html.parser

html.parser — Simple HTML and XHTML parser

Source code: Lib/html/parser.py

This module defines a class HTMLParser which serves as the basis for parsing text files formatted in HTML (HyperText Mark-up Language) and XHTML.

class html.parser.HTMLParser(*, convert_charrefs=True)

Create a parser instance able to parse invalid markup.

If convert_charrefs is True (the default), all character references (except the ones in script/style elements) are automatically converted to the corresponding Unicode charac