Markup HTML¶

HTML manipulation and sanitization utilities.

HTML Module¶

HTML processing utilities.

This submodule provides utilities for cleaning, escaping, and manipulating HTML content.

Examples¶

from rite.markup.html import ( ... html_clean, ... html_escape, ... html_unescape ... ) html_clean("
Hello
") 'Hello'

Modules¶

`html_clean` ¶

HTML Cleaner¶

Remove HTML tags from text.

Examples¶

from rite.markup.html import html_clean html_clean("
Hello World
") 'Hello World'

Functions¶

`html_clean(raw_html: str, strip: bool = True) -> str` ¶

Remove HTML tags from string.

Parameters:

Name	Type	Description	Default
`raw_html`	`str`	Raw HTML string to clean.	required
`strip`	`bool`	Strip whitespace from result.	`True`

Returns:

Type	Description
`str`	Cleaned text without HTML tags.

Examples:

>>> html_clean("<p>Hello</p>")
'Hello'
>>> html_clean("<div>  Text  </div>", strip=False)
'  Text  '

Notes

Uses regex to remove tags. Does not parse HTML structure.

`html_escape` ¶

HTML Escaper¶

Escape special HTML characters.

Examples¶

from rite.markup.html import html_escape html_escape("
Hello & goodbye
") '<div>Hello & goodbye</div>'

Functions¶

`html_escape(text: str) -> str` ¶

Escape special HTML characters.

Parameters:

Name	Type	Description	Default
`text`	`str`	Text to escape.	required

Returns:

Type	Description
`str`	HTML-escaped text.

Examples:

>>> html_escape("5 < 10 & 10 > 5")
'5 &lt; 10 &amp; 10 &gt; 5'
>>> html_escape('"quoted"')
'&quot;quoted&quot;'

Notes

Escapes: &, <, >, ", ' Uses html.escape from standard library.

`html_strip_tags` ¶

HTML Tag Stripper¶

Strip specific HTML tags.

Examples¶

from rite.markup.html import html_strip_tags html_strip_tags("
Keep
", ["script"]) '
Keep
'

Functions¶

`html_strip_tags(html: str, tags: list[str]) -> str` ¶

Strip specific HTML tags and their content.

Parameters:

Name	Type	Description	Default
`html`	`str`	HTML string.	required
`tags`	`list[str]`	List of tag names to strip.	required

Returns:

Type	Description
`str`	HTML with specified tags removed.

Examples:

>>> html_strip_tags("<div>Keep</div><style>Remove</style>", ["style"])
'<div>Keep</div>'
>>> html_strip_tags(
...     "<p>Text</p><script>alert()</script>",
...     ["script", "style"]
... )
'<p>Text</p>'

Notes

Removes both opening and closing tags plus content. Case-insensitive tag matching.

`html_unescape` ¶

HTML Unescaper¶

Unescape HTML entities.

Examples¶

from rite.markup.html import html_unescape html_unescape("<div>Hello</div>") '
Hello
'

Functions¶

`html_unescape(text: str) -> str` ¶

Unescape HTML entities.

Parameters:

Name	Type	Description	Default
`text`	`str`	HTML-escaped text.	required

Returns:

Type	Description
`str`	Unescaped text.

Examples:

>>> html_unescape("&lt;p&gt;Hello&lt;/p&gt;")
'<p>Hello</p>'
>>> html_unescape("&amp;")
'&'

Notes

Converts entities like < back to <. Uses html.unescape from standard library.

options: show_root_heading: true show_source: false heading_level: 2

Markup HTML¶

HTML Module¶

Examples¶

Modules¶

html_clean ¶

HTML Cleaner¶

Examples¶

Functions¶

html_clean(raw_html: str, strip: bool = True) -> str ¶

html_escape ¶

HTML Escaper¶

Examples¶

Functions¶

html_escape(text: str) -> str ¶

html_strip_tags ¶

HTML Tag Stripper¶

Examples¶

Functions¶

html_strip_tags(html: str, tags: list[str]) -> str ¶

html_unescape ¶

HTML Unescaper¶

Examples¶

Functions¶

html_unescape(text: str) -> str ¶

`html_clean` ¶

`html_clean(raw_html: str, strip: bool = True) -> str` ¶

`html_escape` ¶

`html_escape(text: str) -> str` ¶

`html_strip_tags` ¶

`html_strip_tags(html: str, tags: list[str]) -> str` ¶

`html_unescape` ¶

`html_unescape(text: str) -> str` ¶