Skip to content

Markup Module

The rite.markup module provides utilities for working with HTML, XML, and Markdown.

Overview

markup

Markup Module

Comprehensive markup language processing utilities.

This module provides utilities for HTML, XML, Markdown processing, entity encoding/decoding, and content sanitization.

Submodules

  • html: HTML cleaning, escaping, unescaping, tag stripping
  • xml: XML escaping, unescaping, formatting
  • markdown: Markdown to HTML conversion, escaping
  • entities: HTML entity encoding and decoding
  • sanitize: URL, filename, and HTML sanitization

Examples

HTML: >>> from rite.markup import html_clean, html_escape >>> html_clean("

Hello

") 'Hello' >>> html_escape("") '<tag>'

XML

from rite.markup import xml_escape xml_escape("value") '<tag>value</tag>'

Markdown

from rite.markup import markdown_to_html markdown_to_html("bold") 'bold'

Entities

from rite.markup import entities_encode entities_encode("©") '©'

Sanitize

from rite.markup import sanitize_url sanitize_url("javascript:alert(1)") ''

Modules

entities

Entities Module

HTML entity encoding and decoding utilities.

This submodule provides utilities for encoding text to HTML entities and decoding entities back to text.

Examples

from rite.markup.entities import ( ... entities_encode, ... entities_decode ... ) entities_encode("©") '©'

Modules
entities_decode
Entity Decoder

Decode HTML entities to text.

Examples

from rite.markup.entities import entities_decode entities_decode("café") 'café'

Functions
entities_decode
entities_decode(text: str) -> str

Decode HTML entities to text.

Parameters:

Name Type Description Default
text str

Entity-encoded text.

required

Returns:

Type Description
str

Decoded text.

Examples:

>>> entities_decode("&#169;")
'©'
>>> entities_decode("&copy;")
'©'
Notes

Decodes both numeric (&#N;) and named (©). Uses html.unescape from standard library.

entities_encode
Entity Encoder

Encode text to HTML entities.

Examples

from rite.markup.entities import entities_encode entities_encode("café") 'café'

Functions
entities_encode
entities_encode(text: str, ascii_only: bool = False) -> str

Encode text to HTML entities.

Parameters:

Name Type Description Default
text str

Text to encode.

required
ascii_only bool

Encode only non-ASCII characters.

False

Returns:

Type Description
str

Entity-encoded text.

Examples:

>>> entities_encode("©")
'&#169;'
>>> entities_encode("Hello", ascii_only=True)
'Hello'
Notes

Converts characters to &#N; format. Useful for encoding special characters.

entities_decode

Entity Decoder

Decode HTML entities to text.

Examples

from rite.markup.entities import entities_decode entities_decode("café") 'café'

Functions
entities_decode
entities_decode(text: str) -> str

Decode HTML entities to text.

Parameters:

Name Type Description Default
text str

Entity-encoded text.

required

Returns:

Type Description
str

Decoded text.

Examples:

>>> entities_decode("&#169;")
'©'
>>> entities_decode("&copy;")
'©'
Notes

Decodes both numeric (&#N;) and named (©). Uses html.unescape from standard library.

entities_encode

Entity Encoder

Encode text to HTML entities.

Examples

from rite.markup.entities import entities_encode entities_encode("café") 'café'

Functions
entities_encode
entities_encode(text: str, ascii_only: bool = False) -> str

Encode text to HTML entities.

Parameters:

Name Type Description Default
text str

Text to encode.

required
ascii_only bool

Encode only non-ASCII characters.

False

Returns:

Type Description
str

Entity-encoded text.

Examples:

>>> entities_encode("©")
'&#169;'
>>> entities_encode("Hello", ascii_only=True)
'Hello'
Notes

Converts characters to &#N; format. Useful for encoding special characters.

html

HTML Module

HTML processing utilities.

This submodule provides utilities for cleaning, escaping, and manipulating HTML content.

Examples

from rite.markup.html import ( ... html_clean, ... html_escape, ... html_unescape ... ) html_clean("

Hello

") 'Hello'

Modules
html_clean
HTML Cleaner

Remove HTML tags from text.

Examples

from rite.markup.html import html_clean html_clean("

Hello World

") 'Hello World'

Functions
html_clean
html_clean(raw_html: str, strip: bool = True) -> str

Remove HTML tags from string.

Parameters:

Name Type Description Default
raw_html str

Raw HTML string to clean.

required
strip bool

Strip whitespace from result.

True

Returns:

Type Description
str

Cleaned text without HTML tags.

Examples:

>>> html_clean("<p>Hello</p>")
'Hello'
>>> html_clean("<div>  Text  </div>", strip=False)
'  Text  '
Notes

Uses regex to remove tags. Does not parse HTML structure.

html_escape
HTML Escaper

Escape special HTML characters.

Examples

from rite.markup.html import html_escape html_escape("

Hello & goodbye
") '<div>Hello & goodbye</div>'

Functions
html_escape
html_escape(text: str) -> str

Escape special HTML characters.

Parameters:

Name Type Description Default
text str

Text to escape.

required

Returns:

Type Description
str

HTML-escaped text.

Examples:

>>> html_escape("5 < 10 & 10 > 5")
'5 &lt; 10 &amp; 10 &gt; 5'
>>> html_escape('"quoted"')
'&quot;quoted&quot;'
Notes

Escapes: &, <, >, ", ' Uses html.escape from standard library.

html_strip_tags
HTML Tag Stripper

Strip specific HTML tags.

Examples

from rite.markup.html import html_strip_tags html_strip_tags("

Keep

", ["script"]) '

Keep

'

Functions
html_strip_tags
html_strip_tags(html: str, tags: list[str]) -> str

Strip specific HTML tags and their content.

Parameters:

Name Type Description Default
html str

HTML string.

required
tags list[str]

List of tag names to strip.

required

Returns:

Type Description
str

HTML with specified tags removed.

Examples:

>>> html_strip_tags("<div>Keep</div><style>Remove</style>", ["style"])
'<div>Keep</div>'
>>> html_strip_tags(
...     "<p>Text</p><script>alert()</script>",
...     ["script", "style"]
... )
'<p>Text</p>'
Notes

Removes both opening and closing tags plus content. Case-insensitive tag matching.

html_unescape
HTML Unescaper

Unescape HTML entities.

Examples

from rite.markup.html import html_unescape html_unescape("<div>Hello</div>") '

Hello
'

Functions
html_unescape
html_unescape(text: str) -> str

Unescape HTML entities.

Parameters:

Name Type Description Default
text str

HTML-escaped text.

required

Returns:

Type Description
str

Unescaped text.

Examples:

>>> html_unescape("&lt;p&gt;Hello&lt;/p&gt;")
'<p>Hello</p>'
>>> html_unescape("&amp;")
'&'
Notes

Converts entities like < back to <. Uses html.unescape from standard library.

html_clean

HTML Cleaner

Remove HTML tags from text.

Examples

from rite.markup.html import html_clean html_clean("

Hello World

") 'Hello World'

Functions
html_clean
html_clean(raw_html: str, strip: bool = True) -> str

Remove HTML tags from string.

Parameters:

Name Type Description Default
raw_html str

Raw HTML string to clean.

required
strip bool

Strip whitespace from result.

True

Returns:

Type Description
str

Cleaned text without HTML tags.

Examples:

>>> html_clean("<p>Hello</p>")
'Hello'
>>> html_clean("<div>  Text  </div>", strip=False)
'  Text  '
Notes

Uses regex to remove tags. Does not parse HTML structure.

html_escape

HTML Escaper

Escape special HTML characters.

Examples

from rite.markup.html import html_escape html_escape("

Hello & goodbye
") '<div>Hello & goodbye</div>'

Functions
html_escape
html_escape(text: str) -> str

Escape special HTML characters.

Parameters:

Name Type Description Default
text str

Text to escape.

required

Returns:

Type Description
str

HTML-escaped text.

Examples:

>>> html_escape("5 < 10 & 10 > 5")
'5 &lt; 10 &amp; 10 &gt; 5'
>>> html_escape('"quoted"')
'&quot;quoted&quot;'
Notes

Escapes: &, <, >, ", ' Uses html.escape from standard library.

html_strip_tags

HTML Tag Stripper

Strip specific HTML tags.

Examples

from rite.markup.html import html_strip_tags html_strip_tags("

Keep

", ["script"]) '

Keep

'

Functions
html_strip_tags
html_strip_tags(html: str, tags: list[str]) -> str

Strip specific HTML tags and their content.

Parameters:

Name Type Description Default
html str

HTML string.

required
tags list[str]

List of tag names to strip.

required

Returns:

Type Description
str

HTML with specified tags removed.

Examples:

>>> html_strip_tags("<div>Keep</div><style>Remove</style>", ["style"])
'<div>Keep</div>'
>>> html_strip_tags(
...     "<p>Text</p><script>alert()</script>",
...     ["script", "style"]
... )
'<p>Text</p>'
Notes

Removes both opening and closing tags plus content. Case-insensitive tag matching.

html_unescape

HTML Unescaper

Unescape HTML entities.

Examples

from rite.markup.html import html_unescape html_unescape("<div>Hello</div>") '

Hello
'

Functions
html_unescape
html_unescape(text: str) -> str

Unescape HTML entities.

Parameters:

Name Type Description Default
text str

HTML-escaped text.

required

Returns:

Type Description
str

Unescaped text.

Examples:

>>> html_unescape("&lt;p&gt;Hello&lt;/p&gt;")
'<p>Hello</p>'
>>> html_unescape("&amp;")
'&'
Notes

Converts entities like < back to <. Uses html.unescape from standard library.

markdown

Markdown Module

Markdown processing utilities.

This submodule provides utilities for converting and escaping Markdown content.

Examples

from rite.markup.markdown import ( ... markdown_to_html, ... markdown_escape ... ) markdown_to_html("bold") 'bold'

Modules
markdown_escape
Markdown Escape

Escape Markdown special characters.

Examples

from rite.markup.markdown import markdown_escape markdown_escape("not italic") '\not italic\'

Functions
markdown_escape
markdown_escape(text: str) -> str

Escape Markdown special characters.

Parameters:

Name Type Description Default
text str

Text to escape.

required

Returns:

Type Description
str

Escaped text.

Examples:

>>> markdown_escape("# Not a heading")
'\\# Not a heading'
>>> markdown_escape("[not](link)")
'\\[not\\]\\(link\\)'
Notes

Escapes: *, _, #, [, ], (, ), `, ~ Prevents Markdown interpretation.

markdown_to_html
Markdown to HTML

Convert Markdown to HTML (basic).

Examples

from rite.markup.markdown import markdown_to_html markdown_to_html("bold text") 'bold text'

Functions
markdown_to_html
markdown_to_html(markdown: str) -> str

Convert basic Markdown to HTML.

Parameters:

Name Type Description Default
markdown str

Markdown text.

required

Returns:

Type Description
str

HTML string.

Examples:

>>> markdown_to_html("# Heading")
'<h1>Heading</h1>'
>>> markdown_to_html("**bold** and *italic*")
'<strong>bold</strong> and <em>italic</em>'
Notes

Basic conversion only. Supports: headings, bold, italic, code. For full Markdown, use external library.

markdown_escape

Markdown Escape

Escape Markdown special characters.

Examples

from rite.markup.markdown import markdown_escape markdown_escape("not italic") '\not italic\'

Functions
markdown_escape
markdown_escape(text: str) -> str

Escape Markdown special characters.

Parameters:

Name Type Description Default
text str

Text to escape.

required

Returns:

Type Description
str

Escaped text.

Examples:

>>> markdown_escape("# Not a heading")
'\\# Not a heading'
>>> markdown_escape("[not](link)")
'\\[not\\]\\(link\\)'
Notes

Escapes: *, _, #, [, ], (, ), `, ~ Prevents Markdown interpretation.

markdown_to_html

Markdown to HTML

Convert Markdown to HTML (basic).

Examples

from rite.markup.markdown import markdown_to_html markdown_to_html("bold text") 'bold text'

Functions
markdown_to_html
markdown_to_html(markdown: str) -> str

Convert basic Markdown to HTML.

Parameters:

Name Type Description Default
markdown str

Markdown text.

required

Returns:

Type Description
str

HTML string.

Examples:

>>> markdown_to_html("# Heading")
'<h1>Heading</h1>'
>>> markdown_to_html("**bold** and *italic*")
'<strong>bold</strong> and <em>italic</em>'
Notes

Basic conversion only. Supports: headings, bold, italic, code. For full Markdown, use external library.

sanitize

Sanitize Module

Content sanitization utilities.

This submodule provides utilities for sanitizing URLs, filenames, and HTML content for security.

Examples

from rite.markup.sanitize import ( ... sanitize_url, ... sanitize_filename, ... sanitize_html ... ) sanitize_url("javascript:alert(1)") ''

Modules
sanitize_filename
Filename Sanitizer

Sanitize filenames for safe filesystem use.

Examples

from rite.markup.sanitize import sanitize_filename sanitize_filename("file:name?.txt") 'filename.txt'

Functions
sanitize_filename
sanitize_filename(filename: str, replacement: str = '') -> str

Sanitize filename by removing unsafe characters.

Parameters:

Name Type Description Default
filename str

Filename to sanitize.

required
replacement str

Character to replace unsafe chars with.

''

Returns:

Type Description
str

Safe filename.

Examples:

>>> sanitize_filename("my/file:name.txt")
'myfilename.txt'
>>> sanitize_filename("file<>name.txt", "_")
'file__name.txt'
Notes

Removes: / : * ? " < > | Preserves file extension.

sanitize_html
HTML Sanitizer

Sanitize HTML by removing dangerous elements.

Examples

from rite.markup.sanitize import sanitize_html sanitize_html("

Safe

") '

Safe

'

Functions
sanitize_html
sanitize_html(html: str, allowed_tags: list[str] | None = None) -> str

Sanitize HTML by removing dangerous tags.

Parameters:

Name Type Description Default
html str

HTML to sanitize.

required
allowed_tags list[str] | None

List of allowed tags (default: p, br, strong, em).

None

Returns:

Type Description
str

Sanitized HTML.

Examples:

>>> sanitize_html("<p>Safe</p><script>Bad</script>")
'<p>Safe</p>'
>>> sanitize_html("<div>Text</div>", ["div"])
'<div>Text</div>'
Notes

Removes script, iframe, object, embed by default. Only allows whitelisted tags.

sanitize_url
URL Sanitizer

Sanitize and validate URLs.

Examples

from rite.markup.sanitize import sanitize_url sanitize_url("javascript:alert('xss')") ''

Functions
sanitize_url
sanitize_url(url: str, allowed_schemes: list[str] | None = None) -> str

Sanitize URL by checking scheme.

Parameters:

Name Type Description Default
url str

URL to sanitize.

required
allowed_schemes list[str] | None

Allowed URL schemes (default: http, https).

None

Returns:

Type Description
str

Sanitized URL or empty string if invalid.

Examples:

>>> sanitize_url("https://example.com")
'https://example.com'
>>> sanitize_url("javascript:void(0)")
''
>>> sanitize_url("ftp://server.com", ["ftp"])
'ftp://server.com'
Notes

Blocks dangerous schemes like javascript:. Returns empty string for invalid URLs.

sanitize_filename

Filename Sanitizer

Sanitize filenames for safe filesystem use.

Examples

from rite.markup.sanitize import sanitize_filename sanitize_filename("file:name?.txt") 'filename.txt'

Functions
sanitize_filename
sanitize_filename(filename: str, replacement: str = '') -> str

Sanitize filename by removing unsafe characters.

Parameters:

Name Type Description Default
filename str

Filename to sanitize.

required
replacement str

Character to replace unsafe chars with.

''

Returns:

Type Description
str

Safe filename.

Examples:

>>> sanitize_filename("my/file:name.txt")
'myfilename.txt'
>>> sanitize_filename("file<>name.txt", "_")
'file__name.txt'
Notes

Removes: / : * ? " < > | Preserves file extension.

sanitize_html

HTML Sanitizer

Sanitize HTML by removing dangerous elements.

Examples

from rite.markup.sanitize import sanitize_html sanitize_html("

Safe

") '

Safe

'

Functions
sanitize_html
sanitize_html(html: str, allowed_tags: list[str] | None = None) -> str

Sanitize HTML by removing dangerous tags.

Parameters:

Name Type Description Default
html str

HTML to sanitize.

required
allowed_tags list[str] | None

List of allowed tags (default: p, br, strong, em).

None

Returns:

Type Description
str

Sanitized HTML.

Examples:

>>> sanitize_html("<p>Safe</p><script>Bad</script>")
'<p>Safe</p>'
>>> sanitize_html("<div>Text</div>", ["div"])
'<div>Text</div>'
Notes

Removes script, iframe, object, embed by default. Only allows whitelisted tags.

sanitize_url

URL Sanitizer

Sanitize and validate URLs.

Examples

from rite.markup.sanitize import sanitize_url sanitize_url("javascript:alert('xss')") ''

Functions
sanitize_url
sanitize_url(url: str, allowed_schemes: list[str] | None = None) -> str

Sanitize URL by checking scheme.

Parameters:

Name Type Description Default
url str

URL to sanitize.

required
allowed_schemes list[str] | None

Allowed URL schemes (default: http, https).

None

Returns:

Type Description
str

Sanitized URL or empty string if invalid.

Examples:

>>> sanitize_url("https://example.com")
'https://example.com'
>>> sanitize_url("javascript:void(0)")
''
>>> sanitize_url("ftp://server.com", ["ftp"])
'ftp://server.com'
Notes

Blocks dangerous schemes like javascript:. Returns empty string for invalid URLs.

xml

XML Module

XML processing utilities.

This submodule provides utilities for escaping, unescaping, and formatting XML content.

Examples

from rite.markup.xml import ( ... xml_escape, ... xml_unescape ... ) xml_escape("value") '<tag>value</tag>'

Modules
xml_escape
XML Escaper

Escape special XML characters.

Examples

from rite.markup.xml import xml_escape xml_escape("value & more") '<tag>value & more</tag>'

Functions
xml_escape
xml_escape(text: str) -> str

Escape special XML characters.

Parameters:

Name Type Description Default
text str

Text to escape.

required

Returns:

Type Description
str

XML-escaped text.

Examples:

>>> xml_escape("5 < 10 & 10 > 5")
'5 &lt; 10 &amp; 10 &gt; 5'
>>> xml_escape('"quoted"')
'&quot;quoted&quot;'
Notes

Escapes: &, <, >, ", ' Uses xml.sax.saxutils.escape.

xml_format
XML Formatter

Format XML with proper indentation.

Examples

from rite.markup.xml import xml_format xml_format("text") # doctest: +SKIP '\n text\n'

Functions
xml_format
xml_format(xml_string: str, indent: str = '  ') -> str

Format XML string with indentation.

Parameters:

Name Type Description Default
xml_string str

Unformatted XML string.

required
indent str

Indentation string (default: 2 spaces).

' '

Returns:

Type Description
str

Formatted XML string.

Examples:

>>> xml = "<root><child>text</child></root>"
>>> formatted = xml_format(xml)
>>> print(formatted)
<root>
  <child>text</child>
</root>
Notes

Uses xml.dom.minidom for formatting. May fail on malformed XML.

xml_unescape
XML Unescaper

Unescape XML entities.

Examples

from rite.markup.xml import xml_unescape xml_unescape("<tag>value</tag>") 'value'

Functions
xml_unescape
xml_unescape(text: str) -> str

Unescape XML entities.

Parameters:

Name Type Description Default
text str

XML-escaped text.

required

Returns:

Type Description
str

Unescaped text.

Examples:

>>> xml_unescape("&lt;root&gt;&lt;/root&gt;")
'<root></root>'
>>> xml_unescape("&amp;&apos;&quot;")
"&'""
Notes

Converts entities like < back to <. Uses xml.sax.saxutils.unescape.

xml_escape

XML Escaper

Escape special XML characters.

Examples

from rite.markup.xml import xml_escape xml_escape("value & more") '<tag>value & more</tag>'

Functions
xml_escape
xml_escape(text: str) -> str

Escape special XML characters.

Parameters:

Name Type Description Default
text str

Text to escape.

required

Returns:

Type Description
str

XML-escaped text.

Examples:

>>> xml_escape("5 < 10 & 10 > 5")
'5 &lt; 10 &amp; 10 &gt; 5'
>>> xml_escape('"quoted"')
'&quot;quoted&quot;'
Notes

Escapes: &, <, >, ", ' Uses xml.sax.saxutils.escape.

xml_format

XML Formatter

Format XML with proper indentation.

Examples

from rite.markup.xml import xml_format xml_format("text") # doctest: +SKIP '\n text\n'

Functions
xml_format
xml_format(xml_string: str, indent: str = '  ') -> str

Format XML string with indentation.

Parameters:

Name Type Description Default
xml_string str

Unformatted XML string.

required
indent str

Indentation string (default: 2 spaces).

' '

Returns:

Type Description
str

Formatted XML string.

Examples:

>>> xml = "<root><child>text</child></root>"
>>> formatted = xml_format(xml)
>>> print(formatted)
<root>
  <child>text</child>
</root>
Notes

Uses xml.dom.minidom for formatting. May fail on malformed XML.

xml_unescape

XML Unescaper

Unescape XML entities.

Examples

from rite.markup.xml import xml_unescape xml_unescape("<tag>value</tag>") 'value'

Functions
xml_unescape
xml_unescape(text: str) -> str

Unescape XML entities.

Parameters:

Name Type Description Default
text str

XML-escaped text.

required

Returns:

Type Description
str

Unescaped text.

Examples:

>>> xml_unescape("&lt;root&gt;&lt;/root&gt;")
'<root></root>'
>>> xml_unescape("&amp;&apos;&quot;")
"&'""
Notes

Converts entities like < back to <. Uses xml.sax.saxutils.unescape.

Submodules

HTML

HTML manipulation and sanitization.

HTML Module

HTML processing utilities.

This submodule provides utilities for cleaning, escaping, and manipulating HTML content.

Examples

from rite.markup.html import ( ... html_clean, ... html_escape, ... html_unescape ... ) html_clean("

Hello

") 'Hello'

Modules

html_escape

HTML Escaper

Escape special HTML characters.

Examples

from rite.markup.html import html_escape html_escape("

Hello & goodbye
") '<div>Hello & goodbye</div>'

Functions
html_escape
html_escape(text: str) -> str

Escape special HTML characters.

Parameters:

Name Type Description Default
text str

Text to escape.

required

Returns:

Type Description
str

HTML-escaped text.

Examples:

>>> html_escape("5 < 10 & 10 > 5")
'5 &lt; 10 &amp; 10 &gt; 5'
>>> html_escape('"quoted"')
'&quot;quoted&quot;'
Notes

Escapes: &, <, >, ", ' Uses html.escape from standard library.

html_unescape

HTML Unescaper

Unescape HTML entities.

Examples

from rite.markup.html import html_unescape html_unescape("<div>Hello</div>") '

Hello
'

Functions
html_unescape
html_unescape(text: str) -> str

Unescape HTML entities.

Parameters:

Name Type Description Default
text str

HTML-escaped text.

required

Returns:

Type Description
str

Unescaped text.

Examples:

>>> html_unescape("&lt;p&gt;Hello&lt;/p&gt;")
'<p>Hello</p>'
>>> html_unescape("&amp;")
'&'
Notes

Converts entities like < back to <. Uses html.unescape from standard library.

html_strip_tags

HTML Tag Stripper

Strip specific HTML tags.

Examples

from rite.markup.html import html_strip_tags html_strip_tags("

Keep

", ["script"]) '

Keep

'

Functions
html_strip_tags
html_strip_tags(html: str, tags: list[str]) -> str

Strip specific HTML tags and their content.

Parameters:

Name Type Description Default
html str

HTML string.

required
tags list[str]

List of tag names to strip.

required

Returns:

Type Description
str

HTML with specified tags removed.

Examples:

>>> html_strip_tags("<div>Keep</div><style>Remove</style>", ["style"])
'<div>Keep</div>'
>>> html_strip_tags(
...     "<p>Text</p><script>alert()</script>",
...     ["script", "style"]
... )
'<p>Text</p>'
Notes

Removes both opening and closing tags plus content. Case-insensitive tag matching.

html_clean

HTML Cleaner

Remove HTML tags from text.

Examples

from rite.markup.html import html_clean html_clean("

Hello World

") 'Hello World'

Functions
html_clean
html_clean(raw_html: str, strip: bool = True) -> str

Remove HTML tags from string.

Parameters:

Name Type Description Default
raw_html str

Raw HTML string to clean.

required
strip bool

Strip whitespace from result.

True

Returns:

Type Description
str

Cleaned text without HTML tags.

Examples:

>>> html_clean("<p>Hello</p>")
'Hello'
>>> html_clean("<div>  Text  </div>", strip=False)
'  Text  '
Notes

Uses regex to remove tags. Does not parse HTML structure.

XML

XML parsing and formatting.

XML Module

XML processing utilities.

This submodule provides utilities for escaping, unescaping, and formatting XML content.

Examples

from rite.markup.xml import ( ... xml_escape, ... xml_unescape ... ) xml_escape("value") '<tag>value</tag>'

Modules

xml_escape

XML Escaper

Escape special XML characters.

Examples

from rite.markup.xml import xml_escape xml_escape("value & more") '<tag>value & more</tag>'

Functions
xml_escape
xml_escape(text: str) -> str

Escape special XML characters.

Parameters:

Name Type Description Default
text str

Text to escape.

required

Returns:

Type Description
str

XML-escaped text.

Examples:

>>> xml_escape("5 < 10 & 10 > 5")
'5 &lt; 10 &amp; 10 &gt; 5'
>>> xml_escape('"quoted"')
'&quot;quoted&quot;'
Notes

Escapes: &, <, >, ", ' Uses xml.sax.saxutils.escape.

xml_unescape

XML Unescaper

Unescape XML entities.

Examples

from rite.markup.xml import xml_unescape xml_unescape("<tag>value</tag>") 'value'

Functions
xml_unescape
xml_unescape(text: str) -> str

Unescape XML entities.

Parameters:

Name Type Description Default
text str

XML-escaped text.

required

Returns:

Type Description
str

Unescaped text.

Examples:

>>> xml_unescape("&lt;root&gt;&lt;/root&gt;")
'<root></root>'
>>> xml_unescape("&amp;&apos;&quot;")
"&'""
Notes

Converts entities like < back to <. Uses xml.sax.saxutils.unescape.

xml_format

XML Formatter

Format XML with proper indentation.

Examples

from rite.markup.xml import xml_format xml_format("text") # doctest: +SKIP '\n text\n'

Functions
xml_format
xml_format(xml_string: str, indent: str = '  ') -> str

Format XML string with indentation.

Parameters:

Name Type Description Default
xml_string str

Unformatted XML string.

required
indent str

Indentation string (default: 2 spaces).

' '

Returns:

Type Description
str

Formatted XML string.

Examples:

>>> xml = "<root><child>text</child></root>"
>>> formatted = xml_format(xml)
>>> print(formatted)
<root>
  <child>text</child>
</root>
Notes

Uses xml.dom.minidom for formatting. May fail on malformed XML.

Markdown

Markdown processing.

Markdown Module

Markdown processing utilities.

This submodule provides utilities for converting and escaping Markdown content.

Examples

from rite.markup.markdown import ( ... markdown_to_html, ... markdown_escape ... ) markdown_to_html("bold") 'bold'

Modules

markdown_escape

Markdown Escape

Escape Markdown special characters.

Examples

from rite.markup.markdown import markdown_escape markdown_escape("not italic") '\not italic\'

Functions
markdown_escape
markdown_escape(text: str) -> str

Escape Markdown special characters.

Parameters:

Name Type Description Default
text str

Text to escape.

required

Returns:

Type Description
str

Escaped text.

Examples:

>>> markdown_escape("# Not a heading")
'\\# Not a heading'
>>> markdown_escape("[not](link)")
'\\[not\\]\\(link\\)'
Notes

Escapes: *, _, #, [, ], (, ), `, ~ Prevents Markdown interpretation.

markdown_to_html

Markdown to HTML

Convert Markdown to HTML (basic).

Examples

from rite.markup.markdown import markdown_to_html markdown_to_html("bold text") 'bold text'

Functions
markdown_to_html
markdown_to_html(markdown: str) -> str

Convert basic Markdown to HTML.

Parameters:

Name Type Description Default
markdown str

Markdown text.

required

Returns:

Type Description
str

HTML string.

Examples:

>>> markdown_to_html("# Heading")
'<h1>Heading</h1>'
>>> markdown_to_html("**bold** and *italic*")
'<strong>bold</strong> and <em>italic</em>'
Notes

Basic conversion only. Supports: headings, bold, italic, code. For full Markdown, use external library.

Entities

HTML entity encoding/decoding.

Entities Module

HTML entity encoding and decoding utilities.

This submodule provides utilities for encoding text to HTML entities and decoding entities back to text.

Examples

from rite.markup.entities import ( ... entities_encode, ... entities_decode ... ) entities_encode("©") '©'

Modules

entities_encode

Entity Encoder

Encode text to HTML entities.

Examples

from rite.markup.entities import entities_encode entities_encode("café") 'café'

Functions
entities_encode
entities_encode(text: str, ascii_only: bool = False) -> str

Encode text to HTML entities.

Parameters:

Name Type Description Default
text str

Text to encode.

required
ascii_only bool

Encode only non-ASCII characters.

False

Returns:

Type Description
str

Entity-encoded text.

Examples:

>>> entities_encode("©")
'&#169;'
>>> entities_encode("Hello", ascii_only=True)
'Hello'
Notes

Converts characters to &#N; format. Useful for encoding special characters.

entities_decode

Entity Decoder

Decode HTML entities to text.

Examples

from rite.markup.entities import entities_decode entities_decode("café") 'café'

Functions
entities_decode
entities_decode(text: str) -> str

Decode HTML entities to text.

Parameters:

Name Type Description Default
text str

Entity-encoded text.

required

Returns:

Type Description
str

Decoded text.

Examples:

>>> entities_decode("&#169;")
'©'
>>> entities_decode("&copy;")
'©'
Notes

Decodes both numeric (&#N;) and named (©). Uses html.unescape from standard library.

Examples

from rite.markup import (
    html_escape,
    xml_format,
    markdown_to_html
)

# Escape HTML
safe = html_escape("<script>alert('xss')</script>")

# Format XML
formatted = xml_format("<root><child>text</child></root>")

# Convert Markdown
html = markdown_to_html("# Heading\n\nParagraph")