Markup Module¶
The rite.markup module provides utilities for working with HTML, XML, and Markdown.
Overview¶
markup ¶
Markup Module¶
Comprehensive markup language processing utilities.
This module provides utilities for HTML, XML, Markdown processing, entity encoding/decoding, and content sanitization.
Submodules¶
- html: HTML cleaning, escaping, unescaping, tag stripping
- xml: XML escaping, unescaping, formatting
- markdown: Markdown to HTML conversion, escaping
- entities: HTML entity encoding and decoding
- sanitize: URL, filename, and HTML sanitization
Examples¶
HTML: >>> from rite.markup import html_clean, html_escape >>> html_clean("
Hello
") 'Hello' >>> html_escape("XML
from rite.markup import xml_escape xml_escape("
value ") '<tag>value</tag>'
Markdown
from rite.markup import markdown_to_html markdown_to_html("bold") 'bold'
Entities
from rite.markup import entities_encode entities_encode("©") '©'
Sanitize
from rite.markup import sanitize_url sanitize_url("javascript:alert(1)") ''
Modules¶
entities ¶
Entities Module¶
HTML entity encoding and decoding utilities.
This submodule provides utilities for encoding text to HTML entities and decoding entities back to text.
Examples¶
from rite.markup.entities import ( ... entities_encode, ... entities_decode ... ) entities_encode("©") '©'
Modules¶
entities_decode ¶
Entity Decoder¶
Decode HTML entities to text.
Examples¶
from rite.markup.entities import entities_decode entities_decode("café") 'café'
entities_encode ¶
Entity Encoder¶
Encode text to HTML entities.
Examples¶
from rite.markup.entities import entities_encode entities_encode("café") 'café'
Encode text to HTML entities.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to encode. |
required |
ascii_only
|
bool
|
Encode only non-ASCII characters. |
False
|
Returns:
| Type | Description |
|---|---|
str
|
Entity-encoded text. |
Examples:
Notes
Converts characters to &#N; format. Useful for encoding special characters.
entities_decode ¶
Entity Decoder¶
Decode HTML entities to text.
Examples¶
from rite.markup.entities import entities_decode entities_decode("café") 'café'
entities_encode ¶
Entity Encoder¶
Encode text to HTML entities.
Examples¶
from rite.markup.entities import entities_encode entities_encode("café") 'café'
Functions¶
entities_encode ¶
Encode text to HTML entities.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to encode. |
required |
ascii_only
|
bool
|
Encode only non-ASCII characters. |
False
|
Returns:
| Type | Description |
|---|---|
str
|
Entity-encoded text. |
Examples:
Notes
Converts characters to &#N; format. Useful for encoding special characters.
html ¶
HTML Module¶
HTML processing utilities.
This submodule provides utilities for cleaning, escaping, and manipulating HTML content.
Examples¶
from rite.markup.html import ( ... html_clean, ... html_escape, ... html_unescape ... ) html_clean("
Hello
") 'Hello'
Modules¶
html_clean ¶
HTML Cleaner¶
Remove HTML tags from text.
Examples¶
from rite.markup.html import html_clean html_clean("
Hello World
") 'Hello World'
Remove HTML tags from string.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
raw_html
|
str
|
Raw HTML string to clean. |
required |
strip
|
bool
|
Strip whitespace from result. |
True
|
Returns:
| Type | Description |
|---|---|
str
|
Cleaned text without HTML tags. |
Examples:
Notes
Uses regex to remove tags. Does not parse HTML structure.
html_escape ¶
HTML Escaper¶
Escape special HTML characters.
Examples¶
from rite.markup.html import html_escape html_escape("
Hello & goodbye") '<div>Hello & goodbye</div>'
Escape special HTML characters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to escape. |
required |
Returns:
| Type | Description |
|---|---|
str
|
HTML-escaped text. |
Examples:
>>> html_escape("5 < 10 & 10 > 5")
'5 < 10 & 10 > 5'
>>> html_escape('"quoted"')
'"quoted"'
Notes
Escapes: &, <, >, ", ' Uses html.escape from standard library.
html_strip_tags ¶
HTML Tag Stripper¶
Strip specific HTML tags.
Examples¶
from rite.markup.html import html_strip_tags html_strip_tags("
Keep
", ["script"]) 'Keep
'
Strip specific HTML tags and their content.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
html
|
str
|
HTML string. |
required |
tags
|
list[str]
|
List of tag names to strip. |
required |
Returns:
| Type | Description |
|---|---|
str
|
HTML with specified tags removed. |
Examples:
>>> html_strip_tags("<div>Keep</div><style>Remove</style>", ["style"])
'<div>Keep</div>'
>>> html_strip_tags(
... "<p>Text</p><script>alert()</script>",
... ["script", "style"]
... )
'<p>Text</p>'
Notes
Removes both opening and closing tags plus content. Case-insensitive tag matching.
html_unescape ¶
HTML Unescaper¶
Unescape HTML entities.
Examples¶
from rite.markup.html import html_unescape html_unescape("<div>Hello</div>") '
Hello'
html_clean ¶
HTML Cleaner¶
Remove HTML tags from text.
Examples¶
from rite.markup.html import html_clean html_clean("
Hello World
") 'Hello World'
Functions¶
html_clean ¶
Remove HTML tags from string.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
raw_html
|
str
|
Raw HTML string to clean. |
required |
strip
|
bool
|
Strip whitespace from result. |
True
|
Returns:
| Type | Description |
|---|---|
str
|
Cleaned text without HTML tags. |
Examples:
Notes
Uses regex to remove tags. Does not parse HTML structure.
html_escape ¶
HTML Escaper¶
Escape special HTML characters.
Examples¶
from rite.markup.html import html_escape html_escape("
Hello & goodbye") '<div>Hello & goodbye</div>'
Functions¶
html_escape ¶
Escape special HTML characters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to escape. |
required |
Returns:
| Type | Description |
|---|---|
str
|
HTML-escaped text. |
Examples:
>>> html_escape("5 < 10 & 10 > 5")
'5 < 10 & 10 > 5'
>>> html_escape('"quoted"')
'"quoted"'
Notes
Escapes: &, <, >, ", ' Uses html.escape from standard library.
html_strip_tags ¶
HTML Tag Stripper¶
Strip specific HTML tags.
Examples¶
from rite.markup.html import html_strip_tags html_strip_tags("
Keep
", ["script"]) 'Keep
'
Functions¶
html_strip_tags ¶
Strip specific HTML tags and their content.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
html
|
str
|
HTML string. |
required |
tags
|
list[str]
|
List of tag names to strip. |
required |
Returns:
| Type | Description |
|---|---|
str
|
HTML with specified tags removed. |
Examples:
>>> html_strip_tags("<div>Keep</div><style>Remove</style>", ["style"])
'<div>Keep</div>'
>>> html_strip_tags(
... "<p>Text</p><script>alert()</script>",
... ["script", "style"]
... )
'<p>Text</p>'
Notes
Removes both opening and closing tags plus content. Case-insensitive tag matching.
html_unescape ¶
HTML Unescaper¶
Unescape HTML entities.
Examples¶
from rite.markup.html import html_unescape html_unescape("<div>Hello</div>") '
Hello'
markdown ¶
Markdown Module¶
Markdown processing utilities.
This submodule provides utilities for converting and escaping Markdown content.
Examples¶
from rite.markup.markdown import ( ... markdown_to_html, ... markdown_escape ... ) markdown_to_html("bold") 'bold'
Modules¶
markdown_escape ¶
Markdown Escape¶
Escape Markdown special characters.
Examples¶
from rite.markup.markdown import markdown_escape markdown_escape("not italic") '\not italic\'
Escape Markdown special characters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to escape. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Escaped text. |
Examples:
>>> markdown_escape("# Not a heading")
'\\# Not a heading'
>>> markdown_escape("[not](link)")
'\\[not\\]\\(link\\)'
Notes
Escapes: *, _, #, [, ], (, ), `, ~ Prevents Markdown interpretation.
markdown_to_html ¶
Markdown to HTML¶
Convert Markdown to HTML (basic).
Examples¶
from rite.markup.markdown import markdown_to_html markdown_to_html("bold text") 'bold text'
Convert basic Markdown to HTML.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
markdown
|
str
|
Markdown text. |
required |
Returns:
| Type | Description |
|---|---|
str
|
HTML string. |
Examples:
>>> markdown_to_html("# Heading")
'<h1>Heading</h1>'
>>> markdown_to_html("**bold** and *italic*")
'<strong>bold</strong> and <em>italic</em>'
Notes
Basic conversion only. Supports: headings, bold, italic, code. For full Markdown, use external library.
markdown_escape ¶
Markdown Escape¶
Escape Markdown special characters.
Examples¶
from rite.markup.markdown import markdown_escape markdown_escape("not italic") '\not italic\'
Functions¶
markdown_escape ¶
Escape Markdown special characters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to escape. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Escaped text. |
Examples:
>>> markdown_escape("# Not a heading")
'\\# Not a heading'
>>> markdown_escape("[not](link)")
'\\[not\\]\\(link\\)'
Notes
Escapes: *, _, #, [, ], (, ), `, ~ Prevents Markdown interpretation.
markdown_to_html ¶
Markdown to HTML¶
Convert Markdown to HTML (basic).
Examples¶
from rite.markup.markdown import markdown_to_html markdown_to_html("bold text") 'bold text'
Functions¶
markdown_to_html ¶
Convert basic Markdown to HTML.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
markdown
|
str
|
Markdown text. |
required |
Returns:
| Type | Description |
|---|---|
str
|
HTML string. |
Examples:
>>> markdown_to_html("# Heading")
'<h1>Heading</h1>'
>>> markdown_to_html("**bold** and *italic*")
'<strong>bold</strong> and <em>italic</em>'
Notes
Basic conversion only. Supports: headings, bold, italic, code. For full Markdown, use external library.
sanitize ¶
Sanitize Module¶
Content sanitization utilities.
This submodule provides utilities for sanitizing URLs, filenames, and HTML content for security.
Examples¶
from rite.markup.sanitize import ( ... sanitize_url, ... sanitize_filename, ... sanitize_html ... ) sanitize_url("javascript:alert(1)") ''
Modules¶
sanitize_filename ¶
Filename Sanitizer¶
Sanitize filenames for safe filesystem use.
Examples¶
from rite.markup.sanitize import sanitize_filename sanitize_filename("file:name?.txt") 'filename.txt'
Sanitize filename by removing unsafe characters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
str
|
Filename to sanitize. |
required |
replacement
|
str
|
Character to replace unsafe chars with. |
''
|
Returns:
| Type | Description |
|---|---|
str
|
Safe filename. |
Examples:
>>> sanitize_filename("my/file:name.txt")
'myfilename.txt'
>>> sanitize_filename("file<>name.txt", "_")
'file__name.txt'
Notes
Removes: / : * ? " < > | Preserves file extension.
sanitize_html ¶
HTML Sanitizer¶
Sanitize HTML by removing dangerous elements.
Examples¶
from rite.markup.sanitize import sanitize_html sanitize_html("
Safe
") 'Safe
'
Sanitize HTML by removing dangerous tags.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
html
|
str
|
HTML to sanitize. |
required |
allowed_tags
|
list[str] | None
|
List of allowed tags (default: p, br, strong, em). |
None
|
Returns:
| Type | Description |
|---|---|
str
|
Sanitized HTML. |
Examples:
>>> sanitize_html("<p>Safe</p><script>Bad</script>")
'<p>Safe</p>'
>>> sanitize_html("<div>Text</div>", ["div"])
'<div>Text</div>'
Notes
Removes script, iframe, object, embed by default. Only allows whitelisted tags.
sanitize_url ¶
URL Sanitizer¶
Sanitize and validate URLs.
Examples¶
from rite.markup.sanitize import sanitize_url sanitize_url("javascript:alert('xss')") ''
Sanitize URL by checking scheme.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
url
|
str
|
URL to sanitize. |
required |
allowed_schemes
|
list[str] | None
|
Allowed URL schemes (default: http, https). |
None
|
Returns:
| Type | Description |
|---|---|
str
|
Sanitized URL or empty string if invalid. |
Examples:
>>> sanitize_url("https://example.com")
'https://example.com'
>>> sanitize_url("javascript:void(0)")
''
>>> sanitize_url("ftp://server.com", ["ftp"])
'ftp://server.com'
Notes
Blocks dangerous schemes like javascript:. Returns empty string for invalid URLs.
sanitize_filename ¶
Filename Sanitizer¶
Sanitize filenames for safe filesystem use.
Examples¶
from rite.markup.sanitize import sanitize_filename sanitize_filename("file:name?.txt") 'filename.txt'
Functions¶
sanitize_filename ¶
Sanitize filename by removing unsafe characters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
str
|
Filename to sanitize. |
required |
replacement
|
str
|
Character to replace unsafe chars with. |
''
|
Returns:
| Type | Description |
|---|---|
str
|
Safe filename. |
Examples:
>>> sanitize_filename("my/file:name.txt")
'myfilename.txt'
>>> sanitize_filename("file<>name.txt", "_")
'file__name.txt'
Notes
Removes: / : * ? " < > | Preserves file extension.
sanitize_html ¶
HTML Sanitizer¶
Sanitize HTML by removing dangerous elements.
Examples¶
from rite.markup.sanitize import sanitize_html sanitize_html("
Safe
") 'Safe
'
Functions¶
sanitize_html ¶
Sanitize HTML by removing dangerous tags.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
html
|
str
|
HTML to sanitize. |
required |
allowed_tags
|
list[str] | None
|
List of allowed tags (default: p, br, strong, em). |
None
|
Returns:
| Type | Description |
|---|---|
str
|
Sanitized HTML. |
Examples:
>>> sanitize_html("<p>Safe</p><script>Bad</script>")
'<p>Safe</p>'
>>> sanitize_html("<div>Text</div>", ["div"])
'<div>Text</div>'
Notes
Removes script, iframe, object, embed by default. Only allows whitelisted tags.
sanitize_url ¶
URL Sanitizer¶
Sanitize and validate URLs.
Examples¶
from rite.markup.sanitize import sanitize_url sanitize_url("javascript:alert('xss')") ''
Functions¶
sanitize_url ¶
Sanitize URL by checking scheme.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
url
|
str
|
URL to sanitize. |
required |
allowed_schemes
|
list[str] | None
|
Allowed URL schemes (default: http, https). |
None
|
Returns:
| Type | Description |
|---|---|
str
|
Sanitized URL or empty string if invalid. |
Examples:
>>> sanitize_url("https://example.com")
'https://example.com'
>>> sanitize_url("javascript:void(0)")
''
>>> sanitize_url("ftp://server.com", ["ftp"])
'ftp://server.com'
Notes
Blocks dangerous schemes like javascript:. Returns empty string for invalid URLs.
xml ¶
XML Module¶
XML processing utilities.
This submodule provides utilities for escaping, unescaping, and formatting XML content.
Examples¶
from rite.markup.xml import ( ... xml_escape, ... xml_unescape ... ) xml_escape("
value ") '<tag>value</tag>'
Modules¶
xml_escape ¶
XML Escaper¶
Escape special XML characters.
Examples¶
from rite.markup.xml import xml_escape xml_escape("
value & more ") '<tag>value & more</tag>'
Escape special XML characters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to escape. |
required |
Returns:
| Type | Description |
|---|---|
str
|
XML-escaped text. |
Examples:
>>> xml_escape("5 < 10 & 10 > 5")
'5 < 10 & 10 > 5'
>>> xml_escape('"quoted"')
'"quoted"'
Notes
Escapes: &, <, >, ", ' Uses xml.sax.saxutils.escape.
xml_format ¶
XML Formatter¶
Format XML with proper indentation.
Examples¶
from rite.markup.xml import xml_format xml_format("
") # doctest: +SKIP ' text \n 'text \n
Format XML string with indentation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
xml_string
|
str
|
Unformatted XML string. |
required |
indent
|
str
|
Indentation string (default: 2 spaces). |
' '
|
Returns:
| Type | Description |
|---|---|
str
|
Formatted XML string. |
Examples:
>>> xml = "<root><child>text</child></root>"
>>> formatted = xml_format(xml)
>>> print(formatted)
<root>
<child>text</child>
</root>
Notes
Uses xml.dom.minidom for formatting. May fail on malformed XML.
xml_unescape ¶
XML Unescaper¶
Unescape XML entities.
Examples¶
from rite.markup.xml import xml_unescape xml_unescape("<tag>value</tag>") '
value '
Unescape XML entities.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
XML-escaped text. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Unescaped text. |
Examples:
>>> xml_unescape("<root></root>")
'<root></root>'
>>> xml_unescape("&'"")
"&'""
Notes
Converts entities like < back to <. Uses xml.sax.saxutils.unescape.
xml_escape ¶
XML Escaper¶
Escape special XML characters.
Examples¶
from rite.markup.xml import xml_escape xml_escape("
value & more ") '<tag>value & more</tag>'
Functions¶
xml_escape ¶
Escape special XML characters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to escape. |
required |
Returns:
| Type | Description |
|---|---|
str
|
XML-escaped text. |
Examples:
>>> xml_escape("5 < 10 & 10 > 5")
'5 < 10 & 10 > 5'
>>> xml_escape('"quoted"')
'"quoted"'
Notes
Escapes: &, <, >, ", ' Uses xml.sax.saxutils.escape.
xml_format ¶
XML Formatter¶
Format XML with proper indentation.
Examples¶
from rite.markup.xml import xml_format xml_format("
") # doctest: +SKIP ' text \n 'text \n
Functions¶
xml_format ¶
Format XML string with indentation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
xml_string
|
str
|
Unformatted XML string. |
required |
indent
|
str
|
Indentation string (default: 2 spaces). |
' '
|
Returns:
| Type | Description |
|---|---|
str
|
Formatted XML string. |
Examples:
>>> xml = "<root><child>text</child></root>"
>>> formatted = xml_format(xml)
>>> print(formatted)
<root>
<child>text</child>
</root>
Notes
Uses xml.dom.minidom for formatting. May fail on malformed XML.
xml_unescape ¶
XML Unescaper¶
Unescape XML entities.
Examples¶
from rite.markup.xml import xml_unescape xml_unescape("<tag>value</tag>") '
value '
Functions¶
xml_unescape ¶
Unescape XML entities.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
XML-escaped text. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Unescaped text. |
Examples:
>>> xml_unescape("<root></root>")
'<root></root>'
>>> xml_unescape("&'"")
"&'""
Notes
Converts entities like < back to <. Uses xml.sax.saxutils.unescape.
Submodules¶
HTML¶
HTML manipulation and sanitization.
HTML Module¶
HTML processing utilities.
This submodule provides utilities for cleaning, escaping, and manipulating HTML content.
Examples¶
from rite.markup.html import ( ... html_clean, ... html_escape, ... html_unescape ... ) html_clean("
Hello
") 'Hello'
Modules¶
html_escape ¶
HTML Escaper¶
Escape special HTML characters.
Examples¶
from rite.markup.html import html_escape html_escape("
Hello & goodbye") '<div>Hello & goodbye</div>'
Functions¶
html_escape ¶
Escape special HTML characters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to escape. |
required |
Returns:
| Type | Description |
|---|---|
str
|
HTML-escaped text. |
Examples:
>>> html_escape("5 < 10 & 10 > 5")
'5 < 10 & 10 > 5'
>>> html_escape('"quoted"')
'"quoted"'
Notes
Escapes: &, <, >, ", ' Uses html.escape from standard library.
html_unescape ¶
HTML Unescaper¶
Unescape HTML entities.
Examples¶
from rite.markup.html import html_unescape html_unescape("<div>Hello</div>") '
Hello'
html_strip_tags ¶
HTML Tag Stripper¶
Strip specific HTML tags.
Examples¶
from rite.markup.html import html_strip_tags html_strip_tags("
Keep
", ["script"]) 'Keep
'
Functions¶
html_strip_tags ¶
Strip specific HTML tags and their content.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
html
|
str
|
HTML string. |
required |
tags
|
list[str]
|
List of tag names to strip. |
required |
Returns:
| Type | Description |
|---|---|
str
|
HTML with specified tags removed. |
Examples:
>>> html_strip_tags("<div>Keep</div><style>Remove</style>", ["style"])
'<div>Keep</div>'
>>> html_strip_tags(
... "<p>Text</p><script>alert()</script>",
... ["script", "style"]
... )
'<p>Text</p>'
Notes
Removes both opening and closing tags plus content. Case-insensitive tag matching.
html_clean ¶
HTML Cleaner¶
Remove HTML tags from text.
Examples¶
from rite.markup.html import html_clean html_clean("
Hello World
") 'Hello World'
Functions¶
html_clean ¶
Remove HTML tags from string.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
raw_html
|
str
|
Raw HTML string to clean. |
required |
strip
|
bool
|
Strip whitespace from result. |
True
|
Returns:
| Type | Description |
|---|---|
str
|
Cleaned text without HTML tags. |
Examples:
Notes
Uses regex to remove tags. Does not parse HTML structure.
XML¶
XML parsing and formatting.
XML Module¶
XML processing utilities.
This submodule provides utilities for escaping, unescaping, and formatting XML content.
Examples¶
from rite.markup.xml import ( ... xml_escape, ... xml_unescape ... ) xml_escape("
value ") '<tag>value</tag>'
Modules¶
xml_escape ¶
XML Escaper¶
Escape special XML characters.
Examples¶
from rite.markup.xml import xml_escape xml_escape("
value & more ") '<tag>value & more</tag>'
Functions¶
xml_escape ¶
Escape special XML characters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to escape. |
required |
Returns:
| Type | Description |
|---|---|
str
|
XML-escaped text. |
Examples:
>>> xml_escape("5 < 10 & 10 > 5")
'5 < 10 & 10 > 5'
>>> xml_escape('"quoted"')
'"quoted"'
Notes
Escapes: &, <, >, ", ' Uses xml.sax.saxutils.escape.
xml_unescape ¶
XML Unescaper¶
Unescape XML entities.
Examples¶
from rite.markup.xml import xml_unescape xml_unescape("<tag>value</tag>") '
value '
Functions¶
xml_unescape ¶
Unescape XML entities.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
XML-escaped text. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Unescaped text. |
Examples:
>>> xml_unescape("<root></root>")
'<root></root>'
>>> xml_unescape("&'"")
"&'""
Notes
Converts entities like < back to <. Uses xml.sax.saxutils.unescape.
xml_format ¶
XML Formatter¶
Format XML with proper indentation.
Examples¶
from rite.markup.xml import xml_format xml_format("
") # doctest: +SKIP ' text \n 'text \n
Functions¶
xml_format ¶
Format XML string with indentation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
xml_string
|
str
|
Unformatted XML string. |
required |
indent
|
str
|
Indentation string (default: 2 spaces). |
' '
|
Returns:
| Type | Description |
|---|---|
str
|
Formatted XML string. |
Examples:
>>> xml = "<root><child>text</child></root>"
>>> formatted = xml_format(xml)
>>> print(formatted)
<root>
<child>text</child>
</root>
Notes
Uses xml.dom.minidom for formatting. May fail on malformed XML.
Markdown¶
Markdown processing.
Markdown Module¶
Markdown processing utilities.
This submodule provides utilities for converting and escaping Markdown content.
Examples¶
from rite.markup.markdown import ( ... markdown_to_html, ... markdown_escape ... ) markdown_to_html("bold") 'bold'
Modules¶
markdown_escape ¶
Markdown Escape¶
Escape Markdown special characters.
Examples¶
from rite.markup.markdown import markdown_escape markdown_escape("not italic") '\not italic\'
Functions¶
markdown_escape ¶
Escape Markdown special characters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to escape. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Escaped text. |
Examples:
>>> markdown_escape("# Not a heading")
'\\# Not a heading'
>>> markdown_escape("[not](link)")
'\\[not\\]\\(link\\)'
Notes
Escapes: *, _, #, [, ], (, ), `, ~ Prevents Markdown interpretation.
markdown_to_html ¶
Markdown to HTML¶
Convert Markdown to HTML (basic).
Examples¶
from rite.markup.markdown import markdown_to_html markdown_to_html("bold text") 'bold text'
Functions¶
markdown_to_html ¶
Convert basic Markdown to HTML.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
markdown
|
str
|
Markdown text. |
required |
Returns:
| Type | Description |
|---|---|
str
|
HTML string. |
Examples:
>>> markdown_to_html("# Heading")
'<h1>Heading</h1>'
>>> markdown_to_html("**bold** and *italic*")
'<strong>bold</strong> and <em>italic</em>'
Notes
Basic conversion only. Supports: headings, bold, italic, code. For full Markdown, use external library.
Entities¶
HTML entity encoding/decoding.
Entities Module¶
HTML entity encoding and decoding utilities.
This submodule provides utilities for encoding text to HTML entities and decoding entities back to text.
Examples¶
from rite.markup.entities import ( ... entities_encode, ... entities_decode ... ) entities_encode("©") '©'
Modules¶
entities_encode ¶
Entity Encoder¶
Encode text to HTML entities.
Examples¶
from rite.markup.entities import entities_encode entities_encode("café") 'café'
Functions¶
entities_encode ¶
Encode text to HTML entities.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to encode. |
required |
ascii_only
|
bool
|
Encode only non-ASCII characters. |
False
|
Returns:
| Type | Description |
|---|---|
str
|
Entity-encoded text. |
Examples:
Notes
Converts characters to &#N; format. Useful for encoding special characters.
entities_decode ¶
Entity Decoder¶
Decode HTML entities to text.
Examples¶
from rite.markup.entities import entities_decode entities_decode("café") 'café'