CVE-2024-52595

lxml_html_clean is a project for HTML cleaning functionalities copied from `lxml.html.clean`. Prior to version 0.4.0, the HTML Parser in lxml does not properly handle context-switching for special HTML tags such as `<svg>`, `<math>` and `<noscript>`. This behavior deviates from how web browsers parse and interpret such tags. Specifically, content in CSS comments is ignored by lxml_html_clean but may be interpreted differently by web browsers, enabling malicious scripts to bypass the cleaning process. This vulnerability could lead to Cross-Site Scripting (XSS) attacks, compromising the security of users relying on lxml_html_clean in default configuration for sanitizing untrusted HTML content. Users employing the HTML cleaner in a security-sensitive context should upgrade to lxml 0.4.0, which addresses this issue. As a temporary mitigation, users can configure lxml_html_clean with the following settings to prevent the exploitation of this vulnerability. Via `remove_tags`, one may specify tags to remove - their content is moved to their parents' tags. Via `kill_tags`, one may specify tags to be removed completely. Via `allow_tags`, one may restrict the set of permissible tags, excluding context-switching tags like `<svg>`, `<math>` and `<noscript>`.
Cross-site Scripting
ProviderTypeBase ScoreAtk. VectorAtk. ComplexityPriv. RequiredVector
NISTNIST
7.7 HIGH
NETWORK
HIGH
NONE
CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:U/C:H/I:L/A:H
GitHub_MCNA
7.7 HIGH
NETWORK
HIGH
NONE
CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:U/C:H/I:L/A:H
CISA-ADPADP
---
---
Base Score
CVSS 3.x
EPSS Score
Percentile: 27%
VendorProductVersion
fedoralovespythonlxml_html_clean
𝑥
< 0.4.0
𝑥
= Vulnerable software versions
Debian logo
Debian Releases
Debian Product
Codename
lxml-html-clean
sid
0.4.2-1
fixed
trixie
0.4.2-1
fixed
Ubuntu logo
Ubuntu Releases
Ubuntu Product
Codename
lxml-html-clean
plucky
needs-triage
oracular
needs-triage
noble
needs-triage
jammy
dne
focal
dne