{"id":"GHSA-x6wf-f3px-wcqx","summary":"xmldom has XML node injection through unvalidated processing instruction serialization","details":"## Summary\n\nThe package allows attacker-controlled processing instruction data to be serialized into XML without validating or neutralizing the PI-closing sequence `?\u003e`. As a result, an attacker can terminate the processing instruction early and inject arbitrary XML nodes into the serialized output.\n\n---\n\n## Details\n\nThe issue is in the DOM construction and serialization flow for processing instruction nodes.\n\nWhen `createProcessingInstruction(target, data)` is called, the supplied `data` string is stored directly on the node without validation. Later, when the document is serialized, the serializer writes PI nodes by concatenating `\u003c?`, the target, a space, `node.data`, and `?\u003e` directly.\n\nThat behavior is unsafe because processing instructions are a syntax-sensitive context. The closing delimiter `?\u003e` terminates the PI. If attacker-controlled input contains `?\u003e`, the serializer does not preserve it as literal PI content. Instead, it emits output where the remainder of the payload is treated as live XML markup.\n\nThe same class of vulnerability was previously addressed for CDATA sections (GHSA-wh4c-j3r5-mjhp / CVE-2026-34601), where `]]\u003e` in CDATA data was handled by splitting. The serializer applies no equivalent protection to processing instruction data.\n\n---\n\n## Affected code\n\n**`lib/dom.js` — `createProcessingInstruction` (lines 2240–2246):**\n\n```js\ncreateProcessingInstruction: function (target, data) {\n    var node = new ProcessingInstruction(PDC);\n    node.ownerDocument = this;\n    node.childNodes = new NodeList();\n    node.nodeName = node.target = target;\n    node.nodeValue = node.data = data;\n    return node;\n},\n```\n\nNo validation is performed on `data`. Any string including `?\u003e` is stored as-is.\n\n**`lib/dom.js` — serializer PI case (line 2966):**\n\n```js\ncase PROCESSING_INSTRUCTION_NODE:\n    return buf.push('\u003c?', node.target, ' ', node.data, '?\u003e');\n```\n\n`node.data` is emitted verbatim. If it contains `?\u003e`, that sequence terminates the PI in the output\nstream and the remainder appears as active XML markup.\n\n**Contrast — CDATA (line 2945, patched):**\n\n```js\ncase CDATA_SECTION_NODE:\n    return buf.push(g.CDATA_START, node.data.replace(/]]\u003e/g, ']]]]\u003e\u003c![CDATA[\u003e'), g.CDATA_END);\n```\n\n---\n\n## PoC\n\n### Minimal (from @tlsbollei report, 2026-04-01)\n\n```js\nconst { DOMImplementation, XMLSerializer } = require('@xmldom/xmldom');\n\nconst doc = new DOMImplementation().createDocument(null, 'r', null);\ndoc.documentElement.appendChild(\n    doc.createProcessingInstruction('a', '?\u003e\u003cz/\u003e\u003c?q ')\n);\nconsole.log(new XMLSerializer().serializeToString(doc));\n// \u003cr\u003e\u003c?a ?\u003e\u003cz/\u003e\u003c?q ?\u003e\u003c/r\u003e\n//          ^^^^ injected \u003cz/\u003e element is active markup\n```\n\n### With re-parse verification (from @tlsbollei report)\n\n```js\nconst assert = require('assert');\nconst { DOMParser, XMLSerializer } = require('@xmldom/xmldom');\n\nconst doc = new DOMParser().parseFromString('\u003cr/\u003e', 'application/xml');\ndoc.documentElement.appendChild(doc.createProcessingInstruction('a', '?\u003e\u003cz/\u003e\u003c?q '));\nconst xml = new XMLSerializer().serializeToString(doc);\nassert.strictEqual(new DOMParser().parseFromString(xml, 'application/xml')\n    .getElementsByTagName('z').length, 1); // passes — z is a real element\n```\n\n---\n\n## Impact\n\nAn application that uses the package to build XML from untrusted input can be made to emit attacker-controlled elements outside the intended PI boundary. That allows the attacker to alter the meaning and structure of generated XML documents.\n\nIn practice, this can affect any workflow that generates XML and then stores it, forwards it, signs it, or hands it to another parser. Realistic targets include XML-based configuration, policy documents, and message formats where downstream consumers trust the serialized structure.\n\nAs noted by @tlsbollei: this is the same delimiter-driven XML injection bug class previously addressed by GHSA-wh4c-j3r5-mjhp for `createCDATASection()`. Fixing CDATA while leaving PI creation and PI serialization unguarded leaves the same standards-constrained issue open for another node type.\n\n---\n\n## Disclosure\n\nThis vulnerability was publicly disclosed at 2026-04-06T11:25:07Z via\n[xmldom/xmldom#987](https://github.com/xmldom/xmldom/pull/987), which was subsequently closed\nwithout being merged.\n\n---\n\n## Fix Applied\n\n\u003e **⚠ Opt-in required.** Protection is not automatic. Existing serialization calls remain\n\u003e vulnerable unless `{ requireWellFormed: true }` is explicitly passed. Applications that pass\n\u003e untrusted data to `createProcessingInstruction()` or mutate PI nodes with untrusted input\n\u003e (via `.data =` or `CharacterData` mutation methods) should audit all `serializeToString()`\n\u003e call sites and add the option.\n\n`XMLSerializer.serializeToString()` now accepts an options object as a second argument. When `{ requireWellFormed: true }` is passed, the serializer throws `InvalidStateError` before emitting any ProcessingInstruction node whose `.data` contains `?\u003e`. This check applies regardless of how `?\u003e` entered the node — whether via `createProcessingInstruction` directly or a subsequent mutation (`.data =`, `CharacterData` methods).\n\nOn `@xmldom/xmldom` ≥ 0.9.10, the serializer additionally applies the full W3C DOM Parsing §3.2.1.7 checks when `requireWellFormed: true`:\n\n1. **Target check**: throws `InvalidStateError` if the PI target contains a `:` character or is an ASCII case-insensitive match for `\"xml\"`.\n2. **Data Char check**: throws `InvalidStateError` if the PI data contains characters outside the XML Char production.\n3. **Data sequence check**: throws `InvalidStateError` if the PI data contains `?\u003e`.\n\nOn `@xmldom/xmldom` ≥ 0.8.13 (LTS), only the `?\u003e` data check (check 3) is applied. The target and XML Char checks are not included in the LTS fix.\n\n### PoC — fixed path\n\n```js\nconst { DOMImplementation, XMLSerializer } = require('@xmldom/xmldom');\n\nconst doc = new DOMImplementation().createDocument(null, 'r', null);\ndoc.documentElement.appendChild(doc.createProcessingInstruction('a', '?\u003e\u003cz/\u003e\u003c?q '));\n\n// Default (unchanged): verbatim — injection present\nconst unsafe = new XMLSerializer().serializeToString(doc);\nconsole.log(unsafe);\n// \u003cr\u003e\u003c?a ?\u003e\u003cz/\u003e\u003c?q ?\u003e\u003c/r\u003e\n\n// Opt-in guard: throws InvalidStateError before serializing\ntry {\n  new XMLSerializer().serializeToString(doc, { requireWellFormed: true });\n} catch (e) {\n  console.log(e.name, e.message);\n  // InvalidStateError: The ProcessingInstruction data contains \"?\u003e\"\n}\n```\n\nThe guard catches `?\u003e` regardless of when it was introduced:\n\n```js\n// Post-creation mutation: also caught at serialization time\nconst pi = doc.createProcessingInstruction('target', 'safe data');\ndoc.documentElement.appendChild(pi);\npi.data = 'safe?\u003e\u003cinjected/\u003e';\nnew XMLSerializer().serializeToString(doc, { requireWellFormed: true });\n// InvalidStateError: The ProcessingInstruction data contains \"?\u003e\"\n```\n\n### Why the default stays verbatim\n\nThe W3C DOM Parsing and Serialization spec §3.2.1.3 defines a `require well-formed` flag whose **default value is `false`**. With the flag unset, the spec explicitly permits serializing PI data verbatim. This matches browser behavior: Chrome, Firefox, and Safari all emit `?\u003e` in PI data verbatim by default without error.\n\nUnconditionally throwing would be a behavioral breaking change with no spec justification. The opt-in `requireWellFormed: true` flag allows applications that require injection safety to enable strict mode without breaking existing code.\n\n### Residual limitation\n\n`createProcessingInstruction(target, data)` does not validate `data` at creation time. The WHATWG DOM spec (§4.5 step 2) mandates an `InvalidCharacterError` when `data` contains `?\u003e`; enforcing this check unconditionally at creation time is a breaking change and is deferred to a future breaking release.\n\nWhen the default serialization path is used (without `requireWellFormed: true`), PI data containing `?\u003e` is still emitted verbatim. Applications that do not pass `requireWellFormed: true` remain exposed.","aliases":["CVE-2026-41675"],"modified":"2026-05-08T20:26:37.301909Z","published":"2026-04-22T20:17:58Z","related":["CGA-q8q8-3c7r-m9jv"],"database_specific":{"cwe_ids":["CWE-91"],"github_reviewed_at":"2026-04-22T20:17:58Z","github_reviewed":true,"nvd_published_at":"2026-05-07T04:16:33Z","severity":"HIGH"},"references":[{"type":"WEB","url":"https://github.com/xmldom/xmldom/security/advisories/GHSA-x6wf-f3px-wcqx"},{"type":"ADVISORY","url":"https://nvd.nist.gov/vuln/detail/CVE-2026-41675"},{"type":"WEB","url":"https://github.com/xmldom/xmldom/commit/7207a4b0e0bcc228868075ed991665ef9f73b1c2"},{"type":"PACKAGE","url":"https://github.com/xmldom/xmldom"},{"type":"WEB","url":"https://github.com/xmldom/xmldom/releases/tag/0.8.13"},{"type":"WEB","url":"https://github.com/xmldom/xmldom/releases/tag/0.9.10"}],"affected":[{"package":{"name":"@xmldom/xmldom","ecosystem":"npm","purl":"pkg:npm/%40xmldom/xmldom"},"ranges":[{"type":"SEMVER","events":[{"introduced":"0"},{"fixed":"0.8.13"}]}],"database_specific":{"source":"https://github.com/github/advisory-database/blob/main/advisories/github-reviewed/2026/04/GHSA-x6wf-f3px-wcqx/GHSA-x6wf-f3px-wcqx.json"}},{"package":{"name":"@xmldom/xmldom","ecosystem":"npm","purl":"pkg:npm/%40xmldom/xmldom"},"ranges":[{"type":"SEMVER","events":[{"introduced":"0.9.0"},{"fixed":"0.9.10"}]}],"database_specific":{"source":"https://github.com/github/advisory-database/blob/main/advisories/github-reviewed/2026/04/GHSA-x6wf-f3px-wcqx/GHSA-x6wf-f3px-wcqx.json"}},{"package":{"name":"xmldom","ecosystem":"npm","purl":"pkg:npm/xmldom"},"ranges":[{"type":"SEMVER","events":[{"introduced":"0"},{"last_affected":"0.6.0"}]}],"database_specific":{"source":"https://github.com/github/advisory-database/blob/main/advisories/github-reviewed/2026/04/GHSA-x6wf-f3px-wcqx/GHSA-x6wf-f3px-wcqx.json"}}],"schema_version":"1.7.5","severity":[{"type":"CVSS_V4","score":"CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:H/VA:N/SC:N/SI:N/SA:N"}]}