Skip to content

Commit

Permalink
Editorial: improve fragment parsing
Browse files Browse the repository at this point in the history
This makes the argument order consistent and corrects a false statement about what the XML fragment parser returns. It also generally improves alignment with Infra and other best practices, though it does not improve the actual integration with the parsers.
  • Loading branch information
annevk authored Jan 6, 2025
1 parent 8d2829a commit 7536a8f
Showing 1 changed file with 88 additions and 135 deletions.
223 changes: 88 additions & 135 deletions source
Original file line number Diff line number Diff line change
Expand Up @@ -114897,28 +114897,27 @@ enum <dfn enum>DOMParserSupportedType</dfn> {
<var>context</var> and a string <var>markup</var>, are:</p>

<ol>
<li><p>Let <var>algorithm</var> be the <span>HTML fragment parsing algorithm</span>.</p></li>
<li><p>Let <var>algorithm</var> be the <span>HTML fragment parsing algorithm</span>.</p></li>

<li><p>If <var>context</var>'s <span>node document</span> is an <span data-x="XML
documents">XML document</span>, then set <var>algorithm</var> to the <span>XML fragment parsing
algorithm</span>.</p></li>
<li><p>If <var>context</var>'s <span>node document</span> is an <span data-x="XML documents">XML
document</span>, then set <var>algorithm</var> to the <span>XML fragment parsing
algorithm</span>.</p></li>

<li><p>Let <var>new children</var> be the result of invoking <var>algorithm</var> given
<var>markup</var>, with <var data-x="concept-frag-parse-context">context</var> set to
<var>context</var>.</p></li>
<li><p>Let <var>newChildren</var> be the result of invoking <var>algorithm</var> given
<var>context</var> and <var>markup</var>.</p></li>

<li><p>Let <var>fragment</var> be a new <code>DocumentFragment</code> whose <span>node
document</span> is <var>context</var>'s <span>node document</span>.</p></li>
<li><p>Let <var>fragment</var> be a new <code>DocumentFragment</code> whose <span>node
document</span> is <var>context</var>'s <span>node document</span>.</p></li>

<li>
<p><span data-x="concept-node-append">Append</span> each <code>Node</code> in <var>new
children</var> to <var>fragment</var> (in <span>tree order</span>).</p>
<li>
<p>For each <var>node</var> of <var>newChildren</var>, in <span>tree order</span>: <span
data-x="concept-node-append">append</span> <var>node</var> to <var>fragment</var>.</p>

<p class=note>This ensures the <span>node document</span> for the new <span
data-x="node">nodes</span> is correct.</p>
</li>
<p class=note>This ensures the <span>node document</span> for the new <span
data-x="node">nodes</span> is correct.</p>
</li>

<li><p>Return <var>fragment</var>.</p></li>
<li><p>Return <var>fragment</var>.</p></li>
</ol>

<p><code>Element</code>'s <dfn attribute for="Element"><code
Expand Down Expand Up @@ -134276,84 +134275,63 @@ console.assert(container.firstChild instanceof SuperP);

<h3>Parsing HTML fragments</h3>

<p>The following steps form the <dfn>HTML fragment parsing algorithm</dfn>. The algorithm
takes as input an <code>Element</code> node, referred to as the <dfn
data-x="concept-frag-parse-context"><var>context</var></dfn> element, which gives the context for
the parser, <var>input</var>, a string to parse, and an optional boolean
<var>allowDeclarativeShadowRoots</var> (default false). It returns a list of zero or more
nodes.</p>

<p class="note">Parts marked <dfn>fragment case</dfn> in algorithms in the parser section are
parts that only occur if the parser was created for the purposes of this algorithm. The algorithms have been annotated
with such markings for informational purposes only; such markings have no normative weight. If it
is possible for a condition described as a <span>fragment case</span> to occur even when the
parser wasn't created for the purposes of handling this algorithm, then that is an error in the
specification.</p>
<p>The <dfn>HTML fragment parsing algorithm</dfn>, given an <code>Element</code> node <dfn
data-x="concept-frag-parse-context"><var>context</var></dfn>, string <var>input</var>, and an
optional boolean <var>allowDeclarativeShadowRoots</var> (default false) is the following steps.
They return a list of zero or more nodes.</p>

<p class="note">Parts marked <dfn>fragment case</dfn> in algorithms in the <span>HTML
parser</span> section are parts that only occur if the parser was created for the purposes of this
algorithm. The algorithms have been annotated with such markings for informational purposes only;
such markings have no normative weight. If it is possible for a condition described as a
<span>fragment case</span> to occur even when the parser wasn't created for the purposes of
handling this algorithm, then that is an error in the specification.</p>

<ol>
<li>
<p>Create a new <code>Document</code> node, and mark it as being an <span data-x="HTML
documents">HTML document</span>.</p>
</li>
<li><p>Let <var>document</var> be a <code>Document</code> node whose <span
data-x="concept-document-type">type</span> is "<code data-x="">html</code>".</p></li>

<li>
<p>If the
<span>node document</span> of the <var data-x="concept-frag-parse-context">context</var> element is in
<span>quirks mode</span>, then let the <code>Document</code> be in <span>quirks mode</span>.
Otherwise, if the
<span>node document</span> of the <var data-x="concept-frag-parse-context">context</var> element is in
<span>limited-quirks mode</span>, then let the <code>Document</code> be in <span>limited-quirks
mode</span>. Otherwise, leave the <code>Document</code> in <span>no-quirks mode</span>.</p>
</li>
<li><p>If <var data-x="concept-frag-parse-context">context</var>'s <span>node document</span> is
in <span>quirks mode</span>, then set <var>document</var>'s <span
data-x="concept-document-mode">mode</span> to "<code data-x="">quirks</code>".</p></li>

<li><p>If <var>allowDeclarativeShadowRoots</var> is true, then set the <code>Document</code>'s
<span data-x="concept-document-allow-declarative-shadow-roots">allow declarative shadow
roots</span> to true.</p></li>
<li><p>Otherwise, if <var data-x="concept-frag-parse-context">context</var>'s <span>node
document</span> is in <span>limited-quirks mode</span>, then set <var>document</var>'s <span
data-x="concept-document-mode">mode</span> to "<code data-x="">limited-quirks</code>".</p></li>

<li>
<p>Create a new <span>HTML parser</span>, and associate it with the just created
<code>Document</code> node.</p>
</li>
<li><p>If <var>allowDeclarativeShadowRoots</var> is true, then set <var>document</var>'s <span
data-x="concept-document-allow-declarative-shadow-roots">allow declarative shadow roots</span> to
true.</p></li>

<li><p>Create a new <span>HTML parser</span>, and associate it with <var>document</var>.</p></li>

<li>
<p>Set the state of the <span>HTML parser</span>'s <span>tokenization</span> stage as
follows, switching on the <var data-x="concept-frag-parse-context">context</var> element:</p>

<dl class="switch">

<dt><code>title</code></dt>
<dt><code>textarea</code></dt>

<dd>Switch the tokenizer to the <span>RCDATA state</span>.</dd>


<dt><code>style</code></dt>
<dt><code>xmp</code></dt>
<dt><code>iframe</code></dt>
<dt><code>noembed</code></dt>
<dt><code>noframes</code></dt>

<dd>Switch the tokenizer to the <span>RAWTEXT state</span>.</dd>


<dt><code>script</code></dt>

<dd>Switch the tokenizer to the <span>script data state</span>.</dd>


<dt><code>noscript</code></dt>

<dd>If the <span>scripting flag</span> is enabled, switch the tokenizer to the <span>RAWTEXT
state</span>. Otherwise, leave the tokenizer in the <span>data state</span>.</dd>


<dt><code>plaintext</code></dt>

<dd>Switch the tokenizer to the <span>PLAINTEXT state</span>.</dd>


<dt>Any other element</dt>

<dd>Leave the tokenizer in the <span>data state</span>.</dd>
</dl>

Expand All @@ -134365,35 +134343,29 @@ console.assert(container.firstChild instanceof SuperP);
transitions.</p>
</li>

<li>
<p>Let <var>root</var> be a new <code>html</code> element with no attributes.</p>
</li>
<li><p>Let <var>root</var> be the result of <span data-x="create an element">creating an
element</span> given <var>document</var>, "<code data-x="">html</code>", and the <span>HTML
namespace</span>.</p></li>

<li>
<p>Append the element <var>root</var> to the <code>Document</code> node created
above.</p>
</li>
<li><p><span data-x="concept-node-append">Append</span> <var>root</var> to
<var>document</var>.</p></li>

<li>
<p>Set up the parser's <span>stack of open elements</span> so that it contains just the single
element <var>root</var>.</p>
</li>
<li><p>Set up the <span>HTML parser</span>'s <span>stack of open elements</span> so that it
contains just the single element <var>root</var>.</p></li>

<li>
<p>If the <var data-x="concept-frag-parse-context">context</var> element is a
<code>template</code> element, push "<span data-x="insertion mode: in template">in
template</span>" onto the <span>stack of template insertion modes</span> so that it is the new
<span>current template insertion mode</span>.</p>
</li>
<li><p>If <var data-x="concept-frag-parse-context">context</var> is a <code>template</code>
element, then push "<span data-x="insertion mode: in template">in template</span>" onto the
<span>stack of template insertion modes</span> so that it is the new <span>current template
insertion mode</span>.</p></li>

<li>
<p>Create a start tag token whose name is the local name of <var
data-x="concept-frag-parse-context">context</var> and whose attributes are the attributes of
<var data-x="concept-frag-parse-context">context</var>.</p>

<p>Let this start tag token be the start tag token of the <var
data-x="concept-frag-parse-context">context</var> node, e.g. for the purposes of determining
if it is an <span>HTML integration point</span>.</p>
<p>Let this start tag token be the start tag token of <var
data-x="concept-frag-parse-context">context</var>; e.g. for the purposes of determining if it is
an <span>HTML integration point</span>.</p>
</li>

<li>
Expand All @@ -134404,29 +134376,22 @@ console.assert(container.firstChild instanceof SuperP);
data-x="concept-frag-parse-context">context</var> element as part of that algorithm.</p>
</li>

<li>
<p>Set the parser's <span><code>form</code> element pointer</span> to the nearest node to the
<var data-x="concept-frag-parse-context">context</var> element that is a <code>form</code>
element (going straight up the ancestor chain, and including the element itself, if it is a
<code>form</code> element), if any. (If there is no such <code>form</code> element, the
<span><code>form</code> element pointer</span> keeps its initial value, null.)</p>
</li>
<li><p>Set the <span>HTML parser</span>'s <span><code>form</code> element pointer</span> to the
nearest node to <var data-x="concept-frag-parse-context">context</var> that is a
<code>form</code> element (going straight up the ancestor chain, and including the element
itself, if it is a <code>form</code> element), if any. (If there is no such <code>form</code>
element, the <span><code>form</code> element pointer</span> keeps its initial value,
null.)</p></li>

<li>
<p>Place the <var>input</var> into the <span>input stream</span> for the <span>HTML
parser</span> just created. The encoding <span
data-x="concept-encoding-confidence">confidence</span> is <i>irrelevant</i>.</p>
</li>
<li><p>Place the <var>input</var> into the <span>input stream</span> for the <span>HTML
parser</span> just created. The encoding <span
data-x="concept-encoding-confidence">confidence</span> is <i>irrelevant</i>.</p></li>

<li>
<p>Start the parser and let it run until it has consumed all the characters just inserted into
the input stream.</p>
</li>
<li><p>Start the <span>HTML parser</span> and let it run until it has consumed all the characters
just inserted into the input stream.</p></li>

<li>
<p>Return the child
nodes of <var>root</var>, in <span>tree order</span>.</p>
</li>
<li><p>Return <var>root</var>'s <span data-x="concept-tree-child">children</span>, in <span>tree
order</span>.</p></li>
</ol>

</div>
Expand Down Expand Up @@ -134740,57 +134705,45 @@ console.assert(container.firstChild instanceof SuperP);

<h3 id="parsing-xhtml-fragments">Parsing XML fragments</h3>

<p>The <dfn>XML fragment parsing algorithm</dfn> either returns a <code>Document</code> or throws
a <span>"<code>SyntaxError</code>"</span> <code>DOMException</code>. Given a string
<var>input</var> and a context element <var data-x="concept-frag-parse-context">context</var>, the
algorithm is as follows:</p>
<p>The <dfn>XML fragment parsing algorithm</dfn> given an <code>Element</code> node <var
data-x="concept-frag-parse-context">context</var> and a string <var>input</var>, runs the
following steps. They return a list of nodes.</p>

<ol>
<li>
<p>Create a new <span>XML parser</span>.</p>
</li>
<li><p>Create a new <span>XML parser</span>.</p></li>

<li>
<p><span>Feed the
parser</span> just created the string corresponding to the start tag of the <var
data-x="concept-frag-parse-context">context</var> element, declaring
all the namespace prefixes that are in scope on that element in the DOM, as well as declaring
the default namespace (if any) that is in scope on that element in the DOM.</p>
<p><span>Feed the parser</span> just created the string corresponding to the start tag of <var
data-x="concept-frag-parse-context">context</var>, declaring all the namespace prefixes that are
in scope on that element in the DOM, as well as declaring the default namespace (if any) that is
in scope on that element in the DOM.</p>

<p>A namespace prefix is in scope if the DOM <code data-x="">lookupNamespaceURI()</code> method
on the element would return a non-null value for that prefix.</p>

<p>The default namespace is the namespace for which the DOM <code
data-x="">isDefaultNamespace()</code> method on the element would return true.</p>

<p class="note">No
<code data-x="">DOCTYPE</code> is passed to the parser, and therefore no external subset is
referenced, and therefore no entities will be recognized.</p>
</li>

<li>
<p><span>Feed the parser</span> just created the string <var>input</var>.</p>
<p class="note">No <code data-x="">DOCTYPE</code> is passed to the parser, and therefore no
external subset is referenced, and therefore no entities will be recognized.</p>
</li>

<li>
<p><span>Feed the parser</span> just created the string corresponding to the end tag of the <var
data-x="concept-frag-parse-context">context</var> element.</p>
</li>
<li><p><span>Feed the parser</span> just created the string <var>input</var>.</p></li>

<li>
<p>If there is an XML well-formedness or XML namespace well-formedness error, then throw a
<span>"<code>SyntaxError</code>"</span> <code>DOMException</code>.</p>
</li>
<li><p><span>Feed the parser</span> just created the string corresponding to the end tag of <var
data-x="concept-frag-parse-context">context</var>.</p></li>

<li>
<p>If the <span>document element</span> of the resulting <code>Document</code> has any sibling
nodes, then throw a <span>"<code>SyntaxError</code>"</span> <code>DOMException</code>.</p>
<li><p>If there is an XML well-formedness or XML namespace well-formedness error, then throw a
<span>"<code>SyntaxError</code>"</span> <code>DOMException</code>.</p></li>

<!-- https://software.hixie.ch/utilities/js/live-dom-viewer/?saved=1443 -->
</li>
<li><p>If the <span>document element</span> of the resulting <code>Document</code> has any
sibling nodes, then throw a <span>"<code>SyntaxError</code>"</span>
<code>DOMException</code>.</p></li>
<!-- This is technically redundant, but apparently it has gone wrong in the past:
https://software.hixie.ch/utilities/js/live-dom-viewer/?saved=1443 -->

<li><p>Return the child nodes of the <span>document element</span> of the resulting
<code>Document</code>, in <span>tree order</span>.</p></li>
<li><p>Return the resulting <code>Document</code> node's <span>document element</span>'s <span
data-x="concept-tree-child">children</span>, in <span>tree order</span>.</p></li>
</ol>

</div>
Expand Down

0 comments on commit 7536a8f

Please sign in to comment.