From 7536a8fa69ce6303c0c2ccbcb93fdc068d0b27f6 Mon Sep 17 00:00:00 2001 From: Anne van Kesteren Date: Mon, 6 Jan 2025 16:18:29 +0100 Subject: [PATCH] Editorial: improve fragment parsing This makes the argument order consistent and corrects a false statement about what the XML fragment parser returns. It also generally improves alignment with Infra and other best practices, though it does not improve the actual integration with the parsers. --- source | 223 +++++++++++++++++++++++---------------------------------- 1 file changed, 88 insertions(+), 135 deletions(-) diff --git a/source b/source index 4b88e013a05..6dec90001bf 100644 --- a/source +++ b/source @@ -114897,28 +114897,27 @@ enum DOMParserSupportedType { context and a string markup, are:

    -
  1. Let algorithm be the HTML fragment parsing algorithm.

  2. +
  3. Let algorithm be the HTML fragment parsing algorithm.

  4. -
  5. If context's node document is an XML document, then set algorithm to the XML fragment parsing - algorithm.

  6. +
  7. If context's node document is an XML + document, then set algorithm to the XML fragment parsing + algorithm.

  8. -
  9. Let new children be the result of invoking algorithm given - markup, with context set to - context.

  10. +
  11. Let newChildren be the result of invoking algorithm given + context and markup.

  12. -
  13. Let fragment be a new DocumentFragment whose node - document is context's node document.

  14. +
  15. Let fragment be a new DocumentFragment whose node + document is context's node document.

  16. -
  17. -

    Append each Node in new - children to fragment (in tree order).

    +
  18. +

    For each node of newChildren, in tree order: append node to fragment.

    -

    This ensures the node document for the new nodes is correct.

    -
  19. +

    This ensures the node document for the new nodes is correct.

    + -
  20. Return fragment.

  21. +
  22. Return fragment.

Element's Parsing HTML fragments -

The following steps form the HTML fragment parsing algorithm. The algorithm - takes as input an Element node, referred to as the context element, which gives the context for - the parser, input, a string to parse, and an optional boolean - allowDeclarativeShadowRoots (default false). It returns a list of zero or more - nodes.

- -

Parts marked fragment case in algorithms in the parser section are - parts that only occur if the parser was created for the purposes of this algorithm. The algorithms have been annotated - with such markings for informational purposes only; such markings have no normative weight. If it - is possible for a condition described as a fragment case to occur even when the - parser wasn't created for the purposes of handling this algorithm, then that is an error in the - specification.

+

The HTML fragment parsing algorithm, given an Element node context, string input, and an + optional boolean allowDeclarativeShadowRoots (default false) is the following steps. + They return a list of zero or more nodes.

+ +

Parts marked fragment case in algorithms in the HTML + parser section are parts that only occur if the parser was created for the purposes of this + algorithm. The algorithms have been annotated with such markings for informational purposes only; + such markings have no normative weight. If it is possible for a condition described as a + fragment case to occur even when the parser wasn't created for the purposes of + handling this algorithm, then that is an error in the specification.

    -
  1. -

    Create a new Document node, and mark it as being an HTML document.

    -
  2. +
  3. Let document be a Document node whose type is "html".

  4. -
  5. -

    If the - node document of the context element is in - quirks mode, then let the Document be in quirks mode. - Otherwise, if the - node document of the context element is in - limited-quirks mode, then let the Document be in limited-quirks - mode. Otherwise, leave the Document in no-quirks mode.

    -
  6. +
  7. If context's node document is + in quirks mode, then set document's mode to "quirks".

  8. -
  9. If allowDeclarativeShadowRoots is true, then set the Document's - allow declarative shadow - roots to true.

  10. +
  11. Otherwise, if context's node + document is in limited-quirks mode, then set document's mode to "limited-quirks".

  12. -
  13. -

    Create a new HTML parser, and associate it with the just created - Document node.

    -
  14. +
  15. If allowDeclarativeShadowRoots is true, then set document's allow declarative shadow roots to + true.

  16. + +
  17. Create a new HTML parser, and associate it with document.

  18. Set the state of the HTML parser's tokenization stage as follows, switching on the context element:

    -
    title
    textarea
    -
    Switch the tokenizer to the RCDATA state.
    -
    style
    xmp
    iframe
    noembed
    noframes
    -
    Switch the tokenizer to the RAWTEXT state.
    -
    script
    -
    Switch the tokenizer to the script data state.
    -
    noscript
    -
    If the scripting flag is enabled, switch the tokenizer to the RAWTEXT state. Otherwise, leave the tokenizer in the data state.
    -
    plaintext
    -
    Switch the tokenizer to the PLAINTEXT state.
    -
    Any other element
    -
    Leave the tokenizer in the data state.
    @@ -134365,35 +134343,29 @@ console.assert(container.firstChild instanceof SuperP); transitions.

  19. -
  20. -

    Let root be a new html element with no attributes.

    -
  21. +
  22. Let root be the result of creating an + element given document, "html", and the HTML + namespace.

  23. -
  24. -

    Append the element root to the Document node created - above.

    -
  25. +
  26. Append root to + document.

  27. -
  28. -

    Set up the parser's stack of open elements so that it contains just the single - element root.

    -
  29. +
  30. Set up the HTML parser's stack of open elements so that it + contains just the single element root.

  31. -
  32. -

    If the context element is a - template element, push "in - template" onto the stack of template insertion modes so that it is the new - current template insertion mode.

    -
  33. +
  34. If context is a template + element, then push "in template" onto the + stack of template insertion modes so that it is the new current template + insertion mode.

  35. Create a start tag token whose name is the local name of context and whose attributes are the attributes of context.

    -

    Let this start tag token be the start tag token of the context node, e.g. for the purposes of determining - if it is an HTML integration point.

    +

    Let this start tag token be the start tag token of context; e.g. for the purposes of determining if it is + an HTML integration point.

  36. @@ -134404,29 +134376,22 @@ console.assert(container.firstChild instanceof SuperP); data-x="concept-frag-parse-context">context element as part of that algorithm.

  37. -
  38. -

    Set the parser's form element pointer to the nearest node to the - context element that is a form - element (going straight up the ancestor chain, and including the element itself, if it is a - form element), if any. (If there is no such form element, the - form element pointer keeps its initial value, null.)

    -
  39. +
  40. Set the HTML parser's form element pointer to the + nearest node to context that is a + form element (going straight up the ancestor chain, and including the element + itself, if it is a form element), if any. (If there is no such form + element, the form element pointer keeps its initial value, + null.)

  41. -
  42. -

    Place the input into the input stream for the HTML - parser just created. The encoding confidence is irrelevant.

    -
  43. +
  44. Place the input into the input stream for the HTML + parser just created. The encoding confidence is irrelevant.

  45. -
  46. -

    Start the parser and let it run until it has consumed all the characters just inserted into - the input stream.

    -
  47. +
  48. Start the HTML parser and let it run until it has consumed all the characters + just inserted into the input stream.

  49. -
  50. -

    Return the child - nodes of root, in tree order.

    -
  51. +
  52. Return root's children, in tree + order.

@@ -134740,22 +134705,18 @@ console.assert(container.firstChild instanceof SuperP);

Parsing XML fragments

-

The XML fragment parsing algorithm either returns a Document or throws - a "SyntaxError" DOMException. Given a string - input and a context element context, the - algorithm is as follows:

+

The XML fragment parsing algorithm given an Element node context and a string input, runs the + following steps. They return a list of nodes.

    -
  1. -

    Create a new XML parser.

    -
  2. +
  3. Create a new XML parser.

  4. -

    Feed the - parser just created the string corresponding to the start tag of the context element, declaring - all the namespace prefixes that are in scope on that element in the DOM, as well as declaring - the default namespace (if any) that is in scope on that element in the DOM.

    +

    Feed the parser just created the string corresponding to the start tag of context, declaring all the namespace prefixes that are + in scope on that element in the DOM, as well as declaring the default namespace (if any) that is + in scope on that element in the DOM.

    A namespace prefix is in scope if the DOM lookupNamespaceURI() method on the element would return a non-null value for that prefix.

    @@ -134763,34 +134724,26 @@ console.assert(container.firstChild instanceof SuperP);

    The default namespace is the namespace for which the DOM isDefaultNamespace() method on the element would return true.

    -

    No - DOCTYPE is passed to the parser, and therefore no external subset is - referenced, and therefore no entities will be recognized.

    -
  5. - -
  6. -

    Feed the parser just created the string input.

    +

    No DOCTYPE is passed to the parser, and therefore no + external subset is referenced, and therefore no entities will be recognized.

  7. -
  8. -

    Feed the parser just created the string corresponding to the end tag of the context element.

    -
  9. +
  10. Feed the parser just created the string input.

  11. -
  12. -

    If there is an XML well-formedness or XML namespace well-formedness error, then throw a - "SyntaxError" DOMException.

    -
  13. +
  14. Feed the parser just created the string corresponding to the end tag of context.

  15. -
  16. -

    If the document element of the resulting Document has any sibling - nodes, then throw a "SyntaxError" DOMException.

    +
  17. If there is an XML well-formedness or XML namespace well-formedness error, then throw a + "SyntaxError" DOMException.

  18. - - +
  19. If the document element of the resulting Document has any + sibling nodes, then throw a "SyntaxError" + DOMException.

  20. + -
  21. Return the child nodes of the document element of the resulting - Document, in tree order.

  22. +
  23. Return the resulting Document node's document element's children, in tree order.