-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider using sectioning elements for the HTML spec itself #5649
Comments
I would be happy to implement the sections in the way that is implied by heading elements according to the spec. Advantages
My suggestion would be to do this in the source itself. If you give me the go ahead on this, this is what I will do. FYI: This would be my first issue on this standard. |
@mithrayls sorry for the delay in responding! I'm excited that you're interested in tackling this. We'd love your help. I'd be happy to have you do this in the source itself. However, I'd feel most comfortable if you did it in an automated fashion somehow. It would be much easier for any reviewer to audit a script, than it would be for them to audit the hundreds of lines of diffs (all of which are just adding P.S.
To be clear, not using |
My natural starting point would be to use the parse5 library, iterate through the nodes and surround the h tags with section tags where I hit a boundary of equal or greater importance. Another approach, which I actually already successfully used to solve my own personal problem parsing the standard involved the use of a multiline regex but that might be considered unprofessional ;-) No worries about the delay. I noticed you blogging about the spammy Pull Requests! This was an awkward moment for me as I knew I had this issue outstanding with you ! :-p [EDIT] I realize the source meets the standard it describes, but I think it would not meet best practices of semantic HTML? At any rate, section tags would make it easier to parse and locate sections. [EDIT] I've tried both parse5 and jsdom to parse and then serialize, as well as with and without passing it through a prettifier, but the diff is very large due to what seem to be very minor changes, such as whitespace between tags. For this reason, it might be better to go with the regex idea. Unless there is some kind of canonical prettification for the source code that will allow me to make changes to a parsed tree without creating a huge diff of irrelevant changes(the diff changes actually make the source harder to read by getting rid of helpful formatting)? I think that would be useful for making automated changes. I think a canonical prettifier would make more sense. |
Yeah, when I saw your pre-edit message this morning, I was afraid that parsing-then-serializing would cause too many diffs, since HTML generally does not roundtrip in that way. Although I'm interested in canonical prettifification of the source at some point, I don't think it's a good idea to block this project on that. What about using parse5, but instead of using its serialization, using its node location info to textually insert into the source string? I.e. something like this pseudocode: const source = readSourceFile();
let output = source;
const parsed = parseIt(source);
let delta = 0;
for (const h1 of parsed.getH1s()) {
output = output.substring(0, h1.nodeLocation + delta) + "\n<section>\n" + output.substring(h1.nodeLocation +delta);
delta += "\n<section>\n".length;
} I'm not sure if that's workable, or if it's better than regexes. Another route would be to use tools like parse5 to validate the output. In particular, I'm thinking something that verifies that each hN is contained in N-deep section elements. That sounds pretty easy. And then you could use regexes or any other technique; we'd just need to hand-check the validation code, then we could trust it. |
The HTML singlepage spec currently has 10374 child nodes of
<body>
, because we just put a bunch of<hN>
s and<p>
s all together.If we instead used sections (or even
<div>
s), we could get a few benefits:Apparently 10K nodes crashes some accessibility tools, at least in Chromium. @alice mentioned this and might be able to link us to the Chromium bug.
It might allow us to introduce sticky headers; see Make headers sticky? whatwg.org#320 (although I don't think it solves the cross-links-getting-hidden problem mentioned there).
If this is a good idea, some thoughts on implementation:
We could consider doing this at build time, so that spec editors don't have to manually match
</section>
elements. However, that would probably be a pain to implement. It might be better to generate the change once in an automated fashion.We could consider using all-
<h1>
s in the source, and then transforming them into<hN>
s during the build. This would be pretty easy to implement and would actually improve the authoring experience, I think. (We can't keep them as all-<h1>
s because of Suggest adding a warning about outline algorithm #83.)The text was updated successfully, but these errors were encountered: