-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spec and WPT inconsistencies #239
Comments
Another issue @annevk : new URLPattern({ hostname: 'bad#hostname' }); should not throw but there is a WPT that validates that it throws.
This is wrongly implemented on Deno's URLPattern, Cloudflare's workerd and Chromium. |
Hmm, but |
If I understand it correctly: This doesn't fail, hence it shouldn't fail on URLPattern as well:
|
Host parser (specifically domain to ASCII with domain and false) strip all trailing values whenever it sees |
Another test case is invalid: ada-url/ada@d17f000 If you run the following on Google Chrome, you'll get the following error:
But, Particularly, the following should work and works according to URL spec:
Therefore, this test case shouldn't fail. |
Going to see how many of these I can get through today. For this example: new URLPattern({ "protocol": "http", "port": "80 " }) that's not the relevant Chromium code, but instead that we use
I think this change is probably minor enough (especially since it only makes previously invalid patterns valid) that changing the implementation(s) to match the spec is okay. |
@jeremyroman I think there are more invalid cases like this. I've removed and updated the following test cases on Appreciate if you could take a look. They are mostly around |
Yeah, I'm in the process of looking into what you've said. For the port example, on further inspection the change I mention does address some whitespace (newlines and tabs) but not spaces. In the Chromium implementation, it also winds through ParsePortFromStringPosition which simply ignores any leading zeroes and any junk after the ASCII digit sequence, whose spec counterpart is here. |
For the hostname ones, the cases of /?# seem quite parallel to the port ones. The case of \ is quite weird -- during pattern parsing we can't tell for sure if the URL is special so treat it as not, but for interpreting the init dictionary to For the other bad characters, we have comments in Chromium linking to https://issues.chromium.org/u/0/issues/40124263. If it's just a matter of a (long-standing?) Chromium-specific bug I suppose we should probably test the standard behavior, though that kinda suggests it might be tough for us to fix which might motivate the spec changing. Not familiar enough with that bug yet to comment off the top of my head. |
I'm beginning to think that port and possibly also hostname canonicalization should be revisited in the spec, rather than in the implementations and tests. I've looked more at port, but there are other quirks of the current specified behavior. For instance, the pattern A version of that for ports might be:
Curious for your opinions as well as those of @sisidovski. But basically I think the spec might be the thing that ought to give, possibly with changes to the implementations and tests if what it changes to doesn't match them. |
|
@jeremyroman All canonicalize methods are basically encountering the same exact issues like hostname because Chromium implementation does not call URL. Since URL specification is a living specification, I don't see how URLPattern can be spec compliant while using URL. There are more test cases that are failing for pathname as well. I recommend having an "example" implementation to ensure that the behavior is same in all implementations. With the current state, it's not possible to complete implementation and can be spec compliant where the only implementation does not follow the spec... On top of that, the existing web-platform tests are not explanatory and indeed very cryptic. For example, I can't seem to understand how URLPattern spec uses a hacky solution in a spec in canonicalize_pathname. And I haven't found an answer to #240 I think we should have a call to understand what our options are, and proceed accordingly. I'm more than happy to help! FYI: I'm working on adding URLPattern to Ada (which is used by Node.js and CF Workers for URL) which will power Node.js and Cloudflare workers URLPattern implementations. |
Another inconsistency. There is a test in WPT that takes a pathname that starts with
But it should return The inconsistency comes from Chromium not following the spec. It should check for
|
As mentioned in #240, though, I think the spec may have the sense of the opaque path check inverted -- we should be shortening only non-opaque paths. |
Thank you for the discussion all. @jeremyroman Regarding the port, stripping TAB/LF/CR in compiling a component makes sense to me. For process port for init, I'm not sure if we should introduce something new to the current algorithm as the current algorithm already uses the result of basic URL parser, which is intuitive and consists with URL. So maybe we can simply update the test not to throw an error first then fix the implementation? And, the issue of the pathname starting with |
What is the issue with the URL Pattern Standard?
There is a web-platform test that is even implemented by Chrome that is not covered with the URLPattern spec.
This fails on Chromium due to this function validation: https://chromium.googlesource.com/chromium/src/+/main/extensions/common/url_pattern.cc#101
But other than canonicalizePort there is no place that actually validates the validity of the port, and canonicalizePort calls url parser setter which removes leading and trailing spaces which makes
"80 "
input valid.Relevant WPT: https://github.com/web-platform-tests/wpt/blob/0c1d19546fd4873bb9f4147f0bbf868e7b4f91b7/urlpattern/resources/urlpatterntestdata.json#L1146
The text was updated successfully, but these errors were encountered: