Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Input events dispatch to top-level frame #1847

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

sadym-chromium
Copy link
Contributor

@sadym-chromium sadym-chromium commented Oct 22, 2024

As discussed in w3c/webdriver-bidi#795, the actions should be dispatched from the top-level browsing context.


Preview | Diff

index.html Outdated Show resolved Hide resolved
index.html Outdated Show resolved Hide resolved
@sadym-chromium
Copy link
Contributor Author

@whimboo WDYT?

@OrKoN
Copy link
Contributor

OrKoN commented Oct 23, 2024

The following two things seem to be still missing:

  1. steps to convert the coordinates to the top-level browsing context coordinate space if they are frame scoped. Edit: unless this is covered by the implementation-defined bits?
  2. apply the same fix to other actions (I assume we do actually want to dispatch keyboard events through the top-level frame as well)

@whimboo
Copy link
Contributor

whimboo commented Oct 23, 2024

The following two things seem to be still missing:
1. steps to convert the coordinates to the top-level browsing context coordinate space if they are frame scoped. Edit: unless this is covered by the implementation-defined bits?

Yes, absolutely. By changing the way where we dispatch the events the coordinates should be exactly the same. Not taking care of offsets would cause quite a lot of regressions for those users who make use of actions a lot.

Therefore see:

web-platform-tests/wpt#48147
web-platform-tests/wpt#48123

Given that Chrome seems to already dispatch actions in the top-level browsing context the referenced tests are failing.

2. apply the same fix to other actions (I assume we do actually want to dispatch keyboard events through the top-level frame as well)

Yes, we would have to do it for all the pointer and wheel input sources.

Keyboard actions are more challenging because they could allow users to trigger shortcuts that access restricted browser features. This would require additional checks to determine what actions are permitted and which ones should be blocked. Maybe we should have a follow-up issue for it?

@sadym-chromium
Copy link
Contributor Author

  1. apply the same fix to other actions (I assume we do actually want to dispatch keyboard events through the top-level frame as well)

Aren't all the actions end up in perform implementation-specific action dispatch steps?

@sadym-chromium
Copy link
Contributor Author

@OrKoN @whimboo I added calculation of the offset relative to the parent browsing context, but IDK how well it will work

@sadym-chromium sadym-chromium requested a review from OrKoN October 23, 2024 13:38
Comment on lines +8293 to +8294
such that trusted events corresponding to the entries in
<var>list of events</var> are dispatched.
Copy link
Contributor Author

@sadym-chromium sadym-chromium Oct 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively we can "hand-wave" hear like "These steps must be equivalent to user trying to perform the given input device manipulations on context through the top-level browsing context."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@sadym-chromium sadym-chromium Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@OrKoN @whimboo we can keep specification implementation-specific, just clarify the events are dispatched via top-level browsing context. I'm not sure if that approach would work with the events intercepted by the top-level frame.

@whimboo
Copy link
Contributor

whimboo commented Oct 23, 2024

@OrKoN @whimboo I added calculation of the offset relative to the parent browsing context, but IDK how well it will work

I think that we should have more wdspec tests (similar to mine for mouse) so we cover at least each input source (including sub sources) and verify implementations against. Did you get Chrome working for my example?

@OrKoN
Copy link
Contributor

OrKoN commented Oct 23, 2024

@sadym-chromium it looks like each action type dispatches on the context parameter https://w3c.github.io/webdriver/#dfn-dispatch-a-keydown-action

  1. apply the same fix to other actions (I assume we do actually want to dispatch keyboard events through the top-level frame as well)

Aren't all the actions end up in perform implementation-specific action dispatch steps?

that's right, overlooked that.

@OrKoN
Copy link
Contributor

OrKoN commented Oct 23, 2024

Keyboard actions are more challenging because they could allow users to trigger shortcuts that access restricted browser features. This would require additional checks to determine what actions are permitted and which ones should be blocked. Maybe we should have a follow-up issue for it?

so actually I am not sure, even with this change it is not quite defined what it means that the actions are dispatched to the top-level browsing context. For example, in Chrome that is the case but still would not give you access to browser shortcuts.
Perhaps we need a better definition of what the dispatch of an event means, perhaps hook into native event handles in https://www.w3.org/TR/uievents/#handle-native-mouse-move-id?

@whimboo
Copy link
Contributor

whimboo commented Oct 23, 2024

so actually I am not sure, even with this change it is not quite defined what it means that the actions are dispatched to the top-level browsing context. For example, in Chrome that is the case but still would not give you access to browser shortcuts.

Yes you are right. I mixed it up with the native event dispatching that I'm working on as well right now. In those cases we would have that particular issue but not when dispatching it in the content process of the top-level browsing context.

must be equivalent to performing the given input device manipulations
on <var>context</var>, such that trusted events corresponding to the
entries in <var>list of events</var>are dispatched.
dispatch steps</dfn> on a <var>context</var>, and a <var>list of events</var>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should keep browsing context indicating the context var type

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I missed it.

index.html Outdated Show resolved Hide resolved
index.html Outdated Show resolved Hide resolved
@@ -8392,7 +8438,7 @@ <h3>Processing actions</h3>
</dd>
</dl>

<li><p>Return (<var>x</var>, <var>y</var>)
<li><p>Return (<var>x</var> + <var>parentOffsetLeft</var>, <var>y</var> + <var>parentOffsetTop</var>)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the offset be applied to all origins or only the viewport? I think the origin pointer might already be in the top level context coordinates and not sure if the element origin does some adjustments already.

Copy link
Contributor Author

@sadym-chromium sadym-chromium Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect the origin "pointer" to be relative to "context" param of "PerformActionsParameters" of the command.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does not seem to follow from the spec text since input sources are per top-level browsing context.

</li>
<li>
Let <var>navigable</var> be <var>context</var>'s <a>active document</a>'s
[=navigable/parent=].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[=navigable/parent=] is a property of a navigable and not the document. Did you mean to get the current navigable instead of a parent (e.g., https://html.spec.whatwg.org/#node-navigable)?

sadym-chromium and others added 2 commits October 24, 2024 10:08
Co-authored-by: Alex Rudenko <[email protected]>
Co-authored-by: Alex Rudenko <[email protected]>
Copy link
Member

@jgraham jgraham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that in principle this is the model we need to change to in order to work with modern browser designs.

However I think the change as written is insufficient, and makes the spec contradictory. In each action we still claim we're dispatching events to |context| which, afaict after these changes, is not the parent context but the actual child context. The spec is unfortunately rather badly written here, so the text you've modified is apparently normative, but exists in what is essentially an informative section describing the overall model.

I also suggest (perhaps as a followup) that we make it possible for specific actions to provide the context that is used for calculating origins (which must in all cases be an ancestor of the top-level command context). That makes it much easier to construct action chains that interact with both the top level document and iframes.

@whimboo
Copy link
Contributor

whimboo commented Dec 11, 2024

We should as well clarify what should happen if a frame gets closed by an action. Following actions in the same chain should still be dispatched even through they will reach some other document and elements? See the following wpt test as example: https://github.com/web-platform-tests/wpt/blob/master/input-events/input-events-spin-button-click-on-number-input-delete-document.html

@css-meeting-bot
Copy link
Member

The Browser Testing and Tools Working Group just discussed Input events dispatch to top-level frame.

The full IRC log of that discussion <AutomatedTester> Topic: Input events dispatch to top-level frame
<AutomatedTester> github: https://github.com//pull/1847
<jgraham> q+
<AutomatedTester> sadym: There is already a discussion that is in the PR. I am a bit stuck with which approach we should be doing here
<AutomatedTester> ack next
<AutomatedTester> jgraham: At the moment the spec say we pick and iframe and send events to/from that
<AutomatedTester> ... but we may want other data from all frames. e.g. if an overlay is over an iframe and click. We want to have the envet fro the overlay and then the Iframe
<AutomatedTester> ... and there might also be a case with what happens when the iframe disappears
<AutomatedTester> ... we should handle cancelling when frame goes and stop the propagation from the frame but items can still go that way
<AutomatedTester> ... in the future we should probably have a way doing calculations based off the iframe
<AutomatedTester> sadym: my first question: do we want to specify the calc the coords or dispatch to
<jgraham> q+
<AutomatedTester> ... and do that on the <missed what was said>
<orkon> q+
<AutomatedTester> ... and then do the calculations more precise and then do htat to the top level
<AutomatedTester> ack jgraham
<AutomatedTester> jgraham: yes... we need to work with how browsers actually work and then do that from the top/parent and let that go down to the correct place
<AutomatedTester> ... I feel like we agree on the model here
<AutomatedTester> ... the main issue is what happens when the iframe disappears
<AutomatedTester> ... we can either keep going or can fail
<whimboo> q+
<AutomatedTester> q+
<AutomatedTester> ack next
<AutomatedTester> orkon: I think the issue if the iframe disappears. I thought that was solved with the PR from whimboo .
<AutomatedTester> ... I think that if the the iframe disappears we should still continue sending the actions
<AutomatedTester> ... e.g. mouse down removes the iframe we should continue
<jgraham> q+
<AutomatedTester> ... back to sadym if we change to to the top level then the calculations could be a lot harder to do where the current way is already working
<AutomatedTester> ack next
<AutomatedTester> whimboo: a follow to the PR, I haven't done this
<orkon> PR I meant https://github.com//pull/1861
<AutomatedTester> ... I wanted to give a comment to jgraham if we have to continue then it might be good to handle both case (carry on and error)
<AutomatedTester> ... and we would have a default that is managed by an argument
<AutomatedTester> ack next
<tidoust> AutomatedTester: Initially, actions were "do as I say", not "do what I mean". actions.mousedown would assume that the element would be in the viewport. Little things like that. Actions should be above the glass. You would just be telling the coordinates and do the action. But you don't necessarily know what's underneath. If I do element.click,
<tidoust> behavior is different.
<tidoust> i/AutomatedTester:/scribe+ tidoust
<tidoust> AutomatedTester: For iframes, behavior has indeed always be different.
<AutomatedTester> ack next
<orkon> q+
<orkon> q-
<AutomatedTester> jgraham: Do we need to be precise in the spec? Yes definitely
<AutomatedTester> ... I know that there are parts that say browser specific but coordiniates is different and we all have the same on that
<AutomatedTester> ... my proposal for clients would send details from the top level traversible
<AutomatedTester> ... but clients could send them at iframe if they want but then handle the situation if it disappears
<orkon> q+
<AutomatedTester> ... I think we need to follow this up in the issues
<AutomatedTester> ack next
<AutomatedTester> orkon: I agree with the error if it doesn't still exist
<AutomatedTester> ... we could do the calculations at the beginning
<AutomatedTester> jgraham: I don't think we can beause if we have scroll then all the coords are out
<AutomatedTester> q?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants