I think the language of “pull” and “push” focus might be a little more clear to everyone.
Applications “pull” focus by requesting from the compositor, window manager, or X server that focus be set to one of their surfaces or windows.
Compositors and window managers “push” focus by setting it to an application’s surface or window.
Right now, all compositors and window managers are pushy. For Wayland compositors, this is easy because they are completely in control. Window managers can push by calling XSetInputFocus or by sending a WM_TAKE_FOCUS message. X clients can ignore the WM_TAKE_FOCUS message, but they don’t have any good reason to do so; in practice, the message is the same as a push.
Applications can pull focus by using xdg-activation, _NET_ACTIVE_WINDOW, or one of the two active input models on X. By whichever method, there is either a token or a timestamp corresponding to a user action which indicates to the compositor, window manager, or X server that the pull is what the user wants.
I talked about this in another way with the goal of supporting background window actions:
Now I’d like to break that down into two steps.
- Change toolkits to pull focus for events that indicate the user wants to focus the window.
- Get compositors and window managers to stop being so pushy.
If 2 happens without 1, focus handling is going to be messed up. An example of this on X is the Java toolkits that indicate they pull focus, but don’t actually do so. There was, as I recall, special case code in Metacity to deal with that.
If 1 happens without 2, nothing obvious changes. With today’s compositors and window managers, applications will already have focus when they get the events indicating the user wants to move focus. All that actually changes is the addition of whatever steps are needed to say “I do or don’t have focus and I do or do not want it.” This can happen without any change to toolkit APIs. Call this phase 1 of step 1.
Note well that step 1 is within the parameters of existing protocols. It’s just a toolkit doing something it’s already allowed to do on both Wayland and X.
When phase 1 of step 1 is complete, step 2 can happen without messing up focus. Step 2 can almost be summarized as “ignore button presses in the client area”. The compositor still needs to know about the event so it knows when to allow a focus pull. If the compositor or window manager does decorations, supports Alt-Tab window switching, or other ways of changing focus that do not involve button events in the client area, then it still needs to be pushy.
There’s probably a way to do step 2 even if not all toolkits have done step 1 phase 1. Wayland protocols have version numbers and so there might be a way for clients and compositors to negotiate how pushy the compositor should be. So, maybe 2 can happen without 1.
Phase 2 of step 1 is where the behavior of the GUI starts to change. It’s what makes background window actions possible. Phase 2 requires some API changes to all toolkits, as far as I can tell. (Actually, GNUStep should be an exception.) What’s needed is an API for a widget to indicate whether or not the button press it has received should lead to the application pulling focus. Most widgets would indicate focus should be pulled. A text field with a selection under the pointer would indicate not to pull focus yet, because the button press may start a drag. Same goes for an icon field with an icon under the pointer, or a list with a list item. In case there isn’t a drag or some other background window action, the application should still be able to pull focus based on the button press.
I wrote bug reports and some code 19 years ago to get this process started, but consensus could not be reached on how this would work. Here’s one of the old bug reports:
It has been my hope for a few years now that the introduction of Wayland might lead to a solution, but drag and drop doesn’t seem to be too popular on Linux GUIs. Maybe that’s because it doesn’t yet work quite right.
Oh, one last thing. Focus-follows-mouse, sloppy focus, and similar focus models are unaffected by this. They don’t get the benefits of background window actions, but that’s a trade off for using those focus models. They presumably have other benefits that click-to-focus users can’t enjoy.