Click UI elements of focused window using keyboard

Unfortunately my carpal tunnel problems don’t get much better and I am thinking about ways to avoid using the mouse.

I really enjoy the extension “Vimium C” for Firefox, which allows to click a link without using the mouse. First, you press “F”. Then every link will be captioned with one or more letters. Next, I enter the caption of the link I want to click. See the following screenshot:

It would be really great to have a similar feature for Gnome.
I imagine this in the following way: You assign a keyboard shortcut to an arbitrary key combination. When you press it, every clickable button or input field in the focused window should be captioned like explained before. In the same way, you can click a captioned element by typing in the caption.

First, It would be great to have some feedback from you about this idea.

Secondly, I have no knowledge on how to implement this and did not find any related resources. Is there any way how to implement this? I’d happy to try to code an extension if there is a way to do this.

I hope I reached the correct site for this topic, otherwise it would be great to have a recommendation where to get any help on this task.

Thanks a lot,
Best regard

While this is a standard feature in Windows since Vista in Linux no such accessibility options I see exist and I suppose its due to the implementation of security and Wayland compositor…

I can only recommend you to check other accessibility options elsewhere or invest in a great mouse like a ergonomically vertical mouse or a trackball mouse. Alternatively an ambidextrous mouse so you can screw your left hand instead of injuring your right hand further allowing it to heal. There’s also healing exercises you can do to heal the carpal tunnel. Good luck!

The compositor has no idea about the contents of a window, so writing an extension for GNOME Shell won’t do you any good. Additionally, there’s no single UI toolkit in Linux, so you’d have to implement this feature in every major toolkit, and you’d still leave custom toolkits, or projects with their own UI, out.

If everything used the accessibility API, it would theoretically be possible to expose link objects to it and then have some form of high level entity assign a key shortcut to activate the link; in practice, it’s a lot of work.

Thanks for you replies.

I did check out the accessibility API and the wrapper library pyatspi2. The API contains the “Action” class, which could be used to perform clicks as far as I understood. Do you think it is possible to click UI elements using pyatspi2?
Is it true that the orca screen reader basically uses the same API?

Did I understand this correctly: every application manually needs to expose all UI elements to the accessibility API? I wrote a quick test script, which found e.g. windows from Discord, gedit, gnome-terminal but not Firefox. Do you have any idea why Firefox is missing in the list?

import pyatspi
from pyatspi import XY_SCREEN

desktop = pyatspi.Registry.getDesktop(0)

def find_children(element, depth=1):
	if depth > 3:
		return
		
	for m in element:
		if m is not None:
			point = m.get_position(XY_SCREEN)
			print(point.x)
			print(point.y)
			print(m)
			print(point)
			find_children(m, depth+1)



for application in desktop:
	print(application.name)
	
	for o in application:
		print(o)
		print(o.role, o.name)

		print("children:")
		find_children(o)

Edit: probably using LDTP2 is a better idea than using pyatspi2 directly…

This topic was automatically closed 45 days after the last reply. New replies are no longer allowed.