GMainLoop freezes(?) after returning from nested loop when handling Wayland events

My program processes Wayland events in a GTK main loop. It does this by adding a GSource for the Wayland display fd using g_source_add_unix_fd(source, wl_display_get_fd(display), G_IO_IN | G_IO_ERR | G_IO_HUP);
Here is the full code for the GSources: https://github.com/sardemff7/libgwater/blob/master/wayland/libgwater-wayland.c

My program connects to Wayland events and runs them through the program’s event handler.

At one point it opens a modal GtkWindow in a nested GTK mainloop by calling gtk_main(). When it calls gtk_main_quit() and returns to the outer loop, it stops handling the Wayland events (not intended). It seems that the GMainLoop freezes.

void clientCycleEventFilter ()
{
   do_stuff();
   gtk_main_quit ();
}

void ClientCycle ()
{
    passdata.window = myWindowCreate ();
    eventFilterPush (display_info->xfilter, clientCycleEventFilter, &passdata);
  
   GWaterWaylandSource *source = g_water_wayland_source_new_for_display (NULL, screen_info->display_info->wayland_display);     
  
    gtk_main ();
    eventFilterPop (display_info->xfilter);
}

int main ()
{
    wayland_display = wl_display_connect (NULL);

    gtk_init (&argc, &argv);
   
     registry = wl_display_get_registry (wayland_display);

    wl_registry_add_listener (registry, &registry_listener, myStruct);
   /*This binds to Wayland interfaces, which then feeds events
     to the event filter*/
    
    wl_display_roundtrip (wayland_display);
    wl_display_roundtrip (wayland_display);

    source = g_water_wayland_source_new_for_display (NULL, wayland_display);      
     
     gtk_main ();
}

(I seemed to have to add the GSource a second time when entering my nested loop)

Here is the rest of the code: https://github.com/adlocode/xfwm4/blob/wayland/src/cycle.c#L557

How can I resolve this?

Does the same thing happen if you use the response signal on the dialog instead of running a nested loop?

First of all, you should probably avoid a nested main context iteration. Better try to stick to proper asynchronous operations (and yes, that is a bit painful often as you need extra functions/structures).

Then, the main context/loop is not frozen obviously. Just your GSource is being blockedas it shouldn’t run recursively. i.e. have a look at g_source_set_can_recurse

Sorry, I may have over-generalised it a bit. It’s not actually a GtkDialog, it’s an ordinary GtkWindow that updates itself in response to Wayland events.

Do you have a backtrace for where it’s stuck, i.e. is it actually hanging on a poll somewhere? I don’t know if this could be an issue with recursive sources because it seems you’re running GTK with the X11 backend and then managing the wayland connection yourself, right?

GTK is actually running on Wayland, the entrypoint is src/main-shell-client.c

How do I do a backtrace, is that by attaching gdb to the process and then running backtrace when it freezes?

If so, then it seems to look like this:

#0  __futex_abstimed_wait_common64
    (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x15e7a00) at futex-internal.c:57
#1  __futex_abstimed_wait_common
    (futex_word=futex_word@entry=0x15e7a00, expected=expected@entry=0, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=0, cancel=cancel@entry=true) at futex-internal.c:87
#2  0x00007f582cc2f78f in __GI___futex_abstimed_wait_cancelable64
    (futex_word=futex_word@entry=0x15e7a00, expected=expected@entry=0, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=0)
    at futex-internal.c:139
#3  0x00007f582cc31ed9 in __pthread_cond_wait_common
    (abstime=0x0, clockid=0, mutex=0x15e79a8, cond=0x15e79d8)
    at pthread_cond_wait.c:504
#4  ___pthread_cond_wait
    (cond=cond@entry=0x15e79d8, mutex=mutex@entry=0x15e79a8)
    at pthread_cond_wait.c:619
#5  0x00007f582e33ac0b in read_events (display=0x15e78c0)
    at ../src/wayland-client.c:1504
#6  wl_display_read_events (display=0x15e78c0) at ../src/wayland-client.c:1574
#7  0x0000000000447771 in _g_water_wayland_source_check (source=0x1736e00)
    at ../util/libgwater-wayland.c:82
#8  _g_water_wayland_source_check (source=source@entry=0x1736e00)
--Type <RET> for more, q to quit, c to continue without paging--
    at ../util/libgwater-wayland.c:70
#9  0x00007f582d049682 in g_main_context_check
    (context=0x1606020, max_priority=0, fds=<optimized out>, n_fds=<optimized out>) at ../glib/gmain.c:3999
#10 0x00007f582d09e09b in g_main_context_iterate.constprop.0
    (context=0x1606020, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at ../glib/gmain.c:4172
#11 0x00007f582d048853 in g_main_loop_run (loop=0x17f0ac0)
    at ../glib/gmain.c:4373
#12 0x00007f582d9b062d in gtk_main ()
    at /usr/src/debug/gtk3-3.24.30-4.fc35.x86_64/gtk/gtkmain.c:1329
#13 0x000000000040ed9e in main (argc=<optimized out>, argv=<optimized out>)
    at main-shell-client.c:1075

Interestingly, often my GtkWindow doesn’t open if I run it under gdb.

Is your application multi-threaded?

Yes, that backtrace is good, it shows the wayland event loop is deadlocking. I looked at the wayland branch, it seems that function is creating a second wayland source using g_water_wayland_source_new_for_display when one was already created here, which likely is causing the deadlock: main-shell-client.c#L1072

Also, it seems that both of these are unnecessary. When using GTK you do not need to add any additional GSources at all or call nested gtk_main if you’ve detected the display is a GdkWaylandDisplay, because in that case GDK is already managing the Wayland connection for you. As it is, the code is creating a second wayland connection using wl_display_connect and then dispatching events on that separate from the GDK events, and so will not have any effect on a GTK dialog. If you need to implement wayland client extensions and you want them to interact with GTK, you have to use its internal wl_display which can be accessed through gdk_wayland_display_get_wl_display and gdk_wayland_display_query_registry.

The event filtering happening here will be an issue in wayland too, that will probably have to be refactored to use libweston’s grabs and not gdk grabs/filtering.

Edit: It doesn’t appear to be multithreaded, the server code is running in a different process from the client code it looks like.

Can you bind to non-standard/third-party Wayland protocols using GTK’s internal wl_display?

Can you still do stuff like:

void global_add ()
{
   if (strcmp(interface,
			"my_interface") == 0) {
		myStruct = wl_registry_bind(registry, name,
				&my_interface,
				2);   
      }
}
struct wl_registry_listener registry_listener =
{
  .global = global_add,
  .global_remove = global_remove
};

int main ()
{
   wayland_display = gdk_wayland_display_get_wl_display (gdisplay);
   registry = wl_display_get_registry (wayland_display);
   wl_registry_add_listener (registry, &registry_listener, data);
}

I have tried doing global_add listener on gdk_wayland_display_get_wl_display() and doing gdk_wayland_display_query_registry () on my third-party protocols, and neither has worked. How do I bind to third-party Wayland protocols using GTK’s internal wl_display?

Possibly a call to wl_display_roundtrip is missing? The registry should work just like that, here is an example:

It seems to be segfaulting at wl_display_get_registry ().

afaict any operation on the wl_display seems to segfault.

Update: It was because I’d re-set the wl_display variable to NULL after I’d got the wl_display. I don’t mind it when it’s a silly mistake.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.