GTK3 app UI updates get stuck in macOS

Dear community,

I have a very strange issue with my pod-ui app that only shows on macOS. This is a cross-platform macOS/Linux/Windows GTK3 app written in rust that talks MIDI to an obscure piece of guitar tech. First, it detects the Line6 modelling amp on the MIDI line and requests a full dump from it, then it receives 128 programs over MIDI one by one and populates internal buffers with these, updating the UI (program names are shown on buttons) as the programs are received.

All this works fine in Linux and Windows and used to work fine in macOS. This used to be a normal GTK application, but was recently converted to a GtkApplication. This is when the problem became noticeable. The pod-ui project issue contains videos of the two app versions “side-by-side” and how they behave.

I even went as far as reimplementing g_application_run(), which -to my surprise- helped on my (rather old Mac) machine, but didn’t have an effect on a newer faster Mac.

Once the UI is stuck and not updating, the app still works - there’s a “preferences” button which does open a preferences dialog.

If I grab the window by the titlebar and try moving it around the desktop, after a little while it gets unstuck, the updates finish and all works as expected. Otherwise, it doesn’t get unstuck on its own no matter how much time goes by.

I’m really at a loss of how to even start debugging this. Any pointers on what to try are welcome.

PS. The app is multi-threaded, but all GTK GUI updates are done as a part of a glib::MainContext::channel receiver attach-callback, i.e. a source registered on the GMainContext, running all in one thread.

Sincerely, liet_

1 Like

Can you clarify what you mean my old Mac and new Mac ?

Certainly,

the “old Mac” is a Intel Core i7 2015 MacBook Pro, running macOS 12.7.2,
the “new Mac” is an Apple M1 Pro 2021 MacBook Pro, running macOS 14.6.1

Both are running the same app, compiled on the “old Mac”, packaged with the same gtk3 3.24.43 library.

The fact that a custom main context loop helped is a fluke, as GTK is running the exact same code under the hood. I’ve updated the code to run on top of gtk-rs 0.18 and done more testing and the UI gets stuck consistently, whether it gunning g_appication_run(), a custom main context loop, or even gtk_main().

Hi @liet,

As the application handles events like button clicks, I assume that the GLib main loop is spinning normally. You could test that by adding a timeout source with GLib.timeout_add_seconds_full which prints a message to the terminal.

Once the UI is stuck and not updating, the app still works - there’s a “preferences” button which does open a preferences dialog.

The button is part of the window, right? I can see that it’s in the titlebar, but it seems to be a Client-Side (CSD) titlebar. So the app is still handling input events, but graphical updates are blocked. In such case buttons should not highlight when hovering the cursor over them. If you want to test that all input handling keeps working, add a button in the UI that prints a message when clicked.

Finally: when launching your app with the environment variable GTK_DEBUG=interactive, does the inspector work when the app is stuck?

It may be that the window or the frame clock are frozen. Could you prepare a custom build of GTK? I can post here a small patch that prints some info

1 Like

Hi @lb90,

Thank you so much for your input.

Yes, the main loop is spinning normally. A timeout added with timeout_add_seconds_full does print a tick message to the terminal correctly.

Yes, the UI is responsive even after the graphics is stuck. The buttons in the titlebar indeed do not highlight, but actually respond to clicks (the preferences button, fires the “app.preferences” action, which gets handled correctly). I can also verify that a simple button that prints a message to the console, as you suggested, also works correctly. The UI controls also generate correct “changed”/“clicked”/“value-changed” events when they are touched as I can see the callbacks connected to those events get called.

To answer you last question I’ll have to recompile GTK to get GTK_DEBUG=interactive working as the one shipped with Homebrew by default doesn’t have this enabled. If you have a patch I can try, I’ll definitely apply it and report back.

Now that you mentioned the frame clock, I added a tick callback to the main window using the gtk_wiget_add_tick_callback and indeed, when the UI freezes the ticks stop coming. After I shake the window awake, I can once again see the ticks coming. If I add a call to gdk_frame_clock_get_frame_counter, I see that the frame no frames are dropped - if it freezed with frame counter 96, the frame counter when it wakes up is 97.

1 Like

Here’s a first patch:

diff --git a/gdk/gdkframeclockidle.c b/gdk/gdkframeclockidle.c
index 89f5823a72..512c6b6123 100644
--- a/gdk/gdkframeclockidle.c
+++ b/gdk/gdkframeclockidle.c
@@ -805,3 +805,12 @@ _gdk_frame_clock_idle_new (void)
 
   return GDK_FRAME_CLOCK (clock);
 }
+
+int
+gdk_frame_clock_idle_is_frozen (GdkFrameClock *clock)
+{
+  GdkFrameClockIdle *clock_idle = GDK_FRAME_CLOCK_IDLE (clock);
+  GdkFrameClockIdlePrivate *priv = clock_idle->priv;
+
+  return priv->freeze_count > 0;
+}
diff --git a/gdk/gdkwindow.c b/gdk/gdkwindow.c
index 62e0cf816f..8109d36bb2 100644
--- a/gdk/gdkwindow.c
+++ b/gdk/gdkwindow.c
@@ -1336,6 +1336,40 @@ sync_native_window_stack_position (GdkWindow *window)
     }
 }
 
+int
+gdk_frame_clock_idle_is_frozen (GdkFrameClock *clock);
+
+static gboolean
+gdk_window_is_toplevel_frozen (GdkWindow *window);
+
+typedef struct
+{
+  GdkWindow *window;
+  unsigned int counter;
+} check_window_data;
+
+static gboolean
+check_window_callback (gpointer user_data)
+{
+  check_window_data *data = (check_window_data*) user_data;
+  GdkWindow *impl = gdk_window_get_impl_window (data->window);
+  GdkFrameClock *frame_clock = (GdkFrameClock*) gdk_window_get_frame_clock (data->window);
+
+  g_print ("window #%u", data->counter);
+  g_print (" freeze count: %u", (unsigned int)impl->update_freeze_count);
+  g_print (" toplevel frozen: %d", (int)gdk_window_is_toplevel_frozen (data->window));
+
+  if (GDK_IS_FRAME_CLOCK_IDLE (frame_clock))
+    {
+      int frozen = gdk_frame_clock_idle_is_frozen (frame_clock);
+      g_print (" frame clock frozen: %d", frozen);
+    }
+
+  g_print ("\n");
+
+  return G_SOURCE_CONTINUE;
+}
+
 /**
  * gdk_window_new: (constructor)
  * @parent: (allow-none): a #GdkWindow, or %NULL to create the window as a child of
@@ -1545,6 +1579,17 @@ gdk_window_new (GdkWindow     *parent,
         }
     }
 
+  if (window->window_type == GDK_WINDOW_TOPLEVEL)
+    {
+      static unsigned int count = 0;
+
+      check_window_data *data = g_new0 (check_window_data, 1);
+      data->window = g_object_ref (window);
+      data->counter = ++count;
+
+      g_timeout_add_seconds (2, check_window_callback, data);
+    }
+
   return window;
 }

If the frame clock is not cycling then perhaps the CVDispayLinkSource is not acting correctly. Look for gdk_frame_clock_freeze, gdk_frame_clock_thaw in gdk/quartz:

$ git grep -n 'gdk_frame_clock_freeze\|gdk_frame_clock_thaw'
gdkdisplay-quartz.c:163:      _gdk_frame_clock_thaw (frame_clock);
gdkwindow-quartz.c:870:  _gdk_frame_clock_freeze (frame_clock);

@lb90 I built a debug build-type gtk with your patch applied. The inspector works fine and I can see that the buttons that were updated after the UI froze indeed have the correct labels as they otherwise should.

After the UI gets stuck, I get the following statistics printed:

window #1 freeze count: 0 toplevel frozen: 1 frame clock frozen: 1

Once unstuck, I get this:

window #1 freeze count: 0 toplevel frozen: 0 frame clock frozen: 0

I incorrectly assumed that I haven’t updated Homebrew gtk between building versions 1.3.0 (the one that doesn’t get stuck) and 1.4.0 (the one that gets stuck). Version 1.3.0 is packages with gtk 3.24.37 and version 1.4.0 with gtk 3.24.38. Now I have 3.24.43, but playing with DYLD_LIBRARY_PATH I can run the app with either of those and indeed without changing anything, when ran with gtk 3.24.37, the app doesn’t get stuck anymore.

The only quartz-related change in 3.24.38 is this: [quartz] Convert frame_link, windows_awaiting_frame to GSList. (33fd9eb4) · Commits · GNOME / gtk · GitLab

Ok, this was fun. Another quartz-related change in 3.24.38 that I overlooked is [quartz] Pad both the content rect and the window width. (32e5c182) · Commits · GNOME / gtk · GitLab, reverting this fixes my problem :exploding_head: Maybe someone else understands, why?

At first, I followed the CV display link, as you suggested. However, when things were getting stuck, I see the following:

  1. _gdk_quartz_display_add_frame_callback called, which calls gdk_display_link_source_unpause;
  2. gdk_quartz_display_frame_cb called, which processes all pending frame clocks;
  3. gdk_quartz_display_frame_cb called, nothing to process, calls gdk_display_link_source_pause;

At this point, there are no more calls to _gdk_quartz_display_add_frame_callback until I shake the window alive again.

With the change above reverted, I see _gdk_quartz_display_add_frame_callback called regularly without fail.

2 Likes

Nice! :slightly_smiling_face:

Could you open an issue at Issues · GNOME / gtk · GitLab?

This issue is tracked in:

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.