Strawman for GLib's type system improvements

ebassi · January 4, 2025, 12:09pm

Background: halting problem : The Mirror

Draft merge request: Draft: Revamp the GLib type system and slowly decommission GObject (!4451) · Merge requests · GNOME / GLib · GitLab

I’ve opened this topic for long form discussions that do not belong to social media, Matrix/IRC, or the merge request itself.

The goals of the work:

consolidate the type system as the core of GLib
cut through the stalemate in GObject’s design
improve the type system with concepts that have become mainstream in language and API design after 1997
provide a basis for a conceptual redesign of the core libraries and interaction with other languages
avoid a costly API/ABI break at the core of GNOME, by providing more flexibility for libraries going forward

ebassi · January 4, 2025, 12:11pm

Seeding the discussion, using @lwildberg questions on Mastodon:

[For] the properties and signals, why do you plan to have them in the type system? An interface like clonable and finalizable would work too, wouldnt it?
For the GTuple type, wouldn’t it be a form of generic, but with arbitrary typed parameters? Like you can do MyTuple<int, string, int>. Having this feature for generics would also allow things like making GtkColumnView<int, string, int>, having a typed parameter for every column. Or is that too complicated? (btw I have found multiple cases where it could be used like that)
About the union type I am also not very sure which use cases should be supported and which not.

ebassi · January 4, 2025, 12:23pm

GSignalable and GPropertable?

Properties are core to types; signals, ostensibly, less, but they are central to the GNOME ecosystem. Both are building blocks, and they should be part of the central design of the type system.

Moving property and signal objects into the type system avoids having weird “initialise when the class is initialised” restrictions, alongside out of band storage that needs to be recursively locked to ensure thread safety: type registration happens under a lock already, and the lock is in the type registration code, instead of being embedded into the type system. Finally, for reflection purposes you only want to get the type, not instantiate a class or an instance, to gather metainformation about a class.

Unfortunately, we cannot add instance destruction in an ABI-compatible way, so we need an interface. Cloning is an add-on: you can get away with just move semantics.

Yes, it would; the roadmap in the merge request is not a sequential set of steps, so don’t assume that one depends on the other.

Discriminated union types are used all over the place. The basic use cases are Option<T> and Result<T, Error>—which would remove the need of having a GError out argument in all our runtime recoverable API—but if you look inside GTK alone you’ll find a few other such types where the content of the type is determined by an enumeration identifier.

lwildberg · January 4, 2025, 12:43pm

About the discriminated unions I was mostly thinking that there can be different ways to use them: (simplified, no discriminator, think gobject classes instead fo structs and properties instead of struct fields)
Like GResult:

union Foo {
int,
string
}

union Foo {
struct {
  int,
  string
},
struct {
  string,
  float
}
}

struct Foo {
union { int, string }
string
}

So how much flexibility should be allowed? maybe there are simple solutions that cover all use cases, maybe some should be unsupported…

ebassi · January 4, 2025, 12:54pm

Union types are not C unions. Do not confuse the two.

The basic approach I have prototyped is:

typedef enum {
  G_RESULT_OK,
  G_RESULT_ERR,
} GResultType;

typedef struct {
  GUnion parent;  

  GResultType result_type;

  GError *error;
} GResult;

G_DEFINE_UNION_TYPE (GResult, g_result,
  // Offset of the tag
  G_STRUCT_OFFSET (GResult, result_type),
  // GType of the tag
  G_TYPE_RESULT_TYPE,
  // Number of states
  2,
  // First state: name, offset, type
  G_UNION_FIELD (value, -1, G_TYPE_INVALID)
  // Second state: name, offset, type
  G_UNION_FIELD (error, G_STRUCT_OFFSET (GResult, error), G_TYPE_ERROR)
)

The GResult API would then look like:

static inline bool
g_result_is_ok (const GResult *self)
{
  return g_union_get_union_type ((GUnion *) self) == G_RESULT_TYPE_OK;
}

static inline bool
g_result_is_err (const GResult *self)
{
  return g_union_get_union_type ((GUnion *) self) == G_RESULT_TYPE_ERR;
}

static inline const GError *
g_result_get_error (const GResult *self)
{
  return g_union_get_state ((GUnion *) self, G_RESULT_TYPE_ERR);
}

If we want to add a payload to an error, then we can register a new type:

typedef struct {
  GResult parent;

  char *str;
} GStringResult;

And add a g_string_result_get_string().

lwildberg · January 4, 2025, 12:57pm

So basically union types would be similar to classes with one property, which can have different types?

ebassi · January 4, 2025, 1:00pm

Union types are discriminated union types: a type that has multiple payloads discriminated by an enumeration field. Don’t understand why you’re trying to reduce them to something else, when they have a very specific meaning.

You can also use a C union inside them, for storage purposes, as long as you can describe the field as an offset so that the generic API can retrieve the value depending on the state.

lwildberg · January 4, 2025, 1:09pm

ok, maybe I was more focused on the C style unions, I think I understand it now. and I guess a discriminated union could still have properties and signals, right?

ebassi · January 4, 2025, 1:12pm

Yes, since it’s a typed instance.

lwildberg · January 4, 2025, 1:18pm

I see. My concern was mostly about putting things into the core type system that eventually cannot be changed anymore. Having them one layer above makes it possible to switch them out later, like right now with a new base type instead of gobject. Also modularity, but I see that actually people should be encouraged to use properties and signals (and also how much can you do actually without them?), so thats not an issue really I guess.

properties are kind of independent from the rest. they will always build on top of the core type system. But you can easily introduce a new one without breaking the old one or other “systems”. of course things like property bindings wont work anymore between new and old, but even for that some manual signal handlers would work. So that is why I was thinking they dont necessarily need to be part of the type system. The type system does not depend on properties or signals.

lwildberg · January 4, 2025, 1:24pm

For reference about the opportunities of generics for Gtk, here is a MR for the Gtk bindings for vala that introduced type checked generics. It is quite a lot, including properties, methods and signals. Only arbitrary typed parameters are not supported by vala and therefore not included in the MR. (its not merged yet and will need more work, but the bindings serve well as a reference)

I would like to see this being possible just with GIR, and no binding specific extra data. But that is a whole other discussion I guess.

ebassi · January 4, 2025, 1:25pm

Down that road lies false sense of security given by “modularity”: tons of locking, extremely complex defensive programming, and the potential to introduce undefined behaviour.

Instead of being wishy-washy about things, we should identify the core principles of the type system, and ensure that the exposed functionality is strongly tied to it. This also helps defining the minimum amount of functionality to be exposed.

I would not worry too much about possible replacements: we went for nearly 30 years with GObject, I fully expect we’re going to move somewhere else in the next 30.

lwildberg · January 4, 2025, 1:47pm

I just went back to your blog post and saw this example:

typedef struct {
  GTypeInstance parent;

  ShapeKind kind;

  union {
    struct { Point origin; float side; };
    struct { Point origin; Size size; };
    struct { Point center; float radius; };
  } shape;
} Shape;

so this wont be actually supported?

chergert · January 7, 2025, 1:24am

Are there any plans to allow control over the GTypeInstance allocator for a specific GType?

There might be cases where you have a type that cannot ref/unref but you still want to use GTypeInstance for all the bindings support. And yes, I’m thinking of stack based allocations here.

For example, if we had a way to query size/alignment and then an init_instance().

swilmet · January 13, 2025, 4:38am

Breaking the API of GObject is not possible, but breaking the ABI only would be possible: bumping the soversion and requiring to re-compile everything depending on it. To add more padding for future expansion in structs, etc. So more flexibility to refactor GObject.

But I suppose that is not enough to fully modernize the object system, and fix the performance issues.

I think it’s normal to be a bit worried about having two object systems instead of one. There is a parallel to that with two graphical toolkits: GTK and Clutter.

Is it planned to migrate all the GTK API to the new object system? Would it be easy for apps to adapt?

ebassi · January 13, 2025, 9:25am

Bumping the ABI would only allow changing the size or layout of the GObject structures; it would do nothing about its fundamental design flaws, or the API. It would also require a “rebuild the world” downstream of GLib that not everyone is willing, or able, to perform. The result would be a lot of work for negligible results.

The original design of GObject called for multiple root objects; things got progressively moved into GObject because of maintenance burden, and because we did not have the benefit of seeing what other standard libraries in other languages were doing. The design of type system predates a fair chunk of the C++ standardisation; the movement from GtkObject to GObject predates the C++11 standard library types, and the entirety of Rust.

That’s not really a parallel: GTK and Clutter had a whole windowing system abstraction, even if their base objects were GObject types. The fact that a ClutterActor was not a widget is not really why two toolkits were bad; after all, GtkWidget, today, is just a generic container for other widgets and render commands, which are GTypeInstance types but not GObject.

If anything, the design of Clutter not only informed the design of GTK, but also demonstrated that having more lightweight object types is a good goal.

In the end, GObject will remain for the foreseeable future; the end goal of this effort is to expand the type system to include things that, right now, are either not representable, or are limited to GObject.

GTK has already migrated to GTypeInstance for various types; adding more features, like properties, or being able to put instances on the stack is driven, in part, from feedback coming from GTK. Having typed containers in GLib is a long-standing request from GStreamer, as a way to replace GValueArray.

There isn’t much to “adapt” inside applications.

swilmet · January 13, 2025, 12:46pm

Thank you for that detailed answer. I’ll see in practice how it turns out.

Since GLib releases new versions every 6 months, and all new API is directly considered stable, is there going to be an unstable API period for this work as an exception (e.g. one year or one year and a half) ? That’s not what GLib currently does, but it would allow some time to test well the new system in the real world, finding potential refinements.

Or, at least, merge this work early in the development cycle to have 5 months to test the new APIs.

Or develop the system as a separate library (with the same G namespace), and merge it into GLib once it’s ready.

(Perhaps not for the initial work of moving GType into GLib core, but for later steps)

ptomato · January 14, 2025, 4:18am

This would be a lot less abstract for me if there were some examples of how you’d use the new API in an app.

For example let’s say I have a GTK app. I guess all of my custom widgets would stay the same until I ported to a version of GTK that had non-GObject-based widgets, but I’d immediately be able to replace all of my non-widget GObjects with lightweight objects?

ebassi · January 14, 2025, 10:12am

“Immediately” may be overselling it, but: yes, the idea is that you could move away from GObject for your own types.

The main issue is interoperability with GObject; for instance, if you have a GListModel implementation displayed by a list widget, you’d still need to use GObject for the row data:

we cannot compatibly change the GListModel interface to return a GTypeInstance
property bindings depend on GObject’s concept of properties

On the other hand, you could use GTypeInstance for the data stored inside a GObject wrapper, and ideally the various typed container API (GVector, GHashMap, GSet, and any future types we add there, would provide a shared API for iteration and change notification.

In practice, porting over a library like GTK will take some time not just because of mechanical changes, but also for an API patterns redesign:

methods and functions that currently return a value and take a GError out argument would now return a GResult<T, GError> sum type
methods, functions, and properties that return or take a reference to an instance would explicitly use GRc<T> to signal their intent
methods, functions, and properties that take or return NULL would explicitly use GOption<T> to signal their intent
we’d be able to use collection types in the API, unlike the current state of using untyped data structures

This kind of re-design is going to take a while to shake out, and maybe it’ll go nowhere; but it’d still be a prerequisite if we ever decide to move the GNOME stack to another language, given that these concepts are prevalent everywhere.

ebassi · January 14, 2025, 10:16am

If this work gets merged—and it’s a big “if”, at the moment—then it’ll be merged piecemeal, to give time to people to digest it and incrementally work with it over multiple cycles.

No, this will definitely not happen. We have a proliferation of libraries already, and we cannot hijack the G namespace for an experiment. We’ve seen what happened with gobject-introspection, and its ~15 years history.