Implementing GSettingsBackend: blocking reads

mpranj · May 24, 2021, 6:07pm

Hello GNOME/glib community!

We have been working on an alternative GSettings backend based on Elektra and we currently have a working but unfinished prototype.

In the documentation of g_settings_backend_read() (as well as g_settings_backend_write_tree()) it is stated that the calls will never block. Our current prototype does disk I/O in the read function, which would strictly speaking violate the documentation, but works quite well in our tests.

It is a bit unclear to me why g_settings_backend_write() does not have the same requirement to be non-blocking. Am I misunderstanding the docs?

May a compliant implementation perform blocking operations or is it strictly needed to implement everything in a non-blocking way. Afaik dconf solves this by using a separate writer process. We’d like to avoid this in order to keep our architecture simple.

Regards,
Mihael / Elektra Initiative

jfrancis · May 24, 2021, 9:40pm

Does Elektra operate on one file per backend instance, as a single-reader single-writer? If so, that may be okay, that may be similar to the behavior and performance of GKeyfileSettingsBackend, which just uses blocking calls to g_file_load_contents and g_file_replace_contents.

Otherwise I would say you may want to refer to this section in the dconf Readme:

dconf has a partial client/server architecture. It uses D-Bus. The
server is only involved in writes (and is not activated in the user
session until the user modifies a preference). The service is
stateless and can exit freely at any time (and is therefore robust
against crashes). The list of paths that each process is watching is
stored within the D-Bus daemon itself (as D-Bus signal match rules).

Reads are performed by direct access (via mmap) to the on-disk database
which is essentially a hashtable. For this reason, dconf reads
typically involve zero system calls and are comparable to a hashtable
lookup in terms of speed. Practically speaking, in simple non-layered
setups, dconf is less than 10 times slower than GHashTable.

Writes are not optimised at all. On some file systems, dconf-service
will call fsync() for every write, which can introduce a latency of up
to 100ms. This latency is hidden by the client libraries through a
clever “fast” mechanism that records the outstanding changes locally (so
they can be read back immediately) until the service signals that a
write has completed.

It may not be worth it to keep the architecture simple, if that means a performance regression compared to dconf for apps that store a lot of settings. Or is similar background worker functionality already provided by libelektra?

mpranj · May 25, 2021, 8:30am

Thank you for the reply!

Effectively it is multi-reader single-writer. Elektra does not modify existing files so reads are always safe and writes are done to temporary files which are then rename()d when changes are committed. According to docs g_file_replace_contents uses the same approach with atomic renames.

The main problem I was thinking of is much simpler: what if e.g. a read() call blocks? I’m not sure how g_file_load_contents works around this. The docs even state for g_input_stream_read that it will block during the read, which contradicts the g_settings_backend_read docu.

Unfortunately such a background worker does not exist yet. We would implement it only if strictly needed.

Another problem we faced is that GSettingsBackend object instances seem to be used concurrently, i.e. multiple threads using the same instance instead of one thread using one instance. This corrupted (not thread-safe) data structures private to a GSettingsBackend instance and we had to wrap most things with a mutex, which is suboptimal. Is this a known problem or am I misunderstanding something?

pwithnall · May 25, 2021, 9:09am

Correct.

aiui a read() of a small amount of data (less than a few pages) can be assumed to not block, in all practical situations, if reading from a normal local file from a normal local mount. By contrast, it may block if reading from a pipe, NFS mount, or file backed by a FUSE mount.

[quote=“mpranj, post:3, topic:6544”]
Unfortunately such a background worker does not exist yet. We would implement it only if strictly needed.[/quote]

So how do you serialise concurrent writes to the same key from multiple processes?

It seems to me like you are ultimately going to reimplement something which looks a lot like dconf.

I’m not sure. None of the other GSettingsBackend implementations provide any internal thread safety. Are you using a single GSettings instance from multiple threads?

mpranj · May 25, 2021, 11:31am

In Elektra we have the concept of conflicts. To be able to write a config (or one key from the config) you have to first get the latest version of the config. This way, if you try to change a config which was modified without your knowledge, you get back a conflict.

Since this conflict concept does not really exist in GSettings, we simply write the last version of a key. When multiple processes write to the same key, basically the last process “wins”.

On a high level both are key-value databases, but we don’t want to reimplement dconf.

Elektra aims to be a general purpose configuration framework, unifying access to configuration files. We are also working on KDE (KConfig) support and already support many other configuration formats (TOML, JSON, XML, CSV, …) as well as bindings for many languages (C, C++, Python, Lua, Ruby, … ). Elektra is extensible via a plugin interface and allows users to specify and validate configuration. Elektra’s GSettings backend is a comparably small component.

Afaik dconf is very specialized to suit the needs of GNOME. As such, dconf might be faster than Elektra, but it does not solve the other big-picture problems. The performance is something we are trying to evaluate, as Elektra also utilizes an mmap based cache. More important to us though, is that we simplify and unify access to configuration for developers, providing a better experience.

Actually we are not directly using it. We have a working prototype GSettingsBackend based on Elektra and install it to the GIO module dir. It replaces the dconf module, so everything uses Elektra and not dconf. Whatever threading is going on is done by GNOME, GIO, or whatever manages the GSettings backend instances.

This is how we noticed that data, which should be private to a GSettings backend instance, is modified by another thread leading to corrupted data and segmentation faults. When we wrap access to the data with a GMutex the problems are gone. The mutex would not be needed if one instance was not accessed by multiple threads at the same time.

pwithnall · May 25, 2021, 1:30pm

Not really. The needs of GNOME are not that specific. It’s just a multiple-reader, single-writer mmapped data store. You can use it independently of GSettings. It depends on GLib, but nothing else in the GNOME stack.

Sounds like we need to improve the documentation around that, for sure. Could you please get a backtrace of one of these multi-threaded access situations, and attach it to a new issue filed against GLib?

system · June 8, 2021, 1:30pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.