It seems there’s a baked-in assumption that a guchar is 8 bits wide [1]. Is that a safe assumption to make for all platforms on which GLib is supported?
POSIX mandates that a byte is exactly 8 bits. So the question is whether GLib requires (implicitly or explicitly) POSIX compliance in this regard.
[1] looking at code such as g_utf8_validate → fast_validate() and at the utf8_skip_data array,
Thanks. I work on Pacemaker (an HA cluster resource manager). We don’t have an exhaustive list of supported platforms; we test against RHEL, Fedora, CentOS, Debian, Ubuntu, OpenSUSE, and FreeBSD on several CPU architectures. We don’t explicitly require an 8-bit char. However, AFAIK all of our test systems use an 8-bit char.
We use GLib heavily – mostly for lists, hash tables, option parsing, and the main event loop, and a little bit of GString.
I recently wrote some CHAR_BIT-aware code to handle non-ASCII UTF-8 characters. I wanted to look into replacing it with GLib calls instead of reinventing the wheel.
I’ll discuss with the project lead next week… perhaps we have enough of a de facto dependency on 8-bit char via GLib already, to proceed with g_utf8_next_char(), etc.