I wonder, has anyone benchmarked json-glib against other JSON parser/writer libraries? I recently tried adding json-glib 1.4.4 to a clone of the suite at https://github.com/miloyip/nativejson-benchmark . I’m not sure I’ve done so optimally so I’m not going to give any numbers, but json-glib does appear to be slow relative to the competition.
If that’s really the case I can think of one possible reason, at least for JSON heavy with floating-point values such as GeoJSON. In the 1.4.4 code json-glib appears to call g_ascii_strtod() for each such value. Seems to me if the library were just to push the ‘C’ numeric locale at the outset; rely on the system’s strtod() while parsing; then pop whatever locale was in place after parsing, that might save a ton of cycles. (The JSON standard insists on the decimal point.)
I’m not sure I’ve done so optimally so I’m not going to give any numbers, but json-glib does appear to be slow relative to the competition.
It’s a bit hard to say if “doing it optimally” holds if you don’t give any code to review it
In the 1.4.4 code json-glib appears to call g_ascii_strtod() for each such value. Seems to me if the library were just to push the ‘C’ numeric locale at the outset; rely on the system’s strtod() while parsing; then pop whatever locale was in place after parsing, that might save a ton of cycles.
On “doing it optimally” or not I don’t mind sharing my code. You can find the file I placed under src/tests in the nativejson-benchmark directory (root dir of the test kit I referenced) at http://users.wfu.edu/cottrell/json/json-glib.cpp . To make this work, however, you also need to modify build/premake5.lua to set the paths required for the json-glib and GLib headers
(see includedirs in that file) and add the required linker specifications (see linkoptions).
(The base assumption of this kit is that a C-coded candidate can be added just by including its *.c files and headers, but obviously that won’t work for json-glib, with its dependence on GLib.)
On g_ascii_strtod(): it wasn’t obvious to me under what conditions USE_XLOCALE is defined. But if it’s defined on any sane system, as seems to be the case, then only two trivial function calls are wasted per evaluation of floating-point number, so maybe that’s not explanatory of the speed differential I saw (supposing it’s “real”).
JSON-GLib was never written to be fast. It’s barely written to be correct.
A faster JSON-GLib would need a different tokeniser than the fork of GScanner I used, one that used SSE instructions to quickly parse strings for instance; it would probably need better data structures to hold the values, something that tried to store numbers as either integers or doubles, and that used references to the data instead of copies; and it would probably need a parser that was rewritten from scratch not to perform as many allocations as the current one does.