Tracker 3 Creating unique URNs

dean · June 4, 2021, 5:51am

I am coming from relational databases so I don’t know if I’m using SPARQL correctly, but I am wanting to create unique IDs (URNs) for instances within my database and I was wondering if Tracker has functionality for this?

Below I have created myFirstPizza and mySecondPizza. I don’t care about these ‘names’ as I’ll find the instances by their ic:name and the user can select the correct instance.

My reason for this is because two pizza instances can have the same name but with different toppings and I want to make sure I am always referencing the correct instance. I’ve seen the tracker docs example with blank nodes but I don’t think this is the place to use blank nodes.

INSERT { <myFirstPizza> a ic:Pizza ;
ic:name ‘Veggie Sensation’ ;
ic:topping ‘Olives’ .
<mySecondPizza> a ic:Pizza ;
ic:name ‘Veggie Sensation’ ;
ic:topping ‘Red Capsicum’ }

INSERT { <myFirstPizza> ic:topping ‘Rocket’ ; ic:topping ‘Mozzarella’ }

SELECT * WHERE { ?u ic:name ‘Veggie Sensation’ ;
?property ?value } order by ?u

Note: My Italian Cuisine ontology is just a simple example.

sthursfield · June 4, 2021, 9:17am

Hi Dean,

The simplest approach is to call tracker_sparql_get_uuid_urn() to generate a unique identifier of the format urn:uuid:9e57ae55-c18b-468f-99da-f33ac95d357d.

There are two downsides to that approach:

It’s not very readable
It doesn’t identify the actual content, so if the same resource is inserted twice, you might get two different IDs pointing at the same thing.

For things with names Tracker Miner FS uses custom urn schemes such as urn:artist: which identifies a music artist by their name, and urn:contact: which identifies a contact by their name. These aren’t standardised anywhere else, as far as I know.

For contents of files, Tracker Miner FS uses the UUID approach. It would be great if it could hash the contents and use content-addressing, but there are obvious performance implications.

For the pizza example, if the pizzas are internal to the app, then you could use a counter. Something like urn:pizza-id:1, urn:pizza-id:2. You can construct these with g_strdup_printf() since there’s no need for escaping, although tracker_sparql_escape_uri_printf() is there if you need it.

If the pizzas are to be shared, the ID needs to be universally unique, and the best way would be to hash the pizza contents somehow. Tracker doesn’t have helper code for this, but it would be interesting to add some kind of content-hash function to TrackerResource.

garnacho · June 4, 2021, 9:44am

Another approach that does not involve pre-made URIs (if you don’t care about their format) is using blank nodes for the task, in particular this deviation from the SPARQL standard allows them to be used as a URI generator of sorts. E.g.:

INSERT DATA { _:u a ic:Pizza }

Will insert a resource and give it an unique name that you can query around later on.

dean · June 4, 2021, 11:17am

Thanks @sthursfield and @garnacho

If I want to expose this endpoint (for example to gnome-search). What would other developers prefer?

I could also concatenate the name and other attributes to from a key. “Margarita_Dean”

How do you distinguish artists with the same name?

sthursfield · June 4, 2021, 12:08pm

A search engine just needs a unique ID for each resource. I don’t think developers mind as long as it’s a valid URN.

Artists with the same name are treated as the same artist, in Tracker Miner FS. There are improvements possible there

system · June 18, 2021, 12:09pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.