Hello there younger Tracker people,
I am wondering why this almost obvious suggestion has not been made yet:
With sqlite-vss(1) you could store vectors. With OpenAI’s vector embeddings API(2) you could calculate vectors. And with for example DOT_PRODUCT SQL operator you could find near relevance matches of similar vectors.
- GitHub - asg017/sqlite-vss: A SQLite extension for efficient vector search, based on Faiss!
- https://platform.openai.com/docs/api-reference/embeddings
It doesn’t sound too hard to me to add support for these things to Tracker’s SPARQL side for the querying and tracker-extract’s side for obtaining the vectors (using ie. OpenAI’s embeddings APIs).
Obviously optional somehow. As most users probably don’t want to send all of their metadata to their OpenAI accounts (also because it costs some money to do the calculations).
But soon I think various (open source) technologies will become available (or are already) to be ran locally. So that means obtaining the vectors from the metadata content should be a plugin that can be chosen by different implementations.
Where are our new young Tracker maintainers who steer this development work for future Tracker releases?
Kind regards,
Philip
(One of the old grey former Tracker maintainers of the past, retired Gen-X boomer)