Configure Tracker to always index certain file extensions

Is it possible to instruct Tracker to always index files with certain extensions?

The FAQ don’t say anything about this. ChatGPT first advised me to set the indexer-extensions key in the ~/.config/tracker3/config.yml file. But I guess it made those things up. The conversation (in German) is found here if anyone wants to see.

So, is it possible to configure Tracker to always index the .Rproj file extension, for example?

For text files there is a specific allowlist for extensions.

> gsettings get org.freedesktop.Tracker3.Extract text-allowlist
['*.txt', '*.md', '*.mdwn']

Use gsettings set to add to the list, e.g. :slight_smile:

 gsettings set org.freedesktop.Tracker3.Extract text-allowlist "['*.txt', '*.md', '*.mdwn', '*.Rproj']"

In general Tracker Miner FS decides what based on MIME type is reported by xdg-mime-query filetype, rather than the file extension. So if something isn’t being picked up and it should be, you could check if the MIME type makes sense, and check it against the extractor rules found in src/tracker-extract/*.rule in the tracker-miners.git tree.

Thanks for your reply!

Just to be sure: I don’t want Tracker to index the actual content of .Rproj files but only their filenames so they always appear in Gnome Shell’s search. But I guess text-allowlist is intended to whitelist text content indexing, no?

That is correct, and is unnecessary to have searches by filename appear on Shell search.

I suspect there are other settings are preventing your .Rproj files to be indexed by Tracker. E.g.index-recursive-directories contains selected XDG folders for them to be indexed recursively, and ignored-directories-with-content prevent the indexer from stepping in git trees.

You can use tracker3 index to see the currently indexed folders, and its subcommands to change them.

Thanks for the hints!

Once I’ve opened an .Rproj file, it appears in search (for a while). What I want, is that they always appear.

My .Rproj files almost all live in the root of Git-revisioned directories. Does the default (at least Ubuntu’s) setting of ignored-directories-with-content ['.trackerignore', '.git', '.hg', '.nomedia'] mean that version-controlled directories (.git, .hg) are never indexed? I guess .git/.hg are not just in ignored-directories to avoid constant re-indexing while working with version control?

Just to be sure: I don’t want Tracker to index the actual content of .Rproj files but only their filenames so they always appear in Gnome Shell’s search. But I guess text-allowlist is intended to whitelist text content indexing, no?

I see. We do have a tracking issue about making that easier - Index all filenames on the system (#106) · Issues · GNOME / tracker-miners · GitLab

1 Like

FWIW that is fixed in 3.6.0. Some astray file monitors allowed some otherwise ignored files to seep in.

That’s indeed the intent.

There are more philosophical factors involved, Tracker Miner FS intends to be an indexer useful for end-user purposes, source code and git projects are a low hanging fruit to weed out, and one that may be decisive wrt waste of finite resources (e.g. file monitors) or database space (in my past local experiments removing .git, I e.g. ended up with a surplus of half a million text files indexed, despite the allow list). Tracker Miner FS can deal with it though, including deep tree changes like git checkout.

Typically, our stance as Tracker developers to indexing git trees is that there are more suitable tools for it. For the specific purpose of locating git projects, I think it is possible to polish some more the edges, and specify in the indexed data that a directory is also a git project. Something would need to make use of that information though.

Feel free to file an issue at Issues · GNOME / tracker-miners · GitLab if you think that does help you, although I suspect you want .Rproj files indexed specifically so pressing enter on shell search does also open your IDE…

Good to know. I’m on Ubuntu 22.04 which ships Tracker 3.3.0.

Yep. I’m experimenting a bit with having .git in ignored-directories instead of ignored-directories-with-content plus a few important (and big) directories that contain Git repos in index-recursive-directories. I had to increase the inotify watches limit, which I currently set to a bit over a million (I have plenty of RAM). The tracker-miner-fs-3 process took a while to do the re-indexing – and crashed once. But since then I didn’t notice any negative impacts. And search results are great, i.e. they include all the files I care about!

Sounds interesting. I’m keeping my fingers crossed that someone gets the itch to implement the missing something someday. :crossed_fingers:

You suspect right, this is exactly my main use case. I would actually be happy if only the top level of Git repositories were indexed (where .Rproj files usually reside). There is no allow-list equivalent to ignored-directories-with-content for non-recursive indexing like index-single-directories-with-content, isn’t it?

Thanks for the hint. I think resolving this issue would suffice for my use case: Index all filenames on the system (#106) · Issues · GNOME / tracker-miners · GitLab

This topic was automatically closed 45 days after the last reply. New replies are no longer allowed.