Tracker failes to index .qmd (text) files

I work a lot in R with Quarto Markdown Files (.qmd). They are simple text files, with an extended markdown language.

I know for instance, of a .qmd file in a sub/sub/sub directory of my home folder which contains the word ‘opioids’.

When I search (using Files/Nautilus) for ‘opioids’ (Full Text - not File Name), I get lots of results based on content, including pdf files, libreoffice files, markdown (.md) and HTML, but not .qmd files at all.

If I limit the search (What) to just Text Files, I still get results based on content, but markdown (.md) files are not displayed … and neither are any .qmd files.

If I limit the search (What) to Other Type and specify Quarto Markdown File - nothing at all is listed.

I have changed the setting for text-allowlist to include ‘*.qmd’ – makes no difference.

I have made sure, to remove .git from the ignored-directories-with-content as I often use git for version control of manuscripts.

After these changes, I have tried to reset and re-index - nothing helps.

I am at a loss, now and any help is appreciated.

My platform:
Pop_OS! (Ubuntu based) 64 bit.

Gsettings:
org.freedesktop.Tracker3.Extract max-bytes 1048576
org.freedesktop.Tracker3.Extract text-allowlist [‘.txt’, '.md’, ‘.mdwn’, '.qmd’]
org.freedesktop.Tracker3.Extract wait-for-miner-fs false
org.freedesktop.Tracker3.FTS enable-stemmer true
org.freedesktop.Tracker3.FTS enable-unaccent true
org.freedesktop.Tracker3.FTS ignore-numbers true
org.freedesktop.Tracker3.FTS ignore-stop-words true
org.freedesktop.Tracker3.Miner.Files crawling-interval -1
org.freedesktop.Tracker3.Miner.Files enable-monitors true
org.freedesktop.Tracker3.Miner.Files ignored-directories [‘po’, ‘CVS’, ‘core-dumps’, ‘lost+found’]
org.freedesktop.Tracker3.Miner.Files ignored-directories-with-content [‘.trackerignore’, ‘.hg’, ‘.nomedia’]
org.freedesktop.Tracker3.Miner.Files ignored-files [‘~', '.o’, ‘.la’, '.lo’, ‘.loT’, '.in’, ‘.csproj’, '.m4’, ‘.rej’, '.gmo’, ‘.orig’, '.pc’, ‘.omf’, '.aux’, ‘.tmp’, '.vmdk’, ‘.vm’, ‘.nvram’, '.part’, ‘.rcore’, '.lzo’, ‘autom4te’, ‘conftest’, ‘confstat’, ‘Makefile’, ‘SCCS’, ‘ltmain.sh’, ‘libtool’, ‘config.status’, ‘confdefs.h’, ‘configure’, ‘##', '~$.doc?’, ‘~$.dot?', '~$.xls?’, ‘~$.xlt?', '~$.xlam’, ‘~$.ppt?', '~$.pot?’, ‘~$.ppam’, '~$.ppsm’, ‘~$.ppsx’, '~$.vsd?’, ‘~$.vss?', '~$.vst?’, ‘mimeapps.list’, ‘mimeinfo.cache’, ‘gnome-mimeapps.list’, ‘kde-mimeapps.list’, ‘*.directory’]
org.freedesktop.Tracker3.Miner.Files index-applications true
org.freedesktop.Tracker3.Miner.Files index-on-battery false
org.freedesktop.Tracker3.Miner.Files index-on-battery-first-time true
org.freedesktop.Tracker3.Miner.Files index-optical-discs false
org.freedesktop.Tracker3.Miner.Files index-recursive-directories [‘$HOME’]
org.freedesktop.Tracker3.Miner.Files index-removable-devices false
org.freedesktop.Tracker3.Miner.Files index-single-directories @as
org.freedesktop.Tracker3.Miner.Files initial-sleep 15
org.freedesktop.Tracker3.Miner.Files low-disk-space-limit -1
org.freedesktop.Tracker3.Miner.Files removable-days-threshold 3
org.freedesktop.Tracker3.Miner.Files throttle 0

Tracker3 status:
Currently indexed: 52571 files, 12490 folders
Remaining space on database partition: 832,8 GB (83,50%)
All data miners are idle, indexing complete
56 recorded failures

  • Regards
    Soren, Denmark

Hi,

What mime type is reported if you run: xdg-mime query filetype for one of the .qmd files in question?

You already spotted the text allowlist, which is specific to the plaintext extractor. All extractors also have a .rules file, e.g. /usr/share/tracker3-miners/extract-rules/15-text.rule

[ExtractorRule]
ModulePath=libextract-text.so
MimeTypes=text/plain;text/markdown
FallbackRdfTypes=nfo:Document;nfo:PlainTextDocument;
Graph=tracker:Documents
Hash=099925eabaa7a05a96418129b1f22e8510862c9287c4bb47522b4f150af5f62d

Notice how there’s a specific list of mime types that will be processed. I’m just guessing here but maybe qmd files have a different mimetype reported?

Hi,
xdg-mime reports: text/x-quarto-markdown

I added text/x-quarto-markdown to the 15-text.rules file, like so:
MimeTypes=text/plain;text/markdown;text/x-quarto-markdown

Still no .qmd files in the search result, but I suspect the miners need to re-trawl $HOME for .qmd files…

sadly, still no .qmd files in search results

Hi,

You may also need to change the Hash field in the .rule file so tracker-miners only requires a restart to reindex the affected files, instead of a database reset.

And you might prefer to use a standalone .rule file for .qmd files alone, as that will not be overwritten by future system updates. This file would need MimeTypes/Hash fields changed, compared to the stock 15-text.rule, and I would suggest to prefix this file with 99- to avoid conflicts with possible upstream changes.

Cheers,
Carlos

I have created a new 99-text-qmd.rule file as you suggest

Sorry to be a bit of a noob here, but the hash listed in the file, should that be an md5sum or sha1sum or some other hashing algorithm … and that hash has to be calculated on the file without the ‘hash’ line?