Significant Performance Discrepancy Between Nautilus and Terminal Tools (du/find) When Calculating Folder Size and Item Count

kofaysi · January 9, 2025, 12:04pm

I’ve noticed a significant performance difference between Nautilus and terminal tools like du and find when calculating the size and number of items in a large folder. Here’s the test I performed:

Test in Terminal:

$ time (echo "Size:" $(du -sh ~/Downloads/Elements | awk '{print $1}') && echo "Items:" $(find ~/Downloads/Elements | wc -l))
Size: 767G
Items: 804341

real    0m6,351s
user    0m2,516s
sys     0m3,969s

Results:
- Folder size: 767 GB
- Number of items: 804,341
- Total time taken: ~6 seconds

Nautilus Performance:

Using Nautilus, the same task (counting items and calculating folder size) took 2 minutes and 50 seconds, repeatedly. The item count was off by 1, which I suspect is negligible. The size difference can likely be attributed to differences in base-10 vs. base-2 reporting. However, the significant delay in Nautilus stands out as the main issue.

Observations:

Command-line tools (du, find) are extremely efficient.
Nautilus takes significantly longer to process the same data on several occasions after each other in the same application run.
This discrepancy becomes especially problematic when working with large directories (800k+ items in this case).

Questions:

Has anyone else experienced similar performance issues with Nautilus?
Are there any optimizations or settings to improve Nautilus’s performance when handling large folders?
Is this delay a fundamental limitation of the graphical interface compared to terminal-based tools?

Looking forward to insights from the community. Thank you!

p.s. Coming here from Discrepancy in Files/Nautilus performance vs du command-line tool for counting folder size and items - Feedback - Zorin Forum

kabushawarib · January 10, 2025, 6:52pm

Yes, it is known that the performance of CLI tools is almost always faster than GUI application since they have little error handling or tracking by default, and the fact they are several decades old with almost all performance improvement opportunities already squeezed out.

After a quick performance profiling, it seems that the current bottleneck is the tracking of inodes, which is used to not count files twice through hard links. It currently seems to be using a linear unordered list for searching for existing nodes, which is very slow for large folders. I did try to replace it with a hash set, and it boosted counting performance by 4x: Counting Files in Properties is too Slow (#3720) · Issues · GNOME / Files · GitLab. Even then, there’s still some overhead.

After that the limiting factors are disk speed and the enumeration of files and the creation and disposal of GFileInfo objects.

system · February 24, 2025, 6:53pm

This topic was automatically closed 45 days after the last reply. New replies are no longer allowed.