Hello GNOME community,
This (long, sorry) post has three goals:
- Raise awareness for privacy-conscious users of a way that GNOME Software may violate your expectations, and what you can do about it.
- Get feedback from the community on whether I’m making a big deal out of nothing.
- Assuming others agree, getting some additional voices to speak up and persuade the GNOME Software maintainers to change their minds.
tl;dr for privacy-conscious users:
Run gsettings set org.gnome.software review-server ""
if you want to be protected from all of this (this will disable seeing and submitting user reviews in GNOME Software).
There’s also a poll at the bottom that I hope you don’t skip if you engage in any part of this.
The situation
GNOME Software communicates with odrs.gnome.org, the server responsible for hosting user reviews. When fetching user reviews, the ODRS server requires a pseudonymous user ID. It’s basically a cookie, in that it gets sent in the background when data are fetched from the Internet, and can be used to correlate data requests with more accuracy than an IP address.
The user ID that GNOME Software computes for this operation is a hash of the machine ID and the user’s local username, along with a fixed salt (implementation). None of these things are trivial to change or clear, which makes this ID more persistent than a web cookie.
Again: this ID gets sent with the request every time reviews of an application are fetched. GNOME Software does this every[1] time it displays information about an application. So when you, say, right-click on an icon from the dash, and select ‘App Details’, the name of an application you have installed and a cookie that persistently identifies you as long as you don’t reset your machine ID or rename your user are sent to the ODRS server, and you are not notified.
The GNOME Software webpage doesn’t mention this, or link to ODRS.
The GNOME project-wide privacy policy also doesn’t link to ODRS.
The About dialog in GNOME Software doesn’t mention this, or link to ODRS.
Even the ODRS privacy policy describes the situation incompletely, mentioning that ‘User ID (hashed)’ is collected, but only for the purpose of ‘[t]o know what users have voted on each review, and to prevent abuse by users down or upvoting too many things’, which doesn’t cover merely fetching reviews. (To their credit, the odrs.gnome.org landing page does a better job of this.)
Why is the user ID sent for fetching reviews? Because the ODRS server sends back a token computed from the user ID with the response, and then requires that token if the user submits a review. The stated purpose of this two-step dance is to cut down on spam; a spammer writing a script has to make both requests, which is deemed a more challenging exercise than making one. There is no explanation given for why the token needs to be returned as part of the review fetching request that happens in the background every time GNOME Software displays application details, as opposed to a separate request issued only when needed.
@pwithnall and @Richard_Hughes, two of the three GNOME Software maintainers (and in Richard’s case, the GDPR Data Protection Officer for ODRS), say that this is completely overblown.
- The ODRS server software is open-source, and doesn’t store the association between user ID and application unless the user is submitting a review. (But if the server is behind some sort of proxy, and if that proxy has logs, then the association could be stored in those logs. Plus many similar quibbles; see ‘Do you trust the server?’ below.)
- They assert that it is necessary to send the user ID along with requests for reviews to prevent abuse. (As described above, raising the cost from a one-request script to a two-request script is not much of a barrier to abuse, unless the ODRS server operators know something I don’t. Besides, getting the token needed for submitting reviews could be a separate request.)
- And they don’t want to inform users or worse, ask for their consent, because they think it would be a distraction, confusing, or alarming to do so. (I have no fact-based rebuttal for this, but as a matter of opinion, I think it is deeply unethical to avoid asking for consent—in this and every other context—simply because asking might result in a ‘no’.)
Mitigations
For users
As the maintainers have noted, you can run
gsettings set org.gnome.software review-server ""
in a terminal. This leaves most GNOME Software functionality intact but disables viewing and submitting user reviews. I recommend this to everyone for now!
For distributions
Privacy-oriented Linux distributions may want to patch the ODRS functionality out of GNOME Software, with the same effect as the user setting above. I don’t have a public patch available yet but I intend to create one and link it here if this situation doesn’t look like it will resolve.
For GNOME Software / ODRS
The issue here is the confluence of several things, any of which could be mitigated individually.
- User ID is sent with review fetches
- GNOME Software could send one request without an ID to get reviews, and another request to get the token needed to submit a review when it is needed. This wouldn’t require any changes to ODRS.
- ODRS could be updated with a new request that just returns a token for reviews.
- The two-request flow could be dropped, since how much of a barrier to spam is this really?
- Users don’t know about the user ID
- Link to the ODRS privacy policy from About and the GNOME Software webpage. (This is a partial mitigation but better than nothing.)
- GNOME Software could ask for consent before sending a tracking identifier.
- The user ID is hard to change
- GNOME Software could use something other than hashing machine ID + username, like a random GUID stored somewhere in the user’s home. That would at least give users the option of ‘clearing cookies’. (This is a partial mitigation but better than nothing.)
As a mitigation of last resort, GNOME Software could stop using ODRS, temporarily (if another mitigation is pending) or permanently.
Why this matters
Given Richard’s assurances that nothing nefarious is going on here, why should you be expected to care?
(Please skip to the next section if it’s obvious to you why this should matter.)
Do you trust the server?
Richard says, and the ODRS code confirms, that the association of (IP, user ID, application) isn’t stored by the ODRS server application. But that’s far from the end of the story. Other things I am left wondering:
- Is there a proxy that logs this information in front of the ODRS server?
- Who controls the infrastructure? Are they definitely deploying the ODRS software exactly as it is shown?
- Can anyone breach the server?
- Will a government require the server to secretly log and report?
- Will any of the relevant parties be sold to an entity that is incentivized to collect and sell analytics?
- Will any of the above change in the future?
Why might you care about (IP, user ID, application) associations?
All tracking data can be used to build out a more accurate picture of you. Application aside, if you move your laptop from location to location, a log of time, IP address, and user ID can be used to track your movements. Your name may not be easily extractable from your user ID alone, but it could be easier to identify you if your location and pattern of movement—home to employer, perhaps—is known. Once a user ID is known to correspond to a person with a particular home and employer, knowing what applications they have installed could lead to some negative real-life consequences for them, depending on, for example, the laws where they live, or if those applications have vulnerabilities. I don’t think it takes much imagination to fill in a few plausible scenarios here.
If you don’t trust the people behind GNOME Software, don’t you have bigger issues, like… all of your software?
There is a difference between software compiled by your Linux distribution and run on your device, and software run on a server. The former is publicly auditable, and if you are paranoid you can compile from source yourself, with an appropriately audited compiler and source code supply chain. But approximately everyone who uses GNOME sends data to ODRS right now, and for the above reasons even though the ODRS server code is available, you can’t have the same level of assurance that your data aren’t being mishandled.
Is a hashed identifier any worse than an IP address?
Of course it is. That’s why it’s being collected. If both were equally personally-identifying, the ODRS server would just use the IP address for abuse prevention.
But more concretely:
- Privacy-conscious users have options to mask or rotate their IP address, such as VPNs, system-wide Tor, public or shared network access points, or a cooperative ISP. Changing username or machine ID is much more disruptive.
- Privacy-conscious users know that their IP address is something for them to keep in mind. I don’t think the vast majority of GNOME users, even privacy-conscious ones, know about this ODRS user ID.
Now what?
I’m getting some very strong pushback from Richard and Philip on doing anything about all this short of a hard fork. Maybe it’s because I’m being unreasonable, or maybe I came in too hot or was rude in some way (apologies if so). My intention isn’t at all to be edgy or provocative, and I would like to help out with whatever patches are needed so that this isn’t a demand for volunteer maintainers to fix something for me.
I’d love to get some perspective on how other users, once informed of the above facts, perceive this situation.
- This is an issue, and I lean towards doing something about it.
- This is an issue, but I lean towards not changing anything—the alternatives are all worse in some way.
- This is an issue, but I’m fully undecided on whether something should be done about it.
- I agree with the maintainers that this is not an issue, even in principle.
Additionally, if you think this issue merits consideration, speaking up here or on the GitLab issue would be useful. I don’t think addressing this situation requires a lot of implementation work—certainly not more than I’d be capable of doing myself. The hard part will be reaching consensus with the maintainers and/or the ODRS operator(s) (is this anyone other than Richard? I still don’t know) on what is to be done.
Alternatively, if I need a good talking-to about how things are properly done around here, please give that to me before I embarrass myself further.
Thanks for reading.
There is some local caching of responses, so not literally every time. ↩︎