GNOME Software (the ‘app store’) has cookie-like user tracking

Hello GNOME community,

This (long, sorry) post has three goals:

  1. Raise awareness for privacy-conscious users of a way that GNOME Software may violate your expectations, and what you can do about it.
  2. Get feedback from the community on whether I’m making a big deal out of nothing.
  3. Assuming others agree, getting some additional voices to speak up and persuade the GNOME Software maintainers to change their minds.

tl;dr for privacy-conscious users:

Run gsettings set org.gnome.software review-server "" if you want to be protected from all of this (this will disable seeing and submitting user reviews in GNOME Software).

There’s also a poll at the bottom that I hope you don’t skip if you engage in any part of this.

The situation

GNOME Software communicates with odrs.gnome.org, the server responsible for hosting user reviews. When fetching user reviews, the ODRS server requires a pseudonymous user ID. It’s basically a cookie, in that it gets sent in the background when data are fetched from the Internet, and can be used to correlate data requests with more accuracy than an IP address.

The user ID that GNOME Software computes for this operation is a hash of the machine ID and the user’s local username, along with a fixed salt (implementation). None of these things are trivial to change or clear, which makes this ID more persistent than a web cookie.

Again: this ID gets sent with the request every time reviews of an application are fetched. GNOME Software does this every[1] time it displays information about an application. So when you, say, right-click on an icon from the dash, and select ‘App Details’, the name of an application you have installed and a cookie that persistently identifies you as long as you don’t reset your machine ID or rename your user are sent to the ODRS server, and you are not notified.

The GNOME Software webpage doesn’t mention this, or link to ODRS.

The GNOME project-wide privacy policy also doesn’t link to ODRS.

The About dialog in GNOME Software doesn’t mention this, or link to ODRS.

Even the ODRS privacy policy describes the situation incompletely, mentioning that ‘User ID (hashed)’ is collected, but only for the purpose of ‘[t]o know what users have voted on each review, and to prevent abuse by users down or upvoting too many things’, which doesn’t cover merely fetching reviews. (To their credit, the odrs.gnome.org landing page does a better job of this.)

Why is the user ID sent for fetching reviews? Because the ODRS server sends back a token computed from the user ID with the response, and then requires that token if the user submits a review. The stated purpose of this two-step dance is to cut down on spam; a spammer writing a script has to make both requests, which is deemed a more challenging exercise than making one. There is no explanation given for why the token needs to be returned as part of the review fetching request that happens in the background every time GNOME Software displays application details, as opposed to a separate request issued only when needed.

@pwithnall and @Richard_Hughes, two of the three GNOME Software maintainers (and in Richard’s case, the GDPR Data Protection Officer for ODRS), say that this is completely overblown.

  • The ODRS server software is open-source, and doesn’t store the association between user ID and application unless the user is submitting a review. (But if the server is behind some sort of proxy, and if that proxy has logs, then the association could be stored in those logs. Plus many similar quibbles; see ‘Do you trust the server?’ below.)
  • They assert that it is necessary to send the user ID along with requests for reviews to prevent abuse. (As described above, raising the cost from a one-request script to a two-request script is not much of a barrier to abuse, unless the ODRS server operators know something I don’t. Besides, getting the token needed for submitting reviews could be a separate request.)
  • And they don’t want to inform users or worse, ask for their consent, because they think it would be a distraction, confusing, or alarming to do so. (I have no fact-based rebuttal for this, but as a matter of opinion, I think it is deeply unethical to avoid asking for consent—in this and every other context—simply because asking might result in a ‘no’.)

Mitigations

For users

As the maintainers have noted, you can run

gsettings set org.gnome.software review-server ""

in a terminal. This leaves most GNOME Software functionality intact but disables viewing and submitting user reviews. I recommend this to everyone for now!

For distributions

Privacy-oriented Linux distributions may want to patch the ODRS functionality out of GNOME Software, with the same effect as the user setting above. I don’t have a public patch available yet but I intend to create one and link it here if this situation doesn’t look like it will resolve.

For GNOME Software / ODRS

The issue here is the confluence of several things, any of which could be mitigated individually.

  • User ID is sent with review fetches
    • GNOME Software could send one request without an ID to get reviews, and another request to get the token needed to submit a review when it is needed. This wouldn’t require any changes to ODRS.
    • ODRS could be updated with a new request that just returns a token for reviews.
    • The two-request flow could be dropped, since how much of a barrier to spam is this really?
  • Users don’t know about the user ID
    • Link to the ODRS privacy policy from About and the GNOME Software webpage. (This is a partial mitigation but better than nothing.)
    • GNOME Software could ask for consent before sending a tracking identifier.
  • The user ID is hard to change
    • GNOME Software could use something other than hashing machine ID + username, like a random GUID stored somewhere in the user’s home. That would at least give users the option of ‘clearing cookies’. (This is a partial mitigation but better than nothing.)

As a mitigation of last resort, GNOME Software could stop using ODRS, temporarily (if another mitigation is pending) or permanently.

Why this matters

Given Richard’s assurances that nothing nefarious is going on here, why should you be expected to care?

(Please skip to the next section if it’s obvious to you why this should matter.)

Do you trust the server?

Richard says, and the ODRS code confirms, that the association of (IP, user ID, application) isn’t stored by the ODRS server application. But that’s far from the end of the story. Other things I am left wondering:

  • Is there a proxy that logs this information in front of the ODRS server?
  • Who controls the infrastructure? Are they definitely deploying the ODRS software exactly as it is shown?
  • Can anyone breach the server?
  • Will a government require the server to secretly log and report?
  • Will any of the relevant parties be sold to an entity that is incentivized to collect and sell analytics?
  • Will any of the above change in the future?

Why might you care about (IP, user ID, application) associations?

All tracking data can be used to build out a more accurate picture of you. Application aside, if you move your laptop from location to location, a log of time, IP address, and user ID can be used to track your movements. Your name may not be easily extractable from your user ID alone, but it could be easier to identify you if your location and pattern of movement—home to employer, perhaps—is known. Once a user ID is known to correspond to a person with a particular home and employer, knowing what applications they have installed could lead to some negative real-life consequences for them, depending on, for example, the laws where they live, or if those applications have vulnerabilities. I don’t think it takes much imagination to fill in a few plausible scenarios here.

If you don’t trust the people behind GNOME Software, don’t you have bigger issues, like… all of your software?

There is a difference between software compiled by your Linux distribution and run on your device, and software run on a server. The former is publicly auditable, and if you are paranoid you can compile from source yourself, with an appropriately audited compiler and source code supply chain. But approximately everyone who uses GNOME sends data to ODRS right now, and for the above reasons even though the ODRS server code is available, you can’t have the same level of assurance that your data aren’t being mishandled.

Is a hashed identifier any worse than an IP address?

Of course it is. That’s why it’s being collected. If both were equally personally-identifying, the ODRS server would just use the IP address for abuse prevention.

But more concretely:

  • Privacy-conscious users have options to mask or rotate their IP address, such as VPNs, system-wide Tor, public or shared network access points, or a cooperative ISP. Changing username or machine ID is much more disruptive.
  • Privacy-conscious users know that their IP address is something for them to keep in mind. I don’t think the vast majority of GNOME users, even privacy-conscious ones, know about this ODRS user ID.

Now what?

I’m getting some very strong pushback from Richard and Philip on doing anything about all this short of a hard fork. Maybe it’s because I’m being unreasonable, or maybe I came in too hot or was rude in some way (apologies if so). My intention isn’t at all to be edgy or provocative, and I would like to help out with whatever patches are needed so that this isn’t a demand for volunteer maintainers to fix something for me.

I’d love to get some perspective on how other users, once informed of the above facts, perceive this situation.

  • This is an issue, and I lean towards doing something about it.
  • This is an issue, but I lean towards not changing anything—the alternatives are all worse in some way.
  • This is an issue, but I’m fully undecided on whether something should be done about it.
  • I agree with the maintainers that this is not an issue, even in principle.
0 voters

Additionally, if you think this issue merits consideration, speaking up here or on the GitLab issue would be useful. I don’t think addressing this situation requires a lot of implementation work—certainly not more than I’d be capable of doing myself. The hard part will be reaching consensus with the maintainers and/or the ODRS operator(s) (is this anyone other than Richard? I still don’t know) on what is to be done.

Alternatively, if I need a good talking-to about how things are properly done around here, please give that to me before I embarrass myself further.

Thanks for reading.


  1. There is some local caching of responses, so not literally every time. ↩︎

1 Like

IMO this is a bit of a tempest in a teacup

On one hand: is this a catostrophic privacy violation, and are you as a user being spied on with this? No. As you pointed out on GitLab, you believe the GNOME developers when they tell you that nothing is being collected or stored. Awesome. Does this need a big discussion about it, using accusations like “telemetry” (which was the original issue title, and implies intentional data collection) to describe what might be an oversight nobody noticed for 8 years? No probably not. This is probably why you got the pushback you did.

On the other hand: I don’t think it’d actually be that controversial to separate fetching reviews with registering a unique ID for review/vote submission. Plus it’d be more efficient: I wouldn’t be surprised if the majority of people never interact with reviews other than reading them, so no need to generate a user_id for them and keep it in the database.

There’s two legitimate usecases for having the user’s identity at fetch time

  • Votes. Votes are the way abusive reviews get suppressed and deleted, given that there’s no “ODRS moderation” team. If you actually look at ODRS’s code in /fetch: if it sees that you already voted on a review (marked it useful, or not useful, or abusive) it’ll mark the returned review as such. This lets GNOME Software know that it should hide the vote buttons (becuase you’ve already voted). By not passing in a user_hash to fetch, GNOME Software doesn’t know the reviews you’ve voted for already.
  • There is also this MR that ensures a review you’ve left on an app appears first, so that you can easily find it to edit or delete it.

If you’re willing to work on ODRS, I think you could make the following changes to make the issue go away:

  • You’d need to amend /fetch to stop requring (and using) the user_hash argument. For backwards compat, though, it might be prudent to keep accepting it so that older clients can keep functioning.
  • You’d need to add a special /register endpoint. This would take a user_hash and return the user secret along with two lists: one containing the reviews you’ve ever voted on, and the other containing the reviews you’ve submitted.
  • Make GNOME Software keep track of whether or not it ever called /register before (i.e. if you’ve ever submitted a review, or voted on a review, before)
  • If you’ve called /register before, GNOME Software can call /register again on each startup to obtain the user’s history. This way, it doesn’t need to pass around the user ID for each time it fetches any app’s reviews. It’d only do so once per startup. Even better, it might try to cache this result and then keep track of votes/submitted reviews locally. This way it only has to call /register once

The remaining problem is that GNOME Software isn’t currently keeping track of this, so there’s no way to know if a user has submitted anything before until we call register. It’s a catch 22. Let’s say there’s a /query API that works just like /register but without actually doing the registering. On first startup GNOME Software can call /check-registered with the user ID to see if they’ve ever submitted anything. This kinda defeats the purpose of not sending a unique ID over the network until actually necessary. Alternatively, GNOME Software can assume that nobody has ever registered, but then all the users will lose the ability to interact with their reviews until they take some action (like voting on a review) that tricks GNOME Software into reinstating its knowledge about their account. My guess is that this is the reason Philip said you might have better luck replacing ODRS entirely.


Let’s discuss replacing ODRS. Because nowadays, we have Flatpak. Does seeing reviews on distro packages matter that much anymore? Additionally, we’re about to gain something that we didn’t have before: paid apps on Flathub. These require an account to purchase and download the app. Perhaps Flathub could have a reviews system, and it would be tied to this same account. This would make the reviews stop being anonymous, and it would make the situation a lot more clear to the user: “I have an account, and the reviews I write are tied to my account”. The “consent dialog” you mention would simply turn into a “sign into your flathub account” dialog.

2 Likes

Overall, are you suggesting the following?

  • Download the reviews for information purposes, without the “problematic cookie.”
  • Require an account to review.
1 Like

Look, this isn’t the 1990s; infosec isn’t about believing one friendly sysop at the other end of the line anymore. All of these additional concerns aren’t theoretical and shouldn’t be dismissed offhand like this.

Okay! If the relevant stakeholders agree that this is a good design, it works for me!

But then, nothing stops us from implementing the exact same thing on top of ODRS, without modifying ODRS at all.

Imagine the flow for Flathub reviews:

  • Somewhere in the app is a sign-in button and a sign-out button.
  • Reviews are either visible when not signed in (in which case they offer no widgets for interacting with them), or not visible, perhaps with a message that includes, or directs the user to, the sign-in button.
  • When signed in, the Flathub reviews can be interacted with, presumably in a similar manner as the existing ODRS reviews (meaning, editing or deleting one’s own review and voting on others’).

If this is deemed good and non-confusing, we can just do exactly this on top of ODRS. The only difference is that the sign-in process can be one-click, with no need for an authentication step from the user.

Being signed in would be a boolean in GNOME Software. Everyone can start signed out; there’s no catch-22, because signing in is explicit. When not signed in, if we want to display reviews, we can fetch them from ODRS using a fixed not-signed-in user ID. We’ll display them the same way we’d display reviews to a not-signed-in Flathub user: no affordances for voting or editing. Or we won’t display reviews, if that’s how we’d do it for Flathub. Once signed in, we re-fetch reviews with the real user ID. The sign-out button flips you back to using the not-signed-in ID. No new endpoints required. If Richard wants, ODRS can be optimized not to compute tokens when it receives the not-signed-in ID, to improve efficiency, or other request flows could be designed—but that’s all optional!

That’s all I want. It’s the same thing as the Flathub thing from the user’s perspective, and if it’s a good experience in that possible future, why can’t we have it with ODRS today?

Yes, it makes explicit to the user that they have an ‘account’ on ODRS. That should be explicit to the user, because it’s true, and right now users don’t know it!

In so many words, yes. But the devil’s in the details in terms of maintaining the features GNOME Software currently exposes and the (dubious) robustness of ODRS against some forms of abuse.

I would like to suggest what could be a middle ground for the above debate, a relatively easy, low-tech next step, namely: To make the privacy documentation more easily discoverable and available to users of the gnome software application.

Why would that help, and how?

  • Generally speaking, most if not all of us will consider it good ethical practice to inform users in a transparent way.

  • On a related note, I find that the ODRS service already does a great job at informing users. Specifically, in the gitlab issue I found the link to https://odrs.gnome.org/privacy This page is awesome! But to be honest, if it was not for the gitlab issue, I would never have found this page; I would therefore like to propose that…

  • Specifically, GNOME Software could maybe do a better job at communicating to users that the ODRS service is used, and that it processes specific data in a secure and private way, and only for specific purposes. This could be archieved through simple low-tech GUI changes: For example, GNOME Software could advertise the ODRS privacy documentation by linking it in the about dialog. For example, GNOME software could display an introductory screen with privacy information when started for the first time. GUI changes like this could communicate to users that GNOME Software is taking privacy seriously. They could also help users make an informed decision about whether they want to use the GNOME Software app.

Given the somewhat heated discussion, I want to emphasize that I appreciate the points raised on both sides. I’m not taking a position on whether the technical solution can or should be improved. My main point is that, whatever the solution, it should be straightforward for users to understand what data processing is taking place.

I’m also not arguing for or against requiring user consent in GNOME Software. My focus is simply on transparency: regardless of whether consent is legally required, users should be clearly informed about any data processing involved.

Hope this helps!, thank you for your consideration!

2 Likes