Persistent remote desktop access API

Hi! I’m with the Chrome Remote Desktop team at Google, and I have some time this year to work toward a standard API for persistent remote access under Wayland, especially important given future plans to remove X11 support.

The current portals API for remote desktop is a good fit for remote support use cases, where there is a local user present to approve access for each connection and mirroring the physical displays is desired.

However, for persistent remote access, there are additional requirements that cannot be achieved through the existing portals API. Specifically:

  • One-time permission grant: The user needs to be able to grant permission to the remote desktop tool when they first set it up, and connect any time without further permission prompts. When connecting remotely to one’s own workstation, there will be no local user to approve the connection. This permission should survive automatic updates.
  • Virtual seat: It’s a security risk to have the physical seat unlock when a user connects to the workstation remotely, so there must be a way to create a virtual seat for the connection and have the desktop rendered there. The physical seat could then return to GDM.
  • Display configuration: Commonly, a user wants the virtual seat to match the monitor configuration of the machine they are connecting from for an immersive, full-screen, potentially multi-monitor setup. Thus, the remote desktop tool needs a way to configure virtual monitors, layout, and scaling factors to match the client.
  • Launching a session: If the user does not have an active session, the remote desktop tool should be able to launch one on their behalf.

While I realize the standardization of these APIs will likely happen outside of GNOME, GNOME is currently our first priority to support for Wayland remote-desktop, so I want to make sure whatever technical approach we pursue is one that would be accepted by GNOME.

I know @jadahl had some ideas about this in a previous discussion, and suggested a new xdg-spec under the org.freedesktop. prefix would be the right umbrella for this (as opposed to, e.g., a (set of) Wayland protocol extension(s) or twisting Portals to support persistent privileges), and had a rough sketch for what it could look like. (Do you still have that handy?)

I also saw that there is a new transient seat Wayland protocol extension geared toward remote use-cases, though I’m not sure if that would be useful for this particular endeavor . (The overview says, “This protocol is intended for use with virtual input protocols such as ‘virtual_keyboard_unstable_v1’ or ‘wlr_virtual_pointer_unstable_v1’”, while I have the impression that GNOME would prefer a protocol that still uses Pipewire for capture and libei for input injection.)

So for GNOME 46, gnome-remote-desktop is gaining login screen support via GDM. The way it works is there is now a gnome-remote-desktop system service that serves as a dispatcher. When the user connects, they are redirected to a login screen session running its own gnome-remote-desktop session service (using a feature of RDP called ServerRedirection). After authentication, the session is started and the user is again redirected (using RDP ServerRedirection) to their user session.

assuming Chrome Remote Desktop has a similar redirection ability it could probably be implemented in a similar way. The GDM side of things isn’t tied to gnome-remote-desktop, so other remoting implementations should be able to use the same mechanism (I believe NiceDCV is working on an implementation as well for instance).

The basic idea is when a user connects, a system service calls org.gnome.DisplayManager.RemoteDisplayFactory.CreateRemoteDisplay(remote_id) which makes GDM start a login screen session, the display gets exported over the bus via a D-Bus ObjectManager interface. The login screen session should have a session agent running in it (started via say an autostart file). when it’s started it should coordinate with the system service, so the system service redirects the user to the login screen session using some sort of protocol specific means. When authentication completes, the newly started user session display has the same remote_id that was passed in initially at CreateRemoteDisplay time. This new session should also have a session agent running in it (either started via autostart or systemd --user service) to coordinate the handoff from the login screen session to the users session (probably by way of the system dispatcher service).

For 46 it’s either/or. A user gets a remote login or a local login. A dialog will pop up asking the user if they want to kill the other session when they try to log in twice.

For 47, the plan is to support a “hybrid” mode where a session can be started remotely or locally and connected to from the other way.

This work is being spearheaded by Joan Torres, so he’s a good person to reach out to (perhaps on #gnome-shell on matrix)

Work on creating an initial draft has started at https://gitlab.freedesktop.org/jadahl/xdg-specs/-/merge_requests/1. One of the main goals is to share as much functionality with the portals as possible, meaning e.g. libei for input and PipeWire for output.

Thanks for the info! It’s great to hear hybrid functionality is already in the works. Based on my reading, it seems like the the current work for gnome-remote-desktop and the proposed xdg spec use slightly different models, and I want to make sure I understand correctly:

gnome-remote-desktop

  • Uses private, unstable APIs, currently.
  • Runs as a system service.
  • When a user connects, they are presented with a GDM login screen and may log in as any user, which will either spawn a new session or connect to an existing one (once hybrid mode is implemented).
  • Can connect multiple times as long as logging in as different users, so different users would have a set of machine access credentials and would then log into their respective accounts using GDM?
  • Could potentially work with encrypted home directories.

XDG remote desktop spec

  • DBUS service and remote desktop tool run as user services.
  • When a user connects, the remote desktop tool calls org.freedesktop.RemoteDesktop1.CreateSession (assuming no remote connection is currently active). This will either:
    • Start a new desktop session for the user, or
    • If there is an existing desktop session, transition it to headless
  • Since it is a user service, only one connection makes sense and is allowed at once
  • Presumably there would be no GDM login step, since the service is already running as a specific user?
  • Would require linger to start at boot, and wouldn’t work for a user with an encrypted home directory until it had been unlocked.
  • Multiple users would each have there own remote desktop service.

Is that all correct?

Hi Erik.

About the draft Jonas did, that looks like what RemoteDesktop and Screencast portals do, but I don’t think that part covers starting and orchestrating headless sessions.

Based on the current GNOME approach, I’ll try to explain in an abstraction level to see how things could be for any remote solution.

I’m assuming:

  • The current RemoteDesktop and ScreenCast portals are enough for using a Wayland session.
  • A Wayland session is capable of running headless.

An overview:

  1. One daemon running as a system service to which the remote clients connect. This daemon is in charge of dispatching the remote clients which will end up displaying a headless Wayland session using the portals API.

  2. On a new connection, this system daemon (aka dispatcher) does the required authentication with the client (if wanted), and if it succeeds, creates a headless Wayland session. This Wayland session must have a service running which will get the remote client and use the portals to show the session.

  3. To hand over the remote client, connected to the system daemon, to the service running at the headless wayland session; the system daemon registers a Handover iface. And the session service uses it to get the remote client. The simple workflow for this might be 1. StartHandover, 2. TakeClientReady and 3. TakeClient.

Things to note:

  • The Handover API, is based on the RDP protocol, which has a ServerRedirection functionality. When a redirection is requested, the remote client disconnects and reconnects again to the system daemon, this way the session service can take the remote client using its corresponding Handover interface.

  • We rely on the display manager (GDM) to create headless wayland sessions, maybe in other situations that might be different. GDM exposes the org.gnome.DisplayManager.RemoteDisplayFactory.CreateRemoteDisplay method to create a headless wayland greeter. When the greeter is created, GDM registers a org.gnome.DisplayManager.RemoteDisplay iface. When the dispatcher finds a new RemoteDisplay is added, it registers its correspondent Handover iface to allow the handover mentioned before.
    After successful login at the greeter, a new headless user session is started with the session service to show remotely that session. GDM registers a RemoteDisplay iface for this session, and the dispatcher responds registering a Handover inface for that session. Here the remote client will be redirected from the greeter session to the user session.

  • As mentioned in the last point, there is a situation where the dispatcher redirects a remote client from one session to another.
    A remote client is identified with a unique id. Each RemoteDisplay registered, has this unique id too as a property. When the new RemoteDisplay is registered with the same remote client id for a new session, the dispatcher registers a new Handover iface and sets it as dst, and the old Handover iface with the same remote client id is set as src. Then the workflow of the handoff would be: 1. StartHandover (dst), 2. RedirectClient (src), 3. TakeClientReady (dst) and 4. TakeClient (dst).

I think a third party remote solution that implements a similar system daemon with a similar Handover iface, which uses the GDM ifaces mentioned; and a session service that uses that handover interface could work.

If I try to think of different desktop environments… Maybe, if the session service which shows remotely a session relies on portals, and that desktop environment implements the RemoteDesktop and ScreenCast portals and can run headlessly, that could be a start. Maybe other display managers could implement something similar to CreateRemoteDisplay and RemoteDisplay, I don’t know how much sense it has to standardise that.

All this is for the non-persistent use case. The next step I’m about to work on is on handling persistent and hybrid sessions. I don’t have much answers for that yet. I think having persistent sessions relies more on GDM and mutter, so any third-party remote solution that works like what was mentioned before shouldn’t change much.

Areas that might help in improving: hybrid-persistent sessions, improving GDM to not need a greeter (maybe plug GDM with the dispatcher to authenticate directly the remote client, it could be used kerberos which RDP might support in the future), here is a list of TODOs.

Thanks for the reply!

Even if session creation / curtaining is assumed to be handled by a different API analogous to the GDM RemoteDisplay API, there are three key pieces provided by @jadahl’s draft API and the unstable Mutter API that aren’t provided by the Portal APIs:

  • Virtual monitors: The draft API has CreateVirtualMonitor, and Mutter has RecordVirtual. As far as I can tell, the Portal APIs provides no mechanism for this, which makes sense since it is geared toward the use case where a user that is physically present is granting temporary access to their existing session.
  • Persistent access permission: As far as I understand, the draft API and the Mutter API are designed to be exposed only to trusted processes, and thus don’t require a local user to approve every access. The Portal RemoteDestop API, on the other hand, seems to explicitly disallow any persistence of permissions, and requires a local user to select what to share for each session.
  • Clipboard access: both the draft API and Mutter.RemoteDesktop have APIs for interacting with the clipboard for clipboard forwarding purposes. I don’t see anything in the Portal APIs providing that.

Portal permissions and limitations aside, what happens if the user logs into an account that doesn’t have a remote desktop user service configured? Do they get stuck due to the lack of a hand-off while the new session runs without a method of connecting to it? Is there any reason the system service couldn’t inject a process/service into the new session instead of requiring it to be configured ahead of time?

I suppose this issue can be mitigated somewhat by enabling the service in the global systemd user configuration.


Overall, to support persistent / hybrid sessions across multiple desktop environments / greeters, it sounds like we would need:

  • Each DE to implement a standard API like the one proposed by @jadahl or something else analogous to the existing unstable Mutter.{RemoteDesktop,ScreenCast} APIs that would be available in headless mode with the extra functionally needed over the existing Portal APIs.
  • Each greeter to implement something analogous to GDM’s unstable RemoteDisplay interface.
  • Some standard mechanism for a greeter to launch a desktop environment in headless mode, and to tell a supporting desktop environment to transition into or out of headless mode.

Does that sound like a reasonable summary?

I’m assuming things, since what I did was combining all the needed pieces on GNOME ecosystem to have the remote login. I haven’t played with portals

From ScreenCast interface, at least, I see the SelectSource method allows a Virtual property, maybe that is enough to make the backend create a virtual monitor. However, as I mentioned, I haven’t played with portals.

I think this might need to be addressed. Maybe add a method to get a restore_token, before calling Start. This way there’s an alternative way of getting permissions instead of using a dialog.

I think a candidate could be org.freedesktop.portal.Clipboard?

The solution for this is: GDM, on headless greeters, only allows to select sessions which in their desktop file have the property X-GDM-CanRunHeadless=true.

The system service doesn’t have privileges and it can’t start a service on other sessions on demand. The current approach is autostarting the session service when the system service is running, see sharing: Start gnome-remote-desktop --handover when gnome-remote-desktop --system is running (!342) · Merge requests · GNOME / gnome-settings-daemon · GitLab.
There’s a RFE to make systemd have that: RFE: monitor system units from user manager · Issue #3312 · systemd/systemd · GitHub

Based on the solution we did, this is the easiest thing that comes to my mind. However with the right resources and creativity, I think there could be improvements to the approach.