Policy routing to keep two interfaces with overlapping networks separate

I’m looking to use policy routing to address a weird (but not that weird) network setup.

I have two sets of typical home setups; a wired network and wireless network bridged together, both using DHCP. My machine can have Ethernet to either network, and WiFi to either network, and because of provider weirdness, the local IPv4 ranges on both networks overlap.

I want to set things up so that I can connect Ethernet to one network, and WiFi to the other network, have my machine prefer Ethernet for Internet connections (except when that network has lost Internet, when it should use WiFi), and to route reply packets back down the interface they came in on (so that when I’m playing with services on my laptop, other people on the same network can connect to the service I’m running locally).

With dispatcher scripts, I can do this in a janky fashion - NetworkManager is responsible for the main routing table, and connectivity checks get me my “use Ethernet if it has Internet access, use WiFi otherwise” behaviour, and my dispatcher scripts add a set of policy routing tables based on what NetworkManager reports as local addresses and the routes it got from DHCP/SLAAC.

I can’t, however, see a way to do this without the jankiness:

  • If I set non-zero route-table values for each connection, I cannot work out how to write rules that look at all the route tables to find the lowest metric default route, breaking my “prefer Ethernet except when that connection is down, then use WiFi”, nor can I work out how to tell NetworkManager to put in the connection’s current addresses as the from side of a rule.
  • If I set route-table to 0 for all the connections I use, then NetworkManager does what I want with the default route (changing the metric, so that the lowest metric is Ethernet except when that connection’s Internet is down), but then I can’t see how to write policy rules that do the source routing I want (neither the from side of a rule, nor skipping rules in a table that belong to the “wrong” interface).

Am I missing something, or is the dispatcher script the best I can do, and I need to put up with the jank when it bites me?

Assuming your distro provides connectivity check like this:
Tree - rpms/NetworkManager - src.fedoraproject.org

Then you just need to decrease the interval:

sudo tee /etc/NetworkManager/conf.d/99-connectivity.conf << EOF > /dev/null
[connectivity]
interval=10
EOF
sudo systemctl restart NetworkManager.service

The rest should work automatically:

  • Ethernet defaults to metric 100, so it’s preferred over Wi-Fi that defaults to metric 600.
  • Connectivity check failure adds 20k to the route metric per interface per IP family.

PBR is not required in this case.

The only problem may arise with active connections that you will need to re-establish if connectivity changes.

I just tried that, and it doesn’t work for me.

Ethernet connected to network 1, got IP address 192.168.4.84/22. WiFi connected to network 2 (unrelated to network 1 - completely different router and connection), got IP address 192.168.4.97/22. Internet access works as I’d expect, but connections from devices on network 2 to my laptop do not.

In particular, I see that when 192.168.4.158 connects to 192.168.4.97, the kernel routes reply packets over Ethernet, since 192.168.4.97 is on-link for the Ethernet, and the Ethernet has a lower metric. But for this to work, they have to go to network 2, which is only accessible over WiFi at this point in time.

It does work if both Ethernet and WiFi are on the same network - if both are on network 2, or both are on network 1. But not when they’re on different networks.

This is what I thought I needed policy routing for - how do I make it work without policy routing?

Consider the following options for resolving IP namespace collisions:

  • Add more specific IPv4 routes like /32 on the relevant interface for each LAN host you want to connect to.
  • Use IPv6 LLA or IPv6 ULA, possibly combined with LLMNR or mDNS, prioritizing IPv6 resolution.
  • Set up a VPN between the hosts using an external server with ZeroTrust, Tailscale, WireGuard, etc.

None of those are options for me in this situation. Policy routing works, I’m just looking to see if I have to stick to the janky scripts to set it up, or if there’s a better way to do it.