Debating VPN options
In my home lab(s), I have a handful of machines spread around a few points of presence, with mostly residential/commercial cable/DSL uplinks, which means, generally, NAT. This makes monitoring those devices kind of impossible. While I do punch holes for SSH, using jump hosts gets old quick, so I'm considering adding a virtual private network (a "VPN", not a VPN service) so that all machines can be reachable from everywhere.
I see three ways this can work:
- a home-made Wireguard VPN, deployed with Puppet
- a Wireguard VPN overlay, with Tailscale or equivalent
- IPv6, native or with tunnels
So which one will it be?
Wireguard Puppet modules
As is (unfortunately) typical with Puppet, I found multiple different modules to talk with Wireguard.
module | score | downloads | release | stars | watch | forks | license | docs | contrib | issue | PR | notes |
---|---|---|---|---|---|---|---|---|---|---|---|---|
halyard | 3.1 | 1,807 | 2022-10-14 | 0 | 0 | 0 | MIT | no | requires firewall and Configvault_Write modules? |
|||
voxpupuli | 5.0 | 4,201 | 2022-10-01 | 2 | 23 | 7 | AGPLv3 | good | 1/9 | 1/4 | 1/61 | optionnally configures ferm , uses systemd-networkd, recommends systemd module with manage_systemd to true , purges unknown keys |
abaranov | 4.7 | 17,017 | 2021-08-20 | 9 | 3 | 38 | MIT | okay | 1/17 | 4/7 | 4/28 | requires pre-generated private keys |
arrnorets | 3.1 | 16,646 | 2020-12-28 | 1 | 2 | 1 | Apache-2 | okay | 1 | 0 | 0 | requires pre-generated private keys? |
The voxpupuli module seems to be the most promising. The abaranov module is more popular and has more contributors, but it has more open issues and PRs.
More critically, the voxpupuli module was written after the abaranov author didn't respond to a PR from the voxpupuli author trying to add more automation (namely private key management).
It looks like setting up a wireguard network would be as simple as this on node A:
wireguard::interface { 'wg0':
source_addresses => ['2003:4f8:c17:4cf::1', '149.9.255.4'],
public_key => $facts['wireguard_pubkeys']['nodeB'],
endpoint => 'nodeB.example.com:53668',
addresses => [{'Address' => '192.168.123.6/30',},{'Address' => 'fe80::beef:1/64'},],
}
This configuration come from this pull request I sent to the module to document how to use that fact.
Note that the addresses used here are examples that shouldn't be reused and do not confirm to RFC5737 ("IPv4 Address Blocks Reserved for Documentation", 192.0.2.0/24 (TEST-NET-1), 198.51.100.0/24 (TEST-NET-2), and 203.0.113.0/24 (TEST-NET-3)) or RFC3849 ("IPv6 Address Prefix Reserved for Documentation", 2001:DB8::/32), but that's another story.
(To avoid boostrapping problems, the resubmit-facts configuration could be used so that other nodes facts are more immediately available.)
One problem with the above approach is that you explicitly need to take care of routing, network topology, and addressing. This can get complicated quickly, especially if you have lots of devices, behind NAT, in multiple locations (which is basically my life at home, unfortunately).
Concretely, basic Wireguard only support one peer behind NAT. There are some workarounds for this, but they generally imply a relay server of some sort, or some custom registry, it's kind of a mess. And this is where overlay networks like Tailscale come in.
Tailscale
Tailscale is basically designed to deal with this problem. It's not fully opensource, but pretty close, and they have an interesting philosophy behind that. The client is opensource, and there is an opensource version of the server side, called headscale. They have recently (late 2022) hired the main headscale developer while promising to keep supporting it, which is pretty amazing.
Tailscale provides an overlay network based on Wireguard, where each peer basically has a peer-to-peer encrypted connexion, with automatic key rotation. They also ship a multitude of applications and features on top of that like file sharing, keyless SSH access, and so on. The authentication layer is based on an existing SSO provider, you don't just register with Tailscale with new account, you login with Google, Microsoft, or GitHub (which, really, is still Microsoft).
The Headscale server ships with many features out of that:
- Full "base" support of Tailscale's features
- Configurable DNS
- Split DNS
- MagicDNS (each user gets a name)
- Node registration
- Single-Sign-On (via Open ID Connect)
- Pre authenticated key
- Taildrop (File Sharing)
- Access control lists
- Support for multiple IP ranges in the tailnet
- Dual stack (IPv4 and IPv6)
- Routing advertising (including exit nodes)
- Ephemeral nodes
- Embedded DERP server (AKA NAT-to-NAT traversal)
Neither project (client or server) is in Debian (RFP 972439 for
the client, none filed yet for the server), which makes deploying this
for my use case rather problematic. Their install instructions
are basically a curl | bash
but they also provide packages for
various platforms. Their Debian install instructions are
surprisingly good, and check most of the third party checklist
we're trying to establish. (It's missing a pin.)
There's also a Puppet module for tailscale, naturally.
What I find a little disturbing with Tailscale is that you not only need to trust Tailscale with authorizing your devices, you also basically delegate that trust also to the SSO provider. So, in my case, GitHub (or anyone who compromises my account there) can penetrate the VPN. A little scary.
Tailscale is also kind of an "all or nothing" thing. They have
MagicDNS, file transfers, all sorts of things, but those things
require you to hook up your resolver with Tailscale. In fact,
Tailscale kind of assumes you will use their nameservers, and have
suffered great lengths to figure out how to do that. And
naturally, here, it doesn't seem to work reliably; my resolv.conf
somehow gets replaced and the magic resolution of the ts.net
domain
fails.
(I wonder why we can't opt in to just publicly resolve the ts.net
domain. I don't care if someone can enumerate the private IP addreses
or machines in use in my VPN, at least I don't care as much as
fighting with resolv.conf
everywhere.)
Because I mostly have access to the routers on the networks I'm on, I don't think I'll be using tailscale in the long term. But it's pretty impressive stuff: in the time it took me to even review the Puppet modules to configure Wireguard (which is what I'll probably end up doing), I was up and running with Tailscale (but with a broken DNS, naturally).
(And yes, basic Wireguard won't bring me DNS either, but at least I won't have to trust Tailscale's Debian packages, and Tailscale, and Microsoft, and GitHub with this thing.)
IPv6
IPv6 is actually what is supposed to solve this. Not NAT port forwarding crap, just real IPs everywhere.
The problem is: even though IPv6 adoption is still growing, it's kind of reaching a plateau at around 40% world-wide, with Canada lagging behind at 34%. It doesn't help that major ISPs in Canada (e.g. Bell Canada, Videotron) don't care at all about IPv6 (e.g. Videotron in beta since 2011). So we can't rely on those companies to do the right thing here.
The typical solution here is often to use a tunnel like HE's tunnelbroker.net. It's kind of tricky to configure, but once it's done, it works. You get end-to-end connectivity as long as everyone on the network is on IPv6.
And that's really where the problem lies here; the second one of your nodes can't setup such a tunnel, you're kind of stuck and that tool completely breaks down. IPv6 tunnels also don't give you the kind of security a VPN provides as well, naturally.
The other downside of a tunnel is you don't really get peer-to-peer connectivity: you go through the tunnel. So you can expect higher latencies and possibly lower bandwidth as well. Also, HE.net doesn't currently charge for this service (and they've been doing this for a long time), but this could change in the future (just like Tailscale, that said).
Concretely, the latency difference is rather minimal, Google:
--- ipv6.l.google.com ping statistics ---
10 packets transmitted, 10 received, 0,00% packet loss, time 136,8ms
RTT[ms]: min = 13, median = 14, p(90) = 14, max = 15
--- google.com ping statistics ---
10 packets transmitted, 10 received, 0,00% packet loss, time 136,0ms
RTT[ms]: min = 13, median = 13, p(90) = 14, max = 14
In the case of GitHub, latency is actually lower, interestingly:
--- ipv6.github.com ping statistics ---
10 packets transmitted, 10 received, 0,00% packet loss, time 134,6ms
RTT[ms]: min = 13, median = 13, p(90) = 14, max = 14
--- github.com ping statistics ---
10 packets transmitted, 10 received, 0,00% packet loss, time 293,1ms
RTT[ms]: min = 29, median = 29, p(90) = 29, max = 30
That is because HE.net peers directly with my ISP and Fastly (which is behind GitHub.com's IPv6, apparently?), so it's only 6 hops away. While over IPv4, the ping goes over New York, before landing AWS's Ashburn, Virginia datacenters, for a whopping 13 hops...
I managed setup a HE.net tunnel at home, because I also need IPv6 for other reasons (namely debugging at work). My first attempt at setting this up in the office failed, but now that I found the openwrt.org guide, it worked... for a while, and I was able to produce the above, encouraging, mini benchmarks.
Unfortunately, a few minutes later, IPv6 just went down again. And the problem with that is that many programs (and especially OpenSSH) do not respect the Happy Eyeballs protocol (RFC 8305), which means various mysterious "hangs" at random times on random applications. It's kind of a terrible user experience, on top of breaking the one thing it's supposed to do, of course, which is to give me transparent access to all the nodes I maintain.
Even worse, it would still be a problem for other remote nodes I might
setup where I might not have acess to the router to setup the
tunnel. It's also not absolutely clear what happens if you setup the
same tunnel in two places... Presumably, something is smart enough to
distribute only a part of the /48
block selectively, but I don't
really feel like going that far, considering how flaky the setup is
already.
Other options
If this post sounds a little biased towards IPv6 and Wireguard, it's because it is. I would like everyone to migrate to IPv6 already, and Wireguard seems like a simple and sound system.
I'm aware of many other options to make VPNs. So before anyone jumps in and says "but what about...", do know that I have personnally experimented with:
tinc: nice, automatic meshing, used for the Montreal mesh, serious design flaws in the crypto that make it generally unsafe to use; supposedly, v1.1 (or 2.0?) will fix this, but that's been promised for over a decade by now
ipsec, specifically strongswan: hard to configure (especially configure correctly!), harder even to debug, otherwise really nice because transparent (e.g. no need for special subnets), used at work, but also considering a replacement there because it's a major barrier to entry to train new staff
OpenVPN: mostly used as a client for [VPN service][]s like Riseup VPN or Mullvad, mostly relevant for client-server configurations, not really peer-to-peer, shared secrets or TLS, kind of an hassle to maintain, see also SoftEther for an alternative implementation
All of those solutions have significant problems and I do not wish to use any of those for this project.
Also note that Tailscale is only one of many projects laid over Wireguard to do that kind of thing, see this LWN review for others (basically NetbBird, Firezone, and Netmaker).
Future work
Those are options that came up after writing this post, and might warrant further examination in the future.
innernet, a "private network system that uses WireGuard under the hood", Rust-based web server, not in Debian, .debs available here
Meshbird, a "distributed private networking" with little information about how it actually works other than "encrypted with strong AES-256"
Nebula, "A scalable overlay networking tool with a focus on performance, simplicity and security", written by Slack people to replace IPsec, docs, runs as an overlay for Slack's 50k node network, only packaged in Debian experimental, lagging behind upstream (1.4.0, from May 2021 vs upstream's 1.6.1 from September 2022), requires a central CA, Golang, I'm in "wait and see" mode for now
n2n: "layer two VPN", seems packaged in Debian but inactive
ouroboros: "peer-to-peer packet network prototype", sounds and seems complicated
QuickTUN is interesting because it's just a small wrapper around NaCL, and it's in Debian... but maybe too obscure for my own good
unetd: Wireguard-based full mesh networking from OpenWRT, not in Debian
vpncloud: "high performance peer-to-peer mesh VPN over UDP supporting strong encryption, NAT traversal and a simple configuration", sounds interesting, not in Debian
wgautomesh: "connect wireguard nodes together in a full mesh topology", rust
Yggdrasil: actually a pretty good match for my use case, but I didn't think of it when starting the experiments here; packaged in Debian, with the Golang version planned, Puppet module; major caveat: nodes exposed publicly inside the global mesh unless configured otherwise (firewall suggested), requires port forwards, alpha status
Conclusion
Right now, I'm going to deploy Wireguard tunnels with Puppet. It seems like kind of a pain in the back, but it's something I will be able to reuse for work, possibly completely replacing strongswan.
I have another Puppet module for IPsec which I was planning to publish, but now I'm thinking I should just abort that and replace everything with Wireguard, assuming we still need VPNs at work in the future. (I have a number of reasons to believe we might not need any in the near future anyways...)
Hi,
Did you look at zerotier-one, easy to setup, works like a charm for me.
So another thing to keep in mind while reading my blog post is that I did not do an exhaustive search of all possible VPN software out there.
I have 80 bookmark entries under the tag
vpn
. There's so much stuff out there, it would make this post unreadable. I'm trying to stick with simple, free software options, preferably in Debian.zerotier-one
actually completely fails at the latter two, as it's now non-free...I'll still add a bunch more links to the post pre-emptively. That said, no, zerotier wasn't in that list yet, so thanks, I guess.