In my home lab(s), I have a handful of machines spread around a few points of presence, with mostly residential/commercial cable/DSL uplinks, which means, generally, NAT. This makes monitoring those devices kind of impossible. While I do punch holes for SSH, using jump hosts gets old quick, so I'm considering adding a virtual private network (a "VPN", not a VPN service) so that all machines can be reachable from everywhere.

I see three ways this can work:

  1. a home-made Wireguard VPN, deployed with Puppet
  2. a Wireguard VPN overlay, with Tailscale or equivalent
  3. IPv6, native or with tunnels

So which one will it be?

  1. Wireguard Puppet modules
  2. Tailscale
  3. IPv6
  4. Other options
  5. Future work
  6. Conclusion

Wireguard Puppet modules

As is (unfortunately) typical with Puppet, I found multiple different modules to talk with Wireguard.

module score downloads release stars watch forks license docs contrib issue PR notes
halyard 3.1 1,807 2022-10-14 0 0 0 MIT no requires firewall and Configvault_Write modules?
voxpupuli 5.0 4,201 2022-10-01 2 23 7 AGPLv3 good 1/9 1/4 1/61 optionnally configures ferm, uses systemd-networkd, recommends systemd module with manage_systemd to true, purges unknown keys
abaranov 4.7 17,017 2021-08-20 9 3 38 MIT okay 1/17 4/7 4/28 requires pre-generated private keys
arrnorets 3.1 16,646 2020-12-28 1 2 1 Apache-2 okay 1 0 0 requires pre-generated private keys?

The voxpupuli module seems to be the most promising. The abaranov module is more popular and has more contributors, but it has more open issues and PRs.

More critically, the voxpupuli module was written after the abaranov author didn't respond to a PR from the voxpupuli author trying to add more automation (namely private key management).

It looks like setting up a wireguard network would be as simple as this on node A:

wireguard::interface { 'wg0':
  source_addresses => ['2003:4f8:c17:4cf::1', '149.9.255.4'],
  public_key       => $facts['wireguard_pubkeys']['nodeB'],
  endpoint         => 'nodeB.example.com:53668',
  addresses        => [{'Address' => '192.168.123.6/30',},{'Address' => 'fe80::beef:1/64'},],
}

This configuration come from this pull request I sent to the module to document how to use that fact.

Note that the addresses used here are examples that shouldn't be reused and do not confirm to RFC5737 ("IPv4 Address Blocks Reserved for Documentation", 192.0.2.0/24 (TEST-NET-1), 198.51.100.0/24 (TEST-NET-2), and 203.0.113.0/24 (TEST-NET-3)) or RFC3849 ("IPv6 Address Prefix Reserved for Documentation", 2001:DB8::/32), but that's another story.

(To avoid boostrapping problems, the resubmit-facts configuration could be used so that other nodes facts are more immediately available.)

One problem with the above approach is that you explicitly need to take care of routing, network topology, and addressing. This can get complicated quickly, especially if you have lots of devices, behind NAT, in multiple locations (which is basically my life at home, unfortunately).

Concretely, basic Wireguard only support one peer behind NAT. There are some workarounds for this, but they generally imply a relay server of some sort, or some custom registry, it's kind of a mess. And this is where overlay networks like Tailscale come in.

Tailscale

Tailscale is basically designed to deal with this problem. It's not fully opensource, but pretty close, and they have an interesting philosophy behind that. The client is opensource, and there is an opensource version of the server side, called headscale. They have recently (late 2022) hired the main headscale developer while promising to keep supporting it, which is pretty amazing.

Tailscale provides an overlay network based on Wireguard, where each peer basically has a peer-to-peer encrypted connexion, with automatic key rotation. They also ship a multitude of applications and features on top of that like file sharing, keyless SSH access, and so on. The authentication layer is based on an existing SSO provider, you don't just register with Tailscale with new account, you login with Google, Microsoft, or GitHub (which, really, is still Microsoft).

The Headscale server ships with many features out of that:

Neither project (client or server) is in Debian (RFP 972439 for the client, none filed yet for the server), which makes deploying this for my use case rather problematic. Their install instructions are basically a curl | bash but they also provide packages for various platforms. Their Debian install instructions are surprisingly good, and check most of the third party checklist we're trying to establish. (It's missing a pin.)

There's also a Puppet module for tailscale, naturally.

What I find a little disturbing with Tailscale is that you not only need to trust Tailscale with authorizing your devices, you also basically delegate that trust also to the SSO provider. So, in my case, GitHub (or anyone who compromises my account there) can penetrate the VPN. A little scary.

Tailscale is also kind of an "all or nothing" thing. They have MagicDNS, file transfers, all sorts of things, but those things require you to hook up your resolver with Tailscale. In fact, Tailscale kind of assumes you will use their nameservers, and have suffered great lengths to figure out how to do that. And naturally, here, it doesn't seem to work reliably; my resolv.conf somehow gets replaced and the magic resolution of the ts.net domain fails.

(I wonder why we can't opt in to just publicly resolve the ts.net domain. I don't care if someone can enumerate the private IP addreses or machines in use in my VPN, at least I don't care as much as fighting with resolv.conf everywhere.)

Because I mostly have access to the routers on the networks I'm on, I don't think I'll be using tailscale in the long term. But it's pretty impressive stuff: in the time it took me to even review the Puppet modules to configure Wireguard (which is what I'll probably end up doing), I was up and running with Tailscale (but with a broken DNS, naturally).

(And yes, basic Wireguard won't bring me DNS either, but at least I won't have to trust Tailscale's Debian packages, and Tailscale, and Microsoft, and GitHub with this thing.)

IPv6

IPv6 is actually what is supposed to solve this. Not NAT port forwarding crap, just real IPs everywhere.

The problem is: even though IPv6 adoption is still growing, it's kind of reaching a plateau at around 40% world-wide, with Canada lagging behind at 34%. It doesn't help that major ISPs in Canada (e.g. Bell Canada, Videotron) don't care at all about IPv6 (e.g. Videotron in beta since 2011). So we can't rely on those companies to do the right thing here.

The typical solution here is often to use a tunnel like HE's tunnelbroker.net. It's kind of tricky to configure, but once it's done, it works. You get end-to-end connectivity as long as everyone on the network is on IPv6.

And that's really where the problem lies here; the second one of your nodes can't setup such a tunnel, you're kind of stuck and that tool completely breaks down. IPv6 tunnels also don't give you the kind of security a VPN provides as well, naturally.

The other downside of a tunnel is you don't really get peer-to-peer connectivity: you go through the tunnel. So you can expect higher latencies and possibly lower bandwidth as well. Also, HE.net doesn't currently charge for this service (and they've been doing this for a long time), but this could change in the future (just like Tailscale, that said).

Concretely, the latency difference is rather minimal, Google:

--- ipv6.l.google.com ping statistics ---
10 packets transmitted, 10 received, 0,00% packet loss, time 136,8ms
RTT[ms]: min = 13, median = 14, p(90) = 14, max = 15

--- google.com ping statistics ---
10 packets transmitted, 10 received, 0,00% packet loss, time 136,0ms
RTT[ms]: min = 13, median = 13, p(90) = 14, max = 14

In the case of GitHub, latency is actually lower, interestingly:

--- ipv6.github.com ping statistics ---
10 packets transmitted, 10 received, 0,00% packet loss, time 134,6ms
RTT[ms]: min = 13, median = 13, p(90) = 14, max = 14

--- github.com ping statistics ---
10 packets transmitted, 10 received, 0,00% packet loss, time 293,1ms
RTT[ms]: min = 29, median = 29, p(90) = 29, max = 30

That is because HE.net peers directly with my ISP and Fastly (which is behind GitHub.com's IPv6, apparently?), so it's only 6 hops away. While over IPv4, the ping goes over New York, before landing AWS's Ashburn, Virginia datacenters, for a whopping 13 hops...

I managed setup a HE.net tunnel at home, because I also need IPv6 for other reasons (namely debugging at work). My first attempt at setting this up in the office failed, but now that I found the openwrt.org guide, it worked... for a while, and I was able to produce the above, encouraging, mini benchmarks.

Unfortunately, a few minutes later, IPv6 just went down again. And the problem with that is that many programs (and especially OpenSSH) do not respect the Happy Eyeballs protocol (RFC 8305), which means various mysterious "hangs" at random times on random applications. It's kind of a terrible user experience, on top of breaking the one thing it's supposed to do, of course, which is to give me transparent access to all the nodes I maintain.

Even worse, it would still be a problem for other remote nodes I might setup where I might not have acess to the router to setup the tunnel. It's also not absolutely clear what happens if you setup the same tunnel in two places... Presumably, something is smart enough to distribute only a part of the /48 block selectively, but I don't really feel like going that far, considering how flaky the setup is already.

Other options

If this post sounds a little biased towards IPv6 and Wireguard, it's because it is. I would like everyone to migrate to IPv6 already, and Wireguard seems like a simple and sound system.

I'm aware of many other options to make VPNs. So before anyone jumps in and says "but what about...", do know that I have personnally experimented with:

All of those solutions have significant problems and I do not wish to use any of those for this project.

Also note that Tailscale is only one of many projects laid over Wireguard to do that kind of thing, see this LWN review for others (basically NetbBird, Firezone, and Netmaker).

Future work

Those are options that came up after writing this post, and might warrant further examination in the future.

Conclusion

Right now, I'm going to deploy Wireguard tunnels with Puppet. It seems like kind of a pain in the back, but it's something I will be able to reuse for work, possibly completely replacing strongswan.

I have another Puppet module for IPsec which I was planning to publish, but now I'm thinking I should just abort that and replace everything with Wireguard, assuming we still need VPNs at work in the future. (I have a number of reasons to believe we might not need any in the near future anyways...)

comment 2

So another thing to keep in mind while reading my blog post is that I did not do an exhaustive search of all possible VPN software out there.

I have 80 bookmark entries under the tag vpn. There's so much stuff out there, it would make this post unreadable. I'm trying to stick with simple, free software options, preferably in Debian. zerotier-one actually completely fails at the latter two, as it's now non-free...

I'll still add a bunch more links to the post pre-emptively. That said, no, zerotier wasn't in that list yet, so thanks, I guess. :)

Comment by anarcat
Created . Edited .