Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login
Easily accessing all your stuff with a zero-trust mesh VPN (changelog.complete.org) similar stories update story
177 points by JNRowe | karma 6117 | avg karma 5.85 2023-04-15 16:08:11 | hide | past | favorite | 62 comments



view as:

Nice list, zerotier does offer sharing as long as you add it to the network, or even having it public, but keep in mind it will share everything as if you’re normally connected with other members in the network, or create a separate network for the sharing purposes.

Their rules option allow you to lock down access

> I wouldn’t want to have a lot of programs broadcasting on a slow link. While in theory this could let you run Netware or DECNet across Zerotier

There are less obscure uses for broadcast in those networks. For example mDNS "just works". You may not have an automated DNS at the service level, but you can use avahi to discover your other services or use .local without any extra config.


What about legal liability? Forwarding somebody else's traffic can be very risky.

Other than that, yggdrasil and tinc sound so much better than the alternatives.


Headscale seems like the best answer in this space

Headscale is very good, but a few caveats:

There's really no redundancy. If your headscale node goes down it can take your tailnet with it. In particular, if you drop in some ACLs that have issues, your entire tailnet can drop until you get it fixed. The recent (0.21) "configtest" can help there, but it still feels a bit brittle.

Headscale can use a lot of system resources. ~100 nodes can saturate a t3a.small instance in cpu time and disc access. Reducing the update frequency can help, but there are hard limits here. I'm imagining much of this is database updates to sqlite, but I haven't tried switching to an external postgres server yet to see how much of the load is database related.


tinc performance is limited by the encryption performance on single core. I suspect for Linux network engineers it could be a better choice than author suggests. tinc is very mature and battle tested.

I don't know much about tinc for mobile devices, are there decent clients?

I don’t use it on mobile devices so not sure. Instead I use a cloud server which joins the tinc mesh and and also runs wireguard service which then acts as a router.

There's a great client for Android yes. https://tincapp.pacien.org/

For iOS there isn't. Last time I checked there's only one without a GUI that needs root.

PS tinc is not fully zero trust. Every node can connect to every other one. This includes vps nodes you'll probably use for firewall traversal. Other systems have a 'lighthouse' concept where the vps just coordinates traffic but isn't able to actually read it.


> Every node can connect to every other one.

At the Tinc level, they can't connect to you unless you have their public key configured locally.

At the IP level, set 'StrictSubnets = yes' in the main config file to prevent nodes not explicitly configured locally to send packets to you.


semi-related - does anyone know of any software frameworks/libraries for writing IoT event based systems stuff on this sort of mesh network?

I have a side project in mind where IoT devices (dropping in and out of 4G and/or ad-hoc wifi or bluetooth) need to communicate and send/receive events, reconciling with event sourcing or whatever

The tough parts of the architecture for this seem pretty generic so I'm hopeful that something already exists!


Have you looked at nats.io? High performance messaging with a flexible persistence layer. Servers can be meshed via leaf node connections. Perfect for IoT and edge

Works nicely with zerotier to simplify testing it out.

Mail me?

It's not lightweight and it's not exactly a mesh network, that is, it's not a level 2 mesh it is level 5 mesh so you still need to get an ip somehow. but ipfs exposes it's dht for general purpose use via it's pubsub sytem.

I had a lot of fun abusing this to make a serverless[1] video streaming platform. The video streaming ended up being shit with fundamental issues I don't think I could solve. but it was fun to put together the proof of concept.

1. actually serverless. not "the server is in the cloud" bullshit people usually mean when they say serverless.


Sounds like https://lbry.com/ ??

How did they solve or work around what you couldn't solve?

Curious due to some deeply engrained dislike of anything centralized :p


ipfs is fairly good for distributing files. I was watching a twitch stream and wanted to see if I could replicate that sort of use case. live streaming. Note I am not that great of a programmer in the first place and I loose interest quick once I start getting into ui code.

The theory was to listen to an incoming video stream chop it into 2-3 mb segments and publish each segment via ipfs, the program then notifies everybody watching what the next segment is via a signed ipfs pubsub message.

To watch a stream you get your streamers public key(so you can identify their segment messages are valid) and subscribe to their ipfs pubsub channel. as the software gets segment messages it downloads the segments via ipfs and reassembles the video.

Theoretically it would scale automatically to the population of people watching the stream. with the watchers providing the scale infrastructure.

realisticly ipfs pubsub does not scale but the real problem was the lag. the lag started at around 2 minutes and went up rapidly. My impression is that streamers value interactivity with the watchers and would not tolerate such a large lag.

My proof of concept code. it is very primitive, no ui, designed (both client and server) for the user who enjoys a unix style environment. I think about trying to finish it every now and then but have not yet found the motivation.

http://nl1.outband.net/fossil/ipfs_stream/doc/tip/readme.md


Peertube does something similar for livestreaming, with 30s to 60s of latency. WebTorrent instead of IPFS, but the concept is similar.

Have you looked at OpenZiti? I'm a bit surprised it wasn't included on this list. I've only played with it a tiny bit, I had problem getting it set up ~a year ago, but I've heard they have improved the documentation.

Long-time tinc user here.

My gut is that tinc is going to be as good as anything else here -- specifically what this means is that his "for your friends" idea doesn't hold much water for me.

I.e. -- either you have friends techy enough such that you could get tinc going, or you don't, and at this level ALL of these solutions are going to be too annoying and complicated for them?

For the non-techy types, I don't think any VPN like thing has been built yet? Would have to be like "syncthing" level easy.


hamachi was pretty close, we used it back in the days to play LAN games without having to enter IP addresses or mess with routers/modems.

I'm a big fan of tinc.

It's fully private and doesn't rely on public servers, like some of the mentioned solutions. Adding new peers is a minor chore, but if you have a network that doesn't change frequently, this is not an issue. Also unlike most of the alternatives, it runs absolutely everywhere.

I had it setup on my pfSense router, and on my Android phone, and it worked great. Not sure why the article says that mobile support is iffy.

I also used it professionally to setup a small Docker Swarm network over the internet. It also worked without issues, and was relatively simple to setup and maintain.

Performance could be an issue, as the article mentions, but unless you're transferring large files, in practice it's good enough.

I'm currently looking into setting this up again for personal use, and was hoping that there would be native mesh support within Wireguard, which doesn't seem to be the case.


Right? Honestly, I'm no expert, but I coded up a little crappy "good enough" onboarding shell script for myself (for adding new clients). I don't know if something similar exists already, but it seems like it could be done.

Twingate is missing from this list. It has nice DNS-based routing and access control.

Another tool worth looking at is vpncloud (https://github.com/dswd/vpncloud). I used to use tinc, but switched to vpncloud 2 years ago.

In my use case, I have a modest number of nodes. Although nodes learn of other nodes from each other, I use ansible to keep each node's config updated.

I use vpncloud (and previously, tinc) between docker hosts. So, you have to be careful about interface MTU's inside of docker, particularly if you use containers based on Alpine.


What’s the problem with all devices connecting to a vpn server running somewhere?

I mean, the usual access VPN solution. It’s not peer to peer, but you can set up or rent a server near clients, like in the same city.


A hub-and-spoke VPN is slower and it can be more expensive. It's easier if you're configuring each tunnel by hand, but now that mesh VPNs are easy to use you might as well use mesh.

It’s also very much a reduction of what some of these technologies are capable of. For instance, I’ve used Netmaker somewhat extensively as a VPN solution, and I mostly agree with the sentiments that the original article expresses about Netmaker (young, unstable). However, that is not the most interesting use case that Netmaker offers at all. Rather, a much more interesting use case is something like having an easy and straightforward way for Kube pods in AWS and GCP spanning across multiple regions to talk to each other. Multi-cloud architectures are greatly simplified by these kinds of technologies, especially Netmaker, partly because of their abilities to punch through NATs.

Also, Netmaker is kernel level WireGuard which is much more performant. If you’re just using Tailscale to talk to clusters from your MacBook, for instance, userland WireGuard is probably going to be more than enough. When you need servers to talk to each other, though, you usually need the extra perf


It's not necessarily slower. It really depends on the connectivity of the VPN hub.

At my previous company our VPN server was on a VERY good network, including InterNAP traffic optimizers.

One of our employees ran some testing of it and found that in most cases going over the VPN was faster than natively using his Comcast connection. Comcast would tend to route the traffic over their own network links as much as possible.

In the case of the VPN server, it was well enough connected that Comcast could get to it quickly, and the good connectivity it had (largely thanks to Level-3, but the InterNAP optimizer would help it pick great connections in any case), would get it to the final destination faster than Comcast's network could.


I feel like you are answering your own question though... while a VPN hub and spoke can be faster than standard internet (by having peering relationships which are better than the users local telco - i.e., effectively circumvent and improve their standard BGP), you can do this on steroids with a mesh VPN by being able to deploy the relays into many diverse locations so that any user anywhere gets these benefits, even if they change location and access different resources. Its basically the same but more dynamic and distributed.

I'm not sure how I'm answering my own question, or how the question I was replying to is about mesh VPNs at all, I'm replying to "hub and spoke VPNs are slow".

I'm just saying: It's not that simple.

But, to clarify, when you say "relay node", are you talking VPN traffic relays (DERP nodes in tailscale parlance) or relays to the public Internet (exit nodes in tailscale parlance)?

If the former, tailscale goes out of it's way to avoid the DERP servers and instead route traffic directly between the nodes (hence the "mesh"), so it doesn't gain the benefits of hub and spoke that I was speaking about, unless the src/dst nodes can't directly communicate.

If the latter, I don't know of any mesh that has smarts about optimizing the reachability to the public Internet and shifting traffic between exit nodes to get better reachability.

Can you mention a VPN that has the abilities you are speaking about, because I'm not aware of one, unless you somehow did something like integrating BGP into the exit node selection combined with something like a InterNAP traffic optimization appliance.


Huh, I did not know that about TS, I thought they were always using the nodes. I was leaning on the former point (i.e., private applications rather than public internet), I know OpenZiti fabric does optimisation across available nodes by doing 'smart routing'. Interesting idea with the exit node and integrated BGP, not a use case ziti is trying to solve today but its a neat idea that is theoretically possible.

Yep, tailscale definitely will do direct routing between nodes if available and only uses the DERP relays if it can't establish a direct connection. It also uses the DERP nodes to help with NAT-busting, and from what I've heard the tailscale NAT busting is "best in class". I can say that in my situation TS is able to establish direct connections between all my nodes, with maybe a couple exceptions.

You are no longer "zero trust" as you have to trust the VPN server to verify the identity of the nodes, because the traffic to your node is not encrypted to and from another node (there are two encrypted tunnels between each of you and the VPN server, and the information that would allow you to verify the identity of the other node is stripped at the VPN server).

Your VPN server also basically becomes a network switch in this case, doing the backhaul between all clients, which is likely to be expensive and slow compared to peer-to-peer connections.

If you try to improve this by making the clients entire networks with switches/routers connected to each other only via internetwork VPN tunnels, then you also lose the property of the network that only known, verified nodes can be on that network. Not only can you not verify all nodes you connect to, neither can the VPN server (e.g. someone could plug a malicious or insecure device into the switch in your office, that would then have the same access to the network as other nodes).


Good point. But what would be the security difference between running a VPN server on a cloud instance, versus running a DERP/relay/coordination server on the same cloud?

Sure, the traffic is decrypted on the VPN server in the hub-and-spoke case (even if most traffic is already encrypted by TLS and applications) but in both cases the server in the cloud could compromise the connection (in different ways, but correct me if I’m wrong that malicious servers used initially to establish the peer to peer connection cannot compromise the network, even with the Lock feature which prevents addition of unsigned public keys). You have to trust the cloud provider.

Somehow people think mesh VPNs don’t need opening ports or are peer to peer. Not exactly: you just pass it to someone else to open ports for your network (used at minimum in the initial connection, and in the case of relays for the entire session). You have to trust the cloud in both cases, though in different ways. Further, at least your identity provider could act as an administrator with full privileges.


Certainly there are potential exploits if you rely on a 3rd party cloud or service to run the coordination service, though these might be more limited than on a VPN server.

Yggdrasil, assuming nodes operate their own identify based firewall as described the the original article, and tailscale if you use it with self hosted headscale rather than their coordination server can improve on this further.

At work we run headscale on a physical machine we control for now.

Personally I’m very interested in mesh networks like Yggdrasil and secure, radically open overlay networks and virtualised services in general.


I would note to aborsy comment 'VPN server on a cloud instance, versus running a DERP/relay' that if the overlay is architected 'correctly' then it uses mTLS connections between each hope while having E2E encryption between source and destination for the overlay. Net result is you do not have to trust the node as its never decrypting data while the edge runs in your own environment. Further, you cannot just impersonate a node.

You do have to trust the control/coordination server, though you could also run this in confidential compute.

Also, if you are interested in mesh networks, check out open source OpenZiti - https://github.com/openziti


Increased latency, limited bandwidth, point of failures...

Privacy VPN providers are in a unique position to offer this as an add-on. NordVPN recently started offering Meshnet which effectively accomplishes the same thing as Tailscale, Zerotier etc. I vaguely recall another privacy VPN provider (maybe PIA) used to offer this as well.

Seems like a good value add-on for services that already require setting up encrypted tunnels.


Interesting - i wasn't aware of that:

https://nordvpn.com/blog/nordlynx-protocol-wireguard/


Here's a more direct link to what I was referring to:

https://nordvpn.com/meshnet/


The point of zero-trust is that you don't implicitly trust something, but instead re-evaluate its trustworthiness every time you have to make an access control decision. Trusting a device simply because it's on your VPN is equivalent to trusting a device simply because it's on your LAN, and this is absolutely not a zero-trust architecture.

It's zero trust networking, unlike the internet or a traditional VPN.

And every definition of "zero trust networking" I can find still states that the point is that trust is something that's continually evaluated, which isn't the case if you're trusting someone simply because they're on your VPN.

Tailscale describe zero trust networking more eloquently than I can, here: https://tailscale.com/kb/1123/zero-trust/

The gist being that you are no longer making unverifiable trust assumptions about the identity of nodes connected to your network, as you have encrypted links between all nodes (not just between external nodes and a VPN gaetway to a non-zero-trust internal LAN) based on their verified identity and allowed access.

A node's identity can be verified by its public key, so you no longer implicitly need to trust that only known and expected nodes are plugged into the switch in the office (for example). I do not have to care about whether the physical network infrastructure can be trusted (zero trust) because I can verify the identity of all nodes.

This of course does not mean your entire architecture is zero trust. There are pretty much always trust decisions and assumptions somewhere, but that is why it's described as zero trust [mesh] networking, not zero trust complete end to end systems.

edit/aside: some people prefer the term "trust minimising" rather than zero trust to emphasise that there's always trust somewhere and you are just pushing it outside of some component or layer, and I agree this is probably a better/more accurate term.


You're describing the equivalent of 802.1x. Nobody would assert that enabling 802.1x turns your office LAN into a zero-trust network.

Not quite. 802.1x provides an authentication mechanism to devices wishing to attach to a LAN or WLAN. Tailscale is describing to add E2E encryption between device to server so that if a malicious actor is on the underlay, they only see encrypted packets.

I would instead assert that Tailscale's/Wireguard's approach of connecting devices is not zero trust as our focus should be on protecting 'services', not 'devices'. This requries micro-segmentation, least-privilege, attribute-based-access control, authenticate/authorise-before-connectivity as part of the overlay.


It is more zero trust than it would otherwise be though. An RDP server with tailscale can reject by default packets coming from a LAN. And you can get ALC before a connection can be made.

> Tailscale describe zero trust networking more eloquently than I can...

Allow me to rephrase that for you ...

Tailscale, a company whose business model relies on selling Wireguard-based VPN products, tries to convince people/prospective customers that a VPN can be shoehorned into the Zero Trust concept.

If all you have is a hammer, everything looks like a nail as they say....


Agreed, I literally just replied to the comment below, that TS/WG approach of connecting devices is not zero trust as our focus should be on protecting 'services', not 'devices'. This requries micro-segmentation, least-privilege, attribute-based-access control, authenticate/authorise-before-connectivity as part of the overlay.

Tailscale does offer an ACL system[0] that allows protecting individual ports (which I assume is what is meant by services here?) and defaults to least-privilege (when ACLs are enabled, a node in the network cannot access other nodes by default). Though this configuration is centralized in the control plane. Does this not address some of those issues?

I'm not well-versed in zero-trust networking, so I may be missing something fundamental.

[0]: https://tailscale.com/kb/1018/acls/


Good to know, the reference article does not talk about this and I was not aware of the feature set. My personal belief is that the term 'zero trust' comes in shades of grey. I personally believe that anything internet exposed is the lowest form, implementing a software-defined perimeter is the next, and that the final is to embed overlay networking into the application itself so we do not have to trust the WAN, LAN or even host OS network. I wrote a blog on this last year using Harry Potter analogies - https://netfoundry.io/demystifying-the-magic-of-zero-trust-w...

Disclosure: I work at Ockam.

What you describe here sounds a lot like what we’ve been building at Ockam: https://github.com/build-trust/ockam


Exactly. I am awaare of Ockam, I work on the OpenZiti project (https://github.com/openziti). I believe our approaches to embed zero trust, private overlay networking into the app is the best way (with tunnelers for non-embedded where needed) so that we have the least trust in underlay networks (WAN/LAN/host OS network). Ziti is similar to Ockam at a high level (I am sure there are nuiance differences) though while we do not have a Rust SDK, we do have them for Golang, C, Java, Python, C#, Kotlin, ObjectiveC, JavaSript, NodeJS, etc.

I would agree with that assessment — it’s applying something of a fuzzy concept to networking. Arguably it is doing it without a great explanation of how, why, and the trade offs, but then the zero trust concept itself is also still in its evolutionary phase.

Zero trust isn’t about doing it in one place, and then leaving all other doors wide open (although that is one possible realisation).

From a security standpoint, it should be about layering, and making it and based on relevant security anchors. For most people, the fact of being on an authenticated local network is a huge positive indicator. (Or more to the point, being outside of it is the indicator.) This is why, generally, home routers don’t leave the admin interface exposed to the WAN. Are we to advocate dropping the “deny WAN for /admin” rule simply because the admin GUI already has password access control?


Yes you are right it's about continuous evaluation. I'm not sure which part of this setup is implementing that continuous evaluation.

Giving benefit of the doubt, perhaps one of the tools mentioned is doing that implicitly, in which case the author hasn't mentioned it. I think it's ZeroTier but I'm not sure how it's being configured to do so.

Without benefit of the doubt, the author may have misunderstood what ZTNA is and the title is misleading.


NIST SP.800-207: "Zero trust (ZT) is the term for an evolving set of cybersecurity paradigms that move defenses from static, network-based perimeters to focus on users, assets, and resources... [it] assumes there is no implicit trust granted to assets or user accounts based solely on their physical or network location [and it] focuses on protecting resources (assets, services, workflows, network accounts, etc.), not network segments, as the network location is no longer seen as the prime component to the security posture of the resource."

So while I think you 1st sentence is a bit whooly, your 2nd is bang on. Wireguard (and ergo Tailscale) is focused on connecting devices, not services, and being on the mesh gives you access to other devices/endpoints. To achieve zero trust networking you also need micro-segmentation, least-privilege, attribute-based-access control, authenticate/authorise-before-connectivity as part of the overlay.


"Zero trust" is such a bad, repulsive name.

Gl-Inet routers with embedded support for VPN's + ZeroTier.

You may even install ZeroTier directly in these routers to [re-]configure router from anywhere.


Legal | privacy