Instacart reduces developer disruption with Tailscale

Founded in 2012, Instacart is the leading grocery technology company in North America, delivering the future of online grocery shopping. Partnering with over 1,000 retail banners to offer delivery and pickup services from more than 75,000 stores across more than 13,000 cities in North America, Instacart is transforming how people shop, eat, and live.

As Instacart expanded over the past decade, it accumulated a number of legacy networks, many of which were dependent on bespoke VPN solutions. At one point, Instacart was running eight separate VPNs, and engineers routinely had to switch between them multiple times a day. According to Mike Deeks, a senior staff software engineer at Instacart, that alone was a significant source of pain and lost productivity: “Switching VPNs was a long-term chronic pain that became acute as we rapidly scaled up our engineering department.”

Adding to the pain, these legacy VPNs offered a lousy user experience for both administrators and end users, and fine-grained access controls were limited and difficult to use. According to Mike, this amounted to a nightmare scenario: “These VPNs were difficult to maintain, cumbersome to use, and had problems with scalability and availability.”

Mike and his team set out to replace their old VPNs with a single solution across the company. They wanted something that would reduce their support and maintenance burden, and provide a better user experience. But their primary goal for switching was to improve developer productivity. Mike estimates that Instacart’s engineers — not to mention many of their other employees — lost up to 20 minutes a day dealing with their various VPNs. Mike explains: “Small little cuts really affect people throughout the day, and you don’t want to break people out of their programming flow just to go log into a VPN or pull out their phone to do MFA. It’s just very disruptive.”

Searching for a replacement VPN

Mike’s team evaluated multiple VPNs before deciding to move forward with Tailscale. He was immediately impressed with Tailscale’s simplicity and clear documentation. Mike decided to start a proof of concept, and had Tailscale up and running in GCP, AWS, and multiple environments — with DNS resolving between all of them — in less than a day. When Instacart made the decision to switch from their old VPN to Tailscale, Mike says the process was seamless, thanks in part to Tailscale subnet routers. “Our Tailscale subnet routers had the same security groups applied to them that our old VPN nodes did,” he says. “Subnet routers let us access managed AWS services such as RDS where we can’t install Tailscale directly. They work almost like drop-in replacements for our old VPN nodes.”

For Instacart, Tailscale is more than just a VPN

While Instacart uses Tailscale to access internal resources and troubleshoot production issues, individual features are key to helping solve some of their most specific use cases.

Instacart used to maintain a separate VPN to restrict access for HIPAA compliance — which was necessary to enable prescription medication delivery via Instacart. With Tailscale ACLs and exit nodes, Instacart found that maintaining compliance is easier than ever because ACLs allow fine-grained control over who has access to their HIPAA-compliant environment. Tailscale exit nodes give Instacart more control over certain connections that are further restricted with IP allow lists. Exit nodes also enable Instacart to work securely with third party services and internal tools that require allowlisting.

Instacart uses a combination of public and private domains across several VPCs. Tailscale’s split DNS is a crucial feature for navigating these environments, says Mike: “Split DNS is essential to us joining multiple clouds and VPCs. It lets us continue to utilize all of our same domain names we have configured everywhere, but with a single VPN login.”

Mike also appreciates that Tailscale’s high availability features work right out of the box, with no special configuration required: “It’s a single binary with a handful of flags, compared to our previous setup, which required manually configured clusters, a database, DNS health checking, and a non-trivial number of install and setup steps to add a node. Tailscale also provides new capabilities with ACLs, a programmable API, and an easy way to join multiple clouds into one network.”

What’s more, says Mike, Tailscale just makes Instacart more efficient.

Because of Tailscale’s simplicity, both in architecture and end user experience, we can solve our acute problems quickly and easily.

Mike Deeks Senior Staff Software Engineer

The bottom line for Instacart

Switching to Tailscale has made life easier. Internal support requests at Instacart — which take 15 minutes to two hours to resolve — have dropped from 10 a week to nearly zero. “It’s dead simple for end users,” notes Mike. New users can onboard in less than a minute, and they can use hardware MFA tokens and TouchID rather than unsafe and inconvenient push notification or TOTP MFA. And outages have dropped to zero.

But most importantly, thousands of Instacart engineers are no longer being disrupted by logging in and out of multiple VPNs to access the resources they need to do their jobs. With Tailscale, says Mike, “We don’t have to think about VPNs any more.”