How Positron easily scales AI deployments for customers with Tailscale

Founded in 2023, Positron builds inference systems for large language models (LLM) using field-programmable gate arrays (FPGA) to achieve high inference speeds. FPGAs are customizable chips that distinguish Positron from conventional GPU offerings, delivering inference performance for AI models like HuggingFace’s Transformers library. Positron excels at taking trained models and executing them without a compiler or PyTorch and TensorFlow integration needed.

Barrett Woodside, VP of Product at Positron, shares how they adopted Tailscale as a VPN before quickly realizing its potential as a broader network solution that would benefit their products and customers. Tailscale is now instrumental for secure connectivity between Positron’s AI inference servers and distributed data centers, strengthening the “try before you buy” business model of Positron’s managed inference servers.

It started as a straightforward use case

Positron needed a VPN solution, and Tailscale passed the sniff test with senior, trusted minds internally. Thomas Sohmers, Positron’s CEO at the time and now CTO, advocated for Tailscale as the superior solution given its basis on WireGuard, an open-source protocol and industry standard. Barrett also mentions Edward Kmett, CTO, in a tongue-in-cheek remark that summarizes it best: “Our CTO is very particular about all of his software and everything he uses. He was onboard with Tailscale immediately. So I was like okay, can't be that bad.”

“Tailscale was our first VPN, and I have to say, it was easier to set up, configure, and manage than any past experience with OpenVPN, Palo Alto Networks, or Cisco AnyConnect.”
Barrett WoodsideVP of Product

Adopting Tailscale immediately provided the benefits of a traditional business VPN use case. Positron had performance metric dashboards and admin panels that could only be accessed internally, a problem easily solved by setting up their new tailnet. Onboarding engineers in particular and granting access to systems can be a headache. With Tailscale, this became trivial, “Especially with Google Workspace integration, we're capable of bringing on new team members almost without thinking. That's super helpful given that most of our team are engineers.”

Perks like Tailscale SSH removing the tedium of managing and transferring public keys was a welcome change, and expanding Positron’s fleet of employee workstations, demo machines, and engineering servers became frictionless. “Tailscale has certainly saved us from having to maintain our own WireGuard infrastructure and even made us ready for an external audit in the future. It saves us an hour per onboarded prospect”, Barrett estimates, a time-saving benefit that will scale alongside the company’s growth.

But Barrett soon realized that given Tailscale’s broader networking capabilities, these benefits could be extended to help facilitate Positron’s product offerings and customers. Tailscale quickly became a critical part of Positron’s deployment strategy.

Helping drive adoption and deployment for Positron’s customers

As a company that sells hardware, Positron mirrors their customers in preferring their own data centers and on-premise machines. However, some ingenuity is needed to connect to and serve customers globally.

Positron’s data center resides in the State of Washington and houses their FPGA-based inference servers. These machines do not serve API endpoints, only pushing packets to separate API servers. These API servers don’t exist on-premises, as Positron uses Fly.io to run these servers in a distributed manner to get geographically close to the user. “This way, the database for the API endpoint is local to the user. Whether the user is in Bombay or Stockholm, it's in a local data center”, explains Barrett.

This is where Tailscale comes in: each of these Fly.io instances is a container running Tailscale. Tailscale connects these API servers back to the original data center in Washington, granting them access to the FPGA servers. “We use Tailscale to punch a hole through these Docker containers to our data center, in a secure way, so they have access to the servers with all the FPGAs in them.”

This connectivity is essential to Positron’s business model. Positron has a managed inference server offering called Testflight, which is positioned as a “try before you buy” service. Barrett notes, “Tailscale enables us to bring up new API servers at new subdomains for every new customer and prospect that wants to use our Testflight remote access program.” For each of these Testflight customers, Positron only has to configure ACLs and tags through Tailscale to ensure their environment is secure.

Tailscale started as only a VPN solution for Positron. It quickly has become critical to spinning up API frontend database nodes close to their customers and creating a secure tunnel back to their data center. To Barrett, it is clear that Tailscale has become something more: “For us, Tailscale is like a dev tool. It brings people onto our network to access all the dev machines easily.”

Benefiting from the Tailscale AI Startup Program

“We're excited to be a part of the AI startup program as it helps us deliver the fastest and lowest cost generative AI applications without worrying about connectivity for our customers.”
Thomas SohmersCo-Founder and CTO

Positron is part of the Tailscale AI Startup Program, designed for early-stage companies that need to scale their networks quickly and securely. Positron and all admitted startups receive one year of Tailscale’s Enterprise plan for free. Access to this Enterprise plan better positions companies like Positron to build reliable AI infrastructure with prioritized support from the Tailscale team.