Google Cloud Platform reference architecture
This document details best practices and a reference architecture for Tailscale deployments on Google Cloud Platform (GCP). The following guidance applies for all Tailscale modes of operation—devices, exit nodes, subnet routers, and the like.
For the purposes of this document, the following terminology is used:
- Tailscale device—a Tailscale node, exit node, subnet router, and the like.
- Tailscale agent—the Tailscale client that runs on a device to allow it to connect to the Tailscale network.
- GCP resource—a resource in Google Cloud Platform. This can be a GCE instance, a Cloud SQL instance, a Cloud Run service, and the like.
High-level architecture
This diagram illustrates how Tailscale integrates with Google Cloud Platform to provide secure, reliable network connectivity across distributed cloud resources. The architecture shows a Virtual Private Cloud (VPC) divided into multiple zones, each containing both public and private subnets with strategically placed Tailscale components.

Within the public subnets, Tailscale connectors are deployed with failover capabilities to handle inbound connections from external sources. These connectors serve as access points to GCP services including Cloud SQL, Cloud Storage, and other platform offerings. In parallel, private subnets host redundant Tailscale SSH session recorders to ensure high availability for security monitoring.
The architecture maintains clear separation between public-facing components and private infrastructure, with multiple connection pathways (Tailscale connectors, Tailscale instances, GKE Control Plane, and Cloud NAT) working together to enhance reliability, security, and accessibility across the entire cloud deployment.
Ways to deploy Tailscale to connect to and from GCP resources
Tailscale provides a few options for connecting to resources within GCP. At a high-level they are:
- Agent-to-agent connectivity—connect to "static" resources such as Google Compute Engine (GCE) instances. This is recommended when you can install and run Tailscale directly on the resource you wish to connect to.
- IP-based connectivity with a Tailscale subnet router—connect to managed GCP resources such as Cloud SQL or Spanner. This is recommended when you either cannot run Tailscale on the resource you are connecting to or you want to expose an existing subnet or services in a Virtual Private Cloud (VPC) to your tailnet.
- DNS-based routing with a Tailscale app connector—connect to software as a service (SaaS) applications or other resources over your tailnet with DNS-based routing.
- Kubernetes services and auth proxy with Tailscale Kubernetes operator—expose services in your Google Kubernetes Engine (GKE) cluster and your GKE cluster control plane directly to your Tailscale network. This is recommended when you are connecting to resources running in a Kubernetes cluster, or to a Kubernetes cluster's control plane.
- Cloud Run and other container services—access resources in your tailnet from Cloud Run, Cloud Run functions, and other container solutions.
Agent-to-agent connectivity
It's best practice to install the Tailscale client (agent) whenever possible—for example, when setting up servers on GCE instances. Installing Tailscale on your GCE instances directly generally provides the best and most scalable connectivity while enabling Tailscale agent-based functionality such as Tailscale SSH.
IP-based connectivity with subnet router
For managed resources where you cannot install the Tailscale agent, such as Cloud SQL, Spanner, and similar services, you can run a subnet router within your VPC to access these resources from Tailscale. Subnet routers can also be used to connect to resources using Private Service Connect.
DNS-based routing with an app connector
App connectors let you route traffic bound for SaaS applications or managed services by proxying DNS for the target domains and advertising the subnet routes for the observed DNS results. This is useful for cases where the application has an allowlist of IP addresses which can connect to it; the IP address of the nodes running an app connector can be added to the allowlist, and all nodes in the tailnet will use that IP address for their traffic egress.
Kubernetes services and API server proxy with Tailscale Kubernetes Operator
The Tailscale Kubernetes operator lets you expose services in your Kubernetes cluster to your Tailscale network, and use an API server proxy for secure connectivity to the Kubernetes control plane.
Cloud Run and other container services
Tailscale supports userspace networking where processes in the container can connect to other resources on your Tailscale network via a SOCKS5 or HTTP proxy. This allows Cloud Run, Cloud Run functions, and other container-based solutions to connect to the Tailscale network with minimal configuration needed.
Production best practices
Below are general recommendations and best practices for running Tailscale in production environments. Much of what is listed below is explained in greater detail throughout this document:
- When possible deploy subnet routers, exit nodes, app connectors, and the like, to public subnets with public IP addresses to ensure direct connections and optimal performance.
- Run subnet routers, exit nodes, app connectors, and the like, separately from the systems you are administering with Tailscale—for example, run your subnet routers outside of your GKE clusters.
- Deploy dynamically scaled resources (for example, containers or serverless functions) as ephemeral nodes to automatically clean up devices after they shut down.
High availability and regional routing
- Run multiple subnet routers and app connectors across multiple GCP zones to improve resiliency against zone failures with high availability failover and deploy across multiple regions for regional routing.
- Run multiple Tailscale SSH session recorder nodes across multiple GCP zones to improve resiliency against zone failures with recorder node failover and deploy across multiple regions for regional routing.
Performance best practices
When deploying Tailscale in Google Cloud Platform (GCP), several performance considerations should be taken into account. These include selecting appropriate instance types based on workload requirements, configuring Cloud NAT for optimal direct connections, deploying subnet routers strategically, and ensuring high availability through multi-zone deployments.
Refer to Performance best practices for general recommendations.
In-region load balancing
Deploy multiple overlapping connectors within a DERP region to take advantage of in-region load balancing to evenly spread load across the connectors on a best-effort basis, and enable in-region redundancy.
Recommended instance sizing
When selecting instance types for Tailscale deployments on GCP, the requirements vary significantly based on the role of the instance. For normal Tailscale device usage, the instance type is typically determined by the primary workload rather than Tailscale's minimal resource needs. However, for specialized roles like subnet routers, exit nodes, and app connectors, careful consideration should be given to CPU performance, network bandwidth limits, and instance stability. Below are specific recommendations for different Tailscale deployment scenarios.
Normal usage
When installing Tailscale on a GCE instance as a "normal" Tailscale device (for example, not a subnet router or exit node), you likely have already sized that instance to a suitable instance type for its workload and running Tailscale on it will likely add negligible resource usage.
Subnet routers, exit nodes, and app connectors
There are many variables that affect performance and workloads vary widely so we do not have specific size recommendations, but we do have general guidance for selecting an instance type for a GCE instance running as a subnet router, exit node, or app connector. In general:
- Higher CPU clock speed is more important than more cores.
- Instances with Arm-based processors are quite cost effective for packet forwarding.
- Avoid shared-core machine types. These machine types timeshare a physical core and can result in inconsistent performance.
- Consult Google's network bandwidth documentation to better understand per-instance egress and ingress limitations.
Security groups
Tailscale uses various NAT traversal techniques to safely connect to other Tailscale nodes without manual intervention. Nearly all of the time, you do not need to open any firewall ports for Tailscale. However, if your VPC and security groups are overly restrictive about internet-bound egress traffic, refer to What firewall ports should I open to use Tailscale.
Public vs private subnets
Tailscale devices deployed to a public subnet with a public IP address will benefit from direct connections between nodes for the best performance.
Cloud NAT
Tailscale uses both direct and relayed connections, opting for direct connections where possible. Enable endpoint-independent mapping on Cloud NAT to enable direct connections to resources behind NAT.
VPC peering and Network Connectivity Center
Deploy a subnet router (or a set for high availability) within a VPC to allow access to multiple VPCs through VPC peering or Network Connectivity Center.
If you have VPCs or subnets with overlapping IPv4 addresses, use 4via6 subnet routers to access resources with unique IPv6 addresses for each overlapping subnet.
Subnet routers within GKE
Oftentimes organizations use Tailscale to connect to and administer their GKE clusters. While Tailscale can run within a container and be deployed to GKE, we recommend running your subnet routers externally to these clusters to ensure connectivity is available in the event your cluster is having issues. In other words, run your subnet router on dedicated CGE instances or a GKE cluster separate from than the cluster you're administering.
Tailscale SSH session recording
Deploy multiple session recorder instances across multiple availability zones to improve resiliency against zone failures. If your organization operates across multiple regions, consider deploying SSH session recording nodes in each region you operate and configure SSH access rules to send recording information to the local region for your nodes.