How Aperture works

Last validated: Jun 24, 2026

Aperture by Tailscale is currently in beta.

Aperture routes all AI requests through a central proxy, giving organizations consistent visibility into AI usage. This topic explains the four core mechanisms: identity and authentication, request routing by model, telemetry capture, and session tracking. Together, these enable auditing, cost awareness, and operational insight while letting teams safely adopt LLM clients.

Identity and authentication

Traditional API proxies require clients to authenticate with tokens or API keys. Aperture eliminates this step by using Tailscale's identity layer.

Every connection to a tailnet carries cryptographic proof of identity. When a request arrives at Aperture, the proxy queries Tailscale with the remote IP address. Tailscale responds with the user's login name (for example, alice@example.com), a persistent device identifier, and any tags assigned to that device.

This identity is trustworthy because it comes from Tailscale's control plane, not from the client. A user cannot forge their identity without compromising Tailscale's key exchange.

The identity flows through the entire system. Every metric record includes the login name and device ID. The dashboard filters by user. Access control uses roles assigned through grants, keyed on Tailscale identity.

Clients connecting using ts-unplug also authenticate through Tailscale identity. Each ts-unplug instance joins the tailnet as its own device, so Aperture identifies requests by the ts-unplug device's login name. If each person runs their own ts-unplug instance, Aperture attributes activity to individual users. If multiple people share a single ts-unplug instance, all activity appears under the same identity.

Tagged device identity

When a tagged Tailscale device connects to Aperture, it does not have a user account associated with it. Instead, Aperture creates a synthetic identity from the device's tags.

Aperture sorts the device's tags alphabetically and joins them with commas to create a stable login name. For example, a device with the tags tag:prod and tag:api appears as tag:api,tag:prod in dashboards, logs, and session tracking.

If a device has no user profile and no tags, identity resolution fails, and the device is denied access.

To grant access to a tagged device, use this synthetic identity in the grant's src field (this applies to both grants and temp_grants, which use the same src JSON field with identical matching behavior):

A single-tagged device with tag:ci-runner:
```
"src": ["tag:ci-runner"]
```
A multi-tagged device with tag:api and tag:prod:
```
"src": ["tag:api,tag:prod"]
```
A wildcard that matches all nodes, including tagged nodes:
```
"src": ["*"]
```

Finding the correct tag string: To discover the exact tag string for a node, you can:

Temporarily set src: ["*"] to allow the node to connect, then observe the synthetic identity in Aperture's dashboards and tighten grants to the exact tag string.
Construct it manually by sorting the node's tags alphabetically and joining them with commas.
Find the node's tags in the Tailscale admin console.

If all sessions from tagged nodes appear to come from the same user, refer to all sessions appear to come from the same user in the troubleshooting guide.

Request routing by model

When a request arrives, Aperture extracts the model name from the request body (for example, claude-sonnet-4-6 or gpt-5.5). The proxy looks up which provider serves that model and forwards the request to that provider's API endpoint, injecting the correct authentication headers.

From the client's perspective, the proxy appears as if it were the LLM provider itself. Clients connect to the proxy URL and send standard API requests. The proxy handles the routing transparently.

Both legs of the connection are encrypted: clients reach Aperture over the tailnet using Tailscale encryption, and Aperture connects to upstream providers over TLS.

The following table summarizes the supported API formats. For full provider details, including compatibility flags, refer to the provider compatibility reference.

Format	Endpoint	Providers
OpenAI Chat	`POST /v1/chat/completions`	OpenAI, OpenRouter, llama.cpp
OpenAI Responses	`POST /v1/responses`	OpenAI
Anthropic Messages	`POST /v1/messages`	Anthropic
Gemini	`POST /v1beta/models/{model}:generateContent`	Google
Amazon Bedrock	`POST /bedrock/model/{model}/invoke`	Amazon Bedrock
Vertex AI	`POST /v1/projects/{project}/locations/{region}/publishers/{publisher}/models/{model}`	Google Vertex AI

Telemetry capture

The capture system records everything needed to reconstruct and analyze each LLM interaction:

Request data: HTTP method, path, headers (with sensitive values redacted), and full request body.
Response data: Status code, headers, and full response body.
Extracted metrics: Token counts by type (input, output, cached, and reasoning), model name, request duration, and tool use count.
Session context: Session ID linking related requests with supported providers and agents such as Claude Code and Codex.

The proxy processes telemetry asynchronously after the response completes, so clients receive responses without significant delay.

You can configure how long Aperture keeps this captured data by setting a retention policy that purges request and response bodies after a fixed duration, or only after they have been exported. You can also run Aperture with zero local retention, where request and response bodies are never written to disk or purged immediately after export. Only metrics such as token counts, model, and cost are kept. Refer to Observe and export AI usage for retention options.

The proxy handles two technical challenges transparently:

Compression: The proxy decompresses response bodies that arrive compressed (gzip, deflate, or Brotli) before storing them.
Streaming: For streaming responses (Server-Sent Events), the proxy reconstructs a complete response object from the event stream for consistent metric extraction.

Session tracking

A single coding task or conversation typically involves many LLM requests. A developer using Claude Code can generate 50 or more requests while debugging one issue. Without grouping, these appear as 50 unrelated events. With session tracking, they form a coherent unit you can analyze as a whole.

Aperture groups related requests into sessions by detecting session identifiers from different client types. The following table describes how each client type identifies sessions:

Client type	Identification method
Claude Code	Session ID included in each request body
OpenAI Codex	Session ID sent as HTTP header
OpenAI Chat	Fingerprint generated from conversation content
Other clients	Random identifier assigned

Sessions enable conversation-level analysis. The Logs page of the Aperture dashboard groups requests by session, showing the full context of a coding session or chat conversation. You can review token costs per conversation rather than per request, trace how a coding assistant deconstructed a task, or identify which conversation consumed the most resources.

Outbound integrations

Beyond routing LLM requests, Aperture proxies connections to external MCP servers and HTTP APIs through connectors. This extends the gateway model from inbound AI traffic to outbound tool and service access.

Connectors support two protocols. With the mcp protocol, Aperture acts as an MCP server to clients and as an MCP client to remote MCP servers. Clients discover available tools through Aperture's MCP tool list without needing direct access to the remote server. With the http protocol, Aperture functions as an authenticated reverse proxy to REST APIs, forwarding requests and injecting the required credentials.

Authentication follows the same centralized model as LLM providers. An administrator configures credentials for each connector once in the Aperture configuration. When a user or agent makes a request through a connector, Aperture injects the appropriate authentication headers automatically. For most connectors, individual users do not need separate credentials configured in their local tools. Connectors using per-user OAuth 2.0 (oauth2_authorization_code) require each user to complete a one-time authorization flow through the Aperture UI, but the resulting tokens are managed by Aperture, not the user's local tools.

For detailed setup instructions, refer to the connectors feature guide for HTTP API proxying and MCP server proxying for MCP-specific configuration.

Next steps

Get started with Aperture: sign up, configure providers, and connect your first LLM client.
Control AI access: set up grants that define which models each user can access.
How Aperture grants work: understand deny-by-default access, additive grants, and how Aperture resolves precedence.
Observe and export AI usage: access dashboards and export usage data.
Set up connectors: configure outbound integrations with MCP servers and HTTP APIs.