Troubleshooting the Kubernetes operator
Using logs
If you are experiencing issues with your installation, it might be useful to take a look at the operator logs.
For ingress and egress proxies and the Connector
, the operator creates a single replica StatefulSet
in the tailscale
namespace that is responsible for proxying the traffic to and from the tailnet. If the StatefulSet
has been successfully created, you should also look at the logs of its Pod
.
Operator logs
You can increase operator's log level to get debug logs.
To set the log level to debug
for an operator deployed using Helm, run:
helm upgrade --install \
operator tailscale/tailscale-operator \
--set operatorConfig.logging=debug
If you deployed the operator using static manifests, you can set OPERATOR_LOGGING
environment variable for the operator's Deployment
to debug
.
To view the logs, run:
kubectl logs deployment/operator --namespace tailscale
Proxy logs and events
To get logs and events for the proxy created for an Ingress
resource, run:
$ pod_name=$(kubectl get pod --selector=tailscale.com/parent-resource-type=ingress,tailscale.com/parent-resource=<ingress-name>,tailscale.com/parent-resource-ns=<ingress-namespace> \
--namespace tailscale -ojsonpath='{.items[0].metadata.name}')
$ kubectl logs ${pod_name} --namespace tailscale
$ kubectl desribe pod ${pod_name} --namespace tailscale
To get logs and events for a proxy created for an ingress or egress Service
, run:
$ pod_name=$(kubectl get pod --selector=tailscale.com/parent-resource-type=svc,tailscale.com/parent-resource=<service-name>,tailscale.com/parent-resource-ns=<service-namespace> \
--namespace tailscale -ojsonpath='{.items[0].metadata.name}')
$ kubectl logs ${pod_name} --namespace tailscale
$ kubectl describe pod ${pod_name} --namespace tailscale
To get logs and events for a proxy created for a Connector
, run:
$ pod_name=$(kubectl get pod --selector=tailscale.com/parent-resource-type=connector,tailscale.com/parent-resource=<connector-name> \
--namespace tailscale -ojsonpath='{.items[0].metadata.name}')
$ kubectl logs ${pod_name} --namespace tailscale
$ kubectl describe pod ${pod_name} --namespace tailscale
To get logs and events for a proxy created for a ProxyGroup
, run:
$ pod_name=$(kubectl get pod --selector=tailscale.com/parent-resource-type=proxygroup,tailscale.com/parent-resource=<proxy-group-name> \
--namespace tailscale -ojsonpath='{.items[0].metadata.name}')
$ kubectl logs ${pod_name} --namespace tailscale
$ kubectl describe pod ${pod_name} --namespace tailscale
Cluster egress/cluster ingress proxies
The proxy pod is deployed in the tailscale
namespace, and will have a name of the form ts-<annotated-service-name>-<random-string>
.
If there are issues reaching the external service, verify the proxy pod is properly deployed:
- Review the logs and events of the proxy pods.
- Review the logs of the operator. You can do this by running
kubectl logs deploy/operator --namespace tailscale
. The log level can be configured using theOPERATOR_LOGGING
environment variable in the operator's manifest file. - Verify that the cluster workload can send traffic to the proxy pod in the
tailscale
namespace.
TLS connection errors
If you are connecting to a workload exposed to the tailnet over Ingress
or to the kube API server over the operator's API server proxy, you can sometimes run into TLS connection errors.
Check the following, in sequence:
-
HTTPS is not enabled for the tailnet.
To use Tailscale
Ingress
or API server proxy, you must ensure that HTTPS is enabled for your tailnet. -
LetsEncrypt certificate has not yet been provisioned.
If HTTPS is enabled, the errors are most likely related to LetsEncrypt certificate provisioning flow.
For each Tailscale
Ingress
resource, the operator deploys a Tailscale node that runs a TLS server. This server is provisioned with a LetsEncrypt certificate for the MagicDNS name of the node. For the API server proxy, the operator also runs an in-process TLS server that proxies tailnet traffic to the Kubernetes API server. This server gets provisioned with a LetsEncrypt certificate for the MagicDNS name of the operator.In both cases, the certificates are provisioned lazily the first time a client connects to the server. Provisioning takes some time, so you might see some TLS timeout errors.
You can examine the logs to follow the certificate provisioning process:
For API server proxy, review the operator's logs:
- For API server proxy, review the operator logs.
- For
Ingress
, review the proxy logs.
There is nothing you can currently do to prevent the first client connection sometimes erroring. Do reach out if this is causing issues for your workflow.
-
You have hit LetsEncrypt rate limits.
If the connection does not succeed even after first attempt to connect, you should verify that you have not hit LetsEncrypt rate limits. If a limit has been hit, you will be able to see the error returned from LetsEncrypt in the logs.
We are currently working on making it less likely for users to hit LetsEncrypt rate limits. See related discussion in tailscale/tailscale#11119.
Tailscale Ingress troubleshooting
This section contains additional information for troubleshooting Tailscale Ingress
.
- Check that the operator has configured the ingress proxy correctly:
For example, if you have a tailscale Ingress
with a backend Service
similar to these:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: my-app
spec:
ingressClassName: tailscale
rules:
- http:
paths:
- backend:
service:
name: my-app
port:
number: 80
path: /login
pathType: Prefix
tls:
- hosts:
- my-app
---
apiVersion: v1
kind: Service
metadata:
name: my-app
spec:
clusterIP: 192.0.2.9
ports:
- port: 80
- Validate that the ingress proxy's configuration matches:
INGRESS_NAME=<ingress-resource-name> \
INGRESS_NAMESPACE=<ingress-resource-namespace> \
secret_name=$(kubectl get secret --selector=tailscale.com/parent-resource-type=ingress,tailscale.com/parent-resource=${INGRESS_NAME},tailscale.com/parent-resource-ns=${INGRESS_NAMESPACE} \
--namespace tailscale -ojsonpath='{.items[0].metadata.name}')
kubectl get secret ${secret_name} -n tailscale -ojsonpath={.data.serve-config} | base64 -d
'{"TCP":{"443":{"HTTPS":true}},"Web":{"${TS_CERT_DOMAIN}:443":{"Handlers":{"/login":{"Proxy":"http://192.0.2.9:80/"}}}}}'
If the configuration appears to be incorrect, check the operator logs for any errors relating to configuring the Ingress
.
- Check that the ingress proxy has loaded the configuration.
Run the below command to view the current proxy's configuration:
INGRESS_NAME=<ingress-resource-name> \
INGRESS_NAMESPACE=<ingress-resource-namespace> \
pod_name=$(kubectl get pod --selector=tailscale.com/parent-resource-type=ingress,tailscale.com/parent-resource=${INGRESS_NAME},tailscale.com/parent-resource-ns=${INGRESS_NAMESPACE} \
--namespace tailscale -ojsonpath='{.items[0].metadata.name}')
kubectl exec ${pod_name} --namespace tailscale -- tailscale serve status
https://my-app.<tailnetxyz>.ts.net (tailnet only)
|-- / proxy http://192.0.2.9:80/
If the configuration appears to be incorrect, check the proxy Pod
's logs.
- Verify that the
Service
backend is reachable from the ingress proxyPod
:
INGRESS_NAME=<ingress-resource-name> \
INGRESS_NAMESPACE=<ingress-resource-namespace> \
pod_name=$(kubectl get pod --selector=tailscale.com/parent-resource-type=ingress,tailscale.com/parent-resource=${INGRESS_NAME},tailscale.com/parent-resource-ns=${INGRESS_NAMESPACE} \
--namespace tailscale -ojsonpath='{.items[0].metadata.name}')
kubectl exec -it ${pod_name} -n tailscale -- sh \
apk add curl \
curl http://192.0.2.9:80/
If the backend cannot be reached, the issue is likely related to cluster connectivity from tailscale
namespace.
Verify that you don't have a NetworkPolicy
in place that prevents Pod
s in tailscale
namespace to talk to the Ingress
backend.