Problems setting up cloudsql-proxy with Workload identity, although wi-test works #1078

lenalebt · 2022-01-20T07:44:31Z

Question

I currently have trouble connecting to my cloudsql instances using workload identity, and I don't understand the error message provided. This is the error I get:

2022/01/20 07:15:11 current FDs rlimit set to 1048576, wanted limit is 8500. Nothing to do here.
2022/01/20 07:15:11 errors parsing config:
	Get "https://sqladmin.googleapis.com/sql/v1beta4/projects/maxxeed/instances/europe-west3~main-dev-testing2/connectSettings?alt=json&prettyPrint=false": metadata: GCE metadata "instance/service-accounts/default/token?scopes=https%!A(MISSING)%!F(MISSING)%!F(MISSING)www.googleapis.com%!F(MISSING)auth%!F(MISSING)sqlservice.admin" not defined

What does that mean exactly? I don't know how to debug this further.

Additional Context

The usual workload-identity tests work, as far as I can tell. I followed the steps in https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity#authenticating_to, including verifying that workload identity works for the set-up service account by running this pod:

apiVersion: v1
kind: Pod
metadata:
  name: workload-identity-test
  namespace: K8S_NAMESPACE
spec:
  containers:
  - image: google/cloud-sdk:slim
    name: workload-identity-test
    command: ["sleep","infinity"]
  serviceAccountName: KSA_NAME

with my KSA_NAME. I even added this container as a sidecar to cloudsql-proxy to test whether some other problem in my configuration of that deployment would cause issues, but I could run that curl command curl -H "Metadata-Flavor: Google" http://169.254.169.254/computeMetadata/v1/instance/service-accounts/, which returned two entries:

default/
projects/foobar/serviceAccounts/cloudsql-proxy-dev@foobar.iam.gserviceaccount.com/

As far as I understand, I should only get one entry here, but I don't understand where the second may come from, and I suspect this may be the problem over here!?

This is the deployment definition that currently is running (and crashing); I extracted it from the cluster and removed a few fields around managedFields and status:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: cloudsql-proxy
  namespace: default
  labels:
    app: cloudsql-proxy
    app.kubernetes.io/managed-by: Helm
  annotations:
    meta.helm.sh/release-name: cloudsql-proxy
    meta.helm.sh/release-namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: cloudsql-proxy
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: cloudsql-proxy
      annotations:
        prometheus.io/scrape: 'false'
        sidecar.istio.io/inject: 'false'
    spec:
      containers:
        - name: cloudsql-proxy
          image: eu.gcr.io/cloudsql-docker/gce-proxy:1.28.0
          command:
            - /cloud_sql_proxy
            - '-ip_address_types=PRIVATE'
            - '-instances=foobar:europe-west3:main-dev-testing2=tcp:0.0.0.0:5432'
          ports:
            - containerPort: 5432
              protocol: TCP
          resources:
            limits:
              cpu: 100m
              memory: 300Mi
            requests:
              cpu: 10m
              memory: 300Mi
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          imagePullPolicy: IfNotPresent
          securityContext:
            runAsNonRoot: true
      restartPolicy: Always
      terminationGracePeriodSeconds: 30
      dnsPolicy: ClusterFirst
      serviceAccountName: cloudsql-proxy
      serviceAccount: cloudsql-proxy
      securityContext: {}
      schedulerName: default-scheduler
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      maxSurge: 1
  progressDeadlineSeconds: 600

Any pointer much appreciated. Sorry if this turns out to be a generic workload identity problem - I find the cloudsql-proxy error message quite confusing though :-/.

The text was updated successfully, but these errors were encountered:

lenalebt · 2022-01-20T10:52:48Z

After rubberducking through every step, detailed and step-by-step, we found that I annotated the kubernetes service-account wrongly. This is how I should have annotated it:

kubectl annotate serviceaccount cloudsql-proxy default gke.io/gcp-service-account=cloudsql-proxy-dev@foobar.iam.gserviceaccount.com
vs
kubectl annotate serviceaccount cloudsql-proxy default gke.io/gcp-service-account=projects/foobar/serviceAccounts/cloudsql-proxy-dev@foobar.iam.gserviceaccount.com

It was buried under a layer of terraform. Sorry for the interruption, I hope it will help somebody else who is in the same situation later on :)

enocom · 2022-01-20T15:59:25Z

Glad you figured it out. Getting workload identity setup is definitely tricky and the proxy's error message is pretty terrible. By the way, we are working on fixing the error messages as part of #872.

lenalebt added the type: question label Jan 20, 2022

blunderbuss-gcf bot assigned enocom Jan 20, 2022

lenalebt closed this as completed Jan 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problems setting up cloudsql-proxy with Workload identity, although wi-test works #1078

Problems setting up cloudsql-proxy with Workload identity, although wi-test works #1078

lenalebt commented Jan 20, 2022

lenalebt commented Jan 20, 2022

enocom commented Jan 20, 2022

Problems setting up cloudsql-proxy with Workload identity, although wi-test works #1078

Problems setting up cloudsql-proxy with Workload identity, although wi-test works #1078

Comments

lenalebt commented Jan 20, 2022

Question

Additional Context

lenalebt commented Jan 20, 2022

enocom commented Jan 20, 2022