Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] when using aad-pod-identity, env-injector fail to pull image from ACR #55

Closed
rgbusato opened this issue Mar 27, 2020 · 89 comments
Closed
Assignees
Labels
bug Something isn't working
Milestone

Comments

@rgbusato
Copy link

Note: Make sure to check out known issues (https://github.com/sparebankenvest/azure-key-vault-to-kubernetes#known-issues) before submitting

Your question
Does the env-injector use some custom mechanism to authenticate with a private docker registry to inspect the image in order to overwrite the entrypoint or cmd?
I'm having some issues where the env-injector container is showing logs where it attempts to pull the image from ACR but it fails.

To Reproduce
If question relates to a certain behavior, describe steps to reproduce:
I'm using AKS + Azure Container Registry with a service principle authentication mechanism that allows me to spin up containers in AKS available in my private registry (ACR).
The original pod itself was deployed successfully and i can verify that the AKS cluster has the necessary permission to pull images from ACR.
However, I've noticed that not env-injector debug logs were showing up in my container so i poked around the env-injector container and noticed some error messages related to it's inability to pull docker images from ACR

Logs
Here is the command I am running to find that information:
$ kubectl logs -n akv2k8s azure-key-vault-env-injector-5458b4bc9d-njxrz -c azure-key-vault-env-injector

Here are the logs indicating the problem mentioned:

time="2020-03-27T19:43:58Z" level=info msg="found pod to mutate in namespace 'argo'"
time="2020-03-27T19:43:58Z" level=info msg="found container 'api-test' to mutate"
time="2020-03-27T19:43:58Z" level=info msg="checking for env vars containing '@azurekeyvault' in container api-test"
time="2020-03-27T19:43:58Z" level=info msg="found env var: db-connection-string@azurekeyvault"
time="2020-03-27T19:43:58Z" level=info msg="did not find credentials to use with registry '****.azurecr.io' - getting default credentials"
time="2020-03-27T19:43:58Z" level=info msg="failed to read azure.json to get default credentials, error: open /etc/kubernetes/azure.json: no such file or directory"
time="2020-03-27T19:43:58Z" level=info msg="pulling docker image ***.azurecr.io/***:latest to get entrypoint and cmd, timeout is 120 seconds"
2020/03/27 19:43:59 [ERROR] admission webhook error: failed to get auto cmd, error: failed to pull docker image '****.azurecr.io/***:latest', error: Error response from daemon: Get https://****.azurecr.io/v2/***/manifests/latest: unauthorized: authentication required
@rgbusato rgbusato added the question Further information is requested label Mar 27, 2020
@rgbusato
Copy link
Author

I forgot to mention that i am using aad-pod-identity with my akv2k8s setup.
I used the helm documentation to install the env-injector with aad-pod-identity.
I was able to test the tool successfully when using a publicly available docker image. But now that im trying to deploy the real service, which uses a Docker image stored in ACR, it is failing as I initially mentioned in the details.

@torresdal
Copy link
Collaborator

@rgbusato I'm afraid this is most likely a bug in the env-injector. What happens is that you are using aad-pod-identity (custom credentials), but the webhook is trying to use default credentials to download image (error: open /etc/kubernetes/azure.json). Problem is that because you are using custom credentials, the webhook Pod does not map that file - which is a bug.

As I can see it, you have 3 options:

  1. Don't use aad-pod-identity and rely on default auth - that is the AKS service principal
  2. Use the built-in Kubernetes imagePullSecrets referencing a secret containing a service principal with access to ACR
  3. Wait for a bug-fix

As a side note, the upcoming version of env-injector does not download the image, but use remote inspection of the docker image to find the cmd/entrypoint, but it will still need credentials to access ACR.

@torresdal torresdal added the bug Something isn't working label Mar 27, 2020
@torresdal torresdal changed the title [Question] env-injector failing to pull image from ACR [BUG] env-injector failing to pull image from ACR when using aad-pod-identity Mar 27, 2020
@rgbusato
Copy link
Author

thanks for the detailed response @torresdal.
I will probably just use the kube native imagePullSecrets for now.
thanks for helping build and maintain this handy tool.

@rgbusato
Copy link
Author

@torresdal I've tried the second option you've mentioned but doesn't seem to work.

Here are the steps I took:

  1. I've created a docker registry k8s secret in the akv2k8s namespace.
  2. I've patched the azure-key-vault-env-injector service account to inject the imagePullSecrets in every pod.
  3. I've forced the creation of a new set of env-injector pods and verified that the imagePullSecrets information was correct.
  4. I've redeployed my app and checked the env-injector pod for logs.

I got the same error I was getting before.

I think the reason why this doesn't work is because the env-injector pod is trying to pull the image from within application code ? Therefore the imagePullSecrets that i've configured does not take effect on that request to the private registry.

I might just use the controller sync mechanism for now until we fix this.

I'm not sure on the effort to fix this but if you have any ideas how to easily fix this I might be able to convince some of the Go devs on our team to help you get that done.

@ChaosGramer
Copy link
Contributor

We had some other issue like this what we did is to change the creation of any deployment in our cd pipelines.

For now every pod has a cmd spec in itself that way the end injector is not trying to download / inspect the docker image to find the entrypoint.

@torresdal
Copy link
Collaborator

It should use the imagePullSecrets:

...but I cannot rule out a bug. We have a new version coming out very soon. We’ll make sure to create a test for this scenario.

The workaround syggested by @ChaosGramer should work, but this might not be a feasible solution for you.

@rgbusato
Copy link
Author

rgbusato commented Apr 9, 2020

cool. yea that makes sense and should be easy to implement.
I'll definitely give it a try.
thanks for the pointer

@VpavlovLumedic
Copy link

Nothing from the options above worked for me yet. @rgbusato were you able to fix this?

@ChaosGramer
Copy link
Contributor

@VpavlovLumedic
Would you be so kind and send a demo file of your deployment or pod you have used to deploy the application that is needing the secret?

@VpavlovLumedic
Copy link

VpavlovLumedic commented Jul 9, 2020

Hi, @ChaosGramer. I can't share the information about my pod and deployment, but I will try to reproduce an issue later on demo pod which i can share.

Right now i can share this info, which is really similar to @rgbusato case:

  • I am using aad-pod-identity with my akv2k8s setup.
  • I used the helm documentation to install the env-injector with aad-pod-identity.
  • I was able to test the tool successfully when using a publicly available docker image (from your examples). But now that im trying to deploy the real service, which uses a Docker image stored in ACR, it is failing as I initially mentioned in the details.
  • I have a label for binding pod-identity so it has an access for keyvault e.g. aadpodidbinding: myIdentity

The error log from env-injector is:

initializing config...
config initialized
2020/07/08 19:59:53 [WARN] no metrics recorder active
2020/07/08 19:59:53 [WARN] no tracer active
2020/07/08 19:59:53 [INFO] listening on :443
time="2020-07-08T20:03:29Z" level=info msg="found pod to mutate in namespace 'myns'"
time="2020-07-08T20:03:29Z" level=info msg="found container 'mycontainer' to mutate"
time="2020-07-08T20:03:29Z" level=info msg="checking for env vars containing '@AzureKeyVault' in container mycontainer"
time="2020-07-08T20:03:29Z" level=info msg="found env var: somevar@AzureKeyVault"
time="2020-07-08T20:03:29Z" level=info msg="found env var: somevar@AzureKeyVault"
time="2020-07-08T20:03:29Z" level=info msg="found env var: somevar@AzureKeyVault"
time="2020-07-08T20:03:29Z" level=info msg="did not find credentials to use with registry 'private.azurecr.io' - getting default credentials"
time="2020-07-08T20:03:29Z" level=info msg="failed to read azure.json to get default credentials, error: open /etc/kubernetes/azure.json: no such file or directory"
time="2020-07-08T20:03:29Z" level=info msg="pulling docker image private.azurecr.io/private:latest to get entrypoint and cmd, timeout is 120 seconds"
2020/07/08 20:03:29 [ERROR] admission webhook error: failed to get auto cmd, error: failed to pull docker image 'private.azurecr.io/private:latest', error: Error response from daemon: Get https://private.azurecr.io/v2/private/manifests/latest: unauthorized: authentication required

@ChaosGramer
Copy link
Contributor

Do you have a pull secret configured on your pod?

@VpavlovLumedic
Copy link

VpavlovLumedic commented Jul 9, 2020

Yes, I tried that workaround with pullSecrets. (Going to remove it soon, because it's not necessary)

Also, I have just added CMD stuff and it got better, but i have a private KeyVault and i see a strange IP inside the pod logs, which is getting denied by firewall of KeyVault.

Original Error: autorest/azure: Service returned an error. Status=403 Code="Forbidden" Message="Client address is not authorized and caller is not a trusted service.\r\nClient address: 52.148.151.235

So now we can say, that the original bug happens when no CMD defined.

@ChaosGramer Do you know anything about this IP?

** Added **
Ok 52.148.151.235 looks like Microsoft IP. I'll play around my Azure KeyVault firewall settings and report the results later.

@VpavlovLumedic
Copy link

Ok. The steps below fixed it for me.
I'm deploying Azure function V3.
No imagePullSecrets on pod required

  1. Install charts with Helm 3
  • helm install azure-key-vault-env-injector spv-charts/azure-key-vault-env-injector --set installCrd=false --set keyVault.customAuth.enabled=true --set customAuth.autoInject.enabled=true --set keyVault.customAuth.podIdentitySelector=private-pod-identity--namespace akv2k8s
  • helm install azure-key-vault-controller spv-charts/azure-key-vault-controller --set keyVault.customAuth.enabled=true --set customAuth.autoInject.enabled=true --set keyVault.customAuth.podIdentitySelector=private-pod-identity --namespace akv2k8s
  1. deployment.yaml should look like
kind: Deployment
metadata:
  labels:
    app: privatefunc
  name: privatefunc
  namespace: privatens # namespace must support injection https://akv2k8s.io/tutorials/prerequisites/
spec:
  selector:
    matchLabels:
      app: privatefunc
  template:
    metadata:
      labels:
        app: privatefunc
        aadpodidbinding: private-pod-identity # requires custom auth https://akv2k8s.io/security/authentication/ requires aad-pod-identity https://github.com/Azure/aad-pod-identity
    spec:
      containers:        
      - env:
        - name: var1
          value: "var1-link@azurekeyvault" # requires AzureKeyVaultSecret https://akv2k8s.io/tutorials/env-injection/1-secret/
        - name: var2
          value: "var2-link@azurekeyvault"  # requires AzureKeyVaultSecret https://akv2k8s.io/tutorials/env-injection/1-secret/
        image: privateregistry.azurecr.io/privateimage:latest
        name: privatefunc
        args: ["$(var1), $(var2)"]
        command: [ "/azure-functions-host/Microsoft.Azure.WebJobs.Script.WebHost" ]
        ports:
        - containerPort: 80
          protocol: TCP
        resources:
          limits:
            cpu: 350m
            memory: 350Mi
          requests:
            cpu: 100m
            memory: 100Mi
      terminationGracePeriodSeconds: 30

B.t.w.
Injector log still contains errors

time="2020-07-09T12:20:15Z" level=info msg="failed to read azure.json to get default credentials, error: open /etc/kubernetes/azure.json: no such file or directory"
time="2020-07-09T12:20:15Z" level=info msg="using '/azure-functions-host/Microsoft.Azure.WebJobs.Script.WebHost $(private1), $(private2), $(private3), $(private4), $(private5)' as arguments for env-injector"
time="2020-07-09T12:20:15Z" level=info msg="containers mutated and pod updated with init-container and volumes"

@ChaosGramer

  • is it possible to get rid of the error in the injector?
  • do i need to change yaml in some way?

@ChaosGramer
Copy link
Contributor

@VpavlovLumedic I think you have enabled service endpoints on your vnet.
This means your aks will connect with a azure public ip against the acr. For us the solution was to whitelist the subnet using az cli:

Az acr Network-rule

@ChaosGramer
Copy link
Contributor

Have you tested the variable substitution you are doing in your args. I’m not so sure whether this will work out

@VpavlovLumedic
Copy link

VpavlovLumedic commented Jul 9, 2020

Have you tested the variable substitution you are doing in your args. I’m not so sure whether this will work out

Yes, it works very well. And the variables are invisible through kubectl exec mypod -- sh -c 'echo $var1'

@torresdal
Copy link
Collaborator

@VpavlovLumedic @ChaosGramer any chance you could verify if this is still an issue in the latest beta (env-injector chart version 1.1.0-beta.28)? Ref #115 We don't have aad-pod-identity in any of our clusters.

If its still an issue (which I hope not) - we will be able to fix this before going official with 1.1.0

@torresdal torresdal added this to the Version 1.1.0 milestone Aug 28, 2020
@torresdal torresdal removed the question Further information is requested label Aug 28, 2020
@pvmraghunandan
Copy link

pvmraghunandan commented Sep 10, 2020

We are getting the same error. Below are the logs from controller.

time="2020-09-10T15:08:38Z" level=info msg="found pod to mutate in namespace 'ingress-basic'"
time="2020-09-10T15:08:38Z" level=info msg="found container 'nodewebapp' to mutate"
time="2020-09-10T15:08:38Z" level=info msg="checking for env vars containing '@azurekeyvault' in container nodewebapp"
time="2020-09-10T15:08:38Z" level=info msg="found env var: sec1@azurekeyvault"
time="2020-09-10T15:08:38Z" level=info msg="found env var: sec2@azurekeyvault"
time="2020-09-10T15:08:38Z" level=info msg="did not find credentials to use with registry 'testacr.azurecr.io' - getting default credentials"
time="2020-09-10T15:08:38Z" level=info msg="failed to read azure.json to get default credentials, error: open /etc/kubernetes/azure.json: no such file or directory"
time="2020-09-10T15:08:38Z" level=info msg="pulling docker image testacr.azurecr.io/nodewebapp:v4 to get entrypoint and cmd, timeout is 120 seconds"
2020/09/10 15:08:39 [ERROR] admission webhook error: failed to get auto cmd, error: failed to pull docker image 'testacr.azurecr.io/nodewebapp:v4', error: Error response from daemon: Get https://testacr.azurecr.io/v2/nodewebapp/manifests/v4: unauthorized: authentication required, visit https://aka.ms/acr/authorization for more information.
time="2020-09-10T15:13:50Z" level=info msg="found pod to mutate in namespace 'ingress-basic'"
time="2020-09-10T15:13:50Z" level=info msg="found container 'nodewebapp' to mutate"
time="2020-09-10T15:13:50Z" level=info msg="checking for env vars containing '@azurekeyvault' in container nodewebapp"
time="2020-09-10T15:13:50Z" level=info msg="found env var: sec1@azurekeyvault"
time="2020-09-10T15:13:50Z" level=info msg="did not find credentials to use with registry 'testacr.azurecr.io' - getting default credentials"
time="2020-09-10T15:13:50Z" level=info msg="failed to read azure.json to get default credentials, error: open /etc/kubernetes/azure.json: no such file or directory"
time="2020-09-10T15:13:50Z" level=info msg="pulling docker image testacr.azurecr.io/nodewebapp:v4 to get entrypoint and cmd, timeout is 120 seconds"
2020/09/10 15:13:50 [ERROR] admission webhook error: failed to get auto cmd, error: failed to pull docker image 'testacr.azurecr.io/nodewebapp:v4', error: Error response from daemon: Get https://testacr.azurecr.io/v2/nodewebapp/manifests/v4: unauthorized: authentication required, visit https://aka.ms/acr/authorization for more information.

@VpavlovLumedic i tried almost same template and still got same error. Logs are stuck after the daemon error. How did you fix?

@torresdal
Copy link
Collaborator

Thanks for taking the time to test @pvmraghunandan. I looked through the code once more, and there is an issue indeed. When looking for credentials to use for image registries, it does not take into account using custom auth. I'll patch this asap and get the 1.1.0 version out the door. Thanks.

@VpavlovLumedic
Copy link

VpavlovLumedic commented Sep 10, 2020

@VpavlovLumedic I think you have enabled service endpoints on your vnet.
This means your aks will connect with a azure public ip against the acr. For us the solution was to whitelist the subnet using az cli:

Az acr Network-rule

@ChaosGramer , I used private endpoint. I think there was some mistake in my configuration (i think that i had no Vnet link between azure privatelink and AKS Vnet), because it works flawlessly now via private IP in a clusters Vnet so akv2k8s env injector resolves keyvault public host to a private ip properly now.

@VpavlovLumedic
Copy link

VpavlovLumedic commented Sep 10, 2020

Tested it again. Works fine.

deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: akvs-secret-app
  namespace: my-namespace
  labels:
    app: akvs-secret-app
spec:
  selector:
    matchLabels:
      app: akvs-secret-app
  template:
    metadata:
      labels:
        app: akvs-secret-app
        aadpodidbinding: my-pod-identity
    spec:
      serviceAccountName: my-serviceaccount
      containers:
      - name: akv2k8s-env-test
        image: spvest/akv2k8s-env-test:2.0.1
        args: ["ClientId"]
        env:
        - name: ClientId
          value: my-private-secret@azurekeyvault

Pod logs: (modified version of deployment.yaml from this page https://akv2k8s.io/tutorials/env-injection/1-secret/)

time="2020-09-10T18:33:31Z" level=info msg="found original container command to be /usr/local/bin/entrypoint.sh [entrypoint.sh ClientId]" application=env-injector component=akv2k8s custom_auth=true namespace=my-namespace
time="2020-09-10T18:33:31Z" level=error msg="error getting azurekeyvaultsecret resource 'my-private-secret', error: Get https://10.0.0.1:443/apis/spv.no/v1alpha1/namespaces/my-namespace/azurekeyvaultsecrets/my-private-secret: dial tcp 10.0.0.1:443: connect: connection refused" application=env-injector component=akv2k8s custom_auth=true namespace=my-namespace
time="2020-09-10T18:33:31Z" level=info msg="will retry getting azurekeyvaultsecret resource up to 3 times, waiting 3 seconds between retries" application=env-injector component=akv2k8s custom_auth=true namespace=my-namespace
time="2020-09-10T18:33:31Z" level=error msg="error getting azurekeyvaultsecret resource 'my-private-secret', error: Get https://10.0.0.1:443/apis/spv.no/v1alpha1/namespaces/my-namespace/azurekeyvaultsecrets/my-private-secret: dial tcp 10.0.0.1:443: connect: connection refused" application=env-injector component=akv2k8s custom_auth=true namespace=my-namespace
time="2020-09-10T18:33:34Z" level=info msg="succeded getting azurekeyvaultsecret resource 'my-private-secret'" application=env-injector component=akv2k8s custom_auth=true namespace=my-namespace
time="2020-09-10T18:33:35Z" level=info msg="secret ClientId injected into evn var ClientId for executable /usr/local/bin/entrypoint.sh" application=env-injector component=akv2k8s custom_auth=true namespace=my-namespace
time="2020-09-10T18:33:35Z" level=info msg="starting process /usr/local/bin/entrypoint.sh [entrypoint.sh ClientId] with secrets in env vars" application=env-injector component=akv2k8s custom_auth=true namespace=my-namespace
*Here i see my ClientId secret value*
waiting forever...

I should mention that i already have containers pulled:

Container image "spvest/azure-keyvault-env:1.0.2" already present on machine
Container image "spvest/akv2k8s-env-test:2.0.1" already present on machine

Ok, i added imagePullPolicy: Always to deployment.yaml - still fine

From kubectl describe pod:

Pulling image "spvest/akv2k8s-env-test:2.0.1"
Successfully pulled image "spvest/akv2k8s-env-test:2.0.1"

Ok, this looks interesting. I've added imagePullPolicy: Always to my own deployment yaml and redeployed.
Now logs from kubectl describe pod are:

  Warning  Failed     2m28s (x2 over 2m45s)  kubelet, aksmynodepool-35243983-vmss000000  Error: ErrImagePull
  Warning  Failed     2m28s (x2 over 2m45s)  kubelet, aksmynodepool-35243983-vmss000000  Failed to pull image "myappregistry.azurecr.io/myacrimage:0.0.3": rpc error: code = Unknown desc = Error response from daemon: Get https://myappregistry.azurecr.io/v2/myacrimage/manifests/0.0.3: unauthorized: authentication required, visit https://aka.ms/acr/authorization for more information.
  Normal   BackOff    2m14s (x3 over 2m43s)  kubelet, aksmynodepool-35243983-vmss000000  Back-off pulling image "myappregistry.azurecr.io/myacrimage:0.0.3"
  Warning  Failed     2m14s (x3 over 2m43s)  kubelet, aksmynodepool-35243983-vmss000000  Error: ImagePullBackOff
  Normal   Pulling    2m2s (x3 over 2m45s)   kubelet, aksmynodepool-35243983-vmss000000  Pulling image "myappregistry.azurecr.io/myacrimage:0.0.3"

Env injector pod logs are:

time="2020-09-10T19:10:26Z" level=info msg="found pod to mutate in namespace 'my-namespace'"
time="2020-09-10T19:10:26Z" level=info msg="found container 'my-container' to mutate"
time="2020-09-10T19:10:26Z" level=info msg="checking for env vars containing '@azurekeyvault' in container my-container"
time="2020-09-10T19:10:26Z" level=info msg="found env var: myvar1@azurekeyvault"
time="2020-09-10T19:10:26Z" level=info msg="found env var: myvar2@azurekeyvault"
time="2020-09-10T19:10:26Z" level=info msg="found env var: myvar3@azurekeyvault"
time="2020-09-10T19:10:26Z" level=info msg="found env var: myvar4@azurekeyvault"
time="2020-09-10T19:10:26Z" level=info msg="found env var: myvar5@azurekeyvault"
time="2020-09-10T19:10:26Z" level=info msg="did not find credentials to use with registry 'myregistry.azurecr.io' - getting default credentials"
time="2020-09-10T19:10:26Z" level=info msg="failed to read azure.json to get default credentials, error: open /etc/kubernetes/azure.json: no such file or directory"
time="2020-09-10T19:10:26Z" level=info msg="using '/azure-functions-host/myapp $(arg1), $(arg2), $(arg3), $(arg4), $(arg5)' as arguments for env-injector"
time="2020-09-10T19:10:26Z" level=info msg="containers mutated and pod updated with init-container and volumes"

But still i get my container running and logs from pod are good without any errors inside. B.t.w. i also have istio installed, but this should not matter here.

P.S. Sorry but i have to change variable names and obfuscate other things due to privacy concerns.

@ChaosGramer
Copy link
Contributor

@VpavlovLumedic are you sure, your running pod got the secrets injected now?

ISTIO:
time="2020-09-10T18:33:31Z" level=error msg="error getting azurekeyvaultsecret resource 'my-private-secret', error: Get https://10.0.0.1:443/apis/spv.no/v1alpha1/namespaces/my-namespace/azurekeyvaultsecrets/my-private-secret: dial tcp 10.0.0.1:443: connect: connection refused" application=env-injector component=akv2k8s custom_auth=true namespace=my-namespace
I guess, that ISTIO is blocking the connection between the pod and the API Server, could you please validate the config in ISTIO?

@VpavlovLumedic
Copy link

VpavlovLumedic commented Sep 10, 2020

@VpavlovLumedic are you sure, your running pod got the secrets injected now?

ISTIO:
time="2020-09-10T18:33:31Z" level=error msg="error getting azurekeyvaultsecret resource 'my-private-secret', error: Get https://10.0.0.1:443/apis/spv.no/v1alpha1/namespaces/my-namespace/azurekeyvaultsecrets/my-private-secret: dial tcp 10.0.0.1:443: connect: connection refused" application=env-injector component=akv2k8s custom_auth=true namespace=my-namespace
I guess, that ISTIO is blocking the connection between the pod and the API Server, could you please validate the config in ISTIO?

@ChaosGramer
Yes my pod gets the secrets - I am 100% sure, coz i have a lot of pods with the same logs.
The bad thing here is that it gets it after a couple of retries, so i think that this is more like a syncronization issue between istio proxy and akv2k8s init container. I have plans to research this a bit more, but since it's working after a couple of retries - I am fine with that for now.

B.t.w. I had to exclude my key vaults private IP from istio-mesh to get it working. I might reconsider it later. Do you suggest to try to create a ServiceEntry for 10.0.0.1 ?

@VpavlovLumedic
Copy link

VpavlovLumedic commented Sep 25, 2020

Rolled back to 1.0.2. Everything is working now. I need to restore my playground cluster to test your updates later.

@torresdal torresdal reopened this Sep 25, 2020
@torresdal torresdal self-assigned this Sep 25, 2020
@torresdal torresdal pinned this issue Sep 25, 2020
@torresdal torresdal changed the title [BUG] env-injector failing to pull image from ACR when using aad-pod-identity [BUG] when using aad-pod-identity, env-injector fail to pull image from ACR Sep 25, 2020
@ArtemTrofimushkin
Copy link

ArtemTrofimushkin commented Oct 31, 2020

Hi everyone.
First of all, thanks for that project - it's really cool. (Also, sorry for my English, it's not my native language :))
We testing akv2k8s + AAD Pod Identity in our clusters and have some issues with it.

Context: Pods for EnvInjector & Controller (infra-secrets namespace) labeled with aadpodidbinding: kubernetes-agentpool.
This k8s resources (azureidentities, azureidentitybindings) points on user-assigned MSI (other that cluster generated one).
MSI has permissions (get secret/key/certificate) for all keyvaults in cluster resource group.
Several applications use EnvInjector to snitch secrets from AKV and inject values via environment variables.
Version of EnvInjector: 1.1.10 (release 1.1.1)

Problem: Pods fails to start (to create) and secret injection fails

Investigation:

Inspection of kubectl describe replicaset <some replicaset with 0 replicas> give following output:

'Internal error occurred: failed calling webhook "pods.env-injector.admission.spv.no": an error on the server
    ("{\"response\":{\"uid\":\"accd6cf2-3afd-4dd9-9be5-ae2daf634266\",\"allowed\":false,\"status\":{\"metadata\":{},\"status\":\"Failure\",\"message\":\"failed
    to get auto cmd, error: Get https://dodoreg.azurecr.io/v2/: http: non-successful
    response (status=401 body=\\\"{\\\\\\\"errors\\\\\\\":[{\\\\\\\"code\\\\\\\":\\\\\\\"UNAUTHORIZED\\\\\\\",\\\\\\\"message\\\\\\\":\\\\\\\"authentication
    required, visit https://aka.ms/acr/authorization for more information.\\\\\\\",\\\\\\\"detail\\\\\\\":null}]}\\\\n\\\")\\ncannot
    create client for doker registry\\ngithub.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry.getImageBlob\\n\\t/go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry/registry.go:59\\ngithub.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry.GetImageConfig\\n\\t/go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry/registry.go:49\\nmain.getContainerCmd\\n\\t/go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/registry.go:36\\nmain.mutateContainers\\n\\t/go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/pod.go:107\\nmain.mutatePodSpec\\n\\t/go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/pod.go:229\\nmain.vaultSecretsMutator\\n\\t/go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/main.go:138\\ngithub.com/slok/kubewebhook/pkg/webhook/mutating.MutatorFunc.Mutate\\n\\t/go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/mutating/mutator.go:25\\ngithub.com/slok/kubewebhook/pkg/webhook/mutating.(*staticWebhook).mutatingAdmissionReview\\n\\t/go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/mutating/webhook.go:131\\ngithub.com/slok/kubewebhook/pkg/webhook/mutating.(*staticWebhook).Review\\n\\t/go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/mutating/webhook.go:123\\ngithub.com/slok/kubewebhook/pkg/webhook/internal/instrumenting.(*Webhook).Review\\n\\t/go/pkg/mod/github.com/slok/[email protected]/pkg/web")
    has prevented the request from succeeding'

Logs of EnvInjector pods contains following:

[infra-secrets-envinjector-7d67b7bdfd-d29pj] time="2020-10-31T17:47:00Z" level=info msg="found pod to mutate in namespace 'default'"
[infra-secrets-envinjector-7d67b7bdfd-d29pj] time="2020-10-31T17:47:00Z" level=info msg="found container 'garbage-collector' to mutate"
[infra-secrets-envinjector-7d67b7bdfd-d29pj] time="2020-10-31T17:47:00Z" level=info msg="checking for env vars containing '@azurekeyvault' in container garbage-collector"
[infra-secrets-envinjector-7d67b7bdfd-d29pj] time="2020-10-31T17:47:00Z" level=info msg="found env var: cookinginfo-blob-connection-string@azurekeyvault"
[infra-secrets-envinjector-7d67b7bdfd-d29pj] time="2020-10-31T17:47:00Z" level=info msg="found env var: cookinginfo-rabbitmq-connection-strings@azurekeyvault"
[infra-secrets-envinjector-7d67b7bdfd-d29pj] time="2020-10-31T17:47:00Z" level=debug msg="getting container command for container 'garbage-collector'"
[infra-secrets-envinjector-7d67b7bdfd-d29pj] time="2020-10-31T17:47:00Z" level=debug msg="no cmd override in kubernetes for container garbage-collector, checking docker image configuration for entrypoint and cmd for dodoreg.azurecr.io/cookinginfo:1.0.15-2"
[infra-secrets-envinjector-7d67b7bdfd-d29pj] time="2020-10-31T17:47:00Z" level=debug msg="using managed identity for acr credentials"
[infra-secrets-envinjector-7d67b7bdfd-d29pj] time="2020-10-31T17:47:00Z" level=debug msg="image(cookinginfo:1.0.15-2) is not from ACR, skip MSI authentication"
[infra-secrets-envinjector-7d67b7bdfd-d29pj] time="2020-10-31T17:47:00Z" level=info msg="using registry https://dodoreg.azurecr.io"
[infra-secrets-envinjector-7d67b7bdfd-d29pj] 2020-10-31T17:47:00.993496Z	info	registry.ping url=https://dodoreg.azurecr.io/v2/
[infra-secrets-envinjector-7d67b7bdfd-d29pj] time="2020-10-31T17:47:01Z" level=error msg="failed to mutate pod, error: failed to get auto cmd, error: Get https://dodoreg.azurecr.io/v2/: http: non-successful response (status=401 body=\"{\\\"errors\\\":[{\\\"code\\\":\\\"UNAUTHORIZED\\\",\\\"message\\\":\\\"authentication required, visit https://aka.ms/acr/authorization for more information.\\\",\\\"detail\\\":null}]}\\n\")\ncannot create client for doker registry\ngithub.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry.getImageBlob\n\t/go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry/registry.go:59\ngithub.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry.GetImageConfig\n\t/go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry/registry.go:49\nmain.getContainerCmd\n\t/go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/registry.go:36\nmain.mutateContainers\n\t/go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/pod.go:107\nmain.mutatePodSpec\n\t/go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/pod.go:229\nmain.vaultSecretsMutator\n\t/go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/main.go:138\ngithub.com/slok/kubewebhook/pkg/webhook/mutating.MutatorFunc.Mutate\n\t/go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/mutating/mutator.go:25\ngithub.com/slok/kubewebhook/pkg/webhook/mutating.(*staticWebhook).mutatingAdmissionReview\n\t/go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/mutating/webhook.go:131\ngithub.com/slok/kubewebhook/pkg/webhook/mutating.(*staticWebhook).Review\n\t/go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/mutating/webhook.go:123\ngithub.com/slok/kubewebhook/pkg/webhook/internal/instrumenting.(*Webhook).Review\n\t/go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/internal/instrumenting/instrumenting.go:42\ngithub.com/slok/kubewebhook/pkg/http.HandlerFor.func1\n\t/go/pkg/mod/github.com/slok/[email protected]/pkg/http/handler.go:64\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2007\ngithub.com/gorilla/mux.(*Router).ServeHTTP\n\t/go/pkg/mod/github.com/gorilla/[email protected]/mux.go:210\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2802\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1890\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1357"
[infra-secrets-envinjector-7d67b7bdfd-d29pj] 2020-10-31T17:47:01.024417Z	info	[ERROR] admission webhook error: failed to get auto cmd, error: Get https://dodoreg.azurecr.io/v2/: http: non-successful response (status=401 body="{\"errors\":[{\"code\":\"UNAUTHORIZED\",\"message\":\"authentication required, visit https://aka.ms/acr/authorization for more information.\",\"detail\":null}]}\n")

After inspection of source code, I found this call chain:

docker/registry/container/Collect
---> docker/registry/container/getAcrCredentials
------> azure/credentialprovider/acr/GetAcrCredentials
---------> azure/credentialprovider/acr/parseACRLoginServerFromImage

This line indicates that image is not from ACR (but image in application from ACR, actually):

azure/credentialprovider/acr/GetAcrCredentials

This happens because ACR prefix trimmed from image name in this line:

docker/registry/container/Collect

As temporal workaround, I decided to clone repository and move indicated line of code after ACR credentials check:

docker/registry/container/Collect

Logs after creation of new pods (with updated image version):

[infra-secrets-envinjector-5c776c66bc-62n6b] time="2020-10-31T18:13:45Z" level=info msg="found pod to mutate in namespace 'default'"
[infra-secrets-envinjector-5c776c66bc-62n6b] time="2020-10-31T18:13:45Z" level=info msg="found container 'backend' to mutate"
[infra-secrets-envinjector-5c776c66bc-62n6b] time="2020-10-31T18:13:45Z" level=info msg="checking for env vars containing '@azurekeyvault' in container backend"
[infra-secrets-envinjector-5c776c66bc-62n6b] time="2020-10-31T18:13:45Z" level=info msg="found env var: geoservice-redis-connectionstring@azurekeyvault"
[infra-secrets-envinjector-5c776c66bc-62n6b] time="2020-10-31T18:13:45Z" level=info msg="found env var: geoservice-redis-password@azurekeyvault"
[infra-secrets-envinjector-5c776c66bc-62n6b] time="2020-10-31T18:13:45Z" level=info msg="found env var: geoservice-yandex-map-api-key@azurekeyvault"
[infra-secrets-envinjector-5c776c66bc-62n6b] time="2020-10-31T18:13:45Z" level=info msg="found env var: geoservice-google-map-api-key@azurekeyvault"
[infra-secrets-envinjector-5c776c66bc-62n6b] time="2020-10-31T18:13:45Z" level=debug msg="getting container command for container 'backend'"
[infra-secrets-envinjector-5c776c66bc-62n6b] time="2020-10-31T18:13:45Z" level=debug msg="no cmd override in kubernetes for container backend, checking docker image configuration for entrypoint and cmd for dodoreg.azurecr.io/geoservice:master-95c4b2e"
[infra-secrets-envinjector-5c776c66bc-62n6b] time="2020-10-31T18:13:45Z" level=debug msg="using managed identity for acr credentials"
[infra-secrets-envinjector-5c776c66bc-62n6b] time="2020-10-31T18:13:45Z" level=debug msg="azure: using managed identity extension to retrieve access token"
[infra-secrets-envinjector-5c776c66bc-62n6b] time="2020-10-31T18:13:45Z" level=debug msg="azure: using User Assigned MSI ID to retrieve access token"
[infra-secrets-envinjector-5c776c66bc-62n6b] time="2020-10-31T18:13:45Z" level=info msg="using registry https://dodoreg.azurecr.io"
[infra-secrets-envinjector-5c776c66bc-62n6b] 2020-10-31T18:13:45.411925Z	info	registry.ping url=https://dodoreg.azurecr.io/v2/
[infra-secrets-envinjector-5c776c66bc-62n6b] time="2020-10-31T18:13:45Z" level=error msg="failed to mutate pod, error: failed to get auto cmd, error: Get https://dodoreg.azurecr.io/v2/: http: non-successful response (status=401 body=\"{\\\"errors\\\":[{\\\"code\\\":\\\"UNAUTHORIZED\\\",\\\"message\\\":\\\"authentication required, visit https://aka.ms/acr/authorization for more information.\\\",\\\"detail\\\":null}]}\\n\")\ncannot create client for doker registry\ngithub.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry.getImageBlob\n\t/go/src/pkg/docker/registry/registry.go:59\ngithub.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry.GetImageConfig\n\t/go/src/pkg/docker/registry/registry.go:49\nmain.getContainerCmd\n\t/go/src/cmd/azure-keyvault-secrets-webhook/registry.go:36\nmain.mutateContainers\n\t/go/src/cmd/azure-keyvault-secrets-webhook/pod.go:107\nmain.mutatePodSpec\n\t/go/src/cmd/azure-keyvault-secrets-webhook/pod.go:229\nmain.vaultSecretsMutator\n\t/go/src/cmd/azure-keyvault-secrets-webhook/main.go:138\ngithub.com/slok/kubewebhook/pkg/webhook/mutating.MutatorFunc.Mutate\n\t/go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/mutating/mutator.go:25\ngithub.com/slok/kubewebhook/pkg/webhook/mutating.(*staticWebhook).mutatingAdmissionReview\n\t/go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/mutating/webhook.go:131\ngithub.com/slok/kubewebhook/pkg/webhook/mutating.(*staticWebhook).Review\n\t/go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/mutating/webhook.go:123\ngithub.com/slok/kubewebhook/pkg/webhook/internal/instrumenting.(*Webhook).Review\n\t/go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/internal/instrumenting/instrumenting.go:42\ngithub.com/slok/kubewebhook/pkg/http.HandlerFor.func1\n\t/go/pkg/mod/github.com/slok/[email protected]/pkg/http/handler.go:64\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2007\ngithub.com/gorilla/mux.(*Router).ServeHTTP\n\t/go/pkg/mod/github.com/gorilla/[email protected]/mux.go:210\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2802\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1890\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1357"
[infra-secrets-envinjector-5c776c66bc-62n6b] 2020-10-31T18:13:45.443185Z	info	[ERROR] admission webhook error: failed to get auto cmd, error: Get https://dodoreg.azurecr.io/v2/: http: non-successful response (status=401 body="{\"errors\":[{\"code\":\"UNAUTHORIZED\",\"message\":\"authentication required, visit https://aka.ms/acr/authorization for more information.\",\"detail\":null}]}\n")

After inspection of source code, I found this call chain:

docker/registry/container/getAcrCredentials
---> azure/credentialprovider/acr/GetAcrCredentials
------> azure/credentialprovider/provider/getServicePrincipalTokenFromCloudConfig
---------> azure/credentialprovider/provider/getServicePrincipalTokenFromMSI
---------> azure/credentialprovider/provider/getServicePrincipalTokenFromMSI
---> azure/credentialprovider/acr/GetAcrCredentials

This line is not captured in logs.

It seems, method fails - it can't refresh access token from Management Endpoint and returns error

But, error returned from method is not logged and swallowed

When I decided to look at NMI logs, I found following:

E1030 10:28:36.975337       1 server.go:361] failed to get service principal token for pod:infra-secrets/infra-secrets-envinjector-578549bb89-pcz6v with error code 403, error: failed to get service principal token, error: failed to refresh token, error: adal: Refresh request failed. Status Code = '400'. Response body: {"error":"invalid_request","error_description":"Identity not found"}
I1030 10:28:36.975379       1 server.go:192] status (403) took 69062797 ns for req.method=GET reg.path=/metadata/identity/oauth2/token req.remote=10.0.48.26
I1030 10:40:37.871972       1 server.go:358] exception pod infra-secrets/infra-secrets-envinjector-578549bb89-pcz6v token handling
I1030 10:40:37.871994       1 server.go:301] fetching token for user assigned MSI for resource: https://management.core.windows.net/
E1030 10:40:37.948465       1 server.go:361] failed to get service principal token for pod:infra-secrets/infra-secrets-envinjector-578549bb89-pcz6v with error code 403, error: failed to get service principal token, error: failed to refresh token, error: adal: Refresh request failed. Status Code = '400'. Response body: {"error":"invalid_request","error_description":"Identity not found"}
I1030 10:40:37.948502       1 server.go:192] status (403) took 76596720 ns for req.method=GET reg.path=/metadata/identity/oauth2/token req.remote=10.0.48.26
I1030 10:40:40.004810       1 server.go:358] exception pod infra-secrets/infra-secrets-envinjector-578549bb89-pcz6v token handling
I1030 10:40:40.004828       1 server.go:301] fetching token for user assigned MSI for resource: https://management.core.windows.net/
E1030 10:40:40.077414       1 server.go:361] failed to get service principal token for pod:infra-secrets/infra-secrets-envinjector-578549bb89-pcz6v with error code 403, error: failed to get service principal token, error: failed to refresh token, error: adal: Refresh request failed. Status Code = '400'. Response body: {"error":"invalid_request","error_description":"Identity not found"}
I1030 10:40:40.077455       1 server.go:192] status (403) took 72681143 ns for req.method=GET reg.path=/metadata/identity/oauth2/token req.remote=10.0.48.26
I1030 10:46:38.411145       1 server.go:358] exception pod infra-secrets/infra-secrets-envinjector-578549bb89-pcz6v token handling
I1030 10:46:38.411173       1 server.go:301] fetching token for user assigned MSI for resource: https://management.core.windows.net/
E1030 10:46:38.479281       1 server.go:361] failed to get service principal token for pod:infra-secrets/infra-secrets-envinjector-578549bb89-pcz6v with error code 403, error: failed to get service principal token, error: failed to refresh token, error: adal: Refresh request failed. Status Code = '400'. Response body: {"error":"invalid_request","error_description":"Identity not found"}
I1030 10:46:38.479310       1 server.go:192] status (403) took 68229531 ns for req.method=GET reg.path=/metadata/identity/oauth2/token req.remote=10.0.48.26

This fails is because of:

  1. EnvInjector pods use identity from ClodConfigPath
  2. In our setup, pods of EnvInjector labeled with aadpodidbinding: kubernetes-agentpool which point to additional user assigned MSI (instead of generated one, that exists in identityProfile.kubeletidentity.resourceId of AKS Cluster and has name <cluster-name>-agentpool)

So, I decide to add additional azureidentity and azureidentitybinding (with same selector as described here) for generated AKS identity.

Logs after creation of new pods (with updated image version):

[infra-secrets-envinjector-5c776c66bc-t6b4h] time="2020-10-31T20:16:58Z" level=info msg="found pod to mutate in namespace 'infra-monitoring'"
[infra-secrets-envinjector-5c776c66bc-t6b4h] time="2020-10-31T20:16:58Z" level=info msg="found container 'backend' to mutate"
[infra-secrets-envinjector-5c776c66bc-t6b4h] time="2020-10-31T20:16:58Z" level=info msg="checking for env vars containing '@azurekeyvault' in container backend"
[infra-secrets-envinjector-5c776c66bc-t6b4h] time="2020-10-31T20:16:58Z" level=info msg="found env var: infra-exporter-kusto-and-azure-infra-monitoring-sp-subscription-id@azurekeyvault"
[infra-secrets-envinjector-5c776c66bc-t6b4h] time="2020-10-31T20:16:58Z" level=info msg="found env var: infra-exporter-kusto-and-azure-infra-monitoring-sp-tenant-id@azurekeyvault"
[infra-secrets-envinjector-5c776c66bc-t6b4h] time="2020-10-31T20:16:58Z" level=info msg="found env var: infra-exporter-kusto-and-azure-infra-monitoring-sp-client-id@azurekeyvault"
[infra-secrets-envinjector-5c776c66bc-t6b4h] time="2020-10-31T20:16:58Z" level=info msg="found env var: infra-exporter-kusto-and-azure-infra-monitoring-sp-client-secret@azurekeyvault"
[infra-secrets-envinjector-5c776c66bc-t6b4h] time="2020-10-31T20:16:58Z" level=debug msg="getting container command for container 'backend'"
[infra-secrets-envinjector-5c776c66bc-t6b4h] time="2020-10-31T20:16:58Z" level=debug msg="no cmd override in kubernetes for container backend, checking docker image configuration for entrypoint and cmd for dodoreg.azurecr.io/infra-exporter-kusto-and-azure:kusto-query-probe-2991d6c"
[infra-secrets-envinjector-5c776c66bc-t6b4h] time="2020-10-31T20:16:58Z" level=debug msg="using managed identity for acr credentials"
[infra-secrets-envinjector-5c776c66bc-t6b4h] time="2020-10-31T20:16:58Z" level=debug msg="azure: using managed identity extension to retrieve access token"
[infra-secrets-envinjector-5c776c66bc-t6b4h] time="2020-10-31T20:16:58Z" level=debug msg="azure: using User Assigned MSI ID to retrieve access token"
[infra-secrets-envinjector-5c776c66bc-t6b4h] time="2020-10-31T20:16:58Z" level=debug msg="discovering auth redirects for: dodoreg.azurecr.io"
[infra-secrets-envinjector-5c776c66bc-t6b4h] time="2020-10-31T20:16:58Z" level=debug msg="exchanging an acr refresh_token"
[infra-secrets-envinjector-5c776c66bc-t6b4h] time="2020-10-31T20:16:58Z" level=debug msg="adding ACR docker config entry for: dodoreg.azurecr.io"
[infra-secrets-envinjector-5c776c66bc-t6b4h] time="2020-10-31T20:16:58Z" level=debug msg="found acr gredentials for dodoreg.azurecr.io"
[infra-secrets-envinjector-5c776c66bc-t6b4h] time="2020-10-31T20:16:58Z" level=info msg="using registry https://dodoreg.azurecr.io"
[infra-secrets-envinjector-5c776c66bc-t6b4h] 2020-10-31T20:16:58.724867Z	info	registry.ping url=https://dodoreg.azurecr.io/v2/
[infra-secrets-envinjector-5c776c66bc-t6b4h] 2020-10-31T20:16:58.730127Z	info	registry.manifest.get url=https://dodoreg.azurecr.io/v2/infra-exporter-kusto-and-azure/manifests/kusto-query-probe-2991d6c repository=infra-exporter-kusto-and-azure reference=kusto-query-probe-2991d6c
[infra-secrets-envinjector-5c776c66bc-t6b4h] time="2020-10-31T20:16:58Z" level=error msg="failed to mutate pod, error: failed to get auto cmd, error: Get https://dodoreg.azurecr.io/v2/infra-exporter-kusto-and-azure/manifests/kusto-query-probe-2991d6c: http: non-successful response (status=401 body=\"{\\\"errors\\\":[{\\\"code\\\":\\\"UNAUTHORIZED\\\",\\\"message\\\":\\\"authentication required, visit https://aka.ms/acr/authorization for more information.\\\",\\\"detail\\\":[{\\\"Type\\\":\\\"repository\\\",\\\"Name\\\":\\\"infra-exporter-kusto-and-azure\\\",\\\"Action\\\":\\\"pull\\\"}]}]}\\n\")\ncannot download manifest for docker image\ngithub.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry.getImageBlob\n\t/go/src/pkg/docker/registry/registry.go:64\ngithub.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry.GetImageConfig\n\t/go/src/pkg/docker/registry/registry.go:49\nmain.getContainerCmd\n\t/go/src/cmd/azure-keyvault-secrets-webhook/registry.go:36\nmain.mutateContainers\n\t/go/src/cmd/azure-keyvault-secrets-webhook/pod.go:107\nmain.mutatePodSpec\n\t/go/src/cmd/azure-keyvault-secrets-webhook/pod.go:229\nmain.vaultSecretsMutator\n\t/go/src/cmd/azure-keyvault-secrets-webhook/main.go:138\ngithub.com/slok/kubewebhook/pkg/webhook/mutating.MutatorFunc.Mutate\n\t/go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/mutating/mutator.go:25\ngithub.com/slok/kubewebhook/pkg/webhook/mutating.(*staticWebhook).mutatingAdmissionReview\n\t/go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/mutating/webhook.go:131\ngithub.com/slok/kubewebhook/pkg/webhook/mutating.(*staticWebhook).Review\n\t/go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/mutating/webhook.go:123\ngithub.com/slok/kubewebhook/pkg/webhook/internal/instrumenting.(*Webhook).Review\n\t/go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/internal/instrumenting/instrumenting.go:42\ngithub.com/slok/kubewebhook/pkg/http.HandlerFor.func1\n\t/go/pkg/mod/github.com/slok/[email protected]/pkg/http/handler.go:64\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2007\ngithub.com/gorilla/mux.(*Router).ServeHTTP\n\t/go/pkg/mod/github.com/gorilla/[email protected]/mux.go:210\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2802\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1890\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1357"
[infra-secrets-envinjector-5c776c66bc-t6b4h] 2020-10-31T20:16:58.805733Z	info	[ERROR] admission webhook error: failed to get auto cmd, error: Get https://dodoreg.azurecr.io/v2/infra-exporter-kusto-and-azure/manifests/kusto-query-probe-2991d6c: http: non-successful response (status=401 body="{\"errors\":[{\"code\":\"UNAUTHORIZED\",\"message\":\"authentication required, visit https://aka.ms/acr/authorization for more information.\",\"detail\":[{\"Type\":\"repository\",\"Name\":\"infra-exporter-kusto-and-azure\",\"Action\":\"pull\"}]}]}\n")
[infra-secrets-envinjector-5c776c66bc-t6b4h] cannot download manifest for docker image

It seems that token for ACR successfully generated:

azure/credentialprovider/acr/getACRDockerEntryFromARMToken
azure/credentialprovider/acr/getACRDockerEntryFromARMToken
azure/credentialprovider/acr/getACRDockerEntryFromARMToken
docker/registry/container/getAcrCredentials

But code fails in this line.
Thus, problem can be in github.com/heroku/docker-registry-client library.

After reading some of Open Issues & PR's, I found interesting one. Problem mentioned in issue indicates, that ACR uses OAuth, but library don't support it. As temporal workaround, I decided to replace original repo with that fork (in go.mod file with replace directive).

Logs after creation of new pods (with updated image version):

[infra-secrets-envinjector-578549bb89-24cph] time="2020-10-31T20:33:24Z" level=info msg="found pod to mutate in namespace 'infra-monitoring'"
[infra-secrets-envinjector-578549bb89-24cph] time="2020-10-31T20:33:24Z" level=info msg="found container 'backend' to mutate"
[infra-secrets-envinjector-578549bb89-24cph] time="2020-10-31T20:33:24Z" level=info msg="checking for env vars containing '@azurekeyvault' in container backend"
[infra-secrets-envinjector-578549bb89-24cph] time="2020-10-31T20:33:24Z" level=info msg="found env var: infra-exporter-kusto-and-azure-infra-monitoring-sp-subscription-id@azurekeyvault"
[infra-secrets-envinjector-578549bb89-24cph] time="2020-10-31T20:33:24Z" level=info msg="found env var: infra-exporter-kusto-and-azure-infra-monitoring-sp-tenant-id@azurekeyvault"
[infra-secrets-envinjector-578549bb89-24cph] time="2020-10-31T20:33:24Z" level=info msg="found env var: infra-exporter-kusto-and-azure-infra-monitoring-sp-client-id@azurekeyvault"
[infra-secrets-envinjector-578549bb89-24cph] time="2020-10-31T20:33:24Z" level=info msg="found env var: infra-exporter-kusto-and-azure-infra-monitoring-sp-client-secret@azurekeyvault"
[infra-secrets-envinjector-578549bb89-24cph] time="2020-10-31T20:33:24Z" level=debug msg="getting container command for container 'backend'"
[infra-secrets-envinjector-578549bb89-24cph] time="2020-10-31T20:33:24Z" level=debug msg="no cmd override in kubernetes for container backend, checking docker image configuration for entrypoint and cmd for dodoreg.azurecr.io/infra-exporter-kusto-and-azure:kusto-query-probe-2991d6c"
[infra-secrets-envinjector-578549bb89-24cph] time="2020-10-31T20:33:24Z" level=debug msg="using managed identity for acr credentials"
[infra-secrets-envinjector-578549bb89-24cph] time="2020-10-31T20:33:24Z" level=debug msg="azure: using managed identity extension to retrieve access token"
[infra-secrets-envinjector-578549bb89-24cph] time="2020-10-31T20:33:24Z" level=debug msg="azure: using User Assigned MSI ID to retrieve access token"
[infra-secrets-envinjector-578549bb89-24cph] time="2020-10-31T20:33:24Z" level=debug msg="discovering auth redirects for: dodoreg.azurecr.io"
[infra-secrets-envinjector-578549bb89-24cph] time="2020-10-31T20:33:24Z" level=debug msg="exchanging an acr refresh_token"
[infra-secrets-envinjector-578549bb89-24cph] time="2020-10-31T20:33:24Z" level=debug msg="adding ACR docker config entry for: dodoreg.azurecr.io"
[infra-secrets-envinjector-578549bb89-24cph] time="2020-10-31T20:33:24Z" level=debug msg="found acr gredentials for dodoreg.azurecr.io"
[infra-secrets-envinjector-578549bb89-24cph] time="2020-10-31T20:33:24Z" level=info msg="using registry https://dodoreg.azurecr.io"
[infra-secrets-envinjector-578549bb89-24cph] 2020-10-31T20:33:24.391065Z	info	registry.ping url=https://dodoreg.azurecr.io/v2/
[infra-secrets-envinjector-578549bb89-24cph] 2020-10-31T20:33:24.395673Z	info	registry.manifest.get url=https://dodoreg.azurecr.io/v2/infra-exporter-kusto-and-azure/manifests/kusto-query-probe-2991d6c repository=infra-exporter-kusto-and-azure reference=kusto-query-probe-2991d6c
[infra-secrets-envinjector-578549bb89-24cph] 2020-10-31T20:33:24.839260Z	info	registry.blob.download url=https://dodoreg.azurecr.io/v2/infra-exporter-kusto-and-azure/blobs/sha256:82f2485c18d8ce7bd42b5d3303eee8b38cde845d6dcc92d4527fbd331a5d13e1 repository=infra-exporter-kusto-and-azure digest=sha256:82f2485c18d8ce7bd42b5d3303eee8b38cde845d6dcc92d4527fbd331a5d13e1
[infra-secrets-envinjector-578549bb89-24cph] time="2020-10-31T20:33:25Z" level=info msg="using '/azure-metrics-exporter' as arguments for env-injector"
[infra-secrets-envinjector-578549bb89-24cph] time="2020-10-31T20:33:25Z" level=debug msg="signed arguments to prevent override"
[infra-secrets-envinjector-578549bb89-24cph] time="2020-10-31T20:33:25Z" level=debug msg="public signing key for argument verification: \n-----BEGIN RSA PUBLIC KEY-----\nMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA62wJm6Ogsx1wVDZQzY3Z\nV7ORiNw13URvb8E7A6rHrtZpUWs3E+90NFTu2W2vbqN4POF07f6tumKeyaqO+HES\n1P9uGmNiOU9ian7SUaEG3QA0uZFtA6W4hHAbiUvLirFk9fVpkWpPworAtEkAC/ou\np7p+uV4NTCtdUjbYhEawYpRxcTKClgyPT980iB+43TJX+FveJVimacUA+bWpaiE2\n6Fgz6TnP+K0siTAaiQnkuZwrYG1SJTlREkTAqh0lIOE4nlQtLxU0uPTYG7G8WRGL\n18XkJAeJfGZkRRERd97KD3gpEI60heVl2uNvUXZ7q6yr3QaoyAzhXlBwX8xxjI+W\ntwIDAQAB\n-----END RSA PUBLIC KEY-----\n"
[infra-secrets-envinjector-578549bb89-24cph] time="2020-10-31T20:33:25Z" level=debug msg="full exec path: /azure-keyvault/azure-keyvault-env"
[infra-secrets-envinjector-578549bb89-24cph] time="2020-10-31T20:33:25Z" level=debug msg="container args: [/azure-metrics-exporter]"
[infra-secrets-envinjector-578549bb89-24cph] time="2020-10-31T20:33:25Z" level=debug msg="mounting volume 'azure-keyvault-env' to '/azure-keyvault/'"
[infra-secrets-envinjector-578549bb89-24cph] time="2020-10-31T20:33:25Z" level=debug msg="setting ENV_INJECTOR_USE_AUTH_SERVICE=true for container backend"
[infra-secrets-envinjector-578549bb89-24cph] time="2020-10-31T20:33:25Z" level=info msg="containers mutated and pod updated with init-container and volumes"
[infra-secrets-envinjector-578549bb89-24cph] time="2020-10-31T20:33:30Z" level=info msg="served oauth token to 'infra-monitoring/infra-exporter-kusto-and-azure-fc8b6f96b-j8qlr' at address '10.0.48.47:47090'"

Indicates that pods successfully mutated.

After that, some errors generated in application pod logs (EnvInjectorEnv init container part):

time="2020-10-31T20:33:30Z" level=info msg="using sentralized akv2k8s auth service for authentiction with azure key vault" application=env-injector component=akv2k8s namespace=infra-monitoring
time="2020-10-31T20:33:30Z" level=info msg="found original container command to be /azure-metrics-exporter [/azure-metrics-exporter]" application=env-injector component=akv2k8s namespace=infra-monitoring
time="2020-10-31T20:33:30Z" level=info msg="requesting azure key vault oauth token from https://infra-secrets-envinjector.infra-secrets.svc:443/auth/infra-monitoring/infra-exporter-kusto-and-azure-fc8b6f96b-j8qlr" application=env-injector component=akv2k8s namespace=infra-monitoring
time="2020-10-31T20:33:30Z" level=fatal msg="failed to read secret 'infra-monitoring-sp-client-secret', error keyvault.BaseClient#GetSecret: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code=\"Forbidden\" Message=\"The user, group or application 'appid=14080efa-252e-4c8b-b690-e0828bdf4ae8;oid=b7f13d1f-db3c-47c4-9f5f-48c54614ad52;iss=https://sts.windows.net/2b8d2a3c-bc5d-4c9e-9de5-9219a69d6524/' does not have secrets get permission on key vault 'ld-keyvault-dodo;location=westeurope'. For help resolving this issue, please see https://go.microsoft.com/fwlink/?linkid=2125287\" InnerError={\"code\":\"AccessDenied\"}" application=env-injector component=akv2k8s namespace=infra-monitoring

This happens because of aad-pod-identity: cached access token token for cluster identity (with name -agentpool) used to access KeyVaults.
After some time, errors disappeared (because of expiring and renewal of access token for other MSI).

Conclusions:

Long story short, we have following options:

  1. Use cluster-generated MSI (with name <cluster-name>-agentpool) to access all key-vaults (because this MSI used by EnvInjector pods to inspect docker images in ACR) and described fixes.
  2. Use controller to sync secrets without EnvInjection

What do you think about discovered issues (trimmed ACR prefix, swallowed error and stale library)?

Code can be found in this repo: https://github.com/ArtemTrofimushkin/azure-key-vault-to-kubernetes/tree/fix-aad-pod-identity

Also, I can attach our values.yaml.

@torresdal
Copy link
Collaborator

First of all, thanks for taking the time to dig so deep into this issue @ArtemTrofimushkin and find the solutions you did!

I've read through your discoveries, but I need to sit down to create some unit tests and verify the issues you point out and get a cluster going with aad-pod-identity. Currently, not all of this makes sense to me.

One of the issues here, I suspect, is that both aad-pod-identity and akv2k8s has the ability to use Managed Identities. What aad-pod-identity does however, is adding ip-table rules to re-write all traffic to Azure Instance Metadata Service (IMDS) through their NMI endpoint. This means that when aad-pod-identity is installed in a Kubernetes cluster, all requests to IMDS coming from akv2k8s gets sent to the aad-pod-identity NMI endpoint. The reason I suspect this is because we don't use aad-pod-identity and we have none of these issues in any of our Kubernetes clusters.

If you could share your values.yaml file, that would be very helpful. I'll get back to you as soon as I've digged a bit deeper. Thanks!

@ArtemTrofimushkin
Copy link

Sure, this is our values.yaml:

controller:
  enabled: true
  env:
    AZURE_ENVIRONMENT: AzurePublicCloud
  image:
    tag: 1.1.0
  keyVault:
    customAuth:
      enabled: true
  logLevel: Trace
crds:
  create: true
  enabled: true
  keep: true
env_injector:
  caBundleController:
    env:
      AZURE_ENVIRONMENT: AzurePublicCloud
    image:
      tag: 1.1.0
    logLevel: Trace
  dockerImageInspection:
    timeout: '120'
  enabled: true
  envImage:
    tag: 1.1.1
  image:
    tag: 1.1.10
  keyVault:
    customAuth:
      enabled: true
      useAuthService: true
  webhook:
    certificate:
      useCertManager: true
    env:
      AZURE_ENVIRONMENT: AzurePublicCloud
    failurePolicy: Fail
    logLevel: Trace
rbac:
  create: true
runningInsideAzureAks: true

We use chart akv2k8s (version 1.1.24) from this commit

@LumirH
Copy link

LumirH commented Nov 2, 2020

Hello,

I am observing very similar issue as @ArtemTrofimushkin.
Maybe I can help to replicate the issue in a simpler way.

I have new AKS cluster using MSI. This cluster has aad-pod-identity installed.
I have installed latest helm chart (chart 1.1.25). The only non-default setting is "addAzurePodIdentityException: true".

Now -- I get the same authentication error as @ArtemTrofimushkin when trying to connect to ACR.
But if I skip ACR image pull, then everything works fine (i.e. I can connect to Keyvault and inject secrets).

Based on my observation, the root problem is that MSI cluster identity is for some reason not used for ACR login.
I've tested many combinations, uninstalled aad-pod-identity, tried aad binding etc.
I also have other cluster with Service Principal where everything works ok.

Lumir

@torresdal
Copy link
Collaborator

torresdal commented Nov 2, 2020

Thanks @LumirH - I agree it seams to be related to how we handle ACR.

What happens if you set env_injector.dockerImageInspection.useAksCredentialsWithACS to false? (cc @ArtemTrofimushkin)

@LumirH
Copy link

LumirH commented Nov 2, 2020

On my side still the same error:

  • name: DOCKER_IMAGE_INSPECTION_USE_ACS_CREDENTIALS
    value: "false"
time="2020-11-02T22:03:26Z" level=info msg="using registry https://MYREPO.azurecr.io"
time="2020-11-02T22:03:26Z" level=error msg="failed to mutate pod, error: failed to get auto cmd, error: Get https://MYREPO.azurecr.io/v2/: http: non-successful response (status=401 body=\"{\\\"errors\\\":[{\\\"code\\\":\\\"UNAUTHORIZED\\\",\\\"message\\\":\\\"authentication required, visit https://aka.ms/acr/authorization for more information.\\\",\\\"detail\\\":null}]}\\n\")\ncannot create client for doker registry

@leonids2005
Copy link

leonids2005 commented Nov 3, 2020

Sorry, to come back again but after recreating my cluster I have a problem.
Injector logs look ok to me but pod are not created and here is the log. I probably forgot something :(. Any hint of what could go wrong.

Obviously, without injector, everything works OK

time="2020-11-03T11:45:40Z" level=debug msg="azure key vault env injector initializing" application=env-injector component=akv2k8s
time="2020-11-03T11:45:40Z" level=info msg="using sentralized akv2k8s auth service for authentiction with azure key vault" application=env-injector component=akv2k8s namespace=default
time="2020-11-03T11:45:40Z" level=info msg="found original container command to be /usr/local/bin/python [python hello.py]" application=env-injector component=akv2k8s namespace=default
time="2020-11-03T11:45:40Z" level=info msg="requesting azure key vault oauth token from https://azure-key-vault-env-injector.akv2k8s.svc:443/auth/default/" application=env-injector component=akv2k8s namespace=default
time="2020-11-03T11:45:40Z" level=fatal msg="failed to get credentials, error: failed to get credentials, 404 Not Found"
time="2020-11-03T11:24:05Z" level=info msg="found pod to mutate in namespace 'default'"
time="2020-11-03T11:24:05Z" level=info msg="found container 'hellosample' to mutate"
time="2020-11-03T11:24:05Z" level=info msg="checking for env vars containing '@azurekeyvault' in container hellosample"
time="2020-11-03T11:24:05Z" level=info msg="found env var: secret-inject@azurekeyvault"
time="2020-11-03T11:24:05Z" level=info msg="we are running inside azure aks, trying to get acr credentials"
time="2020-11-03T11:24:05Z" level=debug msg="azure: using managed identity extension to retrieve access token"
time="2020-11-03T11:24:05Z" level=debug msg="azure: using User Assigned MSI ID to retrieve access token"
time="2020-11-03T11:24:05Z" level=debug msg="using managed identity for acr credentials"
time="2020-11-03T11:24:05Z" level=debug msg="discovering auth redirects for: *****.azurecr.io"
time="2020-11-03T11:24:05Z" level=debug msg="exchanging an acr refresh_token"
time="2020-11-03T11:24:05Z" level=debug msg="adding ACR docker config entry for: *****.azurecr.io"
time="2020-11-03T11:24:05Z" level=debug msg="found acr gredentials for *****.azurecr.io"
time="2020-11-03T11:24:05Z" level=debug msg="did not find credentials to use with registry during docker image inspection '*****.azurecr.io' - skipping credentials"
time="2020-11-03T11:24:05Z" level=debug msg="no cmd override in kubernetes for container hellosample, checking docker image configuration for entrypoint and cmd for *****.azurecr.io/python-hello"
time="2020-11-03T11:24:05Z" level=debug msg="docker image inspection timeout: 20 seconds"
time="2020-11-03T11:24:05Z" level=debug msg="Trying to access \"*****.azurecr.io/python-hello:latest\""
time="2020-11-03T11:24:05Z" level=debug msg="Returning credentials from DockerAuthConfig"
time="2020-11-03T11:24:05Z" level=debug msg="Using registries.d directory /etc/containers/registries.d for sigstore configuration"
time="2020-11-03T11:24:05Z" level=debug msg=" No signature storage configuration found for *****.azurecr.io/python-hello:latest"
time="2020-11-03T11:24:05Z" level=debug msg="Looking for TLS certificates and private keys in /etc/docker/certs.d/*****.azurecr.io"
time="2020-11-03T11:24:05Z" level=debug msg="GET https://*****.azurecr.io/v2/"
time="2020-11-03T11:24:05Z" level=debug msg="Ping https://*****.azurecr.io/v2/ status 401"
time="2020-11-03T11:24:05Z" level=debug msg="GET https://*****.azurecr.io/oauth2/token?account=00000000-0000-0000-0000-000000000000&scope=repository%3Apython-hello%3Apull&service=*****.azurecr.io"
time="2020-11-03T11:24:05Z" level=debug msg="Increasing token expiration to: 60 seconds"
time="2020-11-03T11:24:05Z" level=debug msg="GET https://*****.azurecr.io/v2/python-hello/manifests/latest"
time="2020-11-03T11:24:05Z" level=debug msg="Downloading /v2/python-hello/blobs/sha256:126dad06087f8424dcc9e54ce1a3dab47a6184abb52b811d17955b12cb052c89"
time="2020-11-03T11:24:05Z" level=debug msg="GET https://*****.azurecr.io/v2/python-hello/blobs/sha256:126dad06087f8424dcc9e54ce1a3dab47a6184abb52b811d17955b12cb052c89"
time="2020-11-03T11:24:05Z" level=info msg="using 'python hello.py' as arguments for env-injector"
time="2020-11-03T11:24:06Z" level=debug msg="signed arguments to prevent override"
time="2020-11-03T11:24:06Z" level=debug msg="public signing key for argument verification: \n-----BEGIN RSA PUBLIC KEY-----\nMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAuhXRKCDvLd5AsodFH4XZ\nNFVGaqCxlC5BMy1+buWOP6dFPR4bI4NXUZhfDxkUTlINpTlsEeWZQpCe+I7qNnqg\n3KXuWdtUh6esIL0nGSJmKtwaFJ22Ali78i4Q2dZQr/O0OV/nErGBCbjD6ixuaGjA\nLdFvg8OMhHz5agpa93HPLPDnRv9b0RRWyUVp4tHXEwm977p10kwk2HyPbZqX9wAj\nlH5Zgu6HEOxaDfOpPC9vXVWeH4sI2FQzyG4IYn2ib5ytSajz3HpZO3XvMwgbudYM\n/DRFcSDPG13MFluHz40otNOcCMQTI5vruLRraARtRxm5yWo3jugUs/eUFYHfI4nO\nmwIDAQAB\n-----END RSA PUBLIC KEY-----\n"
time="2020-11-03T11:24:06Z" level=debug msg="full exec path: /azure-keyvault/azure-keyvault-env"
time="2020-11-03T11:24:06Z" level=debug msg="container args: [python hello.py]"
time="2020-11-03T11:24:06Z" level=debug msg="mounting volume 'azure-keyvault-env' to '/azure-keyvault/'"
time="2020-11-03T11:24:06Z" level=debug msg="configure init-container to use auth service"
time="2020-11-03T11:24:06Z" level=info msg="containers mutated and pod updated with init-container and volumes"

@leonids2005
Copy link

leonids2005 commented Nov 6, 2020

@torresdal I would appreciate if you can look at my question I have sent 3 days ago. thank you.
HEre are also events from the pod

  Normal   Scheduled  <unknown>             default-scheduler                         Successfully assigned default/hellosample-66ddc4ddbc-g88tk to aks-default-38821645-vmss000003
  Normal   Pulled     2m34s                 kubelet, aks-default-38821645-vmss000003  Container image "spvest/azure-keyvault-env:1.1.1" already present on machine
  Normal   Created    2m34s                 kubelet, aks-default-38821645-vmss000003  Created container copy-azurekeyvault-env                                  
  Normal   Started    2m34s                 kubelet, aks-default-38821645-vmss000003  Started container copy-azurekeyvault-env                                  
  Normal   Pulling    111s (x4 over 2m33s)  kubelet, aks-default-38821645-vmss000003  Pulling image "***.azurecr.io/python-hello"               
  Normal   Pulled     111s (x4 over 2m33s)  kubelet, aks-default-38821645-vmss000003  Successfully pulled image "***.azurecr.io/python-hello"   
  Normal   Created    111s (x4 over 2m33s)  kubelet, aks-default-38821645-vmss000003  Created container hellosample
  Normal   Started    110s (x4 over 2m32s)  kubelet, aks-default-38821645-vmss000003  Started container hellosample
  Warning  BackOff    96s (x6 over 2m30s)   kubelet, aks-default-38821645-vmss000003  Back-off restarting failed container 
      

@torresdal
Copy link
Collaborator

@leonids2005 currently working on a fix for ACR auth. As for your current problem that does seam to be related to something else, and not something I've got reported from others (as far as I can remember). Error says 404 (not found) on the auth endpoint of env-injector at https://azure-key-vault-env-injector.akv2k8s.svc:443/auth/default/. Have you just installed using the latest helm chart? Did you use the old ones (now deprecated) or the new akv2k8s-chart?

@leonids2005
Copy link

@torresdal I have installed it using the new chart and have problems - at least with old version pod successfully mutated and with new version, there is again a problem accessing ACR

time="2020-11-06T23:50:13Z" level=info msg="version " buildDate="2020-10-10T09:59:27Z" commit=fa25b51 component=webhook
time="2020-11-06T23:50:13Z" level=info msg="Active settings:"
time="2020-11-06T23:50:13Z" level=info msg="  Running inside Azure AKS  : true"
time="2020-11-06T23:50:13Z" level=info msg="  Webhook port              : 443"
time="2020-11-06T23:50:13Z" level=info msg="  Serve metrics             : false"
time="2020-11-06T23:50:13Z" level=info msg="  Use custom auth           : false"
time="2020-11-06T23:50:13Z" level=info msg="  Use auth service          : true"
time="2020-11-06T23:50:13Z" level=info msg="  Auth service name         : akv2k8s-envinjector"
time="2020-11-06T23:50:13Z" level=info msg="  Auth service port         : 443"
time="2020-11-06T23:50:13Z" level=info msg="  Use AKS creds with ACS    : true"
time="2020-11-06T23:50:13Z" level=info msg="  Docker inspection timeout : 20"
time="2020-11-06T23:50:13Z" level=info msg="  CA ConfigMap name         : akv2k8s-ca"
time="2020-11-06T23:50:13Z" level=info msg="  Cloud config path         : /etc/kubernetes/azure.json"
time="2020-11-06T23:50:13Z" level=debug msg="azure: using managed identity extension to retrieve access token"
time="2020-11-06T23:50:13Z" level=debug msg="azure: using User Assigned MSI ID to retrieve access token"
2020-11-06T23:50:13.142469Z     info    [WARN] no metrics recorder active
2020-11-06T23:50:13.142484Z     info    [WARN] no tracer active
time="2020-11-06T23:50:13Z" level=info msg="Serving healthz at :80/healthz"
time="2020-11-06T23:50:13Z" level=info msg="Serving encrypted webhook at :443/pods"
time="2020-11-06T23:50:13Z" level=info msg="Serving encrypted healthz at :443/healthz"
time="2020-11-06T23:50:13Z" level=info msg="Serving encrypted auth at :443/auth"
time="2020-11-06T23:50:26Z" level=info msg="found pod to mutate in namespace 'default'"
time="2020-11-06T23:50:26Z" level=info msg="found container 'hellosample' to mutate"
time="2020-11-06T23:50:26Z" level=info msg="checking for env vars containing '@azurekeyvault' in container hellosample"
time="2020-11-06T23:50:26Z" level=info msg="found env var: secret-inject@azurekeyvault"
time="2020-11-06T23:50:26Z" level=debug msg="getting container command for container 'hellosample'"
time="2020-11-06T23:50:26Z" level=debug msg="no cmd override in kubernetes for container hellosample, checking docker image configuration for entrypoint and cmd for ****.azurecr.io/python-hello"
time="2020-11-06T23:50:26Z" level=debug msg="using managed identity for acr credentials"
time="2020-11-06T23:50:26Z" level=debug msg="image(python-hello) is not from ACR, skip MSI authentication"
time="2020-11-06T23:50:26Z" level=info msg="using registry https://****.azurecr.io"
2020-11-06T23:50:26.923152Z     info    registry.ping url=https://****.azurecr.io/v2/
time="2020-11-06T23:50:27Z" level=error msg="failed to mutate pod, error: failed to get auto cmd, error: Get https://****.azurecr.io/v2/: http: non-successful response (status=401 body=\"{\\\"errors\\\":[{\\\"code\\\":\\\"UNAUTHORIZED\\\",\\\"message\\\":\\\"authentication required, visit https://aka.ms/acr/authorization for more information.\\\",\\\"detail\\\":null}]}\\n\")\ncannot create client for doker registry\ngithub.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry.getImageBlob\n\t/go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry/registry.go:59\ngithub.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry.GetImageConfig\n\t/go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry/registry.go:49\nmain.getContainerCmd\n\t/go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/registry.go:36\nmain.mutateContainers\n\t/go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/pod.go:107\nmain.mutatePodSpec\n\t/go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/pod.go:229\nmain.vaultSecretsMutator\n\t/go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/main.go:138\ngithub.com/slok/kubewebhook/pkg/webhook/mutating.MutatorFunc.Mutate\n\t/go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/mutating/mutator.go:25\ngithub.com/slok/kubewebhook/pkg/webhook/mutating.(*staticWebhook).mutatingAdmissionReview\n\t/go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/mutating/webhook.go:131\ngithub.com/slok/kubewebhook/pkg/webhook/mutating.(*staticWebhook).Review\n\t/go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/mutating/webhook.go:123\ngithub.com/slok/kubewebhook/pkg/webhook/internal/instrumenting.(*Webhook).Review\n\t/go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/internal/instrumenting/instrumenting.go:42\ngithub.com/slok/kubewebhook/pkg/http.HandlerFor.func1\n\t/go/pkg/mod/github.com/slok/[email protected]/pkg/http/handler.go:64\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2007\ngithub.com/gorilla/mux.(*Router).ServeHTTP\n\t/go/pkg/mod/github.com/gorilla/[email protected]/mux.go:210\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2802\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1890\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1357"
2020-11-06T23:50:27.861997Z     info    [ERROR] admission webhook error: failed to get auto cmd, error: Get https://****.azurecr.io/v2/: http: non-successful response (status=401 body="{\"errors\":[{\"code\":\"UNAUTHORIZED\",\"message\":\"authentication required, visit https://aka.ms/acr/authorization for more information.\",\"detail\":null}]}\n")
cannot create client for doker registry
github.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry.getImageBlob
        /go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry/registry.go:59
github.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry.GetImageConfig
        /go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry/registry.go:49
main.getContainerCmd
        /go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/registry.go:36
main.mutateContainers
        /go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/pod.go:107
main.mutatePodSpec
        /go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/pod.go:229
main.vaultSecretsMutator
        /go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/main.go:138
github.com/slok/kubewebhook/pkg/webhook/mutating.MutatorFunc.Mutate
        /go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/mutating/mutator.go:25
github.com/slok/kubewebhook/pkg/webhook/mutating.(*staticWebhook).mutatingAdmissionReview
        /go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/mutating/webhook.go:131
github.com/slok/kubewebhook/pkg/webhook/mutating.(*staticWebhook).Review
        /go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/mutating/webhook.go:123
github.com/slok/kubewebhook/pkg/webhook/internal/instrumenting.(*Webhook).Review
        /go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/internal/instrumenting/instrumenting.go:42
github.com/slok/kubewebhook/pkg/http.HandlerFor.func1
        /go/pkg/mod/github.com/slok/[email protected]/pkg/http/handler.go:64
net/http.HandlerFunc.ServeHTTP
        /usr/local/go/src/net/http/server.go:2007
github.com/gorilla/mux.(*Router).ServeHTTP
        /go/pkg/mod/github.com/gorilla/[email protected]/mux.go:210
net/http.serverHandler.ServeHTTP
        /usr/local/go/src/net/http/server.go:2802
net/http.(*conn).serve
        /usr/local/go/src/net/http/server.go:1890
runtime.goexit
        /usr/local/go/src/runtime/asm_amd64.s:1357
time="2020-11-06T23:50:27Z" level=info msg="found pod to mutate in namespace 'default'"
time="2020-11-06T23:50:27Z" level=info msg="found container 'hellosample' to mutate"
time="2020-11-06T23:50:27Z" level=info msg="checking for env vars containing '@azurekeyvault' in container hellosample"
time="2020-11-06T23:50:27Z" level=info msg="found env var: secret-inject@azurekeyvault"
time="2020-11-06T23:50:27Z" level=debug msg="getting container command for container 'hellosample'"
time="2020-11-06T23:50:27Z" level=debug msg="no cmd override in kubernetes for container hellosample, checking docker image configuration for entrypoint and cmd for ****.azurecr.io/python-hello"
time="2020-11-06T23:50:27Z" level=debug msg="using managed identity for acr credentials"
time="2020-11-06T23:50:27Z" level=debug msg="image(python-hello) is not from ACR, skip MSI authentication"
2020-11-06T23:50:27.884614Z     info    registry.ping url=https://****.azurecr.io/v2/
time="2020-11-06T23:50:27Z" level=info msg="using registry https://****.azurecr.io"
time="2020-11-06T23:50:27Z" level=error msg="failed to mutate pod, error: failed to get auto cmd, error: Get https://****.azurecr.io/v2/: http: non-successful response (status=401 body=\"{\\\"errors\\\":[{\\\"code\\\":\\\"UNAUTHORIZED\\\",\\\"message\\\":\\\"authentication required, visit https://aka.ms/acr/authorization for more information.\\\",\\\"detail\\\":null}]}\\n\")\ncannot create client for doker registry\ngithub.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry.getImageBlob\n\t/go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry/registry.go:59\ngithub.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry.GetImageConfig\n\t/go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry/registry.go:49\nmain.getContainerCmd\n\t/go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/registry.go:36\nmain.mutateContainers\n\t/go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/pod.go:107\nmain.mutatePodSpec\n\t/go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/pod.go:229\nmain.vaultSecretsMutator\n\t/go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/main.go:138\ngithub.com/slok/kubewebhook/pkg/webhook/mutating.MutatorFunc.Mutate\n\t/go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/mutating/mutator.go:25\ngithub.com/slok/kubewebhook/pkg/webhook/mutating.(*staticWebhook).mutatingAdmissionReview\n\t/go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/mutating/webhook.go:131\ngithub.com/slok/kubewebhook/pkg/webhook/mutating.(*staticWebhook).Review\n\t/go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/mutating/webhook.go:123\ngithub.com/slok/kubewebhook/pkg/webhook/internal/instrumenting.(*Webhook).Review\n\t/go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/internal/instrumenting/instrumenting.go:42\ngithub.com/slok/kubewebhook/pkg/http.HandlerFor.func1\n\t/go/pkg/mod/github.com/slok/[email protected]/pkg/http/handler.go:64\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2007\ngithub.com/gorilla/mux.(*Router).ServeHTTP\n\t/go/pkg/mod/github.com/gorilla/[email protected]/mux.go:210\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2802\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1890\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1357"
2020-11-06T23:50:27.965159Z     info    [ERROR] admission webhook error: failed to get auto cmd, error: Get https://****.azurecr.io/v2/: http: non-successful response (status=401 body="{\"errors\":[{\"code\":\"UNAUTHORIZED\",\"message\":\"authentication required, visit https://aka.ms/acr/authorization for more information.\",\"detail\":null}]}\n")
cannot create client for doker registry
github.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry.getImageBlob
        /go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry/registry.go:59
github.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry.GetImageConfig
        /go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry/registry.go:49
main.getContainerCmd
        /go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/registry.go:36
main.mutateContainers
        /go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/pod.go:107
main.mutatePodSpec
        /go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/pod.go:229
main.vaultSecretsMutator
        /go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/main.go:138
github.com/slok/kubewebhook/pkg/webhook/mutating.MutatorFunc.Mutate
        /go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/mutating/mutator.go:25
github.com/slok/kubewebhook/pkg/webhook/mutating.(*staticWebhook).mutatingAdmissionReview
        /go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/mutating/webhook.go:131
github.com/slok/kubewebhook/pkg/webhook/mutating.(*staticWebhook).Review
        /go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/mutating/webhook.go:123
github.com/slok/kubewebhook/pkg/webhook/internal/instrumenting.(*Webhook).Review
        /go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/internal/instrumenting/instrumenting.go:42
github.com/slok/kubewebhook/pkg/http.HandlerFor.func1
        /go/pkg/mod/github.com/slok/[email protected]/pkg/http/handler.go:64
net/http.HandlerFunc.ServeHTTP
        /usr/local/go/src/net/http/server.go:2007
github.com/gorilla/mux.(*Router).ServeHTTP
        /go/pkg/mod/github.com/gorilla/[email protected]/mux.go:210
net/http.serverHandler.ServeHTTP
        /usr/local/go/src/net/http/server.go:2802
net/http.(*conn).serve
        /usr/local/go/src/net/http/server.go:1890
runtime.goexit
        /usr/local/go/src/runtime/asm_amd64.s:1357
time="2020-11-06T23:50:27Z" level=info msg="found pod to mutate in namespace 'default'"
time="2020-11-06T23:50:27Z" level=info msg="found container 'hellosample' to mutate"
time="2020-11-06T23:50:27Z" level=info msg="checking for env vars containing '@azurekeyvault' in container hellosample"
time="2020-11-06T23:50:27Z" level=info msg="found env var: secret-inject@azurekeyvault"
time="2020-11-06T23:50:27Z" level=debug msg="getting container command for container 'hellosample'"
time="2020-11-06T23:50:27Z" level=debug msg="no cmd override in kubernetes for container hellosample, checking docker image configuration for entrypoint and cmd for ****.azurecr.io/python-hello"
time="2020-11-06T23:50:27Z" level=debug msg="using managed identity for acr credentials"
time="2020-11-06T23:50:27Z" level=debug msg="image(python-hello) is not from ACR, skip MSI authentication"
time="2020-11-06T23:50:27Z" level=info msg="using registry https://****.azurecr.io"
2020-11-06T23:50:27.978271Z     info    registry.ping url=https://****.azurecr.io/v2/
time="2020-11-06T23:50:28Z" level=error msg="failed to mutate pod, error: failed to get auto cmd, error: Get https://****.azurecr.io/v2/: http: non-successful response (status=401 body=\"{\\\"errors\\\":[{\\\"code\\\":\\\"UNAUTHORIZED\\\",\\\"message\\\":\\\"authentication required, visit https://aka.ms/acr/authorization for more information.\\\",\\\"detail\\\":null}]}\\n\")\ncannot create client for doker registry\ngithub.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry.getImageBlob\n\t/go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry/registry.go:59\ngithub.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry.GetImageConfig\n\t/go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry/registry.go:49\nmain.getContainerCmd\n\t/go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/registry.go:36\nmain.mutateContainers\n\t/go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/pod.go:107\nmain.mutatePodSpec\n\t/go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/pod.go:229\nmain.vaultSecretsMutator\n\t/go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/main.go:138\ngithub.com/slok/kubewebhook/pkg/webhook/mutating.MutatorFunc.Mutate\n\t/go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/mutating/mutator.go:25\ngithub.com/slok/kubewebhook/pkg/webhook/mutating.(*staticWebhook).mutatingAdmissionReview\n\t/go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/mutating/webhook.go:131\ngithub.com/slok/kubewebhook/pkg/webhook/mutating.(*staticWebhook).Review\n\t/go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/mutating/webhook.go:123\ngithub.com/slok/kubewebhook/pkg/webhook/internal/instrumenting.(*Webhook).Review\n\t/go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/internal/instrumenting/instrumenting.go:42\ngithub.com/slok/kubewebhook/pkg/http.HandlerFor.func1\n\t/go/pkg/mod/github.com/slok/[email protected]/pkg/http/handler.go:64\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2007\ngithub.com/gorilla/mux.(*Router).ServeHTTP\n\t/go/pkg/mod/github.com/gorilla/[email protected]/mux.go:210\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2802\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1890\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1357"
2020-11-06T23:50:28.026031Z     info    [ERROR] admission webhook error: failed to get auto cmd, error: Get https://****.azurecr.io/v2/: http: non-successful response (status=401 body="{\"errors\":[{\"code\":\"UNAUTHORIZED\",\"message\":\"authentication required, visit https://aka.ms/acr/authorization for more information.\",\"detail\":null}]}\n")
cannot create client for doker registry
github.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry.getImageBlob
        /go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry/registry.go:59
github.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry.GetImageConfig
        /go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/pkg/docker/registry/registry.go:49
main.getContainerCmd
        /go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/registry.go:36
main.mutateContainers
        /go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/pod.go:107
main.mutatePodSpec
        /go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/pod.go:229
main.vaultSecretsMutator
        /go/src/github.com/SparebankenVest/azure-key-vault-to-kubernetes/cmd/azure-keyvault-secrets-webhook/main.go:138
github.com/slok/kubewebhook/pkg/webhook/mutating.MutatorFunc.Mutate
        /go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/mutating/mutator.go:25
github.com/slok/kubewebhook/pkg/webhook/mutating.(*staticWebhook).mutatingAdmissionReview
        /go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/mutating/webhook.go:131
github.com/slok/kubewebhook/pkg/webhook/mutating.(*staticWebhook).Review
        /go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/mutating/webhook.go:123
github.com/slok/kubewebhook/pkg/webhook/internal/instrumenting.(*Webhook).Review
        /go/pkg/mod/github.com/slok/[email protected]/pkg/webhook/internal/instrumenting/instrumenting.go:42
github.com/slok/kubewebhook/pkg/http.HandlerFor.func1
        /go/pkg/mod/github.com/slok/[email protected]/pkg/http/handler.go:64
net/http.HandlerFunc.ServeHTTP
        /usr/local/go/src/net/http/server.go:2007
github.com/gorilla/mux.(*Router).ServeHTTP
        /go/pkg/mod/github.com/gorilla/[email protected]/mux.go:210
net/http.serverHandler.ServeHTTP
        /usr/local/go/src/net/http/server.go:2802
net/http.(*conn).serve
        /usr/local/go/src/net/http/server.go:1890
runtime.goexit
        /usr/local/go/src/runtime/asm_amd64.s:1357
t

@LumirH
Copy link

LumirH commented Nov 7, 2020

@leonids2005 -- thats exactly the same issue I'm facing. The workaround is to specify the "command" in the Kubernetes pod specification, so the connection to ACR is not needed.

@leonids2005
Copy link

leonids2005 commented Nov 7, 2020

@LumirH thank you. This is really strange because the version I used before did work. Ok, I will try to use the command.

UPDATE: using command works but as I said it did work before without using command.

@leonids2005
Copy link

@torresdal do you have any updates for this item?

@zacharias33
Copy link

Hello, I am facing the issues pointed by @LumirH and @ArtemTrofimushkin . I am using 1.1.28 Helm version (https://charts.spvapi.no) on a new AKS cluster using MSI and secrets are injected correctly when the image is from public repos, however for our private repo in ACR I receive the following:

time="2021-02-11T14:49:24Z" level=debug msg="using managed identity for acr credentials"  
time="2021-02-11T14:49:24Z" level=debug msg="no cmd override in kubernetes for container akv2k8s-env-test, checking docker image configuration for entrypoint and cmd for ******.azurecr.io/devops-runner:master"
time="2021-02-11T14:49:24Z" level=debug msg="using managed identity for acr credentials" 
time="2021-02-11T14:49:24Z" level=debug msg="image(devops-runner:master) is not from ACR, skip MSI authentication"
time="2021-02-11T14:49:25Z" level=error msg="failed to mutate pod, error: failed to get auto cmd, error: Get https://******.azurecr.io/v2/: http: non-successful response (status=401 body=\....................

@torresdal do we have any updates on this issue?

@torresdal
Copy link
Collaborator

So it seams (after talking in the Slack #beta-testing channel) that @LumirH has confirmed that the latest 1.2 beta has solved this issue. Can anyone else verify this?

@zacharias33
Copy link

I can confirm that the latest beta works for me as well 👍

@torresdal
Copy link
Collaborator

Have multiple confirmations that this works in latest beta. Will close and hopefully never re-open 😸

@VpavlovLumedic
Copy link

VpavlovLumedic commented Mar 1, 2021

Looks like 1.2 finally fixed this. Also I'm glad to move from 1.0.2 to 1.2 due to security improvements in 1.2 (finally I can set runAsNonRoot: true for entire pod)

Thanks for fixes guys!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

10 participants