Skip to content

[OPENJDK-2992] Create DeploymentConfig and Route for jlink apps #514

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Mar 3, 2025

Conversation

Josh-Matsuoka
Copy link
Contributor

Addresses https://issues.redhat.com/browse/OPENJDK-2992

Cleanup/Continuation of #500

This cleans up the original PR, bringing it in line with the new naming convention for created objects, adding in the missing target port, and converting the Deployment to a DeploymentConfig to work with ImageStreamTags

Copy link
Member

@jmtd jmtd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! Some notes

  • can we get a default value for TARGET_PORT?
  • please add TARGET_PORT to the examples in templates/jlink/README.md
  • The Port parameter in the service object needs templating too; the default of 80 doesn't work with the quickstart we use in README.md

Copy link

@sefroberg sefroberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The structure looks correct.

Copy link
Member

@jmtd jmtd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding those; this is much closer. I've just tried it again in CRC and it's still not quite there. One more difference I notice between what this creates and what is created if I do "+Add" by hand in the console, is the latter creates a Pod object as well. Perhaps we need to add that.

@jmtd
Copy link
Member

jmtd commented Nov 20, 2024

Here's a dump of the pod object that was created in my CRC instance when I did a manual "Add". I'm guessing the vast majority of fields here aren't needed in the template, and somehow we 'll have to resolve the ImageStream/Image discrepancy for this

apiVersion: v1
kind: Pod
metadata:
  annotations:
    k8s.v1.cni.cncf.io/network-status: |-
      [{
          "name": "openshift-sdn",
          "interface": "eth0",
          "ips": [
              "10.217.1.68"
          ],
          "default": true,
          "dns": {}
      }]
    openshift.io/generated-by: OpenShiftWebConsole
    openshift.io/scc: restricted-v2
    seccomp.security.alpha.kubernetes.io/pod: runtime/default
  creationTimestamp: "2024-11-20T10:27:01Z"
  generateName: join-76fb988d6-
  labels:
    app: join
    deployment: join
    pod-template-hash: 76fb988d6
  name: join-76fb988d6-gmcnp
  namespace: jlink1
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: ReplicaSet
    name: join-76fb988d6
    uid: ed6e2e33-3d6f-40b3-9520-33f039c5c14f
  resourceVersion: "39753"
  uid: 05cea876-ef10-41ad-96d7-872468638fc0
spec:
  containers:
  - image: image-registry.openshift-image-registry.svc:5000/jlink1/quarkus-quickstart-lightweight-image@sha256:26654e381a0d87e88b6b2903ee2b0d1431a53a20cbb2c926a1932d9215f15f9c
    imagePullPolicy: IfNotPresent
    name: join
    ports:
    - containerPort: 8080
      protocol: TCP
    resources: {}
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
      runAsNonRoot: true
      runAsUser: 1000660000
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-r8pvn
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  imagePullSecrets:
  - name: default-dockercfg-m2v5c
  nodeName: crc-97g8f-master-0
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext:
    fsGroup: 1000660000
    seLinuxOptions:
      level: s0:c26,c5
    seccompProfile:
      type: RuntimeDefault
  serviceAccount: default
  serviceAccountName: default
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - name: kube-api-access-r8pvn
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
      - configMap:
          items:
          - key: service-ca.crt
            path: service-ca.crt
          name: openshift-service-ca.crt
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2024-11-20T10:27:01Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2024-11-20T10:27:07Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2024-11-20T10:27:07Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2024-11-20T10:27:01Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: cri-o://b171de0db71303e4de17949e27f41503aa7a1f36b6696d28ae7cbcb6cea7a4c6
    image: image-registry.openshift-image-registry.svc:5000/jlink1/quarkus-quickstart-lightweight-image@sha256:26654e381a0d87e88b6b2903ee2b0d1431a53a20cbb2c926a1932d9215f15f9c
    imageID: image-registry.openshift-image-registry.svc:5000/jlink1/quarkus-quickstart-lightweight-image@sha256:26654e381a0d87e88b6b2903ee2b0d1431a53a20cbb2c926a1932d9215f15f9c
    lastState: {}
    name: join
    ready: true
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2024-11-20T10:27:06Z"
  hostIP: 192.168.126.11
  phase: Running
  podIP: 10.217.1.68
  podIPs:
  - ip: 10.217.1.68
  qosClass: BestEffort
  startTime: "2024-11-20T10:27:01Z"

@jmtd
Copy link
Member

jmtd commented Dec 18, 2024

Thanks for continuing to work on this. We're not quite there yet. Here's the
process I go through to review it each time, and the acceptance criteria to
merge:

  1. create a new namespace in my crc instance (or a new crc instance)
  2. follow the steps in templates/jlink/README.md
  3. watch the builds from the web console
  4. visit "Topology" and see what's up

With the current state (adb47cce3197581b42f364bec17b353fd6b57998), the
label on the top-level Template object means the template won't load.

Once that's resolved, the final state is a DeploymentConfig and a Pod that
are separate in the Topology view, instead of part of the same Application.
In this pic, the two objects on the left are created from the template, and
the objects on the right are created when I do "+Add" manually:

topology

Where we want to get to is to have the objects combined like in the manual
case. I think this should be possible with DeploymentConfigs, even though
the manual approach creates a Deployment instead (we've discussed the
difficulties of creating a Deployment in the template, vis-a-vis the
ImageStream versus image: issue). I think it's a matter of labelling,
but I'm not sure.

Clicking on the "external link" icon, to visit the URI for the route,
results in "Application is not available" for the Route created by
the template, but works for the manual case.

The acceptance criteria is thus

  1. template is valid
  2. builds auto-trigger
  3. Pod, DeploymentConfig are created
  4. Pod, DeploymentConfig are unified in the Topology view
  5. the relevant Route works

@Josh-Matsuoka
Copy link
Contributor Author

The latest push fixes the label issue.

The problem doesn't seem to be with the route, rather it's with the container itself. Looking at the logs before it crashes I'm seeing

ERROR: Failed to start application (with profile [prod])
java.lang.RuntimeException: Failed to start quarkus
at io.quarkus.runner.ApplicationImpl.doStart(Unknown Source)
at io.quarkus.runtime.Application.start(Application.java:101)
at io.quarkus.runtime.ApplicationLifecycleManager.run(ApplicationLifecycleManager.java:111)
at io.quarkus.runtime.Quarkus.run(Quarkus.java:71)
at io.quarkus.runtime.Quarkus.run(Quarkus.java:44)
at io.quarkus.runtime.Quarkus.run(Quarkus.java:124)
at io.quarkus.runner.GeneratedMain.main(Unknown Source)
Caused by: java.lang.IllegalStateException: Unable to create folder at path '/tmp/vertx-cache/8814133929435640674'

It looks like quarkus isn't able to create any temporary files/directories inside the container when built through the template. I don't see any differences in the pod specs that should be causing this, do you have any ideas @jmtd ?

@Josh-Matsuoka
Copy link
Contributor Author

Upon further investigation it looks like the user we switch to for running the java command (USER 185) doesn't have write permissions on the file system so it's unable to create the cache for quarkus. This causes quarkus to crash and the pod to enter a CrashLoopBackoff which is why the application is unavailable.

Removing the USER 185 and presumably running as root everything works as expected, but this probably isn't the best solution here, do you have any suggestions for a different user and/or how to fix this permissions issue?

@Josh-Matsuoka
Copy link
Contributor Author

@jmtd any thoughts on the above?

@jmtd
Copy link
Member

jmtd commented Jan 28, 2025

Hi Josh, I've just been looking at this a bit more and trying to reproduce your experience but I haven't been able to.

When I load the template on a clean cluster, once the three builds have finished, I can see (in the "Topology" view) a single Application containing a DeploymentConfig and an associated Route. Trying to follow the Route shows "application not available". There are 0/1 pods running associated with the DeploymentConfig, and I cannot find logs or events to explain what's going on.

Since nothing is running, there's no issue with permissions in /tmp, at least yet.

Meanwhile, if i click "+Add" and create a new Deployment from the final image stream, that Deployment starts up a Pod and following the relevant route leads me to the "hello world" page. So the app has deployed successfully and hasn't complained about any permissions issues.

If I access a shell on the running Pod, I can see the permissions for /tmp are to be expected (anyone can write) and indeed the running user has written some stuff:

sh-5.1$ id
uid=1000670000(1000670000) gid=0(root) groups=0(root),1000670000
sh-5.1$ ls -al /tmp
total 0
drwxrwxrwt. 1 root       root 54 Jan 28 14:35 .
dr-xr-xr-x. 1 root       root 39 Jan 28 14:35 ..
drwxr-xr-x. 2 1000670000 root 15 Jan 28 14:35 hsperfdata_1000670000
drwxrwxrwx. 3 1000670000 root 69 Jan 28 14:35 vertx-cache

@jmtd
Copy link
Member

jmtd commented Jan 28, 2025

@Josh-Matsuoka
Copy link
Contributor Author

After nuking the old crc cluster

crc delete ; crc setup ; crc start --cpus 6 --memory 24596)

and updating to the most recent version of crc

CRC version: 2.46.0+8f40e8
OpenShift version: 4.17.10
MicroShift version: 4.17.10

The deploymentconfig seems to work as expected now:

Jlink

Template was given the following parameters:

oc process -n openshift -f templates/jlink/jlinked-app.yaml -p APP_URI=https://github.com/jboss-container-images/openjdk-test-applications -p JDK_VERSION=17 -p REF=master -p CONTEXT_DIR=quarkus-quickstarts/getting-started-3.9.2-uberjar -p TARGET_PORT=8080 -p SERVICE_PORT=8080 -p APPNAME=quarkus | oc create -f -

Make sure the builder containers used are correctly named (I updated the README accordingly)

oc create imagestream openjdk-17-jlink-tech-preview
podman tag ubi9/openjdk-17:1.18 default-route-openshift-image-registry.apps-crc.testing/default/openjdk-17-jlink-tech-preview:1.18
podman push default-route-openshift-image-registry.apps-crc.testing/default/openjdk-17-jlink-tech-preview:1.18

@Josh-Matsuoka Josh-Matsuoka requested a review from jmtd January 30, 2025 23:57
@jmtd
Copy link
Member

jmtd commented Feb 4, 2025

Hi Josh,

Template was given the following parameters:
-p JDK_VERSION=17

Are you sure you specified JDK 17 for the template? With our move towards only doing a 21 tech preview image, I would expect this not to work.

@Josh-Matsuoka
Copy link
Contributor Author

Are you sure you specified JDK 17 for the template? With our move towards only doing a 21 tech preview image, I would expect this not to work.

I tested both. JDK 21 works as well for me.

oc process -n openshift -f templates/jlink/jlinked-app.yaml -p APP_URI=https://github.com/jboss-container-images/openjdk-test-applications -p JDK_VERSION=21 -p REF=master -p CONTEXT_DIR=quarkus-quickstarts/getting-started-3.9.2-uberjar -p TARGET_PORT=8080 -p SERVICE_PORT=8080 -p APPNAME=quarkus-3 | oc create -f -

@jmtd
Copy link
Member

jmtd commented Feb 13, 2025

I'm still getting undeployable results. I'll update my crc (current is CRC version: 2.31.0+6d23b6)

@Josh-Matsuoka
Copy link
Contributor Author

Did you ever get a chance to re-test this with an updated CRC? I honestly have no idea where the disconnect is between what we're seeing. I can't reproduce what you're seeing.

Copy link

@sefroberg sefroberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are a couple items that should be updated. 1. JDK_VERSION. We are only releasing this for JDK21 so there is no reason to confuse the customer with a lower default verison. 2. Update to using Deployment rather than DeploymentConfig.
DeploymentConfig has been deprecated since OCP version 4.14.

@Josh-Matsuoka
Copy link
Contributor Author

@jmtd @sefroberg

Updated the DeploymentConfig to a Deployment object.

From the documentation there's one notable limitation here:

The following default projects are considered highly privileged: default, kube-public, kube-system, openshift, openshift-infra, openshift-node, and other system-created projects that have the openshift.io/run-level label set to 0 or 1. Functionality that relies on admission plugins, such as pod security admission, security context constraints, cluster resource quotas, and image reference resolution, does not work in highly privileged projects.

In other words, for the Deployment to resolve the ImageStream correctly it needs to be run in a non default project, running everything in a newly created project/namespace it works as expected and deploys properly for me.

If you're getting an undeployable result, check the deployment pod to see if it's having difficulty resolving the $APPNAME-lightweight-image. If it's an ErrImagePull looking for a URL like docker.io/quarkus-lightweight-image or something similar, then it's not able to resolve the ImageStream and you should double check the project you're running it in.

@sefroberg
Copy link

I'll have to double check that when I get home. My understanding is users should never use the default project. In the case I ran on Thursday, I used a project called jlink1. But I need to try it again to make sure it is reproducable.

@jmtd
Copy link
Member

jmtd commented Feb 27, 2025

On Today's call I said I'd share my experiment with dumping a Deployment object from CRC after creating it manually. It's at https://github.com/jmtd/redhat-openjdk-container-images/tree/514-jmtd-dump-deployment . Note that the focus of this work is still on Josh's PR; I need to catch up on #514 (comment)

@sefroberg
Copy link

@jmtd I re-ran the procedure on our OCP cluster and it worked this time. If you run @Josh-Matsuoka command above, remove the -n openshift as I think this caused some problems. I ran it again today and the deployment came automatically.

Copy link

@sefroberg sefroberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor changes with the readme. Otherwise looks good.

Copy link

@sefroberg sefroberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@jmtd
Copy link
Member

jmtd commented Mar 3, 2025

fantastic, it worked!

@jmtd jmtd merged commit 06a2d7f into rh-openjdk:jlink-dev Mar 3, 2025
0 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants