You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The CTJ rewrite's target topology is three containers in a shared ECS task:
Next.js - serves the frontend (App Router pages, server components, server actions)
Django - serves the CTJ API (/api/*) and Django admin (/admin/*)
Postgres - the only database in scope for the rewrite (no managed RDS)
The ALB does path-based routing: /api/* and /admin/* go to a Django target group, everything else goes to a Next.js target group. Both target groups are backed by the same ECS task. Postgres has no public ingress; it's reachable only from the Django container via localhost:5432. The browser sees a single origin (stage.civictechjobs.org), so CORS isn't a concern.
Stage today still runs as a single fullstack container (application_type = "fs", container_port = 8000, single :stage image of the combined Django + WhiteNoise + SPA-catchall shape, separate RDS instance). The deployed image and Terraform are both pre-rewrite.
This is the critical-path Terraform change. Until it lands, the rewrite can't deploy; the application-side cutover (auth rebuild, matching engine work, qualifier integration) has nowhere to run.
Scope
In the incubator repo at terraform/projects/civic-tech-jobs/. Four distinct pieces of work:
1. ECR repositories
In civic-tech-jobs.tf, split the existing civic_tech_jobs_ecr_fullstack module call into two:
ECR repo civic-tech-jobs-frontend
ECR repo civic-tech-jobs-backend
Postgres uses the upstream postgres:18 image and doesn't need an ECR repo.
2. ECS task definition
In environment-stage.tf, replace the single civic_tech_jobs_fullstack_stage_service module call with a multi-container task spec:
One ECS task with three container definitions: Next.js, Django, Postgres
ALB with two target groups (Django, Next.js); listener rules: /api/* and /admin/* -> Django target group, default -> Next.js target group
Both target groups attach to the same task; Postgres is unattached (no public ingress)
Cross-container traffic stays on localhost (shared task network namespace)
The existing container module in incubator/terraform/modules/container was written for single-container ECS services. The application-side requirement is that the three containers share a task; how the module shape gets there is your call.
3. Frontend build-time vars
NEXT_PUBLIC_API_URL must be baked into the frontend image at build time (Next.js inlines NEXT_PUBLIC_* values into the JS bundle during npm run build; setting them at container runtime has no effect because the bundle is already frozen). This means:
Goes into the deploy workflow's docker build step as a --build-arg, not into the Terraform container_environment list.
No Terraform-side change for NEXT_PUBLIC_API_URL itself; just flagging the coordination point so the deploy workflow can match.
4. Postgres: drop RDS, in-task container
The existing civic_tech_jobs_stage_database module call provisions an RDS instance and exports database, host, port, owner_username, owner_password_arn. Those exports are wired into the fullstack container's env list today.
The civic_tech_jobs_stage_database module call goes away (no more RDS provisioning).
A Postgres container (image: postgres:18) is added to the ECS task. Its init env vars are POSTGRES_USER / POSTGRES_PASSWORD / POSTGRES_DB. Its data dir at /var/lib/postgresql/data needs persistent storage; the deployment-infra spec leaves persistence configuration explicitly to DevOps.
The Django container's connection env (SQL_HOST / SQL_PORT / SQL_USER / SQL_PASSWORD / SQL_DATABASE) connects to localhost:5432 with the same credentials as the Postgres init env. How those credentials are sourced (the existing random_password pattern, Secrets Manager, etc.) is a DevOps call.
Substantive cross-repo change, not a drive-by PR. Worth a synchronous handoff conversation rather than starting blind.
Blocks the deploy-stage.yml rewrite on the CTJ side (Rewrite deploy-stage.yml for two-image build (BLOCKED) #750). The workflow can't be rewritten to push two images until the ECR repos and ECS task spec are in place; landing the workflow rewrite first would push to nonexistent ECR repos. Lockstep landing is fine; isolated CTJ-first is not.
The existing :stage deploy is currently broken at the build step on push to main (CTJ's deploy-stage.yml references a Dockerfile that no longer exists). That's a deliberate escape hatch; see Rewrite deploy-stage.yml for two-image build (BLOCKED) #750.
Context
The CTJ rewrite's target topology is three containers in a shared ECS task:
/api/*) and Django admin (/admin/*)The ALB does path-based routing:
/api/*and/admin/*go to a Django target group, everything else goes to a Next.js target group. Both target groups are backed by the same ECS task. Postgres has no public ingress; it's reachable only from the Django container vialocalhost:5432. The browser sees a single origin (stage.civictechjobs.org), so CORS isn't a concern.Stage today still runs as a single fullstack container (
application_type = "fs",container_port = 8000, single:stageimage of the combined Django + WhiteNoise + SPA-catchall shape, separate RDS instance). The deployed image and Terraform are both pre-rewrite.This is the critical-path Terraform change. Until it lands, the rewrite can't deploy; the application-side cutover (auth rebuild, matching engine work, qualifier integration) has nowhere to run.
Scope
In the
incubatorrepo atterraform/projects/civic-tech-jobs/. Four distinct pieces of work:1. ECR repositories
In
civic-tech-jobs.tf, split the existingcivic_tech_jobs_ecr_fullstackmodule call into two:civic-tech-jobs-frontendcivic-tech-jobs-backendPostgres uses the upstream
postgres:18image and doesn't need an ECR repo.2. ECS task definition
In
environment-stage.tf, replace the singlecivic_tech_jobs_fullstack_stage_servicemodule call with a multi-container task spec:/api/*and/admin/*-> Django target group, default -> Next.js target grouplocalhost(shared task network namespace)The existing
containermodule inincubator/terraform/modules/containerwas written for single-container ECS services. The application-side requirement is that the three containers share a task; how the module shape gets there is your call.3. Frontend build-time vars
NEXT_PUBLIC_API_URLmust be baked into the frontend image at build time (Next.js inlinesNEXT_PUBLIC_*values into the JS bundle duringnpm run build; setting them at container runtime has no effect because the bundle is already frozen). This means:docker buildstep as a--build-arg, not into the Terraformcontainer_environmentlist.deploy-stage.ymlfor two-image build (BLOCKED) #750).No Terraform-side change for
NEXT_PUBLIC_API_URLitself; just flagging the coordination point so the deploy workflow can match.4. Postgres: drop RDS, in-task container
The existing
civic_tech_jobs_stage_databasemodule call provisions an RDS instance and exportsdatabase,host,port,owner_username,owner_password_arn. Those exports are wired into the fullstack container's env list today.The new shape, per
docs/developer/deployment-infra.md:civic_tech_jobs_stage_databasemodule call goes away (no more RDS provisioning).image: postgres:18) is added to the ECS task. Its init env vars arePOSTGRES_USER/POSTGRES_PASSWORD/POSTGRES_DB. Its data dir at/var/lib/postgresql/dataneeds persistent storage; the deployment-infra spec leaves persistence configuration explicitly to DevOps.SQL_HOST/SQL_PORT/SQL_USER/SQL_PASSWORD/SQL_DATABASE) connects tolocalhost:5432with the same credentials as the Postgres init env. How those credentials are sourced (the existingrandom_passwordpattern, Secrets Manager, etc.) is a DevOps call.References
docs/decisions/0003-separate-frontend-backend-containers.mddocs/developer/deployment-infra.mdmake stage-smoke, added in chore: Adopt Make as canonical task runner with working host + stage workflows #747) builds and runs the production-shaped images in the equivalent compose topology. Useful as a sanity check that the application side actually composes the way this issue assumes.Coordination
deploy-stage.ymlfor two-image build (BLOCKED) #750). The workflow can't be rewritten to push two images until the ECR repos and ECS task spec are in place; landing the workflow rewrite first would push to nonexistent ECR repos. Lockstep landing is fine; isolated CTJ-first is not.:stagedeploy is currently broken at the build step on push tomain(CTJ'sdeploy-stage.ymlreferences a Dockerfile that no longer exists). That's a deliberate escape hatch; see Rewritedeploy-stage.ymlfor two-image build (BLOCKED) #750.Owner
TBD - pending DevOps CoP discussion.