docs(llms): /deploy/new tarball cap 50MB → 10MB (matches api P1)

mastermanas805 · claude · mastermanas805 · commit 62a96d857ac1 · 2026-06-03T12:36:08.000+05:30
Syncs the SoT to api PR #220: direct uploads on POST /deploy/new now cap at
10MB; over-cap returns 413 tarball_too_large + an agent_action nudging to slim
the upload or deploy a prebuilt image. Pairs with the public/llms.txt fallback edit.

Co-Authored-By: Claude Opus 4.8 (1M context) &lt;noreply@anthropic.com&gt;
diff --git a/llms.txt b/llms.txt
@@ -39,7 +39,7 @@ Pick a descriptive name per resource (e.g. `"prod-db"`, `"sessions-cache"`, `"ev
 - **`POST /webhooks/brevo/:secret`** — Brevo delivery webhook receiver (internal — Brevo's transactional pipeline POSTs here for every delivery event). Authentication is by URL token: the `{secret}` path segment is constant-time-compared against the platform's `BREVO_WEBHOOK_SECRET`. Handled events: `delivered`, `soft_bounce`, `hard_bounce`, `blocked`, `complaint`, `deferred`, `unsubscribed`, `error`. The handler overwrites the matching `forwarder_sent` row's `classification` with the real Brevo outcome and stamps `delivered_at` on `delivered` only. Unknown messageIds return `200 {"matched":false}` (Brevo retries on 5xx — orphan events must not amplify retries). Unhandled event types (`click`, `open`, `request`) return `200 {"skipped":true}`. This is the truth surface for "did the user receive the email" — the worker's 201 from Brevo's API only means the relay queued the message; `forwarder_sent.classification` (set by this webhook) is the actual delivery outcome.
 - **`GET /healthz`** — Shallow liveness probe. Returns 200 with `{ok, commit_id, build_time, version}` if the binary is up and can ping its primary platform DB. Wired to the Kubernetes `livenessProbe`. Use `/readyz` for deep upstream checks.
 - **`GET /readyz`** — Deep readiness probe. Multi-component upstream reachability matrix (platform_db, customer_db, redis, provisioner_grpc, NATS, DO Spaces, Brevo, Razorpay, GeoIP). Per-check criticality: `platform_db` + `provisioner_grpc` are CRITICAL (failure → 503); everything else degrades to `200 + overall=degraded`. Each check runs in parallel behind a 10-15s cache to avoid self-DoS via the k8s `readinessProbe` cycle. Response envelope: `{ok, overall, commit_id, checks: {name: {status, latency_ms, last_checked, message?}}}`. Same shape served by api, worker, and provisioner.
-- **`POST /deploy/new`** — Container deploy. Multipart form: `tarball=@app.tar.gz` (required, gzipped tar containing Dockerfile + source, ≤50 MB) and `name=my-app` (**required** — same 1–64 char `^[A-Za-z0-9][A-Za-z0-9 _-]*$` rule), plus optional `port=8080`, `env=production` (scope), and `env_vars={"KEY":"VAL"}` (JSON string of env vars injected into the pod). Build runs in-cluster via kaniko (~30–90s); call returns `202` with `status=building`, then `status=healthy` once the URL on `*.deployment.instanode.dev` is live with a Let's Encrypt cert. **Requires a JWT** — `Authorization: Bearer <upgrade_jwt from /db/new or /claim>`.
+- **`POST /deploy/new`** — Container deploy. Multipart form: `tarball=@app.tar.gz` (required, gzipped tar containing Dockerfile + source, ≤10 MB — over the cap returns 413 `tarball_too_large` with an agent_action to slim the upload or deploy a prebuilt image) and `name=my-app` (**required** — same 1–64 char `^[A-Za-z0-9][A-Za-z0-9 _-]*$` rule), plus optional `port=8080`, `env=production` (scope), and `env_vars={"KEY":"VAL"}` (JSON string of env vars injected into the pod). Build runs in-cluster via kaniko (~30–90s); call returns `202` with `status=building`, then `status=healthy` once the URL on `*.deployment.instanode.dev` is live with a Let's Encrypt cert. **Requires a JWT** — `Authorization: Bearer <upgrade_jwt from /db/new or /claim>`.
     - **Pushing a new version of an existing app** (in-place update — same `app_id`, same URL, slot count unchanged): add `redeploy=true` as a multipart form field on the SAME `POST /deploy/new` call, with the SAME `name=` you used for the original deploy. The platform finds the existing deployment for that team + name and rebuilds it in place. The response includes `"redeployed": true` and reuses the original URL. If no matching deployment exists for that name, the call returns `404 {"error":"no_existing_deployment_to_redeploy"}` (and `409 not_ready` if the row has no provider id yet) — drop the flag and retry to create a fresh app. Without `redeploy=true`, every `POST /deploy/new` mints a NEW `app_id` and a NEW `*.deployment.instanode.dev` URL, even when `name` collides — so an agent shipping v2 of the same app MUST pass `redeploy=true` or the user ends up with two parallel deployments and two distinct URLs.
 - **`POST /stacks/new`** — Multi-service deploy. Multipart form: an `instant.yaml` manifest plus one tarball per service, and `name=my-stack` (**required** — same 1–64 char `^[A-Za-z0-9][A-Za-z0-9 _-]*$` rule). **Requires a JWT.** Returns `{ok, slug, stack_url, services: [{name, url, status}]}`. Anonymous stacks (no Bearer JWT) are accepted and expire after a 6h TTL (a stack is live compute — tighter than the 24h anon data-resource TTL; claim/upgrade to keep it).
 - **`GET /api/v1/stacks/{slug}`** — Inspect a stack by slug. Returns the manifest, current per-service status, exposed URLs, and the merged env-vars (redacted). Anonymous-tier stacks are readable by anyone holding the slug; authenticated stacks require the owning team's session JWT.