Skip to content

ci: fix the always-failing PR checks (LFS-404 + upstream-only gating)#48

Merged
Pterjudin merged 7 commits into
mainfrom
fix/pr-checks-2026-05-24
May 24, 2026
Merged

ci: fix the always-failing PR checks (LFS-404 + upstream-only gating)#48
Pterjudin merged 7 commits into
mainfrom
fix/pr-checks-2026-05-24

Conversation

@Pterjudin

Copy link
Copy Markdown

Summary

Every PR on this repo currently shows ~15 red checks. Almost all of them have always been red since long before the 1.118.1 sync (confirmed on PR #44 from Feb 2026). The team has been relying on cortexide-builder for real validation. This PR triages the inherited-from-upstream noise without disabling any check that's catching real regressions.

This is a draft; do not merge until reviewed.

Root causes found

After pulling logs from PRs #46 and #47, the failures collapse into three buckets:

  1. LFS-404 (drives ~15 of the failures). actions/checkout@v6 with lfs: true fails because every LFS pointer in the repo resolves to 404 on our LFS server:

    [d326ad15c05a...] Object does not exist on the server: [404]
    error: failed to fetch some objects from 'https://github.com/OpenCortexIDE/cortexide.git/info/lfs'
    The process '/usr/bin/git' failed with exit code 2
    

    The only LFS-tracked files in the repo are extensions/copilot/test/simulation/cache/*.sqlite (and extensions/copilot/.gitattributes declares them). The pointer files are committed; the actual blobs were never uploaded to our LFS endpoint.

  2. Microsoft-specific infrastructure the fork doesn't have credentials/runners/endpoints for:

    • Monaco Editor checks — part of upstream's npm-publish flow for monaco-editor.
    • Prevent engineering system changes in PRs — queries microsoft/vscode collaborator permissions (403 Resource not accessible by integration) and references vs-code-engineering[bot].
    • Check API Proposal Version Changes — enforces upstream's vscode.proposed.*.d.ts versioning policy that we don't manage.
    • Checking Component Screenshots — uploads to hediet-screenshots.azurewebsites.net with an OIDC token our identity isn't authorized for (curl: (22) The requested URL returned error: 403).
    • copilot-setup-steps — uses runs-on: vscode-large-runners (Microsoft-only self-hosted runner), sits queued 24h before GitHub force-fails it.
  3. Real test failuresComponent Fixture Tests is failing 8 Playwright tests in tests/imageCarousel.spec.ts (locator('.image-carousel-editor') never becomes visible). This fails identically on PR Sync/vscode 1.110.0 #44 from February, so it's a pre-existing bug unrelated to the 1.118.1 rebase. Not fixed in this PR; see follow-ups below.

Fixes in this PR

Check Root cause Fix
Compile & Hygiene LFS-404 lfs: false in pr.yml (compile doesn't read sqlite)
Linux / {Browser, Electron, Remote} LFS-404 lfs: false in pr-linux-test.yml
macOS / {Browser, Electron, Remote} LFS-404 lfs: false in pr-darwin-test.yml
Windows / {Browser, Electron, Remote} LFS-404 lfs: false in pr-win32-test.yml
Linux / CLI LFS-404 lfs: false in pr-linux-cli-test.yml
Copilot - Check Telemetry LFS-404 lfs: false in pr.yml (telemetry extractor reads TS sources)
Copilot - Check Test Cache needs LFS-tracked sqlite Gated off on fork via if: github.repository_owner != 'OpenCortexIDE'
Copilot - Test (Linux) needs LFS-tracked sqlite for simulate-ci Gated off on fork
Copilot - Test (Windows) needs LFS-tracked sqlite for simulate-ci Gated off on fork
Monaco Editor checks upstream npm-publish flow + LFS-404 Gated off on fork
Prevent engineering system changes in PRs queries microsoft/vscode permissions Gated off on fork
Check API Proposal Version Changes upstream API governance Gated off on fork
Checking Component Screenshots upstream Azure endpoint (403) Gated off on fork
copilot-setup-steps uses upstream's self-hosted runner pool Gated off on fork

All gating uses if: github.repository_owner != 'OpenCortexIDE' so the workflows are kept verbatim and will run normally on any other fork that has the upstream infra. To revert, drop the if: line.

Intentionally NOT fixed

  • Component Fixture Tests (Playwright imageCarousel timeouts) — this is a real test failure, but it pre-existed the rebase by several months and isn't load-bearing for shipping (the actual product build runs in cortexide-builder, which is green). Fixing it requires investigating why .image-carousel-editor never mounts in headless Chromium and is out of scope here. Suggest a separate fix/image-carousel-playwright PR.
  • chat-lib tests (ubuntu/macos/windows) — these are passing and unaffected.
  • Check metadata (telemetry.yml) — passing.
  • pr-node-modules.yml — push-to-main only, doesn't run on PRs.
  • sessions-e2e.yml / chat-perf.yml — already workflow_dispatch-only.
  • The cortexide-builder workflows in the sibling repo were not touched.

Follow-ups requiring user decision (not in this PR)

  1. LFS data: do we want to actually upload the copilot simulation sqlite files to our LFS server (~1-2 GB) so the simulate-ci jobs can run? If yes, ungate the three copilot test jobs. If no, leave gated.
  2. Engineering-system guardrail: upstream's "no PR may modify .github/workflows or build/" rule is a sensible policy. If you want it enforced on this repo too, the workflow could be rewritten to query OpenCortexIDE/cortexide collaborator permissions instead of microsoft/vscode. Happy to do that in a follow-up.
  3. Screenshot diff service: if there's appetite, we can stand up our own Azure-Static-Web-Apps screenshot service and re-enable the workflow.

Test plan

  • Open this draft PR and watch the checks list — only valid checks should run; the gated ones should not appear (or appear as skipped).
  • Confirm Compile & Hygiene, Linux/macOS/Windows × Browser/Electron/Remote, Linux / CLI, and Copilot - Check Telemetry all get past actions/checkout (they may still fail later for other reasons — that's a separate problem we want to see).
  • Confirm chat-lib tests and Check metadata still pass.
  • Verify gated workflows show no run for this PR head SHA.

🤖 Generated with Claude Code

Tajudeen and others added 7 commits May 24, 2026 10:19
The repo's only LFS-tracked files are copilot simulation-cache sqlite
databases under extensions/copilot/test/simulation/cache/. Their LFS
objects are not present on our LFS server (all return 404), which makes
every `actions/checkout@v6` with `lfs: true` fail before any real work.

The Linux/macOS/Windows electron/browser/remote tests and the Linux CLI
Rust tests don't touch those files, so we can safely set `lfs: false` to
unblock checkout. The copilot simulation jobs themselves are addressed
separately in a follow-up commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nt jobs

- Compile & Hygiene and Copilot - Check Telemetry don't touch the LFS-
  tracked simulation cache, so switch `lfs: true` to `lfs: false` to
  unblock checkout (our LFS server returns 404 for those objects).
- Copilot - Check Test Cache, Copilot - Test (Linux), and Copilot - Test
  (Windows) genuinely open the sqlite simulation databases via
  cache-cli check / simulate-ci. Without LFS data they cannot work, so
  gate them off on our fork (`if: github.repository_owner != 'OpenCortexIDE'`)
  rather than masking the missing data.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Monaco Editor is published from microsoft/vscode; this checks belongs to
that publishing flow and isn't relevant for downstream forks. It also
hit the LFS-404 issue, which the gate makes moot.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This workflow queries microsoft/vscode collaborator permissions and
enforces upstream's bot allow-list. It returns 403 on our token and
references identities (vs-code-engineering[bot], etc.) that don't apply
to this fork. Engineering-system changes here are governed by normal PR
review.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This check enforces upstream VS Code's policy around versioned proposed
APIs (vscode.proposed.*.d.ts). We don't publish proposed APIs from this
fork and don't want to block PRs that touch d.ts files inherited from an
upstream sync.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The workflow uploads to hediet-screenshots.azurewebsites.net (upstream's
Azure-hosted screenshot diff service) using a token derived from OIDC.
Our OIDC identity is not authorized at that endpoint (403 on every run).
Until we run an equivalent screenshot service, skip the workflow on this
fork rather than fail every PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This workflow targets `runs-on: vscode-large-runners`, a self-hosted
runner pool that exists only in the microsoft/vscode org. On our fork
the job sits queued for 24h and is then auto-failed by GitHub.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Pterjudin Pterjudin marked this pull request as ready for review May 24, 2026 09:24
@Pterjudin Pterjudin merged commit a5a6126 into main May 24, 2026
14 of 25 checks passed
@Pterjudin Pterjudin deleted the fix/pr-checks-2026-05-24 branch May 24, 2026 09:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant