Skip to content

Conversation

@blink-so
Copy link
Contributor

@blink-so blink-so bot commented Sep 15, 2025

Summary

  • Add tests/schema.sql and tests/seed.sql with a small, fixed dataset covering:
    • Status flip (In Progress → Done)
    • Labels change
    • Assignee add
    • Deletion event
    • Snapshot rows consistent with changes
  • Add .github/workflows/evals.yml to create a fresh Neon branch per run, seed it, and run evals if present
  • Add Vitest config and initial scenario tests (tests/scenario.spec.ts)
    • Tests are skipped locally if DATABASE_URL is not set
    • In CI, they run against the ephemeral branch seeded above

Repo settings

  • Repository secret: NEON_API_KEY
  • Repository variable: NEON_PROJECT_ID
  • Optional variables (defaults ok): NEON_DATABASE=neondb, NEON_ROLE=neondb_owner

Notes

  • Uses neondatabase/create-branch-action@v6 to create the branch and neondatabase/delete-branch-action@v3 for cleanup
  • Uses neonctl to generate a DATABASE_URL for the created branch
  • Seeds via psql with ON_ERROR_STOP

Updates

  • Fix invalid workflow expression by proxying secrets/vars through job env and using env.* in all if conditions
  • Add temporary debug step (BRANCH_NAME, neonctl --version) and assert non-empty DATABASE_URL before seeding/tests
  • Make neonctl connection-string handling resilient to JSON or raw string output; fall back when jq parsing fails

Next PR

  • Add Vitest-based contract tests that exercise tool usage (db_schema/db_query), SQL structure, and output semantics

Co-authored by Matt Vollmer

blink-so bot and others added 6 commits September 15, 2025 19:35
… data

- Add tests/schema.sql and tests/seed.sql
- Add .github/workflows/evals.yml using Neon ephemeral branches per run
- Guard eval run when no tests are present

Co-authored-by: mattvollmer <[email protected]>
…nup failure

- Only create/seed/run when NEON_API_KEY and NEON_PROJECT_ID are set
- Guard cleanup to run only if branch was created

Co-authored-by: mattvollmer <[email protected]>
- Add vitest and test script
- Add scenario tests using seeded Neon dataset (skipped locally without DATABASE_URL)

Co-authored-by: mattvollmer <[email protected]>
- Assert prompt includes tool names, safety, and 7-day default
- Mock pg to validate runQuery clamping and SELECT-only enforcement
- Ensure schema summary mentions expected tables

Co-authored-by: mattvollmer <[email protected]>
…_string vs connectionString)

Co-authored-by: mattvollmer <[email protected]>
@mattvollmer mattvollmer marked this pull request as ready for review September 15, 2025 20:05
blink-so bot and others added 4 commits September 15, 2025 20:08
… DATABASE_URL assertion

- Replace secrets/vars in if with env.* to satisfy workflow parser\n- Add debug step (BRANCH_NAME, neonctl --version)\n- Assert DATABASE_URL non-empty before seeding/tests\n\nCo-authored-by: mattvollmer <[email protected]>
…string

- Try JSON first, fall back to raw string to avoid jq parse errors\n\nCo-authored-by: mattvollmer <[email protected]>
- Remove accidental escaping and trailing backslashes inserted in previous edit\n\nCo-authored-by: mattvollmer <[email protected]>
Co-authored-by: mattvollmer <[email protected]>
@mattvollmer mattvollmer merged commit 18a6754 into main Sep 15, 2025
1 check passed
@mattvollmer mattvollmer deleted the blink/evals-neon-ephemeral branch September 15, 2025 20:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants