Skip to content

Stabilize activation integration tests and add guarded completion chat cache parity coverage#193

Open
Sohailm25 wants to merge 2 commits intohijohnnylin:mainfrom
Sohailm25:codex/cache-parity-and-activation-structure
Open

Stabilize activation integration tests and add guarded completion chat cache parity coverage#193
Sohailm25 wants to merge 2 commits intohijohnnylin:mainfrom
Sohailm25:codex/cache-parity-and-activation-structure

Conversation

@Sohailm25
Copy link
Copy Markdown

Problem

  • Integration tests for activation/all and activation/single were overly brittle because they asserted exact activation snapshots, which can fail under harmless backend or dependency drift.
  • completion_chat cache-parity coverage was missing for the additive feature steering path.
  • In the local CPU test env, HookedTransformer did not expose a native generate_stream, which prevented the completion-chat integration path from running as written.

Fix

  • Replaced exact activation snapshot assertions in the activation integration tests with stronger structural invariants:
    • token lengths match activation lengths
    • activations are nontrivial
    • max_value and max_value_index are internally consistent
    • expected token suffixes remain stable
  • Added a cache-parity regression test for deterministic additive feature steering in completion_chat.
  • Added a test-only generate_stream compatibility shim so the existing completion-chat integration tests can run in environments where the backend does not provide native streaming.
  • Guarded the cache-parity regression so it only runs when the backend exposes a native generate_stream implementation.

Testing

  • Ran:
    • python -m pytest -q tests/integration/test_completion_chat.py tests/integration/test_activation_all.py tests/integration/test_activation_single.py
  • Result:
    • 9 passed, 1 skipped
  • The skipped test is the new cache-parity regression in environments without native generate_stream.

@vercel
Copy link
Copy Markdown

vercel bot commented Mar 13, 2026

@Sohailm25 is attempting to deploy a commit to the Neuronpedia Team on Vercel.

A member of the Team first needs to authorize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant