Add Continuity compaction strategy by nhicks00 · Pull Request #306 · mpfaffenberger/code_puppy

nhicks00 · 2026-04-24T19:36:05Z

Summary

Makes compaction_strategy=continuity the default compaction mode and adds the Continuity strategy for preserving long-session working context through predictive triggers, deterministic observation masking, durable task-scoped memory, archive retrieval hints, fallback summarization, recent raw-tail protection, and target trimming.

The existing truncation and summarization strategies remain available for users who prefer the legacy behavior.

Structure

Continuity is implemented as a built-in plugin under code_puppy/plugins/continuity_compaction/.

The core changes are limited to generic plugin extension points and their invocations:

register_config_keys: lets plugins expose config keys in /set help.
register_compaction_strategies: lets plugins register strategy names such as continuity.
compact_message_history: lets a plugin handle message-history compaction and return rebuilt history plus dropped-message bookkeeping.
/compact now routes through the unified compaction entrypoint with force=True, so plugin strategies can handle manual compaction without command-specific core logic.

Why Continuity

Practical Scenario	Truncation Risk	Summarization Risk	Continuity Solves It By
A long session starts with OAuth work, then later switches to dashboard work.	Early OAuth goals and constraints can be deleted completely.	Summaries can blur task boundaries and make old constraints look current.	Keeping original root, active task, task ledger, and task-scoped constraints separately.
The agent reads huge files and long test logs many times.	Old observations vanish, including useful failures.	Large outputs become prose that may omit exact tool/status/archive details.	Archiving bulky raw observations locally and replacing them with deterministic capsules.
A later bug depends on an old failed test or invalidated hypothesis.	The key failure may be outside the retained tail.	Stale hypotheses can survive as vague summary text.	Tracking validation status, accepted decisions, invalidated hypotheses, and archive hints.
The session runs through many compactions.	Repeated hard cuts erase session roots.	Repeated summaries can compound drift.	Refreshing one bounded durable memory snapshot while preserving recent raw context separately.
The next turn is likely to be large.	Compaction can happen too late.	Same threshold problem unless manually compacted.	Predicting next-turn growth and compacting before the projected turn crosses the soft trigger.
The user needs to tune behavior.	Mostly global threshold/protected-token controls.	Mostly summarization model/settings.	Exposing plugin-owned trigger, target, raw-tail, archive, retention, timeout, and semantic-model knobs.

What Changed

Defaults get_compaction_strategy() to continuity when no strategy is configured or an invalid strategy is provided.
Adds code_puppy/plugins/continuity_compaction/ with:
- percentage-scaled trigger settings per model context window
- predictive compaction based on recent growth history
- default post-compaction target of 35% full context
- final trim protection for the newest raw tail, using continuity_compaction_recent_raw_floor_ratio (20% by default)
- deterministic archiving and masking of old bulky tool-return observations
- durable memory snapshots for active task, task ledger, constraints, decisions, validation state, active files, next action, and archive hints
- optional semantic memory update using continuity_compaction_semantic_model, defaulting to the active chat model when unset and falling back to summarization_model only for non-agent utility calls
- archive metadata indexing, search, retrieval snippets, retention cleanup, and schema v1-to-v2 migration
Registers /continuity as a plugin command for memory status, task ledger, archive search/show, and diagnostics.
Adds live comparison tooling and docs for the practical compaction evaluation.

Before / After Model

Before compaction, a long session can contain the current task, old completed task work, repeated file reads, large test output, tool-return logs, and the latest raw conversation tail all mixed together.

After Continuity compaction, the recent raw tail stays intact, old bulky tool returns are replaced in place with short deterministic capsules, the raw logs are archived locally, and one compact durable-memory snapshot is injected near the front of the rebuilt history. If masking still cannot hit the 35% target, only the oldest already-masked region is summarized while preserving a visible recent archive capsule when practical; if the transcript is still above target, older compacted history is trimmed while preserving the latest user request, current error context, and newest raw tail.

User Impact

Continuity is now the default compaction strategy. Users can still set it explicitly with:

/set compaction_strategy=continuity

Legacy strategies remain available:

/set compaction_strategy=truncation
/set compaction_strategy=summarization

Useful related knobs include:

/set continuity_compaction_semantic_model=gpt-5.4
/set continuity_compaction_semantic_timeout_seconds=60
/set continuity_compaction_soft_trigger_ratio=0.825
/set continuity_compaction_predictive_trigger_min_ratio=0.725
/set continuity_compaction_target_ratio=0.35
/set continuity_compaction_recent_raw_floor_ratio=0.20
/set continuity_compaction_emergency_trigger_ratio=0.90
/set continuity_compaction_archive_retention_days=30
/set continuity_compaction_archive_retention_count=500

continuity_compaction_semantic_model controls the semantic memory LLM call. If unset, Continuity uses the active chat model from the current Code Puppy session. If no active model is available for a direct utility call, it falls back to summarization_model. Fallback summarization uses Code Puppy's existing summarization path, so it is still controlled by summarization_model.

Validation

Focused pytest suite for Continuity, config, compaction routing, and related command coverage
- 329 passed
uv run ruff check ...
- passed locally
uv run ruff format --check ...
- passed locally
Broad uv run pytest --no-cov -q
- reached 9802 passed, 87 skipped, 1 xpassed before manual interrupt after the suite stalled late in an unrelated file-operations area

Provider Compatibility Note

The ChatGPT OAuth/Codex stream reconstruction fix is intentionally isolated from Continuity. It lives only in chatgpt_codex_client.py plus its focused test because that provider can stream text deltas and then finish with response.completed.output=[], which appears to pydantic-ai as ModelResponse(parts=[]). Continuity simply uses the active model through the normal model factory; it does not import ChatGPT OAuth code or depend on this provider patch.

For forks that do not include ChatGPT OAuth, such as a Walmart fork without that provider, this patch can be omitted while still porting the Continuity plugin and the generic compaction callback hooks.

Why Any Core Code Changed

Continuity's implementation is plugin-owned, but Code Puppy did not previously expose a compaction-strategy plugin lifecycle. The small core diff adds generic extension plumbing rather than Continuity-specific behavior:

callbacks.py adds hooks for plugin config keys, plugin compaction strategy registration, and plugin-owned message-history compaction.
agents/_compaction.py invokes the compaction hook before falling back to built-in truncation/summarization.
config.py accepts plugin-registered strategy names and defaults to continuity.
/compact, /show, and /set use the generic strategy/config discovery path so plugin strategies are usable and visible from the CLI.

Without those generic hooks, the plugin could register files and commands, but it could not become a selectable compaction strategy or participate in automatic/manual compaction.

Reviewer Notes

The legacy truncation and summarization strategies remain available through /set compaction_strategy=....
Local observation archives stay under Code Puppy's data directory and are bounded by retention settings.
A follow-up may be useful if maintainers want archived observation lookup to survive restored histories across fresh agent IDs.

mpfaffenberger · 2026-04-28T19:03:13Z

I love the concept and I'd love to have this feature, but structurally this P/R is not mergeable. This needs to be created as 100% a plugin.

Your agent wrote a justification for why it shouldn't be a plugin, but that contradicts everything that's in our AGENTS.md.

For a feature like this, the only feasible way to create changes in core is to add lifecycle hooks.

I would love to have this feature, so if you can propose a set of lifecycle hooks to add in callbacks.py (and corresponding invocations), that will allow this P/R to fit into the guidelines, we can move forward. Otherwise we can close this P/R.

nhicks00 mentioned this pull request Apr 24, 2026

Add Continuity compaction strategy (replaced) #305

Closed

nhicks00 added 28 commits April 29, 2026 21:09

Add threshold-driven compaction strategy

7a676ba

Add compaction precision probe test

1e1a530

Add live compaction QA eval harness

4f20efa

Make live compaction eval use legacy router

69dabce

Respect context window in live compaction eval

243a7a4

Document threshold compaction live eval results

9322b1b

Rebrand threshold compaction as continuity

a01d89e

Make continuity fallback summarization deterministic

198c179

Show continuity compaction status

ecaa44f

Add continuity task ledger memory

2ee13d3

Add semantic continuity task detection

d8da0f4

Add continuity memory v2

f99bb3f

Show continuity semantic memory status

d1a7554

Reduce continuity semantic memory timeouts

3866a3f

Increase continuity semantic timeout

f90c162

Document Continuity core integration rationale

b1627d3

Document Continuity compaction behavior

2237765

Use raw text request for Continuity memory

53f672c

Fix Continuity semantic memory extraction

54b16c7

Add Continuity predictive trigger floor

a26d8de

Make Continuity compaction target adaptive

f412d8c

Format Continuity compaction changes

c08fa3d

Trim Continuity history to displayed target

81c7ba7

Document Continuity target trimming

2576fb1

Use configured Continuity target

5c09429

Preserve recent raw tail during Continuity trim

c236733

Move Continuity compaction into plugin hooks

a7d6fc0

Remove unrelated Codex client change

e194e62

nhicks00 force-pushed the continuity-compaction branch from 0b64051 to e194e62 Compare April 30, 2026 02:10

nhicks00 added 5 commits April 29, 2026 21:25

Add Continuity semantic model setting

575bd69

Default Continuity semantic model to active chat model

e8882cc

Default compaction strategy to Continuity

13e7df4

Retry Continuity semantic memory on empty active response

b792859

Preserve ChatGPT streamed text for semantic memory

6c54f3f

nhicks00 force-pushed the continuity-compaction branch from 314aab0 to 6c54f3f Compare April 30, 2026 03:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Continuity compaction strategy#306

Add Continuity compaction strategy#306
nhicks00 wants to merge 33 commits intompfaffenberger:mainfrom
nhicks00:continuity-compaction

nhicks00 commented Apr 24, 2026 •

edited

Loading

Uh oh!

mpfaffenberger commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nhicks00 commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Structure

Why Continuity

What Changed

Before / After Model

User Impact

Validation

Provider Compatibility Note

Why Any Core Code Changed

Reviewer Notes

Uh oh!

mpfaffenberger commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nhicks00 commented Apr 24, 2026 •

edited

Loading