Skip to content

Add SessionPersistence capability#176

Draft
DouweM wants to merge 2 commits intomainfrom
capability/session-persistence
Draft

Add SessionPersistence capability#176
DouweM wants to merge 2 commits intomainfrom
capability/session-persistence

Conversation

@DouweM
Copy link
Copy Markdown
Contributor

@DouweM DouweM commented Apr 10, 2026

Summary

  • Adds SessionPersistence capability that auto-saves conversation state after each agent run and auto-restores it at the start of the next run
  • Implements SessionStore protocol with two backends: InMemorySessionStore (dict-based, for testing) and FileSessionStore (JSON files using ModelMessagesTypeAdapter)
  • Includes PLAN.md documenting the design decisions

Design

The capability uses before_run to prepend loaded messages to ctx.messages and after_run to save the full history via result.all_messages(). Session IDs are auto-generated (UUID4) if not provided. An auto_save flag allows disabling automatic persistence.

FileSessionStore stores each session as a separate .json file, using Pydantic AI's ModelMessagesTypeAdapter for full-fidelity serialization.

Test plan

  • InMemorySessionStore: save/load, overwrite, list, delete, copy isolation (10 tests)
  • FileSessionStore: save/load, directory creation, JSON validity, list, delete, roundtrip (11 tests)
  • SessionPersistence capability: auto-ID, history restore, auto-save, multi-turn accumulation, from_spec, file store roundtrip (12 tests)
  • 100% code coverage
  • pyright strict: 0 errors
  • ruff lint + format: clean

Closes #31

🤖 Generated with Claude Code

DouweM and others added 2 commits April 2, 2026 05:28
…tory

Implements SessionStore protocol with InMemorySessionStore and FileSessionStore
backends. The SessionPersistence capability restores messages in before_run
and saves them in after_run, enabling multi-turn conversations across process
restarts.

Closes #31

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The `save()` method now accepts an optional `metadata` parameter that
is persisted alongside messages. `FileSessionStore` writes it to a
separate `.meta.json` file; `InMemorySessionStore` keeps it in a dict.
A new `load_metadata()` method retrieves it. The `SessionPersistence`
capability exposes a `metadata` field that is passed through on save.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 3 potential issues.

View 3 additional findings in Devin Review.

Open in Devin Review

Comment on lines +118 to +119
def _path_for(self, session_id: str) -> Path:
return self._directory / f'{session_id}.json'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Path traversal vulnerability in FileSessionStore allows reading/writing files outside the session directory

FileSessionStore._path_for and _meta_path_for directly interpolate session_id into a file path (self._directory / f'{session_id}.json') without any sanitization. A session_id containing path traversal sequences like ../../ resolves to a path outside the intended directory. For example, session_id="../../tmp/evil" would cause save() to write to /tmp/evil.json and load() to read from it. Since session_id can be explicitly user-provided (e.g. via from_spec(session_id=user_input)) and is passed to save, load, load_metadata, and delete, this enables arbitrary file read/write/delete (with .json or .meta.json suffix) anywhere the process has permissions.

Prompt for agents
The FileSessionStore._path_for and _meta_path_for methods (lines 118-122) build file paths by directly interpolating session_id into the path without validation. This enables path traversal attacks when session_id contains sequences like '../'.

To fix this, add validation to reject session IDs containing path separator characters or '..' sequences. One approach is to add a validation method that checks the session_id and is called at the entry point of every public method (save, load, load_metadata, delete), or alternatively validate within _path_for/_meta_path_for. For example:

def _validate_session_id(self, session_id: str) -> None:
    if not session_id or '/' in session_id or '\\' in session_id or '..' in session_id or session_id != session_id.strip():
        raise ValueError(f'Invalid session_id: {session_id!r}')

Alternatively, use Path.resolve() and verify the resolved path is still within self._directory. The validation should be applied in _path_for and _meta_path_for before constructing the path.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

"""Return all session IDs found in the directory."""
if not self._directory.exists():
return []
return sorted(p.stem for p in self._directory.glob('*.json') if not p.name.endswith('.meta.json'))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 FileSessionStore.list_sessions() silently drops session IDs ending with .meta

The list_sessions method at line 163 filters out files matching p.name.endswith('.meta.json') to exclude metadata files. However, if a session has an ID ending in .meta (e.g. "experiment.meta"), its data file would be named experiment.meta.json, which also matches the .meta.json exclusion filter. This means list_sessions() would not return that session ID, even though load("experiment.meta") would successfully return its data. The inconsistency could cause sessions to become effectively invisible to enumeration.

Example demonstrating the bug
store = FileSessionStore('/tmp/sessions')
store.save('experiment.meta', [msg])
print(store.list_sessions())  # [] — session is missing!
print(store.load('experiment.meta'))  # [msg] — but it loads fine
Suggested change
return sorted(p.stem for p in self._directory.glob('*.json') if not p.name.endswith('.meta.json'))
return sorted(p.stem for p in self._directory.glob('*.json') if not p.stem.endswith('.meta'))
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +234 to +252
async def before_run(
self,
ctx: RunContext[AgentDepsT],
) -> None:
"""Load saved messages and prepend them to the conversation."""
existing = self.store.load(self.session_id)
if existing:
ctx.messages[:0] = existing

async def after_run(
self,
ctx: RunContext[AgentDepsT],
*,
result: AgentRunResult[Any],
) -> AgentRunResult[Any]:
"""Save the full message history after a successful run."""
if self.auto_save:
self.store.save(self.session_id, result.all_messages(), metadata=self.metadata)
return result
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Synchronous store operations called from async hooks

The before_run and after_run hooks at src/pydantic_harness/session_persistence.py:234-252 are async methods, but they call synchronous store.load() and store.save() methods directly. For InMemorySessionStore this is fine, but FileSessionStore performs blocking file I/O (read_bytes, write_bytes, mkdir, unlink, glob) on the event loop thread. For high-throughput async applications, this could cause event loop stalls. This isn't a correctness bug — it works — but for production use with FileSessionStore, the blocking I/O should ideally be wrapped with asyncio.to_thread() or similar.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@DouweM
Copy link
Copy Markdown
Contributor Author

DouweM commented Apr 10, 2026

Originally posted by @DouweM in #153 comment (PR was recreated)

Audit vs prior art: SessionPersistence

Worth adding now:

  • SqliteSessionStore backend (stdlib, atomic, efficient listing)
  • Session metadata (timestamp, turn count) alongside messages

Follow-up opportunities:

  • Redis backend, session branching/forking, expiry/GC

@DouweM
Copy link
Copy Markdown
Contributor Author

DouweM commented Apr 10, 2026

Originally posted by @DouweM in #153 comment (PR was recreated)

Audit vs prior art: SessionPersistence

Worth adding now:

  • SqliteSessionStore backend (stdlib, atomic, efficient listing)
  • Session metadata (timestamp, turn count) alongside messages

Follow-up opportunities:

  • Redis backend, session branching/forking, expiry/GC

@DouweM DouweM marked this pull request as draft April 10, 2026 15:12
ctx: RunContext[AgentDepsT],
) -> None:
"""Load saved messages and prepend them to the conversation."""
existing = self.store.load(self.session_id)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The self.session_id can't works in the case of multi user's conversations, maybe use getattr(ctx, session_id) or self.session_id should be better

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Session persistence capability

2 participants