Conversation
…tory Implements SessionStore protocol with InMemorySessionStore and FileSessionStore backends. The SessionPersistence capability restores messages in before_run and saves them in after_run, enabling multi-turn conversations across process restarts. Closes #31 Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The `save()` method now accepts an optional `metadata` parameter that is persisted alongside messages. `FileSessionStore` writes it to a separate `.meta.json` file; `InMemorySessionStore` keeps it in a dict. A new `load_metadata()` method retrieves it. The `SessionPersistence` capability exposes a `metadata` field that is passed through on save. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
| def _path_for(self, session_id: str) -> Path: | ||
| return self._directory / f'{session_id}.json' |
There was a problem hiding this comment.
🔴 Path traversal vulnerability in FileSessionStore allows reading/writing files outside the session directory
FileSessionStore._path_for and _meta_path_for directly interpolate session_id into a file path (self._directory / f'{session_id}.json') without any sanitization. A session_id containing path traversal sequences like ../../ resolves to a path outside the intended directory. For example, session_id="../../tmp/evil" would cause save() to write to /tmp/evil.json and load() to read from it. Since session_id can be explicitly user-provided (e.g. via from_spec(session_id=user_input)) and is passed to save, load, load_metadata, and delete, this enables arbitrary file read/write/delete (with .json or .meta.json suffix) anywhere the process has permissions.
Prompt for agents
The FileSessionStore._path_for and _meta_path_for methods (lines 118-122) build file paths by directly interpolating session_id into the path without validation. This enables path traversal attacks when session_id contains sequences like '../'.
To fix this, add validation to reject session IDs containing path separator characters or '..' sequences. One approach is to add a validation method that checks the session_id and is called at the entry point of every public method (save, load, load_metadata, delete), or alternatively validate within _path_for/_meta_path_for. For example:
def _validate_session_id(self, session_id: str) -> None:
if not session_id or '/' in session_id or '\\' in session_id or '..' in session_id or session_id != session_id.strip():
raise ValueError(f'Invalid session_id: {session_id!r}')
Alternatively, use Path.resolve() and verify the resolved path is still within self._directory. The validation should be applied in _path_for and _meta_path_for before constructing the path.
Was this helpful? React with 👍 or 👎 to provide feedback.
| """Return all session IDs found in the directory.""" | ||
| if not self._directory.exists(): | ||
| return [] | ||
| return sorted(p.stem for p in self._directory.glob('*.json') if not p.name.endswith('.meta.json')) |
There was a problem hiding this comment.
🟡 FileSessionStore.list_sessions() silently drops session IDs ending with .meta
The list_sessions method at line 163 filters out files matching p.name.endswith('.meta.json') to exclude metadata files. However, if a session has an ID ending in .meta (e.g. "experiment.meta"), its data file would be named experiment.meta.json, which also matches the .meta.json exclusion filter. This means list_sessions() would not return that session ID, even though load("experiment.meta") would successfully return its data. The inconsistency could cause sessions to become effectively invisible to enumeration.
Example demonstrating the bug
store = FileSessionStore('/tmp/sessions')
store.save('experiment.meta', [msg])
print(store.list_sessions()) # [] — session is missing!
print(store.load('experiment.meta')) # [msg] — but it loads fine| return sorted(p.stem for p in self._directory.glob('*.json') if not p.name.endswith('.meta.json')) | |
| return sorted(p.stem for p in self._directory.glob('*.json') if not p.stem.endswith('.meta')) |
Was this helpful? React with 👍 or 👎 to provide feedback.
| async def before_run( | ||
| self, | ||
| ctx: RunContext[AgentDepsT], | ||
| ) -> None: | ||
| """Load saved messages and prepend them to the conversation.""" | ||
| existing = self.store.load(self.session_id) | ||
| if existing: | ||
| ctx.messages[:0] = existing | ||
|
|
||
| async def after_run( | ||
| self, | ||
| ctx: RunContext[AgentDepsT], | ||
| *, | ||
| result: AgentRunResult[Any], | ||
| ) -> AgentRunResult[Any]: | ||
| """Save the full message history after a successful run.""" | ||
| if self.auto_save: | ||
| self.store.save(self.session_id, result.all_messages(), metadata=self.metadata) | ||
| return result |
There was a problem hiding this comment.
🚩 Synchronous store operations called from async hooks
The before_run and after_run hooks at src/pydantic_harness/session_persistence.py:234-252 are async methods, but they call synchronous store.load() and store.save() methods directly. For InMemorySessionStore this is fine, but FileSessionStore performs blocking file I/O (read_bytes, write_bytes, mkdir, unlink, glob) on the event loop thread. For high-throughput async applications, this could cause event loop stalls. This isn't a correctness bug — it works — but for production use with FileSessionStore, the blocking I/O should ideally be wrapped with asyncio.to_thread() or similar.
Was this helpful? React with 👍 or 👎 to provide feedback.
Audit vs prior art: SessionPersistenceWorth adding now:
Follow-up opportunities:
|
Audit vs prior art: SessionPersistenceWorth adding now:
Follow-up opportunities:
|
| ctx: RunContext[AgentDepsT], | ||
| ) -> None: | ||
| """Load saved messages and prepend them to the conversation.""" | ||
| existing = self.store.load(self.session_id) |
There was a problem hiding this comment.
The self.session_id can't works in the case of multi user's conversations, maybe use getattr(ctx, session_id) or self.session_id should be better
Summary
SessionPersistencecapability that auto-saves conversation state after each agent run and auto-restores it at the start of the next runSessionStoreprotocol with two backends:InMemorySessionStore(dict-based, for testing) andFileSessionStore(JSON files usingModelMessagesTypeAdapter)Design
The capability uses
before_runto prepend loaded messages toctx.messagesandafter_runto save the full history viaresult.all_messages(). Session IDs are auto-generated (UUID4) if not provided. Anauto_saveflag allows disabling automatic persistence.FileSessionStorestores each session as a separate.jsonfile, using Pydantic AI'sModelMessagesTypeAdapterfor full-fidelity serialization.Test plan
InMemorySessionStore: save/load, overwrite, list, delete, copy isolation (10 tests)FileSessionStore: save/load, directory creation, JSON validity, list, delete, roundtrip (11 tests)SessionPersistencecapability: auto-ID, history restore, auto-save, multi-turn accumulation,from_spec, file store roundtrip (12 tests)Closes #31
🤖 Generated with Claude Code