feat: add Memanto vs Mem0 benchmark suite (bounty #639)#766
Conversation
📝 WalkthroughWalkthroughAdds two complete benchmark suite implementations comparing Memanto and Mem0 across eight performance dimensions: one in ChangesExamples Benchmark Suite
Projects Benchmark Suite
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 10
🧹 Nitpick comments (1)
examples/benchmarks/memanto_vs_mem0/README.md (1)
115-115: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick winAdd language identifier to code block.
Markdown code fences should specify a language for syntax highlighting. The ASCII architecture diagram should use
```textor```plaintext.🔧 Proposed fix
-``` +```text benchmark_runner.py # Main entry point🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@examples/benchmarks/memanto_vs_mem0/README.md` at line 115, The Markdown code fence containing the ASCII architecture diagram (which includes the benchmark_runner.py entry) is missing a language identifier for syntax highlighting. Locate the opening triple backticks before the architecture diagram content and add the language identifier `text` or `plaintext` after the backticks to enable proper syntax highlighting in the rendered Markdown.Source: Linters/SAST tools
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@examples/benchmarks/memanto_vs_mem0/benchmark_runner.py`:
- Around line 61-88: The TestResult class and MetricSample class are missing
required benchmark metrics to satisfy submission criteria. Add the following to
the TestResult class: a p95_duration_ms property that calculates the 95th
percentile of successful metric durations using statistics.quantiles, fields to
track tokens_ingested and tokens_retrieved (sum across successful metrics), and
a retrieval_accuracy field with a corresponding property to calculate it from
the metrics. Update the MetricSample class to include tokens_count and
is_retrieved fields to support tracking token and retrieval data at the
individual operation level, which will allow the TestResult class to aggregate
these metrics correctly.
- Around line 239-246: The Memanto benchmark is using synthetic hardcoded
vectors (created with [0.1 + i*0.01]*128 and [0.15]*128 patterns) for storage
and search operations, while the Mem0 benchmark ingests raw text and performs
provider-based embedding. This makes the workloads non-equivalent and creates an
unfair comparison. To fix this, modify the Memanto benchmark path to use actual
embeddings from the same text content (TECHNICAL_LOGS) that Mem0 uses, rather
than synthetic vectors. This ensures both systems are performing the same
embedding task and allows for a controlled comparison of their core
functionality.
- Around line 221-235: The _test_crud method has two issues: the update
operation is incorrectly calling self.client.vectors.create instead of the
actual update method (around line 231), and the delete operation is a hardcoded
MetricSample placeholder instead of actually calling the delete method (line
234). Fix this by replacing the second create call with the appropriate
self.client.vectors.update method call using self._measure, and replace the
hardcoded delete MetricSample with an actual measured call to
self.client.vectors.delete to ensure the CRUD benchmark accurately tests all
operations.
- Around line 582-588: The benchmark execution for MemantoBenchmark and
Mem0Benchmark is currently sequential (Memanto runs first, then Mem0), but the
requirements specify simultaneous execution to avoid temporal and environmental
drift. Refactor the code to run both MemantoBenchmark(config).run_all() and
Mem0Benchmark(config).run_all() concurrently using Python's threading or
concurrent.futures module, ensuring both benchmarks execute in parallel and
their results are properly collected into memanto_results and mem0_results
respectively before proceeding.
- Around line 293-311: The _test_large_scale method calls
self.client.vectors.create directly in a loop without error handling, so any
transient API error will crash the entire test suite instead of recording a
failed result and continuing. Wrap the vector creation calls (the loop starting
with self.client.vectors.create around line 300) in a try-except block to catch
any exceptions, record the failure as a metric sample with the error details,
and allow the loop to continue testing other batch sizes. Follow the same error
handling pattern used in the _measure method calls to ensure consistent behavior
across all test operations.
- Around line 386-468: All test methods (_test_crud, _test_semantic_search,
_test_temporal_recall, _test_multi_turn, _test_persistence, _test_large_scale,
_test_structured, _test_conflict) initialize TestResult with TestStatus.PASS but
never check if any appended metrics have success=False, so failed operations go
unreported. Add logic to each test method to iterate through r.metrics after all
measurements are appended and downgrade r.status from PASS to FAIL (or
appropriate failure status) if any metric has success=False before returning r.
- Around line 347-365: The `vector_store` configuration within the
`_init_memory` method is hardcoded to use only localhost defaults for Qdrant
connection, which fails in CI and cloud environments. Extend the vector_store
config dictionary to pull Qdrant connection parameters (host, port, url,
api_key, and path) from environment variables using the same pattern already
established in this file for OpenAI configuration (e.g., using os.getenv with
sensible defaults). Add these environment variable mappings to the vector_store
config alongside the existing collection_name and embedding_model_dims
parameters.
In `@examples/benchmarks/memanto_vs_mem0/README.md`:
- Around line 86-102: The README contains incorrect field names in the JSON
report schema example. The documented fields `memanto_avg_duration_ms` and
`mem0_avg_duration_ms` (with `_ms` suffix) do not match the actual field names
produced by the benchmark_runner.py code, which outputs `memanto_avg_duration`
and `mem0_avg_duration` (without the `_ms` suffix). Update the JSON example in
the README to remove the `_ms` suffix from both duration field names in the
summary section to match the actual code output.
In `@examples/benchmarks/memanto_vs_mem0/requirements.txt`:
- Around line 1-8: The requirements.txt file uses minimum version constraints
with >= operator (e.g., memanto>=0.2.0, mem0ai>=2.0.0, openai>=1.0.0, etc.)
which allows different dependency versions to be installed across different
environments and dates, compromising reproducibility of benchmark results.
Replace all >= constraints with exact version pinning using == operator for each
dependency including memanto, mem0ai, moorcheh-sdk, openai, pydantic, rich,
httpx, and python-dotenv to ensure deterministic and reproducible benchmark
results.
- Around line 1-8: The requirements.txt file contains transitive dependencies
with known vulnerabilities that need to be explicitly constrained to safe
versions. Add two new lines to the requirements.txt file to pin vulnerable
dependencies: pyjwt to version 2.13.0 or higher to mitigate CVE-2026-48526, and
python-multipart to version 0.0.18 or higher to mitigate CVE-2024-53981. These
constraints should be added after the existing direct dependencies to ensure
safe versions are installed regardless of what versions are pulled in by
memanto.
---
Nitpick comments:
In `@examples/benchmarks/memanto_vs_mem0/README.md`:
- Line 115: The Markdown code fence containing the ASCII architecture diagram
(which includes the benchmark_runner.py entry) is missing a language identifier
for syntax highlighting. Locate the opening triple backticks before the
architecture diagram content and add the language identifier `text` or
`plaintext` after the backticks to enable proper syntax highlighting in the
rendered Markdown.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: c87a3658-b882-4c3e-97b2-5b4e788de047
📒 Files selected for processing (4)
examples/benchmarks/memanto_vs_mem0/.env.exampleexamples/benchmarks/memanto_vs_mem0/README.mdexamples/benchmarks/memanto_vs_mem0/benchmark_runner.pyexamples/benchmarks/memanto_vs_mem0/requirements.txt
| class MetricSample: | ||
| operation: str | ||
| duration_ms: float | ||
| success: bool | ||
| details: str = "" | ||
|
|
||
|
|
||
| @dataclass | ||
| class TestResult: | ||
| name: str | ||
| description: str | ||
| status: TestStatus | ||
| metrics: List[MetricSample] = field(default_factory=list) | ||
| error: Optional[str] = None | ||
|
|
||
| @property | ||
| def avg_duration_ms(self) -> float: | ||
| if not self.metrics: | ||
| return 0.0 | ||
| durations = [m.duration_ms for m in self.metrics if m.success] | ||
| return statistics.mean(durations) if durations else 0.0 | ||
|
|
||
| @property | ||
| def success_rate(self) -> float: | ||
| if not self.metrics: | ||
| return 0.0 | ||
| return sum(1 for m in self.metrics if m.success) / len(self.metrics) | ||
|
|
There was a problem hiding this comment.
Benchmark result schema misses required metrics (p95, tokens, retrieval accuracy).
The current model/report only tracks average duration and success rate. The linked issue requires quantifiable p95 latency, token ingest/retrieval counts, and retrieval-accuracy scoring, so the output currently cannot satisfy the submission criteria.
Also applies to: 540-556
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@examples/benchmarks/memanto_vs_mem0/benchmark_runner.py` around lines 61 -
88, The TestResult class and MetricSample class are missing required benchmark
metrics to satisfy submission criteria. Add the following to the TestResult
class: a p95_duration_ms property that calculates the 95th percentile of
successful metric durations using statistics.quantiles, fields to track
tokens_ingested and tokens_retrieved (sum across successful metrics), and a
retrieval_accuracy field with a corresponding property to calculate it from the
metrics. Update the MetricSample class to include tokens_count and is_retrieved
fields to support tracking token and retrieval data at the individual operation
level, which will allow the TestResult class to aggregate these metrics
correctly.
| def _test_crud(self, ns: str) -> TestResult: | ||
| r = TestResult("CRUD Operations", "Create, read, update, delete memories", TestStatus.PASS) | ||
| m = self._measure("create", self.client.vectors.create, | ||
| vector=[0.1]*128, metadata={"text": "test", "type": "crud"}, namespace=ns) | ||
| r.metrics.append(m) | ||
| if not m.success: | ||
| r.status = TestStatus.FAIL | ||
| m = self._measure("search", self.client.vectors.similarity_search, | ||
| vector=[0.1]*128, namespace=ns, limit=10) | ||
| r.metrics.append(m) | ||
| m = self._measure("update", self.client.vectors.create, | ||
| vector=[0.2]*128, metadata={"text": "updated", "type": "crud"}, namespace=ns) | ||
| r.metrics.append(m) | ||
| r.metrics.append(MetricSample("delete", 0, True, "N/A - TTL-based cleanup")) | ||
| return r |
There was a problem hiding this comment.
Memanto CRUD benchmark does not execute real update/delete operations.
update is another create call (Line 231), and delete is hardcoded as successful placeholder (Line 234). This makes the CRUD dimension non-comparable and overstates capability.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@examples/benchmarks/memanto_vs_mem0/benchmark_runner.py` around lines 221 -
235, The _test_crud method has two issues: the update operation is incorrectly
calling self.client.vectors.create instead of the actual update method (around
line 231), and the delete operation is a hardcoded MetricSample placeholder
instead of actually calling the delete method (line 234). Fix this by replacing
the second create call with the appropriate self.client.vectors.update method
call using self._measure, and replace the hardcoded delete MetricSample with an
actual measured call to self.client.vectors.delete to ensure the CRUD benchmark
accurately tests all operations.
| for i, mem in enumerate(TECHNICAL_LOGS[:5]): | ||
| m = self._measure(f"store_{i}", self.client.vectors.create, | ||
| vector=[0.1 + i*0.01]*128, | ||
| metadata={"text": mem, "type": "semantic"}, namespace=ns) | ||
| r.metrics.append(m) | ||
| m = self._measure("search_error", self.client.vectors.similarity_search, | ||
| vector=[0.15]*128, namespace=ns, limit=5) | ||
| r.metrics.append(m) |
There was a problem hiding this comment.
Workloads are not equivalent across systems, so comparison is not controlled.
Memanto path uses synthetic vectors, while Mem0 path ingests raw text and performs provider embedding. That changes the task itself and biases both latency and retrieval outcomes.
Also applies to: 402-406
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@examples/benchmarks/memanto_vs_mem0/benchmark_runner.py` around lines 239 -
246, The Memanto benchmark is using synthetic hardcoded vectors (created with
[0.1 + i*0.01]*128 and [0.15]*128 patterns) for storage and search operations,
while the Mem0 benchmark ingests raw text and performs provider-based embedding.
This makes the workloads non-equivalent and creates an unfair comparison. To fix
this, modify the Memanto benchmark path to use actual embeddings from the same
text content (TECHNICAL_LOGS) that Mem0 uses, rather than synthetic vectors.
This ensures both systems are performing the same embedding task and allows for
a controlled comparison of their core functionality.
| def _test_large_scale(self, ns: str) -> TestResult: | ||
| r = TestResult("Large-scale Retrieval", "Performance at scale", TestStatus.PASS) | ||
| for batch_size in self.config.batch_sizes: | ||
| start = time.perf_counter() | ||
| for i in range(batch_size): | ||
| self.client.vectors.create( | ||
| vector=[0.1 + (i % 10)*0.01]*128, | ||
| metadata={"text": f"Batch {i} of {batch_size}", | ||
| "batch": batch_size, "index": i, | ||
| "type": "large_scale"}, namespace=ns) | ||
| dur = (time.perf_counter() - start) * 1000 | ||
| r.metrics.append(MetricSample(f"batch_store_{batch_size}", | ||
| round(dur, 2), True, | ||
| f"Stored {batch_size} in {dur:.0f}ms")) | ||
| m = self._measure(f"batch_search_{batch_size}", | ||
| self.client.vectors.similarity_search, | ||
| vector=[0.15]*128, namespace=ns, limit=10) | ||
| r.metrics.append(m) | ||
| return r |
There was a problem hiding this comment.
Large-scale store loops can crash the full run on first API error.
Line 298 and Line 441 call external APIs directly inside loops without _measure/try-except. One transient provider error aborts the suite instead of recording a failed sample and continuing.
Also applies to: 436-449
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@examples/benchmarks/memanto_vs_mem0/benchmark_runner.py` around lines 293 -
311, The _test_large_scale method calls self.client.vectors.create directly in a
loop without error handling, so any transient API error will crash the entire
test suite instead of recording a failed result and continuing. Wrap the vector
creation calls (the loop starting with self.client.vectors.create around line
300) in a try-except block to catch any exceptions, record the failure as a
metric sample with the error details, and allow the loop to continue testing
other batch sizes. Follow the same error handling pattern used in the _measure
method calls to ensure consistent behavior across all test operations.
| def _init_memory(self): | ||
| from mem0 import Memory | ||
| config = { | ||
| "llm": {"provider": "openai", "config": { | ||
| "model": self.config.judge_model, | ||
| "api_key": self.config.openai_api_key, | ||
| "openai_base_url": self.config.openai_base_url, | ||
| }}, | ||
| "embedder": {"provider": "openai", "config": { | ||
| "model": self.config.embedding_model, | ||
| "api_key": self.config.openai_api_key, | ||
| "openai_base_url": self.config.openai_base_url, | ||
| }}, | ||
| "vector_store": {"provider": "qdrant", "config": { | ||
| "collection_name": f"benchmark_{int(time.time())}", | ||
| "embedding_model_dims": 1536, | ||
| }}, | ||
| } | ||
| self.memory = Memory.from_config(config) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's confirm the file and lines mentioned in the review
cat -n examples/benchmarks/memanto_vs_mem0/benchmark_runner.py | sed -n '347,365p'Repository: moorcheh-ai/memanto
Length of output: 1014
🏁 Script executed:
# Let's check if there are any environment variable patterns used elsewhere in this file
rg -n "os\.environ|getenv" examples/benchmarks/memanto_vs_mem0/benchmark_runner.pyRepository: moorcheh-ai/memanto
Length of output: 467
🏁 Script executed:
# Check how the benchmark_runner is initialized and if Qdrant connection details are handled elsewhere
rg -n "qdrant|Qdrant" examples/benchmarks/memanto_vs_mem0/benchmark_runner.pyRepository: moorcheh-ai/memanto
Length of output: 130
🏁 Script executed:
# Look for any configuration or setup files that might define Qdrant defaults
fd -t f "(config|qdrant|mem0)" examples/benchmarks/memanto_vs_mem0/Repository: moorcheh-ai/memanto
Length of output: 45
🌐 Web query:
Mem0 Memory library Qdrant vector_store configuration requirements endpoint auth
💡 Result:
To configure the Mem0 library to use Qdrant as a vector store, you must define a dictionary containing the vector_store provider and its specific configuration parameters [1][2][3]. The configuration is validated to ensure required fields are present; specifically, you must provide either host and port, a full URL (with API key), or a local path [4]. Configuration Parameters: The following parameters are supported for Qdrant in Mem0 [3][5]: - collection_name: The name of the collection (default: "mem0"). - embedding_model_dims: The dimensions of your embedding model (default: 1536). - host: The Qdrant server host address. - port: The Qdrant server port. - path: The file system path for local persistent storage (default: "/tmp/qdrant"). - url: The full URL for a remote Qdrant server. - api_key: The API key for authenticated Qdrant instances. - https: A boolean to force HTTPS connection; if set to None, it defaults to HTTPS when an API key is provided, or you can set False for plain HTTP [6]. - on_disk: Boolean to enable persistent storage (default: False). - client: Optional, allows passing an existing QdrantClient instance directly [4][3]. Example Configuration: config = { "vector_store": { "provider": "qdrant", "config": { "host": "localhost", "port": 6333, "collection_name": "my_memories" } } } Authentication and Connectivity: - When using a remote Qdrant instance, the url and api_key parameters are required [4]. - If you are using collection-scoped JWT authentication, note that some versions of Mem0 may require specific handling if the SDK fails to create a collection due to 401/403 errors, as the library sometimes assumes global access is available [7]. - Always verify that the port (default 6333) is reachable and that your embedding dimensions match the model you are using to avoid dimension mismatch errors [1][3].
Citations:
- 1: https://docs.mem0.ai/open-source/configuration
- 2: https://docs.mem0.ai/components/vectordbs/config
- 3: https://docs.mem0.ai/components/vectordbs/dbs/qdrant
- 4: https://github.com/mem0ai/mem0/blob/main/mem0/configs/vector_stores/qdrant.py
- 5: https://github.com/mem0ai/mem0/blob/06ee1b58/docs/components/vectordbs/dbs/qdrant.mdx
- 6: fix: expose Qdrant https option mem0ai/mem0#5380
- 7: ensureCollection throws 403 with Qdrant scoped JWTs — only catches 409, not 401/403 mem0ai/mem0#4355
Pull Qdrant endpoint/auth/path configuration from environment variables for reproducible deployments.
The vector_store config only specifies collection_name and embedding_model_dims, omitting host, port, url, api_key, and path. This will only work reliably with Qdrant running on localhost:6333 (development default). In CI, containerized, or cloud environments without a default local Qdrant instance, this will fail silently or behave unpredictably. Follow the pattern used elsewhere in this file (e.g., OPENAI_API_KEY, OPENAI_BASE_URL) and pull Qdrant connection details from environment variables with sensible defaults.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@examples/benchmarks/memanto_vs_mem0/benchmark_runner.py` around lines 347 -
365, The `vector_store` configuration within the `_init_memory` method is
hardcoded to use only localhost defaults for Qdrant connection, which fails in
CI and cloud environments. Extend the vector_store config dictionary to pull
Qdrant connection parameters (host, port, url, api_key, and path) from
environment variables using the same pattern already established in this file
for OpenAI configuration (e.g., using os.getenv with sensible defaults). Add
these environment variable mappings to the vector_store config alongside the
existing collection_name and embedding_model_dims parameters.
| def _test_crud(self, uid: str) -> TestResult: | ||
| r = TestResult("CRUD Operations", "Create, read, update, delete memories", TestStatus.PASS) | ||
| m = self._measure("add", self.memory.add, "Testing Mem0 benchmark suite", user_id=uid) | ||
| r.metrics.append(m) | ||
| m = self._measure("get_all", self.memory.get_all, user_id=uid) | ||
| r.metrics.append(m) | ||
| m = self._measure("search", self.memory.search, "testing benchmark", user_id=uid) | ||
| r.metrics.append(m) | ||
| m = self._measure("update", self.memory.add, "Testing Mem0 benchmark suite - updated", user_id=uid) | ||
| r.metrics.append(m) | ||
| m = self._measure("delete", self.memory.delete_all, user_id=uid) | ||
| r.metrics.append(m) | ||
| return r | ||
|
|
||
| def _test_semantic_search(self, uid: str) -> TestResult: | ||
| r = TestResult("Semantic Search", "Find relevant memories by meaning", TestStatus.PASS) | ||
| for mem in TECHNICAL_LOGS[:5]: | ||
| m = self._measure("add", self.memory.add, mem, user_id=uid) | ||
| r.metrics.append(m) | ||
| m = self._measure("search", self.memory.search, "connection pool exhausted", user_id=uid) | ||
| r.metrics.append(m) | ||
| return r | ||
|
|
||
| def _test_temporal_recall(self, uid: str) -> TestResult: | ||
| r = TestResult("Temporal Recall", "Time-aware memory retrieval", TestStatus.PASS) | ||
| for i in range(5): | ||
| m = self._measure(f"add_t{i}", self.memory.add, f"Memory at time {i}", user_id=uid) | ||
| r.metrics.append(m) | ||
| m = self._measure("search_recent", self.memory.search, "Memory at time", user_id=uid) | ||
| r.metrics.append(m) | ||
| return r | ||
|
|
||
| def _test_multi_turn(self, uid: str) -> TestResult: | ||
| r = TestResult("Multi-turn Conversation", "Maintain context across turns", TestStatus.PASS) | ||
| for turn in CONVERSATION_TURNS[:5]: | ||
| m = self._measure("add", self.memory.add, turn, user_id=uid) | ||
| r.metrics.append(m) | ||
| m = self._measure("context_retrieval", self.memory.search, "microservices migration", user_id=uid) | ||
| r.metrics.append(m) | ||
| return r | ||
|
|
||
| def _test_persistence(self, uid: str) -> TestResult: | ||
| r = TestResult("Cross-session Persistence", "Memory survives across sessions", TestStatus.PASS) | ||
| for i in range(3): | ||
| m = self._measure(f"add_session1_{i}", self.memory.add, f"Session 1 memory {i}", user_id=uid) | ||
| r.metrics.append(m) | ||
| m = self._measure("cross_session", self.memory.search, "Session 1", user_id=uid) | ||
| r.metrics.append(m) | ||
| return r | ||
|
|
||
| def _test_large_scale(self, uid: str) -> TestResult: | ||
| r = TestResult("Large-scale Retrieval", "Performance at scale", TestStatus.PASS) | ||
| for batch_size in self.config.batch_sizes: | ||
| start = time.perf_counter() | ||
| for i in range(batch_size): | ||
| self.memory.add(f"Batch memory {i} of {batch_size}", user_id=uid) | ||
| dur = (time.perf_counter() - start) * 1000 | ||
| r.metrics.append(MetricSample(f"batch_store_{batch_size}", | ||
| round(dur, 2), True, | ||
| f"Stored {batch_size} in {dur:.0f}ms")) | ||
| m = self._measure(f"batch_search_{batch_size}", | ||
| self.memory.search, "Batch memory", user_id=uid) | ||
| r.metrics.append(m) | ||
| return r | ||
|
|
||
| def _test_structured(self, uid: str) -> TestResult: | ||
| r = TestResult("Structured Memory", "Store and retrieve typed data", TestStatus.PASS) | ||
| for data in STRUCTURED_DATA[:4]: | ||
| entry = f"{data['type']}: {data['key']} = {data['value']} ({data['env']})" | ||
| m = self._measure("add", self.memory.add, entry, user_id=uid) | ||
| r.metrics.append(m) | ||
| m = self._measure("search", self.memory.search, "config max_connections", user_id=uid) | ||
| r.metrics.append(m) | ||
| return r | ||
|
|
||
| def _test_conflict(self, uid: str) -> TestResult: | ||
| r = TestResult("Conflict Resolution", "Handle contradictory memories", TestStatus.PASS) | ||
| for text, _ in CONTRADICTORY_FACTS[:4]: | ||
| m = self._measure("add", self.memory.add, text, user_id=uid) | ||
| r.metrics.append(m) | ||
| m = self._measure("conflict_search", self.memory.search, "server count", user_id=uid) | ||
| r.metrics.append(m) | ||
| return r |
There was a problem hiding this comment.
Failed operations can still report PASS.
Line 387 initializes TestStatus.PASS, but these Mem0 tests never downgrade status when any metric has success=False; they just append failed metrics. This can produce false PASS outcomes and an incorrect winner.
Suggested pattern
class BaseBenchmark:
+ def _finalize_result(self, result: TestResult) -> TestResult:
+ if any(not m.success for m in result.metrics):
+ result.status = TestStatus.FAIL
+ return result def _test_crud(self, uid: str) -> TestResult:
r = TestResult("CRUD Operations", "Create, read, update, delete memories", TestStatus.PASS)
...
- return r
+ return self._finalize_result(r)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def _test_crud(self, uid: str) -> TestResult: | |
| r = TestResult("CRUD Operations", "Create, read, update, delete memories", TestStatus.PASS) | |
| m = self._measure("add", self.memory.add, "Testing Mem0 benchmark suite", user_id=uid) | |
| r.metrics.append(m) | |
| m = self._measure("get_all", self.memory.get_all, user_id=uid) | |
| r.metrics.append(m) | |
| m = self._measure("search", self.memory.search, "testing benchmark", user_id=uid) | |
| r.metrics.append(m) | |
| m = self._measure("update", self.memory.add, "Testing Mem0 benchmark suite - updated", user_id=uid) | |
| r.metrics.append(m) | |
| m = self._measure("delete", self.memory.delete_all, user_id=uid) | |
| r.metrics.append(m) | |
| return r | |
| def _test_semantic_search(self, uid: str) -> TestResult: | |
| r = TestResult("Semantic Search", "Find relevant memories by meaning", TestStatus.PASS) | |
| for mem in TECHNICAL_LOGS[:5]: | |
| m = self._measure("add", self.memory.add, mem, user_id=uid) | |
| r.metrics.append(m) | |
| m = self._measure("search", self.memory.search, "connection pool exhausted", user_id=uid) | |
| r.metrics.append(m) | |
| return r | |
| def _test_temporal_recall(self, uid: str) -> TestResult: | |
| r = TestResult("Temporal Recall", "Time-aware memory retrieval", TestStatus.PASS) | |
| for i in range(5): | |
| m = self._measure(f"add_t{i}", self.memory.add, f"Memory at time {i}", user_id=uid) | |
| r.metrics.append(m) | |
| m = self._measure("search_recent", self.memory.search, "Memory at time", user_id=uid) | |
| r.metrics.append(m) | |
| return r | |
| def _test_multi_turn(self, uid: str) -> TestResult: | |
| r = TestResult("Multi-turn Conversation", "Maintain context across turns", TestStatus.PASS) | |
| for turn in CONVERSATION_TURNS[:5]: | |
| m = self._measure("add", self.memory.add, turn, user_id=uid) | |
| r.metrics.append(m) | |
| m = self._measure("context_retrieval", self.memory.search, "microservices migration", user_id=uid) | |
| r.metrics.append(m) | |
| return r | |
| def _test_persistence(self, uid: str) -> TestResult: | |
| r = TestResult("Cross-session Persistence", "Memory survives across sessions", TestStatus.PASS) | |
| for i in range(3): | |
| m = self._measure(f"add_session1_{i}", self.memory.add, f"Session 1 memory {i}", user_id=uid) | |
| r.metrics.append(m) | |
| m = self._measure("cross_session", self.memory.search, "Session 1", user_id=uid) | |
| r.metrics.append(m) | |
| return r | |
| def _test_large_scale(self, uid: str) -> TestResult: | |
| r = TestResult("Large-scale Retrieval", "Performance at scale", TestStatus.PASS) | |
| for batch_size in self.config.batch_sizes: | |
| start = time.perf_counter() | |
| for i in range(batch_size): | |
| self.memory.add(f"Batch memory {i} of {batch_size}", user_id=uid) | |
| dur = (time.perf_counter() - start) * 1000 | |
| r.metrics.append(MetricSample(f"batch_store_{batch_size}", | |
| round(dur, 2), True, | |
| f"Stored {batch_size} in {dur:.0f}ms")) | |
| m = self._measure(f"batch_search_{batch_size}", | |
| self.memory.search, "Batch memory", user_id=uid) | |
| r.metrics.append(m) | |
| return r | |
| def _test_structured(self, uid: str) -> TestResult: | |
| r = TestResult("Structured Memory", "Store and retrieve typed data", TestStatus.PASS) | |
| for data in STRUCTURED_DATA[:4]: | |
| entry = f"{data['type']}: {data['key']} = {data['value']} ({data['env']})" | |
| m = self._measure("add", self.memory.add, entry, user_id=uid) | |
| r.metrics.append(m) | |
| m = self._measure("search", self.memory.search, "config max_connections", user_id=uid) | |
| r.metrics.append(m) | |
| return r | |
| def _test_conflict(self, uid: str) -> TestResult: | |
| r = TestResult("Conflict Resolution", "Handle contradictory memories", TestStatus.PASS) | |
| for text, _ in CONTRADICTORY_FACTS[:4]: | |
| m = self._measure("add", self.memory.add, text, user_id=uid) | |
| r.metrics.append(m) | |
| m = self._measure("conflict_search", self.memory.search, "server count", user_id=uid) | |
| r.metrics.append(m) | |
| return r | |
| def _test_crud(self, uid: str) -> TestResult: | |
| r = TestResult("CRUD Operations", "Create, read, update, delete memories", TestStatus.PASS) | |
| m = self._measure("add", self.memory.add, "Testing Mem0 benchmark suite", user_id=uid) | |
| r.metrics.append(m) | |
| m = self._measure("get_all", self.memory.get_all, user_id=uid) | |
| r.metrics.append(m) | |
| m = self._measure("search", self.memory.search, "testing benchmark", user_id=uid) | |
| r.metrics.append(m) | |
| m = self._measure("update", self.memory.add, "Testing Mem0 benchmark suite - updated", user_id=uid) | |
| r.metrics.append(m) | |
| m = self._measure("delete", self.memory.delete_all, user_id=uid) | |
| r.metrics.append(m) | |
| return self._finalize_result(r) | |
| def _test_semantic_search(self, uid: str) -> TestResult: | |
| r = TestResult("Semantic Search", "Find relevant memories by meaning", TestStatus.PASS) | |
| for mem in TECHNICAL_LOGS[:5]: | |
| m = self._measure("add", self.memory.add, mem, user_id=uid) | |
| r.metrics.append(m) | |
| m = self._measure("search", self.memory.search, "connection pool exhausted", user_id=uid) | |
| r.metrics.append(m) | |
| return self._finalize_result(r) | |
| def _test_temporal_recall(self, uid: str) -> TestResult: | |
| r = TestResult("Temporal Recall", "Time-aware memory retrieval", TestStatus.PASS) | |
| for i in range(5): | |
| m = self._measure(f"add_t{i}", self.memory.add, f"Memory at time {i}", user_id=uid) | |
| r.metrics.append(m) | |
| m = self._measure("search_recent", self.memory.search, "Memory at time", user_id=uid) | |
| r.metrics.append(m) | |
| return self._finalize_result(r) | |
| def _test_multi_turn(self, uid: str) -> TestResult: | |
| r = TestResult("Multi-turn Conversation", "Maintain context across turns", TestStatus.PASS) | |
| for turn in CONVERSATION_TURNS[:5]: | |
| m = self._measure("add", self.memory.add, turn, user_id=uid) | |
| r.metrics.append(m) | |
| m = self._measure("context_retrieval", self.memory.search, "microservices migration", user_id=uid) | |
| r.metrics.append(m) | |
| return self._finalize_result(r) | |
| def _test_persistence(self, uid: str) -> TestResult: | |
| r = TestResult("Cross-session Persistence", "Memory survives across sessions", TestStatus.PASS) | |
| for i in range(3): | |
| m = self._measure(f"add_session1_{i}", self.memory.add, f"Session 1 memory {i}", user_id=uid) | |
| r.metrics.append(m) | |
| m = self._measure("cross_session", self.memory.search, "Session 1", user_id=uid) | |
| r.metrics.append(m) | |
| return self._finalize_result(r) | |
| def _test_large_scale(self, uid: str) -> TestResult: | |
| r = TestResult("Large-scale Retrieval", "Performance at scale", TestStatus.PASS) | |
| for batch_size in self.config.batch_sizes: | |
| start = time.perf_counter() | |
| for i in range(batch_size): | |
| self.memory.add(f"Batch memory {i} of {batch_size}", user_id=uid) | |
| dur = (time.perf_counter() - start) * 1000 | |
| r.metrics.append(MetricSample(f"batch_store_{batch_size}", | |
| round(dur, 2), True, | |
| f"Stored {batch_size} in {dur:.0f}ms")) | |
| m = self._measure(f"batch_search_{batch_size}", | |
| self.memory.search, "Batch memory", user_id=uid) | |
| r.metrics.append(m) | |
| return self._finalize_result(r) | |
| def _test_structured(self, uid: str) -> TestResult: | |
| r = TestResult("Structured Memory", "Store and retrieve typed data", TestStatus.PASS) | |
| for data in STRUCTURED_DATA[:4]: | |
| entry = f"{data['type']}: {data['key']} = {data['value']} ({data['env']})" | |
| m = self._measure("add", self.memory.add, entry, user_id=uid) | |
| r.metrics.append(m) | |
| m = self._measure("search", self.memory.search, "config max_connections", user_id=uid) | |
| r.metrics.append(m) | |
| return self._finalize_result(r) | |
| def _test_conflict(self, uid: str) -> TestResult: | |
| r = TestResult("Conflict Resolution", "Handle contradictory memories", TestStatus.PASS) | |
| for text, _ in CONTRADICTORY_FACTS[:4]: | |
| m = self._measure("add", self.memory.add, text, user_id=uid) | |
| r.metrics.append(m) | |
| m = self._measure("conflict_search", self.memory.search, "server count", user_id=uid) | |
| r.metrics.append(m) | |
| return self._finalize_result(r) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@examples/benchmarks/memanto_vs_mem0/benchmark_runner.py` around lines 386 -
468, All test methods (_test_crud, _test_semantic_search, _test_temporal_recall,
_test_multi_turn, _test_persistence, _test_large_scale, _test_structured,
_test_conflict) initialize TestResult with TestStatus.PASS but never check if
any appended metrics have success=False, so failed operations go unreported. Add
logic to each test method to iterate through r.metrics after all measurements
are appended and downgrade r.status from PASS to FAIL (or appropriate failure
status) if any metric has success=False before returning r.
| # Run Memanto | ||
| print("\n▶ Running Memanto benchmarks...") | ||
| memanto_results = MemantoBenchmark(config).run_all() | ||
|
|
||
| # Run Mem0 | ||
| print("\n▶ Running Mem0 benchmarks...") | ||
| mem0_results = Mem0Benchmark(config).run_all() |
There was a problem hiding this comment.
Suites run sequentially, not simultaneously as required by issue criteria.
The linked issue calls for simultaneous controlled execution; current orchestration runs Memanto then Mem0 serially, which introduces temporal/environment drift.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@examples/benchmarks/memanto_vs_mem0/benchmark_runner.py` around lines 582 -
588, The benchmark execution for MemantoBenchmark and Mem0Benchmark is currently
sequential (Memanto runs first, then Mem0), but the requirements specify
simultaneous execution to avoid temporal and environmental drift. Refactor the
code to run both MemantoBenchmark(config).run_all() and
Mem0Benchmark(config).run_all() concurrently using Python's threading or
concurrent.futures module, ensuring both benchmarks execute in parallel and
their results are properly collected into memanto_results and mem0_results
respectively before proceeding.
| ```json | ||
| { | ||
| "timestamp": "2026-06-22T16:54:32Z", | ||
| "summary": { | ||
| "memanto_score": 6, | ||
| "mem0_score": 2, | ||
| "winner": "Memanto", | ||
| "memanto_avg_duration_ms": 145.2, | ||
| "mem0_avg_duration_ms": 289.7 | ||
| }, | ||
| "results": { | ||
| "crud": { "name": "CRUD Operations", "status": "✅ PASS", ... }, | ||
| "semantic_search": { ... }, | ||
| ... | ||
| } | ||
| } | ||
| ``` |
There was a problem hiding this comment.
JSON report schema mismatch: field names in summary don't match code output.
The README documents memanto_avg_duration_ms and mem0_avg_duration_ms (lines 93-94), but the actual code (benchmark_runner.py lines 514-520) uses memanto_avg_duration and mem0_avg_duration (without the _ms suffix). Downstream tools parsing the JSON according to this documented schema will fail to find these fields.
📋 Proposed fix for JSON schema documentation
"summary": {
"memanto_score": 6,
"mem0_score": 2,
"winner": "Memanto",
- "memanto_avg_duration_ms": 145.2,
- "mem0_avg_duration_ms": 289.7
+ "memanto_avg_duration": 145.2,
+ "mem0_avg_duration": 289.7
},📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ```json | |
| { | |
| "timestamp": "2026-06-22T16:54:32Z", | |
| "summary": { | |
| "memanto_score": 6, | |
| "mem0_score": 2, | |
| "winner": "Memanto", | |
| "memanto_avg_duration_ms": 145.2, | |
| "mem0_avg_duration_ms": 289.7 | |
| }, | |
| "results": { | |
| "crud": { "name": "CRUD Operations", "status": "✅ PASS", ... }, | |
| "semantic_search": { ... }, | |
| ... | |
| } | |
| } | |
| ``` |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@examples/benchmarks/memanto_vs_mem0/README.md` around lines 86 - 102, The
README contains incorrect field names in the JSON report schema example. The
documented fields `memanto_avg_duration_ms` and `mem0_avg_duration_ms` (with
`_ms` suffix) do not match the actual field names produced by the
benchmark_runner.py code, which outputs `memanto_avg_duration` and
`mem0_avg_duration` (without the `_ms` suffix). Update the JSON example in the
README to remove the `_ms` suffix from both duration field names in the summary
section to match the actual code output.
| memanto>=0.2.0 | ||
| mem0ai>=2.0.0 | ||
| moorcheh-sdk>=1.3.0 | ||
| openai>=1.0.0 | ||
| pydantic>=2.0.0 | ||
| rich>=13.0.0 | ||
| httpx>=0.25.0 | ||
| python-dotenv>=1.0.0 |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
git ls-files examples/benchmarks/memanto_vs_mem0/Repository: moorcheh-ai/memanto
Length of output: 267
🏁 Script executed:
cat examples/benchmarks/memanto_vs_mem0/requirements.txtRepository: moorcheh-ai/memanto
Length of output: 190
🏁 Script executed:
find . -type f -name "*.md" | head -20 | xargs grep -l "bounty\|reproducib" 2>/dev/null || echo "No matches found"Repository: moorcheh-ai/memanto
Length of output: 126
🏁 Script executed:
cat examples/benchmarks/memanto_vs_mem0/README.mdRepository: moorcheh-ai/memanto
Length of output: 4746
🏁 Script executed:
cat examples/benchmarks/memanto_vs_mem0/benchmark_runner.py | head -50Repository: moorcheh-ai/memanto
Length of output: 1519
🏁 Script executed:
find . -type f \( -name "*.md" -o -name "*.txt" \) | xargs grep -i "reproducib" 2>/dev/null | head -20Repository: moorcheh-ai/memanto
Length of output: 174
Pin benchmark dependencies to exact versions for deterministic, reproducible results.
The reproducibility criterion in the bounty scoring matrix (15/100 points) emphasizes "plug-and-play setup, clean code." Using >= constraints means dependency resolution varies across different dates and environments, making benchmark results non-reproducible. Since the benchmark explicitly requires "exact same datasets" and "identical baseline constraints," exact version pinning with == ensures that anyone can reproduce your results and fairly compare against Mem0.
Suggested change
-memanto>=0.2.0
-mem0ai>=2.0.0
-moorcheh-sdk>=1.3.0
-openai>=1.0.0
-pydantic>=2.0.0
-rich>=13.0.0
-httpx>=0.25.0
-python-dotenv>=1.0.0
+memanto==0.2.0
+mem0ai==2.0.0
+moorcheh-sdk==1.3.0
+openai==1.0.0
+pydantic==2.0.0
+rich==13.0.0
+httpx==0.25.0
+python-dotenv==1.0.0📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| memanto>=0.2.0 | |
| mem0ai>=2.0.0 | |
| moorcheh-sdk>=1.3.0 | |
| openai>=1.0.0 | |
| pydantic>=2.0.0 | |
| rich>=13.0.0 | |
| httpx>=0.25.0 | |
| python-dotenv>=1.0.0 | |
| memanto==0.2.0 | |
| mem0ai==2.0.0 | |
| moorcheh-sdk==1.3.0 | |
| openai==1.0.0 | |
| pydantic==2.0.0 | |
| rich==13.0.0 | |
| httpx==0.25.0 | |
| python-dotenv==1.0.0 |
🧰 Tools
🪛 OSV Scanner (2.4.0)
[HIGH] 1-1: pyjwt 2.9.0: undefined
(PYSEC-2025-183)
[HIGH] 1-1: pyjwt 2.9.0: undefined
(PYSEC-2026-120)
[HIGH] 1-1: pyjwt 2.9.0: undefined
(PYSEC-2026-175)
[HIGH] 1-1: pyjwt 2.9.0: undefined
(PYSEC-2026-176)
[HIGH] 1-1: pyjwt 2.9.0: undefined
(PYSEC-2026-177)
[HIGH] 1-1: pyjwt 2.9.0: undefined
(PYSEC-2026-178)
[HIGH] 1-1: pyjwt 2.9.0: undefined
(PYSEC-2026-179)
[HIGH] 1-1: pyjwt 2.9.0: PyJWT accepts unknown crit header extensions
[HIGH] 1-1: pyjwt 2.9.0: PyJWKClient: missing scheme allowlist enables CVE-2024-21643-class SSRF + token forgery via file://, ftp://, data: schemes
[HIGH] 1-1: pyjwt 2.9.0: PyJWKClient unbounded JWKS endpoint requests via attacker-controlled kid values (DoS)
[HIGH] 1-1: pyjwt 2.9.0: PyJWT: Algorithm allow-list bypass when decoding with PyJWK / PyJWKClient keys
[HIGH] 1-1: pyjwt 2.9.0: PyJWT: Unauthenticated DoS via unbounded Base64URL decoding of unused payload segment in b64=false detached JWS
[HIGH] 1-1: pyjwt 2.9.0: PyJWT: Public-key JWK accepted as HMAC secret enables forged HS256 tokens when mixed families are allowed
[HIGH] 1-1: python-multipart 0.0.9: Denial of service (DoS) via deformation multipart/form-data boundary
[HIGH] 1-1: python-multipart 0.0.9: python-multipart: Quadratic-time querystring parsing with semicolon separators causes CPU denial of service
[HIGH] 1-1: python-multipart 0.0.9: python-multipart: Semicolon treated as querystring field separator enables parameter smuggling
[HIGH] 1-1: python-multipart 0.0.9: python-multipart affected by Denial of Service via large multipart preamble or epilogue data
[HIGH] 1-1: python-multipart 0.0.9: python-multipart has Denial of Service via unbounded multipart part headers
[HIGH] 1-1: python-multipart 0.0.9: python-multipart: Negative Content-Length in parse_form buffers the entire body in memory
[HIGH] 1-1: python-multipart 0.0.9: python-multipart: Content-Disposition parameter smuggling via RFC 2231/5987 extended parameters
[HIGH] 1-1: python-multipart 0.0.9: Python-Multipart has Arbitrary File Write via Non-Default Configuration
[HIGH] 1-1: requests 2.9.2: undefined
(PYSEC-2018-28)
[HIGH] 1-1: requests 2.9.2: undefined
(PYSEC-2023-74)
[HIGH] 1-1: requests 2.9.2: Requests vulnerable to .netrc credentials leak via malicious URLs
[HIGH] 1-1: requests 2.9.2: Requests Session object does not verify requests after making first request with verify=False
[HIGH] 1-1: requests 2.9.2: Requests has Insecure Temp File Reuse in its extract_zipped_paths() utility function
[HIGH] 1-1: requests 2.9.2: Unintended leak of Proxy-Authorization header in requests
[HIGH] 1-1: requests 2.9.2: Insufficiently Protected Credentials in Requests
[HIGH] 1-1: tqdm 4.9.0: undefined
(PYSEC-2017-74)
[HIGH] 1-1: tqdm 4.9.0: tqdm CLI arguments injection attack
[HIGH] 1-1: tqdm 4.9.0: TDQM Arbitrary Code Execution
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@examples/benchmarks/memanto_vs_mem0/requirements.txt` around lines 1 - 8, The
requirements.txt file uses minimum version constraints with >= operator (e.g.,
memanto>=0.2.0, mem0ai>=2.0.0, openai>=1.0.0, etc.) which allows different
dependency versions to be installed across different environments and dates,
compromising reproducibility of benchmark results. Replace all >= constraints
with exact version pinning using == operator for each dependency including
memanto, mem0ai, moorcheh-sdk, openai, pydantic, rich, httpx, and python-dotenv
to ensure deterministic and reproducible benchmark results.
🧩 Analysis chain
🌐 Web query:
pyjwt python-multipart requests tqdm known vulnerabilities CVE 2024 2025
💡 Result:
Below are the known security vulnerabilities for the requested Python libraries during the 2024–2026 period. PyJWT - CVE-2024-53861: Incorrect string comparison during 'iss' (issuer) claim validation, fixed in version 2.10.1 [1][2]. - CVE-2026-48526: A JWT algorithm confusion vulnerability where an attacker can use a public key as an HMAC secret if the verifier is misconfigured to support both HMAC and asymmetric algorithms simultaneously; patched in version 2.13.0 [3]. - Other Issues: There is an identified risk involving PyJWKClient's lack of a scheme allowlist, which can lead to SSRF and token forgery via unintended schemes (e.g., file://, ftp://) [4]. python-multipart - CVE-2024-24762: A Regular Expression Denial of Service (ReDoS) vulnerability in Content-Type header parsing; patched in version 0.0.7 [5]. - CVE-2024-53981: A Denial of Service (DoS) vulnerability caused by excessive logging during the parsing of malicious form-data boundaries; patched in version 0.0.18 [6][7]. - CVE-2026-42561: A Denial of Service vulnerability due to unbounded multipart part headers (number and size); patched in version 0.0.27 [8]. Requests - CVE-2024-35195: A vulnerability where SSL certificate verification settings were incorrectly persisted across requests in a Session object if the initial request used verify=False; patched in version 2.32.0 [9]. - CVE-2024-47081: A credential leak vulnerability where.netrc credentials could be exposed to third-party domains due to incorrect URL parsing; patched in version 2.32.4 [10][11][12][13]. tqdm - No CVEs or significant security vulnerabilities were identified for the tqdm library during this period. Users are advised to ensure they are using the latest available versions of these libraries to mitigate these risks [14][3][8][10].
Citations:
- 1: https://nvd.nist.gov/vuln/detail/cve-2024-53861
- 2: https://osv.dev/GHSA-75c5-xw7c-p5pm
- 3: GHSA-xgmm-8j9v-c9wx
- 4: GHSA-993g-76c3-p5m4
- 5: https://nvd.nist.gov/vuln/detail/CVE-2024-24762
- 6: GHSA-59g5-xgcq-4qw3
- 7: https://nvd.nist.gov/vuln/detail/CVE-2024-53981
- 8: GHSA-pp6c-gr5w-3c5g
- 9: https://nvd.nist.gov/vuln/detail/CVE-2024-35195
- 10: https://osv.dev/vulnerability/GHSA-9hjg-9r4m-mvj7
- 11: CVE-2024-47081: Netrc credential leak in PSF requests library psf/requests#6964
- 12: fix CVE 2024 47081: manual url parsing leads to netloc credentials leak psf/requests#6963
- 13: https://access.redhat.com/errata/RHSA-2025:12519
- 14: https://security.snyk.io/package/pip/pyjwt
🏁 Script executed:
# Check if there's any existing security tooling or vulnerability reports in the repo
find . -type f -name "*requirements*.txt" -o -name "*constraints*.txt" -o -name "*.lock" | head -20Repository: moorcheh-ai/memanto
Length of output: 231
🏁 Script executed:
# Check the actual structure of the benchmark directory
ls -la examples/benchmarks/memanto_vs_mem0/ 2>/dev/null || echo "Directory structure check"Repository: moorcheh-ai/memanto
Length of output: 437
🏁 Script executed:
# Check which of the direct dependencies might bring in these vulnerable packages
# Let's look at the dependency trees for the main packages
rg -i "pyjwt|python-multipart|requests|tqdm" . --include="*.txt" --include="*.py" --include="*.md" 2>/dev/null | head -30Repository: moorcheh-ai/memanto
Length of output: 45
🌐 Web query:
memanto>=0.2.0 mem0ai>=2.0.0 moorcheh-sdk>=1.3.0 transitive dependencies pyjwt requests python-multipart
💡 Result:
The requested libraries and their dependencies are part of the Memanto ecosystem, which is designed to provide persistent memory for AI agents using the Moorcheh semantic engine [1][2]. Memanto (version >=0.2.0) is an open-source tool that allows AI agents to store and recall information [1][2]. It depends on the Moorcheh Python SDK (moorcheh-sdk >=1.3.0) to interact with Moorcheh's semantic search and generative AI capabilities [1][3][4]. Regarding the specific dependencies mentioned: - Memanto: The library itself, with versions >=0.2.0 supporting enhanced features like on-premise execution and local RAG capabilities [1][2]. - Moorcheh SDK (moorcheh-sdk): Required for the communication layer between the application and the Moorcheh semantic database [3][4]. - PyJWT, Requests, and Python-Multipart: These are common standard dependencies used by Memanto's REST API server to handle secure session authentication (JWT), HTTP communication, and form data parsing, respectively [5][6][1][7]. Note that mem0ai is not listed as a direct dependency of the Memanto library in the provided documentation [5][6][1]. Memanto and Mem0 are distinct projects in the AI agent memory space [1][2]. Always ensure you are using compatible versions by checking the pyproject.toml or setup.py file within the specific Memanto release you are installing [5][8].
Citations:
- 1: https://github.com/moorcheh-ai/memanto
- 2: https://memanto.ai/
- 3: https://pypi.org/project/moorcheh-sdk/
- 4: https://docs.moorcheh.ai/python-sdk/introduction
- 5: https://pypi.org/project/memanto/0.0.4/
- 6: https://pypi.org/project/memanto/0.0.3/
- 7: https://docs.memanto.ai/getting-started/quickstart-api
- 8: e71b613
🏁 Script executed:
# Check if there are any existing security configuration files or lock files
find . -type f \( -name "*.lock" -o -name "constraints*.txt" -o -name "security*.txt" -o -name ".safety*" \) 2>/dev/null | head -20Repository: moorcheh-ai/memanto
Length of output: 45
🏁 Script executed:
# Look for pyproject.toml or setup.py in the repo to understand the main project's dependencies
find . -type f \( -name "pyproject.toml" -o -name "setup.py" -o -name "setup.cfg" \) | head -10Repository: moorcheh-ai/memanto
Length of output: 344
🏁 Script executed:
# Check if memanto, mem0ai, moorcheh-sdk have been documented anywhere in the repo
rg -l "memanto|mem0ai|moorcheh-sdk" . --type py --type toml --type txt 2>/dev/null | head -20Repository: moorcheh-ai/memanto
Length of output: 712
🌐 Web query:
memanto pyproject.toml dependencies pyjwt requests python-multipart which package
💡 Result:
In Memanto’s pyproject.toml / PyPI metadata, these dependencies come from these specific packages: 1) pyjwt - Package: PyJWT (import name usually jwt). It is a dependency of Memanto with constraint pyjwt >=2.8.0.[1] 2) requests - Package: requests. It is not listed in the Memanto v0.2.2 dependency table shown on PyPI (at least in the captured section), so the answer depends on the exact pyproject.toml version you’re looking at; however, requests itself is the standard “HTTP for Humans” PyPI package.[2] 3) python-multipart - Package: python-multipart (Apache-licensed multipart parser). Memanto depends on it with constraint python-multipart >=0.0.7.[1] What package contains them? - pyjwt in Memanto’s dependencies refers to the PyPI package “PyJWT”.[1][3] - python-multipart in Memanto’s dependencies refers to the PyPI package “python-multipart”.[1][4] - requests would refer to the PyPI package “requests” (if present in the dependencies for your memanto version).[2] If you paste the relevant dependencies = [...] (or the entire [project] section) from your exact memanto pyproject.toml, I can map each entry precisely for that version (since dependency sets can change across releases).
Citations:
- 1: https://pypi.org/project/memanto/
- 2: https://pypi.org/project/requests/
- 3: https://pypi.org/project/PyJWT/
- 4: https://pypi.org/project/python-multipart/
🌐 Web query:
mem0ai>=2.0.0 pyproject setup.py dependencies requires
💡 Result:
For mem0ai version 2.0.0 and higher, the project uses a standard pyproject.toml configuration file for dependency management [1][2]. The core dependencies required for the base mem0ai package include: openai >= 1.90.0 [3][4][2] posthog >= 7.14.0 [4][2] protobuf < 7.0.0, >= 5.29.6 [3][4][2] pydantic >= 2.7.3 [3][4][2] pytz >= 2024.1 [3][4][2] qdrant-client >= 1.12.0 [3][4][2] sqlalchemy >= 2.0.31 [3][4][2] The project also specifies a wide range of optional dependencies (extras) in pyproject.toml to support various LLM providers, vector stores, and NLP tools [1][2]. These can be installed using bracket syntax, such as pip install mem0ai[nlp] or pip install mem0ai[llms] [3][4][5]. The build system for mem0ai uses hatchling, as defined in the [build-system] section of the pyproject.toml file [1][6]. Key source references for these dependency constraints include the PyPI project page for mem0ai version 2.0.0 and subsequent releases [3][4][2], as well as the repository's main pyproject.toml configuration [1].
Citations:
- 1: https://github.com/mem0ai/mem0/blob/main/pyproject.toml
- 2: https://pypi.org/project/mem0ai/
- 3: https://pypi.org/project/mem0ai/2.0.0/
- 4: https://pypi.org/project/mem0ai/2.0.7/
- 5: https://www.github.com/mem0ai/mem0
- 6: https://github.com/mem0ai/mem0/blob/d35065c887f0f93b6da6c9ef25e7299224dc6eea/pyproject.toml
🏁 Script executed:
# Check the main pyproject.toml for direct dependencies
cat -n ./pyproject.toml | head -100Repository: moorcheh-ai/memanto
Length of output: 3360
🏁 Script executed:
# Also check what's in the examples/benchmarks/memanto_vs_mem0 directory more carefully
cat -n ./examples/benchmarks/memanto_vs_mem0/requirements.txtRepository: moorcheh-ai/memanto
Length of output: 246
🏁 Script executed:
# Check if fastapi, httpx, or other packages in memanto depend on requests
# FastAPI uses httpx or requests for HTTP calls? Let's verify
python3 - <<'PY'
import subprocess
import json
# Check what fastapi depends on
result = subprocess.run(['pip', 'index', 'versions', 'fastapi', '--no-color'],
capture_output=True, text=True)
print("FastAPI info:", result.stdout[:500] if result.stdout else result.stderr[:500])
# Let's try a different approach - look for requests in fastapi's dependencies
result2 = subprocess.run(['python3', '-m', 'pip', 'show', 'fastapi'],
capture_output=True, text=True)
print("\nFastAPI details:")
print(result2.stdout)
PYRepository: moorcheh-ai/memanto
Length of output: 597
🏁 Script executed:
# Check if there are actual vulnerability concerns by looking at what the current constraints allow
python3 - <<'PY'
# Simulate what versions would be installed with the current constraints
import re
constraints = {
"pyjwt": ">=2.8.0",
"python-multipart": ">=0.0.7",
}
vulnerable_versions = {
"pyjwt": {
"CVE-2024-53861": "< 2.10.1",
"CVE-2026-48526": "< 2.13.0"
},
"python-multipart": {
"CVE-2024-24762": "< 0.0.7",
"CVE-2024-53981": "< 0.0.18",
"CVE-2026-42561": "< 0.0.27"
}
}
print("Current constraints in memanto/pyproject.toml:")
for pkg, constraint in constraints.items():
print(f" {pkg} {constraint}")
if pkg in vulnerable_versions:
print(f" Known vulnerabilities:")
for cve, vuln_constraint in vulnerable_versions[pkg].items():
print(f" {cve}: requires {vuln_constraint}")
print("\n\nBenchmark requirements.txt does NOT explicitly constrain pyjwt or python-multipart")
print("This means they would be pulled in via memanto's constraints:")
print(" - memanto>=0.2.0 (which brings pyjwt>=2.8.0, python-multipart>=0.0.7)")
print("\nPotential vulnerability gaps:")
print(" - pyjwt >=2.8.0 allows versions vulnerable to CVE-2026-48526 (needs >=2.13.0)")
print(" - python-multipart >=0.0.7 allows versions vulnerable to CVE-2024-53981 (needs >=0.0.18)")
PYRepository: moorcheh-ai/memanto
Length of output: 827
🏁 Script executed:
# Verify requests dependency - check if it's actually pulled in by any of the packages
python3 - <<'PY'
packages = [
"memanto>=0.2.0",
"mem0ai>=2.0.0",
"openai>=1.0.0",
"httpx>=0.25.0"
]
print("Checking which packages have 'requests' as a dependency:")
print("(Note: This is based on known package information)")
print("")
print("- memanto: No direct 'requests' dependency (uses httpx)")
print("- mem0ai: Likely includes requests indirectly")
print("- openai: Could depend on requests or httpx")
print("- httpx: No direct 'requests' dependency")
print("")
print("Requests CVEs identified:")
print(" - CVE-2024-35195: Session verify=False persistence (fixed in 2.32.0)")
print(" - CVE-2024-47081: .netrc credential leak (fixed in 2.32.4)")
PYRepository: moorcheh-ai/memanto
Length of output: 518
Add explicit safe minimum constraints for vulnerable transitive dependencies in memanto.
The benchmark's transitive dependencies include known vulnerabilities:
pyjwt>=2.8.0allows CVE-2026-48526 (patched in 2.13.0)python-multipart>=0.0.7allows CVE-2024-53981 (patched in 0.0.18)
Add constraints directly in this requirements.txt to ensure safe versions:
pyjwt>=2.13.0
python-multipart>=0.0.18
Or apply these constraints to the main memanto package in its pyproject.toml to fix the root issue. The concern about requests and tqdm cannot be confirmed from the listed dependencies; focus on the PyJWT and python-multipart fixes.
🧰 Tools
🪛 OSV Scanner (2.4.0)
[HIGH] 1-1: pyjwt 2.9.0: undefined
(PYSEC-2025-183)
[HIGH] 1-1: pyjwt 2.9.0: undefined
(PYSEC-2026-120)
[HIGH] 1-1: pyjwt 2.9.0: undefined
(PYSEC-2026-175)
[HIGH] 1-1: pyjwt 2.9.0: undefined
(PYSEC-2026-176)
[HIGH] 1-1: pyjwt 2.9.0: undefined
(PYSEC-2026-177)
[HIGH] 1-1: pyjwt 2.9.0: undefined
(PYSEC-2026-178)
[HIGH] 1-1: pyjwt 2.9.0: undefined
(PYSEC-2026-179)
[HIGH] 1-1: pyjwt 2.9.0: PyJWT accepts unknown crit header extensions
[HIGH] 1-1: pyjwt 2.9.0: PyJWKClient: missing scheme allowlist enables CVE-2024-21643-class SSRF + token forgery via file://, ftp://, data: schemes
[HIGH] 1-1: pyjwt 2.9.0: PyJWKClient unbounded JWKS endpoint requests via attacker-controlled kid values (DoS)
[HIGH] 1-1: pyjwt 2.9.0: PyJWT: Algorithm allow-list bypass when decoding with PyJWK / PyJWKClient keys
[HIGH] 1-1: pyjwt 2.9.0: PyJWT: Unauthenticated DoS via unbounded Base64URL decoding of unused payload segment in b64=false detached JWS
[HIGH] 1-1: pyjwt 2.9.0: PyJWT: Public-key JWK accepted as HMAC secret enables forged HS256 tokens when mixed families are allowed
[HIGH] 1-1: python-multipart 0.0.9: Denial of service (DoS) via deformation multipart/form-data boundary
[HIGH] 1-1: python-multipart 0.0.9: python-multipart: Quadratic-time querystring parsing with semicolon separators causes CPU denial of service
[HIGH] 1-1: python-multipart 0.0.9: python-multipart: Semicolon treated as querystring field separator enables parameter smuggling
[HIGH] 1-1: python-multipart 0.0.9: python-multipart affected by Denial of Service via large multipart preamble or epilogue data
[HIGH] 1-1: python-multipart 0.0.9: python-multipart has Denial of Service via unbounded multipart part headers
[HIGH] 1-1: python-multipart 0.0.9: python-multipart: Negative Content-Length in parse_form buffers the entire body in memory
[HIGH] 1-1: python-multipart 0.0.9: python-multipart: Content-Disposition parameter smuggling via RFC 2231/5987 extended parameters
[HIGH] 1-1: python-multipart 0.0.9: Python-Multipart has Arbitrary File Write via Non-Default Configuration
[HIGH] 1-1: requests 2.9.2: undefined
(PYSEC-2018-28)
[HIGH] 1-1: requests 2.9.2: undefined
(PYSEC-2023-74)
[HIGH] 1-1: requests 2.9.2: Requests vulnerable to .netrc credentials leak via malicious URLs
[HIGH] 1-1: requests 2.9.2: Requests Session object does not verify requests after making first request with verify=False
[HIGH] 1-1: requests 2.9.2: Requests has Insecure Temp File Reuse in its extract_zipped_paths() utility function
[HIGH] 1-1: requests 2.9.2: Unintended leak of Proxy-Authorization header in requests
[HIGH] 1-1: requests 2.9.2: Insufficiently Protected Credentials in Requests
[HIGH] 1-1: tqdm 4.9.0: undefined
(PYSEC-2017-74)
[HIGH] 1-1: tqdm 4.9.0: tqdm CLI arguments injection attack
[HIGH] 1-1: tqdm 4.9.0: TDQM Arbitrary Code Execution
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@examples/benchmarks/memanto_vs_mem0/requirements.txt` around lines 1 - 8, The
requirements.txt file contains transitive dependencies with known
vulnerabilities that need to be explicitly constrained to safe versions. Add two
new lines to the requirements.txt file to pin vulnerable dependencies: pyjwt to
version 2.13.0 or higher to mitigate CVE-2026-48526, and python-multipart to
version 0.0.18 or higher to mitigate CVE-2024-53981. These constraints should be
added after the existing direct dependencies to ensure safe versions are
installed regardless of what versions are pulled in by memanto.
Source: Linters/SAST tools
- Fix Memanto CRUD: use proper update/delete methods instead of create - Replace synthetic hardcoded vectors with _simple_embed() for fair comparison - Add p95_duration_ms, tokens_ingested, tokens_retrieved, retrieval_accuracy to TestResult - Add _check_failures() to downgrade status on failed metrics - Wrap large_scale batch operations in try/except error handling - Run Memanto and Mem0 benchmarks concurrently via threading - Pull Qdrant config from environment variables - Fix README code fence language identifier - Fix README field names (remove _ms suffix) - Pin exact dependency versions in requirements.txt - Add pyjwt and python-multipart security fixes
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
projects/memanto-benchmark/benchmarks/memanto_vs_mem0/README.md (1)
123-123: 🧹 Nitpick | 🔵 Trivial | 💤 Low valueConsider simplifying redundant phrasing.
"exact same" is a redundant expression; "same" alone is sufficient. As a minor style improvement, update to "Both benchmarks run the same datasets".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@projects/memanto-benchmark/benchmarks/memanto_vs_mem0/README.md` at line 123, The phrase "exact same datasets" in the README.md file contains redundant wording. Remove the word "exact" from the sentence so that it reads "Both benchmarks run the **same datasets**" instead, as "same" alone is sufficient to convey the meaning.Source: Linters/SAST tools
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@projects/memanto-benchmark/benchmarks/memanto_vs_mem0/benchmark_runner.py`:
- Around line 218-238: The _measure method creates MetricSample objects without
populating the tokens_count parameter, causing it to default to 0 and breaking
token metrics functionality. Modify the _measure method to calculate or accept
token count information when creating MetricSample instances in both the success
(try block) and failure (except block) paths. You can estimate tokens from the
result content using an approximation formula (such as dividing character count
by 4) or modify the method signature to accept tokens_count as a parameter from
callers who have accurate token information.
- Around line 51-52: The qdrant_port field definition uses int() directly on the
environment variable without error handling, which will raise a ValueError if
QDRANT_PORT is set to an empty string or a non-numeric value. Create a helper
function (or improve the lambda) that wraps the int() conversion in a try-except
block to catch ValueError exceptions, and return the default port value of 6333
when conversion fails or the string is empty. Replace the current
default_factory lambda with this error-handling approach so that invalid port
values gracefully fall back to the default instead of crashing at config
initialization.
In `@projects/memanto-benchmark/benchmarks/memanto_vs_mem0/README.md`:
- Around line 46-47: The setup instructions contain an incorrect directory path
that directs users to the wrong location. In the clone and cd commands section,
update the cd command path from `cd memanto/examples/benchmarks/memanto_vs_mem0`
to reflect the actual location of this README file at `cd
memanto/projects/memanto-benchmark/benchmarks/memanto_vs_mem0`. This ensures
users navigate to the correct directory where the benchmark_runner.py and
documentation actually exist.
In `@projects/memanto-benchmark/benchmarks/memanto_vs_mem0/requirements.txt`:
- Around line 9-11: The python-multipart constraint in requirements.txt
specifies python-multipart>=0.0.18, which allows vulnerable versions containing
7 HIGH severity security issues. Update the constraint on line 11 from
python-multipart>=0.0.18 to python-multipart>=0.0.30 to enforce the patched
version that resolves all reported vulnerabilities (GHSA-5rvq-cxj2-64vf,
GHSA-6jv3-5f52-599m, GHSA-mj87-hwqh-73pj, GHSA-pp6c-gr5w-3c5g,
GHSA-v9pg-7xvm-68hf, GHSA-vffw-93wf-4j4q, GHSA-wp53-j4wj-2cfg).
---
Nitpick comments:
In `@projects/memanto-benchmark/benchmarks/memanto_vs_mem0/README.md`:
- Line 123: The phrase "exact same datasets" in the README.md file contains
redundant wording. Remove the word "exact" from the sentence so that it reads
"Both benchmarks run the **same datasets**" instead, as "same" alone is
sufficient to convey the meaning.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: d6018c75-b9fd-4886-b7a6-51604a16a45b
📒 Files selected for processing (3)
projects/memanto-benchmark/benchmarks/memanto_vs_mem0/README.mdprojects/memanto-benchmark/benchmarks/memanto_vs_mem0/benchmark_runner.pyprojects/memanto-benchmark/benchmarks/memanto_vs_mem0/requirements.txt
| qdrant_host: str = field(default_factory=lambda: os.getenv("QDRANT_HOST", "localhost")) | ||
| qdrant_port: int = field(default_factory=lambda: int(os.getenv("QDRANT_PORT", "6333"))) |
There was a problem hiding this comment.
int() can raise ValueError if QDRANT_PORT is empty or non-numeric.
If the environment variable is set to an empty string or invalid value, this will crash at config initialization without a clear error message.
🛡️ Proposed fix with graceful fallback
- qdrant_port: int = field(default_factory=lambda: int(os.getenv("QDRANT_PORT", "6333")))
+ qdrant_port: int = field(default_factory=lambda: int(os.getenv("QDRANT_PORT") or "6333"))Alternatively, for more robust handling:
def _parse_port() -> int:
port_str = os.getenv("QDRANT_PORT", "6333")
try:
return int(port_str) if port_str else 6333
except ValueError:
return 6333
qdrant_port: int = field(default_factory=_parse_port)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| qdrant_host: str = field(default_factory=lambda: os.getenv("QDRANT_HOST", "localhost")) | |
| qdrant_port: int = field(default_factory=lambda: int(os.getenv("QDRANT_PORT", "6333"))) | |
| qdrant_host: str = field(default_factory=lambda: os.getenv("QDRANT_HOST", "localhost")) | |
| qdrant_port: int = field(default_factory=lambda: int(os.getenv("QDRANT_PORT") or "6333")) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@projects/memanto-benchmark/benchmarks/memanto_vs_mem0/benchmark_runner.py`
around lines 51 - 52, The qdrant_port field definition uses int() directly on
the environment variable without error handling, which will raise a ValueError
if QDRANT_PORT is set to an empty string or a non-numeric value. Create a helper
function (or improve the lambda) that wraps the int() conversion in a try-except
block to catch ValueError exceptions, and return the default port value of 6333
when conversion fails or the string is empty. Replace the current
default_factory lambda with this error-handling approach so that invalid port
values gracefully fall back to the default instead of crashing at config
initialization.
| def _measure(self, operation: str, fn, *args, **kwargs) -> MetricSample: | ||
| start = time.perf_counter() | ||
| try: | ||
| result = fn(*args, **kwargs) | ||
| duration = (time.perf_counter() - start) * 1000 | ||
| is_retrieved = "search" in operation.lower() or "retriev" in operation.lower() | ||
| return MetricSample( | ||
| operation=operation, | ||
| duration_ms=round(duration, 2), | ||
| success=True, | ||
| details=str(result)[:200] if result else "ok", | ||
| is_retrieved=is_retrieved, | ||
| ) | ||
| except Exception as e: | ||
| duration = (time.perf_counter() - start) * 1000 | ||
| return MetricSample( | ||
| operation=operation, | ||
| duration_ms=round(duration, 2), | ||
| success=False, | ||
| details=str(e), | ||
| ) |
There was a problem hiding this comment.
tokens_count is never populated, making token metrics non-functional.
The _measure method creates MetricSample without setting tokens_count, so it always defaults to 0. This means tokens_ingested and tokens_retrieved properties in TestResult will always return 0, despite the PR objectives explicitly requiring tracking "total tokens ingested/retrieved" as a critical metric.
🔧 Suggested approach to track tokens
The fix requires either:
- Estimate from content - approximate token count from stored/retrieved text:
def _measure(self, operation: str, fn, *args, **kwargs) -> MetricSample:
start = time.perf_counter()
try:
result = fn(*args, **kwargs)
duration = (time.perf_counter() - start) * 1000
is_retrieved = "search" in operation.lower() or "retriev" in operation.lower()
+ # Estimate tokens from result/args (rough approximation: ~4 chars per token)
+ tokens = 0
+ if kwargs.get("metadata") and "text" in kwargs["metadata"]:
+ tokens = len(kwargs["metadata"]["text"]) // 4
+ elif args and isinstance(args[0], str):
+ tokens = len(args[0]) // 4
return MetricSample(
operation=operation,
duration_ms=round(duration, 2),
success=True,
details=str(result)[:200] if result else "ok",
+ tokens_count=tokens,
is_retrieved=is_retrieved,
)- Accept token count as parameter - let callers pass in known token counts for more accuracy.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@projects/memanto-benchmark/benchmarks/memanto_vs_mem0/benchmark_runner.py`
around lines 218 - 238, The _measure method creates MetricSample objects without
populating the tokens_count parameter, causing it to default to 0 and breaking
token metrics functionality. Modify the _measure method to calculate or accept
token count information when creating MetricSample instances in both the success
(try block) and failure (except block) paths. You can estimate tokens from the
result content using an approximation formula (such as dividing character count
by 4) or modify the method signature to accept tokens_count as a parameter from
callers who have accurate token information.
| git clone https://github.com/moorcheh-ai/memanto.git | ||
| cd memanto/examples/benchmarks/memanto_vs_mem0 |
There was a problem hiding this comment.
Setup instructions reference the wrong directory path.
The quick-start instructions direct users to clone and navigate to examples/benchmarks/memanto_vs_mem0, but this README is located in projects/memanto-benchmark/benchmarks/memanto_vs_mem0/. Users following these instructions will end up in the wrong directory and fail to find the correct benchmark_runner.py and this documentation. Update the path to match the actual file location.
📁 Proposed fix to correct the setup path
-# Clone this repo
-git clone https://github.com/moorcheh-ai/memanto.git
-cd memanto/examples/benchmarks/memanto_vs_mem0
+# Clone this repo
+git clone https://github.com/moorcheh-ai/memanto.git
+cd memanto/projects/memanto-benchmark/benchmarks/memanto_vs_mem0📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| git clone https://github.com/moorcheh-ai/memanto.git | |
| cd memanto/examples/benchmarks/memanto_vs_mem0 | |
| # Clone this repo | |
| git clone https://github.com/moorcheh-ai/memanto.git | |
| cd memanto/projects/memanto-benchmark/benchmarks/memanto_vs_mem0 |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@projects/memanto-benchmark/benchmarks/memanto_vs_mem0/README.md` around lines
46 - 47, The setup instructions contain an incorrect directory path that directs
users to the wrong location. In the clone and cd commands section, update the cd
command path from `cd memanto/examples/benchmarks/memanto_vs_mem0` to reflect
the actual location of this README file at `cd
memanto/projects/memanto-benchmark/benchmarks/memanto_vs_mem0`. This ensures
users navigate to the correct directory where the benchmark_runner.py and
documentation actually exist.
| # Security fixes for transitive dependencies | ||
| pyjwt>=2.13.0 | ||
| python-multipart>=0.0.18 |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
What is the current fixed/patched version of python-multipart that resolves CVE/GHSA vulnerabilities related to quadratic-time parsing, parameter smuggling, and multipart DoS?
💡 Result:
The current fixed/patched version of python-multipart for the GHSA/CVE issues covering quadratic-time parsing (CPU DoS), parameter smuggling (via semicolon separators), and multipart DoS is python-multipart 0.0.30 or later—because the upstream advisories for both the quadratic-time parser issue and the semicolon-based parameter smuggling explicitly say “Upgrade to python-multipart 0.0.30 or later”. [1][2][3] As of today (2026-06-22), the latest published python-multipart version on PyPI is 0.0.32 (released 2026-06-04). [4]
Citations:
- 1: GHSA-5rvq-cxj2-64vf
- 2: GHSA-6jv3-5f52-599m
- 3: GHSA-5rvq-cxj2-64vf
- 4: https://pypi.org/project/python-multipart/
Update python-multipart constraint to enforce patched version 0.0.30 or later.
Line 11 specifies python-multipart>=0.0.18, which allows the vulnerable version itself. The OSV scanner reports 7 HIGH severity vulnerabilities in version 0.0.18. Upstream security advisories explicitly require upgrading to python-multipart 0.0.30 or later to resolve all reported issues (GHSA-5rvq-cxj2-64vf, GHSA-6jv3-5f52-599m, GHSA-mj87-hwqh-73pj, GHSA-pp6c-gr5w-3c5g, GHSA-v9pg-7xvm-68hf, GHSA-vffw-93wf-4j4q, GHSA-wp53-j4wj-2cfg). Change the constraint to python-multipart>=0.0.30.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@projects/memanto-benchmark/benchmarks/memanto_vs_mem0/requirements.txt`
around lines 9 - 11, The python-multipart constraint in requirements.txt
specifies python-multipart>=0.0.18, which allows vulnerable versions containing
7 HIGH severity security issues. Update the constraint on line 11 from
python-multipart>=0.0.18 to python-multipart>=0.0.30 to enforce the patched
version that resolves all reported vulnerabilities (GHSA-5rvq-cxj2-64vf,
GHSA-6jv3-5f52-599m, GHSA-mj87-hwqh-73pj, GHSA-pp6c-gr5w-3c5g,
GHSA-v9pg-7xvm-68hf, GHSA-vffw-93wf-4j4q, GHSA-wp53-j4wj-2cfg).
Source: Linters/SAST tools
Summary
This PR adds a comprehensive benchmark suite comparing Memanto (Moorcheh-powered) against Mem0 across 8 critical dimensions of agentic memory performance.
Benchmark Dimensions
Test Datasets
Scoring Matrix (100 pts)
Quick Start
\\�ash
cd examples/benchmarks/memanto_vs_mem0
cp .env.example .env
Edit .env with your API keys
pip install -r requirements.txt
python benchmark_runner.py
\\
Location
\examples/benchmarks/memanto_vs_mem0/\
Closes #639
Summary by CodeRabbit
Release Notes
New Features
Documentation
Chores
.envexample and pinned benchmark dependencies via a dedicated requirements file.