-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Add GraphRAG Cache package. #2153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR extracts cache functionality from the main GraphRAG package into a new standalone graphrag-cache package. The refactoring improves modularity by separating cache concerns into its own package with a cleaner API design.
Key changes include:
- Introduction of the
graphrag-cachepackage withCacheinterface and implementations (JsonCache,MemoryCache,NoopCache) - New factory pattern using
create_cache()andregister_cache()functions withCacheConfigfor type-safe configuration - Updated cache configuration model to use nested
StorageConfiginstead of flat structure
Reviewed changes
Copilot reviewed 34 out of 36 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| uv.lock | Adds graphrag-cache package dependency and workspace member |
| pyproject.toml | Registers graphrag-cache as workspace member and adds version update task |
| packages/graphrag-cache/pyproject.toml | Defines new graphrag-cache package metadata and dependencies |
| packages/graphrag-cache/graphrag_cache/init.py | Exports public API (Cache, CacheConfig, CacheType, create_cache, register_cache) |
| packages/graphrag-cache/graphrag_cache/cache.py | Defines abstract Cache interface with ABC |
| packages/graphrag-cache/graphrag_cache/cache_type.py | Defines CacheType enum (Json, Memory, Noop) |
| packages/graphrag-cache/graphrag_cache/cache_config.py | Defines CacheConfig model with optional StorageConfig |
| packages/graphrag-cache/graphrag_cache/cache_factory.py | Implements factory pattern with lazy registration of builtin cache types |
| packages/graphrag-cache/graphrag_cache/json_cache.py | Renamed from JsonPipelineCache, now accepts Storage or StorageConfig |
| packages/graphrag-cache/graphrag_cache/memory_cache.py | Renamed from InMemoryCache, simplified by removing cache key prefixing |
| packages/graphrag-cache/graphrag_cache/noop_cache.py | Renamed from NoopPipelineCache with updated interface |
| packages/graphrag-cache/README.md | Documents usage patterns for builtin and custom cache implementations |
| packages/graphrag/pyproject.toml | Adds graphrag-cache==2.7.0 dependency |
| packages/graphrag/graphrag/config/models/cache_config.py | Removed (replaced by graphrag_cache.CacheConfig) |
| packages/graphrag/graphrag/config/models/graph_rag_config.py | Updated to use graphrag_cache.CacheConfig with asdict conversion |
| packages/graphrag/graphrag/config/defaults.py | Restructured cache defaults to use CacheStorageDefaults with nested storage config |
| packages/graphrag/graphrag/config/init_content.py | Updated config template to show nested cache.storage structure |
| packages/graphrag/graphrag/cache/factory.py | Removed (replaced by graphrag_cache factory) |
| packages/graphrag/graphrag/cache/init.py | Emptied as cache functionality moved to separate package |
| packages/graphrag/graphrag/utils/api.py | Removed create_cache_from_config function (replaced by graphrag_cache.create_cache) |
| packages/graphrag/graphrag/index/run/run_pipeline.py | Updated to use graphrag_cache.create_cache |
| packages/graphrag/graphrag/index/run/utils.py | Updated imports to use graphrag_cache.Cache and MemoryCache |
| packages/graphrag/graphrag/index/typing/context.py | Updated PipelineRunContext to use graphrag_cache.Cache type |
| packages/graphrag/graphrag/index/workflows/*.py | Updated type annotations from PipelineCache to graphrag_cache.Cache |
| packages/graphrag/graphrag/index/operations//.py | Updated type annotations from PipelineCache to graphrag_cache.Cache |
| packages/graphrag/graphrag/language_model/providers/litellm/*.py | Updated type annotations from PipelineCache to graphrag_cache.Cache |
| packages/graphrag/graphrag/prompt_tune/loader/input.py | Updated to use graphrag_cache.NoopCache |
| tests/unit/config/utils.py | Updated assert_cache_configs to check nested storage configuration |
| tests/unit/indexing/cache/test_file_pipeline_cache.py | Refactored to use new cache creation API |
| tests/integration/cache/test_factory.py | Completely rewritten to test new factory API and patterns |
Comments suppressed due to low confidence (4)
packages/graphrag-cache/graphrag_cache/memory_cache.py:19
- The
MemoryCacheclass removed the_nameinstance variable initialization and all_create_cache_keylogic that used it, but the__init__method no longer accepts or useskwargs. This means if any code passes keyword arguments toMemoryCache, they will be silently ignored, which could lead to confusion. Consider whether kwargs should be used or if documentation should clarify that no arguments are needed for MemoryCache.
packages/graphrag-cache/graphrag_cache/json_cache.py:14 - This class does not call Cache.init during initialization. (JsonCache.init may be missing a call to a base class init)
packages/graphrag-cache/graphrag_cache/memory_cache.py:11 - This class does not call Cache.init during initialization. (MemoryCache.init may be missing a call to a base class init)
packages/graphrag-cache/graphrag_cache/noop_cache.py:11 - This class does not call Cache.init during initialization. (NoopCache.init may be missing a call to a base class init)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Add GraphRAG Cache package.