Skip to content

Conversation

@dworthen
Copy link
Contributor

Add GraphRAG Cache package.

@dworthen dworthen requested a review from a team as a code owner December 15, 2025 19:44
@dworthen dworthen requested review from Copilot and removed request for a team December 15, 2025 22:19
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extracts cache functionality from the main GraphRAG package into a new standalone graphrag-cache package. The refactoring improves modularity by separating cache concerns into its own package with a cleaner API design.

Key changes include:

  • Introduction of the graphrag-cache package with Cache interface and implementations (JsonCache, MemoryCache, NoopCache)
  • New factory pattern using create_cache() and register_cache() functions with CacheConfig for type-safe configuration
  • Updated cache configuration model to use nested StorageConfig instead of flat structure

Reviewed changes

Copilot reviewed 34 out of 36 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
uv.lock Adds graphrag-cache package dependency and workspace member
pyproject.toml Registers graphrag-cache as workspace member and adds version update task
packages/graphrag-cache/pyproject.toml Defines new graphrag-cache package metadata and dependencies
packages/graphrag-cache/graphrag_cache/init.py Exports public API (Cache, CacheConfig, CacheType, create_cache, register_cache)
packages/graphrag-cache/graphrag_cache/cache.py Defines abstract Cache interface with ABC
packages/graphrag-cache/graphrag_cache/cache_type.py Defines CacheType enum (Json, Memory, Noop)
packages/graphrag-cache/graphrag_cache/cache_config.py Defines CacheConfig model with optional StorageConfig
packages/graphrag-cache/graphrag_cache/cache_factory.py Implements factory pattern with lazy registration of builtin cache types
packages/graphrag-cache/graphrag_cache/json_cache.py Renamed from JsonPipelineCache, now accepts Storage or StorageConfig
packages/graphrag-cache/graphrag_cache/memory_cache.py Renamed from InMemoryCache, simplified by removing cache key prefixing
packages/graphrag-cache/graphrag_cache/noop_cache.py Renamed from NoopPipelineCache with updated interface
packages/graphrag-cache/README.md Documents usage patterns for builtin and custom cache implementations
packages/graphrag/pyproject.toml Adds graphrag-cache==2.7.0 dependency
packages/graphrag/graphrag/config/models/cache_config.py Removed (replaced by graphrag_cache.CacheConfig)
packages/graphrag/graphrag/config/models/graph_rag_config.py Updated to use graphrag_cache.CacheConfig with asdict conversion
packages/graphrag/graphrag/config/defaults.py Restructured cache defaults to use CacheStorageDefaults with nested storage config
packages/graphrag/graphrag/config/init_content.py Updated config template to show nested cache.storage structure
packages/graphrag/graphrag/cache/factory.py Removed (replaced by graphrag_cache factory)
packages/graphrag/graphrag/cache/init.py Emptied as cache functionality moved to separate package
packages/graphrag/graphrag/utils/api.py Removed create_cache_from_config function (replaced by graphrag_cache.create_cache)
packages/graphrag/graphrag/index/run/run_pipeline.py Updated to use graphrag_cache.create_cache
packages/graphrag/graphrag/index/run/utils.py Updated imports to use graphrag_cache.Cache and MemoryCache
packages/graphrag/graphrag/index/typing/context.py Updated PipelineRunContext to use graphrag_cache.Cache type
packages/graphrag/graphrag/index/workflows/*.py Updated type annotations from PipelineCache to graphrag_cache.Cache
packages/graphrag/graphrag/index/operations//.py Updated type annotations from PipelineCache to graphrag_cache.Cache
packages/graphrag/graphrag/language_model/providers/litellm/*.py Updated type annotations from PipelineCache to graphrag_cache.Cache
packages/graphrag/graphrag/prompt_tune/loader/input.py Updated to use graphrag_cache.NoopCache
tests/unit/config/utils.py Updated assert_cache_configs to check nested storage configuration
tests/unit/indexing/cache/test_file_pipeline_cache.py Refactored to use new cache creation API
tests/integration/cache/test_factory.py Completely rewritten to test new factory API and patterns
Comments suppressed due to low confidence (4)

packages/graphrag-cache/graphrag_cache/memory_cache.py:19

  • The MemoryCache class removed the _name instance variable initialization and all _create_cache_key logic that used it, but the __init__ method no longer accepts or uses kwargs. This means if any code passes keyword arguments to MemoryCache, they will be silently ignored, which could lead to confusion. Consider whether kwargs should be used or if documentation should clarify that no arguments are needed for MemoryCache.
    packages/graphrag-cache/graphrag_cache/json_cache.py:14
  • This class does not call Cache.init during initialization. (JsonCache.init may be missing a call to a base class init)
    packages/graphrag-cache/graphrag_cache/memory_cache.py:11
  • This class does not call Cache.init during initialization. (MemoryCache.init may be missing a call to a base class init)
    packages/graphrag-cache/graphrag_cache/noop_cache.py:11
  • This class does not call Cache.init during initialization. (NoopCache.init may be missing a call to a base class init)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@dworthen dworthen merged commit 3201f28 into v3/main Dec 16, 2025
14 checks passed
@dworthen dworthen deleted the graphrag-cache branch December 16, 2025 14:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants