-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Add GraphRAG Cache package. #2153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 2 commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| 3.12 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,92 @@ | ||
| # GraphRAG Cache | ||
|
|
||
| ## Basic | ||
|
|
||
| ```python | ||
| import asyncio | ||
| from graphrag_storage import StorageConfig, create_storage, StorageType | ||
| from graphrag_cache import CacheConfig, create_cache, CacheType | ||
|
|
||
| async def run(): | ||
| cache = create_cache( | ||
| CacheConfig( | ||
| type=CacheType.Json | ||
| storage=StorageConfig( | ||
| type=StorageType.File | ||
| base_dir="cache" | ||
| ) | ||
| ), | ||
| ) | ||
|
|
||
| await cache.set("my_key", {"some": "object to cache"}) | ||
| print(await cache.get("my_key")) | ||
|
|
||
| if __name__ == "__main__": | ||
| asyncio.run(run()) | ||
| ``` | ||
|
|
||
| ## Custom Cache | ||
|
|
||
| ```python | ||
| import asyncio | ||
| from typing import Any | ||
| from graphrag_storage import Storage | ||
| from graphrag_cache import Cache, CacheConfig, create_cache, register_cache | ||
|
|
||
| class MyCache(Cache): | ||
| def __init__(self, some_setting: str, optional_setting: str = "default setting", **kwargs: Any): | ||
| # Validate settings and initialize | ||
| # View the JsonCache implementation to see how to create a cache that relies on a Storage provider. | ||
| ... | ||
|
|
||
| #Implement rest of interface | ||
| ... | ||
|
|
||
| register_cache("MyCache", MyCache) | ||
|
|
||
| async def run(): | ||
| cache = create_cache( | ||
| CacheConfig( | ||
| type="MyCache" | ||
| some_setting="My Setting" | ||
| ) | ||
dworthen marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| ) | ||
|
|
||
| # Or use the factory directly to instantiate with a dict instead of using | ||
| # CacheConfig + create_factory | ||
| # from graphrag_cache.cache_factory import cache_factory | ||
| # cache = cache_factory.create(strategy="MyCache", init_args={"some_setting": "My Setting"}) | ||
|
|
||
| await cache.set("my_key", {"some": "object to cache"}) | ||
| print(await cache.get("my_key")) | ||
|
|
||
| if __name__ == "__main__": | ||
| asyncio.run(run()) | ||
| ``` | ||
|
|
||
| ### Details | ||
|
|
||
| By default, the `create_cache` comes with the following cache providers registered that correspond to the entries in the `CacheType` enum. | ||
|
|
||
| - `JsonCache` | ||
| - `MemoryCache` | ||
| - `NoopCache` | ||
|
|
||
| The preregistration happens dynamically, e.g., `JsonCache` is only imported and registered if you request a `JsonCache` with `create_cache(CacheType.Json, ...)`. There is no need to manually import and register builtin cache providers when using `create_cache`. | ||
|
|
||
| If you want a clean factory with no preregistered cache providers then directly import `cache_factory` and bypass using `create_cache`. The downside is that `cache_factory.create` uses a dict for init args instead of the strongly typed `CacheConfig` used with `create_cache`. | ||
|
|
||
| ```python | ||
| from graphrag_cache.cache_factory import cache_factory | ||
| from graphrag_cache.json_cache import JsonCache | ||
|
|
||
| # cache_factory has no preregistered providers so you must register any | ||
| # providers you plan on using. | ||
| # May also register a custom implementation, see above for example. | ||
| cache_factory.register("my_cache_impl", JsonCache) | ||
|
|
||
| cache = cache_factory.create(strategy="my_cache_impl", init_args={"some_setting": "..."}) | ||
|
|
||
| ... | ||
|
|
||
| ``` | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,17 @@ | ||
| # Copyright (c) 2024 Microsoft Corporation. | ||
| # Licensed under the MIT License | ||
|
|
||
| """The GraphRAG Cache package.""" | ||
|
|
||
| from graphrag_cache.cache import Cache | ||
| from graphrag_cache.cache_config import CacheConfig | ||
| from graphrag_cache.cache_factory import create_cache, register_cache | ||
| from graphrag_cache.cache_type import CacheType | ||
|
|
||
| __all__ = [ | ||
| "Cache", | ||
| "CacheConfig", | ||
| "CacheType", | ||
| "create_cache", | ||
| "register_cache", | ||
| ] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,26 @@ | ||
| # Copyright (c) 2024 Microsoft Corporation. | ||
| # Licensed under the MIT License | ||
|
|
||
| """Cache configuration model.""" | ||
|
|
||
| from graphrag_storage import StorageConfig | ||
| from pydantic import BaseModel, ConfigDict, Field | ||
|
|
||
| from graphrag_cache.cache_type import CacheType | ||
|
|
||
|
|
||
| class CacheConfig(BaseModel): | ||
| """The configuration section for cache.""" | ||
|
|
||
| model_config = ConfigDict(extra="allow") | ||
| """Allow extra fields to support custom cache implementations.""" | ||
|
|
||
| type: str = Field( | ||
| description="The cache type to use. Builtin types include 'Json', 'Memory', and 'Noop'.", | ||
| default=CacheType.Json, | ||
| ) | ||
|
|
||
| storage: StorageConfig | None = Field( | ||
| description="The storage configuration to use for file-based caches such as 'Json'.", | ||
| default=None, | ||
| ) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,83 @@ | ||
| # Copyright (c) 2024 Microsoft Corporation. | ||
| # Licensed under the MIT License | ||
|
|
||
|
|
||
| """Cache factory implementation.""" | ||
|
|
||
| from collections.abc import Callable | ||
|
|
||
| from graphrag_common.factory import Factory, ServiceScope | ||
dworthen marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| from graphrag_storage import Storage | ||
|
|
||
| from graphrag_cache.cache import Cache | ||
| from graphrag_cache.cache_config import CacheConfig | ||
| from graphrag_cache.cache_type import CacheType | ||
|
|
||
|
|
||
| class CacheFactory(Factory[Cache]): | ||
| """A factory class for cache implementations.""" | ||
|
|
||
|
|
||
| cache_factory = CacheFactory() | ||
|
|
||
|
|
||
| def register_cache( | ||
| cache_type: str, | ||
| cache_initializer: Callable[..., Cache], | ||
| scope: ServiceScope = "transient", | ||
| ) -> None: | ||
| """Register a custom storage implementation. | ||
|
|
||
| Args | ||
| ---- | ||
| - storage_type: str | ||
| The storage id to register. | ||
| - storage_initializer: Callable[..., Storage] | ||
| The storage initializer to register. | ||
dworthen marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| """ | ||
| cache_factory.register(cache_type, cache_initializer, scope) | ||
|
|
||
|
|
||
| def create_cache(config: CacheConfig, storage: Storage | None = None) -> Cache: | ||
| """Create a cache implementation based on the given configuration. | ||
|
|
||
| Args | ||
| ---- | ||
| - config: CacheConfig | ||
| The cache configuration to use. | ||
| - storage: Storage | None | ||
| The storage implementation to use for file-based caches such as 'Json'. | ||
|
|
||
| Returns | ||
| ------- | ||
| Cache | ||
| The created cache implementation. | ||
| """ | ||
| config_model = config.model_dump() | ||
| cache_strategy = config.type | ||
|
|
||
| if cache_strategy not in cache_factory: | ||
dworthen marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| match cache_strategy: | ||
| case "json": | ||
| from graphrag_cache.json_cache import JsonCache | ||
|
|
||
| register_cache(CacheType.Json, JsonCache) | ||
|
|
||
| case "memory": | ||
| from graphrag_cache.memory_cache import MemoryCache | ||
|
|
||
| register_cache(CacheType.Memory, MemoryCache) | ||
|
|
||
| case "noop": | ||
dworthen marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| from graphrag_cache.noop_cache import NoopCache | ||
|
|
||
| register_cache(CacheType.Noop, NoopCache) | ||
|
|
||
| case _: | ||
| msg = f"CacheConfig.type '{cache_strategy}' is not registered in the CacheFactory. Registered types: {', '.join(cache_factory.keys())}." | ||
| raise ValueError(msg) | ||
|
|
||
| if storage: | ||
| config_model["storage"] = storage | ||
|
|
||
| return cache_factory.create(strategy=cache_strategy, init_args=config_model) | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| # Copyright (c) 2024 Microsoft Corporation. | ||
| # Licensed under the MIT License | ||
|
|
||
|
|
||
| """Builtin cache implementation types.""" | ||
|
|
||
| from enum import StrEnum | ||
|
|
||
|
|
||
| class CacheType(StrEnum): | ||
| """Enum for cache types.""" | ||
|
|
||
| Json = "json" | ||
| Memory = "memory" | ||
| Noop = "none" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.