-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Yorkie Intellgence python migration #446
base: main
Are you sure you want to change the base?
Conversation
- not good performance in local model(llama3.2, etc) - add docs
WalkthroughThe pull request introduces new project configuration and API functionality for the Yorkie Intelligence project. It adds comprehensive configuration files (e.g., Changes
Sequence Diagram(s)sequenceDiagram
participant C as Client
participant A as FastAPI App
participant L as Language Model
C->>A: POST request to endpoint (e.g. /intelligence/pr/create)
A->>L: Initialize processing chain with prompt template
L-->>A: Stream response chunks
A->>C: Deliver StreamingResponse (text/event-stream)
Possibly related PRs
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (4)
🚧 Files skipped from review as they are similar to previous changes (4)
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (15)
yorkie-intelligence/README.md (3)
1-2
: Header Clarity and ConsistencyThe file header indicates a work-in-progress state correctly. Consider adding a space before the "(WIP)" for better readability (e.g., "Yorkie Intelligence (WIP)").
3-6
: Improve Instruction Grammar and CapitalizationThe setup instructions on lines 4-5 would benefit from improved grammar and consistent capitalization. For instance, "install python 3.10.* version(recommend using [pyenv]...)" can be refined for clarity and correctness.
-install python 3.10.* version(recommend using [pyenv](https://github.com/pyenv/pyenv))<br/> -install [poetry](https://python-poetry.org/docs/#installing-with-the-official-installer)<br/> +Install Python 3.10.* (recommended to use [pyenv](https://github.com/pyenv/pyenv))<br/> +Install [Poetry](https://python-poetry.org/docs/#installing-with-the-official-installer)<br/>🧰 Tools
🪛 LanguageTool
[grammar] ~3-~3: There seems to be a noun/verb agreement error. Did you mean “installs” or “installed”?
Context: # Yorkie Intelligence(WIP) ## Setting install python 3.10.* version(recommend using [...(SINGULAR_NOUN_VERB_AGREEMENT)
26-26
: Link Formatting for OpenAPI DocumentationThe OpenAPI documentation URL on line 26 is presented as a bare URL. For improved readability and compliance with markdown lint guidelines, consider formatting this as a clickable link.
-you can see openapi in http://localhost:8000 +You can access the OpenAPI documentation at [http://localhost:8000](http://localhost:8000).🧰 Tools
🪛 markdownlint-cli2 (0.17.2)
26-26: Bare URL used
null(MD034, no-bare-urls)
yorkie-intelligence/.pre-commit-config.yaml (2)
6-9
: Fix YAML Indentation for Hooks List Items.YAMLlint reports that the indentation for the hooks list items is off (expected 2 spaces but found 4). To align with common pre-commit configuration style and YAMLlint expectations, please adjust the indentation of the
hooks
section and its child elements. For example, the section should be reindented as follows:- hooks: - # Run the linter. - - id: ruff - # Run the formatter. - - id: ruff-format + hooks: + # Run the linter. + - id: ruff + # Run the formatter. + - id: ruff-format🧰 Tools
🪛 YAMLlint (1.35.1)
[warning] 7-7: wrong indentation: expected 2 but found 4
(indentation)
[error] 9-9: no new line character at the end of file
(new-line-at-end-of-file)
9-9
: Add Newline at End of File.The file is missing a newline at the end, which is causing a YAMLlint error. Please add a newline at the end of the file to conform with best practices and linting requirements.
🧰 Tools
🪛 YAMLlint (1.35.1)
[error] 9-9: no new line character at the end of file
(new-line-at-end-of-file)
yorkie-intelligence/.gitignore (1)
137-163
: Environment Managers and Lock Files SettingsThis section (lines 137–163) includes configurations for various Python environment tools:
- pyenv: Ignoring the
.python-version
is typical for libraries but may be reconsidered for applications.- pipenv: Pipfile.lock is commented out (per common pipenv recommendations).
- poetry: The comments advise that for applications it is generally recommended to commit
poetry.lock
for reproducible builds, yet the file explicitly ignores it.- pdm: The guidance for pdm is clear and follows standard practices.
If this repository represents an application (as suggested by the migration objective) rather than a library, you might consider removing the exclusion for
poetry.lock
so that the lock file is tracked for dependency reproducibility.For example, you could revise the poetry section diff as follows:
-poetry.lock +# Remove the exclusion for poetry.lock if committing it for reproducible dependency resolution.yorkie-intelligence/src/common/utils.py (3)
5-5
: Fix typo in app_name.There's a typo in the app name: "Intellignce" should be "Intelligence".
- app_name: str = "Yorkie Intellignce" + app_name: str = "Yorkie Intelligence"
12-13
: Make environment file path configurable.The environment file path is hardcoded to "src/.env". Consider making it configurable to support different environments (development, testing, production).
- model_config = SettingsConfigDict(env_file="src/.env") + model_config = SettingsConfigDict(env_file="${ENV_FILE:src/.env}")
16-18
: Resolve TODO comment about caching.The TODO comment asks about the difference between basic definition and using
lru_cache
. Let me help clarify this:Using
lru_cache
with settings can improve performance by caching the settings instance, preventing repeated environment variable lookups. However, it's only beneficial if the settings are accessed frequently and don't change during runtime.Would you like me to help implement the
lru_cache
pattern for the settings?yorkie-intelligence/src/api/pr/views.py (1)
16-22
: Improve error handling in event stream.The current error handling is too broad and doesn't provide specific error information.
async def event_stream(): try: async for chunk in chain.astream(query): yield f"data: {chunk}\n\n" - except Exception as e: - yield f"data: {str(e)}\n\n" + except ValueError as e: + yield f"data: Error: Invalid input - {str(e)}\n\n" + except ConnectionError as e: + yield f"data: Error: Language model service unavailable - {str(e)}\n\n" + except Exception as e: + yield f"data: Error: Unexpected error occurred - {str(e)}\n\n"yorkie-intelligence/src/api/issue/views.py (1)
16-22
: Enhance error handling with specific error types.The current error handling catches all exceptions generically. Consider catching specific exceptions to provide more meaningful error messages.
Apply this diff to improve error handling:
async def event_stream(): try: async for chunk in chain.astream(query): yield f"data: {chunk}\n\n" - except Exception as e: - yield f"data: {str(e)}\n\n" + except ValueError as e: + yield f"data: Invalid input: {str(e)}\n\n" + except RuntimeError as e: + yield f"data: Processing error: {str(e)}\n\n" + except Exception as e: + yield f"data: Unexpected error: {str(e)}\n\n"yorkie-intelligence/src/api/pr/config.py (2)
105-107
: Improve template string readability using multi-line strings.The current string concatenation makes the prefix hard to read and maintain.
Apply this diff to improve readability:
- prefix="I want you to act as a GitHub PR Writer for me. I'll provide you with brief notes about GitHub PR, and you just need to write the PR. " - "Please ensure that you follow the template used in example provided. Do not provide the example as it is. Please write your responses in English. " - "If there is insufficient information to create the PR, request additional information", + prefix=""" + I want you to act as a GitHub PR Writer for me. I'll provide you with brief notes about GitHub PR, and you just need to write the PR. + Please ensure that you follow the template used in example provided. Do not provide the example as it is. Please write your responses in English. + If there is insufficient information to create the PR, request additional information + """.strip(),
7-100
: Consider adding more diverse PR examples.The current examples are good but could benefit from including scenarios like:
- Security fixes
- Documentation updates
- Performance improvements
yorkie-intelligence/src/api/issue/config.py (2)
77-79
: Improve template string readability for consistency with PR config.Similar to the PR config, the string concatenation here could be improved for better readability.
Apply this diff to improve readability:
- prefix="I want you to act as a GitHub Issue writer. I will provide brief information about the GitHub issue I want to create, and you should write the GitHub issue. " - "The types of issues you can write are bug 🐞 or enhancement 🌟. Please ensure that you follow the template used in each type of issue example provided. Do not provide the example as it is. Please write your responses in English. " - "If there is insufficient information to create the issue, request additional information.", + prefix=""" + I want you to act as a GitHub Issue writer. I will provide brief information about the GitHub issue I want to create, and you should write the GitHub issue. + The types of issues you can write are bug 🐞 or enhancement 🌟. Please ensure that you follow the template used in each type of issue example provided. Do not provide the example as it is. Please write your responses in English. + If there is insufficient information to create the issue, request additional information. + """.strip(),
3-5
: Consider extracting common template logic to reduce duplication.The template structure is very similar between PR and issue configs. Consider creating a shared utility for template creation.
Create a new utility file
src/common/templates.py
:from typing import List, Dict from langchain_core.prompts import PromptTemplate, FewShotPromptTemplate def create_template(template_str: str, examples: List[Dict], prefix: str, suffix: str) -> FewShotPromptTemplate: example_prompt = PromptTemplate.from_template(template_str) return FewShotPromptTemplate( example_prompt=example_prompt, examples=examples, prefix=prefix, suffix=suffix, input_variables=["content"], )Then use it in both config files to reduce duplication.
Also applies to: 74-82
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
yorkie-intelligence/.gitignore
(1 hunks)yorkie-intelligence/.pre-commit-config.yaml
(1 hunks)yorkie-intelligence/README.md
(1 hunks)yorkie-intelligence/pyproject.toml
(1 hunks)yorkie-intelligence/src/.env.sample
(1 hunks)yorkie-intelligence/src/api/__init__.py
(1 hunks)yorkie-intelligence/src/api/issue/config.py
(1 hunks)yorkie-intelligence/src/api/issue/views.py
(1 hunks)yorkie-intelligence/src/api/pr/config.py
(1 hunks)yorkie-intelligence/src/api/pr/views.py
(1 hunks)yorkie-intelligence/src/common/llms.py
(1 hunks)yorkie-intelligence/src/common/utils.py
(1 hunks)yorkie-intelligence/src/main.py
(1 hunks)
✅ Files skipped from review due to trivial changes (3)
- yorkie-intelligence/src/api/init.py
- yorkie-intelligence/src/.env.sample
- yorkie-intelligence/pyproject.toml
🧰 Additional context used
🪛 YAMLlint (1.35.1)
yorkie-intelligence/.pre-commit-config.yaml
[warning] 7-7: wrong indentation: expected 2 but found 4
(indentation)
[error] 9-9: no new line character at the end of file
(new-line-at-end-of-file)
🪛 LanguageTool
yorkie-intelligence/README.md
[grammar] ~3-~3: There seems to be a noun/verb agreement error. Did you mean “installs” or “installed”?
Context: # Yorkie Intelligence(WIP) ## Setting install python 3.10.* version(recommend using [...
(SINGULAR_NOUN_VERB_AGREEMENT)
🪛 markdownlint-cli2 (0.17.2)
yorkie-intelligence/README.md
26-26: Bare URL used
null
(MD034, no-bare-urls)
🪛 Ruff (0.8.2)
yorkie-intelligence/src/api/issue/views.py
13-13: Do not perform function call Depends
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
yorkie-intelligence/src/api/pr/views.py
13-13: Do not perform function call Depends
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
🔇 Additional comments (10)
yorkie-intelligence/README.md (4)
7-8
: Development Section AcknowledgmentThe "### dev" section is clear and uses a standard header for development instructions.
9-11
: Dev Environment Command BlockThe code block for development (
poetry install --no-root
) is clear and correctly formatted.
13-18
: Production Environment Command BlockThe "### prod" section and its associated command block (
poetry install --without dev
) are properly formatted and straightforward.
19-20
: Section Header for 'How To Start'The "## How To Start" section header is clear and appropriately introduces the commands to start the application.
yorkie-intelligence/.gitignore (5)
1-3
: Autogenerated Header is Clear and InformativeThe header clearly credits the Toptal gitignore generator and documents the templates used—this is useful context for future maintenance.
4-18
: Linux Section is Standard and ComprehensiveThe Linux block (lines 4–18) correctly ignores common temporary files (such as backup files ending with
~
), transient files from fuse, KDE preferences, and Linux trash folders. This helps maintain a clean repository.
19-47
: macOS and Remote Share Files CoverageThe macOS section (lines 19–47) properly lists system files like
.DS_Store
, Apple-specific files, and even directories potentially created on AFP shares. This ensures that macOS system artifacts do not clutter version control.
52-136
: Python Build Artifacts and Testing Reports InclusionThe block covering Python build artifacts (lines 52–136) is well crafted. It includes rules for ignoring:
- Byte-compiled files (
__pycache__/
,*.py[cod]
, etc.),- C extensions (
*.so
),- Packaging directories (e.g.,
build/
,dist/
, etc.),- PyInstaller artifacts, installer logs, coverage reports, translation files, and even framework-specific files for Django, Flask, Scrapy, and Sphinx docs.
This comprehensive list covers most common unwanted files for a Python project.
164-243
: Additional Tools, IDE, and Editor ConfigurationsThe remaining sections (lines 164–243) cover a wide array of tools and configurations:
- PEP 582 support,
- Celery scheduling files,
- SageMath, various environment directories,
- IDE-specific configurations for Spyder, Rope, PyCharm, and Visual Studio Code (including local history and extension build files),
- Linters and type checkers (mypy, Pyre, pytype, ruff, LSP config).
These entries are standard and help ensure that temporary, build-specific, or environment-specific files do not get committed.
yorkie-intelligence/src/main.py (1)
10-16
: Restrict CORS settings for production.The current CORS configuration is too permissive, allowing all origins (
*
), methods, and headers. This might pose security risks in a production environment.Consider restricting these settings based on your deployment requirements. For example:
app.add_middleware( CORSMiddleware, - allow_origins=["*"], - allow_methods=["*"], - allow_headers=["*"], + allow_origins=["https://your-frontend-domain.com"], + allow_methods=["GET", "POST"], + allow_headers=["Authorization", "Content-Type"], allow_credentials=True, )✅ Verification successful
CORS settings must be restricted for a secure production setup.
The current configuration in
yorkie-intelligence/src/main.py
(lines 10-16) allows all origins, methods, and headers, which is suitable only for development. For production, it's recommended to limit these settings to trusted domains and permitted HTTP methods/headers.
- File under review:
yorkie-intelligence/src/main.py
- Issue: Permissive CORS configuration (
allow_origins=["*"]
,allow_methods=["*"]
,allow_headers=["*"]
)- Recommendation: Restrict these settings based on your deployment requirements (e.g., specific domains, required methods, and headers).
🏁 Scripts executed
The following scripts were executed for the analysis:
Error: No shell scripts found
yorkie-intelligence/README.md
Outdated
```sh | ||
git clone https://github.com/yorkie-team/codepair.git | ||
cd yorkie-installigence | ||
poetry run uvicorn src.main:app --reload | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct Repository Navigation Command
Within the startup instructions, the command on line 23 (cd yorkie-installigence
) appears to be a typo given the repository name. It should likely be cd yorkie-intelligence
to match the project name.
-cd yorkie-installigence
+cd yorkie-intelligence
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
```sh | |
git clone https://github.com/yorkie-team/codepair.git | |
cd yorkie-installigence | |
poetry run uvicorn src.main:app --reload | |
``` |
def get_model(): | ||
if SETTINGS.model_type == "ollama": | ||
llm = ChatOllama(model=SETTINGS.model_name, temperature=0) | ||
elif SETTINGS.model_type == "openai": | ||
llm = ChatOpenAI( | ||
model=SETTINGS.model_name, api_key=SETTINGS.api_key, temperature=0 | ||
) | ||
else: | ||
raise ValueError("Invalid model type") | ||
|
||
# TODO | ||
# support more model type | ||
|
||
return llm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add type hints and improve error handling.
The function would benefit from type hints and better error handling for configuration issues.
-def get_model():
+from typing import Literal
+
+def get_model() -> ChatOllama | ChatOpenAI:
+ """Get a language model instance based on configuration.
+
+ Returns:
+ ChatOllama | ChatOpenAI: Configured language model instance.
+
+ Raises:
+ ValueError: If model_type is invalid or required configuration is missing.
+ """
+ model_type = SETTINGS.model_type.lower()
+ if not SETTINGS.model_name:
+ raise ValueError("model_name must be configured")
+
if SETTINGS.model_type == "ollama":
+ if not SETTINGS.ollama_url:
+ raise ValueError("ollama_url must be configured for Ollama models")
llm = ChatOllama(model=SETTINGS.model_name, temperature=0)
elif SETTINGS.model_type == "openai":
+ if not SETTINGS.api_key:
+ raise ValueError("api_key must be configured for OpenAI models")
llm = ChatOpenAI(
model=SETTINGS.model_name, api_key=SETTINGS.api_key, temperature=0
)
else:
- raise ValueError("Invalid model type")
+ raise ValueError(f"Invalid model_type: {SETTINGS.model_type}. Supported types: ollama, openai")
return llm
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def get_model(): | |
if SETTINGS.model_type == "ollama": | |
llm = ChatOllama(model=SETTINGS.model_name, temperature=0) | |
elif SETTINGS.model_type == "openai": | |
llm = ChatOpenAI( | |
model=SETTINGS.model_name, api_key=SETTINGS.api_key, temperature=0 | |
) | |
else: | |
raise ValueError("Invalid model type") | |
# TODO | |
# support more model type | |
return llm | |
from typing import Literal | |
def get_model() -> ChatOllama | ChatOpenAI: | |
"""Get a language model instance based on configuration. | |
Returns: | |
ChatOllama | ChatOpenAI: Configured language model instance. | |
Raises: | |
ValueError: If model_type is invalid or required configuration is missing. | |
""" | |
model_type = SETTINGS.model_type.lower() | |
if not SETTINGS.model_name: | |
raise ValueError("model_name must be configured") | |
if SETTINGS.model_type == "ollama": | |
if not SETTINGS.ollama_url: | |
raise ValueError("ollama_url must be configured for Ollama models") | |
llm = ChatOllama(model=SETTINGS.model_name, temperature=0) | |
elif SETTINGS.model_type == "openai": | |
if not SETTINGS.api_key: | |
raise ValueError("api_key must be configured for OpenAI models") | |
llm = ChatOpenAI( | |
model=SETTINGS.model_name, api_key=SETTINGS.api_key, temperature=0 | |
) | |
else: | |
raise ValueError(f"Invalid model_type: {SETTINGS.model_type}. Supported types: ollama, openai") | |
return llm |
@router.get("/streaming_pr") | ||
async def get_pr(query: str, llm=Depends(get_model)): | ||
chain = issue_template_prompt | llm | StrOutputParser() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add input validation and fix Depends usage.
The endpoint needs input validation and should follow FastAPI's best practices for dependency injection.
+from typing import Annotated
+from pydantic import BaseModel, Field
+
+class PRQuery(BaseModel):
+ query: str = Field(..., min_length=1, description="The query to generate PR from")
+
@router.get("/streaming_pr")
-async def get_pr(query: str, llm=Depends(get_model)):
+async def get_pr(
+ query: Annotated[str, Field(min_length=1)],
+ llm: Annotated[ChatOllama | ChatOpenAI, Depends(get_model)]
+):
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
@router.get("/streaming_pr") | |
async def get_pr(query: str, llm=Depends(get_model)): | |
chain = issue_template_prompt | llm | StrOutputParser() | |
from typing import Annotated | |
from pydantic import BaseModel, Field | |
class PRQuery(BaseModel): | |
query: str = Field(..., min_length=1, description="The query to generate PR from") | |
@router.get("/streaming_pr") | |
async def get_pr( | |
query: Annotated[str, Field(min_length=1)], | |
llm: Annotated[ChatOllama | ChatOpenAI, Depends(get_model)] | |
): | |
chain = issue_template_prompt | llm | StrOutputParser() |
🧰 Tools
🪛 Ruff (0.8.2)
13-13: Do not perform function call Depends
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
|
||
|
||
@router.get("/streaming_issue") | ||
async def get_issue(query: str, llm=Depends(get_model)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Refactor dependency injection to follow best practices.
The current implementation of Depends(get_model)
in the parameter default violates best practices (B008). This could lead to unexpected behavior if the dependency is evaluated at import time.
Apply this diff to fix the dependency injection:
-async def get_issue(query: str, llm=Depends(get_model)):
+async def get_issue(query: str, llm: Any = Depends(get_model)):
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
async def get_issue(query: str, llm=Depends(get_model)): | |
async def get_issue(query: str, llm: Any = Depends(get_model)): |
🧰 Tools
🪛 Ruff (0.8.2)
13-13: Do not perform function call Depends
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
yorkie-intelligence/src/api/pr/views.py (1)
12-13
: 🛠️ Refactor suggestionAdd input validation and fix Depends usage.
The endpoint needs input validation and should follow FastAPI's best practices for dependency injection.
+from typing import Annotated +from pydantic import BaseModel, Field + +class PRQuery(BaseModel): + query: str = Field(..., min_length=1, description="The query to generate PR from") + @router.get("/create") -async def make_pr(query: str, llm=Depends(get_model)): +async def make_pr( + query: Annotated[str, Field(min_length=1)], + llm: Annotated[ChatOllama | ChatOpenAI, Depends(get_model)] +):🧰 Tools
🪛 Ruff (0.8.2)
13-13: Do not perform function call
Depends
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable(B008)
yorkie-intelligence/src/api/issue/views.py (1)
12-13
: 🛠️ Refactor suggestionRefactor dependency injection to follow best practices.
The current implementation of
Depends(get_model)
in the parameter default violates best practices (B008). This could lead to unexpected behavior if the dependency is evaluated at import time.Apply this diff to fix the dependency injection:
-async def make_issue(query: str, llm=Depends(get_model)): +async def make_issue(query: str, llm: Any = Depends(get_model)):Don't forget to add the following import:
from typing import Any🧰 Tools
🪛 Ruff (0.8.2)
13-13: Do not perform function call
Depends
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable(B008)
🧹 Nitpick comments (6)
yorkie-intelligence/README.md (3)
1-1
: Title Formatting ImprovementConsider adding a space before the "(WIP)" to improve readability. For example, change
# Yorkie Intelligence(WIP)
to
# Yorkie Intelligence (WIP)
4-5
: Clarify Installation StepsThe installation instructions could be rephrased to start with a capital letter and clarify the Python version requirement. For instance, you might update them as follows:
-install python 3.10.* version(recommend using [pyenv](https://github.com/pyenv/pyenv))<br/> -install [poetry](https://python-poetry.org/docs/#installing-with-the-official-installer)<br/> +Install Python 3.10.x (recommended to use [pyenv](https://github.com/pyenv/pyenv))<br/> +Install [Poetry](https://python-poetry.org/docs/#installing-with-the-official-installer)<br/>
27-27
: Improve OpenAPI Access InstructionThe instruction on line 27 can be improved for clarity and to comply with markdown best practices by replacing the preposition and the bare URL. For example:
-you can see openapi in http://localhost:8000/docs +You can see the OpenAPI documentation at [http://localhost:8000/docs](http://localhost:8000/docs)🧰 Tools
🪛 LanguageTool
[uncategorized] ~27-~27: The preposition “at” seems more likely in this position than the preposition “in”.
Context: ...n:app --reload ``` you can see openapi in http://localhost:8000/docs(AI_EN_LECTOR_REPLACEMENT_PREPOSITION_IN_AT)
🪛 markdownlint-cli2 (0.17.2)
27-27: Bare URL used
null(MD034, no-bare-urls)
yorkie-intelligence/src/api/pr/views.py (2)
1-7
: Add type hints for language models.Import the LLM types to improve type safety and IDE support.
from fastapi import APIRouter, Depends from fastapi.responses import StreamingResponse from langchain_core.output_parsers import StrOutputParser +from langchain_community.chat_models import ChatOllama +from langchain_openai import ChatOpenAI from src.common.llms import get_model from .config import issue_template_prompt
14-23
: Enhance error handling with structured responses.The current error handling could be improved to provide more detailed error information to clients.
async def event_stream(): try: async for chunk in chain.astream(query): yield f"data: {chunk}\n\n" except Exception as e: - yield f"data: {str(e)}\n\n" + error_response = { + "error": { + "type": e.__class__.__name__, + "message": str(e) + } + } + import json + yield f"data: {json.dumps(error_response)}\n\n"yorkie-intelligence/src/api/issue/views.py (1)
14-23
: Enhance error handling in event stream.While the basic error handling is in place, consider these improvements:
- Log the error for debugging
- Return a structured error response
Apply this diff to enhance error handling:
async def event_stream(): try: async for chunk in chain.astream(query): yield f"data: {chunk}\n\n" except Exception as e: + import logging + logging.error(f"Error in event stream: {str(e)}") + error_response = { + "error": "An error occurred during processing", + "detail": str(e) + } - yield f"data: {str(e)}\n\n" + yield f"data: {error_response}\n\n"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
yorkie-intelligence/README.md
(1 hunks)yorkie-intelligence/src/api/__init__.py
(1 hunks)yorkie-intelligence/src/api/issue/views.py
(1 hunks)yorkie-intelligence/src/api/pr/views.py
(1 hunks)yorkie-intelligence/src/common/utils.py
(1 hunks)yorkie-intelligence/src/main.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- yorkie-intelligence/src/main.py
- yorkie-intelligence/src/common/utils.py
🧰 Additional context used
🪛 Ruff (0.8.2)
yorkie-intelligence/src/api/issue/views.py
13-13: Do not perform function call Depends
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
yorkie-intelligence/src/api/pr/views.py
13-13: Do not perform function call Depends
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
🪛 LanguageTool
yorkie-intelligence/README.md
[grammar] ~3-~3: There seems to be a noun/verb agreement error. Did you mean “installs” or “installed”?
Context: # Yorkie Intelligence(WIP) ## Setting install python 3.10.* version(recommend using [...
(SINGULAR_NOUN_VERB_AGREEMENT)
[uncategorized] ~27-~27: The preposition “at” seems more likely in this position than the preposition “in”.
Context: ...n:app --reload ``` you can see openapi in http://localhost:8000/docs
(AI_EN_LECTOR_REPLACEMENT_PREPOSITION_IN_AT)
🪛 markdownlint-cli2 (0.17.2)
yorkie-intelligence/README.md
27-27: Bare URL used
null
(MD034, no-bare-urls)
🔇 Additional comments (6)
yorkie-intelligence/README.md (1)
22-25
: Correct Directory Navigation CommandThere is a typo in the repository directory name on line 23. The command
cd yorkie-intalligence
should be corrected to
cd yorkie-intelligence
to accurately reflect the project name and ensure proper navigation.-cd yorkie-intalligence +cd yorkie-intelligenceyorkie-intelligence/src/api/__init__.py (3)
1-4
: LGTM!The imports are well-organized and include all necessary dependencies for the router setup.
6-6
: LGTM!The router is correctly initialized using FastAPI's APIRouter.
8-9
: LGTM!The sub-routers are correctly included with appropriate RESTful API prefixes.
yorkie-intelligence/src/api/issue/views.py (2)
1-6
: LGTM!The imports are well-organized and include all necessary dependencies for the issue creation endpoint.
9-9
: LGTM!The router is correctly initialized using FastAPI's APIRouter.
- add pytest-asyncio for llm test - add simple llm response test - change api response - fix api request - add .gitignore rule(F401)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🔭 Outside diff range comments (1)
yorkie-intelligence/test/test_stream_response.py (1)
26-40
: Fix incorrect endpoint and remove code duplication.Critical issues found:
- The function name
test_stream_issue
suggests testing issue creation, but it's testing the PR creation endpoint- The code is duplicated from
test_stream_pr
Apply these changes:
- Update the endpoint to test issue creation:
- "POST", "/intelligence/pr/create", json={"query": "hi"} + "POST", "/intelligence/issue/create", json={"query": "hi"}
- Consider extracting common test logic to avoid duplication:
@pytest.fixture async def api_client(): async with AsyncClient( transport=ASGITransport(app=app), base_url="http://127.0.0.1:8000" ) as client: yield client async def test_stream_endpoint(api_client, endpoint: str, query: str): async with api_client.stream( "POST", endpoint, json={"query": query} ) as response: assert response.status_code == 200 data = [] async for line in response.aiter_lines(): data.append(line) return data @pytest.mark.asyncio async def test_stream_pr(api_client): await test_stream_endpoint(api_client, "/intelligence/pr/create", "test PR query") @pytest.mark.asyncio async def test_stream_issue(api_client): await test_stream_endpoint(api_client, "/intelligence/issue/create", "test issue query")
♻️ Duplicate comments (1)
yorkie-intelligence/src/api/issue/views.py (1)
12-13
:⚠️ Potential issueFix dependency injection and add type hints.
The dependency injection issue from the previous review still needs to be addressed.
Apply this diff to fix the dependency injection and add proper type hints:
-async def make_issue(query: str = Body(embed=True), llm=Depends(get_model)): +async def make_issue( + query: str = Body(embed=True), + llm: Any = Depends(get_model) +):Don't forget to add the following import:
+from typing import Any
🧹 Nitpick comments (1)
yorkie-intelligence/test/test_stream_response.py (1)
7-8
: Translate and implement the TODO comment.The TODO comment in Korean reduces code accessibility. Consider:
- Translating it to English
- Implementing the model configuration for tests
- Adding it to the test configuration file instead of inline comments
Apply this diff:
-# TODO -# test시 smollm2:135m 모델 사용하도록 변경 +# TODO: Configure tests to use smollm2:135m model
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
yorkie-intelligence/pyproject.toml
(1 hunks)yorkie-intelligence/src/api/__init__.py
(1 hunks)yorkie-intelligence/src/api/issue/__init__.py
(1 hunks)yorkie-intelligence/src/api/issue/views.py
(1 hunks)yorkie-intelligence/src/api/pr/__init__.py
(1 hunks)yorkie-intelligence/src/api/pr/views.py
(1 hunks)yorkie-intelligence/src/common/llms.py
(1 hunks)yorkie-intelligence/src/common/utils.py
(1 hunks)yorkie-intelligence/test/test_stream_response.py
(1 hunks)
✅ Files skipped from review due to trivial changes (2)
- yorkie-intelligence/src/api/issue/init.py
- yorkie-intelligence/src/api/pr/init.py
🚧 Files skipped from review as they are similar to previous changes (5)
- yorkie-intelligence/src/api/pr/views.py
- yorkie-intelligence/src/api/init.py
- yorkie-intelligence/src/common/utils.py
- yorkie-intelligence/src/common/llms.py
- yorkie-intelligence/pyproject.toml
🔇 Additional comments (2)
yorkie-intelligence/test/test_stream_response.py (1)
1-4
: LGTM!The imports are well-organized and appropriate for async testing with FastAPI.
yorkie-intelligence/src/api/issue/views.py (1)
1-11
: LGTM! Well-organized imports and router setup.The imports are properly organized following PEP 8 guidelines, and the router setup is clean.
@pytest.mark.asyncio | ||
async def test_stream_pr(): | ||
async with AsyncClient( | ||
transport=ASGITransport(app=app), base_url="http://127.0.0.1:8000" | ||
) as client: | ||
async with client.stream( | ||
"POST", "/intelligence/pr/create", json={"query": "hi"} | ||
) as response: | ||
assert response.status_code == 200 # 응답 상태 확인 | ||
|
||
# data = [] | ||
# async for line in response.aiter_lines(): | ||
# data.append(line) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance test coverage and data validation.
The test has several areas for improvement:
- It only verifies the response status code without validating the response data
- Uses hardcoded test data
- Doesn't test error cases
- Contains commented-out code that should be implemented
Consider implementing these improvements:
@pytest.mark.asyncio
@pytest.mark.parametrize("query,expected_status", [
("valid query", 200),
("", 400),
("invalid_query" * 1000, 400) # Test query length limits
])
async def test_stream_pr(query: str, expected_status: int):
async with AsyncClient(
transport=ASGITransport(app=app), base_url="http://127.0.0.1:8000"
) as client:
async with client.stream(
"POST", "/intelligence/pr/create", json={"query": query}
) as response:
assert response.status_code == expected_status
if response.status_code == 200:
data = []
async for line in response.aiter_lines():
data.append(line)
assert isinstance(line, str) # Validate response format
assert len(data) > 0 # Ensure we received some data
- add doc write test - add document writing function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🔭 Outside diff range comments (1)
yorkie-intelligence/test/test_stream_response.py (1)
42-56
: Implement response data validation.The test should validate the streamed response data. Consider:
- Implementing the commented-out code for response validation
- Adding assertions to verify the response format and content
- Testing with different query inputs
Apply this diff to implement response validation:
async with client.stream( "POST", "/intelligence/doc/create", json={"query": "hi"} ) as response: assert response.status_code == 200 # 응답 상태 확인 + data = [] + async for line in response.aiter_lines(): + data.append(line) + assert isinstance(line, str) + assert len(data) > 0
🧹 Nitpick comments (3)
yorkie-intelligence/src/api/write_document/config.py (1)
76-107
: Minor consistency nitpick regarding the punctuation and quotes in the prefix.
You might consider making the usage of quotes around “Yorkie Intelligence” more uniform to prevent any confusion or unintended formatting in multiline strings.Example potential update:
- prefix="""You are an AI documentation assistant named "Yorkie Intelligence. + prefix="""You are an AI documentation assistant named "Yorkie Intelligence."yorkie-intelligence/src/api/__init__.py (1)
7-7
: Add type hints for better code maintainability.Consider adding type hints to improve code maintainability and IDE support.
Apply this diff:
-router = APIRouter() +router: APIRouter = APIRouter()yorkie-intelligence/test/test_stream_response.py (1)
7-8
: Translate TODO comment to English and track it.The TODO comment in Korean suggests using the
smollm2:135m
model for testing. Consider:
- Translating the comment to English for consistency.
- Creating an issue to track this requirement.
Apply this diff to translate the comment:
-# TODO -# test시 smollm2:135m 모델 사용하도록 변경 +# TODO: Update tests to use smollm2:135m modelWould you like me to create an issue to track this TODO?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
yorkie-intelligence/src/api/__init__.py
(1 hunks)yorkie-intelligence/src/api/write_document/__init__.py
(1 hunks)yorkie-intelligence/src/api/write_document/config.py
(1 hunks)yorkie-intelligence/src/api/write_document/views.py
(1 hunks)yorkie-intelligence/test/test_stream_response.py
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- yorkie-intelligence/src/api/write_document/init.py
🔇 Additional comments (6)
yorkie-intelligence/src/api/write_document/config.py (3)
1-2
: No issues found in the imports.
3-4
: Definition of 'example_prompt' looks good.
5-74
: Examples provide clear and comprehensive subscription details.
Any potential modifications are optional, but please confirm that the large Markdown content is correctly parsed and displayed as intended in downstream usage.yorkie-intelligence/src/api/__init__.py (2)
7-11
: LGTM! Clean router organization.The router configuration follows FastAPI best practices with clear prefix naming and logical separation of concerns.
9-11
: 🏁 Script executed:#!/bin/bash # Search for API documentation or OpenAPI specs fd -e md -e yaml -e json . | xargs rg -l "intelligence.*(?:pr|issue|doc)"Length of output: 75
yorkie-intelligence/test/test_stream_response.py (1)
11-24
: Enhance test coverage and data validation.The test has several areas for improvement that were previously identified:
- Only verifies response status without validating response data
- Uses hardcoded test data
- Doesn't test error cases
- Contains commented-out code that should be implemented
from fastapi import APIRouter | ||
|
||
from src.api.pr import router as pr_router | ||
from src.api.issue import router as issue_router | ||
from src.api.write_document import router as doc_router |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Use relative imports for internal modules.
The current absolute imports starting with 'src' may cause issues when the package is installed. Consider using relative imports for internal modules.
Apply this diff to improve the imports:
from fastapi import APIRouter
-from src.api.pr import router as pr_router
-from src.api.issue import router as issue_router
-from src.api.write_document import router as doc_router
+from .pr import router as pr_router
+from .issue import router as issue_router
+from .write_document import router as doc_router
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
from fastapi import APIRouter | |
from src.api.pr import router as pr_router | |
from src.api.issue import router as issue_router | |
from src.api.write_document import router as doc_router | |
from fastapi import APIRouter | |
from .pr import router as pr_router | |
from .issue import router as issue_router | |
from .write_document import router as doc_router |
async def event_stream(): | ||
try: | ||
async for chunk in chain.astream(query): | ||
yield chunk | ||
except Exception as e: | ||
yield f"\n\n{str(e)}\n\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance error handling and input validation.
The current error handling is basic and could be improved. Consider:
- Validating the input query length and format.
- Structuring error responses consistently with success responses.
- Adding specific error handling for different types of exceptions.
Apply this diff to improve error handling:
async def event_stream():
+ if not query or len(query.strip()) == 0:
+ yield "Error: Query cannot be empty"
+ return
+
try:
async for chunk in chain.astream(query):
yield chunk
- except Exception as e:
- yield f"\n\n{str(e)}\n\n"
+ except ValueError as e:
+ yield f"Error: Invalid input - {str(e)}"
+ except Exception as e:
+ yield f"Error: An unexpected error occurred - {str(e)}"
+ # Log the error for debugging
+ logger.error(f"Unexpected error in write_document: {str(e)}")
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
async def event_stream(): | |
try: | |
async for chunk in chain.astream(query): | |
yield chunk | |
except Exception as e: | |
yield f"\n\n{str(e)}\n\n" | |
async def event_stream(): | |
if not query or len(query.strip()) == 0: | |
yield "Error: Query cannot be empty" | |
return | |
try: | |
async for chunk in chain.astream(query): | |
yield chunk | |
except ValueError as e: | |
yield f"Error: Invalid input - {str(e)}" | |
except Exception as e: | |
yield f"Error: An unexpected error occurred - {str(e)}" | |
# Log the error for debugging | |
logger.error(f"Unexpected error in write_document: {str(e)}") |
async with client.stream( | ||
"POST", "/intelligence/pr/create", json={"query": "hi"} | ||
) as response: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix incorrect endpoint URL.
The test is using the PR endpoint (/intelligence/pr/create
) instead of the issue endpoint.
Apply this diff to fix the endpoint URL:
async with client.stream(
- "POST", "/intelligence/pr/create", json={"query": "hi"}
+ "POST", "/intelligence/issue/create", json={"query": "hi"}
) as response:
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
async with client.stream( | |
"POST", "/intelligence/pr/create", json={"query": "hi"} | |
) as response: | |
async with client.stream( | |
- "POST", "/intelligence/pr/create", json={"query": "hi"} | |
+ "POST", "/intelligence/issue/create", json={"query": "hi"} | |
) as response: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution! I've left few comments.
router.include_router(pr_router, prefix="/pr") | ||
router.include_router(issue_router, prefix="/issue") | ||
router.include_router(doc_router, prefix="/doc") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about integrating these APIs into a single file using a path parameter?
Since all actions appear to be the same, with only the prompt differing, integrating them could improve scalability. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now, only few show prompt is applied all these, but I think, in the future, each function can be implemented by different way to enhancement the model performance(ex. refering a vectordb, using langgraph agent). so, IMO if we have a plan to change the method, separating api is better choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q) Also, can we make mock llm to improve testability?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found this(FakeLLM). I'll apply these for testing, and maybe it might not be optimal solution. I'll try and share experience
- remove empty files - change config naming rule - fix url path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (4)
yorkie-intelligence/README.md (1)
22-26
: Correct Repository Navigation Command
The commandcd yorkie-intalligence
appears to contain a typo. It should likely be updated to:-cd yorkie-intalligence +cd yorkie-intelligenceThis correction ensures consistency with the repository name.
yorkie-intelligence/src/api/write_document/views.py (1)
16-22
: Enhance error handling and input validation.The current implementation needs improved error handling and input validation.
async def event_stream(): + # Validate input + if not query or len(query.strip()) == 0: + yield "Error: Query cannot be empty" + return + + # Add logging + import logging + logger = logging.getLogger(__name__) + logger.info(f"Processing document writing request with query: {query[:50]}...") + try: async for chunk in chain.astream(query): yield chunk except Exception as e: - yield f"\n\n{str(e)}\n\n" + error_msg = f"Error during document generation: {str(e)}" + logger.error(error_msg) + yield error_msgyorkie-intelligence/src/api/issue/views.py (2)
13-13
: Fix dependency injection to follow best practices.The current implementation of
Depends(get_model)
in the parameter default violates best practices (B008). This could lead to unexpected behavior if the dependency is evaluated at import time.Apply this diff to fix the dependency injection:
-async def make_issue(query: str = Body(embed=True), llm=Depends(get_model)): +async def make_issue(query: str = Body(embed=True), llm: Any = Depends(get_model)):
14-23
: Improve error handling and add type hints for event streaming.The error handling could be more specific and structured, and the error messages should follow SSE format.
Apply this diff to improve the implementation:
+from typing import AsyncGenerator + async def make_issue(query: str = Body(embed=True), llm: Any = Depends(get_model)): chain = issue_template_prompt | llm | StrOutputParser() - async def event_stream(): + async def event_stream() -> AsyncGenerator[str, None]: try: async for chunk in chain.astream(query): - yield chunk - except Exception as e: - yield f"\n\n{str(e)}\n\n" + yield f"data: {chunk}\n\n" + except Exception as e: + error_message = { + "error": str(e), + "type": e.__class__.__name__ + } + yield f"event: error\ndata: {str(error_message)}\n\n"
🧹 Nitpick comments (10)
yorkie-intelligence/README.md (2)
3-5
: Improve Clarity and Grammar in Setup Instructions
The instructions in these lines would read more clearly with improved phrasing. For example, consider rewording:
- From:
install python 3.10.* version(recommend using [pyenv](https://github.com/pyenv/pyenv))<br/>
To:
Install Python 3.10.* (recommended using [pyenv](https://github.com/pyenv/pyenv))<br/>
- And similarly update the poetry installation line for consistency.
🧰 Tools
🪛 LanguageTool
[grammar] ~3-~3: There seems to be a noun/verb agreement error. Did you mean “installs” or “installed”?
Context: # Yorkie Intelligence(WIP) ## Setting install python 3.10.* version(recommend using [...(SINGULAR_NOUN_VERB_AGREEMENT)
28-28
: Format OpenAPI Documentation URL
Consider formatting the URL as a Markdown link to improve readability. For example, replace the bare URL with:[OpenAPI docs](http://localhost:8000/docs)
Also, capitalizing the sentence ("You can see…") would enhance clarity.
🧰 Tools
🪛 markdownlint-cli2 (0.17.2)
28-28: Bare URL used
null(MD034, no-bare-urls)
yorkie-intelligence/src/api/write_document/views.py (3)
8-11
: Reduce consecutive blank lines.Multiple consecutive blank lines can be reduced to a single blank line to maintain consistent spacing.
from .config import document_writing_template_prompt - router = APIRouter()
12-14
: Add endpoint documentation and type hints.The endpoint would benefit from:
- OpenAPI documentation using docstring
- Type hints for better code maintainability
@router.post("/") async def write_document(query: str = Body(embed=True), llm=Depends(get_model)): + """ + Generate document content based on the provided query. + + Args: + query (str): The input query for document generation + llm: The language model instance + + Returns: + StreamingResponse: A stream of generated document content + """ + from langchain.chains import LLMChain + chain = document_writing_template_prompt | llm | StrOutputParser()
23-24
: Add response headers for SSE.For Server-Sent Events (SSE), additional headers should be set to ensure proper client handling.
- return StreamingResponse(event_stream(), media_type="text/event-stream") + response = StreamingResponse(event_stream(), media_type="text/event-stream") + response.headers["Cache-Control"] = "no-cache" + response.headers["Connection"] = "keep-alive" + return responseyorkie-intelligence/src/api/issue/views.py (1)
8-11
: Remove excessive blank lines.There are multiple consecutive blank lines that can be reduced to a single blank line for better code organization.
Apply this diff to fix the spacing:
from .config import issue_template_prompt - router = APIRouter() -yorkie-intelligence/src/common/utils.py (4)
5-5
: Fix typo in APP_NAME.There's a typo in the application name: "Intellignce" should be "Intelligence".
- APP_NAME: str = "Yorkie Intellignce" + APP_NAME: str = "Yorkie Intelligence"
8-9
: Consider providing default values for required fields.
MODEL_TYPE
andMODEL_NAME
are required fields without defaults. This might cause initialization failures if not properly configured. Consider either:
- Providing sensible defaults
- Adding validation to ensure these values are set correctly
- MODEL_TYPE: str - MODEL_NAME: str + MODEL_TYPE: str = "ollama" # Default to ollama as shown in .env.deployment + MODEL_NAME: str = "smollm2:135m" # Default model
7-7
: Make OLLAMA_URL more configurable.The current OLLAMA_URL default assumes a localhost setup, which might not work in production or different environments. Consider using a more flexible default or environment-specific configuration.
- OLLAMA_URL: str = "localhost:11434" + OLLAMA_URL: str = "http://localhost:11434" # Add protocol
12-12
: Make env file path more flexible.The hardcoded path "src/.env.deployment" might cause issues in different environments or when running tests. Consider:
- Making it configurable
- Supporting multiple environment files
- Using relative paths
- model_config = SettingsConfigDict(env_file="src/.env.deployment") + model_config = SettingsConfigDict( + env_file=".env.deployment", + env_file_path=[".", "src/", "../src/"] + )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
yorkie-intelligence/README.md
(1 hunks)yorkie-intelligence/src/.env.deployment
(1 hunks)yorkie-intelligence/src/api/issue/views.py
(1 hunks)yorkie-intelligence/src/api/pr/views.py
(1 hunks)yorkie-intelligence/src/api/write_document/views.py
(1 hunks)yorkie-intelligence/src/common/llms.py
(1 hunks)yorkie-intelligence/src/common/utils.py
(1 hunks)yorkie-intelligence/test/test_stream_response.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- yorkie-intelligence/src/common/llms.py
- yorkie-intelligence/src/api/pr/views.py
- yorkie-intelligence/test/test_stream_response.py
🧰 Additional context used
🪛 LanguageTool
yorkie-intelligence/README.md
[grammar] ~3-~3: There seems to be a noun/verb agreement error. Did you mean “installs” or “installed”?
Context: # Yorkie Intelligence(WIP) ## Setting install python 3.10.* version(recommend using [...
(SINGULAR_NOUN_VERB_AGREEMENT)
🪛 markdownlint-cli2 (0.17.2)
yorkie-intelligence/README.md
28-28: Bare URL used
null
(MD034, no-bare-urls)
🔇 Additional comments (3)
yorkie-intelligence/README.md (3)
1-1
: Header Indicates WIP Status
The project title clearly states it's a work in progress, which is helpful.
10-12
: Dev Installation Command Reviewed
The dev installation commandpoetry install --no-root
is clear and conforms to standard usage.
14-18
: Production Installation Command Reviewed
The production commandpoetry install --without dev
is properly specified for production use.
I will try to also review this PR, but it will take some time 😅 |
- add how to run test - fix test code with fakellm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
yorkie-intelligence/README.md (1)
24-24
:⚠️ Potential issueFix the directory name typo.
The directory name contains a typo: "yorkie-intalligence" should be "yorkie-intelligence".
-cd yorkie-intalligence +cd yorkie-intelligence
🧹 Nitpick comments (7)
yorkie-intelligence/README.md (6)
1-2
: Add a project description.Consider adding a brief description of what Yorkie Intelligence is, its purpose, and its key features to help users understand the project better.
3-6
: Improve the setting section formatting and clarity.The setting section could be improved for better readability:
## Setting -install python 3.10.* version(recommend using [pyenv](https://github.com/pyenv/pyenv))<br/> -install [poetry](https://python-poetry.org/docs/#installing-with-the-official-installer)<br/> +### Prerequisites + +1. Install Python 3.10.* (we recommend using [pyenv](https://github.com/pyenv/pyenv)) +2. Install [Poetry](https://python-poetry.org/docs/#installing-with-the-official-installer)🧰 Tools
🪛 LanguageTool
[grammar] ~3-~3: There seems to be a noun/verb agreement error. Did you mean “installs” or “installed”?
Context: # Yorkie Intelligence(WIP) ## Setting install python 3.10.* version(recommend using [...(SINGULAR_NOUN_VERB_AGREEMENT)
8-18
: Add context for environment setup.Consider explaining the differences between dev and prod environments and when to use each:
### dev +Development environment includes all dependencies needed for development, testing, and debugging. ```sh poetry install --no-root
prod
+Production environment includes only the dependencies required to run the application.
poetry install --without dev--- `28-28`: **Format the OpenAPI URL properly.** The OpenAPI URL should be properly formatted as a Markdown link: ```diff -you can see openapi in http://localhost:8000/docs +You can access the OpenAPI documentation at [http://localhost:8000/docs](http://localhost:8000/docs)
🧰 Tools
🪛 LanguageTool
[uncategorized] ~28-~28: The preposition “at” seems more likely in this position than the preposition “in”.
Context: ...n:app --reload ``` you can see openapi in http://localhost:8000/docs ## How To R...(AI_EN_LECTOR_REPLACEMENT_PREPOSITION_IN_AT)
🪛 markdownlint-cli2 (0.17.2)
28-28: Bare URL used
null(MD034, no-bare-urls)
30-34
: Enhance test documentation.Consider adding more details about the test suite:
## How To Run TestCode + +Our test suite includes: +- Unit tests +- Integration tests +- API endpoint tests ```sh poetry run python -m pytest
+To run specific test files or test cases, you can use:
+sh +poetry run python -m pytest path/to/test_file.py -k "test_name" +
--- `36-37`: **Add basic Ollama setup steps.** Consider adding basic setup steps before referring to the detailed documentation: ```diff ## Ollama SetUp -If you want to more detail, visit [here!](https://hub.docker.com/r/ollama/ollama) + +Quick start: +```sh +docker pull ollama/ollama +docker run -d -v ollama:/root/.ollama -p 11434:11434 ollama/ollama +``` + +For more detailed setup instructions, visit the [Ollama Docker Hub page](https://hub.docker.com/r/ollama/ollama).
yorkie-intelligence/test/test_stream_response.py (1)
10-18
: Enhance the client fixture for better reusability and maintainability.While the fixture correctly implements dependency override and client setup, consider these improvements:
+# Constants +BASE_URL = "http://127.0.0.1:8000" +FAKE_RESPONSES = [ + "hello, I'm a Yorkie Intelligence ChatBot, How Can I help you?", + "Here's your pull request summary...", + "I've analyzed your issue...", + "I've prepared a document..." +] @pytest_asyncio.fixture() async def client(): - fake_responses = ["hello, I'm a Yorkie Intelligence ChatBot, How Can I help you?"] - fake_llm = FakeStreamingListLLM(responses=fake_responses) + fake_llm = FakeStreamingListLLM(responses=FAKE_RESPONSES) app.dependency_overrides[get_model] = lambda: fake_llm async with AsyncClient( - transport=ASGITransport(app=app), base_url="http://127.0.0.1:8000" + transport=ASGITransport(app=app), base_url=BASE_URL ) as client: yield client
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
yorkie-intelligence/README.md
(1 hunks)yorkie-intelligence/test/test_stream_response.py
(1 hunks)
🧰 Additional context used
🪛 LanguageTool
yorkie-intelligence/README.md
[grammar] ~3-~3: There seems to be a noun/verb agreement error. Did you mean “installs” or “installed”?
Context: # Yorkie Intelligence(WIP) ## Setting install python 3.10.* version(recommend using [...
(SINGULAR_NOUN_VERB_AGREEMENT)
[uncategorized] ~28-~28: The preposition “at” seems more likely in this position than the preposition “in”.
Context: ...n:app --reload ``` you can see openapi in http://localhost:8000/docs ## How To R...
(AI_EN_LECTOR_REPLACEMENT_PREPOSITION_IN_AT)
🪛 markdownlint-cli2 (0.17.2)
yorkie-intelligence/README.md
28-28: Bare URL used
null
(MD034, no-bare-urls)
🔇 Additional comments (1)
yorkie-intelligence/test/test_stream_response.py (1)
1-8
: LGTM! Good choice of testing tools.The imports are well-organized and the choice of testing tools (pytest_asyncio, httpx, FakeStreamingListLLM) is appropriate for testing async streaming endpoints.
@pytest.mark.asyncio() | ||
async def test_stream_pr(client): | ||
async with client.stream( | ||
"POST", "/intelligence/pr/", json={"query": "hi"} | ||
) as response: | ||
assert response.status_code == 200 | ||
|
||
|
||
@pytest.mark.asyncio | ||
async def test_stream_issue(): | ||
async with AsyncClient( | ||
transport=ASGITransport(app=app), base_url="http://127.0.0.1:8000" | ||
) as client: | ||
async with client.stream( | ||
"POST", "/intelligence/issue/", json={"query": "hi"} | ||
) as response: | ||
assert response.status_code == 200 | ||
|
||
|
||
@pytest.mark.asyncio | ||
async def test_stream_write_doc(): | ||
async with AsyncClient( | ||
transport=ASGITransport(app=app), base_url="http://127.0.0.1:8000" | ||
) as client: | ||
async with client.stream( | ||
"POST", "/intelligence/doc/", json={"query": "hi"} | ||
) as response: | ||
assert response.status_code == 200 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve test coverage with parameterized tests and response validation.
The current tests only verify the response status code. They should also:
- Test different query scenarios
- Validate response data format
- Test error cases
- Verify streaming behavior
Here's a suggested implementation:
import pytest
from typing import AsyncGenerator
TEST_CASES = [
pytest.param("valid query", 200, id="valid_query"),
pytest.param("", 400, id="empty_query"),
pytest.param("invalid_query" * 1000, 400, id="query_too_long"),
]
async def validate_stream_response(response: AsyncGenerator) -> None:
data = []
async for line in response.aiter_lines():
data.append(line)
assert isinstance(line, str) # Validate response format
assert len(data) > 0 # Ensure we received some data
@pytest.mark.asyncio
@pytest.mark.parametrize("query,expected_status", TEST_CASES)
async def test_stream_pr(client, query: str, expected_status: int):
async with client.stream(
"POST", "/intelligence/pr/", json={"query": query}
) as response:
assert response.status_code == expected_status
if response.status_code == 200:
await validate_stream_response(response)
@pytest.mark.asyncio
@pytest.mark.parametrize("query,expected_status", TEST_CASES)
async def test_stream_issue(client, query: str, expected_status: int):
async with client.stream(
"POST", "/intelligence/issue/", json={"query": query}
) as response:
assert response.status_code == expected_status
if response.status_code == 200:
await validate_stream_response(response)
@pytest.mark.asyncio
@pytest.mark.parametrize("query,expected_status", TEST_CASES)
async def test_stream_write_doc(client, query: str, expected_status: int):
async with client.stream(
"POST", "/intelligence/doc/", json={"query": query}
) as response:
assert response.status_code == expected_status
if response.status_code == 200:
await validate_stream_response(response)
- add modles file and change query parameter type - make in-memory chat history store - modify apis to allow use of chat history
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (3)
yorkie-intelligence/src/api/write_document/views.py (1)
34-42
: 🛠️ Refactor suggestionImprove error handling in event_stream function.
The current error handling is quite basic. Consider adding more specific error handling with proper error categorization and logging.
async def event_stream(): yield f"{session_id}\n" try: async for chunk in chain.astream( content, config={"session_id": session_id} ): yield chunk - except Exception as e: - yield f"\n\n{str(e)}\n\n" + except ValueError as e: + yield f"\n\nInput Error: {str(e)}\n\n" + except ConnectionError as e: + yield f"\n\nConnection Error: Unable to connect to language model service\n\n" + except Exception as e: + # Add logging here + yield f"\n\nUnexpected Error: An error occurred while processing your request\n\n"yorkie-intelligence/src/api/issue/views.py (2)
19-19
: 🛠️ Refactor suggestionAdd type annotation to the dependency injection parameter.
The parameter
llm
lacks type annotation, which violates best practices (B008) for dependency injection in FastAPI. This could lead to unexpected behavior at import time.Apply this diff to fix the dependency injection:
-async def make_issue(query: Query, llm=Depends(get_model)): +async def make_issue(query: Query, llm: Any = Depends(get_model)):You'll also need to add the import:
+from typing import Any from fastapi import APIRouter, Depends, Body
36-44
: 🛠️ Refactor suggestionImprove error handling and add type hints for event streaming.
The current error handling is basic and doesn't follow the Server-Sent Events (SSE) format. Proper SSE formatting includes prefixing messages with
data:
and ending with double newlines.Apply this diff to improve the implementation:
+from typing import AsyncGenerator + -async def event_stream(): +async def event_stream() -> AsyncGenerator[str, None]: try: async for chunk in chain.astream( content, config={"session_id": session_id} ): - yield chunk + yield f"data: {chunk}\n\n" except Exception as e: - yield f"\n\n{str(e)}\n\n" + error_message = { + "error": str(e), + "type": e.__class__.__name__ + } + yield f"event: error\ndata: {json.dumps(error_message)}\n\n"Don't forget to add the import for
json
:+import json from fastapi import APIRouter, Depends, Body
🧹 Nitpick comments (15)
yorkie-intelligence/src/api/pr/config.py (4)
12-105
: Consider extracting examples to a separate configuration file.The examples are well-structured and provide good templates for PR creation. However, hardcoding such large data structures directly in the code reduces maintainability. Consider moving these examples to a separate JSON or YAML configuration file.
- examples = [ - { - "title": "Error with `qemu` when launching `yorkie` project using `yorkieteam/yorkie` image on Apple M1", - "content": """<!-- Thanks for sending a pull request! --> - -#### What this PR does / why we need it? - -Change "documents" to "document" in DocEvent - -#### Any background context you want to provide? - -There were previous proto changes in the js-sdk, so I have set the target branch to "concurrent-case-handling." - -#### What are the relevant tickets? -<!-- -*Automatically closes linked issue when PR is merged. -Usage: \`Fixes #<issue number>\`, or \`Fixes (paste link of issue)\`. ---> -Related https://github.com/yorkie-team/yorkie/issues/612 - -### Checklist -- [x] Added relevant tests or not required -- [x] Didn't break anything - """, - }, - # ... other examples - ] + from .example_data import examples # Import from a separate moduleAnd create a new file
yorkie-intelligence/src/api/pr/example_data.py
with the examples.
16-35
: Fix inconsistent indentation in example PR content.There is inconsistent indentation in the first example's content, particularly with extra spaces after the triple quotes. This might affect how Markdown is rendered.
"content": """<!-- Thanks for sending a pull request! --> - + #### What this PR does / why we need it?
107-115
: Consider breaking the long prefix string for better readability.The
prefix
text is quite long and spans multiple lines. Consider using Python's triple quotes for multi-line strings to improve readability.- prefix="I want you to act as a GitHub PR Writer for me. I'll provide you with brief notes about GitHub PR, and you just need to write the PR. " - "Please ensure that you follow the template used in example provided. Do not provide the example as it is. Please write your responses in English. " - "If there is insufficient information to create the PR, request additional information", + prefix="""I want you to act as a GitHub PR Writer for me. I'll provide you with brief notes about GitHub PR, and you just need to write the PR. + Please ensure that you follow the template used in example provided. Do not provide the example as it is. Please write your responses in English. + If there is insufficient information to create the PR, request additional information""",
1-127
: Add docstrings to improve code documentation.Consider adding docstrings to explain the purpose of each template and how they're intended to be used. This will help future developers understand the design intentions.
+ """ + Configuration module for PR-related prompt templates using LangChain. + + This module defines prompt templates for generating GitHub pull requests + and handling chat interactions within a Markdown editor context. + """ from langchain_core.prompts import ( PromptTemplate, FewShotPromptTemplate, ChatPromptTemplate, MessagesPlaceholder, ) + """Template for formatting individual PR examples with title and content.""" example_prompt = PromptTemplate.from_template( "[Example]\n## Title\n{title}\n## Content\n{content}" ) + """Collection of example PRs used for few-shot learning.""" examples = [ # ... ] + """ + Few-shot prompt template for generating GitHub PRs. + + Uses example PRs to guide the model in creating new PRs based on brief user input. + If insufficient information is provided, the prompt instructs to request more details. + """ pr_template_prompt = FewShotPromptTemplate( # ... ) + """ + Chat template for AI assistance in a Markdown editor context. + + Designed to maintain conversation context and respond to the latest user query. + """ chat_template = ChatPromptTemplate.from_messages( # ... )yorkie-intelligence/src/api/write_document/models.py (2)
1-7
: Consider adding docstrings to improve code documentation.Adding docstrings to describe the purpose of the class and its fields would improve code documentation and make it easier for other developers to understand the model's role in the application.
from pydantic import BaseModel class Query(BaseModel): + """ + Model representing a document write query. + + Attributes: + document_id: Unique identifier for the document + session_id: Optional session identifier for conversation tracking + content: The content to be written in the document + """ document_id: str session_id: str | None = None content: str
1-7
: Consider creating a common base model instead of duplicating across modules.I notice that identical
Query
models are defined in multiple places (issue/models.py
,pr/models.py
, and here). Consider creating a shared base model in a common location to avoid duplication.You could create a common model in
src/common/models.py
:# src/common/models.py from pydantic import BaseModel class BaseQuery(BaseModel): """Base query model with common fields.""" document_id: str session_id: str | None = None content: strThen import and use or extend it in each module:
from src.common.models import BaseQuery # Either use directly Query = BaseQuery # Or extend if needed class Query(BaseQuery): # Add additional fields if required passyorkie-intelligence/src/api/write_document/views.py (3)
18-25
: Add validation for content beyond Pydantic.While Pydantic ensures that
content
exists, it doesn't validate its quality. Consider adding validation for empty or very short content.async def write_document(query: Query, llm=Depends(get_model)): content = {"content": query.content} + + # Validate content quality + if not query.content.strip(): + return StreamingResponse( + iter([f"{generate_session_id()}\n", "Error: Content cannot be empty"]), + media_type="text/event-stream" + ) + if len(query.content.strip()) < 5: # Adjust minimum length as needed + return StreamingResponse( + iter([f"{generate_session_id()}\n", "Error: Content is too short"]), + media_type="text/event-stream" + ) if query.session_id is None: session_id = generate_session_id() _chain = document_writing_template_prompt | llm | StrOutputParser() else: session_id = query.session_id _chain = chat_template | llm | StrOutputParser()
1-44
: Add logging throughout the endpoint.Consider adding logging to track the flow of requests, especially for errors. This will help with debugging and monitoring in production.
from typing import Annotated from fastapi import APIRouter, Depends, Body from fastapi.responses import StreamingResponse from langchain_core.output_parsers import StrOutputParser from langchain_core.runnables.history import RunnableWithMessageHistory +import logging from src.common.llms import get_model from src.common.utils import get_by_session_id, generate_session_id from .models import Query from .config import document_writing_template_prompt, chat_template +# Set up logging +logger = logging.getLogger(__name__) router = APIRouter() @router.post("/") async def write_document(query: Query, llm=Depends(get_model)): + logger.info(f"Received document write request for document ID: {query.document_id}") content = {"content": query.content} if query.session_id is None: session_id = generate_session_id() + logger.info(f"Created new session: {session_id}") _chain = document_writing_template_prompt | llm | StrOutputParser() else: session_id = query.session_id + logger.info(f"Using existing session: {session_id}") _chain = chat_template | llm | StrOutputParser() chain = RunnableWithMessageHistory( _chain, get_by_session_id, input_messages_key="content", history_messages_key="chat_history", ) async def event_stream(): yield f"{session_id}\n" try: + logger.debug(f"Starting stream for session: {session_id}") async for chunk in chain.astream( content, config={"session_id": session_id} ): yield chunk except Exception as e: + logger.error(f"Error in stream for session {session_id}: {str(e)}") yield f"\n\n{str(e)}\n\n" + logger.info(f"Returning streaming response for session: {session_id}") return StreamingResponse(event_stream(), media_type="text/event-stream")
16-44
: Add documentation for the API endpoint.Adding API documentation will help users understand how to use this endpoint. FastAPI uses these docstrings to generate OpenAPI documentation.
@router.post("/") async def write_document(query: Query, llm=Depends(get_model)): + """ + Create or continue a document writing session. + + This endpoint accepts document content and processes it using a language model. + If no session_id is provided, a new session is created and the document writing + template is used. If a session_id is provided, the conversation continues using + the chat template. + + The response is streamed as events, with the first line containing the session_id. + + Args: + query: The Query object containing document_id, optional session_id, and content + llm: Language model obtained through dependency injection + + Returns: + StreamingResponse: A streaming response containing the generated text + """ content = {"content": query.content} # Rest of the function...yorkie-intelligence/src/api/issue/views.py (1)
45-45
: Return a structured response object with session ID.The API currently doesn't return the session ID to the client, which would be helpful for continuing the conversation in subsequent requests.
Apply this diff to include the session ID in the response headers:
- return StreamingResponse(event_stream(), media_type="text/event-stream") + response = StreamingResponse(event_stream(), media_type="text/event-stream") + response.headers["X-Session-ID"] = session_id + return responseyorkie-intelligence/src/common/utils.py (4)
10-10
: Fix typo in application name.There's a spelling error in the application name.
- APP_NAME: str = "Yorkie Intellignce" + APP_NAME: str = "Yorkie Intelligence"
39-41
: Fix typo in TODO comment and clarify TTL cache implementation.There's a spelling error in the TODO comment, and it's unclear how the TTLCache is being used, as it's not directly integrated with the InMemoryHistory class.
Apply these changes:
-# TODO -# chane this to TTLCache -store = TTLCache(maxsize=100, ttl=60) +# TODO: Refine TTL parameters based on usage patterns +# Cache for storing conversation histories with 60-second expiration +store = TTLCache(maxsize=100, ttl=60)Consider making the TTL and maxsize configurable via settings:
-store = TTLCache(maxsize=100, ttl=60) +store = TTLCache(maxsize=SETTINGS.CACHE_MAX_SIZE, ttl=SETTINGS.CACHE_TTL_SECONDS)And add these settings to the Settings class:
class Settings(BaseSettings): APP_NAME: str = "Yorkie Intelligence" GO_BACKEND_URL: str = "" OLLAMA_URL: str = "localhost:11434" MODEL_TYPE: str MODEL_NAME: str API_KEY: str | None = None + CACHE_MAX_SIZE: int = 100 + CACHE_TTL_SECONDS: int = 60
17-17
: Consider making the env file path configurable.Hardcoding the environment file path makes it less flexible for different deployment scenarios.
- model_config = SettingsConfigDict(env_file="src/.env.deployment") + model_config = SettingsConfigDict( + env_file=".env.deployment", + env_file_encoding="utf-8", + env_nested_delimiter="__" + )This allows for more flexibility when deployed in different environments, as the default path can be overridden by environment variables.
30-30
: Add type validation for messages.The
messages
field should ensure that only valid message types are added.- messages: list[BaseMessage] = Field(default_factory=list) + messages: list[BaseMessage] = Field( + default_factory=list, + description="List of conversation messages" + )yorkie-intelligence/src/api/write_document/config.py (1)
8-8
: Consider adding more detail to the example prompt template.The example prompt is quite minimal compared to the detailed examples. Adding more structure could help guide the model better.
-example_prompt = PromptTemplate.from_template("[Example]\n## Output\n{output}") +example_prompt = PromptTemplate.from_template("[Example]\n## Title\n{title}\n## Output\n{output}")If you implement this change, you'll need to add a "title" field to each example in the examples list.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
yorkie-intelligence/pyproject.toml
(1 hunks)yorkie-intelligence/src/api/issue/config.py
(1 hunks)yorkie-intelligence/src/api/issue/models.py
(1 hunks)yorkie-intelligence/src/api/issue/views.py
(1 hunks)yorkie-intelligence/src/api/pr/config.py
(1 hunks)yorkie-intelligence/src/api/pr/models.py
(1 hunks)yorkie-intelligence/src/api/pr/views.py
(1 hunks)yorkie-intelligence/src/api/write_document/config.py
(1 hunks)yorkie-intelligence/src/api/write_document/models.py
(1 hunks)yorkie-intelligence/src/api/write_document/views.py
(1 hunks)yorkie-intelligence/src/common/utils.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- yorkie-intelligence/src/api/pr/views.py
- yorkie-intelligence/pyproject.toml
🔇 Additional comments (22)
yorkie-intelligence/src/api/pr/config.py (3)
1-6
: Imports look good.The imports are clean and correctly specify only the necessary classes from
langchain_core.prompts
.
8-10
: Simple and effective template structure.The
example_prompt
is well defined with a clear structure for formatting PR examples with title and content sections.
117-126
: Well-structured chat template.The chat template is appropriately structured with a system message, history placeholder, and human message, which is perfect for a conversational AI assistant in a Markdown editor context.
yorkie-intelligence/src/api/write_document/models.py (1)
4-7
: Model definition looks good.The
Query
class is properly defined with appropriate type annotations for all fields. Using Pydantic'sBaseModel
is a good practice for input validation in FastAPI applications.yorkie-intelligence/src/api/pr/models.py (3)
4-7
: Model definition looks good.The
Query
class is properly defined with appropriate type annotations for all fields. Using Pydantic'sBaseModel
is a good practice for input validation in FastAPI applications.
1-7
: Consider adding docstrings to improve code documentation.Adding docstrings to describe the purpose of the class and its fields would improve code documentation and make it easier for other developers to understand the model's role in the application.
from pydantic import BaseModel class Query(BaseModel): + """ + Model representing a pull request creation query. + + Attributes: + document_id: Unique identifier for the document + session_id: Optional session identifier for conversation tracking + content: The content for the pull request + """ document_id: str session_id: str | None = None content: str
1-7
: Code duplication across model files.This model definition is duplicated across multiple files (
issue/models.py
,write_document/models.py
, and here). Consider creating a common base model to reduce duplication.yorkie-intelligence/src/api/issue/models.py (3)
4-7
: Model definition looks good.The
Query
class is properly defined with appropriate type annotations for all fields. Using Pydantic'sBaseModel
is a good practice for input validation in FastAPI applications.
1-7
: Consider adding docstrings to improve code documentation.Adding docstrings to describe the purpose of the class and its fields would improve code documentation and make it easier for other developers to understand the model's role in the application.
from pydantic import BaseModel class Query(BaseModel): + """ + Model representing an issue creation query. + + Attributes: + document_id: Unique identifier for the document + session_id: Optional session identifier for conversation tracking + content: The content for the issue + """ document_id: str session_id: str | None = None content: str
1-7
: Code duplication across model files.This model definition is duplicated across multiple files (
pr/models.py
,write_document/models.py
, and here). Consider creating a common base model to reduce duplication.yorkie-intelligence/src/api/write_document/views.py (3)
16-17
: API endpoint definition looks good.The POST endpoint is properly defined using FastAPI's router with appropriate parameter annotations. Using the
Query
model from your models file for request validation is a good practice.
20-25
: Conditional chain selection based on session is well implemented.The logic for selecting different templates based on session existence is clear and well-structured. This approach allows for different conversation flows based on whether it's a new or existing session.
27-32
: LangChain message history integration is well implemented.Using
RunnableWithMessageHistory
for maintaining conversation history is a good approach. The configuration with appropriate input and history message keys is correct.yorkie-intelligence/src/api/issue/views.py (1)
29-34
:❓ Verification inconclusive
Add error handling for RunnableWithMessageHistory initialization.
The current implementation doesn't handle potential errors during chain initialization, which could occur if there are issues with the message history store.
Consider adding a try-except block around this initialization:
- chain = RunnableWithMessageHistory( - _chain, - get_by_session_id, - input_messages_key="content", - history_messages_key="chat_history", - ) + try: + chain = RunnableWithMessageHistory( + _chain, + get_by_session_id, + input_messages_key="content", + history_messages_key="chat_history", + ) + except Exception as e: + return StreamingResponse( + iter([f"data: Error initializing chat history: {str(e)}\n\n"]), + media_type="text/event-stream" + )Let's verify how
RunnableWithMessageHistory
is typically used and if it throws exceptions:
🏁 Script executed:
#!/bin/bash # Look for examples of RunnableWithMessageHistory error handling in the codebase rg -A 5 -B 5 "RunnableWithMessageHistory.*try" --type pyLength of output: 58
Action Required: Verify Exception Handling for RunnableWithMessageHistory Initialization
After searching the repository for existing patterns of exception handling around
RunnableWithMessageHistory
, no similar try-except blocks were found. This suggests that error handling for this initialization might not be uniformly applied elsewhere. However, considering that issues with the message history store could potentially lead to runtime errors, it's important to manually verify whether exceptions may occur during initialization and if a try-except block is appropriate.Please review the following changes in your local environment to ensure they fit your error-handling strategy:
- Confirm that
RunnableWithMessageHistory
may indeed raise exceptions during initialization.- Verify that wrapping the initialization in a try-except block does not inadvertently obscure underlying issues.
- Ensure that returning a
StreamingResponse
with the error message is consistent with the project’s expected error propagation framework.Here’s the suggested diff:
- chain = RunnableWithMessageHistory( - _chain, - get_by_session_id, - input_messages_key="content", - history_messages_key="chat_history", - ) + try: + chain = RunnableWithMessageHistory( + _chain, + get_by_session_id, + input_messages_key="content", + history_messages_key="chat_history", + ) + except Exception as e: + return StreamingResponse( + iter([f"data: Error initializing chat history: {str(e)}\n\n"]), + media_type="text/event-stream" + )Please manually verify this change and confirm that it handles runtime exceptions in accordance with your expectations.
yorkie-intelligence/src/common/utils.py (2)
26-27
:❓ Verification inconclusive
Address the TODO comments for TTL cache implementation.
There are unresolved TODO comments about applying TTL cache, but a TTLCache is already implemented on line 41. This suggests inconsistency or incomplete implementation.
🏁 Script executed:
#!/bin/bash # Check if InMemoryHistory already uses the TTLCache defined on line 41 rg -A 5 -B 5 "InMemoryHistory.*TTLCache" --type pyLength of output: 52
Action: Reconcile TTLCache Usage in InMemoryHistory
The grep search did not return any instance where InMemoryHistory leverages the TTLCache (as intended by the comment). Please verify one of the following:
- If TTLCache is meant to be integrated within InMemoryHistory, update the class to actually use it.
- Otherwise, remove or update the TODO comment in
yorkie-intelligence/src/common/utils.py
(lines 26–27) to reflect the current implementation.
44-47
:❓ Verification inconclusive
Consider thread safety for the global store variable.
The
get_by_session_id
function modifies a global dictionary, which could lead to race conditions in a multi-threaded environment.Verify if the code runs in a multi-threaded environment:
If multi-threading is used, consider using a thread-safe dictionary or adding locks:
+import threading + +# Thread lock for store access +_store_lock = threading.Lock() def get_by_session_id(session_id: str) -> BaseChatMessageHistory: - if session_id not in store: - store[session_id] = InMemoryHistory() - return store[session_id] + with _store_lock: + if session_id not in store: + store[session_id] = InMemoryHistory() + return store[session_id]
🏁 Script executed:
#!/bin/bash # Check if the application uses multi-threading rg -A 2 -B 2 "ThreadPoolExecutor|ProcessPoolExecutor|threading|multiprocessing|concurrent.futures" --type pyLength of output: 110
Review Threading Concern Verification for Global Store in
get_by_session_id
It appears from the search output that there are no clear signs of multi-threading usage in the codebase (no references to
ThreadPoolExecutor
,threading
, etc.). However, because the inference quality is low and the output was empty, please manually verify that no part of the application creates concurrent access tostore
. If there is any chance of future threading, consider applying a lock as suggested:+import threading + +# Thread lock for store access +_store_lock = threading.Lock() def get_by_session_id(session_id: str) -> BaseChatMessageHistory: - if session_id not in store: - store[session_id] = InMemoryHistory() - return store[session_id] + with _store_lock: + if session_id not in store: + store[session_id] = InMemoryHistory() + return store[session_id]Please double-check the threading model usage across the codebase to confirm if this function may be accessed concurrently.
yorkie-intelligence/src/api/issue/config.py (3)
12-77
: Excellent examples for few-shot learning.The examples provided for GitHub issues are comprehensive, well-structured, and cover different issue types (bug, enhancement, common issue). They follow GitHub issue conventions and provide clear templates for the model to learn from.
The inclusion of real-world examples with proper markdown formatting will help the model generate more accurate and useful issue templates.
79-87
: Well-structured prompt template with clear instructions.The few-shot prompt template is well-designed with clear instructions for the model. It provides context on the task (GitHub issue writing), specifies the types of issues it can write, and instructs the model to follow the templates from the examples.
The instruction to request additional information when there's insufficient data is particularly valuable for ensuring quality issue creation.
89-98
: Good chat template design for Markdown editor context.The chat template is designed appropriately for a Markdown editor interface. It provides clear system instructions and properly handles the conversation history through the MessagesPlaceholder.
This implementation will allow for maintaining context in multi-turn conversations while keeping the model focused on the task of assisting with Markdown editing.
yorkie-intelligence/src/api/write_document/config.py (3)
10-79
: Comprehensive documentation example with detailed formatting.The example provided is detailed and illustrates proper documentation format for JavaScript code, including proper annotation of events, code examples, and alerts for important information.
The example will help the model generate similarly well-structured documentation responses.
81-112
: Thorough AI assistant guidelines with clear boundaries.The guidelines for the AI documentation assistant are comprehensive and clearly define expected behavior. They enforce neutrality, professionalism, and focus on documentation tasks while explicitly prohibiting certain types of content.
The instructions are specific about the assistant's name, language, and the format of responses, which will help ensure consistency in the interaction with users.
114-123
: Chat template is consistent with other modules.The chat template follows the same pattern as in the issue config, providing a consistent user experience across different features of the application.
Consistency in design patterns helps maintain a cohesive codebase and reduces cognitive load for developers working on the project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution!
It seems like the PR is still WIP, so I will start with the small but common ones.
- change pre-commit ruff version to 0.9.10 - add autofix arg in ruff - add end line - change .env.devlopyment API_KEY to empty string
What this PR does / why we need it:
Yorkie Intelligence Python Migration project using Poetry and Pyenv, providing a standardized environment for development and dependency management.
currently in a Work In Progress (WIP) state, with API development underway.
Which issue(s) this PR fixes:
Fixes #430
Checklist:
Summary by CodeRabbit
New Features
Documentation
Chores