-
Notifications
You must be signed in to change notification settings - Fork 154
fix: ripgrep fallback for list_files #250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
RyanLHicks
wants to merge
7
commits into
mpfaffenberger:main
Choose a base branch
from
RyanLHicks:fix/list-files-ripgrep-fallback
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 4 commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
cfeba28
fix: standardize token estimation heuristic and add consistency tests
RyanLHicks 78ce419
fix: fall back to non-recursive listing when ripgrep is not installed
RyanLHicks b4a9938
fix: add missing math import and update stale token estimation comment
RyanLHicks c2a6663
fix: update test and stale comment to reflect ripgrep fallback behavior
RyanLHicks d32b274
Delete tests/agents/test_token_estimation_consistency.py
RyanLHicks b76f6dc
style: apply ruff formatting
RyanLHicks 1b46d6d
fix: update tests for 2.5 chars/token heuristic
RyanLHicks File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,52 @@ | ||
| """Tests for token estimation consistency across modules. | ||
|
|
||
| Ensures file_operations._read_file and BaseAgent.estimate_token_count | ||
| use the same chars-per-token heuristic to prevent unexpected early | ||
| compaction triggered by estimation mismatch. | ||
| """ | ||
|
|
||
| import math | ||
|
|
||
| from code_puppy.agents.agent_code_puppy import CodePuppyAgent | ||
|
|
||
|
|
||
| class TestTokenEstimationConsistency: | ||
| """Token estimation should be consistent between file_operations and BaseAgent.""" | ||
|
|
||
| def test_estimate_token_count_matches_file_operations_heuristic(self): | ||
| """ | ||
| BaseAgent.estimate_token_count and file_operations._read_file | ||
| must use the same 2.5 chars/token heuristic. | ||
| """ | ||
| agent = CodePuppyAgent() | ||
| content = "x" * 1000 | ||
|
|
||
| base_agent_estimate = agent.estimate_token_count(content) | ||
| expected_heuristic = math.floor(len(content) / 2.5) | ||
|
|
||
| assert base_agent_estimate == expected_heuristic | ||
|
|
||
| def test_estimation_consistent_across_content_sizes(self): | ||
| """ | ||
| Consistency holds across small, medium, and large content sizes. | ||
| """ | ||
| agent = CodePuppyAgent() | ||
|
|
||
| for size in [100, 1000, 10000, 25000]: | ||
| content = "x" * size | ||
| base_agent_estimate = agent.estimate_token_count(content) | ||
| expected_heuristic = math.floor(len(content) / 2.5) | ||
| assert base_agent_estimate == expected_heuristic, ( | ||
| f"Mismatch at size {size}: " | ||
| f"base_agent={base_agent_estimate}, " | ||
| f"expected={expected_heuristic}" | ||
| ) | ||
|
|
||
| def test_minimum_token_count_is_one(self): | ||
| """ | ||
| estimate_token_count enforces a minimum of 1 token even for empty content. | ||
| """ | ||
| agent = CodePuppyAgent() | ||
|
|
||
| result = agent.estimate_token_count("") | ||
| assert result == 1 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,48 @@ | ||
| """Regression test for ripgrep fallback in _list_files. | ||
|
|
||
| When ripgrep is not installed, _list_files should fall back to | ||
| non-recursive os.listdir instead of returning an error. | ||
| """ | ||
|
|
||
| import os | ||
| import tempfile | ||
| from unittest.mock import patch | ||
|
|
||
| from code_puppy.tools.file_operations import _list_files | ||
|
|
||
|
|
||
| class TestListFilesRipgrepFallback: | ||
| """_list_files should gracefully handle missing ripgrep.""" | ||
|
|
||
| def test_falls_back_when_ripgrep_not_found(self): | ||
| """ | ||
| When ripgrep is not installed, _list_files should return | ||
| a non-recursive listing instead of an error. | ||
| """ | ||
| with tempfile.TemporaryDirectory() as tmpdir: | ||
| test_file = os.path.join(tmpdir, "test.py") | ||
| with open(test_file, "w") as f: | ||
| f.write("print('hello')") | ||
|
|
||
| with patch("shutil.which", return_value=None): | ||
| result = _list_files(None, tmpdir, recursive=True) | ||
|
|
||
| # Should not return a hard error | ||
| assert result.content is not None | ||
| assert "not found" not in (result.content or "").lower() or "falling back" in (result.content or "").lower() | ||
| # Should still return file listing | ||
| assert "test.py" in result.content | ||
|
|
||
| def test_returns_files_without_ripgrep(self): | ||
| """ | ||
| Files in the directory should be listed even without ripgrep. | ||
| """ | ||
| with tempfile.TemporaryDirectory() as tmpdir: | ||
| test_file = os.path.join(tmpdir, "myfile.py") | ||
| with open(test_file, "w") as f: | ||
| f.write("x = 1") | ||
|
|
||
| with patch("shutil.which", return_value=None): | ||
| result = _list_files(None, tmpdir, recursive=True) | ||
|
|
||
| assert "myfile.py" in result.content |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.