Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ repos:
hooks:
- id: mypy
name: mypy
entry: uv run mypy src/
entry: uv run mypy src/ tests/
language: system
types: [python]
pass_filenames: false
Expand Down
45 changes: 45 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
This is built on top of [CocoIndex v1](https://cocoindex.io/docs-v1/llms.txt).


## Build and Test Commands

This project uses [uv](https://docs.astral.sh/uv/) for project management.

```bash
uv run mypy . # Type check Python code
uv run pytest tests/ # Run Python tests
```

## Code Conventions

### Internal vs External Modules

We distinguish between **internal modules** (under packages with `_` prefix, e.g. `_internal.*` or `connectors.*._source`) and **external modules** (which users can directly import).

**External modules** (user-facing, e.g. `cocoindex/ops/sentence_transformers.py`):

* Be strict about not leaking implementation details
* Use `__all__` to explicitly list public exports
* Prefix ALL non-public symbols with `_`, including:
* Standard library imports: `import threading as _threading`, `import typing as _typing`
* Third-party imports: `import numpy as _np`, `from numpy.typing import NDArray as _NDArray`
* Internal package imports: `from cocoindex.resources import schema as _schema`
* Exception: `TYPE_CHECKING` imports for type hints don't need prefixing

**Internal modules** (e.g. `cocoindex/_internal/component_ctx.py`):

* Less strict since users shouldn't import these directly
* Standard library and internal imports don't need underscore prefix
* Only prefix symbols that are truly private to the module itself (e.g. `_context_var` for a module-private ContextVar)

### Type Annotations

Avoid `Any` whenever feasible. Use specific types — including concrete types from third-party libraries. Only use `Any` when the type is truly generic and no downstream code needs to downcast it.

### Multi-Value Returns

For functions returning multiple values, use `NamedTuple` instead of plain tuples. At call sites, access fields by name (`result.can_reuse`) rather than positional unpacking — this prevents misreading fields in the wrong order.

### Testing Guidelines

We prefer end-to-end tests on user-facing APIs, over unit tests on smaller internal functions. With this said, there're cases where unit tests are necessary, e.g. for internal logic with various situations and edge cases, in which case it's usually easier to cover various scenarios with unit tests.
2 changes: 0 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,6 @@ Use the cocoindex-code MCP server for semantic code search when:
|----------|-------------|---------|
| `COCOINDEX_CODE_ROOT_PATH` | Root path of the codebase | Auto-discovered (see below) |
| `COCOINDEX_CODE_EMBEDDING_MODEL` | Embedding model (see below) | `sbert/sentence-transformers/all-MiniLM-L6-v2` |
| `COCOINDEX_CODE_BATCH_SIZE` | Max batch size for local embedding model | `16` |
| `COCOINDEX_CODE_EXTRA_EXTENSIONS` | Additional file extensions to index (comma-separated, e.g. `"inc:php,yaml,toml"` — use `ext:lang` to override language detection) | _(none)_ |
| `COCOINDEX_CODE_EXCLUDED_PATTERNS` | Additional glob patterns to exclude from indexing as a JSON array (e.g. `'["**/migration.sql", "{**/*.md,**/*.txt}"]'`) | _(none)_ |

Expand Down Expand Up @@ -281,7 +280,6 @@ claude mcp add cocoindex-code \
```bash
claude mcp add cocoindex-code \
-e COCOINDEX_CODE_EMBEDDING_MODEL=sbert/nomic-ai/CodeRankEmbed \
-e COCOINDEX_CODE_BATCH_SIZE=16 \
-- cocoindex-code
```

Expand Down
7 changes: 6 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -23,12 +23,15 @@ classifiers = [

dependencies = [
"mcp>=1.0.0",
"cocoindex[litellm]==1.0.0a29",
"cocoindex[litellm]==1.0.0a31",
"sentence-transformers>=2.2.0",
"sqlite-vec>=0.1.0",
"pydantic>=2.0.0",
"numpy>=1.24.0",
"einops>=0.8.2",
"typer>=0.9.0",
"msgspec>=0.19.0",
"pyyaml>=6.0",
]

[project.optional-dependencies]
Expand All @@ -43,6 +46,7 @@ dev = [

[project.scripts]
cocoindex-code = "cocoindex_code:main"
ccc = "cocoindex_code.cli:app"

[project.urls]
Homepage = "https://github.com/cocoindex-io/cocoindex-code"
Expand All @@ -66,6 +70,7 @@ dev = [
"ruff>=0.1.0",
"mypy>=1.0.0",
"prek>=0.1.0",
"types-pyyaml>=6.0.12.20250915",
]

[tool.uv]
Expand Down
7 changes: 3 additions & 4 deletions src/cocoindex_code/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,7 @@

logging.basicConfig(level=logging.WARNING)

from .config import Config # noqa: E402
from .server import main, mcp # noqa: E402
from ._version import __version__ # noqa: E402
from .server import main # noqa: E402

__version__ = "0.1.0"
__all__ = ["Config", "main", "mcp"]
__all__ = ["main", "__version__"]
3 changes: 3 additions & 0 deletions src/cocoindex_code/_version.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# This file will be rewritten by the release workflow.
# DO NOT ADD ANYTHING ELSE TO THIS FILE.
__version__ = "999.0.0"
Loading
Loading