Problem Statement
Current OpenViking CLI requires users or AI agents to always specify file and directory ignore flags (ignore_dirs, include, exclude) explicitly with each upload. This is error-prone and makes it difficult to ensure consistency in which files/directories are ingested, especially for human users or when automating uploads via agents (e.g., after each feature implementation).
We lack a robust way to keep certain files/folders out of the knowledge base by default.
Proposed Solution
Support a configurable, automatic ignore mechanism for the CLI's folder/file uploads:
- By default, respect the repository's
.gitignore (and all nested .gitignore files) for which files and directories to ignore during upload. Respecting .gitignore should ideally be optional and possible to turn off in config. Several OSS Python implementations (e.g. mherrmann/gitignore_parser) can be leveraged for parsing.
- Extend
ovcli.conf with default ignore rules (ignore_dirs, include, exclude) that will be used for every upload when the corresponding CLI flags are not explicitly specified. This allows personalization and consistent filtering at the user/project level, following the documented config format.
- The set of ignore rules from all sources should be merged for the final configuration (not override each other).
Combined, these improvements ensure uploads are always filtered correctly/consistently without having to set ignore flags every time.
Alternatives Considered
- Only extending the
ovcli.conf with default ignore_dirs/include/exclude, might be easier to implement and it might cover most of the use cases mentioned here (only downside is that most rules in repo .gitignore will be duplicated here, but that's more acceptable). Meaning if respecting .gitignore is too complex, it's okay to only extend ovcli.conf.
Feature Area
CLI Tools
Use Case
I want to ensure that confidential, irrelevant, or project-specific files are never accidentally uploaded to OpenViking (especially if an agent or user forgets to set all the right ignore flags on each upload), and to inherit sensible and concrete ignore rules without manual intervention. Since the default IGNORE_DIRS might not cover all cases.
This enables safer and more predictable ingestion at scale or automation.
Example API (Optional)
# Example: ovcli.conf config
{
"url": "http://localhost:1933",
"api_key": "your-secret-key",
"upload": {
"ignore_dirs": [".cache", ".mypy_cache","custom_data",".nx"],
"exclude": ["*.tmp","*.log","*.bak"],
"include": ["*.md","*.pdf"]
}
}
# Example: CLI behaviour
openviking add-resource ./docs # Uses .gitignore, ovcli.conf ignore rules
openviking add-resource ./docs --exclude "*.test.ts" # Adds an additional exclude just for this upload, .gitignore and ovcli.conf are still respected, NOT overriden
Additional Context
No response
Contribution
Problem Statement
Current OpenViking CLI requires users or AI agents to always specify file and directory ignore flags (ignore_dirs, include, exclude) explicitly with each upload. This is error-prone and makes it difficult to ensure consistency in which files/directories are ingested, especially for human users or when automating uploads via agents (e.g., after each feature implementation).
We lack a robust way to keep certain files/folders out of the knowledge base by default.
Proposed Solution
Support a configurable, automatic ignore mechanism for the CLI's folder/file uploads:
.gitignore(and all nested.gitignorefiles) for which files and directories to ignore during upload. Respecting.gitignoreshould ideally be optional and possible to turn off in config. Several OSS Python implementations (e.g. mherrmann/gitignore_parser) can be leveraged for parsing.ovcli.confwith default ignore rules (ignore_dirs, include, exclude) that will be used for every upload when the corresponding CLI flags are not explicitly specified. This allows personalization and consistent filtering at the user/project level, following the documented config format.Combined, these improvements ensure uploads are always filtered correctly/consistently without having to set ignore flags every time.
Alternatives Considered
ovcli.confwith default ignore_dirs/include/exclude, might be easier to implement and it might cover most of the use cases mentioned here (only downside is that most rules in repo .gitignore will be duplicated here, but that's more acceptable). Meaning if respecting.gitignoreis too complex, it's okay to only extendovcli.conf.Feature Area
CLI Tools
Use Case
I want to ensure that confidential, irrelevant, or project-specific files are never accidentally uploaded to OpenViking (especially if an agent or user forgets to set all the right ignore flags on each upload), and to inherit sensible and concrete ignore rules without manual intervention. Since the default
IGNORE_DIRSmight not cover all cases.This enables safer and more predictable ingestion at scale or automation.
Example API (Optional)
Additional Context
No response
Contribution