Releases: psarno/pyragix-net
v0.3.2 - CancellationToken Support
What's New
CancellationToken Support
Added cooperative cancellation throughout the query pipeline, making it easy to cancel in-flight LLM requests - useful for GUI applications where users may want to stop a query mid-flight.
Changes:
RagEngine.QueryAsyncnow accepts an optionalCancellationTokenRetrievalService.QueryAsyncpropagates the token through the pipelineLlmGenerator.GenerateAnswerAsyncand the internalCallLlmAsyncpass the token to allHttpClientcallsOperationCanceledExceptionis re-thrown (not swallowed by the error handler), so callers receive clean
cancellation semantics
Usage:
using var cts = new CancellationTokenSource();
// Cancel after 30 seconds, or call cts.Cancel() from a button click, etc.
cts.CancelAfter(TimeSpan.FromSeconds(30));
var answer = await ragEngine.QueryAsync("What is RAG?", cancellationToken: cts.Token);This is a non-breaking change — the parameter is optional with a default of CancellationToken.default.
v0.3.1 - Thinking Model Compatibility & Documentation
Bug Fix
Wildcard query crash with reasoning/thinking models (QueryExpander, HybridRetriever)
Models that emit chain-of-thought output (DeepSeek-R1, Qwen3, etc.) sometimes prefix their expanded query variants with markdown list markers (* , - , 1. ). When those markers survived into the Lucene BM25 query parser, it threw a ParseException because * and ? are illegal as the first character in a wildcard query.
Two-layer fix:
QueryExpandernow strips leading bullet and numbered-list markers from each expanded variant before they enter the pipeline.HybridRetrievercatches any remainingParseExceptionand retries with the query text fully escaped (retrieval degrades gracefully rather than crashing).
Improvements
XML documentation on all public members
All publicly visible types and members now carry <summary> XML doc comments, eliminating the CS1591 warning flood when building with <GenerateDocumentationFile>true</GenerateDocumentationFile>. This also means IntelliSense tooltips are populated when consuming PyRagix.Net as a library.
v0.3.0 - OpenAI-compatible LLM endpoint
What's New
OpenAI-compatible LLM endpoint
PyRagix.Net now works with any local inference server that implements the OpenAI /v1/chat/completions API - KoboldCpp, llamacpp, LM Studio, vLLM, LocalAI, and Ollama (via its /v1 endpoint).
The Ollama-native /api/generate wire format has been replaced with the OpenAI chat completions format. Health checks now use /v1/models.
Breaking: config key rename
Update your settings.toml:
| Old key | New key |
|---|---|
OllamaEndpoint |
LlmEndpoint |
OllamaModel |
LlmModel |
OllamaTimeout |
LlmTimeout |
Default port changed from 11434 to 8080 (llamacpp default).
Set LlmEndpoint to your server's base URL - do not include /v1/ in the value.
LlmEndpoint = "http://localhost:5001" # KoboldCpp
LlmEndpoint = "http://localhost:8080" # llamacpp
LlmEndpoint = "http://localhost:11434" # Ollama
LlmModel = "your-model-name"Other fixes
- Console app now loads
settings.tomlfrom the working directory (run frompyragix-net-console/) - Errors when the LLM server is unreachable now exit cleanly without a stack trace
- Removed unused
Microsoft.SemanticKerneldependency (carried a CVE) - Updated ONNX Runtime, EF Core, AngleSharp, and test packages to latest
Upgrading from v0.2.0
- Rename the three config keys above in your
settings.toml - Verify
LlmEndpointdoes not include a/v1/suffix
v0.2.0
What's New
--fresh flag
Ingest command now accepts --fresh to wipe existing indexes and database before rebuilding from scratch. Previously required manual deletion of artifacts.
dotnet run -- ingest ./docs --freshONNX Execution Provider
Replaced the GpuEnabled bool with ExecutionProviderPreference, a three-way enum controlling how ONNX Runtime picks a compute device for embedding and reranking inference:
| Value | Behaviour |
|---|---|
Cpu |
CPU only. Default. |
Auto |
Try CUDA, fall back to CPU silently. |
Cuda |
Require CUDA — fails at startup if unavailable. |
ExecutionProviderPreference = "Auto"
GpuDeviceId = 0Settings format fixed
settings.toml / settings.example.toml were copies of the Python project's sectioned format and were silently ignored by the C# config binder. Converted to flat PascalCase format matching PyRagixConfig property names.
Bug Fixes
- Fixed CI workflow failing on
dotnet tool installdue toenv: PATH:not expanding shell variables in GitHub Actions YAML. - Fixed
reportgeneratorstep treating;in-reporttypes:as a shell command separator.
Initial Release
PyRagix.Net v0.1.0 Release Notes
Highlights
- Local-first Retrieval-Augmented Generation engine targeting .NET 9.0 with a shared codebase for both library (
pyragix-net/) and demo console (pyragix-net-console/). - End-to-end RAG pipeline: multi-format ingestion (PDF, HTML, image OCR), semantic chunking, ONNX-based embeddings, FAISS/Lucene hybrid retrieval, cross-encoder reranking, and Ollama-driven answer generation.
- Configurable via TOML (
pyragix-net/settings.toml) or programmaticPyRagixConfig, including knobs for query expansion, hybrid weighting, reranking depth, GPU enablement, and batching. - Works fully offline once ONNX models and Ollama weights are exported locally—no remote APIs or cloud services required.
What's in v0.1.0
- RagEngine public API encapsulating ingestion and retrieval orchestration with EF Core metadata persistence.
- Document pipeline featuring PdfPig, AngleSharp, and Tesseract OCR, plus sentence-aware
SemanticChunkerwith overlap controls for higher recall. - Hybrid retrieval stack combining FAISS (or managed inner-product fallback) with Lucene BM25 and SQLite metadata for Reciprocal Rank Fusion.
- Retrieval refinements including multi-query expansion via Ollama, ONNX Runtime cross-encoder reranking, and streaming-friendly answer generation.
- Console CLI (
dotnet run -- ingest …/-- query …) that exercises the full pipeline using the shared configuration file. - Docs & tooling covering ONNX export (
docs/ONNX_SETUP.md), technology explainer, OS-specific behavior, and test harness instructions.
Recent Improvements
- Reranker honors caller limits –
Rerankernow respects the requestedtopK, enabling tighter answer contexts and lower latency (4f5638e). - End-to-end resiliency – Polly-based retries guard ingestion, retrieval, and Ollama calls against transient IO or HTTP faults (6c54679).
- Startup validation – Pipeline bootstrapping checks for required models, tessdata, and database paths before ingesting or querying (617f623).
- Cross-platform vector index – Added managed inner-product index fallback plus docs on when to regenerate FAISS files when switching OSes (1a52419, f26ddd9).
- Operator ergonomics – Structured ingestion progress output and clarified README/asset guidance; build scripts now quote arguments and ensure bundled dotnet tools resolve (71e8658, 721658a, fef4d22, a6a9ef8).
Setup Essentials
- Models – Export embedding (
sentence-transformers/all-MiniLM-L6-v2) and reranker (cross-encoder/ms-marco-MiniLM-L-6-v2) ONNX bundles intopyragix-net-console/Models/...(see docs/ONNX_SETUP.md). - Config – Copy
pyragix-net/settings.example.toml→settings.toml, point to ONNX paths, database file, and Ollama endpoint/model. - Artifacts – Keep
pyragix.db,faiss_index.bin, andlucene_index/alongside the process working directory; delete them when switching operating systems before re-ingesting. - Runtime deps – Install Tesseract
eng.traineddataunder./tessdata/for OCR, ensure Ollama is running locally, and restore/build withdotnet restore && dotnet build.
Happy querying!