PyRagix.Net requires ONNX models for embeddings and reranking. Export once from Python.
pip install optimum-onnx
pip install onnxruntime # or onnxruntime-gpu for CUDA# Embedding model (sentence-transformers)
optimum-cli export onnx \
--model sentence-transformers/all-MiniLM-L6-v2 \
--task feature-extraction \
pyragix-net-console/Models/embeddings
# Reranker model (cross-encoder)
optimum-cli export onnx \
--model cross-encoder/ms-marco-MiniLM-L-6-v2 \
--task text-classification \
pyragix-net-console/Models/rerankerCheck for model.onnx in each folder:
pyragix-net-console/Models/embeddings/model.onnxpyragix-net-console/Models/reranker/model.onnx
For image/PDF OCR:
Windows:
# Install from: https://github.com/UB-Mannheim/tesseract/wiki
# Then verify:
tesseract --versionLinux:
sudo apt install tesseract-ocr tesseract-ocr-engmacOS:
brew install tesseractModels are gitignored (large files).