Pinned Loading
-
alfred-abliterate
alfred-abliterate PublicResidual-stream abliteration toolkit for MoE models (Qwen3.5-397B-A10B) on Apple Silicon. Removes PRC-aligned content policies from local inference. Tested on Mac Studio M3 Ultra 512GB.
Python
-
alfred-infra
alfred-infra PublicAI-infrastructure hardening kit for multi-machine local-LLM clusters: monitoring (node_exporter, DCGM, Prometheus), Grafana dashboards, cold backups, network-binding audits
Shell 1
-
alfred-rag
alfred-rag PublicHybrid RAG stack: dense (LanceDB) + BM25 (Tantivy) + RRF + Qwen3-Reranker-8B. With sentence-transformers fine-tuning recipe for Qwen3-Embedding-8B.
Python 1
-
blockops-proxy
blockops-proxy PublicHTTP proxy that translates text-format tool calls from local LLMs into OpenAI tool_calls. SSE streaming, concurrency gating, context truncation. Works with MLX/vLLM/llama.cpp backends.
Python
-
context-bench
context-bench PublicContext-window benchmark for local LLM serving endpoints (MLX, vLLM, llama.cpp). Measures throughput and memory pressure across 4k-128k context sizes.
Python 3
-
hermes-agent
hermes-agent PublicForked from NousResearch/hermes-agent
The agent that grows with you
Python
If the problem persists, check the GitHub status page or contact support.
