on-device-inference

Here are 39 public repositories matching this topic...

microsoft / Foundry-Local

microsoft gpu-acceleration speech-to-text whisper ai-sdk onnx-runtime local-ai chat-completions foundry-local on-device-inference

Updated May 16, 2026
TypeScript

umitkacar / awesome-tinyml

Star

TinyML & Edge AI: On-device inference, model quantization, embedded ML, ultra-low-power AI for microcontrollers and IoT devices.

Updated Nov 10, 2025
Python

ondeinference / onde

Star

AI for Apple silicon devices.

inference-engine on-device-ai on-device-inference

Updated May 15, 2026
Rust

Alpha-Guardian / Engram

Star

Auditable offline edge intelligence for low-cost edge devices, with benchmark evidence and public board proof on ESP32-C3.

esp32 expert-systems embedded-ai edge-ai tinyml esp32-c3 offline-ai on-device-inference auditable-ai

Updated Mar 23, 2026
Python

nobodywho-ooo / flutter-starter-example

Sponsor

Star

Flutter starter example app to get started with NobodyWho, a library designed to run LLMs locally and efficiently on any device.

dart ai flutter slm inference-engine on-device-ai llm multimodal-ai on-device-inference

Updated May 12, 2026
Dart

ghostapp-ai / ghost

Star

The Private Agent OS — search files, run AI agents, connect to 10,000+ tools via the complete protocol stack (MCP, AG-UI, A2UI, A2A). Zero cloud. Zero telemetry. On-device inference.

Updated May 14, 2026
Rust

nobodywho-ooo / react-native-starter-example

Sponsor

Star

React Native starter example app to get started with NobodyWho, a library designed to run LLMs locally and efficiently on any device.

react-native ai slm inference-engine on-device-ai llm multimodal-ai on-device-inference

Updated May 12, 2026
TypeScript

Siddhesh2377 / llama.cpp-android

Sponsor

Star

Custom llama.cpp fork with character intelligence engine: control vectors, attention bias, head rescaling, attention temperature, fast weight memory

android c-plus-plus ndk jni quantization attention-mechanism arm-neon edge-ai mobile-ai llama-cpp character-ai ggml gguf on-device-inference control-vectors

Updated May 16, 2026
C++

Routstr / local-plus-plus

Star

iOS + Android app that runs local LLMs on-device + routstr cloud LLMs for anonymous inference

react-native local-llm routstr on-device-inference

Updated Sep 18, 2025
TypeScript

umitkacar / awesome-mobile-ai

Star

Mobile AI: iOS CoreML, Android TFLite, on-device inference, ONNX, TensorRT, and ML deployment for smartphones.

quantization mlkit tensorrt mnn edge-computing coreml ncnn onnx tensorflow-lite openvino mobile-ai mobile-inference pytorch-mobile model-optimization neural-engine android-ml on-device-inference ios-ml smartphone-ai

Updated Nov 10, 2025
Python

whyisitworking / llama-bro

Star

High-performance Android SDK for on-device LLM inference (GGUF). Privacy-focused, offline-first, and powered by llama.cpp with a clean Kotlin Coroutines API.

android cmake ai ndk android-library llama android-app android-package on-device-ai ndk-jni ai-assistant llamacpp llama-cpp on-device-models on-device-inference on-device-llm

Updated Mar 27, 2026
Kotlin

umitkacar / executorch-android-inference

Star

Production Android AI with ExecuTorch 1.0 - Deploy PyTorch models to mobile with NPU acceleration and 50KB footprint

kotlin-android java-android pytorch-mobile pytorch-android edge-inference android-ai real-time-ai executorch arm-optimization mobile-ml privacy-preserving-ai mobile-deployment on-device-inference npu-acceleration executorch-1-0 qualcomm-hexagon lightweight-ml cpu-gpu-npu meta-executorch

Updated Nov 14, 2025
Python

lanka-ai-foundation / LiteRTLM-Swift-SDK

Star

Unofficial Swift SDK for Google's LiteRT-LM — run Gemma 4 on-device with text, vision, audio, and tool calling. CPU + GPU (Metal). iOS 17+ / macOS 14+.