-
Notifications
You must be signed in to change notification settings - Fork 0
Architecture
This document tracks architectural decisions and planning progress for Zelara.
Status: Foundation Phase - No implementation yet
- Client-side application - Not a web service. Users install the app.
- Edge-first execution - Runs locally, works offline. No cloud dependency.
- Device linking & distributed computing - Cross-device task offloading (Desktop > Mobile > Web priority).
- Cross-platform - Web, Desktop (Windows/Mac/Linux), Mobile (iOS/Android) from shared codebase.
- Conditional builds - Build system compiles custom binaries based on unlocked features.
- Git submodules as feature gates - Each feature domain is a separate repo.
Framework Options:
- Option A: Flutter - Single codebase for all platforms, compiled binaries, good performance, built-in device sync capabilities
- Option B: React Native + Tauri - RN for mobile, Tauri for desktop/web, TypeScript shared logic, Rust for heavy processing
- Option C: Electron + React Native - Electron for desktop/web, RN for mobile, heavier but mature ecosystem
Shared Logic Language:
- TypeScript (works with all options, familiar, good tooling, easier AI integration)
- Rust (performance, security, works with Tauri, better for on-device CV/AI, steeper learning curve)
On-Device AI/Computer Vision:
- TensorFlow Lite (mobile-optimized, good for image validation)
- ONNX Runtime (cross-platform, good performance)
- Core ML (iOS native, excellent performance)
- Custom lightweight models (for recycling image validation)
Decision needed: Choose framework + language + on-device AI stack
Core Concept: Tasks processed on most capable linked device (Desktop > Mobile > Web priority).
Questions:
- How do devices discover each other? (Local network? Bluetooth? Cloud relay as fallback?)
- How is task offloading triggered? (Automatic detection of device capability? Manual user preference?)
- What happens if no capable device is linked? (Prompt user to link? Degrade functionality? Queue task?)
- How is state synchronized across devices? (Real-time sync? Periodic sync? Event-driven?)
- Security model for device linking? (Encryption? Device approval? Revocation?)
Use Cases:
- Image validation: Mobile captures recycling photo → offloads to Desktop for CV processing → results back to mobile
- Financial calculations: Mobile logs expense → Desktop runs tax prep analysis → syncs back results
- Carbon footprint analysis: Mobile inputs property data → Desktop runs complex calculations with local regulations → generates pathway
Decision needed: Define device linking protocol and task offloading mechanics
Challenge: How does the build system know which submodules to include?
Potential Approaches:
-
Option A: User's unlock state stored in local file (e.g.,
user-progress.json). Build script reads it and pulls submodules. - Option B: Build-time configuration file user manually edits before rebuild.
- Option C: Separate "updater" app that manages submodules and triggers rebuilds.
- Option D: App downloads feature modules dynamically (like plugins) without full rebuild.
Questions:
- How does user trigger a rebuild when they unlock new features?
- Is this automatic (app detects unlock → downloads update) or manual (user clicks "update app")?
- Where is the unlock state stored (local file, SQLite, platform keychain)?
- Can features be hot-swapped without app restart?
Decision needed: Define build system mechanics
Revised Core Mechanics:
- Universal Starter: Basic recycling task (use paper bags, validate with images processed on-edge)
- Complete tasks → earn points
- Points unlock feature branches:
- Finance: Personal/business finance organization → suggest donations from excess
- Productivity: Focus tools, email/calendar AI → sell themed skins/accessories (profits to green causes)
- Homeowner: Property-based carbon reduction (city/suburbs/farm calculations, regulations, pathways)
- Each branch unlocked adds new submodule to user's build
Questions:
- What's the skill tree schema? (JSON? YAML? Database?)
- How are prerequisites defined? (Linear progression? Multiple paths? Branching?)
- How are points awarded? (Manual logging? Automated tracking? AI verification via image validation?)
- Can users "re-lock" features to reduce app size?
- How granular are unlocks? (Entire modules? Sub-features within modules?)
Image Validation for Recycling:
- User takes photo of paper bag with recyclables
- On-device AI validates:
- Is it a paper bag? (vs plastic)
- Does it contain paper/cardboard items?
- Confidence score for validation
- Points awarded based on validation success
- Processing happens on-edge (Desktop > Mobile > Web priority)
Decision needed: Define skill tree schema and progression mechanics
Proposed Initial Repos:
Core:
-
zelara-ai/core(this repo) - Anchor, shared systems, skill tree engine, device linking -
zelara-ai/core.wiki- Technical documentation
Feature Modules:
3. zelara-ai/module-finance + .wiki - Personal/business finance, donation suggestions
4. zelara-ai/module-productivity + .wiki - Focus, calendar/email AI, themed skins marketplace
5. zelara-ai/module-green + .wiki - Recycling (starter), carbon footprint, homeowner tools
Platform Apps:
6. zelara-ai/app-desktop + .wiki - Desktop app (Windows/Mac/Linux)
7. zelara-ai/app-mobile + .wiki - Mobile app (iOS/Android)
8. zelara-ai/app-web + .wiki - Web app
Questions:
- Should we create all repos upfront or only as needed?
- How granular should modules be? (e.g., one repo for all finance vs separate repos for budget/tax)
- Should starter features (recycling) live in
coreormodule-green? - Do apps share UI components repo or keep separate?
Decision needed: Define initial repository creation plan
What is the absolute minimum we need to prove the concept?
Potential MVP:
- Starter feature: Recycling task (paper bag usage, image validation on-edge)
- Single unlockable module: Finance module (basic budget tracker, donation suggestion)
- Point system: Complete recycling tasks → earn points → unlock finance module
- Image validation: On-device CV for recycling verification
- Single platform: Desktop (most capable for edge processing) OR Mobile (most common use case)
- Manual rebuild: User clicks "update app" to pull new submodules after unlock
Questions:
- Which platform should MVP target? (Desktop = best for edge AI, Mobile = better UX for photos)
- Should MVP include device linking or defer to v2?
- Which image validation approach for MVP? (Simple rule-based? Lightweight ML model?)
- How complex should first finance module be? (Just expense logging? Or basic budget analysis?)
Decision needed: Define MVP scope and first platform target
Local-first principle - All user data lives on device.
Storage Options:
- File system - JSON files for simple data (user progress, settings, unlock state)
- SQLite - Structured data (transactions, task history, recycling logs)
- IndexedDB - Web platform storage
- Platform-native storage - iOS Core Data, Android Room, etc.
Cross-Device Sync:
- User unlock state synced across linked devices
- Task completion history synced
- Module preferences synced
Questions:
- Can we use same storage mechanism across platforms?
- How do we handle data migration when app updates?
- What gets synced across linked devices? (Everything? Just unlock state?)
- Conflict resolution for cross-device syncing?
Decision needed: Choose storage strategy per platform
Revised Mechanics:
Finance Branch:
- App organizes personal/business finances
- Calculates "excess" (disposable income after essentials)
- Suggests optional donation from excess to green initiatives
- User can accept/decline/adjust amount
Productivity Branch:
- Free core tools (focus timer, calendar/email AI)
- Paid themed skins/accessories (UI customization)
- All profits from sales go to green causes
- Transparency on donation amounts
Homeowner Branch:
- Free carbon footprint calculations
- Free pathways to reduce emissions (solar, insulation, etc.)
- Partnerships with green vendors (solar installers, etc.)
- Referral fees go to green causes
Questions:
- What green initiatives do we partner with? (Tree planting? Carbon offset? Renewables?)
- How do we verify donations actually happen? (Payment integration? Transparency reports?)
- Is donation required to unlock features or purely optional? (Always optional)
- How do we handle different currencies/regions?
Decision needed: Define green initiative mechanics and partnerships
Core Requirement: On-device validation of recycling tasks (paper bag usage, correct sorting).
Questions:
- Which computer vision approach? (Lightweight CNN? Rule-based detection? Hybrid?)
- How to train/validate models? (Custom dataset? Transfer learning?)
- Where do models live? (Bundled with app? Downloaded on unlock? Per-platform?)
- Confidence thresholds for validation? (How strict? False positive handling?)
- Privacy considerations? (Images stay on-device? Optionally uploaded for model improvement?)
- Fallback if device can't run CV? (Manual validation? Skip validation? Link to capable device?)
Edge Processing Priority:
- Desktop: Full CV models, high accuracy
- Mobile: Lightweight models, acceptable accuracy
- Web: Very lightweight or rule-based, lower accuracy
- If Mobile/Web can't process → offload to linked Desktop
Decision needed: Define image validation approach and model distribution strategy
Decision: MVP includes Mobile + Desktop, device linking, ML-based recycling validation, finance module with budget analysis
MVP Features:
- Mobile app (React Native) + Desktop app (Tauri)
- Device linking (manual QR pairing, local network)
- Recycling validation (ML-based, ONNX Runtime)
- Point system (10 points per task, 50 to unlock finance)
- Finance module (expense logging, categorization, donation suggestions)
- Conditional builds (manual rebuild, git submodule pulling)
Scope Exclusions:
- No web platform (v2)
- No productivity/homeowner modules (v2)
- No automatic device discovery (manual pairing only)
- No cloud sync (fully local)
- No marketplace/skins (v2)
Rationale:
- Proves all core concepts (device linking, skill tree, conditional builds, on-device ML)
- Mobile + Desktop together demonstrates distributed computing
- ML validation more impressive than rule-based
- Finance with budget analysis provides real value
- Coarse module granularity reduces complexity
See: MVP-Scope.md for full details
Decision: React Native + Tauri + TypeScript + Rust + ONNX Runtime
Components:
- Mobile: React Native (TypeScript)
- Desktop: Tauri (TypeScript UI + Rust backend)
- Shared logic: TypeScript packages in core
- CV/ML: ONNX Runtime with Rust bindings
- UI components: Duplicated per app (not shared repo)
Rationale:
- Best balance of performance, code reuse, and edge-first principles
- TypeScript for rapid UI development
- Rust for high-performance CV and device linking
- ONNX Runtime cross-platform and framework-agnostic
See: Technology-Stack.md for full analysis
Decision: Coarse granularity - one repo per feature domain
Structure:
-
module-finance(contains all finance features: budget, tax, expenses) -
module-green(contains all green features: recycling, carbon, solar) -
module-productivity(contains all productivity features: focus, calendar, email)
Rationale:
- Simpler repository management
- Fewer submodules to orchestrate
- Still allows nested submodules later if needed
- MVP proves concept without complex dependency graph
Decision: Manual QR pairing over local network, TLS encryption
Protocol:
- Desktop generates QR code (contains IP, port, pairing token)
- Mobile scans QR code
- Direct device-to-device connection (no cloud relay)
- TLS for all communication
- Connection persists across sessions
Task Offloading:
- Mobile sends image to Desktop for CV processing
- Desktop runs ONNX model, returns result
- Fallback: Prompt user to link Desktop if unavailable
Rationale:
- Simplest implementation for MVP
- No cloud infrastructure needed
- Proves device linking concept
- Automatic discovery deferred to v2
See: Device-Linking.md for full protocol design
Decision: ML-based validation using ONNX Runtime on Desktop
Approach:
- Pre-trained ONNX model (bag classification, content detection)
- Mobile captures image, sends to Desktop
- Desktop runs inference (Rust + ONNX Runtime)
- Returns validation result to Mobile
Model:
- Lightweight CNN (MobileNet or EfficientNet base)
- Fine-tuned for paper vs plastic classification
- Bundled with Desktop app (no download)
Fallback:
- If no Desktop linked: Prompt user to pair
- Future: Lightweight model on Mobile for degraded validation
Rationale:
- ML validation more impressive and accurate than rule-based
- Desktop has compute resources for full model
- Proves device linking immediately
- ONNX allows model flexibility (can swap later)
Decision: SQLite for structured data, JSON for simple state
Storage Strategy:
-
User progress:
~/.zelara/progress.json(points, unlocked modules) - Finance transactions: SQLite database
- Recycling history: SQLite database
- App settings: JSON file
Cross-Device Sync:
- Progress file synced via device linking
- SQLite databases synced selectively (unlock state only)
- No cloud sync (fully local)
Rationale:
- SQLite cross-platform and performant
- JSON simple for configuration/state
- Both work with TypeScript and Rust
- Easy migration path for future cloud sync (optional)
Implementation can now proceed:
- ✅ Technology stack decided
- ✅ MVP scope defined
- ✅ Device linking approach decided
- ✅ Image validation approach decided
- ✅ Module granularity decided
- ✅ Data storage decided
Ready for:
- Initialize repository structures (app-mobile, app-desktop, modules)
- Set up core/src/ with shared TypeScript packages
- Configure git submodules
- Begin implementation (device linking → CV → modules)
Status: Designed, awaiting implementation
Goal: Enable Mobile to work fully offline, queue tasks when Desktop unavailable, auto-sync when Desktop comes online.
Documentation: See Device-Linking.md § Mobile Offline Architecture
Key insight: Current Metro bundler requirement is development-only. Production APK already works offline with embedded JS bundle. Task queueing is UX enhancement, not fundamental requirement.
Implementation phases:
- Verify release APK works offline (no Metro/WiFi dependency)
- Task queue service (AsyncStorage persistence)
- Auto-sync service (background Desktop detection)
- Connection state persistence (remember paired Desktop)
- UI polish (status indicators, sync notifications)
- Web platform support (deferred from MVP)
- Productivity & Homeowner modules (deferred from MVP)
- Automatic device discovery (deferred from MVP - manual QR pairing only)
- Marketplace for themed skins (deferred from MVP)
- Cloud sync as optional feature (deferred - fully local for MVP)
Decision: Rename and redesign Zelara Desktop as "Zelara Core" — a local AI hub that powers all Zelara mobile apps, not just a pairing/testing tool.
Reasoning: The device linking + TLS WSS infrastructure already built is exactly the
right foundation for a local AI service. Mobile apps already offload ONNX inference to
Desktop (recycling CV). Formalizing this as "Core = AI Hub" with a generic ai_task
dispatch system unlocks Finance OCR, DistilBERT categorization, LLM coaching, and all
future capabilities without any new protocol work.
Impact:
- Desktop app renamed "Zelara Core" in all UI and branding
- React frontend: 4-section nav (Hub / AI Models / Finance / Settings) replacing current layout
- Rust backend: new
ai/module directory —dispatcher.rs,model_manager.rs,ocr_receipt.rs,categorization.rs,llm.rs - New generic
ai_taskWSS message type; existingimage_validationretained for backward compat
See: Zelara-Core.md for full design
Decision: Finance will be a separate app (apps/finance-mobile/) with its own App Store
listing, not a locked module inside Zelara Mobile.
Reasoning: The skill tree gate ("earn 50 points to unlock Finance") creates friction for users who only want to track spending. A standalone app is immediately useful, has better App Store discoverability in the Finance category, and builds more user trust for a finance product. The multi-app architecture validates the Core hub model — Zelara Finance is the first app that uses Core purely as an AI service.
See: Zelara-Finance.md for full design
Decision: Every mobile application is a standalone React Native project with its own
package name, App Store presence, and apps/ directory from day one. No "embed now, extract
later" pattern.
What changed: The 2026-04-04 Finance decision originally planned building Finance inside
apps/mobile/ during v0.6.0 and extracting it in v0.7.0. Additionally, the Testing module
was expected to remain inside the main app indefinitely.
New structure:
-
apps/mobile/— Zelara (ai.zelara.mobile) — Green, recycling, skill tree -
apps/finance-mobile/— Zelara Finance (ai.zelara.finance) — standalone from v0.6.0 -
apps/testing-mobile/— Zelara Testing (ai.zelara.testing) — standalone from v0.6.0
Rationale: Eliminates the fragile relative import in modules/testing/mobile/ that
reached 5 directories back into apps/mobile/src/. Removes the v0.7.0 extraction tax
entirely. Makes architecture fully consistent — every entry in apps/ is an independent,
deployable app. Shared business logic lives in src/packages/ as pure TypeScript packages.
Shared RN infrastructure (DeviceLinkingService, BLE, ZelaraPinnedWebSocket, native TLS Kotlin
modules) is intentionally duplicated per app to keep each independently buildable.
See: Zelara-Finance.md, Repository-Structure.md
What was built:
-
Desktop renamed Zelara Core:
tauri.conf.jsonproductName/identifier/title updated;Cargo.tomlrenamedzelara-core/zelara_core_lib; version bumped to0.6.0 -
Tab navigation:
App.tsxreplaced single-page layout with Hub / AI Models / Finance / Settings state-based tab nav. TestingPanel moved to Settings → Debug Tools. -
Rust
ai/module:dispatcher.rs,model_manager.rs,ocr_receipt.rs,categorization.rs— all stubs returningNotYetImplemented;AiTaskRequest,AiTaskResponse,AiTaskErrortypes inmod.rs;list_ai_models+get_model_statusTauri commands registered -
Generic
ai_taskWSS message:TaskRequestgainscapability: Option<String>; new"ai_task"match arm routes toai::dispatcher::dispatch(); backward compat withimage_validationpreserved -
@zelara/financepackage (src/packages/finance/):types.ts,schema.ts,FinanceRepository(abstract),CategorizationEngine,BudgetEngine,ForecastEngine,ReportParser(interface stub) — zero react-native dependencies -
CI: guard step +
@zelara/financebuild in all 4 jobs; Windows exe path corrected tozelara-core.exe
What is stubbed (implemented in later phases):
-
ocr_receipt.rs/categorization.rshandlers — Phase 5 -
model_managerdownload + ONNX loading — Phase 5 -
@zelara/financeSQLite implementation (concreteFinanceRepository) — Phase 3 -
ReportParserimplementations — Phase 4 - Finance / AI Models tab UI — Phase 4/5
All major architectural decisions made. Implementation phase in progress (Phase 1 of 7 complete).
What was built:
-
apps/testing-mobile/standalone app: Independent React Native 0.76.6 project (ai.zelara.testing,@zelara/app-testing-mobile); eliminates the fragile 5-level relative import frommodules/testing/mobile/intoapps/mobile/src/services/ -
Android re-namespace: 7 Kotlin TLS native modules copied from
apps/mobileand re-namespaced toai.zelara.testing(ZelaraWebSocketModule, ZelaraTLSModule, ZelaraTLSPackage, ZelaraTrustManagers, ZelaraOkHttpFactory, MainApplication, MainActivity) -
Brand identity: Zelara palette (
#0c1e24dark,#3daeaeteal,#deedf2light) applied to all screens;ZelaraLogoSubon TestingScreen,ZelaraLogoon DevicePairingScreen viareact-native-svg@15.3.0(pinned — 15.6+ incompatible with RN 0.76.6 Yoga API) -
tsconfig.json self-contained: Removed
extends @react-native/typescript-configso the IDE resolves types correctly beforenode_modulesis installed -
Git submodule:
https://github.com/zelara-ai/testing-mobile.gitregistered atapps/testing-mobilein core repo -
CI:
build-testing-androidjob added torelease.yml; producestesting-android-apkartifact;create-releasejob updated to include testing APK in release assets
RN 0.76.6 native dependency pinning lesson:
All native packages must use exact versions (not ^ ranges) — apps/mobile/node_modules
is the source-of-truth baseline. Key pins: react-native-screens@4.4.0,
react-native-svg@15.3.0, react-native-safe-area-context@5.6.2.
Package.json overrides block enforces svg version across transitive deps.
Phase 2 of 7 complete.
What was built:
-
apps/finance-mobile/standalone app: Independent React Native 0.76.6 project (ai.zelara.finance,@zelara/app-finance-mobile); mirrors testing-mobile's structure -
Android re-namespace: 5 Kotlin TLS native modules (ZelaraWebSocketModule,
ZelaraTLSModule, ZelaraTLSPackage, ZelaraTrustManagers, ZelaraOkHttpFactory) copied from
apps/testing-mobileand re-namespaced toai.zelara.finance; MainActivity/MainApplication registerZelaraFinancecomponent -
FinanceService.ts: Concrete implementation of theFinanceRepositoryabstract class from@zelara/finance; usesexpo-sqliteopenDatabaseAsyncAPI; seedsDEFAULT_CATEGORIESon first init; maps camelCase fields to snake_case columns; generates IDs without uuid dep -
BiometricService.ts:expo-local-authenticationwrapper;authenticate()is called on everyAppState 'active'event via listener inApp.tsx; device without hardware returnstrue(no-op gate) -
Stub services:
OCRService,CategorizationService,SMSImportService,ReportImportService— each returns empty/null with detailed Phase 4/5 comments -
CategorizationService: wrapsCategorizationEnginefrom@zelara/financefor offline keyword-based categorization (Phase 5 upgrades to Core DistilBERT) -
Navigation:
RootStack(GetStarted / Main / DevicePairing) +MainTabNavigator(Overview / Transactions / Analytics / Reports); first-launch routing via@zelara_finance_launchedAsyncStorage key;GetStartedsets key then replaces toMain -
Biometric overlay:
LockedOverlayrendered aboveNavigationContainerwhenlocked === true; AppState listener fires on every foreground; no lock on initial mount -
ConnectionBanner: subscribes toDeviceLinkingService.onConnectionChange; renders a teal banner only when disconnected -
AndroidManifest:
READ_SMS,USE_BIOMETRIC,USE_FINGERPRINTdeclared from day one (required by stores even before feature is wired) -
Git submodule:
https://github.com/zelara-ai/finance-mobile.gitregistered atapps/finance-mobilein core repo -
CI:
build-finance-androidjob added torelease.yml; builds@zelara/financebeforenpm install; producesfinance-android-apk/zelara-finance.apk;create-releaseupdated
Phase 3 of 7 complete.
What was built:
-
Chart library:
react-native-gifted-charts@1.4.49+react-native-linear-gradient@2.8.3added toapps/finance-mobile/package.json;gifted-chartspeers with existingreact-native-svg@15.12.1; flat fills used in Phase 4 (noreact-native-linear-gradientruntime use); both pinned inoverrides -
Theme.ts: single color constant file (bg,teal,light,border,muted) eliminates magic strings across all new components -
BudgetProgressBar.tsx: animated horizontal progress bar using RN built-inAnimated(useNativeDriver: falsefor width); color shifts orange at 90%, red over 100% of limit -
TransactionRow.tsx: collapsible transaction row with source badge pills (MANUAL/OCR/IMPORT/SMS), inline expand showing merchant/notes + red Delete button -
DonationCard.tsx: renders when user is under budget; shows savings amount framed as a green-causes contribution; returnsnullwhen over budget -
SpendCoachCard.tsx: rendersSpendCoachInsight[]fromBudgetEngine.checkBudgetStatus; shows a green "on track" card when no insights; accent colors per insight type -
ExpenseFormSheet.tsx: Modal-based slide-up form;Animated.Value(500)→ 0 on open, 0 → 500 on close (useNativeDriver: true,translateY); description blur triggersCategorizationService.categorize()with teal suggestion chip; saves viaFinanceService.addTransaction()+ProgressService.awardPoints(5)(fire-and-forget) -
App.tsxFAB:MainTabNavigatorconverted to stateful component; teal+FAB (position:absolute, bottom:88, right:24) overlays all 4 tabs; tab icons via emoji text (no icon library);ExpenseFormSheetrendered as sibling ofTab.Navigator -
OverviewScreen:useFocusEffect+Promise.allfor txns/budgets/cats; monthly spend/income summary card;BudgetProgressBarperbudget.categoryLimitsentry;DonationCardwhen under budget; last 5 transactions viaTransactionRow -
TransactionsScreen:SectionListgrouped byYYYY-MM-DD, sorted DESC; horizontal filter pills (All / Manual / OCR / Import / SMS); inline expand + delete -
AnalyticsScreen:PieChartdonut (gifted-charts) for monthly category spend; horizontal category selector +LineChartfor 6-month trend + forecast data point;SpendCoachCardfor insights; "Upcoming" forecasts table (top 3 categories,ForecastEngine.forecast()linear regression, trend icons) -
ReportsScreen: four action cards — Scan Receipt (locked if Core disconnected, Phase 5 stub), Import Bank File (Phase 5 stub), Read Bank SMS (Android only, SMSImportService stub), Export CSV (live: generates CSV viareact-native-fs+Share.share)
What is stubbed (later phases):
-
SMSImportService.importTransactions()— returns[]until Phase 5 SMS parser -
ReportImportService.importFile()— Alert stub, document picker deferred to Phase 5 - Scan Receipt OCR — requires Phase 5
ocr_receiptRust backend on Core
Phase 4 of 7 complete.
Phase 5A — AI Runtime Platform foundation:
-
Model manager (
ai/model_manager.rs):list_ai_models,get_model_status,download_model_cmdTauri commands;ensure_capability_ready()handles auto-download + session loading; deduplicates concurrent download requests -
Hardware tier (
ai/hardware_tier.rs):HardwareTier(Light/Medium/Heavy); detects AC power viawindows::Win32::System::Power::GetSystemPowerStatus; RAM viasysinfo; cached inAiRuntimeStateat startup -
ONNX runner (
ai/onnx_runner.rs): thinOnnxRunnerwrapper; exposespub session: Mutex<Session>for handlers to use directly with ort 2.0.0-rc.12 API -
Model manifest (
assets/model_manifest.json): real URLs + SHA-256 for PaddleOCR det+rec and MiniLM embedder; bundled viainclude_str! -
Dispatcher (
ai/dispatcher.rs): routesai_taskby capability; tier check; activity ring buffer;device_linking.rsasyncai_taskarm -
Finance import (
finance_import.rs): CSV / OFX (quick-xml) / XLSX (calamine) parsers;push_finance_transactionsTauri command
Phase 5B+ — Receipt scanning pipeline overhaul:
Accuracy:
-
Real per-field confidence (
ai/ocr_receipt.rs):FieldConfidence { merchant, total, date, line_quality, overall }replaces flat constants; merchant scored via edit-distance match against known list; total via keyword priority vs. largest-amount fallback; date via recency check; line quality = printable-ASCII ratio; completion thresholdoverall ≥ 0.88(was 0.95 on fake scores) -
Image preprocessing: deskew via gradient histogram (1°-bin Hough approximation, 0.5°–12° gate); 3×3 median filter; adaptive local-mean threshold (15×15 window, C=7); new
"deskewed_adaptive"variant competes against 4 existing passes; winner logged with per-variant scores -
PaddleOCR ONNX integration: two-stage pipeline — detection (
paddleocr-det-v4-en, 4.83 MB) produces bounding boxes; recognition (paddleocr-rec-v4-en, 16.6 MB) runs CTC decoding per crop; results sorted by Y-coordinate (reading order); gated to Medium/Heavy tier; Windows OCR retained as fallback
Robustness:
-
Persistent Qwen3 subprocess (
ai/qwen_receipt.rs):PersistentQwenProcess { child, stdin, stdout }; spawned once with--serverflag;qwen_receipt_cleanup.py --serverpre-loads model then processes stdin JSON lines indefinitely; 18s per-call timeout kills + resets process; auto-respawn on next call;python_availablebool checked at startup (rejects Python < 3.9 fast) -
Dispatcher timeout:
tokio::time::timeout(45s)wraps each capability handler;AiTaskError::Timeoutreturns structured error to mobile before mobile's 90s deadline -
Retry cap:
MAX_JOB_RETRIES = 5; jobs exceeding cap →"failed"status;ReceiptQueueState.shutdown()signals processing loop viaArc<AtomicBool> -
Safe concurrency: eliminated all
unsafe { &*(ptr as *const T) }raw pointer patterns;AiRuntimeStateandReceiptQueueStatemanaged asArc<T>; inner mutable fields (downloading,loaded_sessions,download_progress) wrapped inArc<Mutex<>>for spawn safety
Structured logging:
-
ReceiptLogger(receipt_queue.rs): 500-entry in-memory ring buffer + JSONL file (logs/receipt_ocr.jsonl, rotates at 10 MB);get_receipt_log(job_id?)Tauri command;ReceiptLogEntrycarries stage, level, duration, error detail, variant scores
User story — gallery folder sync:
-
WATCHED_FOLDER(ReceiptQueueService.ts): AndroidExternalStorageDirectoryPath/Zelara/receipts, iOSDocumentDirectoryPath/ZelaraReceipts;scanWatchedFolder()requestsREAD_MEDIA_IMAGESpermission, reads dir, enqueues new images, persistsfilename→receiptIdmap in AsyncStorage; moves processed files toprocessed/subfolder after Desktop acknowledges; hooked intoonConnectionChange+ReportsScreenmount -
Parallel upload: sequential loop replaced with chunked
Promise.all()(3 concurrent); named quality constants replace all magic numbers -
Finance.tsx Processing Log:
convertFileSrc(imagePath)replaces broken base64 image display; per-field confidence shown inline; expandable "Processing Log" section per receipt showing stage/duration/level/error/variant scores viaget_receipt_logTauri command
Phase 5 of 7 complete.