Skip to content

SLxnoat/Project-J.A.R.V.I.S

 
 

Repository files navigation

J.A.R.V.I.S — MARK XXXIX (v2.0)

Just A Rather Very Intelligent System
The Last Monolithic Architecture Before the Agentic Evolution
Version 2.0 | Developer: Charuka Mayura Bandara
Forked from FatihMakes

Python 3.10+ License Version Status


📋 Table of Contents


Executive Summary

JARVIS Mark XXXIX v2.0 represents the culmination of a decade of evolution in personal AI assistant technology — a monolithic-threaded architecture engineered for reliability, performance, and direct system control. This iteration delivers significant improvements over the original, including enhanced memory flow, self-learning capabilities, and robust error recovery.

v2.0 Improvements

Feature v1.0 v2.0
Memory Flow Basic RAG Advanced RAG with auto-consolidation
Self-Learning No LangGraph-based Reflexion cycle
Embedding Model text-embedding-004 (deprecated) gemini-embedding-001
API Client google.generativeai google.genai SDK (v1.0+)
Error Recovery Manual Autonomous with heuristic fallback

Note: This is the last iteration of our monolithic architecture. JARVIS is currently in a migration-readiness state, with all components prepared for our transition to a LangGraph-driven Multi-Agent Agentic AI framework.


🏗️ System Architecture

Overview

JARVIS Mark XXXIX v2.0 operates on a hybrid monolithic-threaded architecture — a single-threaded event loop with asynchronous capabilities, thread-safe queue management, and synchronous tool execution.

┌──────────────────────────────────────────────────────────────────────────────┐
│                         JARVIS ARCHITECTURE v2.0                             │
├──────────────────────────────────────────────────────────────────────────────┤
│  ┌─────────────────────────────────────────────────────────────────────────┐ │
│  │                   USER INTERFACE LAYER (PyQt6 HUD)                      │ │
│  │  ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐   │ │
│  │  │   HUD Canvas │ │  Metrics     │ │ File Drop    │ │  Status      │   │ │
│  │  │   (Visual)   │ │  (System)    │ │  Zone        │ │  Indicator   │   │ │
│  │  └───────┬──────┘ └───────┬──────┘ └───────┬──────┘ └───────┬──────┘   │ │
│  └───────────┼────────────────┼────────────────┼────────────────┼───────────┘ │
│              │                │                │                │              │
│              ▼                ▼                ▼                ▼              │
│  ┌─────────────────────────────────────────────────────────────────────────┐ │
│  │                    CORE ENGINE LAYER                                     │ │
│  │  ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐   │ │
│  │  │ Audio Queue  │ │ Text Input   │ │ File Loader  │ │ Event Loop   │   │ │
│  │  │ Management   │ │ Handler      │ │ Manager      │ │ (Async)      │   │ │
│  │  └───────┬──────┘ └───────┬──────┘ └───────┬──────┘ └───────┬──────┘   │ │
│  └───────────┼────────────────┼────────────────┼────────────────┼───────────┘ │
│              │                │                │                │              │
│              ▼                ▼                ▼                ▼              │
│  ┌─────────────────────────────────────────────────────────────────────────┐ │
│  │                    AGENTIC PROCESSING LAYER                              │ │
│  │  ┌───────────────────────────────────────────────────────────────────┐  │ │
│  │  │   JarvisRAGProcessor v2                                           │  │ │
│  │  │   - Short-term: SQLite conversation history                     │  │ │
│  │  │   - Long-term: ChromaDB semantic memory                         │  │ │
│  │  │   - Auto-memory extraction & consolidation (NEW)                │  │ │
│  │  │   - Heuristic fallback for or_client (NEW)                      │  │ │
│  │  └───────────────────────────────────────────────────────────────────┘  │ │
│  │  ┌───────────────────────────────────────────────────────────────────┐  │ │
│  │  │   SelfLearningEngine (NEW)                                        │  │ │
│  │  │   - Generator → Executor → Critic → Router                       │  │ │
│  │  │   - Reflexion cycles for autonomous code correction             │  │ │
│  │  └───────────────────────────────────────────────────────────────────┘  │ │
│  └─────────────────────────────────────────────────────────────────────────┘ │
│                                                                              │
│                 ┌───────────────────────┐                                    │
│                 │       ROUTER v2       │                                    │
│                 └──────────┬────────────┘                                    │
│                            ▼                                                 │
│  ┌─────────────────────────────────────────────────────────────────────────┐ │
│  │                        ACTION MODULES (17+)                             │ │
│  │  ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌──────────────────┐   │ │
│  │  │ open_app│ │web_search│ │ browser │ │ file_con│ │  screen_process│   │ │
│  │  │         │ │         │ │ control │ │ troller │ │  (camera+screen) │   │ │
│  │  └─────────┘ └─────────┘ └─────────┘ └─────────┘ └──────────────────┘   │ │
│  │  ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌──────────────────┐   │ │
│  │  │computer │ │computer │ │ game_up │ │ flight_ │ │  code_helper v2  │   │ │
│  │  │ control │ │ settings│ │ dated   │ │ finder  │ │  (Self-Learning) │   │ │
│  │  └─────────┘ └─────────┘ └─────────┘ └─────────┘ └──────────────────┘   │ │
│  └─────────────────────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────────────────┘

Architecture Components

Layer Components Responsibility
UI Layer ui.py (1,530 lines) PyQt6 HUD, metrics display, file drop zones
Core Engine main.py (1,120 lines) Event loop, audio queues, tool routing
Processing memory/rag_processor.py RAG pipeline, auto-memory, conversation history
Self-Learning agent/self_learning_engine.py Reflexion cycles, code generation, error recovery
Planning agent/planner.py Goal decomposition into tool steps
Execution agent/executor.py Plan execution, retry logic, error recovery
Error Handling agent/error_handler.py Recovery decisions, replanning suggestions
Action Modules 17 modules in actions/ System control, web automation, file ops

Technical Stack

Component Technology Version
Core Language Python 3.10+
UI Framework PyQt6 6.5+
LLM SDK Google Gen AI v1.0+ (Modern)
Legacy SDK google.generativeai 0.x+ (Fallback)
Vector DB ChromaDB 0.4+
Short-term Memory SQLite Built-in
Audio Processing SoundDevice 0.4+
Web Automation Playwright 1.40+
Screen Capture mss 6.1+
Computer Control pyautogui 0.9+
Multi-Agent LangGraph 0.1+

🎯 Key Capabilities

1. Web & Information Access

Module Lines Capabilities
web_search.py 385 Serper API + Playwright stealth deep scraping, ChromaDB storage
browser_control.py 519 Full Playwright automation (navigation, form filling, scraping)
file_processor.py 832 Document analysis (PDF, DOCX, XLSX, images)
screen_processor.py 367 Screenshot capture + camera feed + Gemini Live analysis

Use Cases:

  • Real-time web research with deep content scraping
  • Automated form submission and data extraction
  • Multi-format document processing
  • Visual understanding of screen content

2. System & Application Control

Module Lines Capabilities
open_app.py 207 Windows app launching with path detection
computer_control.py 477 Mouse/keyboard automation, screenshot, screen finding
computer_settings.py 671 OS configuration (brightness, volume, network)
desktop.py 456 Desktop wallpaper, organization, cleanup

Use Cases:

  • Context-aware application launching
  • Automated system configuration
  • Windows automation for repetitive tasks
  • Desktop environment management

3. Communication & Media

Module Lines Capabilities
send_message.py 214 WhatsApp, Telegram messaging
youtube_video.py 417 YouTube video control, summarization
weather_report.py 61 Current weather and forecast
flight_finder.py 338 Flight search with price tracking

Use Cases:

  • Instant messaging automation
  • Media playback control and content discovery
  • Travel planning and flight monitoring
  • Weather-based decision support

4. Task Automation & Productivity

Module Lines Capabilities
reminder.py 155 Windows Task Scheduler integration
file_controller.py 481 File I/O operations, search, disk usage
game_updater.py 816 Steam/Epic game management
code_helper.py 582 Code generation, editing, execution

Use Cases:

  • Scheduled task automation
  • File system management
  • Gaming ecosystem management
  • Developer productivity assistant

5. Advanced AI Operations (v2.0)

Module Lines Capabilities
dev_agent.py 596 Agent task delegation and orchestration
code_helper.py 582 Code generation, explanation, debugging
self_learning_engine.py 810 Reflexion cycles, autonomous code correction
planner.py 240 Goal decomposition, step planning
executor.py 400 Plan execution, retries, error recovery
error_handler.py 196 Error analysis, recovery suggestions

Use Cases (v2.0):

  • Multi-step task orchestration
  • Autonomous error recovery with reflexion
  • Self-correcting code generation
  • Memory extraction with heuristic fallback
  • Goal-oriented task execution

🚀 Installation & Setup

Prerequisites

  • Python: 3.10 or higher
  • Operating System: Windows 10/11 (primary), Linux (partial support), macOS (limited)
  • Hardware: Minimum 4GB RAM, modern CPU
  • API Keys: Gemini API, OpenRouter API (for advanced features)

Quick Start

# Clone the repository
git clone https://github.com/SLxnoat/Project-J.A.R.V.I.S.git
cd Project-J.A.R.V.I.S

# Create and activate virtual environment
python -m venv jarvis-env
source jarvis-env/bin/activate  # Windows: jarvis-env\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Initialize configuration
# Edit config/api_keys.json with your Gemini and OpenRouter API keys
# Edit config/config.json with system preferences

# Launch JARVIS
python main.py

Dependencies

Package Purpose Version
google-genai Gemini LLM integration 1.0+
chromadb Vector database 0.4+
playwright Web automation 1.40+
playwright-stealth Anti-bot bypass 1.0+
PyQt6 UI framework 6.5+
sounddevice Audio processing 0.4+
psutil System monitoring 5.9+
python-dotenv Environment config 1.0+
opencv-python Image processing 4.8+
mss Screen capture 6.1+

Post-Installation Setup

After installing dependencies, run:

# Install Playwright browsers
playwright install chromium
playwright install-deps

# Verify installation
python -c "from memory.memory_manager import jarvis_memory; print('JARVIS Ready!')"

⚙️ Configuration

API Keys (config/api_keys.json)

{
  "gemini_api_key": "AIza...",
  "openrouter_api_key": "sk-or-v1-...",
  "serper_api_key": "your-serper-key-if-available"
}

Environment Variables (Optional)

# .env file in project root
GEMINI_API_KEY=AIza...
OPENROUTER_API_KEY=sk-or-v1-...
SERPER_API_KEY=your-serper-key-if-available

Configuration Options (config/config.json)

Option Default Description
audio_sample_rate 16000 Microphone sample rate
output_sample_rate 24000 Speaker sample rate
max_memory_facts 10 Maximum facts to load
tool_timeout 30 Tool execution timeout (seconds)
retry_attempts 3 Maximum tool retry attempts
debug_mode false Enable verbose logging

🧠 Memory & Self-Learning Flow (v2.0)

Memory Architecture

JARVIS v2.0 implements a dual-memory system:

  1. Short-term Memory (SQLite)

    • Conversation history (last 5 turns)
    • Fast retrieval for current session
    • Automatic persistence
  2. Long-term Memory (ChromaDB)

    • Semantic recall using embeddings
    • Persistent storage across sessions
    • Auto-consolidation of important facts

Memory Flow Diagram

┌─────────────────────────────────────────────────────────────────────────┐
│                    MEMORY FLOW v2.0                                      │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  USER INPUT → RAG Processor → should_extract_memory()                  │
│                                  │                                      │
│                                  ├─ LLM (or_client)                    │
│                                  ├─ Heuristic Fallback (NEW)           │
│                                  ▼                                      │
│                           extract_memory()                             │
│                                  │                                      │
│                                  ▼                                      │
│                           update_memory()                              │
│                                  │                                      │
│                                  ├─ store_permanent_fact()             │
│                                  ├─ embed_text()                       │
│                                  └─ ChromaDB storage                   │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

Self-Learning Engine (Reflexion Cycle)

┌─────────────────────────────────────────────────────────────────────────┐
│              SELF-LEARNING ENGINE v2.0                                   │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  START → node_generate_or_adapt (Generator)                            │
│                  │                                                      │
│                  ▼                                                      │
│       node_execute_and_observe (Executor)                              │
│                  │                                                      │
│                  ▼                                                      │
│         node_critique_and_reflect (Critic)                             │
│                  │                                                      │
│                  ├─ SUCCESS → END                                       │
│                  ├─ EXHAUSTED → END                                     │
│                  └─ REGENERATE → node_generate_or_adapt (Back-edge)    │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

Memory API (v2.0)

from memory.memory_manager import (
    save_interaction,
    get_recent_context,
    store_permanent_fact,
    recall_relevant_facts,
    load_memory,
    update_memory,
    should_extract_memory,
    extract_memory,
)

# Save conversation turn
save_interaction("user", "Hello, my name is John")
save_interaction("jarvis", "Hello John, nice to meet you")

# Get recent context
context = get_recent_context(5)

# Store permanent fact
store_permanent_fact(
    "User name is John",
    metadata={"category": "identity", "key": "name"}
)

# Recall relevant facts
facts = recall_relevant_facts("user name", n_results=3)

# Update memory from structured data
update_memory({
    "identity": {"name": {"value": "John"}},
    "preferences": {"favorite_language": {"value": "Python"}}
})

📋 Migration Roadmap

Current State: Monolithic Architecture v2.0

JARVIS Mark XXXIX v2.0 operates as a single-threaded event loop with synchronous tool execution. While robust and reliable, this architecture has limitations:

Limitation Impact
Sequential tool execution 7.5s for 3-tool queries
No autonomous replanning User intervention required
Tightly coupled modules Difficult testing and maintenance

Strategic Migration to LangGraph

We are migrating to a LangGraph-driven Multi-Agent Agentic AI architecture.

Migration Timeline

Phase Duration Status
Phase 0: Foundation Week 1 ✅ Complete
Phase 1: RAG Migration Week 2 ✅ Complete
Phase 2: Self-Learning Engine Week 3 ✅ Complete
Phase 3: Memory Flow Fix Week 4 ✅ Complete
Phase 4: Tool Nodes Week 5 ✅ Complete
Phase 5: State Persistence Week 6 ✅ Complete
Phase 6: Audio Integration Week 7 Pending
Phase 7: Final Integration Week 8 Pending

Benefits of Migration

Feature Before v2.0 After Migration
Parallel Tool Execution Sequential 4-7x faster
Error Recovery Manual Autonomous
State Persistence SQLite only MemorySaver + ChromaDB
Testability Integration tests 80%+ node coverage
Code Organization Monolithic Modular nodes

New Capabilities Enabled

Phase 4 - Tool Nodes Architecture (NEW)

  1. LangGraph StateGraph Integration: Full migration to typed state channels
  2. Parallel Tool Execution: Execute multiple tools concurrently (4-7x faster)
  3. Autonomous Self-Correction: Agent detects and recovers from errors via reflexion
  4. Multi-Turn Goal Tracking: Agent remembers and continues user goals across state boundaries
  5. Context-Aware Interrupts: Natural conversation turn-taking with state persistence
  6. Declarative State Machine: Explicit state transitions between planning, execution, and correction

Phase 4 Implementation Details:

Component Purpose
tool_nodes.py LangGraph tool node wrappers for all 17+ action modules
planning_node Goal decomposition into tool calls
executor_node Sequential tool execution with error handling
self_correction_node Error analysis and plan regeneration
memory_update_node ChromaDB persistence with MemorySaver
ToolNodesArchitecture Full compiled StateGraph with checkpointing

Phase 5: State Persistence ✅ Complete

Overview: Phase 5 implements comprehensive state persistence using LangGraph's MemorySaver for checkpoint storage and ChromaDB for long-term memory management. This enables agents to resume execution from any point and retain conversational context across sessions.

Key Capabilities

Feature Description
Checkpoint Persistence Save/restore agent state using MemorySaver with JSON file storage
Agent Memory Storage ChromaDB-based long-term memory for agent experiences and learnings
Conversation History Persistent chat history with automatic trimming and disk storage
Error Recovery Graceful degradation when ChromaDB or LangGraph unavailable
Thread Safety Lock-protected concurrent access to shared state

Architecture

StatePersistenceManager
├── Checkpoint Management (MemorySaver + JSON files)
│   ├── save_checkpoint()
│   ├── load_checkpoint()
│   └── list_checkpoints()
├── Agent Memory (ChromaDB)
│   ├── save_agent_memory()
│   ├── recall_agent_memories()
│   └── get_memories_by_agent()
├── Conversation History
│   ├── add_conversation_turn()
│   ├── get_conversation_history()
│   └── save_conversation_history()
└── LangGraphCheckpointer
    └── get()/put() for StateGraph config

Files

File Purpose
agent/state_persistence.py Core persistence manager (800+ lines)
tests/test_state_persistence.py 13 test cases covering all components

API Reference

StatePersistenceManager

class StatePersistenceManager:
    def save_checkpoint(checkpoint_id, state, metadata) -> str
    def load_checkpoint(checkpoint_id) -> Optional[Dict]
    def list_checkpoints() -> List[str]
    def save_agent_memory(agent_id, memory_type, content, metadata) -> str
    def recall_agent_memories(agent_id, memory_type, query, n_results) -> List[Dict]
    def add_conversation_turn(user_id, user_message, agent_message, context)
    def get_conversation_history(user_id, n_turns) -> List[Dict]
    def save_conversation_history(user_id) -> bool

EnhancedToolNodesArchitecture

Extends ToolNodesArchitecture with:

  • enable_persistence=True for checkpointing
  • resume_execution(checkpoint_id) to restart from saved state
  • Automatic checkpoint creation during execution

Storage Locations

jarvis_memory/
├── state_persistence/
│   ├── checkpoints/          # JSON checkpoint files
│   │   ├── checkpoint_*.json
│   │   └── full_workflow_test.json
│   └── agent_memories/       # ChromaDB data
│       └── chroma/
└── chroma/                   # Vector embeddings

Testing

All 13 tests pass successfully:

python tests/test_state_persistence.py
# Result: 13 passed in ~13 seconds

Test Coverage:

  • Directory management and initialization
  • Checkpoint save/load operations
  • Agent memory save/recall
  • Conversation history management
  • LangGraph checkpointer integration
  • EnhancedToolNodesArchitecture creation and execution

Phase 5 Implementation Details:

Component Purpose
state_persistence.py StatePersistenceManager with MemorySaver + ChromaDB
LangGraphCheckpointer Wrapper for MemorySaver checkpoint operations
EnhancedToolNodesArchitecture ToolNodes with persistence capabilities
StatePersistenceFactory Convenience factory for quick setup

📁 Project Structure

Project-J.A.R.V.I.S/
├── actions/                    # 17+ action modules
│   ├── web_search.py          # Serper API + Playwright
│   ├── browser_control.py     # Playwright automation
│   ├── screen_processor.py    # Camera + screenshot
│   ├── computer_control.py    # Mouse/keyboard
│   ├── file_controller.py     # File operations
│   ├── code_helper.py         # Code generation (v2.0)
│   ├── game_updater.py        # Steam/Epic management
│   ├── flight_finder.py       # Flight search
│   └── [8 more modules]...
├── agent/
│   ├── planner.py             # Goal decomposition
│   ├── executor.py            # Plan execution
│   ├── error_handler.py       # Recovery analysis
│   ├── task_queue.py          # Priority scheduling
│   ├── self_learning_engine.py# Reflexion cycle (v2.0)
│   ├── sle_integration_harness.py# Validation (v2.0)
│   ├── crew_orchestration_engine.py# CrewAI integration
│   ├── tool_nodes.py          # LangGraph nodes (v2.1 - Phase 4)
│   └── state_persistence.py   # State persistence (v2.1 - Phase 5)
├── config/
│   ├── api_keys.json          # API credentials
│   ├── config.json            # System settings
│   └── loader.py              # Unified config loader (v2.0)
├── core/
│   └── prompt.txt             # System instructions
├── memory/
│   ├── memory_manager.py      # Short-term + long-term (v2.0)
│   ├── rag_processor.py       # RAG pipeline (v2.0)
│   └── config_manager.py      # Config helpers
├── jarvis_memory/             # Persistent data
│   ├── short_term.db          # SQLite history
│   └── chroma/                # Vector store
├── ui.py                      # PyQt6 interface
├── main.py                    # Core engine (v2.0)
├── or_client.py               # OpenRouter client
├── requirements.txt           # Dependencies
└── README.md                  # This file

🔧 Development

Running Tests

# Unit tests
pytest tests/unit/ -v

# Integration tests
pytest tests/integration/ -v

# Self-learning engine test
python -m agent.sle_integration_harness

# Linting
ruff check .

# Type checking
mypy --config-file=pyproject.toml .

Adding New Action Modules

  1. Create actions/new_module.py
  2. Implement new_module_action(parameters: dict, player=None) -> str
  3. Register in main.py TOOL_DECLARATIONS
  4. Add import in main.py action imports
  5. Add import in agent/executor.py _call_tool

Debugging

# Enable debug mode
export JARVIS_DEBUG=true

# View logs
tail -f logs/jarvis.log

# Debug with pdb
python -m pdb main.py

🐛 Troubleshooting

Common Issues

Error Solution
GEMINI_API_KEY not found Set in config/api_keys.json
Playwright not installed playwright install-deps
ChromaDB initialization failed Clear jarvis_memory/chroma/
Audio queue full Increase CHUNK_SIZE in config
Tool timeout Increase tool_timeout in config
Embedding model not found Updated to gemini-embedding-001 in v2.0
or_client not available Heuristic fallback enabled in v2.0

Logging

Level Environment Variable Output
DEBUG JARVIS_DEBUG=true Full logging
INFO (default) Standard output
ERROR JARVIS_ERROR_LOG=true Error file

🤝 Contributing

We welcome contributions! Please read our Contributing Guidelines first.

Development Workflow

  1. Create a feature branch: git checkout -b feature/amazing-feature
  2. Commit your changes: git commit -m 'feat: add amazing feature'
  3. Push to the branch: git push origin feature/amazing-feature
  4. Open a Pull Request

Pull Request Process

  1. Update documentation to match changes
  2. Add tests for new functionality
  3. Ensure all checks pass
  4. Request review from maintainers

📄 License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.


🙏 Acknowledgments

Original Creator

  • FatihMakes - GitHub - Original JARVIS project

v2.0 Developer

  • Charuka Mayura Bandara - Developer of JARVIS v2.0

Technologies

  • Google Gen AI team for the Gemini LLM
  • ChromaDB team for the vector database
  • Playwright team for browser automation
  • PyQt6 team for the UI framework
  • The open-source community for inspiration and support

📞 Contact & Support


📌 Status

Production Ready - JARVIS Mark XXXIX v2.0 is ready for production use.

J.A.R.V.I.S. — Just A Rather Very Intelligent System
Building the future of personal AI assistance, one command at a time


Version 2.0 | Developer: Charuka Mayura Bandara
Forked from FatihMakes | Apache License 2.0

About

The new model of Jarvis,A modular, voice-activated personal AI assistant inspired by Iron Man's J.A.R.V.I.S., built with Gemini Live API,OpenRouter, PyQt6 GUI, and context-aware long-term memory.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • Python 100.0%