Skip to content

ashwin-r11/LLMTrace

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

7 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

LLM-Trace

Systems-Oriented Interpretability & Inspection for Language Models

License Python PyTorch Status

LLM-Trace is a tooling and interpretability library designed to treat language models as observable systems. It exposes the internal computation paths of transformer-based models, focusing on attention-based circuits and token-level influence.

Unlike standard evaluation tools that analyze inputs and outputs, LLM-Trace intercepts intermediate activations to explain how a model reaches a conclusion, bridging the gap between academic mechanistic interpretability and engineering diagnostics.


πŸ“– Table of Contents


πŸ’‘ Project Philosophy

LLM-Trace is built on the premise that modern language models require observability, not just evaluation. The project adheres to three core tenets:

  1. Measurement > Aesthetics: While visualization is supported, the primary goal is quantifiable metrics of influence (e.g., attention scores, activation magnitudes).
  2. Causality > Correlation: We aim to validate findings through ablation and counterfactual testing, ensuring that detected circuits actually drive model behavior.
  3. Systems Thinking: The model is treated as a computational graph where specific nodes (heads, layers) perform distinct sub-tasks.

πŸš€ Core Features

1. Internal Signal Extraction

  • Attention Hooking: Direct extraction of attention matrices ($A$), Queries ($Q$), and Keys ($K$) from intermediate transformer layers without altering model weights.
  • Activation Tracing: Capture residual stream states at critical junctures.

2. Circuit Detection & Analysis

  • Token Influence Mapping: Identify which source tokens (e.g., a specific instruction) contribute most heavily to a target token's generation.
  • Dependency Graphs: Visualize recurring patterns of information flow across layers.

3. Verification Utilities

  • Ablation Testing: Tools to mask or zero-out specific attention heads to verify their contribution to the output.
  • Prompt-Set Evaluation: Batch processing capabilities to test if a circuit holds true across varied inputs.

4. Inspection Console (Frontend)

  • A dedicated React-based dashboard for navigating token-to-token relationships, heatmaps, and layer-wise comparisons.

πŸ— System Architecture

LLM-Trace operates via a split architecture designed for performance and rigor:

  • The Backend (Python/PyTorch): Handles model loading, hook injection, tensor manipulation, and metric calculation. It produces structured JSON/Tensor data representing the model's internal state.
  • The Frontend (React/Node.js): Consumes structured data to render high-fidelity inspection interfaces. It acts as the "DevTools" for the model.

Conceptual Model

  1. Input: A prompt is fed into a Transformer model.
  2. Interception: Custom hooks capture activations during the forward pass.
  3. Analysis: Raw tensors are processed into influence metrics (Intervention).
  4. Output: Structured trace data is returned for visualization or programmatic analysis.

🎯 Scope & Non-Goals

In Scope

  • Internal behavior inspection during inference.
  • Mechanistic interpretability workflows.
  • Safety audits via internal state monitoring.
  • Small-to-Medium sized Transformer models (e.g., GPT-2, TinyLlama).

Out of Scope

  • Training/Fine-tuning: This is an inference-only tool.
  • Black-box Evaluation: We do not provide benchmarks like MMLU.
  • Prompt Engineering: We analyze prompts, we do not optimize them.

βš™οΈ Installation

Prerequisites

  • Python 3.10+
  • Node.js 18+ (for Frontend dashboard)
  • PyTorch (CUDA/MPS recommended)

Setup

# Clone the repository
git clone https://github.com/your-org/LLM-Trace.git
cd LLM-Trace

# Install Backend Dependencies
pip install -r requirements.txt

πŸ—Ί Roadmap

Phase 1 (Current): Core tracing logic for attention mechanisms. Basic heatmap visualization.

Phase 2: Implementation of ablation/intervention tools to prove causality.

Phase 3: MLP layer analysis and support for larger architectures (e.g., Llama-2 7B).

Phase 4: Automated "Circuit Discovery" pipelines.


🀝 Contributing

We welcome contributions from researchers and engineers. However, because this is a scientific tool, strict standards apply regarding code correctness, testing, and commit hygiene. Please read Contribution Guide before submitting a Pull Request.

Key Contribution Areas:

Core tracing logic & PyTorch hooks.

Influence metrics & statistical analysis.
Frontend data visualization (WebGL/Canvas).
Documentation & Reproducible Notebooks.

πŸ“„ License

  • This project is licensed under the MIT License. See the LICENSE file for details. LLM-Trace is an open-source initiative to demystify machine intelligence.

About

LLM Behaviour & Token Influence Visualizer

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published