I ship production AI / ML systems at Fortune-100 scale and contribute upstream to the SDKs that power them.
Currently at Southwest Airlines · formerly Software Engineer at AWS.
Now: building agent guardrails, MCP linters, and eval harnesses; contributing to AI SDKs upstream.
| Production AI / ML | |||
|
COST −78% SageMaker → Bedrock $1,740 → $371 / mo |
LATENCY 600× retrieval improvement aircraft fault prediction |
RAG SCALE 30K+ knowledge entries 9-stage agentic pipeline |
QUALITY 370+ tests & evaluations across production systems |
Open Source Footprint |
|||
|
ORIGINALS 164 public repos I authored |
PACKAGES 50 npm + PyPI + MCP across 3 registries |
UPSTREAM 98 merged PRs in external public repos |
REPOS REACHED 409 external projects I've contributed to |
|
npm · CLI · Homebrew · MCP Registry · TypeScript · 64 tests · 0 deps Streaming JSON parser that yields partial valid trees as tokens arrive. Render LLM tool calls mid-stream, recover dropped responses, parse messy |
🛡️ mcpchecknpm · GitHub Action · SARIF · TypeScript · CLI · CI-ready Lint MCP config files for Claude Desktop, Cursor, Cline, Windsurf, and Zed. Catches malformed transports, missing auth, duplicate servers, and placeholder values. Drops into any CI as a 3-line composite Action with SARIF output for code-scanning. |
|
npm + PyPI · zero deps · CLI + programmatic API Eval harness for comparing model, prompt, and agent behavior. Exact, regex, token-F1, JSON, and citation-coverage checks. CLI for one-shot runs, programmatic API for CI integration. Sister Python release on PyPI. |
npm · zero deps · 33 tests · TypeScript types Snapshot tests for AI agents — record an agent run's tool-call trace, diff against a baseline, fail CI on regressions. Drops into any test runner. First in a four-package agent reliability stack (snap · guard · vet · cast · fit). |
+ 46 more packages across npm, PyPI, the MCP Registry, GitHub Marketplace Actions, and Homebrew
Distribution pattern. Each flagship ships across the surfaces it makes sense to live on:
library → Python port → CLI binary → GitHub Action → Homebrew formula → MCP server
npm PyPI Marketplace brew tap npm
So the same problem is solvable from a TypeScript app, a Python script, a CI workflow, a terminal, or directly inside Claude / Cursor / Cline / Windsurf / Zed.
Categories (npm + PyPI):
| Area | Examples |
|---|---|
| Structured outputs & parsing | streamparse, streamparse-mcp, llm-response-schema-lite |
| Agent reliability | agentsnap, agentguard, agentvet, agentcast, agentfit |
| Agent infrastructure | agent-loop-breaker, agent-regression-lens, tool-call-contracts, tool-permission-gate |
| RAG & retrieval | rag-quality-kit, retrieval-acl-filter, vector-poison-score, embedding-dedupe |
| Prompt & output safety | pii-sentry, prompt-injection-shield, llm-output-sanitizer, system-prompt-leak-scan |
| Cost, routing, caching | llm-cost-guard, model-fallback-planner, model-router-policy, semantic-cache-key |
| MCP & Claude Code linting | mcpcheck, mcp-config-check, claude-skill-check, claude-hooks-check, claude-commands-check |
| Evals & tracing | ai-eval-forge, eval-dataset-smith, eval-flake-detector, llm-trace-sampler, agent-run-diff |
Browse the full set at npmjs.com/~mukundakatta and pypi.org/user/mukundakatta.
GitHub Marketplace Actions: mcpcheck, claude-skill-check, mcp-config-check, claude-hooks-check, claude-commands-check.
Homebrew tap: brew tap mukundakatta/tools then any of the linters above.
Selected upstream contributions, recent first:
- openai/openai-node #1831 — improved fallback handling for non-standard JSON error bodies
- openai/tiktoken #529 — added PyInstaller hooks for dynamic encoding plugins
- googleapis/python-genai #2298 — clarified response_schema vs response_json_schema
- microsoft/playwright-mcp #1562 — clarified extension connection and tab-selection flow
- anthropics/anthropic-sdk-python #1412 — fixed async memory tool example docs
- stanford-crfm/helm #4210 — fixed later-page deep links for run instances
Where I tend to land: AI SDKs, MCP tooling, eval frameworks, agent infrastructure, structured outputs. Public log of selected work in oss-contributions.
Last refreshed 2026-04-26 from npm, PyPI, and the GitHub API.
Latest releases
2026-04-25·@mukundakatta/streamparse-mcpv1.0.1· npm2026-04-25·@mukundakatta/streamparsev1.0.1· npm2026-04-25·@mukundakatta/agentsnapv0.1.0· npm
Recently merged PRs
2026-04-24· langgenius/dify #35547 — docs: fix Kubernetes deployment wording2026-04-24· infiniflow/ragflow #14352 — docs: fix API key guide typo2026-04-25· PrefectHQ/fastmcp #4001 — docs: add best practices for custom telemetry spans2026-04-22· pydantic/pydantic-ai #5156 — fix(vercel-ai): allow regenerate requests withoutmessageId2026-04-23· ntop/ntopng #10297 — fix(locales/en): correct display string 'Enstablished' -> 'Established'
Karna — AI Agent PlatformSelf-hosted AI assistant across 7 messaging channels (Telegram, Slack, Discord, WhatsApp, SMS, iMessage, Web). Extensible plugin SDK, semantic memory, voice. TypeScript monorepo with Next.js dashboard and React Native mobile app. TypeScript · Next.js · Supabase · WebSocket · pgvector |
Chetana — AI Consciousness ResearchResearch platform for studying machine consciousness via 14 indicators grounded in 6 scientific theories. Turns abstract questions into structured experiments, scoring, and analysis. Python · Evaluation · Experimentation |
AgentRAG — Modular RAGProvider-agnostic RAG framework with pluggable vector stores, chunking strategies, and retrieval methods. Designed for agentic workflows with clean API boundaries. TypeScript · Vector Search · Embeddings |
Astra Agent — RuntimeStandalone AI agent runtime with tool execution, context management, and multi-model routing. Foundation for autonomous AI assistants with structured tool use. TypeScript · LLM Orchestration · Tool Use |
+ more projects (Sadhak, Prithvi, AgentMem, Patchly, Evalharness, TokenWise, …)
| Project | What it does |
|---|---|
| Sadhak | AI-powered job search command center — automated evaluation, resume tailoring, application tracking |
| Prithvi | Container security scanner — vulnerability detection, compliance checks, Docker audits |
| AgentMem | Pluggable memory management for AI agents |
| Patchly | AI code review bot — flags bugs, suggests fixes, explains why |
| Evalharness | Prompt, agent, and RAG test harness — red teaming, regression testing |
| TokenWise | Token usage optimization across providers |
| LLM Bench CLI | Benchmark local LLMs — speed, throughput, quality |
| Amogha Cafe | Full-stack Firebase restaurant platform · live |
| Role | Company | Era |
|---|---|---|
| AI/ML Engineer | Southwest Airlines | Aug 2025 — Present |
| AI/ML Engineer | GPS IT Solutions | Jun 2024 — Aug 2025 |
| Software Development Engineer | Amazon Web Services | Aug 2022 — May 2024 |
| Data Engineer | GPS IT Solutions | Jan 2022 — Aug 2022 |
| Software Engineer | American Express | Feb 2017 — Dec 2020 |
Highlights from each role
Southwest Airlines — AI/ML Engineer
- Architected ML fault-prediction system for aircraft maintenance: 5 prediction types, 10K+ records, sub-second retrieval.
- Led SageMaker → Bedrock migration: 78% cost reduction ($1,740 → $371/mo), 600× latency improvement.
- Designed 9-stage agentic RAG pipeline (LangGraph, Bedrock Nova Pro/Micro, FAISS + BM25) over 30K+ KB entries.
GPS IT Solutions — AI/ML Engineer
- Built GPT-4 + RAG content generation platform with compliance validation; reduced production time by 40%.
- Designed AI model-risk governance framework with 23 automated evaluation tests (regulatory compliance).
- Architected FastAPI microservices with FAISS / Pinecone vector search on Kubernetes.
Amazon Web Services — SDE
- Shipped features for AWS Application Manager (Systems Manager) serving enterprise customers globally.
- Owned full-stack delivery: React/TypeScript frontend + Java/Python backend APIs.
- Designed CI/CD and IaC patterns enabling zero-downtime deployments at enterprise scale.
GPS IT Solutions — Data Engineer
- Led end-to-end migration of data pipelines from on-prem to AWS (Glue, PySpark).
American Express — Software Engineer
- Built Python backend services and REST APIs for enterprise platforms handling high-volume transactions.
What I reach for, in order of frequency:
TypeScript · Python · React/Next.js · Node.js · FastAPI ·
AWS (Bedrock, SageMaker, ECS, OpenSearch) · LangGraph · Postgres + pgvector · Docker / Kubernetes · Terraform
|
STARS 1395 |
COMMITS 7400 |
PRS 993 |
CONTRIBUTIONS 8703 |
REPOS REACHED 409 |
stars cumulative · commits & contributions are last 12 months · PRs all-time · repos reached = external repos contributed to
M.S., Big Data Analytics & Information Technology · University of Central Missouri (2022) B.Tech, Mechanical Engineering · SRM University (2016)
Selected certifications
Anthropic — MCP Advanced Topics · Claude with Bedrock · Claude with Vertex AI · Building with Claude API · Intro to Agent Skills · Intro to Subagents
AWS — Generative AI Applications · Services for AI Solutions · AI Fundamentals & the Cloud · Amazon Q for Software Dev
Cloud / GCP — Advanced Terraform with GCP · Build & Deploy Agent with Reasoning Engine
Stanford — Machine Learning · Introduction to Statistics
Wharton (Coursera) — Business Analytics · Customer Analytics · People Analytics
Microsoft — Generative AI for Software Devs · GitHub Copilot Productivity
Senior AI/ML Engineer · GenAI Platform Engineer · Software Engineer
Las Vegas, NV · mukunda-ai.vercel.app




