βββββββββββββββββββββββββββββββββββββββββββββ
β β
β βββ βββ βββββββ βββββββ βββ β
β βββ ββββ ββββββββ βββββββββ βββ β
β βββββββ ββββββββ βββββββββ βββ β
β ββββββ ββββββββ βββββββββ βββ β
β βββ βββ βββ βββ βββ βββ βββββββββ β
β βββ βββ βββ βββ βββ βββ βββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββ
Hi, I'm Cao Hanzhe (Kral) β a CS student, AI researcher, and open-source enthusiast.
- π Building Multi-Agent Systems that reason, debate, and collaborate to solve complex real-world problems
- π§ Advancing Reinforcement Learning β from RLHF reward modeling to agentic RL with multi-turn reasoning
- π€ Bridging the gap between simulation and real-world robotics β sim-to-real transfer, embodied AI
- π Pushing the frontier of LLM Reasoning β test-time compute scaling, search-augmented generation, tool-use agents
- ποΈ Creator of MingJian (ζι΄) β an evidence-driven multi-agent simulation platform for strategic decision-making
- π¬ Reach me at: [email protected]
|
Designing agent systems where multiple LLMs collaborate through debate protocols, evidence chains, and structured reasoning β not just simple tool-calling. |
Implementing and fixing core RL algorithms β from PettingZoo parallel environments to LinUCB contextual bandits. Contributing fixes upstream to pytorch/rl and Pearl. |
|
Working with robosuite and NVIDIA IsaacLab to build robust simulation pipelines that transfer to real robots. Fixing core physics engine bugs and resource management. |
Building automated safety checks β prompt injection detection, red-teaming frameworks, and LLM evaluation harnesses. Contributing to Giskard AI safety platform. |
AI-powered multi-agent platform for evidence-driven scenario simulation and strategic decision-making
- β 19 stars Β· Python Β· FastAPI
- ποΈ Supports corporate and military strategic domains
- π Multi-agent debate protocol with evidence chains
- π Real-time scenario simulation engine
- π github.com/dashitongzhi/MingJian
Active contributor to 30+ AI and agent projects across GitHub β fixing core bugs, adding safety features, and improving developer experience
| Category | Projects | Highlights |
|---|---|---|
| π€ Agent Frameworks | rllm, notte, Composio/agent-orchestrator, stakpak/agent | Core bug fixes, session management, async improvements |
| π§ Reinforcement Learning | pytorch/rl, facebookresearch/Pearl, alibaba/ROLL | Fixing PettingZoo parallel env bugs, LinUCB tensor squeeze, agentic LR scheduler |
| π¦Ύ Robotics | robosuite, IsaacLab, SmolVM | Resource leak fixes, docstring corrections, sim-to-real improvements |
| π§ AI Infrastructure | Kokoro-FastAPI, any-llm, Art, burr | Error message fixes, kwargs passthrough, install automation |
| π‘οΈ AI Safety | Giskard-AI, Agent-R1 | LLM-based prompt injection detection, red-teaming checks |
| π¦ Dev Tools | visidata, cc-switch, hermecore, go-micro | Shell command fixes, metadata parsing, CI improvements |
- π Starstruck β Repository earned 16+ stars
- π¦ Pull Shark β Merged 30+ pull requests across major open-source projects
- π 330+ contributions in the last year
- π Contributed to projects from Meta, PyTorch, Alibaba, NVIDIA, Apache and more
- π Built and maintained MingJian β a production-grade multi-agent platform
- ποΈ MingJian v2 β Enhanced multi-agent debate protocol with evidence chain validation
- π§ Agentic RL β Multi-turn reinforcement learning for LLM agents
- π¦Ύ IsaacLab Contributions β Improving sim-to-real transfer pipelines
- π‘οΈ Prompt Injection Detection β Building automated LLM safety evaluation tools
"The question of whether machines can think is about as relevant as the question of whether submarines can swim." β Edsger W. Dijkstra
I'm always open to collaboration on multi-agent systems, RL research, and robotics projects.
If you're building something in the AI agent space β let's talk! π

