Senior AI/ML Engineer with extensive expertise in GPU acceleration (CUDA) specializing in computer vision, real-time AI systems, and scalable deep learning applications. I bridge state-of-the-art research with production-ready systems, leading teams to build state-of-the-art, GPU-accelerated ML systems.
Languages: C++, CUDA, Rust, Python
ML/AI: PyTorch, OpenCV, CUDA, CuDNN, cuBLAS, CUTLASS
High-Performance: OpenMP, BLAS, Eigen, Ceres, Boost, TBB
Data Science: numpy, pandas, matplotlib, scipy
MLOps: Union/Flyte, MLflow
⏲ Uncovering and Resolving Bottlenecks in SGLang - A deep dive into SGLang, a state of the art LLM inference engine
📚 Cutting Through the Noise - AI Paper Curation with LLMs and RLHF
Passionate about transforming cutting-edge research into high-quality products that deliver real-world impact.