Torrowdawn, or Chenxia Tang here. You may find my resume (in chinese) here.
I was born in 2003, and I'm currently a Master student in USTC. My research focuses on post-training and inference optimization for large language models. I agree with Euripides: The language of truth is simple.
My favorite Top-2 English words are Meditation and Philosophy. I treat meditation as the best medicine for life, and I also deeply love wisdom.
I am highly proficient in
which you can verify from my repos. I also wrote verilog
, and planning to learn dart to develop some android apps.
Primary Research Area:
- Post-training & Inference Optimization for Large Language Models
- Token Sampling Methods and Generation Quality
- High-speed Inference and Performance Optimization
Previous Experience:
- Reinforcement Learning, Recommendation Systems (brief exploration)
My research on large language models is particularly in-depth. I have extensive experience with RLHF (GPT-2 scale), fine-tuning (LLaMA2, 3), and have measured execution times of almost all operators in LLaMA. I am quite familiar with the bottlenecks of LLaMA, especially for single-GPU execution on the A6000.
- Bachelor's Degree
University of Science and Technology of China (USTC)
Special Class for the Gifted Young (enrolled at age 16)
2019 -2023
- Master's Degree (Current)
University of Science and Technology of China (USTC)
School of Computer Science and Technology
2023 -
I served as a teaching assistant(TA)for Algebra,in 2023 fall.
Authors: Chenxia Tang, Jianchun Liu, Hongli Xu, Liusheng Huang
Conference: ACL 2025 (Long Papers) | Pages: 10758-10774 | Venue: Vienna, Austria
TL;DR: Proposed top-nσ, a novel sampling method that eliminates noise directly in logit space. Key findings: (1) pre-softmax logits show clear separation between informative tokens and noise, (2) proved mathematical equivalence of min-p and top-(1-p). Achieves temperature-invariant token selection while preserving diversity, outperforming existing methods especially at high temperatures.
Links: ACL Anthology | PDF
Authors: Chenxia Tang | Preprint: arXiv:2406.07028 | Submitted: June 2024
TL;DR: Proposed adaptive learning rate scheduling for DARTS on long-tailed datasets. Traditional re-sampling/re-weighting techniques cause performance degradation with DARTS. Our method specifically optimizes architecture parameters for imbalanced class distributions.
Repository: GitHub | Language: Python/C++
TL;DR: A comprehensive simulator for Genshin Impact's Genius Invokation TCG, designed specifically for AI training and reinforcement learning research.
Key Contributions:
- Game Engine Development: Built a complete TCG simulation engine with event-driven architecture supporting complex card interactions, elemental reactions, and turn-based gameplay mechanics
- AI Framework: Implemented MCTS (Monte Carlo Tree Search) algorithms and AlphaBeta search for intelligent gameplay, achieving competitive performance against human players
- Modular Design: Created extensible card system with string-based reflection, supporting custom card implementations and easy game state serialization
Repository: GitHub | Language: Python
TL;DR: Self-study project focused on advancing vision-language model capabilities with modern frameworks.
Repository: GitHub | Language: Python
TL;DR: A light-weight FlashInfer wrapper that simplifies memory management and enables easy implementation of custom sparse attention operations.
Key Contributions:
- Memory Management: Developed a streamlined wrapper around FlashInfer that significantly simplifies memory allocation and deallocation for attention operations
- Custom Sparse Attention: Created an intuitive interface for implementing custom sparse attention patterns and operations
- Performance Optimization: Leveraged FlashInfer's optimized kernels while providing a more accessible API for research and development
- Research Framework: Built a flexible foundation for experimenting with various attention mechanisms and sparse patterns
My research on Top-nσ sampling has been adopted by several open-source projects:
- llama.cpp: Pull Request #11223
- SillyTavern: Pull Request #3094