Skip to content
Change the repository type filter

All

    Repositories list

    • 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
      Python
      Apache License 2.0
      28k300Updated Jan 28, 2025Jan 28, 2025
    • Python
      Apache License 2.0
      14270Updated Jan 24, 2025Jan 24, 2025
    • Zamba2

      Public
      PyTorch implementation of models from the Zamba2 series.
      Python
      Apache License 2.0
      1617331Updated Jan 23, 2025Jan 23, 2025
    • bci_minRF

      Public
      Minimal implementation of scalable rectified flow transformers, based on SD3's approach
      Jupyter Notebook
      Apache License 2.0
      41000Updated Jan 23, 2025Jan 23, 2025
    • zcookbook

      Public
      Training hybrid models for dummies.
      Python
      Apache License 2.0
      21801Updated Jan 16, 2025Jan 16, 2025
    • Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters
      Python
      511310Updated Dec 3, 2024Dec 3, 2024
    • FastChat

      Public
      An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
      Python
      Apache License 2.0
      4.6k000Updated Nov 6, 2024Nov 6, 2024
    • Ongoing research training transformer models at scale
      Python
      Other
      2.5k0104Updated Aug 20, 2024Aug 20, 2024
    • Ongoing research training transformer language models at scale, including: BERT & GPT-2
      Python
      Other
      2.5k002Updated Aug 19, 2024Aug 19, 2024
    • Fast and memory-efficient exact attention
      Python
      BSD 3-Clause "New" or "Revised" License
      1.4k000Updated Jul 8, 2024Jul 8, 2024
    • Python
      Apache License 2.0
      1700Updated Jul 1, 2024Jul 1, 2024
    • mamba

      Public
      Python
      Apache License 2.0
      1.2k400Updated Jun 27, 2024Jun 27, 2024
    • A framework for few-shot evaluation of language models.
      Python
      MIT License
      2k000Updated Jun 20, 2024Jun 20, 2024
    • Python
      Apache License 2.0
      13110Updated Jun 19, 2024Jun 19, 2024
    • High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
      C++
      MIT License
      418100Updated Jun 11, 2024Jun 11, 2024
    • Dataset for the temporal memory tests
      0300Updated Jun 4, 2024Jun 4, 2024
    • Robust recipes to align language models with human and AI preferences
      Python
      Apache License 2.0
      428000Updated Jun 3, 2024Jun 3, 2024
    • Simple, minimal implementation of the Mamba SSM in one file of PyTorch.
      Python
      Apache License 2.0
      200000Updated Mar 8, 2024Mar 8, 2024
    • Code repository for Black Mamba
      Python
      1823450Updated Feb 8, 2024Feb 8, 2024
    • 💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
      Rust
      Apache License 2.0
      833101Updated Feb 3, 2024Feb 3, 2024
    • 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
      Python
      Apache License 2.0
      28k200Updated Jan 25, 2024Jan 25, 2024
    • DeepSpeed

      Public
      DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
      Python
      Apache License 2.0
      4.2k000Updated Nov 2, 2023Nov 2, 2023
    • apex

      Public
      A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
      Python
      BSD 3-Clause "New" or "Revised" License
      1.4k000Updated Nov 1, 2023Nov 1, 2023