Skip to content
Change the repository type filter

All

    Repositories list

    • ShinkaEvolve

      Public
      ShinkaEvolve: Towards Open-Ended and Sample-Efficient Program Evolution
      Python
      15081079Updated Jan 25, 2026Jan 25, 2026
    • Kamon

      Public
      Data and code for understanding and generation of Kamon.
      Python
      63100Updated Jan 24, 2026Jan 24, 2026
    • drq

      Public
      Digital Red Queen: Adversarial Program Evolution in Core War with LLMs
      Red
      2017010Updated Jan 13, 2026Jan 13, 2026
    • DroPE

      Public
      Extending the Context of Pretrained LLMs by Dropping Their Positional Embedding
      Python
      1719421Updated Jan 12, 2026Jan 12, 2026
    • ALE-Bench

      Public
      The official repository of ALE-Bench
      Python
      2015320Updated Jan 5, 2026Jan 5, 2026
    • IASC

      Public
      LLMs for Constructed Languages
      HTML
      44210Updated Jan 2, 2026Jan 2, 2026
    • continuous-thought-machines

      Public
      Continuous Thought Machines, because thought takes time and reasoning is a process.
      Python
      2691.7k12Updated Dec 29, 2025Dec 29, 2025
    • repo

      Public
      RePo: Language Models with Context Re-Positioning
      Python
      76410Updated Dec 24, 2025Dec 24, 2025
    • AI-Scientist-v2

      Public
      The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search
      Python
      3832.1k309Updated Dec 19, 2025Dec 19, 2025
    • AI-Scientist

      Public
      The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑‍🔬
      Jupyter Notebook
      1.7k12k8920Updated Dec 19, 2025Dec 19, 2025
    • An AI benchmark for creative, human-like problem solving using Sudoku variants
      JavaScript
      1515710Updated Dec 13, 2025Dec 13, 2025
    • treequest

      Public
      A Tree Search Library with Flexible API for LLM Inference-Time Scaling
      Python
      6551311Updated Dec 9, 2025Dec 9, 2025
    • TinySwallow-ChatUI

      Public
      Browser-based chat UI for TinySwallow-1.5B that runs without API calls.
      CSS
      813000Updated Dec 1, 2025Dec 1, 2025
    • Python
      118041Updated Nov 22, 2025Nov 22, 2025
    • Neuroevolution Community
      2600Updated Nov 17, 2025Nov 17, 2025
    • MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering
      Python
      205000Updated Nov 12, 2025Nov 12, 2025
    • petri-dish-nca

      Public
      Python
      65102Updated Nov 6, 2025Nov 6, 2025
    • Python
      0400Updated Oct 31, 2025Oct 31, 2025
    • asal

      Public
      Automating the Search for Artificial Life with Foundation Models!
      Jupyter Notebook
      5244910Updated Oct 23, 2025Oct 23, 2025
    • shachi

      Public
      Reimagining Agent-based Modeling with Large Language Model Agents via Shachi
      Python
      32700Updated Oct 10, 2025Oct 10, 2025
    • TAID

      Public
      Official implementation of "TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models"
      Python
      812030Updated Oct 6, 2025Oct 6, 2025
    • The code repository of the paper: Competition and Attraction Improve Model Fusion
      Jupyter Notebook
      3316810Updated Aug 25, 2025Aug 25, 2025
    • BALROG

      Public
      Benchmarking Agentic LLM and VLM Reasoning On Games
      Python
      41100Updated Aug 19, 2025Aug 19, 2025
    • Evaluating the performance of LLMs on Japanese challenging financial tasks.
      Python
      32900Updated Jul 28, 2025Jul 28, 2025
    • Reasoning-based Evaluation and Ranking of Translations.
      Python
      41810Updated Jul 18, 2025Jul 18, 2025
    • Python
      1810610Updated Jun 30, 2025Jun 30, 2025
    • RLT

      Public
      Training teachers with reinforcement learning able to make LLMs learn how to reason for test time scaling.
      Python
      5435830Updated Jun 23, 2025Jun 23, 2025
    • edinet2dataset is a tool to construct financial dataset using EDINET.
      Python
      83000Updated Jun 11, 2025Jun 11, 2025
    • Hypernetworks that adapt LLMs for specific benchmark tasks using only textual task description as the input
      Python
      6593920Updated Jun 8, 2025Jun 8, 2025
    • L2D

      Public
      Large language models to diffusion finetuning code
      Python
      32300Updated Jun 2, 2025Jun 2, 2025