Skip to content
Change the repository type filter

All

    Repositories list

    • Python
      47226222Updated Feb 18, 2026Feb 18, 2026
    • STAT

      Public
      Skill-Targeted Adaptive Training
      Python
      21510Updated Jan 27, 2026Jan 27, 2026
    • [ICLR 2026] Why is Your Language Model a Poor Implicit Reward Model?
      Python
      0300Updated Jan 26, 2026Jan 26, 2026
    • QRHead

      Public
      QRHead: Query-Focused Retrieval Heads Improve Long-Context Reasoning and Re-ranking
      Python
      13430Updated Jan 20, 2026Jan 20, 2026
    • Tools for analyzing cluster usage, etc.
      Python
      0000Updated Dec 31, 2025Dec 31, 2025
    • Python
      23530Updated Dec 25, 2025Dec 25, 2025
    • Codebase for the paper "How Does RL Post-training Induce Skill Composition? A Case Study Using Countdown"
      Python
      0400Updated Dec 2, 2025Dec 2, 2025
    • RLMT

      Public
      [R]einforcement [L]earning from [M]odel-rewarded [T]hinking - code for the paper "Language Models That Think, Chat Better"
      Python
      612400Updated Oct 27, 2025Oct 27, 2025
    • Python
      0300Updated Oct 23, 2025Oct 23, 2025
    • LongProc

      Public
      LongProc: Benchmarking Long-Context Language Models on Long Procedural Generation
      HTML
      13301Updated Oct 11, 2025Oct 11, 2025
    • [NeurIPS 2025] What Makes a Reward Model a Good Teacher? An Optimization Perspective
      Python
      34200Updated Sep 18, 2025Sep 18, 2025
    • Examples for distributed model training on the cluster.
      Python
      0300Updated Sep 17, 2025Sep 17, 2025
    • PruLong

      Public
      Code for the preprint "Cache Me If You Can: How Many KVs Do You Need for Effective Long-Context LMs?"
      Python
      44810Updated Jul 29, 2025Jul 29, 2025
    • AdaptMI

      Public
      [COLM 2025] Adaptive Skill-based In-context Math Instruction for Small Language Models
      Python
      4800Updated Jul 10, 2025Jul 10, 2025
    • MeCo

      Public
      Code for ICML 25 paper "Metadata Conditioning Accelerates Language Model Pre-training (MeCo)"
      Python
      24950Updated Jun 30, 2025Jun 30, 2025
    • MixiT

      Public
      Disentangling the transformer.
      Python
      0700Updated Jun 9, 2025Jun 9, 2025
    • VLM_S2H

      Public
      Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs?
      Python
      01500Updated Jun 3, 2025Jun 3, 2025
    • Code for Preprint "On the Power of Context-Enhanced Learning"
      Jupyter Notebook
      0300Updated Mar 7, 2025Mar 7, 2025
    • Python
      0800Updated Feb 11, 2025Feb 11, 2025
    • Python
      0350Updated Feb 4, 2025Feb 4, 2025