Skip to content
Change the repository type filter

All

    Repositories list

    • C2C

      Public
      [ICLR'26] The official code implementation for "Cache-to-Cache: Direct Semantic Communication Between Large Language Models"
      Python
      3734800Updated Feb 21, 2026Feb 21, 2026
    • MARSHAL

      Public
      MARSHAL: Incentivizing Multi-Agent Reasoning via Self-Play with Strategic LLMs
      Python
      13800Updated Feb 19, 2026Feb 19, 2026
    • db-SP

      Public
      This repository contains the official implementation of db-SP, a sparsity-aware sequence parallelism strategy designed to accelerate sparse attention in visual …
      Python
      1200Updated Feb 12, 2026Feb 12, 2026
    • R2R

      Public
      [NeurIPS'25] The official code implementation for paper "R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing"
      Python
      117810Updated Feb 10, 2026Feb 10, 2026
    • [DATE'23] The official code for paper <CLAP: Locality Aware and Parallel Triangle Counting with Content Addressable Memory>
      C++
      02301Updated Jan 19, 2026Jan 19, 2026
    • UniNDP

      Public
      Github repository of HPCA 2025 paper "UniNDP: A Unified Compilation and Simulation Tool for Near DRAM Processing Architectures"
      Python
      121900Updated Jan 18, 2026Jan 18, 2026
    • MoA

      Public
      [CoLM'25] The official implementation of the paper <MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression>
      Python
      815400Updated Jan 14, 2026Jan 14, 2026
    • [ICCV'25] The official code of paper "Combining Similarity and Importance for Video Token Reduction on Large Visual Language Models"
      Python
      16910Updated Jan 13, 2026Jan 13, 2026
    • TaH

      Public
      Official implementation of paper "Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models"
      Python
      106500Updated Jan 13, 2026Jan 13, 2026
    • AED

      Public
      an automatic, effective, and diverse vulnerability discovery framework for autonomous driving policies
      Python
      0000Updated Nov 30, 2025Nov 30, 2025
    • Python
      0920Updated Oct 23, 2025Oct 23, 2025
    • USF

      Public
      The official code of paper "A Unified Sampling Framework for Solver Searching of Diffusion Probabilistic Models" (ICLR24)
      Jupyter Notebook
      0300Updated Sep 27, 2025Sep 27, 2025
    • NIPA

      Public
      Python
      0000Updated Sep 26, 2025Sep 26, 2025
    • JavaScript
      3000Updated Aug 16, 2025Aug 16, 2025
    • PM-KVQ

      Public
      The official code implementation for paper "PM-KVQ: Progressive Mixed-precision KV Cache Quantization for Long-CoT LLMs"
      Python
      31900Updated May 24, 2025May 24, 2025
    • VGDFR

      Public
      VGDFR: Diffuison-based Video Generation with Dynamic Frame Rate
      Python
      01710Updated May 16, 2025May 16, 2025
    • ViDiT-Q

      Public
      [ICLR'25] ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
      Python
      24149241Updated Mar 21, 2025Mar 21, 2025
    • MBQ

      Public
      The code repository of "MBQ: Modality-Balanced Quantization for Large Vision-Language Models"
      Python
      376100Updated Mar 17, 2025Mar 17, 2025
    • DLFR-VAE

      Public
      01110Updated Feb 18, 2025Feb 18, 2025
    • Jupyter Notebook
      10190100Updated Jan 14, 2025Jan 14, 2025
    • MNSIM-2.0

      Public
      A Behavior-Level Modeling Tool for Memristor-based Neuromorphic Computing Systems
      Python
      5819571Updated Nov 27, 2024Nov 27, 2024
    • MixDQ

      Public
      [ECCV24] MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization
      Python
      549130Updated Nov 27, 2024Nov 27, 2024
    • Rad-NeRF

      Public
      [NeurIPS24] Rad-NeRF: Ray-decoupled Training of Neural Radiance Field
      Python
      0700Updated Nov 9, 2024Nov 9, 2024
    • JavaScript
      0000Updated Nov 8, 2024Nov 8, 2024
    • SCSS
      0000Updated Oct 19, 2024Oct 19, 2024
    • The official CUDA kernel implementation for Mixture of Sparse Attention
      Cuda
      0610Updated Oct 9, 2024Oct 9, 2024
    • qllm-eval

      Public
      Code Repository of Evaluating Quantized Large Language Models
      Python
      1013550Updated Sep 8, 2024Sep 8, 2024
    • FlashEval

      Public
      Python
      11410Updated Aug 9, 2024Aug 9, 2024
    • Here are some mplementations of some basic hardware units in RTL language (verilog for now), which can be used for area/power evaluation and support the hardwar…
      9000Updated May 11, 2023May 11, 2023
    • some docs for rookies in nics-efc
      72201Updated Mar 17, 2022Mar 17, 2022