Skip to content

Lynncc6/Awesome-Edge-LLMs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

57 Commits
 
 
 
 

Repository files navigation

🔍 Awesome Edge LLMs

A comprehensive survey on Edge AI,covering hardware, software, frameworks, applications, performance optimization, and the deployment of LLMs on edge devices.

Open Source Edge Models

The listed models are base model limited to either of the following:

  • Parameter ≤ 10B
  • Officially claimed edge models
Model Size Org Time Download Paper
SmalLM3 3B Hugging Face 2025.7.9 🤗 📖
MiniCPM4 8B OpenBMB 2025.6.6 🤗 arXiv
Qwen2.5-Omni 7B Qwen 2025.3.26 🤗 arXiv
MiniCPM-o 2.6 8B OpenBMB 2025.1.14 🤗 -
Phi-4 14B Microsoft 2025.1.9
2024.12.12(release)
🤗 arXiv
VITA-1.5 7B VITA 2025.1.6 - arXiv
Megrez-3B-Omni 3B Infinigence 2024.12.16 🤗 -
OmniAudio 2.6B Nexa AI 2024.12.12 🤗 📖
InternVL 2.5 8B OpenGVLab 2024.12.5 🤗 -
GLM-Edge 1.5B 2B 4B 5B THUDM 2024.11.29 🤗 -
SmalVLM 2B Hugging Face 2024.11.26 🤗 📖
SmalLM2 135M 360M 1.7B Hugging Face 2024.11.1 🤗 📖
Ministral 3B 8B Mistral AI 2024.10.16 🤗 📖
Qwen2.5 0.5B, 1.5B, 3B, 7B Qwen 2024.9.19 🤗 📖
Pixtral 12B 12B Mistral AI 2024.9.17 🤗 📖
Qwen2-VL 2B 7B Qwen 2024.8.30 🤗 📖
Phi 3.5 3.8B 4.1B Microsoft 2024.8.21 🤗 -
MiniCPM-V 2.6 8B OpenBMB 2024.8.6 🤗 -
SmolLM 135M 360M 1.7B Hugging Face 2024.8.2 🤗 📖
Gemma2 2B 9B Google 2024.7.31 🤗 📖
DCLM 7B 7B Apple 2024.7.18 🤗 arXiv
Phi-3 3.8B 7B Microsoft 2024.4.23 🤗 arXiv
Mistral NeMo 12B Mistral AI 2024.6.18 🤗 📖
Gemma 2B 7B Google 2024.2.21 🤗 📖
Mistral 7B 2B 7B Mistral AI 2023.9.27 🤗 📖

Embodied Model

LLM Inference

Title Date Org Paper
DashInfer-VLM 2025.1 ModelScope 📖
SparseInfer 2024.11 University of Seoul, etc arXiv
Mooncake 2024.6 Moonshot AI 📖
flashinfer 2024.2 flashinfer-ai 📖
inferflow 2024.2 Tencent AI Lab arXiv
PowerInfer 2023.12 SJTU
PETALS 2023.12 HSE University, etc arXiv
TensorRT-LLM 2023.10 NVIDIA -
LightSeq 2023.10 UC Berkeley, etc arXiv
vLLM 2023.9 UC Berkeley, etc arXiv
StreamingLLM 2023.9 Meta AI, etc arXiv
MLC-LLM 2023.5 mlc-ai 📖
Medusa 2023.9 Tianle Cai, etc 📖
LightLLM 2023.8 ModelTC -
FastServe 2023.5 Peking University arXiv
SpecInfer 2023.05 Peking University, etc arXiv
Ollama 2023.8 Ollama Inc -
LMDeploy 2023.6 InternLM 📖
Megatron-LM 2020.5 NVIDIA arXiv

Processor

NVIDIA

✅ 50 Series @2025

GeForce RTX 5090 GeForce RTX 5080 GeForce RTX 5070 Ti GeForce RTX 5070
NVIDIA CUDA Cores 21760 10752 8960 6144
Shader Cores Blackwell Blackwell Blackwell Blackwell
Tensor Cores (AI) 5th Generation
3352 AI TOPS
5th Generation
1801 AI TOPS
5th Generation
1406 AI TOPS
5th Generation
988 AI TOPS
Ray Tracing Cores 4th Generation
318 TFLOPS
4th Generation
171 TFLOPS
4th Generation
133 TFLOPS
4th Generation
94 TFLOPS
Boost Clock (GHz) 2.41 2.62 2.45 2.51
Base Clock (GHz) 2.01 2.30 2.30 2.16
Standard Memory Config 32 GB GDDR7 16 GB GDDR7 16 GB GDDR7 12 GB GDDR7
Memory Interface Width 512-bit 256-bit 256-bit 192-bit
Price $1999 $999 $749 $549

✅ 40 Super Series @2024

GPU Specs GeForce RTX 4080 Super GeForce RTX 4070 Ti Super GeForce RTX 4070 Super
CUDA Cores 10,240 8448 7168
Memory Configuration 16 GB GDDR6X 16 GB GDDR6X 12 GB GDDR6X
Memory Interface Width 256-bit 256-bit 256-bit
Memory Bandwidth 736 GB/s 736 GB/s 736 GB/s
Base Clock (GHz) 2.21 GHz 2.31 GHz 1.92 GHz
Boost Clock (GHz) 2.55 GHz 2.61 GHz 2.48 GHz
Graphics Card Power 320W 285W 200W
Recommended PSU 750W 700W 650W
Price $999 $799 $599

✅ 40 Series @2022

GPU Specs GeForce RTX 4090 GeForce RTX 4080 GeForce RTX 4070 Ti GeForce RTX 4070 GeForce RTX 4060 Ti GeForce RTX 4060
NVIDIA CUDA Cores 16384 9728 7680 5888 4352 3072
Shader Cores Ada Lovelace Ada Lovelace Ada Lovelace Ada Lovelace Ada Lovelace Ada Lovelace
Tensor Cores (AI) 4th Gen
330 AI TFLOPS
4th Gen
200 AI TFLOPS
4th Gen
150 AI TFLOPS
4th Gen
100 AI TFLOPS
4th Gen
90 AI TFLOPS
4th Gen
60 AI TFLOPS
Ray Tracing Cores 3rd Gen
191 TFLOPS
3rd Gen
112 TFLOPS
3rd Gen
92 TFLOPS
3rd Gen
64 TFLOPS
3rd Gen
54 TFLOPS
3rd Gen
35 TFLOPS
Boost Clock (GHz) 2.52 2.51 2.61 2.48 2.54 2.42
Base Clock (GHz) 2.23 2.21 2.31 1.92 2.31 1.83
Standard Memory Config 24 GB GDDR6X 16 GB GDDR6X 12 GB GDDR6X 12 GB GDDR6X 8 GB GDDR6 8 GB GDDR6
Memory Interface Width 384-bit 256-bit 192-bit 192-bit 128-bit 128-bit
Graphics Card Power (W) 450W 320W 285W 200W 160W 115W
Recommended PSU (W) 850W 750W 700W 650W 550W 450W
Price $1,599 $1,199 $799 $599 $399 (8GB)
$499 (16GB)
$299

Hardware Applications

AI Glasses

Name Company Model Time Price
雷鸟V3 雷鸟创新 Qwen 2025.1.7 ¥ 1799 +
闪极拍拍镜 闪极科技 Qwen Kimi GLM, etc. 2024.12.19 ¥999 +
INMO GO2 影目科技 - 2024.11.29 ¥3999
Rokid Glasses Rokid Qwen 2024.11.18 ¥2499
Looktech Looktech ChatGPT Claude Gemini 2024.11.16 $199
Ray-Ban Meta Meta AI 2023.9 $299

Reference

Awesome-LLMs-on-device

Awesome-LLM-Inference

数字生命卡兹克- AI硬件大全

About

A comprehensive survey on Edge AI

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published