Change the repository type filter
All
Repositories list
186 repositories
chatterbox
Public templateChatterbox is an TTS by Resemble AI featuring emotion exaggeration control, zero-shot voice cloning, alignment-informed real-time synthesis, and built-in PerTh …qwen3-30b-a3b-instruct-2507
Public template30.5B MoE language model from Qwen team, tuned for broad instruction following, reasoning, multilingual tasks, and agentic tool use.<metadata> gpu: A100 | colle…Qwen3-30B-A3B-Thinking
Publicflux-1-krea-dev
Public template12B model distilled from Krea 1, designed to deliver highly photorealistic results. <metadata> gpu: A100 | collections: ["HF_Transformers"] </metadata>code-debugging-agent
Publicdia-1.6b
Publicqwen-image
Publicpyannote-speaker-diarization-3.1
Public templateA state-of-the-art model that segments and labels audio recordings by accurately distinguishing different speakers. <metadata> gpu: T4 | collections: ["HF Trans…facebook-bart-cnn
Public templateA variant of the BART model designed specifically for natural language summarization. It was pre-trained on a large corpus of English text and later fine-tuned …- 30.5B MoE code generation model purpose-tuned for code generation and agentic tool use. <metadata> gpu: A100 | collections: ["HF Transformers"] </metadata>
gpt-oss-20b
Public templateA 21B open‑weight language model (with ~3.6 billion active parameters per token) developed by OpenAI for reasoning, tool integration, and low‑latency usage. <m…voxtral-mini-3b
Public template3B parameter audio-language model with speech transcription, translation, and audio understanding capabilities. <metadata> gpu: A10 | collections:["HF_Transform…kyutai-tts-1.6b
Public template1.6B parameter text-to-speech model that supports real-time streaming text input with ultra-low latency and voice conditioning capabilities.<metadata> gpu: A10 …llama-3.1-8b-instruct-gguf
Public templateAn 8B-parameter, instruction-tuned variant of Meta's Llama-3.1 model, optimized in GGUF format for efficient inference. <metadata> gpu: A100 | collections: ["l…stable-diffusion-3-5-large-turbo
Public templateA fast, optimized diffusion model that generates high-quality images from text prompts, ideal for creative visual content. <metadata> gpu: A100 | collections: […jina-embeddings-v4
Public templateA 3.8B multimodal-multilingual embedding that unifies text and image understanding in a single late-interaction space, delivers both dense and multi-vector outp…flux-1-kontext-dev
Public template12B model from Black Forest Labs that allows in‑context image editing with character and style consistency; supporting iterative, instruction-guided edits. <met…gemma-3n-e4b-it
Public template8B variant of the lightweight Gemma 3n series that operates with a 4B‑parameter memory footprint, enabling full multimodal inference (text, image, audio, video)…qwen3-embedding-0.6b
Public template600M parameter, 100 language embedding model that turns up to 32k token inputs into instruction-aware vectors. <metadata> gpu: A10 | collections: ["HF_Transform…devstral-small
Public templateAn agentic LLM for software engineering tasks, excels at using tools to explore codebases, editing multiple files and power software engineering agents. <metada…deepseek-r1-qwen3-8b
Public templateA distilled 8B parameter reasoning powerhouse, leveraging deep chain‑of‑thought from the DeepSeek R1‑0528—delivering SOTA open‑source performance. <metadata> gp…nanonets-ocr-s
Public templateNanonets-OCR-s that turns images or PDFs into structured Markdown capturing tables, LaTeX, captions and tags—for fast, powerful, human-readable OCR. <metadata> …Open-NotebookLM
Publicyolo11m-detect
Publickokoro
Public template82M parameters lightweight text-to-speech (TTS) model that delivers high-quality voice synthesis. <metadata> gpu: T4 | collections: ["SSE Events"] </metadata>qwen3-14b
Public template14B model with hybrid approach to problem-solving with two distinct modes: "thinking mode," which enables step-by-step reasoning and "non-thinking mode," design…qwen2.5-omni-7b
Public templateAn advanced end-to-end multimodal which can processes text, images, audio, and video inputs, generating real-time text and natural speech responses. <metadata> …qwen3-8b
Public templateQwen3-8B is a language model that supports seamless switching between “thinking” mode-for advanced math, coding, and logical inference-and “non-thinking” mode f…MCP-Google-Map-Agent
Public