Skip to content

liutaocode/talking-face-arxiv-daily

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2,688 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Contributors Forks Stargazers Issues

Talking-Face Research Papers

Automatically Updated on 2026.02.18

Current Search Keywords: Talking Face, Talking Head, Visual Dubbing, Face Genertation, Lip Sync, Talker, Portrait, Talking Video, Head Synthesis, Face Reenactment, Wav2Lip, Talking Avatar, Lip Generation, Lip-Synchronization, Portrait Animation, Facial Animation, Lip Expert

If you have any other keywords, please feel free to let us know :)

Web Page (Scrape Code)

Table of Contents
  1. Talking Face
  2. Image Animation

Talking Face

Publish Date Title Authors PDF Code
2026-02-14 EchoTorrent: Towards Swift, Sustained, and Streaming Multi-Modal Video Generation Rang Meng et.al. 2602.13669 null
2026-02-13 VineetVC: Adaptive Video Conferencing Under Severe Bandwidth Constraints Using Audio-Driven Talking-Head Reconstruction Vineet Kumar Rakesh et.al. 2602.12758 null
2026-02-12 3DXTalker: Unifying Identity, Lip Sync, Emotion, and Spatial Dynamics in Expressive 3D Talking Avatars Zhongju Wang et.al. 2602.10516 null
2026-02-11 SoulX-FlashHead: Oracle-guided Generation of Infinite Real-time Streaming Talking Heads Tan Yu et.al. 2602.07449 null
2026-02-10 Toward Fine-Grained Facial Control in 3D Talking Head Generation Shaoyang Xie et.al. 2602.09736 null
2026-02-10 AUHead: Realistic Emotional Talking Head Generation via Action Units Control Jiayi Lyu et.al. 2602.09534 null
2026-02-10 MOVA: Towards Scalable and Synchronized Video-Audio Generation SII-OpenMOSS Team et.al. 2602.08794 null
2026-02-09 VedicTHG: Symbolic Vedic Computation for Low-Resource Talking-Head Generation in Educational Avatars Vineet Kumar Rakesh et.al. 2602.08775 null
2026-02-06 Condition Matters in Full-head 3D GANs Heyuan Li et.al. 2602.07198 null
2026-02-06 Ex-Omni: Enabling 3D Facial Animation Generation for Omni-modal Large Language Models Haoyu Zhang et.al. 2602.07106 null
2026-02-05 From Blurry to Believable: Enhancing Low-quality Talking Heads with 3D Generative Priors Ding-Jiun Huang et.al. 2602.06122 null
2026-02-04 A $^2$ -LLM: An End-to-end Conversational Audio Avatar Large Language Model Xiaolin Hu et.al. 2602.04913 null
2026-02-03 Asymmetric Hierarchical Anchoring for Audio-Visual Joint Representation: Resolving Information Allocation Ambiguity for Robust Cross-Modal Generalization Bixing Wu et.al. 2602.03570 null
2026-02-02 Making Avatars Interact: Towards Text-Driven Human-Object Interaction for Controllable Talking Avatars Youliang Zhang et.al. 2602.01538 null
2026-01-31 JoyAvatar: Unlocking Highly Expressive Avatars via Harmonized Text-Audio Conditioning Ruikui Wang et.al. 2602.00702 null
2026-01-30 LPIPS-AttnWav2Lip: Generic Audio-Driven lip synchronization for Talking Head Generation in the Wild Zhipeng Chen et.al. 2602.00189 null
2026-01-30 MIRRORTALK: Forging Personalized Avatars Via Disentangled Style and Hierarchical Motion Control Renjie Lu et.al. 2601.22501 null
2026-01-29 JUST-DUB-IT: Video Dubbing via Joint Audio-Visual Diffusion Anthony Chen et.al. 2601.22143 null
2026-01-29 EditYourself: Audio-Driven Generation and Manipulation of Talking Head Videos with Diffusion Transformers John Flynn et.al. 2601.22127 null
2026-01-29 Lightweight High-Fidelity Low-Bitrate Talking Face Compression for 3D Video Conference Jianglong Li et.al. 2601.21269 null
2026-01-29 SkyReels-V3 Technique Report Debang Li et.al. 2601.17323 null
2026-01-28 SFQA: A Comprehensive Perceptual Quality Assessment Dataset for Singing Face Generation Zhilin Gao et.al. 2601.20385 null
2026-01-27 Uncertainty-Aware 3D Emotional Talking Face Synthesis with Emotion Prior Distillation Nanhan Shen et.al. 2601.19112 null
2026-01-26 Audio-Driven Talking Face Generation with Blink Embedding and Hash Grid Landmarks Encoding Yuhui Zhang et.al. 2601.18849 null
2026-01-26 Splat-Portrait: Generalizing Talking Heads with Gaussian Splatting Tong Shi et.al. 2601.18633 null
2026-01-21 FunCineForge: A Unified Dataset Toolkit and Model for Zero-Shot Movie Dubbing in Diverse Cinematic Scenes Jiaxuan Liu et.al. 2601.14777 null
2026-01-20 HoverAI: An Embodied Aerial Agent for Natural Human-Drone Interaction Yuhua Jin et.al. 2601.13801 null
2026-01-19 Exploring Talking Head Models With Adjacent Frame Prior for Speech-Preserving Facial Expression Manipulation Zhenxuan Lu et.al. 2601.12876 null
2026-01-19 Generalizable and Animatable 3D Full-Head Gaussian Avatar from a Single Image Shuling Zhao et.al. 2601.12770 null
2026-01-15 RSATalker: Realistic Socially-Aware Talking Head Generation for Multi-Turn Conversation Peng Chen et.al. 2601.10606 null
2026-01-15 EditEmoTalk: Controllable Speech-Driven 3D Facial Animation with Continuous Expression Editing Diqiong Jiang et.al. 2601.10000 null
2026-01-14 Now You See Me, Now You Don't: A Unified Framework for Expression Consistent Anonymization in Talking Head Videos Anil Egin et.al. 2601.11635 null
2026-01-14 MoCha:End-to-End Video Character Replacement without Structural Guidance Zhengbo Xu et.al. 2601.08587 null
2026-01-13 Deep Learning Based Facial Retargeting Using Local Patches Yeonsoo Choi et.al. 2601.08429 null
2026-01-08 MM-Sonate: Multimodal Controllable Audio-Video Generation with Zero-Shot Voice Cloning Chunyu Qiang et.al. 2601.01568 null
2026-01-07 REFA: Real-time Egocentric Facial Animations for Virtual Reality Qiang Zhang et.al. 2601.03507 null
2026-01-05 HeadLighter: Disentangling Illumination in Generative 3D Gaussian Heads via Lightstage Captures Yating Wang et.al. 2601.02103 null
2026-01-05 ESGaussianFace: Emotional and Stylized Audio-Driven Facial Animation via 3D Gaussian Splatting Chuhang Ma et.al. 2601.01847 null
2026-01-05 MANGO:Natural Multi-speaker 3D Talking Head Generation via 2D-Lifted Enhancement Lei Zhu et.al. 2601.01749 null
2026-01-02 Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation Taekyung Ki et.al. 2601.00664 null
2025-12-31 From Inpainting to Editing: A Self-Bootstrapping Framework for Context-Rich Visual Dubbing Xu He et.al. 2512.25066 null
2025-12-30 DyStream: Streaming Dyadic Talking Heads Generation via Flow Matching-based Autoregressive Model Bohong Chen et.al. 2512.24408 null
2025-12-30 SyncAnyone: Implicit Disentanglement via Progressive Self-Correction for Lip-Syncing in the wild Xindi Zhang et.al. 2512.21736 null
2025-12-29 Knot Forcing: Taming Autoregressive Video Diffusion Models for Real-time Infinite Interactive Portrait Animation Steven Xiao et.al. 2512.21734 null
2025-12-29 Efficient and Robust Video Defense Framework against 3D-field Personalized Talking Face Rui-qing Sun et.al. 2512.21019 null
2025-12-27 PTalker: Personalized Speech-Driven 3D Talking Head Animation via Style Disentanglement and Modality Alignment Bin Wang et.al. 2512.22602 null
2025-12-24 ALIVE: An Avatar-Lecture Interactive Video Engine with Content-Aware Retrieval for Real-Time Interaction Md Zabirul Islam et.al. 2512.20858 null
2025-12-23 TAVID: Text-Driven Audio-Visual Interactive Dialogue Generation Ji-Hoon Kim et.al. 2512.20296 null
2025-12-23 FlashLips: 100-FPS Mask-Free Latent Lip-Sync using Reconstruction Instead of Diffusion or GANs Andreas Zinonos et.al. 2512.20033 null
2025-12-22 ActAvatar: Temporally-Aware Precise Action Control for Talking Avatars Ziqiao Peng et.al. 2512.19546 null
2025-12-21 In-Context Audio Control of Video Diffusion Transformers Wenze Liu et.al. 2512.18772 null
2025-12-20 Asynchronous Pipeline Parallelism for Real-Time Multilingual Lip Synchronization in Video Communication Systems Eren Caglar et.al. 2512.18318 null
2025-12-20 MACE-Dance: Motion-Appearance Cascaded Experts for Music-Driven Dance Video Generation Kaixing Yang et.al. 2512.18181 null
2025-12-19 SynergyWarpNet: Attention-Guided Cooperative Warping for Neural Portrait Animation Shihang Li et.al. 2512.17331 null
2025-12-19 InstructDubber: Instruction-based Alignment for Zero-shot Movie Dubbing Zhedong Zhang et.al. 2512.17154 null
2025-12-18 FlashPortrait: 6x Faster Infinite Portrait Animation with Adaptive Latent Prediction Shuyuan Tu et.al. 2512.16900 null
2025-12-18 Instant Expressive Gaussian Head Avatar via 3D-Aware Expression Distillation Kaiwen Jiang et.al. 2512.16893 null
2025-12-17 FlexAvatar: Learning Complete 3D Head Avatars with Partial Supervision Tobias Kirschstein et.al. 2512.15599 null
2025-12-17 DeX-Portrait: Disentangled and Expressive Portrait Animation via Explicit and Latent Motion Representations Yuxiang Shi et.al. 2512.15524 null
2025-12-16 TalkVerse: Democratizing Minute-Long Audio-Driven Video Generation Zhenzhi Wang et.al. 2512.14938 null
2025-12-16 VASA-3D: Lifelike Audio-Driven Gaussian Head Avatars from a Single Image Sicheng Xu et.al. 2512.14677 null
2025-12-16 FacEDiT: Unified Talking Face Editing and Generation via Facial Motion Infilling Kim Sung-Bin et.al. 2512.14056 null
2025-12-16 Seedance 1.5 pro: A Native Audio-Visual Joint Generation Foundation Model Heyi Chen et.al. 2512.13507 null
2025-12-15 JoVA: Unified Multimodal Learning for Joint Video-Audio Generation Xiaohu Huang et.al. 2512.13677 null
2025-12-15 Soul: Breathe Life into Digital Human for High-fidelity Long-term Multimodal Animation Jiangning Zhang et.al. 2512.13495 null
2025-12-15 KlingAvatar 2.0 Technical Report Kling Team et.al. 2512.13313 null
2025-12-15 STARCaster: Spatio-Temporal AutoRegressive Video Diffusion for Identity- and View-Aware Talking Portraits Foivos Paraperas Papantoniou et.al. 2512.13247 null
2025-12-12 FactorPortrait: Controllable Portrait Animation via Disentangled Expression, Pose, and Viewpoint Jiapeng Tang et.al. 2512.11645 null
2025-12-12 JoyAvatar: Real-time and Infinite Audio-Driven Avatar Generation with Autoregressive Diffusion Chaochao Li et.al. 2512.11423 null
2025-12-12 KeyframeFace: From Text to Expressive Facial Keyframes Jingchao Wu et.al. 2512.11321 null
2025-12-12 PersonaLive! Expressive Portrait Image Animation for Live Streaming Zhiyuan Li et.al. 2512.11253 null
2025-12-12 REST: Diffusion-based Real-time End-to-end Streaming Talking Head Generation via ID-Context Caching and Asynchronous Streaming Distillation Haotian Wang et.al. 2512.11229 null
2025-12-11 GaussianHeadTalk: Wobble-Free 3D Talking Heads with Audio Driven Gaussian Splatting Madhav Agarwal et.al. 2512.10939 null
2025-12-10 EmoDiffTalk:Emotion-aware Diffusion for Editable 3D Gaussian Talking Head Chang Liu et.al. 2512.05991 null
2025-12-04 LiteVGGT: Boosting Vanilla VGGT via Geometry-aware Cached Token Merging Zhijian Shu et.al. 2512.04939 null
2025-12-04 Measuring the Unspoken: A Disentanglement Model and Benchmark for Psychological Analysis in the Wild Yigui Feng et.al. 2512.04728 null
2025-12-02 DF-Mamba: Deformable State Space Modeling for 3D Hand Pose Estimation in Interactions Yifan Zhou et.al. 2512.02727 null
2025-12-01 ELVIS: Enhance Low-Light for Video Instance Segmentation in the Dark Joanne Lin et.al. 2512.01495 null
2025-12-01 EvalTalker: Learning to Evaluate Real-Portrait-Driven Multi-Subject Talking Humans Yingjie Zhou et.al. 2512.01340 null
2025-11-30 TalkingPose: Efficient Face and Gesture Animation with Feedback-guided Diffusion Model Alireza Javanmardi et.al. 2512.00909 null
2025-11-29 MVAD : A Comprehensive Multimodal Video-Audio Dataset for AIGC Detection Mengxue Hu et.al. 2512.00336 null
2025-11-28 AnyTalker: Scaling Multi-Person Talking Video Generation with Interactivity Refinement Zhizhou Zhong et.al. 2511.23475 null
2025-11-28 DAONet-YOLOv8: An Occlusion-Aware Dual-Attention Network for Tea Leaf Pest and Disease Detection Yefeng Wu et.al. 2511.23222 null
2025-11-28 CoordSpeaker: Exploiting Gesture Captioning for Coordinated Caption-Empowered Co-Speech Gesture Generation Fengyi Fang et.al. 2511.22863 null
2025-11-27 AI killed the video star. Audio-driven diffusion model for expressive talking head generation Baptiste Chopin et.al. 2511.22488 null
2025-11-27 VSpeechLM: A Visual Speech Language Model for Visual Text-to-Speech Task Yuyue Wang et.al. 2511.22229 null
2025-11-27 IMTalker: Efficient Audio-driven Talking Face Generation with Implicit Motion Transfer Bo Chen et.al. 2511.22167 null
2025-11-27 Lips-Jaw and Tongue-Jaw Articulatory Tradeoff in DYNARTmo Bernd J. KrΓΆger et.al. 2511.22155 null
2025-11-27 DiP: Taming Diffusion Models in Pixel Space Zhennan Chen et.al. 2511.18822 null
2025-11-26 Passive Dementia Screening via Facial Temporal Micro-Dynamics Analysis of In-the-Wild Talking-Head Video Filippo Cenacchi et.al. 2511.13802 null
2025-11-25 Image Diffusion Models Exhibit Emergent Temporal Propagation in Videos Youngseo Kim et.al. 2511.19936 null
2025-11-24 Blinking Beyond EAR: A Stable Eyelid Angle Metric for Driver Drowsiness Detection and Data Augmentation Mathis Wolter et.al. 2511.19519 null
2025-11-24 Assessing the alignment between infants' visual and linguistic experience using multimodal language models Alvin Wei Ming Tan et.al. 2511.18824 null
2025-11-23 The Locally Deployable Virtual Doctor: LLM Based Human Interface for Automated Anamnesis and Database Conversion Jan Benedikt Ruhland et.al. 2511.18632 null
2025-11-23 RigAnyFace: Scaling Neural Facial Mesh Auto-Rigging with Unlabeled Data Wenchao Ma et.al. 2511.18601 null
2025-11-22 A superpersuasive autonomous policy debating system Allen Roush et.al. 2511.17854 null
2025-11-21 Investigating self-supervised representations for audio-visual deepfake detection Dragos-Alexandru Boldisor et.al. 2511.17181 null
2025-11-21 One Small Step in Latent, One Giant Leap for Pixels: Fast Latent Upscale Adapter for Your Diffusion Models Aleksandr Razin et.al. 2511.10629 null
2025-11-20 Motion Transfer-Enhanced StyleGAN for Generating Diverse Macaque Facial Expressions Takuya Igaue et.al. 2511.16711 null
2025-11-19 StreamingTalker: Audio-driven 3D Facial Animation with Autoregressive Diffusion Model Yifan Yang et.al. 2511.14223 null
2025-11-18 Blur-Robust Detection via Feature Restoration: An End-to-End Framework for Prior-Guided Infrared UAV Target Detection Xiaolin Wang et.al. 2511.14371 null
2025-11-18 Towards Authentic Movie Dubbing with Retrieve-Augmented Director-Actor Interaction Learning Rui Liu et.al. 2511.14249 null
2025-11-17 B2F: End-to-End Body-to-Face Motion Generation with Style Reference Bokyung Jang et.al. 2511.13988 null
2025-11-17 Uni-Hand: Universal Hand Motion Forecasting in Egocentric Views Junyi Ma et.al. 2511.12878 null
2025-11-14 3D Gaussian and Diffusion-Based Gaze Redirection Abiram Panchalingam et.al. 2511.11231 null
2025-11-12 GRACE: Designing Generative Face Video Codec via Agile Hardware-Centric Workflow Rui Wan et.al. 2511.09272 null
2025-11-11 StableMorph: High-Quality Face Morph Generation with Stable Diffusion Wassim Kabbani et.al. 2511.08090 null
2025-11-11 Is It Truly Necessary to Process and Fit Minutes-Long Reference Videos for Personalized Talking Face Generation? Rui-Qing Sun et.al. 2511.07940 null
2025-11-10 LiveNeRF: Efficient Face Replacement Through Neural Radiance Fields Integration Tung Vu et.al. 2511.07552 null
2025-11-10 The Inner Kernel of the Classical Kuiper Belt Amir Siraj et.al. 2511.07512 null
2025-11-10 ConsistTalk: Intensity Controllable Temporally Consistent Talking Head Generation with Diffusion Noise Search Zhenjie Liu et.al. 2511.06833 null
2025-11-08 DiLO: Disentangled Latent Optimization for Learning Shape and Deformation in Grouped Deforming 3D Objects Mostofa Rafid Uddin et.al. 2511.06115 null
2025-11-08 Reperio-rPPG: Relational Temporal Graph Neural Networks for Periodicity Learning in Remote Physiological Measurement Ba-Thinh Nguyen et.al. 2511.05946 null
2025-11-07 Shared Latent Representation for Joint Text-to-Audio-Visual Synthesis Dogucan Yaman et.al. 2511.05432 null
2025-11-07 THEval. Evaluation Framework for Talking Head Video Generation Nabyl Quignon et.al. 2511.04520 null
2025-11-05 UniAVGen: Unified Audio and Video Generation with Asymmetric Cross-Modal Interactions Guozhen Zhang et.al. 2511.03334 null
2025-11-04 Densemarks: Learning Canonical Embeddings for Human Heads Images via Point Tracks Dmitrii Pozdeev et.al. 2511.02830 null
2025-10-29 Learning Disentangled Speech- and Expression-Driven Blendshapes for 3D Talking Face Animation Yuxiang Mao et.al. 2510.25234 null
2025-10-28 See the Speaker: Crafting High-Resolution Talking Faces from Speech with Prior Guidance and Region Refinement Jinting Wang et.al. 2510.26819 null
2025-10-27 Lookahead Anchoring: Preserving Character Identity in Audio-Driven Human Animation Junyoung Seo et.al. 2510.23581 null
2025-10-27 Revising Second Order Terms in Deep Animation Video Coding Konstantin Schmidt et.al. 2510.23561 null
2025-10-26 MAGIC-Talk: Motion-aware Audio-Driven Talking Face Generation with Customizable Identity Control Fatemeh Nazarieh et.al. 2510.22810 null
2025-10-26 DeepfakeBench-MM: A Comprehensive Benchmark for Multimodal Deepfake Detection Kangran Zhao et.al. 2510.22622 null
2025-10-24 Unmasking Puppeteers: Leveraging Biometric Leakage to Disarm Impersonation in AI-based Videoconferencing Danial Samadi Vahdati et.al. 2510.03548 null
2025-10-23 LSF-Animation: Label-Free Speech-Driven Facial Animation via Implicit Feature Representation Xin Lu et.al. 2510.21864 null
2025-10-16 PIA: Deepfake Detection Using Phoneme-Temporal and Identity-Dynamic Analysis Soumyya Kanti Datta et.al. 2510.14241 null
2025-10-14 Playmate2: Training-Free Multi-Character Audio-Driven Animation via Diffusion Transformer with Reward Feedback Xingpei Ma et.al. 2510.12089 null
2025-10-12 DEMO: Disentangled Motion Latent Flow Matching for Fine-Grained Controllable Talking Portrait Synthesis Peiyin Chen et.al. 2510.10650 null
2025-10-11 VividAnimator: An End-to-End Audio and Pose-driven Half-Body Human Animation Framework Donglin Huang et.al. 2510.10269 null
2025-10-11 SyncLipMAE: Contrastive Masked Pretraining for Audio-Visual Talking-Face Representation Zeyu Ling et.al. 2510.10069 null
2025-10-09 Paper2Video: Automatic Video Generation from Scientific Papers Zeyu Zhu et.al. 2510.05096 null
2025-10-08 A Bridge from Audio to Video: Phoneme-Viseme Alignment Allows Every Face to Speak Multiple Languages Zibo Su et.al. 2510.06612 null
2025-10-03 EGSTalker: Real-Time Audio-Driven Talking Head Generation with Efficient Gaussian Deformation Tianheng Zhu et.al. 2510.08587 null
2025-10-02 Input-Aware Sparse Attention for Real-Time Co-Speech Video Generation Beijia Lu et.al. 2510.02617 null
2025-10-01 Audio Driven Real-Time Facial Animation for Social Telepresence Jiye Lee et.al. 2510.01176 null
2025-09-30 3DiFACE: Synthesizing and Editing Holistic 3D Facial Animation Balamurugan Thambiraja et.al. 2509.26233 null
2025-09-26 StableDub: Taming Diffusion Prior for Generalized and Efficient Visual Dubbing Liyang Chen et.al. 2509.21887 null
2025-09-25 Unlocking Financial Insights: An advanced Multimodal Summarization with Multimodal Output Framework for Financial Advisory Videos Sarmistha Das et.al. 2509.20961 null
2025-09-24 KSDiff: Keyframe-Augmented Speech-Aware Dual-Path Diffusion for Facial Animation Tianle Lyu et.al. 2509.20128 null
2025-09-24 Comparative Study of Subjective Video Quality Assessment Test Methods in Crowdsourcing for Varied Use Cases Babak Naderi et.al. 2509.20118 null
2025-09-24 SynchroRaMa : Lip-Synchronized and Emotion-Aware Talking Face Generation via Multi-Modal Emotion Embedding Phyo Thet Yee et.al. 2509.19965 null
2025-09-24 Talking Head Generation via AU-Guided Landmark Prediction Shao-Yu Chang et.al. 2509.19749 null
2025-09-23 Audio-Driven Universal Gaussian Head Avatars Kartik Teotia et.al. 2509.18924 null
2025-09-22 "I don't like my avatar": Investigating Human Digital Doubles Siyi Liu et.al. 2509.17748 null
2025-09-22 Stable Video-Driven Portraits Mallikarjun B. R. et.al. 2509.17476 null
2025-09-21 Beat on Gaze: Learning Stylized Generation of Gaze and Head Dynamics Chengwei Shi et.al. 2509.17168 null
2025-09-21 PGSTalker: Real-Time Audio-Driven Talking Head Generation via 3D Gaussian Splatting with Pixel-Aware Density Control Tianheng Zhu et.al. 2509.16922 null
2025-09-20 Follow-Your-Emoji-Faster: Towards Efficient, Fine-Controllable, and Expressive Freestyle Portrait Animation Yue Ma et.al. 2509.16630 null
2025-09-17 Kling-Avatar: Grounding Multimodal Instructions for Cascaded Long-Duration Avatar Animation Synthesis Yikang Ding et.al. 2509.09595 null
2025-09-16 A Lightweight Pipeline for Noisy Speech Voice Cloning and Accurate Lip Sync Synthesis Javeria Amir et.al. 2509.12831 null
2025-09-15 AvatarSync: Rethinking Talking-Head Animation through Autoregressive Perspective Yuchen Deng et.al. 2509.12052 null
2025-09-10 Bitrate-Controlled Diffusion for Disentangling Motion and Content in Video Xiao Li et.al. 2509.08376 null
2025-09-09 PanoLAM: Large Avatar Model for Gaussian Full-Head Synthesis from One-shot Unposed Image Peng Li et.al. 2509.07552 null
2025-09-04 Durian: Dual Reference-guided Portrait Animation with Attribute Transfer Hyunsoo Cha et.al. 2509.04434 null
2025-08-28 EmoCAST: Emotional Talking Portrait via Emotive Text Description Yiguo Jiang et.al. 2508.20615 null
2025-08-27 InfinityHuman: Towards Long-Term Audio-Driven Human Xiaodi Li et.al. 2508.20210 null
2025-08-27 Improving Generalization in Deepfake Detection with Face Foundation Models and Metric Learning Stelios Mylonas et.al. 2508.19730 null
2025-08-26 OmniHuman-1.5: Instilling an Active Mind in Avatars via Cognitive Simulation Jianwen Jiang et.al. 2508.19209 null
2025-08-26 PanoHair: Detailed Hair Strand Synthesis on Volumetric Heads Shashikant Verma et.al. 2508.18944 null
2025-08-26 Wan-S2V: Audio-Driven Cinematic Video Generation Xin Gao et.al. 2508.18621 null
2025-08-26 Supervising 3D Talking Head Avatars with Analysis-by-Audio-Synthesis Radek Daněček et.al. 2504.13386 null
2025-08-25 Lightning Fast Caching-based Parallel Denoising Prediction for Accelerating Talking Head Generation Jianzhi Long et.al. 2509.00052 null
2025-08-25 EAI-Avatar: Emotion-Aware Interactive Talking Head Generation Haijie Yang et.al. 2508.18337 null
2025-08-22 Audio2Face-3D: Audio-driven Realistic Facial Animation For Digital Avatars NVIDIA et.al. 2508.16401 null
2025-08-20 D^3-Talker: Dual-Branch Decoupled Deformation Fields for Few-Shot 3D Talking Head Synthesis Yuhang Guo et.al. 2508.14449 null
2025-08-20 Taming Transformer for Emotion-Controllable Talking Face Generation Ziqi Zhang et.al. 2508.14359 null
2025-08-19 TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis Shunian Chen et.al. 2508.13618 null
2025-08-19 EDTalk++: Full Disentanglement for Controllable Talking Head Synthesis Shuai Tan et.al. 2508.13442 null
2025-08-18 Human Feedback Driven Dynamic Speech Emotion Recognition Ilya Fedorov et.al. 2508.14920 null
2025-08-17 CEM-Net: Cross-Emotion Memory Network for Emotional Talking Face Generation Kangyi Wu et.al. 2508.12368 null
2025-08-16 RealTalk: Realistic Emotion-Aware Lifelike Talking-Head Synthesis Wenqing Wang et.al. 2508.12163 null
2025-08-16 SimInterview: Transforming Business Education through Large Language Model-Based Simulated Multilingual Interview Training System Truong Thanh Hung Nguyen et.al. 2508.11873 null
2025-08-15 FantasyTalking2: Timestep-Layer Adaptive Preference Optimization for Audio-Driven Portrait Animation MengChao Wang et.al. 2508.11255 null
2025-08-14 HM-Talker: Hybrid Motion Modeling for High-Fidelity Talking Head Synthesis Shiyu Liu et.al. 2508.10566 null
2025-08-14 M2DAO-Talker: Harmonizing Multi-granular Motion Decoupling and Alternating Optimization for Talking-head Generation Kui Jiang et.al. 2507.08307 null
2025-08-14 MEDTalk: Multimodal Controlled 3D Facial Animation with Dynamic Emotions by Disentangled Embedding Chang Liu et.al. 2507.06071 null
2025-08-13 LIA-X: Interpretable Latent Portrait Animator Yaohui Wang et.al. 2508.09959 null
2025-08-12 Preview WB-DH: Towards Whole Body Digital Human Bench for the Generation of Whole-body Talking Avatar Videos Chaoyi Wang et.al. 2508.08891 null
2025-08-11 Learning Phonetic Context-Dependent Viseme for Enhancing Speech-Driven 3D Facial Animation Hyung Kyu Kim et.al. 2507.20568 null
2025-08-10 KLASSify to Verify: Audio-Visual Deepfake Detection Using SSL-based Audio and Handcrafted Visual Features Ivan Kukanov et.al. 2508.07337 null
2025-08-08 MotionSwap Om Patil et.al. 2508.06430 null
2025-08-08 MoDA: Multi-modal Diffusion Architecture for Talking Head Generation Xinyang Li et.al. 2507.03256 null
2025-08-07 Evaluation of a Sign Language Avatar on Comprehensibility, User Experience & Acceptability Fenya Wasserroth et.al. 2508.05358 null
2025-08-07 RAP: Real-time Audio-driven Portrait Animation with Video Diffusion Transformer Fangyu Du et.al. 2508.05115 null
2025-08-07 UniTalker: Conversational Speech-Visual Synthesis Yifan Hu et.al. 2508.04585 null
2025-08-07 AudioGen-Omni: A Unified Multimodal Diffusion Transformer for Video-Synchronized Audio, Speech, and Song Generation Le Wang et.al. 2508.00733 null
2025-08-06 MienCap: Realtime Performance-Based Facial Animation with Live Mood Dynamics Ye Pan et.al. 2508.04687 null
2025-08-06 READ: Real-time and Efficient Asynchronous Diffusion for Audio-driven Talking Head Generation Haotian Wang et.al. 2508.03457 null
2025-08-06 Disentangle Identity, Cooperate Emotion: Correlation-Aware Emotional Talking Portrait Generation Weipeng Tan et.al. 2504.18087 null
2025-08-05 Multi-human Interactive Talking Dataset Zeyu Zhu et.al. 2508.03050 null
2025-08-04 X-Actor: Emotional and Expressive Long-Range Portrait Acting from Audio Chenxu Zhang et.al. 2508.02944 null
2025-08-04 Text2Lip: Progressive Lip-Synced Talking Face Generation from Text via Viseme-Guided Rendering Xu Wang et.al. 2508.02362 null
2025-08-04 Is It Really You? Exploring Biometric Verification Scenarios in Photorealistic Talking-Head Avatar Videos Laura Pedrouzo-Rodriguez et.al. 2508.00748 null
2025-07-31 Who is a Better Talker: Subjective and Objective Quality Assessment for AI-Generated Talking Heads Yingjie Zhou et.al. 2507.23343 null
2025-07-30 X-NeMo: Expressive Neural Motion Reenactment via Disentangled Latent Attention Xiaochen Zhao et.al. 2507.23143 null
2025-07-30 Robust Deepfake Detection for Electronic Know Your Customer Systems Using Registered Images Takuma Amada et.al. 2507.22601 null
2025-07-29 DiTalker: A Unified DiT-based Framework for High-Quality and Speaking Styles Controllable Portrait Animation He Feng et.al. 2508.06511 null
2025-07-29 JWB-DH-V1: Benchmark for Joint Whole-Body Talking Avatar and Speech Generation Version 1 Xinhan Di et.al. 2507.20987 null
2025-07-28 Mask-Free Audio-driven Talking Face Generation for Enhanced Visual Quality and Identity Preservation Dogucan Yaman et.al. 2507.20953 null
2025-07-28 MemoryTalker: Personalized Speech-Driven 3D Facial Animation via Audio-Guided Stylization Hyung Kyu Kim et.al. 2507.20562 null
2025-07-28 JOLT3D: Joint Learning of Talking Heads and 3DMM Parameters with Application to Lip-Sync Sungjoon Park et.al. 2507.20452 null
2025-07-25 Face2VoiceSync: Lightweight Face-Voice Consistency for Text-Driven Talking Face Generation Fang Kang et.al. 2507.19225 null
2025-07-24 Tiny is not small enough: High-quality, low-resource facial animation models through hybrid knowledge distillation Zhen Han et.al. 2507.18352 null
2025-07-24 Celeb-DF++: A Large-scale Challenging Video DeepFake Benchmark for Generalizable Forensics Yuezun Li et.al. 2507.18015 null
2025-07-22 Livatar-1: Real-Time Talking Heads Generation with Tailored Flow Matching Haiyang Liu et.al. 2507.18649 null
2025-07-22 Navigating Large-Pose Challenge for High-Fidelity Face Reenactment with Video Diffusion Model Mingtao Guo et.al. 2507.16341 null
2025-07-21 VisualSpeaker: Visually-Guided 3D Avatar Lip Synthesis Alexandre Symeonidis-Herzig et.al. 2507.06060 null
2025-07-17 FantasyPortrait: Enhancing Multi-Character Portrait Animation with Expression-Augmented Diffusion Transformers Qiang Wang et.al. 2507.12956 null
2025-07-17 ATL-Diff: Audio-Driven Talking Head Generation with Early Landmarks-Guide Noise Diffusion Hoang-Son Vo et.al. 2507.12804 null
2025-07-17 Think-Before-Draw: Decomposing Emotion Semantics & Fine-Grained Controllable Expressive Talking Head Generation Hanlei Shi et.al. 2507.12761 null
2025-07-17 Cross-Modal Watermarking for Authentic Audio Recovery and Tamper Localization in Synthesized Audiovisual Forgeries Minyoung Kim et.al. 2507.12723 null
2025-07-16 AU-Blendshape for Fine-grained Stylized 3D Facial Expression Manipulation Hao Li et.al. 2507.12001 null
2025-07-15 Model See Model Do: Speech-Driven Facial Animation with Style Control Yifang Pan et.al. 2505.01319 null
2025-07-11 Detecting Deepfake Talking Heads from Facial Biometric Anomalies Justin D. Norman et.al. 2507.08917 null
2025-07-10 GGTalker: Talking Head Systhesis with Generalizable Gaussian Priors and Identity-Specific Adaptation Wentao Hu et.al. 2506.21513 null
2025-07-07 MoDiT: Learning Highly Consistent 3D Motion Coefficients with Diffusion Transformer for Talking Head Generation Yucheng Wang et.al. 2507.05092 null
2025-07-05 EchoMimicV3: 1.3B Parameters are All You Need for Unified Multi-Modal and Multi-Task Human Animation Rang Meng et.al. 2507.03905 null
2025-07-03 CanonSwap: High-Fidelity and Consistent Video Face Swapping via Canonical Space Modulation Xiangyang Luo et.al. 2507.02691 null
2025-07-02 FixTalk: Taming Identity Leakage for High-Quality Talking Head Generation in Extreme Cases Shuai Tan et.al. 2507.01390 null
2025-07-01 ICME 2025 Grand Challenge on Video Super-Resolution for Video Conferencing Babak Naderi et.al. 2506.12269 null
2025-06-30 JAM-Flow: Joint Audio-Motion Synthesis with Flow Matching Mingi Kwon et.al. 2506.23552 null
2025-06-27 MirrorMe: Towards Realtime and High Fidelity Audio-Driven Halfbody Animation Dechao Meng et.al. 2506.22065 null
2025-06-27 Few-Shot Identity Adaptation for 3D Talking Heads via Global Gaussian Field Hong Nie et.al. 2506.22044 null
2025-06-27 RiverEcho: Real-Time Interactive Digital System for Ancient Yellow River Culture Haofeng Wang et.al. 2506.21865 null
2025-06-24 Bind-Your-Avatar: Multi-Talking-Character Video Generation with Dynamic 3D-mask-based Embedding Router Yubo Huang et.al. 2506.19833 null
2025-06-23 Advancing Talking Head Generation: A Comprehensive Survey of Multi-Modal Methodologies, Datasets, Evaluation Metrics, and Loss Functions Vineet Kumar Rakesh et.al. 2507.02900 null
2025-06-23 OmniAvatar: Efficient Audio-Driven Avatar Video Generation with Adaptive Body Animation Qijun Gan et.al. 2506.18866 null
2025-06-23 CGS-GAN: 3D Consistent Gaussian Splatting GANs for High Resolution Human Head Synthesis Florian Barthel et.al. 2505.17590 null
2025-06-17 SyncTalk++: High-Fidelity and Efficient Synchronized Talking Heads Synthesis Using Gaussian Splatting Ziqiao Peng et.al. 2506.14742 null
2025-06-17 Compressed Video Super-Resolution based on Hierarchical Encoding Yuxuan Jiang et.al. 2506.14381 null
2025-06-16 Audio-Visual Driven Compression for Low-Bitrate Talking Head Videos Riku Takahashi et.al. 2506.13419 null
2025-06-15 iDiT-HOI: Inpainting-based Hand Object Interaction Reenactment via Video Diffusion Transformer Zhelun Shen et.al. 2506.12847 null
2025-06-10 HunyuanVideo-HOMA: Generic Human-Object Interaction in Multimodal Driven Human Animation Ziyao Huang et.al. 2506.08797 null
2025-06-03 NTIRE 2025 XGC Quality Assessment Challenge: Methods and Results Xiaohong Liu et.al. 2506.02875 null
2025-06-03 OmniTalker: One-shot Real-time Text-Driven Talking Audio-Video Generation With Multimodal Style Mimicking Zhongjian Wang et.al. 2504.02433 null
2025-06-02 Cocktail-Party Audio-Visual Speech Recognition Thai-Binh Nguyen et.al. 2506.02178 null
2025-06-02 Low-Rank Head Avatar Personalization with Registers Sai Tanmay Reddy Chakkera et.al. 2506.01935 null
2025-06-02 Silence is Golden: Leveraging Adversarial Examples to Nullify Audio Control in LDM-based Talking-Head Generation Yuan Gan et.al. 2506.01591 null
2025-06-01 SkyReels-Audio: Omni Audio-Conditioned Talking Portraits in Video Diffusion Transformers Zhengcong Fei et.al. 2506.00830 null
2025-05-30 TalkingHeadBench: A Multi-Modal Benchmark & Analysis of Talking-Head DeepFake Detection Xinqi Xiong et.al. 2505.24866 null
2025-05-29 Hallo4: High-Fidelity Dynamic Portrait Animation via Direct Preference Optimization and Temporal Motion Modulation Jiahao Cui et.al. 2505.23525 null
2025-05-29 Video Editing for Audio-Visual Dubbing Binyamin Manela et.al. 2505.23406 null
2025-05-29 Wav2Sem: Plug-and-Play Audio Semantic Decoupling for 3D Speech-Driven Facial Animation Hao Li et.al. 2505.23290 null
2025-05-29 MMGT: Motion Mask Guided Two-Stage Network for Co-Speech Gesture Video Generation Siyuan Wang et.al. 2505.23120 null
2025-05-28 Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation Zhe Kong et.al. 2505.22647 null
2025-05-28 Tell me Habibi, is it Real or Fake? Kartik Kuckreja et.al. 2505.22581 null
2025-05-28 Neural Face Skinning for Mesh-agnostic Facial Expression Cloning Sihun Cha et.al. 2505.22416 null
2025-05-28 FaceEditTalker: Interactive Talking Head Generation with Facial Attribute Editing Guanwen Feng et.al. 2505.22141 null
2025-05-28 RESOUND: Speech Reconstruction from Silent Videos via Acoustic-Semantic Decomposed Modeling Long-Khanh Pham et.al. 2505.22024 null
2025-05-27 OmniSync: Towards Universal Lip Synchronization via Diffusion Transformers Ziqiao Peng et.al. 2505.21448 null
2025-05-26 Total-Editing: Head Avatar with Editable Appearance, Motion, and Lighting Yizhou Zhao et.al. 2505.20582 null
2025-05-26 DualTalk: Dual-Speaker Interaction for 3D Talking Head Conversations Ziqiao Peng et.al. 2505.18096 null
2025-05-14 Test-Time Augmentation for Pose-invariant Face Recognition Jaemin Jung et.al. 2505.09256 null
2025-05-10 VTutor: An Animated Pedagogical Agent SDK that Provide Real Time Multi-Model Feedback Eason Chen et.al. 2505.06676 null
2025-05-10 OT-Talk: Animating 3D Talking Head with Optimal Transportation Xinmu Wang et.al. 2505.01932 null
2025-05-10 MagicPortrait: Temporally Consistent Face Reenactment with 3D Geometric Guidance Mengting Wei et.al. 2504.21497 null
2025-05-08 OXSeg: Multidimensional attention UNet-based lip segmentation using semi-supervised lip contours Hanie Moghaddasi et.al. 2505.05531 null
2025-05-03 GenSync: A Generalized Talking Head Framework for Audio-driven Multi-Subject Lip-Sync using 3D Gaussian Splatting Anushka Agarwal et.al. 2505.01928 null
2025-05-02 FlowDubber: Movie Dubbing with LLM-based Semantic-aware Learning and Flow Matching based Voice Enhancing Gaoxiang Cong et.al. 2505.01263 null
2025-05-01 KeySync: A Robust Approach for Leakage-free Lip Synchronization in High Resolution Antoni Bigata et.al. 2505.00497 null
2025-04-29 IM-Portrait: Learning 3D-aware Video Diffusion for Photorealistic Talking Heads from Monocular Videos Yuan Li et.al. 2504.19165 null
2025-04-27 Generative AI for Character Animation: A Comprehensive Survey of Techniques, Applications, and Future Directions Mohammad Mahdi Abootorabi et.al. 2504.19056 null
2025-04-26 Audio-Driven Talking Face Video Generation with Joint Uncertainty Learning Yifan Xie et.al. 2504.18810 null
2025-04-14 Controllable Expressive 3D Facial Animation via Diffusion in a Unified Multimodal Space Kangwei Liu et.al. 2506.10007 null
2025-04-14 SpinMeRound: Consistent Multi-View Identity Generation Using Diffusion Models Stathis Galanakis et.al. 2504.10716 null
2025-04-10 ChildlikeSHAPES: Semantic Hierarchical Region Parsing for Animating Figure Drawings Astitva Srivastava et.al. 2504.08022 null
2025-04-08 VideoSPatS: Video SPatiotemporal Splines for Disentangled Occlusion, Appearance and Motion Modeling and Editing Juan Luis Gonzalez Bello et.al. 2504.07146 null
2025-04-08 SE4Lip: Speech-Lip Encoder for Talking Head Synthesis to Solve Phoneme-Viseme Alignment Ambiguity Yihuan Huang et.al. 2504.05803 null
2025-04-08 Exploiting Temporal Audio-Visual Correlation Embedding for Audio-Driven One-Shot Talking Head Animation Zhihua Xu et.al. 2504.05746 null
2025-04-08 Contrastive Decoupled Representation Learning and Regularization for Speech-Preserving Facial Expression Manipulation Tianshui Chen et.al. 2504.05672 null
2025-04-07 Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation Fa-Ting Hong et.al. 2504.02542 null
2025-04-06 FluentLip: A Phonemes-Based Two-stage Approach for Audio-Driven Lip Synthesis with Optical Flow Consistency Shiyan Liu et.al. 2504.04427 null
2025-04-04 A Human Digital Twin Architecture for Knowledge-based Interactions and Context-Aware Conversations Abdul Mannan Mohammed et.al. 2504.03147 null
2025-04-03 VoiceCraft-Dub: Automated Video Dubbing with Neural Codec Language Models Kim Sung-Bin et.al. 2504.02386 null
2025-04-02 Detecting Lip-Syncing Deepfakes: Vision Temporal Transformer for Analyzing Mouth Inconsistencies Soumyya Kanti Datta et.al. 2504.01470 link
2025-04-02 EmoHead: Emotional Talking Head via Manipulating Semantic Expression Parameters Xuli Shen et.al. 2503.19416 null
2025-04-01 Monocular and Generalizable Gaussian Talking Head Animation Shengjie Gong et.al. 2504.00665 null
2025-03-31 Perceptually Accurate 3D Talking Head Generation: New Definitions, Speech-Mesh Representation, and Evaluation Metrics Lee Chae-Yeon et.al. 2503.20308 null
2025-03-30 MoCha: Towards Movie-Grade Talking Character Synthesis Cong Wei et.al. 2503.23307 null
2025-03-29 STSA: Spatial-Temporal Semantic Alignment for Visual Dubbing Zijun Ding et.al. 2503.23039 link
2025-03-28 Audio-Plane: Audio Factorization Plane Gaussian Splatting for Real-Time Talking Head Synthesis Shuai Shen et.al. 2503.22605 null
2025-03-28 Follow Your Motion: A Generic Temporal Consistency Portrait Editing Framework with Trajectory Guidance Haijie Yang et.al. 2503.22225 null
2025-03-27 ChatAnyone: Stylized Real-time Portrait Video Generation with Hierarchical Motion Diffusion Model Jinwei Qi et.al. 2503.21144 null
2025-03-27 DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation Hanbo Cheng et.al. 2410.13726 null
2025-03-26 Dual Audio-Centric Modality Coupling for Talking Head Generation Ao Fu et.al. 2503.22728 null
2025-03-25 AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers Jiazhi Guan et.al. 2503.19824 null
2025-03-25 MVPortrait: Text-Guided Motion and Emotion Control for Multi-view Vivid Portrait Animation Yukang Lin et.al. 2503.19383 null
2025-03-25 HunyuanPortrait: Implicit Condition Control for Enhanced Portrait Animation Zunnan Xu et.al. 2503.18860 null
2025-03-25 Re-HOLD: Video Hand Object Interaction Reenactment via adaptive Layout-instructed Diffusion Model Yingying Fan et.al. 2503.16942 null
2025-03-24 DisentTalk: Cross-lingual Talking Face Generation via Semantic Disentangled Diffusion Model Kangwei Liu et.al. 2503.19001 null
2025-03-24 Teller: Real-Time Streaming Audio-Driven Portrait Animation with Autoregressive Motion Generation Dingcheng Zhen et.al. 2503.18429 null
2025-03-23 DiffusionTalker: Efficient and Compact Speech-Driven 3D Talking Head via Personalizer-Guided Distillation Peng Chen et.al. 2503.18159 link
2025-03-21 TaoAvatar: Real-Time Lifelike Full-Body Talking Avatars for Augmented Reality via 3D Gaussian Splatting Jianchuan Chen et.al. 2503.17032 null
2025-03-21 From Faces to Voices: Learning Hierarchical Representations for High-quality Video-to-Speech Ji-Hoon Kim et.al. 2503.16956 null
2025-03-20 UniSync: A Unified Framework for Audio-Visual Synchronization Tao Feng et.al. 2503.16357 null
2025-03-20 PC-Talk: Precise Facial Animation Control for Audio-Driven Talking Face Generation Baiqin Wang et.al. 2503.14295 null
2025-03-19 DiffPortrait360: Consistent Portrait Diffusion for 360 View Synthesis Yuming Gu et.al. 2503.15667 link
2025-03-17 SyncDiff: Diffusion-based Talking Head Synthesis with Bottlenecked Temporal Visual Prior for Improved Synchronization Xulin Fan et.al. 2503.13371 null
2025-03-17 Unlock Pose Diversity: Accurate and Efficient Implicit Keypoint-based Spatiotemporal Diffusion for Audio-driven Talking Portrait Chaolong Yang et.al. 2503.12963 link
2025-03-16 Versatile Multimodal Controls for Whole-Body Talking Human Animation Zheng Qin et.al. 2503.08714 null
2025-03-14 Cafe-Talk: Generating 3D Talking Face Animation with Multimodal Coarse- and Fine-grained Control Hejia Chen et.al. 2503.14517 null
2025-03-14 EmoDiffusion: Enhancing Emotional 3D Facial Animation with Latent Diffusion Models Yixuan Zhang et.al. 2503.11028 null
2025-03-12 StyleSpeaker: Audio-Enhanced Fine-Grained Style Modeling for Speech-Driven 3D Facial Animation An Yang et.al. 2503.09852 null
2025-03-12 Bidirectional Learned Facial Animation Codec for Low Bitrate Talking Head Videos Riku Takahashi et.al. 2503.09787 null
2025-03-09 Removing Averaging: Personalized Lip-Sync Driven Characters Based on Identity Adapter Yanyu Zhu et.al. 2503.06397 null
2025-03-07 MagicInfinite: Generating Infinite Talking Videos with Your Words and Voice Hongwei Yi et.al. 2503.05978 null
2025-03-06 FREAK: Frequency-modulated High-fidelity and Real-time Audio-driven Talking Portrait Synthesis Ziqi Ni et.al. 2503.04067 null
2025-03-03 KeyFace: Expressive Audio-Driven Facial Animation for Long Sequences via KeyFrame Interpolation Antoni Bigata et.al. 2503.01715 null
2025-03-02 FaceShot: Bring Any Character into Life Junyao Gao et.al. 2503.00740 null
2025-03-01 Towards High-fidelity 3D Talking Avatar with Personalized Dynamic Texture Xuanchen Li et.al. 2503.00495 null
2025-02-28 Two-Stream Spatial-Temporal Transformer Framework for Person Identification via Natural Conversational Keypoints Masoumeh Chapariniya et.al. 2502.20803 null
2025-02-28 ARTalk: Speech-Driven 3D Head Animation via Autoregressive Model Xuangeng Chu et.al. 2502.20323 null
2025-02-27 InsTaG: Learning Personalized 3D Talking Head from Few-Second Video Jiahe Li et.al. 2502.20387 link
2025-02-27 High-Fidelity Relightable Monocular Portrait Animation with Lighting-Controllable Video Diffusion Model Mingtao Guo et.al. 2502.19894 link
2025-02-26 FLAP: Fully-controllable Audio-driven Portrait Video Generation through 3D head conditioned diffusion mode Lingzhou Mu et.al. 2502.19455 null
2025-02-24 Dimitra: Audio-driven Diffusion model for Expressive Talking Head Generation Baptiste Chopin et.al. 2502.17198 null
2025-02-20 NeRF-3DTalker: Neural Radiance Field with 3D Prior Aided Audio Disentanglement for Talking Head Synthesis Xiaoxing Liu et.al. 2502.14178 null
2025-02-18 AV-Flow: Transforming Text to Audio-Visual Human-like Interactions Aggelina Chatziagapi et.al. 2502.13133 null
2025-02-17 SayAnything: Audio-Driven Lip Synchronization with Conditional Video Diffusion Junxian Ma et.al. 2502.11515 null
2025-02-15 SkyReels-A1: Expressive Portrait Animation in Video Diffusion Transformers Di Qiu et.al. 2502.10841 link
2025-02-13 Long-Term TalkingFace Generation via Motion-Prior Conditional Diffusion Model Fei Shen et.al. 2502.09533 null
2025-02-13 VTutor: An Open-Source SDK for Generative AI-Powered Animated Pedagogical Agents with Multi-Media Output Eason Chen et.al. 2502.04103 null
2025-02-11 Playmate: Flexible Control of Portrait Animation via 3D-Implicit Space Guided Diffusion Xingpei Ma et.al. 2502.07203 null
2025-02-07 Towards Multimodal Empathetic Response Generation: A Rich Text-Speech-Vision Avatar-based Benchmark Han Zhang et.al. 2502.04976 null
2025-02-02 EmoTalkingGaussian: Continuous Emotion-conditioned Talking Head Synthesis Junuk Cha et.al. 2502.00654 null
2025-01-24 SyncAnimation: A Real-Time End-to-End Framework for Audio-Driven Human Pose and Talking Head Animation Yujian Liu et.al. 2501.14646 null
2025-01-21 A Lightweight and Interpretable Deepfakes Detection Framework Muhammad Umar Farooq et.al. 2501.11927 null
2025-01-18 EMO2: End-Effector Guided Audio-Driven Avatar Video Generation Linrui Tian et.al. 2501.10687 null
2025-01-17 TalkingEyes: Pluralistic Speech-Driven 3D Eye Gaze Animation Yixiang Zhuang et.al. 2501.09921 null
2025-01-15 Joint Learning of Depth and Appearance for Portrait Image Animation Xinya Ji et.al. 2501.08649 null
2025-01-15 Make-A-Character 2: Animatable 3D Character Generation From a Single Image Lin Liu et.al. 2501.07870 null
2025-01-09 Towards Dynamic Neural Communication and Speech Neuroprosthesis Based on Viseme Decoding Ji-Ha Park et.al. 2501.14790 null
2025-01-09 Identity-Preserving Video Dubbing Using Motion Warping Runzhen Liu et.al. 2501.04586 null
2025-01-09 MoEE: Mixture of Emotion Experts for Audio-Driven Portrait Animation Huaize Liu et.al. 2501.01808 null
2025-01-07 Generating and Detecting Various Types of Fake Image and Audio Content: A Review of Modern Deep Learning Technologies and Tools Arash Dehghani et.al. 2501.06227 null
2025-01-07 VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control Yuanpeng Tu et.al. 2501.01427 null
2025-01-06 RDD4D: 4D Attention-Guided Road Damage Detection And Classification Asma Alkalbani et.al. 2501.02822 link
2025-01-06 Takeaways from Applying LLM Capabilities to Multiple Conversational Avatars in a VR Pilot Study Mykola Maslych et.al. 2501.00168 null
2025-01-03 JoyGen: Audio-Driven 3D Depth-Aware Talking-Face Video Editing Qili Wang et.al. 2501.01798 link
2024-12-28 DEGSTalk: Decomposed Per-Embedding Gaussian Fields for Hair-Preserving Talking Face Synthesis Kaijun Deng et.al. 2412.20148 link
2024-12-26 UniAvatar: Taming Lifelike Audio-Driven Talking Head Generation with Comprehensive Motion and Lighting Control Wenzhang Sun et.al. 2412.19860 null
2024-12-26 Generating Editable Head Avatars with 3D Gaussian GANs Guohao Li et.al. 2412.19149 link
2024-12-23 FaceLift: Single Image to 3D Head with View Generation and GS-LRM Weijie Lyu et.al. 2412.17812 null
2024-12-22 FADA: Fast Diffusion Avatar Synthesis with Mixed-Supervised Multi-CFG Distillation Tianyun Zhong et.al. 2412.16915 null
2024-12-18 Joint Co-Speech Gesture and Expressive Talking Face Generation using Diffusion with Adapters Steven Hogue et.al. 2412.14333 link
2024-12-18 GLCF: A Global-Local Multimodal Coherence Analysis Framework for Talking Face Generation Detection Xiaocan Chen et.al. 2412.13656 null
2024-12-18 Learning to Control an Android Robot Head for Facial Animation Marcel Heisler et.al. 2412.13641 null
2024-12-18 Real-time One-Step Diffusion-based Expressive Portrait Videos Generation Hanzhong Guo et.al. 2412.13479 link
2024-12-18 VQTalker: Towards Multilingual Talking Avatars through Facial Motion Tokenization Tao Liu et.al. 2412.09892 null
2024-12-16 Towards a Universal Synthetic Video Detector: From Face or Background Manipulations to Fully AI-Generated Content Rohit Kundu et.al. 2412.12278 null
2024-12-13 GoHD: Gaze-oriented and Highly Disentangled Portrait Animation with Rhythmic Poses and Realistic Expression Ziqi Zhou et.al. 2412.09296 link
2024-12-12 LatentSync: Audio Conditioned Latent Diffusion Models for Lip Sync Chunyu Li et.al. 2412.09262 link
2024-12-12 EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing Gaoxiang Cong et.al. 2412.08988 null
2024-12-12 PointTalk: Audio-Driven Dynamic Lip Point Cloud for 3D Gaussian-based Talking Head Synthesis Yifan Xie et.al. 2412.08504 null
2024-12-10 PortraitTalk: Towards Customizable One-Shot Audio-to-Talking Face Generation Fatemeh Nazarieh et.al. 2412.07754 null
2024-12-10 IF-MDM: Implicit Face Motion Diffusion Model for High-Fidelity Realtime Talking Head Generation Sejong Yang et.al. 2412.04000 null
2024-12-05 MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation Longtao Zheng et.al. 2412.04448 null
2024-12-05 Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Diffusion Transformer Networks Jiahao Cui et.al. 2412.00733 link
2024-12-04 SINGER: Vivid Audio-driven Singing Video Generation with Multi-scale Spectral Diffusion Model Yan Li et.al. 2412.03430 null
2024-12-02 One Shot, One Talk: Whole-body Talking Avatar from a Single Image Jun Xiang et.al. 2412.01106 null
2024-12-01 Synergizing Motion and Appearance: Multi-Scale Compensatory Codebooks for Talking Head Video Generation Shuling Zhao et.al. 2412.00719 null
2024-11-29 LokiTalk: Learning Fine-Grained and Generalizable Correspondences to Enhance NeRF-based Talking Head Synthesis Tianqi Li et.al. 2411.19525 null
2024-11-29 Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis Tianqi Li et.al. 2411.19509 null
2024-11-29 V2SFlow: Video-to-Speech Generation with Speech Decomposition and Rectified Flow Jeongsoo Choi et.al. 2411.19486 null
2024-11-26 Passive Deepfake Detection Across Multi-modalities: A Comprehensive Survey Hong-Hanh Nguyen-Le et.al. 2411.17911 null
2024-11-25 Sonic: Shifting Focus to Global Audio Perception in Portrait Animation Xiaozhong Ji et.al. 2411.16331 null
2024-11-25 ESARM: 3D Emotional Speech-to-Animation via Reward Model from Automatically-Ranked Demonstrations Xulong Zhang et.al. 2411.13089 null
2024-11-24 LetsTalk: Latent Diffusion Transformer for Talking Video Synthesis Haojie Zhang et.al. 2411.16748 null
2024-11-23 EmotiveTalk: Expressive Talking Head Generation through Audio Information Decoupling and Emotional Video Diffusion Haotian Wang et.al. 2411.16726 null
2024-11-23 ConsistentAvatar: Learning to Diffuse Fully Consistent Talking Head Avatar with Temporal Guidance Haijie Yang et.al. 2411.15436 null
2024-11-20 Comparative Analysis of Audio Feature Extraction for Real-Time Talking Portrait Synthesis Pegah Salehi et.al. 2411.13209 link
2024-11-20 JoyVASA: Portrait and Animal Image Animation with Diffusion-Based Audio-Driven Facial Dynamics and Head Motion Generation Xuyang Cao et.al. 2411.09209 link
2024-11-14 LES-Talker: Fine-Grained Emotion Editing for Talking Head Generation in Linear Emotion Space Guanwen Feng et.al. 2411.09268 null
2024-11-06 Large Generative Model-assisted Talking-face Semantic Communication System Feibo Jiang et.al. 2411.03876 null
2024-10-31 Stereo-Talker: Audio-driven 3D Human Synthesis with Prior-Guided Mixture-of-Experts Xiang Deng et.al. 2410.23836 null
2024-10-29 Multimodal Semantic Communication for Generative Audio-Driven Video Conferencing Haonan Tong et.al. 2410.22112 null
2024-10-24 Real-time 3D-aware Portrait Video Relighting Ziqi Cai et.al. 2410.18355 link
2024-10-21 Joker: Conditional 3D Head Synthesis with Extreme Facial Expressions Malte Prinzler et.al. 2410.16395 null
2024-10-18 Takin-ADA: Emotion Controllable Audio-Driven Animation with Canonical and Landmark Loss Optimization Bin Lin et.al. 2410.14283 null
2024-10-16 MuseTalk: Real-Time High Quality Lip Synchronization with Latent Space Inpainting Yue Zhang et.al. 2410.10122 link
2024-10-15 Titanic Calling: Low Bandwidth Video Conference from the Titanic Wreck Fevziye Irem Eyiokur et.al. 2410.11434 null
2024-10-15 MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes Zhenhui Ye et.al. 2410.06734 null
2024-10-14 Character-aware audio-visual subtitling in context Jaesung Huh et.al. 2410.11068 null
2024-10-14 Beyond Fixed Topologies: Unregistered Training and Comprehensive Evaluation Metrics for 3D Talking Heads Federico Nocentini et.al. 2410.11041 null
2024-10-14 TALK-Act: Enhance Textural-Awareness for 2D Speaking Avatar Reenactment with Diffusion Model Jiazhi Guan et.al. 2410.10696 null
2024-10-14 Generative Human Video Compression with Multi-granularity Temporal Trajectory Factorization Shanzhi Yin et.al. 2410.10171 null
2024-10-10 MMHead: Towards Fine-grained Multi-modal 3D Facial Animation Sijing Wu et.al. 2410.07757 null
2024-10-09 FreeAvatar: Robust 3D Facial Animation Transfer by Learning an Expression Foundation Model Feng Qiu et.al. 2409.13180 null
2024-10-01 LaDTalk: Latent Denoising for Synthesizing Talking Head Videos with High Frequency Details Jian Yang et.al. 2410.00990 null
2024-09-29 Learning Frame-Wise Emotion Intensity for Audio-Driven Talking-Head Generation Jingyi Xu et.al. 2409.19501 null
2024-09-27 Diverse Code Query Learning for Speech-Driven Facial Animation Chunzhi Gu et.al. 2409.19143 null
2024-09-26 Stable Video Portraits Mirela Ostrek et.al. 2409.18083 null
2024-09-25 ProbTalk3D: Non-Deterministic Emotion Controllable Speech-Driven 3D Facial Animation Synthesis Using VQ-VAE Sichun Wu et.al. 2409.07966 link
2024-09-24 FastTalker: Jointly Generating Speech and Conversational Gestures from Text Zixin Guo et.al. 2409.16404 null
2024-09-23 FaceVid-1K: A Large-Scale High-Quality Multiracial Human Face Video Dataset Donglin Di et.al. 2410.07151 null
2024-09-23 MIMAFace: Face Animation via Motion-Identity Modulated Appearance Feature Learning Yue Han et.al. 2409.15179 null
2024-09-18 JEAN: Joint Expression and Audio-guided NeRF-based Talking Face Generation Sai Tanmay Reddy Chakkera et.al. 2409.12156 null
2024-09-18 GaussianHeads: End-to-End Learning of Drivable Gaussian Head Avatars from Coarse-to-fine Representations Kartik Teotia et.al. 2409.11951 null
2024-09-17 3DFacePolicy: Speech-Driven 3D Facial Animation with Diffusion Policy Xuanmeng Sha et.al. 2409.10848 null
2024-09-16 DreamHead: Learning Spatial-Temporal Correspondence via Hierarchical Diffusion for Audio-driven Talking Head Synthesis Fa-Ting Hong et.al. 2409.10281 null
2024-09-14 StyleTalk++: A Unified Framework for Controlling the Speaking Styles of Talking Heads Suzhen Wang et.al. 2409.09292 null
2024-09-11 DiffTED: One-shot Audio-driven TED Talk Video Generation with Diffusion-based Co-speech Gestures Steven Hogue et.al. 2409.07649 null
2024-09-11 EMOdiffhead: Continuously Emotional Control in Talking Head Generation via Diffusion Jian Zhang et.al. 2409.07255 null
2024-09-09 PersonaTalk: Bring Attention to Your Persona in Visual Dubbing Longhao Zhang et.al. 2409.05379 null
2024-09-09 KAN-Based Fusion of Dual-Domain for Audio-Driven Facial Landmarks Generation Hoang-Son Vo-Thanh et.al. 2409.05330 link
2024-09-05 SegTalker: Segmentation-based Talking Face Generation with Mask-guided Local Editing Lingyu Xiong et.al. 2409.03605 null
2024-09-05 SVP: Style-Enhanced Vivid Portrait Talking Head Diffusion Model Weipeng Tan et.al. 2409.03270 null
2024-09-04 PoseTalk: Text-and-Audio-based Pose Control and Motion Refinement for One-Shot Talking Head Generation Jun Ling et.al. 2409.02657 null
2024-09-02 KMTalk: Speech-Driven 3D Facial Animation with Key Motion Embedding Zhihao Xu et.al. 2409.01113 link
2024-08-28 Micro and macro facial expressions by driven animations in realistic Virtual Humans Rubens Halbig Montanha et.al. 2408.16110 null
2024-08-27 MegActor- $Ξ£$ : Unlocking Flexible Mixed-Modal Control in Portrait Animation with Diffusion Transformer Shurong Yang et.al. 2408.14975 null
2024-08-25 TalkLoRA: Low-Rank Adaptation for Speech-Driven Animation Jack Saunders et.al. 2408.13714 null
2024-08-23 G3FA: Geometry-guided GAN for Face Animation Alireza Javanmardi et.al. 2408.13049 null
2024-08-21 AutoDirector: Online Auto-scheduling Agents for Multi-sensory Composition Minheng Ni et.al. 2408.11564 null
2024-08-21 EmoFace: Emotion-Content Disentangled Speech-Driven 3D Talking Face with Mesh Attention Yihong Lin et.al. 2408.11518 null
2024-08-20 DEGAS: Detailed Expressions on Full-Body Gaussian Avatars Zhijing Shao et.al. 2408.10588 null
2024-08-18 FD2Talk: Towards Generalized Talking Head Generation with Facial Decoupled Diffusion Model Ziyu Yao et.al. 2408.09384 null
2024-08-18 Meta-Learning Empowered Meta-Face: Personalized Speaking Style Adaptation for Audio-Driven 3D Talking Face Animation Xukun Zhou et.al. 2408.09357 null
2024-08-18 S^3D-NeRF: Single-Shot Speech-Driven Neural Radiance Field for High Fidelity Talking Head Synthesis Dongze Li et.al. 2408.09347 null
2024-08-16 GLDiTalker: Speech-Driven 3D Facial Animation with Graph Latent Diffusion Transformer Yihong Lin et.al. 2408.01826 null
2024-08-14 Content and Style Aware Audio-Driven Facial Animation Qingju Liu et.al. 2408.07005 null
2024-08-12 DEEPTalk: Dynamic Emotion Embedding for Probabilistic Speech-Driven 3D Face Animation Jisoo Kim et.al. 2408.06010 null
2024-08-10 High-fidelity and Lip-synced Talking Face Synthesis via Landmark-based Diffusion Model Weizhi Zhong et.al. 2408.05416 null
2024-08-10 Style-Preserving Lip Sync via Audio-Aware Style Reference Weizhi Zhong et.al. 2408.05412 null
2024-08-09 DeepSpeak Dataset v1.0 Sarah Barrington et.al. 2408.05366 null
2024-08-06 ReSyncer: Rewiring Style-based Generator for Unified Audio-Visually Synced Facial Performer Jiazhi Guan et.al. 2408.03284 null
2024-08-03 Landmark-guided Diffusion Model for High-fidelity and Temporally Coherent Talking Head Generation Jintao Tan et.al. 2408.01732 null
2024-08-03 JambaTalk: Speech-Driven 3D Talking Head Generation Based on Hybrid Transformer-Mamba Model Farzaneh Jafari et.al. 2408.01627 null
2024-08-01 UniTalker: Scaling up Audio-Driven 3D Facial Animation through A Unified Model Xiangyu Fan et.al. 2408.00762 null
2024-08-01 Reenact Anything: Semantic Video Motion Transfer Using Motion-Textual Inversion Manuel Kansy et.al. 2408.00458 null
2024-08-01 EmoTalk3D: High-Fidelity Free-View Synthesis of Emotional 3D Talking Head Qianyun He et.al. 2408.00297 null
2024-07-31 Deformable 3D Shape Diffusion Model Dengsheng Chen et.al. 2407.21428 null
2024-07-26 LinguaLinker: Audio-Driven Portraits Animation with Implicit Facial Control Enhancement Rui Zhang et.al. 2407.18595 null
2024-07-24 A Comprehensive Review and Taxonomy of Audio-Visual Synchronization Techniques for Realistic Speech Animation Jose Geraldo Fernandes et.al. 2407.17430 null
2024-07-24 The impact of differences in facial features between real speakers and 3D face models on synthesized lip motions Rabab Algadhy et.al. 2407.17253 null
2024-07-22 PAV: Personalized Head Avatar from Unstructured Video Collection Akin Caliskan et.al. 2407.21047 null
2024-07-21 Anchored Diffusion for Video Face Reenactment Idan Kligvasser et.al. 2407.15153 null
2024-07-20 Text-based Talking Video Editing with Cascaded Conditional Diffusion Bo Han et.al. 2407.14841 null
2024-07-17 Universal Facial Encoding of Codec Avatars from VR Headsets Shaojie Bai et.al. 2407.13038 null
2024-07-17 EmoFace: Audio-driven Emotional 3D Face Animation Chang Liu et.al. 2407.12501 link
2024-07-13 Learning Online Scale Transformation for Talking Head Video Generation Fa-Ting Hong et.al. 2407.09965 null
2024-07-12 Real Face Video Animation Platform Xiaokai Chen et.al. 2407.18955 null
2024-07-12 One-Shot Pose-Driving Face Animation Platform He Feng et.al. 2407.08949 null
2024-07-12 EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditions Zhiyuan Chen et.al. 2407.08136 null
2024-07-08 MobilePortrait: Real-Time One-Shot Neural Head Avatars on Mobile Devices Jianwen Jiang et.al. 2407.05712 null
2024-07-08 Audio-driven High-resolution Seamless Talking Head Video Editing via StyleGAN Jiacheng Su et.al. 2407.05577 null
2024-07-04 Compressed Skinning for Facial Blendshapes Ladislav Kavan et.al. 2406.11597 null
2024-07-03 LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control Jianzhu Guo et.al. 2407.03168 link
2024-07-01 Enhancing Speech-Driven 3D Facial Animation with Audio-Visual Guidance from Lip Reading Expert Han EunGi et.al. 2407.01034 null
2024-06-26 RealTalk: Real-time and Realistic Audio-driven Face Generation with 3D Facial Prior-guided Identity Alignment Network Xiaozhong Ji et.al. 2406.18284 null
2024-06-24 The Effects of Embodiment and Personality Expression on Learning in LLM-based Educational Agents Sinan Sonlu et.al. 2407.10993 null
2024-06-21 EmpathyEar: An Open-source Avatar Multimodal Empathetic Chatbot Hao Fei et.al. 2406.15177 link
2024-06-20 MultiTalk: Enhancing 3D Talking Head Generation Across Languages with Multilingual Video Dataset Kim Sung-Bin et.al. 2406.14272 null
2024-06-19 DF40: Toward Next-Generation Deepfake Detection Zhiyuan Yan et.al. 2406.13495 null
2024-06-19 AniFaceDiff: High-Fidelity Face Reenactment via Facial Parametric Conditioned Diffusion Models Ken Chen et.al. 2406.13272 null
2024-06-18 RITA: A Real-time Interactive Talking Avatars Framework Wuxinlin Cheng et.al. 2406.13093 null
2024-06-18 A Comprehensive Taxonomy and Analysis of Talking Head Synthesis: Techniques for Portrait Generation, Driving Mechanisms, and Editing Ming Meng et.al. 2406.10553 null
2024-06-17 NLDF: Neural Light Dynamic Fields for Efficient 3D Talking Head Generation Niu Guanchen et.al. 2406.11259 null
2024-06-17 Make Your Actor Talk: Generalizable and High-Fidelity Lip Sync with Motion and Appearance Disentanglement Runyi Yu et.al. 2406.08096 null
2024-06-16 Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation Mingwang Xu et.al. 2406.08801 null
2024-06-14 DNPM: A Neural Parametric Model for the Synthesis of Facial Geometric Details Haitao Cao et.al. 2405.19688 null
2024-06-13 Talking Heads: Understanding Inter-layer Communication in Transformer Language Models Jack Merullo et.al. 2406.09519 null
2024-06-13 DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing Neha Sahipjohn et.al. 2406.08802 null
2024-06-12 Emotional Conversation: Empowering Talking Faces with Cohesive Expression, Gaze and Pose Generation Jiadong Liang et.al. 2406.07895 null
2024-06-07 Follow-Your-Emoji: Fine-Controllable and Expressive Freestyle Portrait Animation Yue Ma et.al. 2406.01900 null
2024-06-05 Controllable Talking Face Generation by Implicit Facial Keypoints Editing Dong Zhao et.al. 2406.02880 null
2024-05-31 MunchSonic: Tracking Fine-grained Dietary Actions through Active Acoustic Sensing on Eyeglasses Saif Mahmud et.al. 2405.21004 null
2024-05-31 MegActor: Harness the Power of Raw Video for Vivid Portrait Animation Shurong Yang et.al. 2405.20851 link
2024-05-30 Audio2Rig: Artist-oriented deep learning tool for facial animation Bastien Arcelin et.al. 2405.20412 null
2024-05-28 OpFlowTalker: Realistic and Natural Talking Face Generation via Optical Flow Guidance Shuheng Ge et.al. 2405.14709 null
2024-05-24 InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation Yuchi Wang et.al. 2405.15758 link
2024-05-22 Metabook: An Automatically Generated Augmented Reality Storybook Interaction System to Improve Children's Engagement in Storytelling Yibo Wang et.al. 2405.13701 null
2024-05-21 Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control Yue Han et.al. 2405.12970 null
2024-05-16 Faces that Speak: Jointly Synthesising Talking Face and Speech from Text Youngjoon Jang et.al. 2405.10272 null
2024-05-14 PolyGlotFake: A Novel Multilingual and Multimodal DeepFake Dataset Yang Hou et.al. 2405.08838 link
2024-05-12 Listen, Disentangle, and Control: Controllable Speech-Driven Talking Head Generation Changpeng Cai et.al. 2405.07257 null
2024-05-10 NeRFFaceSpeech: One-shot Audio-driven 3D Talking Head Synthesis via Generative Prior Gihoon Kim et.al. 2405.05749 null
2024-05-09 SwapTalk: Audio-Driven Talking Face Generation with One-Shot Customization in Latent Space Zeren Zhang et.al. 2405.05636 null
2024-05-08 Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention Ruijie Tao et.al. 2404.18501 null
2024-05-07 Audio-Visual Speech Representation Expert for Enhanced Talking Face Video Generation and Evaluation Dogucan Yaman et.al. 2405.04327 null
2024-05-06 AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding Tao Liu et.al. 2405.03121 link
2024-04-29 EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars Nikita Drobyshev et.al. 2404.19110 null
2024-04-29 GSTalker: Real-time Audio-Driven Talking Face Generation via Deformable Gaussian Splatting Bo Chen et.al. 2404.19040 null
2024-04-29 Embedded Representation Learning Network for Animating Styled Video Portrait Tianyong Wang et.al. 2404.19038 null
2024-04-29 CSTalk: Correlation Supervised Speech-driven 3D Emotional Facial Animation Generation Xiangyu Liang et.al. 2404.18604 null
2024-04-28 GaussianTalker: Speaker-specific Talking Head Synthesis via 3D Gaussian Splatting Hongyun Yu et.al. 2404.14037 null
2024-04-25 GaussianTalker: Real-Time High-Fidelity Talking Head Synthesis with Audio-Driven 3D Gaussian Splatting Kyusun Cho et.al. 2404.16012 link
2024-04-23 TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting Jiahe Li et.al. 2404.15264 null
2024-04-22 Learn2Talk: 3D Talking Face Learns from 2D Talking Face Yixiang Zhuang et.al. 2404.12888 null
2024-04-16 VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time Sicheng Xu et.al. 2404.10667 null
2024-04-15 FSRT: Facial Scene Representation Transformer for Face Reenactment from Factorized Appearance, Head-pose, and Facial Expression Features Andre Rochow et.al. 2404.09736 null
2024-04-13 THQA: A Perceptual Quality Assessment Database for Talking Heads Yingjie Zhou et.al. 2404.09003 link
2024-04-11 EFHQ: Multi-purpose ExtremePose-Face-HQ dataset Trung Tuan Dao et.al. 2312.17205 null
2024-04-09 Deepfake Generation and Detection: A Benchmark and Survey Gan Pei et.al. 2403.17881 link
2024-04-08 SphereHead: Stable 3D Full-head Synthesis with Spherical Tri-plane Representation Heyuan Li et.al. 2404.05680 null
2024-04-07 GvT: A Graph-based Vision Transformer with Talking-Heads Utilizing Sparsity, Trained from Scratch on Small Datasets Dongjing Shan et.al. 2404.04924 null
2024-04-07 Towards a Simultaneous and Granular Identity-Expression Control in Personalized Face Generation Renshuai Liu et.al. 2401.01207 null
2024-04-03 MI-NeRF: Learning a Single Face NeRF from Multiple Identities Aggelina Chatziagapi et.al. 2403.19920 null
2024-04-02 EDTalk: Efficient Disentanglement for Emotional Talking Head Synthesis Shuai Tan et.al. 2404.01647 null
2024-04-02 Learning to Generate Conditional Tri-plane for 3D-aware Expression Controllable Portrait Animation Taekyung Ki et.al. 2404.00636 null
2024-04-01 FaceChain-ImagineID: Freely Crafting High-Fidelity Diverse Talking Faces from Disentangled Audio Chao Xu et.al. 2403.01901 link
2024-04-01 Exploring Phonetic Context-Aware Lip-Sync For Talking Face Generation Se Jin Park et.al. 2305.19556 null
2024-03-29 Talk3D: High-Fidelity Talking Portrait Synthesis via Personalized 3D Generative Prior Jaehoon Ko et.al. 2403.20153 link
2024-03-28 MoDiTalker: Motion-Disentangled Diffusion Model for High-Fidelity Talking Head Generation Seyeon Kim et.al. 2403.19144 link
2024-03-28 GOTCHA: Real-Time Video Deepfake Detection via Challenge-Response Govind Mittal et.al. 2210.06186 link
2024-03-27 X-Portrait: Expressive Portrait Animation with Hierarchical Motion Attention You Xie et.al. 2403.15931 null
2024-03-26 Superior and Pragmatic Talking Face Generation with Teacher-Student Framework Chao Liang et.al. 2403.17883 null
2024-03-26 AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation Huawei Wei et.al. 2403.17694 link
2024-03-26 Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis Zhenhui Ye et.al. 2401.08503 null
2024-03-25 DiffusionAct: Controllable Diffusion Autoencoder for One-shot Face Reenactment Stella Bounareli et.al. 2403.17217 null
2024-03-25 AnimateMe: 4D Facial Expressions via Diffusion Models Dimitrios Gerogiannis et.al. 2403.17213 null
2024-03-25 Make-Your-Anchor: A Diffusion-based 2D Avatar Generation Framework Ziyao Huang et.al. 2403.16510 link
2024-03-23 Adaptive Super Resolution For One-Shot Talking-Head Generation Luchuan Song et.al. 2403.15944 link
2024-03-22 LeGO: Leveraging a Surface Deformation Network for Animatable Stylized Face Generation with One Example Soyeon Yoon et.al. 2403.15227 link
2024-03-22 Virbo: Multimodal Multilingual Avatar Video Generation in Digital Marketing Juan Zhang et.al. 2403.11700 null
2024-03-19 EmoVOCA: Speech-Driven Emotional 3D Talking Heads Federico Nocentini et.al. 2403.12886 null
2024-03-19 ScanTalk: 3D Talking Heads from Unregistered Scans Federico Nocentini et.al. 2403.10942 null
2024-03-15 StyleTalker: One-shot Style-based Audio-driven Talking Head Video Generation Dongchan Min et.al. 2208.10922 null
2024-03-14 GAIA: Zero-shot Talking Avatar Generation Tianyu He et.al. 2311.15230 null
2024-03-13 Say Anything with Any Style Shuai Tan et.al. 2403.06363 null
2024-03-12 FlowVQTalker: High-Quality Emotional Talking Face Generation through Normalizing Flow and Quantization Shuai Tan et.al. 2403.06375 null
2024-03-12 Style2Talker: High-Resolution Talking Head Generation with Emotion Style and Art Style Shuai Tan et.al. 2403.06365 null
2024-03-11 A Comparative Study of Perceptual Quality Metrics for Audio-driven Talking Head Videos Weixia Zhang et.al. 2403.06421 link
2024-03-05 Memories are One-to-Many Mapping Alleviators in Talking Face Generation Anni Tang et.al. 2212.05005 null
2024-03-02 G4G:A Generic Framework for High Fidelity Talking Face Generation with Fine-grained Intra-modal Alignment Juan Zhang et.al. 2402.18122 null
2024-03-01 DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder Chenpeng Du et.al. 2303.17550 null
2024-02-29 Learning a Generalized Physical Face Model From Data Lingchen Yang et.al. 2402.19477 null
2024-02-28 Context-aware Talking Face Video Generation Meidai Xuanyuan et.al. 2402.18092 null
2024-02-27 EMO: Emote Portrait Alive -- Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions Linrui Tian et.al. 2402.17485 null
2024-02-27 Learning Dynamic Tetrahedra for High-Quality Talking Head Synthesis Zicheng Zhang et.al. 2402.17364 link
2024-02-26 Resolution-Agnostic Neural Compression for High-Fidelity Portrait Video Conferencing via Implicit Radiance Fields Yifei Li et.al. 2402.16599 null
2024-02-25 AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation Yasheng Sun et.al. 2402.16124 null
2024-02-21 Bring Your Own Character: A Holistic Solution for Automatic Facial Animation Generation of Customized Characters Zechen Bai et.al. 2402.13724 link
2024-02-21 StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing Gaoxiang Cong et.al. 2402.12636 null
2024-02-12 StyleLipSync: Style-based Personalized Lip-sync Video Generation Taekyung Ki et.al. 2305.00521 null
2024-02-08 DiffSpeaker: Speech-Driven 3D Facial Animation with Diffusion Transformer Zhiyuan Ma et.al. 2402.05712 link
2024-02-05 One-shot Neural Face Reenactment via Finding Directions in GAN's Latent Space Stella Bounareli et.al. 2402.03553 null
2024-02-02 EmoSpeaker: One-shot Fine-grained Emotion-Controlled Talking Face Generation Guanwen Feng et.al. 2402.01422 null
2024-01-31 MM-TTS: Multi-modal Prompt based Style Transfer for Expressive Text-to-Speech Synthesis Wenhao Guan et.al. 2312.10687 null
2024-01-30 Media2Face: Co-speech Facial Animation Generation With Multi-Modality Guidance Qingcheng Zhao et.al. 2401.15687 null
2024-01-28 Lips Are Lying: Spotting the Temporal Inconsistency between Audio and Visual in Lip-Syncing DeepFakes Weifeng Liu et.al. 2401.15668 link
2024-01-27 An Implicit Physical Face Model Driven by Expression and Style Lingchen Yang et.al. 2401.15414 null
2024-01-26 Implicit Neural Representation for Physics-driven Actuated Soft Bodies Lingchen Yang et.al. 2401.14861 null
2024-01-25 SAiD: Speech-driven Blendshape Facial Animation with Diffusion Inkyu Park et.al. 2401.08655 link
2024-01-23 NeRF-AD: Neural Radiance Field with Attention-based Disentanglement for Talking Face Synthesis Chongke Bi et.al. 2401.12568 null
2024-01-19 Fast Registration of Photorealistic Avatars for VR Facial Animation Chaitanya Patel et.al. 2401.11002 null
2024-01-18 Exposing Lip-syncing Deepfakes from Mouth Inconsistencies Soumyya Kanti Datta et.al. 2401.10113 null
2024-01-18 Text-driven Talking Face Synthesis by Reprogramming Audio-driven Models Jeongsoo Choi et.al. 2306.16003 null
2024-01-16 EmoTalker: Emotionally Editable Talking Face Generation via Diffusion Model Bingyuan Zhang et.al. 2401.08049 null
2024-01-12 DiffDub: Person-generic Visual Dubbing Using Inpainting Renderer with Diffusion Auto-encoder Tao Liu et.al. 2311.01811 null
2024-01-11 Dubbing for Everyone: Data-Efficient Visual Dubbing using Neural Rendering Priors Jack Saunders et.al. 2401.06126 null
2024-01-11 Jump Cut Smoothing for Talking Heads Xiaojuan Wang et.al. 2401.04718 null
2024-01-08 AdaMesh: Personalized Facial Expressions and Head Poses for Adaptive Speech-Driven 3D Facial Animation Liyang Chen et.al. 2310.07236 null
2024-01-07 Freetalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness Sicheng Yang et.al. 2401.03476 null
2024-01-04 Expressive Speech-driven Facial Animation with controllable emotions Yutong Chen et.al. 2301.02008 link
2023-12-23 TransFace: Unit-Based Audio-Visual Speech Synthesizer for Talking Head Translation Xize Cheng et.al. 2312.15197 null
2023-12-22 DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation Chenxu Zhang et.al. 2312.13578 null
2023-12-20 FAAC: Facial Animation Generation with Anchor Frame and Conditional Control for Superior Fidelity and Editability Linze Li et.al. 2312.03775 null
2023-12-19 Learning Dense Correspondence for NeRF-Based Face Reenactment Songlin Yang et.al. 2312.10422 null
2023-12-19 Gaussian3Diff: 3D Gaussian Diffusion for 3D Full Head Synthesis and Editing Yushi Lan et.al. 2312.03763 null
2023-12-18 VectorTalker: SVG Talking Face Generation with Progressive Vectorisation Hao Hu et.al. 2312.11568 null
2023-12-18 AE-NeRF: Audio Enhanced Neural Radiance Field for Few Shot Talking Head Synthesis Dongze Li et.al. 2312.10921 null
2023-12-18 Mimic: Speaking Style Disentanglement for Speech-Driven 3D Facial Animation Hui Fu et.al. 2312.10877 null
2023-12-15 DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models Yifeng Ma et.al. 2312.09767 null
2023-12-15 Attention-Based VR Facial Animation with Visual Mouth Camera Guidance for Immersive Telepresence Avatars Andre Rochow et.al. 2312.09750 null
2023-12-13 uTalk: Bridging the Gap Between Humans and AI Hussam Azzuni et.al. 2310.02739 null
2023-12-13 MMFace4D: A Large-Scale Multi-Modal 4D Face Dataset for Audio-Driven 3D Face Animation Haozhe Wu et.al. 2303.09797 null
2023-12-12 GMTalker: Gaussian Mixture based Emotional talking video Portraits Yibo Xia et.al. 2312.07669 null
2023-12-12 GSmoothFace: Generalized Smooth Talking Face Generation via Fine Grained 3D Face Guidance Haiming Zhang et.al. 2312.07385 null
2023-12-11 Neural Text to Articulate Talk: Deep Text to Audiovisual Speech Synthesis achieving both Auditory and Photo-realism Georgios Milis et.al. 2312.06613 link
2023-12-11 Study of Non-Verbal Behavior in Conversational Agents Camila Vicari Maccari et.al. 2312.06530 null
2023-12-11 DiT-Head: High-Resolution Talking Head Synthesis using Diffusion Transformers Aaron Mir et.al. 2312.06400 null
2023-12-11 Audio-driven Talking Face Generation by Overcoming Unintended Information Flow Dogucan Yaman et.al. 2307.09368 null
2023-12-10 DaGAN++: Depth-Aware Generative Adversarial Network for Talking Head Video Generation Fa-Ting Hong et.al. 2305.06225 link
2023-12-09 R2-Talker: Realistic Real-Time Talking Head Synthesis with Hash Grid Landmarks Encoding and Progressive Multilayer Conditioning Zhiling Ye et.al. 2312.05572 null
2023-12-09 FT2TF: First-Person Statement Text-To-Talking Face Generation Xingjian Diao et.al. 2312.05430 null
2023-12-08 SingingHead: A Large-scale 4D Dataset for Singing Head Animation Sijing Wu et.al. 2312.04369 null
2023-12-07 VividTalk: One-Shot Audio-Driven Talking Head Generation Based on 3D Hybrid Prior Xusen Sun et.al. 2312.01841 null
2023-12-05 PMMTalk: Speech-Driven 3D Facial Animation from Complementary Pseudo Multi-modal Features Tianshun Han et.al. 2312.02781 null
2023-12-05 MyPortrait: Morphable Prior-Guided Personalized Portrait Generation Bo Ding et.al. 2312.02703 null
2023-12-02 DiffusionTalker: Personalization and Acceleration for Speech-Driven 3D Face Diffuser Peng Chen et.al. 2311.16565 null
2023-12-01 3DiFACE: Diffusion-based Speech-driven 3D Facial Animation and Editing Balamurugan Thambiraja et.al. 2312.00870 null
2023-11-30 Learning One-Shot 4D Head Avatar Synthesis using Synthetic Data Yu Deng et.al. 2311.18729 null
2023-11-30 Talking Head(?) Anime from a Single Image 4: Improved Model and Its Distillation Pramook Khungurn et.al. 2311.17409 null
2023-11-29 SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis Ziqiao Peng et.al. 2311.17590 link
2023-11-28 THInImg: Cross-modal Steganography for Presenting Talking Heads in Images Lin Zhao et.al. 2311.17177 null
2023-11-28 BakedAvatar: Baking Neural Fields for Real-Time Head Avatar Synthesis Hao-Bin Duan et.al. 2311.05521 link
2023-11-28 Continuously Controllable Facial Expression Editing in Talking Face Videos Zhiyao Sun et.al. 2209.08289 null
2023-11-20 MemoryCompanion: A Smart Healthcare Solution to Empower Efficient Alzheimer's Care Via Unleashing Generative AI Lifei Zheng et.al. 2311.14730 null
2023-11-15 CP-EB: Talking Face Generation with Controllable Pose and Eye Blinking Embedding Jianzong Wang et.al. 2311.08673 null
2023-11-13 DualTalker: A Cross-Modal Dual Learning Approach for Speech-Driven 3D Facial Animation Guinan Su et.al. 2311.04766 null
2023-11-12 ChatAnything: Facetime Chat with LLM-Enhanced Personas Yilin Zhao et.al. 2311.06772 null
2023-11-08 Synthetic Speaking Children -- Why We Need Them and How to Make Them Muhammad Ali Farooq et.al. 2311.06307 null
2023-11-06 RADIO: Reference-Agnostic Dubbing Video Synthesis Dongyeun Lee et.al. 2309.01950 null
2023-11-05 3D-Aware Talking-Head Video Motion Transfer Haomiao Ni et.al. 2311.02549 null
2023-11-03 Learning Separable Hidden Unit Contributions for Speaker-Adaptive Lip-Reading Songtao Luo et.al. 2310.05058 link
2023-11-02 LaughTalk: Expressive 3D Talking Head Generation with Laughter Kim Sung-Bin et.al. 2311.00994 null
2023-11-02 High-Fidelity and Freely Controllable Talking Head Video Generation Yue Gao et.al. 2304.10168 null
2023-10-31 Breathing Life into Faces: Speech-driven 3D Facial Animation with Natural Head Pose and Detailed Shape Wei Zhao et.al. 2310.20240 null
2023-10-29 On the Vulnerability of DeepFake Detectors to Attacks Generated by Denoising Diffusion Models Marija Ivanovska et.al. 2307.05397 null
2023-10-25 Personalized Speech-driven Expressive 3D Facial Animation Synthesis with Style Control Elif Bozkurt et.al. 2310.17011 null
2023-10-23 The Self 2.0: How AI-Enhanced Self-Clones Transform Self-Perception and Improve Presentation Skills Qingxiao Zheng et.al. 2310.15112 null
2023-10-19 Gemino: Practical and Robust Neural Compression for Video Conferencing Vibhaalakshmi Sivaraman et.al. 2209.10507 null
2023-10-17 CorrTalk: Correlation Between Hierarchical Speech and Facial Activity Variances for 3D Animation Zhaojie Chu et.al. 2310.11295 null
2023-10-15 HyperLips: Hyper Control Lips with High Resolution Decoder for Talking Face Generation Yaosen Chen et.al. 2310.05720 link
2023-10-12 CleftGAN: Adapting A Style-Based Generative Adversarial Network To Create Images Depicting Cleft Lip Deformity Abdullah Hayajneh et.al. 2310.07969 link
2023-10-12 Efficient Emotional Adaptation for Audio-Driven Talking-Head Generation Yuan Gan et.al. 2309.04946 link
2023-10-08 GestSync: Determining who is speaking without a talking head Sindhu B Hegde et.al. 2310.05304 link
2023-09-30 DiffPoseTalk: Speech-Driven Stylistic 3D Facial Animation and Head Pose Generation via Diffusion Models Zhiyao Sun et.al. 2310.00434 null
2023-09-28 OSM-Net: One-to-Many One-shot Talking Head Generation with Spontaneous Head Motions Jin Liu et.al. 2309.16148 null
2023-09-26 Emotional Speech-Driven Animation with Content-Emotion Disentanglement Radek Daněček et.al. 2306.08990 null
2023-09-20 FaceDiffuser: Speech-Driven 3D Facial Animation Synthesis Using Diffusion Stefan Stan et.al. 2309.11306 link
2023-09-20 Context-Aware Talking-Head Video Editing Songlin Yang et.al. 2308.00462 null
2023-09-18 That's What I Said: Fully-Controllable Talking Face Generation Youngjoon Jang et.al. 2304.03275 null
2023-09-15 Audio-Visual Active Speaker Extraction for Sparsely Overlapped Multi-talker Speech Junjie Li et.al. 2309.08408 link
2023-09-14 DT-NeRF: Decomposed Triplane-Hash Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis Yaoyu Su et.al. 2309.07752 null
2023-09-14 DiffTalker: Co-driven audio-image diffusion for talking faces via intermediate landmarks Zipeng Qi et.al. 2309.07509 null
2023-09-14 HDTR-Net: A Real-Time High-Definition Teeth Restoration Network for Arbitrary Talking Face Generation Methods Yongyuan Li et.al. 2309.07495 link
2023-09-13 PIAVE: A Pose-Invariant Audio-Visual Speaker Extraction Network Qinghua Liu et.al. 2309.06723 null
2023-09-12 DF-TransFusion: Multimodal Deepfake Detection via Lip-Audio Cross-Attention and Facial Self-Attention Aaditya Kharel et.al. 2309.06511 null
2023-09-12 Avatar Fingerprinting for Authorized Use of Synthetic Talking-Head Videos Ekta Prashnani et.al. 2305.03713 null
2023-09-11 ExpCLIP: Bridging Text and Facial Expressions via Semantic Alignment Yicheng Zhong et.al. 2308.14448 null
2023-09-10 MaskRenderer: 3D-Infused Multi-Mask Realistic Face Reenactment Tina Behrouzi et.al. 2309.05095 null
2023-09-09 Speech2Lip: High-fidelity Speech to Lip Generation by Learning from a Short Video Xiuzhe Wu et.al. 2309.04814 link
2023-09-01 Unsupervised Learning of Style-Aware Facial Animation from Real Acting Performances Wolfgang Paier et.al. 2306.10006 null
2023-08-30 From Pixels to Portraits: A Comprehensive Survey of Talking Head Generation Techniques and Applications Shreyank N Gowda et.al. 2308.16041 null
2023-08-30 SelfTalk: A Self-Supervised Commutative Training Diagram to Comprehend 3D Talking Faces Ziqiao Peng et.al. 2306.10799 link
2023-08-30 Laughing Matters: Introducing Laughing-Face Generation using Diffusion Models Antoni Bigata Casademunt et.al. 2305.08854 link
2023-08-29 Papeos: Augmenting Research Papers with Talk Videos Tae Soo Kim et.al. 2308.15224 null
2023-08-25 EmoTalk: Speech-Driven Emotional Disentanglement for 3D Face Animation Ziqiao Peng et.al. 2303.11089 link
2023-08-24 ToonTalker: Cross-Domain Face Reenactment Yuan Gong et.al. 2308.12866 null
2023-08-24 Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis Jiahe Li et.al. 2307.09323 link
2023-08-23 DF-3DFace: One-to-Many Speech Synchronized 3D Face Animation with Diffusion Se Jin Park et.al. 2310.05934 null
2023-08-21 Deep Person Generation: A Survey from the Perspective of Face, Pose and Cloth Synthesis Tong Sha et.al. 2109.02081 null
2023-08-18 Diff2Lip: Audio Conditioned Diffusion Models for Lip-Synchronization Soumik Mukhopadhyay et.al. 2308.09716 link
2023-08-18 Implicit Identity Representation Conditioned Memory Compensation Network for Talking Head video Generation Fa-Ting Hong et.al. 2307.09906 link
2023-08-17 A Survey on Deep Multi-modal Learning for Body Language Recognition and Generation Li Liu et.al. 2308.08849 link
2023-08-16 Instruct-NeuralTalker: Editing Audio-Driven Talking Radiance Fields with Instructions Yuqi Sun et.al. 2306.10813 null
2023-08-12 Text-to-Video: a Two-stage Framework for Zero-shot Identity-agnostic Talking-head Generation Zhichao Wang et.al. 2308.06457 link
2023-08-12 DialogueNeRF: Towards Realistic Avatar Face-to-Face Conversation Video Generation Yichao Yan et.al. 2203.07931 null
2023-08-11 Versatile Face Animator: Driving Arbitrary 3D Facial Avatar in RGBD Space Haoyu Wang et.al. 2308.06076 link
2023-08-11 VAST: Vivify Your Talking Avatar via Zero-Shot Expressive Facial Style Transfer Liyang Chen et.al. 2308.04830 null
2023-08-10 Near-realtime Facial Animation by Deep 3D Simulation Super-Resolution Hyojoon Park et.al. 2305.03216 null
2023-08-02 Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis Zhenhui Ye et.al. 2306.03504 null
2023-07-29 Diffused Heads: Diffusion Models Beat GANs on Talking-Face Generation MichaΕ‚ StypuΕ‚kowski et.al. 2301.03396 null
2023-07-26 Learning Landmarks Motion from Speech for Speaker-Agnostic 3D Talking Heads Generation Federico Nocentini et.al. 2306.01415 link
2023-07-20 HyperReenact: One-Shot Reenactment via Jointly Learning to Refine and Retarget Faces Stella Bounareli et.al. 2307.10797 link
2023-07-19 MODA: Mapping-Once Audio-driven Portrait Animation with Dual Attentions Yunfei Liu et.al. 2307.10008 null
2023-07-19 Hierarchical Semantic Perceptual Listener Head Video Generation: A High-performance Pipeline Zhigang Chang et.al. 2307.09821 null
2023-07-19 OPHAvatars: One-shot Photo-realistic Head Avatars Shaoxu Li et.al. 2307.09153 link
2023-07-18 FACTS: Facial Animation Creation using the Transfer of Styles Jack Saunders et.al. 2307.09480 null
2023-07-09 Predictive Coding For Animation-Based Video Compression Goluck Konuko et.al. 2307.04187 null
2023-07-08 FTFDNet: Learning to Detect Talking Face Video Manipulation with Tri-Modality Interaction Ganglai Wang et.al. 2307.03990 null
2023-07-05 Interactive Conversational Head Generation Mohan Zhou et.al. 2307.02090 null
2023-07-04 A Comprehensive Multi-scale Approach for Speech and Dynamics Synchrony in Talking Head Generation Louis Airale et.al. 2307.03270 link
2023-07-04 Generating Animatable 3D Cartoon Faces from Single Portraits Chuanyu Pan et.al. 2307.01468 null
2023-07-03 RobustL2S: Speaker-Specific Lip-to-Speech Synthesis exploiting Self-Supervised Representations Neha Sahipjohn et.al. 2307.01233 null
2023-06-20 Audio-Driven 3D Facial Animation from In-the-Wild Videos Liying Lu et.al. 2306.11541 null
2023-06-13 Parametric Implicit Face Representation for Audio-Driven Facial Reenactment Ricong Huang et.al. 2306.07579 null
2023-06-13 AniFaceDrawing: Anime Portrait Exploration during Your Sketching Zhengyu Huang et.al. 2306.07476 null
2023-06-12 NPVForensics: Jointing Non-critical Phonemes and Visemes for Deepfake Detection Yu Chen et.al. 2306.06885 null
2023-06-10 StyleTalk: One-shot Talking Head Generation with Controllable Speaking Styles Yifeng Ma et.al. 2301.01081 link
2023-06-08 ReliableSwap: Boosting General Face Swapping Via Reliable Supervision Ge Yuan et.al. 2306.05356 link
2023-06-06 Emotional Talking Head Generation based on Memory-Sharing and Attention-Augmented Networks Jianrong Wang et.al. 2306.03594 null
2023-06-05 Instruct-Video2Avatar: Video-to-Avatar Generation with Instructions Shaoxu Li et.al. 2306.02903 link
2023-05-31 High-fidelity Generalized Emotional Talking Face Generation with Multi-modal Emotion Space Learning Chao Xu et.al. 2305.02572 null
2023-05-23 CPNet: Exploiting CLIP-based Attention Condenser and Probability Map Guidance for High-fidelity Talking Face Generation Jingning Xu et.al. 2305.13962 null
2023-05-22 RenderMe-360: A Large Digital Asset Library and Benchmarks Towards High-fidelity Head Avatars Dongwei Pan et.al. 2305.13353 link
2023-05-19 UniFLG: Unified Facial Landmark Generator from Text or Speech Kentaro Mitsui et.al. 2302.14337 null
2023-05-18 An Android Robot Head as Embodied Conversational Agent Marcel Heisler et.al. 2305.10945 null
2023-05-18 Audio-Visual Person-of-Interest DeepFake Detection Davide Cozzolino et.al. 2204.03083 link
2023-05-17 INCLG: Inpainting for Non-Cleft Lip Generation with a Multi-Task Image Processing Network Shuang Chen et.al. 2305.10589 null
2023-05-17 LPMM: Intuitive Pose Control for Neural Talking-Head Model via Landmark-Parameter Morphable Model Kwangho Lee et.al. 2305.10456 null
2023-05-15 Identity-Preserving Talking Face Generation with Landmark and Appearance Priors Weizhi Zhong et.al. 2305.08293 link
2023-05-09 Zero-shot personalized lip-to-speech synthesis with face image based voice control Zheng-Yan Sheng et.al. 2305.14359 null
2023-05-09 StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator Jiazhi Guan et.al. 2305.05445 null
2023-05-09 Multimodal-driven Talking Face Generation via a Unified Diffusion-based Generator Chao Xu et.al. 2305.02594 null
2023-05-01 StyleAvatar: Real-time Photo-realistic Portrait Avatar from a Single Video Lizhen Wang et.al. 2305.00942 link
2023-05-01 GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation Zhenhui Ye et.al. 2305.00787 null
2023-04-28 A Unified Compression Framework for Efficient Speech-Driven Talking-Face Generation Bo-Kyeong Kim et.al. 2304.00471 null
2023-04-27 Controllable One-Shot Face Video Synthesis With Semantic Aware Prior Kangning Liu et.al. 2304.14471 null
2023-04-25 AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head Rongjie Huang et.al. 2304.12995 link
2023-04-24 VR Facial Animation for Immersive Telepresence Avatars Andre Rochow et.al. 2304.12051 null
2023-04-21 Implicit Neural Head Synthesis via Controllable Local Deformation Fields Chuhan Chen et.al. 2304.11113 null
2023-04-20 DiffTalk: Crafting Diffusion Models for Generalized Audio-Driven Portraits Animation Shuai Shen et.al. 2301.03786 link
2023-04-18 Audio-Driven Talking Face Generation with Diverse yet Realistic Facial Animations Rongliang Wu et.al. 2304.08945 null
2023-04-17 Autoregressive GAN for Semantic Unconditional Head Motion Generation Louis Airale et.al. 2211.00987 link
2023-04-11 One-Shot High-Fidelity Talking-Head Synthesis with Deformable Neural Radiance Field Weichuang Li et.al. 2304.05097 null
2023-04-06 Face Animation with an Attribute-Guided Diffusion Model Bohan Zeng et.al. 2304.03199 link
2023-04-06 4D Agnostic Real-Time Facial Animation Pipeline for Desktop Scenarios Wei Chen et.al. 2304.02814 null
2023-04-03 CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior Jinbo Xing et.al. 2301.02379 link
2023-04-01 DreamFace: Progressive Generation of Animatable 3D Faces under Text Guidance Longwen Zhang et.al. 2304.03117 null
2023-04-01 TalkCLIP: Talking Head Generation with Text-Guided Expressive Speaking Styles Yifeng Ma et.al. 2304.00334 null
2023-03-31 FONT: Flow-guided One-shot Talking Head Generation with Natural Head Motions Jin Liu et.al. 2303.17789 null
2023-03-29 Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert Jiadong Wang et.al. 2303.17480 link
2023-03-27 OmniAvatar: Geometry-Guided Controllable 3D Head Synthesis Hongyi Xu et.al. 2303.15539 null
2023-03-27 Accurate and Interpretable Solution of the Inverse Rig for Realistic Blendshape Models with Quadratic Corrective Terms Stevo Racković et.al. 2302.04843 null
2023-03-27 MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation Bowen Zhang et.al. 2212.08062 link
2023-03-27 A Majorization-Minimization Based Method for Nonconvex Inverse Rig Problems in Facial Animation: Algorithm Derivation Stevo Racković et.al. 2205.04289 null
2023-03-26 OTAvatar: One-shot Talking Face Avatar with Controllable Tri-plane Rendering Zhiyuan Ma et.al. 2303.14662 link
2023-03-26 Emotionally Enhanced Talking Face Generation Sahil Goyal et.al. 2303.11548 link
2023-03-26 Distributed Solution of the Inverse Rig Problem in Blendshape Facial Animation Stevo Racković et.al. 2303.06370 null
2023-03-24 Synthesizing Photorealistic Virtual Humans Through Cross-modal Disentanglement Siddarth Ravichandran et.al. 2209.01320 null
2023-03-23 PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360 $^{\circ}$ Sizhe An et.al. 2303.13071 null
2023-03-22 Style Transfer for 2D Talking Head Animation Trong-Thang Pham et.al. 2303.09799 link
2023-03-22 MARLIN: Masked Autoencoder for facial video Representation LearnINg Zhixi Cai et.al. 2211.06627 link
2023-03-14 DisCoHead: Audio-and-Video-Driven Talking Head Generation by Disentangled Control of Head Pose and Facial Expressions Geumbyeol Hwang et.al. 2303.07697 link
2023-03-13 SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation Wenxuan Zhang et.al. 2211.12194 link
2023-03-09 FaceXHuBERT: Text-less Speech-driven E(X)pressive 3D Facial Animation Synthesis Using Self-Supervised Speech Representation Learning Kazi Injamamul Haque et.al. 2303.05416 link
2023-03-09 Improving Few-Shot Learning for Talking Face System with TTS Data Augmentation Qi Chen et.al. 2303.05322 link
2023-03-07 DINet: Deformation Inpainting Network for Realistic Face Visually Dubbing on High Resolution Video Zhimeng Zhang et.al. 2303.03988 link
2023-03-05 Cyber Vaccine for Deepfake Immunity Ching-Chun Chang et.al. 2303.02659 null
2023-03-04 High-fidelity Facial Avatar Reconstruction from Monocular Video with Generative Priors Yunpeng Bai et.al. 2211.15064 null
2023-03-01 DPE: Disentanglement of Pose and Expression for General Video Portrait Editing Youxin Pang et.al. 2301.06281 link
2023-02-27 Deep Visual Forced Alignment: Learning to Align Transcription with Talking Face Video Minsu Kim et.al. 2303.08670 null
2023-02-27 Memory-augmented Contrastive Learning for Talking Head Generation Jianrong Wang et.al. 2302.13469 link
2023-02-24 Pose-Controllable 3D Facial Animation Synthesis using Hierarchical Audio-Vertex Attention Bin Liu et.al. 2302.12532 null
2023-02-16 OPT: One-shot Pose-Controllable Talking Head Generation Jin Liu et.al. 2302.08197 null
2023-02-14 Expressive Talking Head Video Encoding in StyleGAN2 Latent-Space Trevine Oorloff et.al. 2203.14512 link
2023-01-31 GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis Zhenhui Ye et.al. 2301.13430 null
2023-01-23 Data standardization for robust lip sync Chun Wang et.al. 2202.06198 null
2023-01-20 Neural Volumetric Blendshapes: Computationally Efficient Physics-Based Facial Blendshapes Nicolas Wagner et.al. 2212.14784 null
2023-01-15 Learning Audio-Driven Viseme Dynamics for 3D Face Animation Linchao Bao et.al. 2301.06059 null
2022-12-30 Imitator: Personalized Speech-driven 3D Facial Animation Balamurugan Thambiraja et.al. 2301.00023 null
2022-12-28 All's well that FID's well? Result quality and metric scores in GAN models for lip-sychronization tasks Carina Geldhauser et.al. 2212.13810 null
2022-12-23 Dubbing in Practice: A Large Scale Study of Human Localization With Insights for Automatic Dubbing William Brannon et.al. 2212.12137 null
2022-12-09 Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in Transformers Yasheng Sun et.al. 2212.04970 null
2022-12-07 Talking Head Generation with Probabilistic Audio-to-Visual Diffusion Priors Zhentao Yu et.al. 2212.04248 null
2022-12-07 SPACE: Speech-driven Portrait Animation with Controllable Expression Siddharth Gururani et.al. 2211.09809 null
2022-11-30 Extracting Semantic Knowledge from GANs with Unsupervised Learning Jianjin Xu et.al. 2211.16710 null
2022-11-29 VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild Kun Cheng et.al. 2211.14758 null
2022-11-26 Progressive Disentangled Representation Learning for Fine-Grained Controllable Talking Head Synthesis Duomin Wang et.al. 2211.14506 link
2022-11-22 Real-time Neural Radiance Talking Portrait Synthesis via Audio-spatial Decomposition Jiaxiang Tang et.al. 2211.12368 null
2022-11-10 On the role of Lip Articulation in Visual Speech Perception Zakaria Aldeneh et.al. 2203.10117 null
2022-11-03 SyncTalkFace: Talking Face Generation with Precise Lip-Syncing via Audio-Lip Memory Se Jin Park et.al. 2211.00924 null
2022-10-21 Leveraging Real Talking Faces via Self-Supervision for Robust Forgery Detection Alexandros Haliassos et.al. 2201.07131 link
2022-10-14 Pre-Avatar: An Automatic Presentation Generation Framework Leveraging Talking Avatar Aolan Sun et.al. 2210.06877 null
2022-10-13 Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors Vladimir Iashin et.al. 2210.07055 link
2022-10-07 Compressing Video Calls using Synthetic Talking Heads Madhav Agarwal et.al. 2210.03692 null
2022-10-07 A Keypoint Based Enhancement Method for Audio Driven Free View Talking Head Synthesis Yichen Han et.al. 2210.03335 null
2022-10-06 Audio-Visual Face Reenactment Madhav Agarwal et.al. 2210.02755 link
2022-10-06 Finding Directions in GAN's Latent Space for Neural Face Reenactment Stella Bounareli et.al. 2202.00046 link
2022-10-04 Towards MOOCs for Lipreading: Using Synthetic Talking Heads to Train Humans in Lipreading at Scale Aditya Agarwal et.al. 2208.09796 null
2022-09-29 Facial Landmark Predictions with Applications to Metaverse Qiao Han et.al. 2209.14698 link
2022-09-27 StyleMask: Disentangling the Style Space of StyleGAN2 for Neural Face Reenactment Stella Bounareli et.al. 2209.13375 link
2022-09-23 EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model Xinya Ji et.al. 2205.15278 null
2022-09-21 FNeVR: Neural Volume Rendering for Face Animation Bohan Zeng et.al. 2209.10340 link
2022-09-19 AutoLV: Automatic Lecture Video Generator Wenbin Wang et.al. 2209.08795 null
2022-09-09 Talking Head from Speech Audio using a Pre-trained Image Generator Mohammed M. Alghamdi et.al. 2209.04252 null
2022-09-07 Restructurable Activation Networks Kartikeya Bhardwaj et.al. 2208.08562 link
2022-08-29 StableFace: Analyzing and Improving Motion Stability for Talking Face Generation Jun Ling et.al. 2208.13717 null
2022-08-17 Extreme-scale Talking-Face Video Upsampling with Audio-Visual Priors Sindhu B Hegde et.al. 2208.08118 link
2022-08-03 Free-HeadGAN: Neural Talking Head Synthesis with Explicit Gaze Control Michail Christos Doukas et.al. 2208.02210 null
2022-08-02 Perceptual Conversational Head Generation with Regularized Driver and Enhanced Renderer Ailin Huang et.al. 2206.12837 link
2022-08-01 A Feasibility Study on Image Inpainting for Non-cleft Lip Generation from Patients with Cleft Lip Shuang Chen et.al. 2208.01149 link
2022-07-27 A Hybrid Deep Animation Codec for Low-bitrate Video Conferencing Goluck Konuko et.al. 2207.13530 null
2022-07-24 Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis Shuai Shen et.al. 2207.11770 link
2022-07-22 Visual Speech-Aware Perceptual 3D Facial Expression Reconstruction from Videos Panagiotis P. Filntisis et.al. 2207.11094 link
2022-07-20 NARRATE: A Normal Assisted Free-View Portrait Stylizer Youjia Wang et.al. 2207.00974 null
2022-07-20 VisageSynTalk: Unseen Speaker Video-to-Speech Synthesis via Speech-Visage Feature Selection Joanna Hong et.al. 2206.07458 null
2022-07-20 Responsive Listening Head Generation: A Benchmark Dataset and Baseline Mohan Zhou et.al. 2112.13548 null
2022-07-13 FastLTS: Non-Autoregressive End-to-End Unconstrained Lip-to-Speech Synthesis Yongqi Wang et.al. 2207.03800 null
2022-06-29 Cut Inner Layers: A Structured Pruning Strategy for Efficient U-Net GANs Bo-Kyeong Kim et.al. 2206.14658 null
2022-06-09 Face-Dubbing++: Lip-Synchronous, Voice Preserving Translation of Videos Alexander Waibel et.al. 2206.04523 null
2022-05-31 Text/Speech-Driven Full-Body Animation Wenlin Zhuang et.al. 2205.15573 null
2022-05-27 Unsupervised Voice-Face Representation Learning by Cross-Modal Prototype Contrast Boqing Zhu et.al. 2204.14057 link
2022-05-26 One-Shot Face Reenactment on Megapixels Wonjun Kang et.al. 2205.13368 null
2022-05-24 Merkel Podcast Corpus: A Multimodal Dataset Compiled from 16 Years of Angela Merkel's Weekly Video Podcasts Debjoy Saha et.al. 2205.12194 link
2022-05-20 MeshTalk: 3D Face Animation from Speech using Cross-Modality Disentanglement Alexander Richard et.al. 2104.08223 link
2022-05-13 Talking Face Generation with Multilingual TTS Hyoung-Kyu Song et.al. 2205.06421 null
2022-05-02 Emotion-Controllable Generalized Talking Face Generation Sanjana Sinha et.al. 2205.01155 null
2022-05-02 A Novel Speech-Driven Lip-Sync Model with CNN and LSTM Xiaohong Li et.al. 2205.00916 null
2022-04-27 Talking Head Generation Driven by Speech-Related Facial Action Units and Audio- Based on Multimodal Representation Fusion Sen Chen et.al. 2204.12756 null
2022-04-25 Fast Facial Landmark Detection and Applications: A Survey Kostiantyn Khabarlak et.al. 2101.10808 null
2022-04-13 Dynamic Neural Textures: Generating Talking-Face Videos with Continuously Controllable Expressions Zipeng Ye et.al. 2204.06180 null
2022-04-06 Transformer-S2A: Robust and Efficient Speech-to-Animation Liyang Chen et.al. 2111.09771 null
2022-04-03 Txt2Vid: Ultra-Low Bitrate Compression of Talking-Head Videos via Text Pulkit Tandon et.al. 2106.14014 link
2022-03-30 End to End Lip Synchronization with a Temporal AutoEncoder Yoav Shalev et.al. 2203.16224 link
2022-03-29 Thin-Plate Spline Motion Model for Image Animation Jian Zhao et.al. 2203.14367 link
2022-03-17 StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGAN Fei Yin et.al. 2203.04036 link
2022-03-17 FaceFormer: Speech-Driven 3D Facial Animation with Transformers Yingruo Fan et.al. 2112.05329 link
2022-03-16 Efficient conditioned face animation using frontally-viewed embedding Maxime Oquab et.al. 2203.08765 null
2022-03-15 Depth-Aware Generative Adversarial Network for Talking Head Video Generation Fa-Ting Hong et.al. 2203.06605 link
2022-03-10 An Audio-Visual Attention Based Multimodal Network for Fake Talking Face Videos Detection Ganglai Wang et.al. 2203.05178 null
2022-03-08 Attention-Based Lip Audio-Visual Synthesis for Talking Face Generation in the Wild Ganglai Wang et.al. 2203.03984 null
2022-03-04 Multi-modality Deep Restoration of Extremely Compressed Face Videos Xi Zhang et.al. 2107.05548 null
2022-03-01 FakeAVCeleb: A Novel Audio-Video Multimodal Deepfake Dataset Hasam Khalid et.al. 2108.05080 link
2022-02-25 FSGANv2: Improved Subject Agnostic Face Swapping and Reenactment Yuval Nirkin et.al. 2202.12972 null
2022-02-22 Thinking the Fusion Strategy of Multi-reference Face Reenactment Takuya Yashima et.al. 2202.10758 null
2022-01-24 Selective Listening by Synchronizing Speech with Lips Zexu Pan et.al. 2106.07150 link
2022-01-22 Text2Video: Text-driven Talking-head Video Synthesis with Personalized Phoneme-Pose Dictionary Sibo Zhang et.al. 2104.14631 null
2022-01-21 Stitch it in Time: GAN-Based Facial Editing of Real Videos Rotem Tzaban et.al. 2201.08361 link
2022-01-17 Towards Realistic Visual Dubbing with Heterogeneous Sources Tianyi Xie et.al. 2201.06260 null
2022-01-16 Audio-Driven Talking Face Video Generation with Dynamic Convolution Kernels Zipeng Ye et.al. 2201.05986 null
2022-01-03 DFA-NeRF: Personalized Talking Head Generation via Disentangled Face Attributes Neural Rendering Shunyu Yao et.al. 2201.00791 null
2021-12-20 Parallel and High-Fidelity Text-to-Lip Generation Jinglin Liu et.al. 2107.06831 link
2021-12-19 Initiative Defense against Facial Manipulation Qidong Huang et.al. 2112.10098 link
2021-12-07 Joint Audio-Text Model for Expressive Speech-Driven 3D Facial Animation Yingruo Fan et.al. 2112.02214 null
2021-12-06 One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning Suzhen Wang et.al. 2112.02749 null
2021-11-29 Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates Shenhan Qian et.al. 2108.08020 link
2021-11-04 FEAFA+: An Extended Well-Annotated Dataset for Facial Expression Analysis and 3D Facial Animation Wei Gan et.al. 2111.02751 null
2021-11-02 BiosecurID: a multimodal biometric database Julian Fierrez et.al. 2111.03472 null
2021-10-30 Imitating Arbitrary Talking Style for Realistic Audio-DrivenTalking Face Synthesis Haozhe Wu et.al. 2111.00203 link
2021-10-26 Emotion recognition in talking-face videos using persistent entropy and neural networks Eduardo Paluzo-Hidalgo et.al. 2110.13571 link
2021-10-26 ViDA-MAN: Visual Dialog with Digital Humans Tong Shen et.al. 2110.13384 null
2021-10-22 Invertible Frowns: Video-to-Video Facial Emotion Translation Ian Magnusson et.al. 2109.08061 null
2021-10-19 Talking Head Generation with Audio and Speech Related Facial Action Units Sen Chen et.al. 2110.09951 null
2021-10-16 Intelligent Video Editing: Incorporating Modern Talking Face Generation Algorithms in a Video Editor Anchit Gupta et.al. 2110.08580 null
2021-10-12 Fine-grained Identity Preserving Landmark Synthesis for Face Reenactment Haichao Zhang et.al. 2110.04708 null
2021-10-07 Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution Yangyang Shi et.al. 2110.05241 null
2021-09-24 Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation Yuanxun Lu et.al. 2109.10595 null
2021-09-20 Accurate, Interpretable, and Fast Animation: An Iterative, Sparse, and Nonconvex Approach Stevo Rackovic et.al. 2109.08356 null
2021-09-17 Detection of GAN-synthesized street videos Omran Alamayreh et.al. 2109.04991 null
2021-08-30 Audiovisual Speech Synthesis using Tacotron2 Ahmed Hussen Abdelaziz et.al. 2008.00620 null
2021-08-23 KoDF: A Large-scale Korean DeepFake Detection Dataset Patrick Kwon et.al. 2103.10094 null
2021-08-23 HeadGAN: One-shot Neural Head Synthesis and Editing Michail Christos Doukas et.al. 2012.08261 null
2021-08-19 AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis Yudong Guo et.al. 2103.11078 link
2021-08-18 DeepFake MNIST+: A DeepFake Facial Animation Dataset Jiajun Huang et.al. 2108.07949 link
2021-08-18 FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning Chenxu Zhang et.al. 2108.07938 link
2021-08-12 UniFaceGAN: A Unified Framework for Temporally Consistent Facial Video Editing Meng Cao et.al. 2108.05650 null
2021-08-11 AnyoneNet: Synchronized Speech and Talking Head Generation for Arbitrary Person Xinsheng Wang et.al. 2108.04325 null
2021-08-06 SofGAN: A Portrait Image Generator with Dynamic Styling Anpei Chen et.al. 2007.03780 link
2021-07-27 Beyond Voice Identity Conversion: Manipulating Voice Attributes by Adversarial Learning of Structured Disentangled Representations Laurent Benaroya et.al. 2107.12346 null
2021-07-21 Speech Driven Talking Face Generation from a Single Image and an Emotion Condition Sefik Emre Eskimez et.al. 2008.03592 link
2021-07-20 Audio2Head: Audio-driven One-shot Talking-head Generation with Natural Head Motion Suzhen Wang et.al. 2107.09293 link
2021-07-10 Speech2Video: Cross-Modal Distillation for Speech to Video Generation Shijing Si et.al. 2107.04806 null
2021-07-07 Egocentric Videoconferencing Mohamed Elgharib et.al. 2107.03109 null
2021-06-08 LipSync3D: Data-Efficient Learning of Personalized 3D Talking Faces from Video using Pose and Lighting Normalization Avisek Lahiri et.al. 2106.04185 null
2021-05-20 Audio-Driven Emotional Video Portraits Xinya Ji et.al. 2104.07452 null
2021-05-07 Write-a-speaker: Text-based Emotional and Rhythmic Talking-head Generation Lincheng Li et.al. 2104.07995 link
2021-05-05 A Neural Lip-Sync Framework for Synthesizing Photorealistic Virtual News Anchors Ruobing Zheng et.al. 2002.08700 null
2021-04-29 Learned Spatial Representations for Few-shot Talking-Head Synthesis Moustafa Meshry et.al. 2104.14557 null
2021-04-26 One-shot Face Reenactment Using Appearance Adaptive Normalization Guangming Yao et.al. 2102.03984 null
2021-04-25 3D-TalkEmo: Learning to Synthesize 3D Emotional Talking Head Qianyun Wang et.al. 2104.12051 null
2021-04-23 Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation Hang Zhou et.al. 2104.11116 null
2021-04-07 Single Source One Shot Reenactment using Weighted motion From Paired Feature Points Soumya Tripathy et.al. 2104.03117 null
2021-04-07 Everything's Talkin': Pareidolia Face Reenactment Linsen Song et.al. 2104.03061 link
2021-04-07 LI-Net: Large-Pose Identity-Preserving Face Reenactment Network Jin Liu et.al. 2104.02850 null
2021-04-02 One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing Ting-Chun Wang et.al. 2011.15126 null
2021-03-20 Not made for each other- Audio-Visual Dissonance-based Deepfake Detection and Localization Komal Chugh et.al. 2005.14405 link
2021-03-19 End-to-End Lip Synchronisation Based on Pattern Classification You Jin Kim et.al. 2005.08606 null
2021-03-05 Real-time RGBD-based Extended Body Pose Estimation Renat Bashirov et.al. 2103.03663 link
2021-03-03 Estimating Uniqueness of I-Vector Representation of Human Voice Erkam Sinan Tandogan et.al. 2008.11985 null
2021-02-25 MakeItTalk: Speaker-Aware Talking-Head Animation Yang Zhou et.al. 2004.12992 null
2021-02-19 One Shot Audio to Animated Video Generation Neeraj Kumar et.al. 2102.09737 null
2021-02-18 AudioVisual Speech Synthesis: A brief literature review Efthymios Georgiou et.al. 2103.03927 null
2020-12-14 Robust One Shot Audio to Video Generation Neeraj Kumar et.al. 2012.07842 null
2020-12-14 Multi Modal Adaptive Normalization for Audio to Video Generation Neeraj Kumar et.al. 2012.07304 null
2020-11-30 Adaptive Compact Attention For Few-shot Video-to-video Translation Risheng Huang et.al. 2011.14695 null
2020-11-21 Stochastic Talking Face Generation Using Latent Distribution Matching Ravindra Yadav et.al. 2011.10727 link
2020-11-21 Iterative Text-based Editing of Talking-heads Using Neural Retargeting Xinwei Yao et.al. 2011.10688 null
2020-11-09 FACEGAN: Facial Attribute Controllable rEenactment GAN Soumya Tripathy et.al. 2011.04439 null
2020-11-06 Large-scale multilingual audio visual dubbing Yi Yang et.al. 2011.03530 null
2020-11-02 Facial Keypoint Sequence Generation from Audio Prateek Manocha et.al. 2011.01114 null
2020-10-25 APB2FaceV2: Real-Time Audio-Guided Multi-Face Reenactment Jiangning Zhang et.al. 2010.13017 link
2020-10-12 Intuitive Facial Animation Editing Based On A Generative RNN Framework EloΓ―se Berson et.al. 2010.05655 null
2020-10-05 SMILE: Semantically-guided Multi-attribute Image and Layout Editing AndrΓ©s Romero et.al. 2010.02315 link
2020-10-05 Dynamic Facial Asset and Rig Generation from a Single Scan Jiaman Li et.al. 2010.00560 null
2020-09-20 An Improved Approach of Intention Discovery with Machine Learning for POMDP-based Dialogue Management Ruturaj Raval et.al. 2009.09354 null
2020-09-18 Mesh Guided One-shot Face Reenactment using Graph Convolutional Networks Guangming Yao et.al. 2008.07783 null
2020-09-12 DualLip: A System for Joint Lip Reading and Generation Weicong Chen et.al. 2009.05784 null
2020-09-02 Seeing wake words: Audio-visual Keyword Spotting Liliane Momeni et.al. 2009.01225 null
2020-08-29 "It took me almost 30 minutes to practice this". Performance and Production Practices in Dance Challenge Videos on TikTok Daniel Klug et.al. 2008.13040 null
2020-08-25 A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild K R Prajwal et.al. 2008.10010 null
2020-08-11 Audio- and Gaze-driven Facial Animation of Codec Avatars Alexander Richard et.al. 2008.05023 null
2020-08-04 Speaker dependent acoustic-to-articulatory inversion using real-time MRI of the vocal tract TamΓ‘s GΓ‘bor CsapΓ³ et.al. 2008.02098 link
2020-08-04 Real-Time Cleaning and Refinement of Facial Animation Signals EloΓ―se Berson et.al. 2008.01332 null
2020-08-02 Deep Multi-modality Soft-decoding of Very Low Bit-rate Face Videos Yanhui Guo et.al. 2008.01652 null
2020-07-29 Neural Voice Puppetry: Audio-driven Facial Reenactment Justus Thies et.al. 1912.05566 link
2020-07-20 Deformable Style Transfer Sunnie S. Y. Kim et.al. 2003.11038 link
2020-07-18 A Robust Interactive Facial Animation Editing System EloΓ―se Berson et.al. 2007.09367 null
2020-07-16 Talking-head Generation with Rhythmic Head Motion Lele Chen et.al. 2007.08547 link
2020-07-08 Learning Speech Representations from Raw Audio by Joint Audiovisual Self-Supervision Abhinav Shukla et.al. 2007.04134 null
2020-06-20 Speaker Independent and Multilingual/Mixlingual Speech-Driven Talking Head Generation Using Phonetic Posteriorgrams Huirong Huang et.al. 2006.11610 null
2020-05-27 Modality Dropout for Improved Performance-driven Talking Faces Ahmed Hussen Abdelaziz et.al. 2005.13616 null
2020-05-25 Identity-Preserving Realistic Talking Face Generation Sanjana Sinha et.al. 2005.12318 null
2020-05-22 Head2Head: Video-based Neural Head Synthesis Mohammad Rami Koujan et.al. 2005.10954 null
2020-05-16 FReeNet: Multi-Identity Face Reenactment Jiangning Zhang et.al. 1905.11805 null
2020-05-13 FaR-GAN for One-Shot Face Reenactment Hanxiang Hao et.al. 2005.06402 null
2020-05-13 Arbitrary Talking Face Generation via Attentional Audio-Visual Coherence Learning Hao Zhu et.al. 1812.06589 null
2020-05-11 Dancing to the Partisan Beat: A First Analysis of Political Communication on TikTok Juan Carlos Medina Serrano et.al. 2004.05478 link
2020-05-07 What comprises a good talking-head video generation?: A Survey and Benchmark Lele Chen et.al. 2005.03201 link
2020-05-04 Disentangled Speech Embeddings using Cross-modal Self-supervision Arsha Nagrani et.al. 2002.08742 null
2020-04-30 APB2Face: Audio-guided face reenactment with auxiliary pose and blink signals Jiangning Zhang et.al. 2004.14569 null
2020-03-30 ActGAN: Flexible and Efficient One-shot Face Reenactment Ivan Kosarevych et.al. 2003.13840 null
2020-03-29 Realistic Face Reenactment via Self-Supervised Disentangling of Identity and Pose Xianfang Zeng et.al. 2003.12957 null
2020-03-26 High-Accuracy Facial Depth Models derived from 3D Synthetic Data Faisal Khan et.al. 2003.06211 null
2020-03-05 Talking-Heads Attention Noam Shazeer et.al. 2003.02436 link
2020-03-05 Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose Ran Yi et.al. 2002.10137 link
2020-03-01 Towards Automatic Face-to-Face Translation Prajwal K R et.al. 2003.00418 link
2020-02-19 Speech-driven facial animation using polynomial fusion of features Triantafyllos Kefalas et.al. 1912.05833 null
2020-01-17 ICface: Interpretable and Controllable Face Reenactment Using GANs Soumya Tripathy et.al. 1904.01909 null
2019-12-20 Disentangling Style and Content in Anime Illustrations Sitao Xiang et.al. 1905.10742 null
2019-11-21 FLNet: Landmark Driven Fetching and Learning Network for Faithful Talking Facial Animation Synthesis Kuangxiao Gu et.al. 1911.09224 null
2019-11-19 MarioNETte: Few-shot Face Reenactment Preserving Identity of Unseen Targets Sungjoo Ha et.al. 1911.08139 null
2019-10-28 Few-shot Video-to-Video Synthesis Ting-Chun Wang et.al. 1910.12713 null
2019-10-19 Real-Time Lip Sync for Live 2D Animation Deepali Aneja et.al. 1910.08685 link
2019-10-16 Designing Style Matching Conversational Agents Deepali Aneja et.al. 1910.07514 null
2019-10-15 A High-Fidelity Open Embodied Avatar with Lip Syncing and Expression Capabilities Deepali Aneja et.al. 1909.08766 link
2019-10-09 EmoCo: Visual Analysis of Emotion Coherence in Presentation Videos Haipeng Zeng et.al. 1907.12918 null
2019-10-02 Animating Face using Disentangled Audio Representations Gaurav Mittal et.al. 1910.00726 null
2019-09-25 Few-Shot Adversarial Learning of Realistic Neural Talking Head Models Egor Zakharov et.al. 1905.08233 null
2019-09-06 Neural Style-Preserving Visual Dubbing Hyeongwoo Kim et.al. 1909.02518 null
2019-08-29 3D Face Pose and Animation Tracking via Eigen-Decomposition based Bayesian Approach Ngoc-Trung Tran et.al. 1908.11039 null
2019-08-20 Prosodic Phrase Alignment for Machine Dubbing Alp Γ–ktem et.al. 1908.07226 link
2019-08-16 FSGAN: Subject Agnostic Face Swapping and Reenactment Yuval Nirkin et.al. 1908.05932 link
2019-08-11 Emotion Dependent Facial Animation from Affective Speech Rizwan Sadiq et.al. 1908.03904 null
2019-08-05 One-shot Face Reenactment Yunxuan Zhang et.al. 1908.03251 link
2019-07-25 Talking Face Generation by Conditional Recurrent Adversarial Network Yang Song et.al. 1804.04786 link
2019-07-24 Data-Driven Physical Face Inversion Yeara Kozlov et.al. 1907.10402 null
2019-07-23 A system for efficient 3D printed stop-motion face animation Rinat Abdrashitov et.al. 1907.10163 null
2019-06-14 Realistic Speech-Driven Facial Animation with GANs Konstantinos Vougioukas et.al. 1906.06337 null
2019-06-04 Text-based Editing of Talking-head Video Ohad Fried et.al. 1906.01524 null
2019-05-27 Audio2Face: Generating Speech/Face Animation from Single Audio with Attention-Based Bidirectional LSTM Networks Guanzhong Tian et.al. 1905.11142 null
2019-05-09 Hierarchical Cross-Modal Talking Face Generationwith Dynamic Pixel-Wise Loss Lele Chen et.al. 1905.03820 link
2019-05-08 Capture, Learning, and Synthesis of 3D Speaking Styles Daniel Cudeiro et.al. 1905.03079 link
2019-04-23 Talking Face Generation by Adversarially Disentangled Audio-Visual Representation Hang Zhou et.al. 1807.07860 null
2019-04-02 FEAFA: A Well-Annotated Dataset for Facial Expression Analysis and 3D Facial Animation Yanfu Yan et.al. 1904.01509 null
2019-03-13 Animating an Autonomous 3D Talking Avatar Dominik Borer et.al. 1903.05448 null
2018-12-22 Deep Audio-Visual Speech Recognition Triantafyllos Afouras et.al. 1809.02108 null
2018-12-20 DeepFakes: a New Threat to Face Recognition? Assessment and Detection Pavel Korshunov et.al. 1812.08685 null
2018-11-22 Towards Highly Accurate and Stable Face Alignment for High-Resolution Videos Ying Tai et.al. 1811.00342 link
2018-11-16 Influence of visual cues on head and eye movements during listening tasks in multi-talker audiovisual environments with animated characters Maartje M. E. Hendrikse et.al. 1812.02088 null
2018-08-28 GANimation: Anatomically-aware Facial Animation from a Single Image Albert Pumarola et.al. 1807.09251 link
2018-08-19 Dynamic Temporal Alignment of Speech to Lips Tavi Halperin et.al. 1808.06250 link
2018-07-29 ReenactGAN: Learning to Reenact Faces via Boundary Transfer Wayne Wu et.al. 1807.11079 link
2018-07-26 Learnable PINs: Cross-Modal Embeddings for Person Identity Arsha Nagrani et.al. 1805.00833 null
2018-07-19 End-to-End Speech-Driven Facial Animation with Temporal GANs Konstantinos Vougioukas et.al. 1805.09313 null
2018-05-29 Deep Video Portraits Hyeongwoo Kim et.al. 1805.11714 null
2018-05-24 VisemeNet: Audio-Driven Animator-Centric Speech Animation Yang Zhou et.al. 1805.09488 null
2018-05-21 Anime Style Space Exploration Using Metric Learning and Generative Adversarial Networks Sitao Xiang et.al. 1805.07997 null
2018-04-23 Generating Talking Face Landmarks from Speech Sefik Emre Eskimez et.al. 1803.09803 null
2018-03-28 Generative Adversarial Talking Head: Bringing Portraits to Life with a Weakly Supervised Neural Network Hai X. Pham et.al. 1803.07716 null
2018-03-20 Speech-Driven Facial Reenactment Using Conditional Generative Adversarial Networks Seyed Ali Jalalifar et.al. 1803.07461 null
2017-12-07 End-to-end Learning for 3D Facial Animation from Raw Waveforms of Speech Hai X. Pham et.al. 1710.00920 null
2017-12-06 ObamaNet: Photo-realistic lip-sync from text Rithesh Kumar et.al. 1801.01442 null
2017-07-30 Kernel Projection of Latent Structures Regression for Facial Animation Retargeting Christos Ouzounis et.al. 1707.09629 null
2017-07-26 Fast Deep Matting for Portrait Animation on Mobile Phone Bingke Zhu et.al. 1707.08289 null
2017-07-21 Multichannel Attention Network for Analyzing Visual Behavior in Public Speaking Rahul Sharma et.al. 1707.06830 null
2017-07-18 You said that? Joon Son Chung et.al. 1705.02966 null
2017-01-30 Lip Reading Sentences in the Wild Joon Son Chung et.al. 1611.05358 link
2016-10-28 Galaxy gas as obscurer: II. Separating the galaxy-scale and nuclear obscurers of Active Galactic Nuclei Johannes Buchner et.al. 1610.09380 link
2016-07-11 Large-Scale MIMO is Capable of Eliminating Power-Thirsty Channel Coding for Wireless Transmission of HEVC/H.265 Video Shaoshi Yang et.al. 1601.06684 null
2016-05-22 Improving Facial Analysis and Performance Driven Animation through Disentangling Identity and Expression David Rim et.al. 1512.08212 null
2016-02-08 Automatic Face Reenactment Pablo Garrido et.al. 1602.02651 null
2015-11-20 ExpressionBot: An Emotive Lifelike Robotic Face for Face-to-Face Communication Ali Mollahosseini et.al. 1511.06502 null
2014-09-03 Visual Speech Recognition Ahmad B. A. Hassanat et.al. 1409.1411 null
2012-09-22 Using multimodal speech production data to evaluate articulatory animation for audiovisual speech synthesis Ingmar Steiner et.al. 1209.4982 null
2012-03-30 Face Expression Recognition and Analysis: The State of the Art Vinay Bettadapura et.al. 1203.6722 null
2012-01-19 Progress in animation of an EMA-controlled tongue model for acoustic-visual speech synthesis Ingmar Steiner et.al. 1201.4080 null
2010-03-01 Re-verification of a Lip Synchronization Protocol using Robust Reachability Piotr Kordy et.al. 1003.0431 null

(back to top)

Image Animation

Publish Date Title Authors PDF Code
2026-02-11 MotionWeaver: Holistic 4D-Anchored Framework for Multi-Humanoid Image Animation Xirui Hu et.al. 2602.13326 null
2026-01-29 DreamActor-M2: Universal Character Image Animation via Spatiotemporal In-Context Learning Mingshuang Luo et.al. 2601.21716 null
2026-01-16 CoDance: An Unbind-Rebind Paradigm for Robust Multi-Subject Animation Shuai Tan et.al. 2601.11096 null
2026-01-06 DreamLoop: Controllable Cinemagraph Generation from a Single Photograph Aniruddha Mahapatra et.al. 2601.02646 null
2025-12-31 Few-Shot-Based Modular Image-to-Video Adapter for Diffusion Models Zhenhao Li et.al. 2512.20000 null
2025-12-30 APOLLO Blender: A Robotics Library for Visualization and Animation in Blender Peter Messina et.al. 2512.23103 null
2025-12-29 MACE-Dance: Motion-Appearance Cascaded Experts for Music-Driven Dance Video Generation Kaixing Yang et.al. 2512.18181 null
2025-12-26 High-Fidelity and Long-Duration Human Image Animation with Diffusion Transformer Shen Zheng et.al. 2512.21905 null
2025-12-12 PersonaLive! Expressive Portrait Image Animation for Live Streaming Zhiyuan Li et.al. 2512.11253 null
2025-12-05 SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representations Wenhao Yan et.al. 2512.05905 null
2025-12-05 Learning High-Fidelity Cloth Animation via Skinning-Free Image Transfer Rong Wang et.al. 2512.05593 null
2025-12-04 ShadowDraw: From Any Object to Shadow-Drawing Compositional Art Rundong Luo et.al. 2512.05110 null
2025-12-04 Efficient Spatially-Variant Convolution via Differentiable Sparse Kernel Complex Zhizhen Wu et.al. 2512.04556 null
2025-12-03 Artificial Microsaccade Compensation: Stable Vision for an Ornithopter Levi Burner et.al. 2512.03995 null
2025-12-02 PPTArena: A Benchmark for Agentic PowerPoint Editing Michael Ofengenden et.al. 2512.03042 null
2025-12-01 Know Thyself by Knowing Others: Learning Neuron Identity from Population Context Vinam Arora et.al. 2512.01199 null
2025-12-01 One-to-All Animation: Alignment-Free Character Animation and Image Pose Transfer Shijun Shi et.al. 2511.22940 null
2025-11-30 TalkingPose: Efficient Face and Gesture Animation with Feedback-guided Diffusion Model Alireza Javanmardi et.al. 2512.00909 null
2025-11-29 Astro-Animation -- How Artists and Scientists Envision the Universe Laurence Arcadias et.al. 2512.00535 null
2025-11-28 MultiBanana: A Challenging Benchmark for Multi-Reference Text-to-Image Generation Yuta Oshima et.al. 2511.22989 null
2025-11-27 A Progressive Evaluation Framework for Multicultural Analysis of Story Visualization Janak Kapuriya et.al. 2511.22576 null
2025-11-27 INSIGHT: An Interpretable Neural Vision-Language Framework for Reasoning of Generative Artifacts Anshul Bagaria et.al. 2511.22351 null
2025-11-25 MotionV2V: Editing Motion in a Video Ryan Burgert et.al. 2511.20640 null
2025-11-25 New York Smells: A Large Multimodal Dataset for Olfaction Ege Ozguroglu et.al. 2511.20544 null
2025-11-24 SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation Jiaming Zhang et.al. 2511.19320 null
2025-11-22 AnimAgents: Coordinating Multi-Stage Animation Pre-Production with Human-Multi-Agent Collaboration Wen-Fan Wang et.al. 2511.17906 null
2025-11-20 Motion Transfer-Enhanced StyleGAN for Generating Diverse Macaque Facial Expressions Takuya Igaue et.al. 2511.16711 null
2025-11-20 Integrating Deep Learning and Spatial Statistics in Marine Ecosystem Monitoring Gian Mario Sangiovanni et.al. 2511.16447 null
2025-11-20 How Robot Dogs See the Unseeable Oliver Bimber et.al. 2511.16262 null
2025-11-18 PFAvatar: Pose-Fusion 3D Personalized Avatar Reconstruction from Real-World Outfit-of-the-Day Photos Dianbing Xi et.al. 2511.12935 null
2025-11-16 Sketch2PoseNet: Efficient and Generalized Sketch to 3D Human Pose Prediction Li Wang et.al. 2510.26196 null
2025-11-14 EmoVid: A Multimodal Emotion Video Dataset for Emotion-Centric Video Understanding and Generation Zongyang Qiu et.al. 2511.11002 null
2025-11-11 OmniAID: Decoupling Semantic and Artifacts for Universal AI-Generated Image Detection in the Wild Yuncheng Guo et.al. 2511.08423 null
2025-11-11 oboro: Text-to-Image Synthesis on Limited Data using Flow-based Diffusion Transformer with MMH Attention Ryusuke Mizutani et.al. 2511.08168 null
2025-11-11 Beyond the Pixels: VLM-based Evaluation of Identity Preservation in Reference-Guided Synthesis Aditi Singhania et.al. 2511.08087 null
2025-11-09 Time-to-Move: Training-Free Motion Controlled Video Generation via Dual-Clock Denoising Assaf Singer et.al. 2511.08633 null
2025-11-04 Video Text Preservation with Synthetic Text-Rich Videos Ziyang Liu et.al. 2511.05573 null
2025-11-03 FreeArt3D: Training-Free Articulated Object Generation using 3D Diffusion Chuhao Chen et.al. 2510.25765 null
2025-11-02 A Hybrid YOLOv5-SSD IoT-Based Animal Detection System for Durian Plantation Protection Anis Suttan Shahrir et.al. 2511.00777 null
2025-10-31 DANCER: Dance ANimation via Condition Enhancement and Rendering with diffusion model Yucheng Xing et.al. 2510.27169 null
2025-10-29 4-Doodle: Text to 3D Sketches that Move! Hao Chen et.al. 2510.25319 null
2025-09-19 TT-DF: A Large-Scale Diffusion-Based Dataset and Benchmark for Human Body Forgery Detection Wenkui Yang et.al. 2505.08437 null
2025-09-09 LINR Bridge: Vector Graphic Animation via Neural Implicits and Video Diffusion Priors Wenshuo Gao et.al. 2509.07484 null
2025-08-23 AnimateAnywhere: Rouse the Background in Human Image Animation Xiaoyu Liu et.al. 2504.19834 null
2025-08-13 Animate-X++: Universal Character Image Animation with Dynamic Backgrounds Shuai Tan et.al. 2508.09454 null
2025-08-10 Consistent and Controllable Image Animation with Motion Linear Diffusion Transformers Xin Ma et.al. 2508.07246 null
2025-08-01 FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait Taekyung Ki et.al. 2412.01064 null
2025-07-20 StableAnimator++: Overcoming Pose Misalignment and Face Distortion for Human Image Animation Shuyuan Tu et.al. 2507.15064 null
2025-07-11 X-Dancer: Expressive Music to Human Dance Video Generation Zeyuan Chen et.al. 2502.17414 null
2025-07-01 DAM-VSR: Disentanglement of Appearance and Motion for Video Super-Resolution Zhe Kong et.al. 2507.01012 null
2025-06-09 Efficient Long-duration Talking Video Synthesis with Linear Diffusion Transformer under Multimodal Guidance Haojie Zhang et.al. 2411.16748 null
2025-05-30 MTVCrafter: 4D Motion Tokenization for Open-World Human Image Animation Yanbo Ding et.al. 2505.10238 null
2025-05-29 HyperMotion: DiT-Based Pose-Guided Human Image Animation of Complex Motions Shuolin Xu et.al. 2505.22977 null
2025-05-24 EvAnimate: Event-conditioned Image-to-Video Generation for Human Animation Qiang Qu et.al. 2503.18552 null
2025-05-18 DynamiCtrl: Rethinking the Basic Structure and the Role of Text for High-quality Human Image Animation Haoyu Zhao et.al. 2503.21246 null
2025-04-20 DreamActor-M1: Holistic, Expressive and Robust Human Image Animation with Hybrid Guidance Yuxuan Luo et.al. 2504.01724 null
2025-04-15 UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer Xiang Wang et.al. 2504.11289 null
2025-04-15 Taming Consistency Distillation for Accelerated Human Image Animation Xiang Wang et.al. 2504.11143 null
2025-04-14 TPC: Test-time Procrustes Calibration for Diffusion-based Human Image Animation Sunjae Yoon et.al. 2410.24037 null
2025-04-05 Multi-identity Human Image Animation with Structural Video Diffusion Zhenzhi Wang et.al. 2504.04126 null
2025-04-04 Optimizing 4D Gaussians for Dynamic Scene Video from Single Landscape Images In-Hwan Jin et.al. 2504.05458 null
2025-04-01 VFX Creator: Animated Visual Effect Generation with Controllable Diffusion Transformer Xinyu Liu et.al. 2502.05979 null
2025-03-23 MotiF: Making Text Count in Image Animation with Motion Focal Loss Shijie Wang et.al. 2412.16153 null
2025-03-13 Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion Transformer Jiahao Cui et.al. 2412.00733 null
2025-03-10 Perception-as-Control: Fine-grained Controllable Image Animation with 3D-aware Motion Representation Yingjie Chen et.al. 2501.05020 null
2025-03-10 VidStyleODE: Disentangled Video Editing via StyleGAN and NeuralODEs Moayed Haji Ali et.al. 2304.06020 null
2025-03-01 Towards Multiple Character Image Animation Through Enhancing Implicit Decoupling Jingyun Xue et.al. 2406.03035 null
2025-02-25 DisPose: Disentangling Pose Guidance for Controllable Human Image Animation Hongxiang Li et.al. 2412.09349 null
2025-02-24 Dormant: Defending against Pose-driven Human Image Animation Jiachen Zhou et.al. 2409.14424 null
2025-02-15 SkyReels-A1: Expressive Portrait Animation in Video Diffusion Transformers Di Qiu et.al. 2502.10841 null
2025-02-10 Animate Anyone 2: High-Fidelity Character Image Animation with Environment Affordance Li Hu et.al. 2502.06145 null
2025-02-06 MotionCanvas: Cinematic Shot Design with Controllable Image-to-Video Generation Jinbo Xing et.al. 2502.04299 null
2025-01-30 Every Image Listens, Every Image Dances: Music-Driven Image Animation Zhikang Dong et.al. 2501.18801 null
2025-01-20 X-Dyna: Expressive Dynamic Human Image Animation Di Chang et.al. 2501.10021 null
2025-01-15 Joint Learning of Depth and Appearance for Portrait Image Animation Xinya Ji et.al. 2501.08649 null
2024-12-12 Animate-X: Universal Character Image Animation with Enhanced Motion Representation Shuai Tan et.al. 2410.10306 null
2024-11-30 DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses Yatian Pang et.al. 2412.00397 null
2024-11-28 JoyVASA: Portrait and Animal Image Animation with Diffusion-Based Audio-Driven Facial Dynamics and Head Motion Generation Xuyang Cao et.al. 2411.09209 null
2024-11-27 StableAnimator: High-Quality Identity-Preserving Human Image Animation Shuyuan Tu et.al. 2411.17697 null
2024-11-22 HumanVid: Demystifying Training Data for Camera-controllable Human Image Animation Zhenzhi Wang et.al. 2407.17438 null
2024-11-12 LEO: Generative Latent Image Animator for Human Video Synthesis Yaohui Wang et.al. 2305.03989 null
2024-10-20 FrameBridge: Improving Image-to-Video Generation with Bridge Models Yuji Wang et.al. 2410.15371 null
2024-10-14 Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image Animation Jiahao Cui et.al. 2410.07718 null
2024-09-30 Illustrious: an Open Advanced Illustration Model Sang Hyun Park et.al. 2409.19946 null
2024-09-29 High Quality Human Image Animation using Regional Supervision and Motion Blur Condition Zhongcong Xu et.al. 2409.19580 null
2024-07-23 Cinemo: Consistent and Controllable Image Animation with Motion Diffusion Models Xin Ma et.al. 2407.15642 null
2024-07-17 Audio-Synchronized Visual Animation Lin Zhang et.al. 2403.05659 null
2024-07-12 TCAN: Animating Human Images with Temporally Consistent Pose Guidance using Diffusion Models Jeongho Kim et.al. 2407.09012 null
2024-07-12 EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditions Zhiyuan Chen et.al. 2407.08136 null
2024-07-11 MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model Muyao Niu et.al. 2405.20222 null
2024-06-16 Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation Mingwang Xu et.al. 2406.08801 null
2024-06-03 UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation Xiang Wang et.al. 2406.01188 null
2024-06-01 Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance Shenhao Zhu et.al. 2403.14781 null
2024-05-29 Evaluating the efectiveness of sonifcation in science education using Edukoi Lucrezia Guiotto Nai Fovino et.al. 2405.18908 null
2024-05-28 VividPose: Advancing Stable Video Diffusion for Realistic Human Image Animation Qilin Wang et.al. 2405.18156 null
2024-05-28 Controllable Longer Image Animation with Diffusion Models Qiang Wang et.al. 2405.17306 null
2024-05-20 Dynamic modeling of a sliding ring on an elastic rod with incremental potential formulation Weicheng Huang et.al. 2208.01238 null
2024-03-25 PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models Yiming Zhang et.al. 2312.13964 null
2024-03-13 Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts Yue Ma et.al. 2403.08268 null
2024-03-05 Tuning-Free Noise Rectification for High Fidelity Image-to-Video Generation Weijie Li et.al. 2403.02827 null
2024-01-17 Continuous Piecewise-Affine Based Motion Model for Image Animation Hexiang Wang et.al. 2401.09146 null
2024-01-03 Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions David Junhao Zhang et.al. 2401.01827 null
2023-12-08 AnimateZero: Video Diffusion Models are Zero-Shot Image Animators Jiwen Yu et.al. 2312.03793 null
2023-12-05 LivePhoto: Real Image Animation with Text-guided Motion Control Xi Chen et.al. 2312.02928 null
2023-12-04 AnimateAnything: Fine-Grained Open Domain Image Animation with Motion Guidance Zuozhuo Dai et.al. 2311.12886 null
2023-11-30 Motion-Conditioned Image Animation for Video Editing Wilson Yan et.al. 2311.18827 null
2023-11-27 MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model Zhongcong Xu et.al. 2311.16498 null
2023-11-27 DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors Jinbo Xing et.al. 2310.12190 null
2023-11-19 Differential Motion Evolution for Fine-Grained Motion Deformation in Unsupervised Image Animation Peirong Liu et.al. 2110.04658 null
2023-10-16 LAMP: Learn A Motion Pattern for Few-Shot-Based Video Generation Ruiqi Wu et.al. 2310.10769 null
2023-09-26 Text-Guided Synthesis of Eulerian Cinemagraphs Aniruddha Mahapatra et.al. 2307.03190 null
2023-09-25 Automatic Animation of Hair Blowing in Still Portrait Photos Wenpeng Xiao et.al. 2309.14207 null
2023-07-10 AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning Yuwei Guo et.al. 2307.04725 link
2023-07-09 Predictive Coding For Animation-Based Video Compression Goluck Konuko et.al. 2307.04187 null
2023-03-10 3D Cinemagraphy from a Single Image Xingyi Li et.al. 2303.05724 null
2023-02-02 Dreamix: Video Diffusion Models are General Video Editors Eyal Molad et.al. 2302.01329 null
2023-01-27 Animating Still Images Kushagr Batra et.al. 2209.10497 null
2023-01-14 Continuous odor profile monitoring to study olfactory navigation in small animals Kevin S. Chen et.al. 2301.05905 null
2022-11-30 NeRFInvertor: High Fidelity NeRF-GAN Inversion for Single-shot Real Image Animation Yu Yin et.al. 2211.17235 null
2022-10-05 Implicit Warping for Animation with Image Sets Arun Mallya et.al. 2210.01794 null
2022-09-28 Motion Transformer for Unsupervised Image Animation Jiale Tao et.al. 2209.14024 null
2022-07-19 Single Stage Virtual Try-on via Deformable Attention Flows Shuai Bai et.al. 2207.09161 null
2022-07-08 Jointly Harnessing Prior Structures and Temporal Consistency for Sign Language Video Generation Yucheng Suo et.al. 2207.03714 null
2022-06-11 Bayesian Statistics Guided Label Refurbishment Mechanism: Mitigating Label Noise in Medical Image Classification Mengdi Gao et.al. 2106.12284 null
2022-04-05 Neural Fields in Visual Computing and Beyond Yiheng Xie et.al. 2111.11426 null
2022-03-29 Thin-Plate Spline Motion Model for Image Animation Jian Zhao et.al. 2203.14367 null
2022-03-29 Image Animation with Perturbed Masks Yoav Shalev et.al. 2011.06922 null
2022-03-25 3D GAN Inversion for Controllable Portrait Image Animation Connor Z. Lin et.al. 2203.13441 null
2022-03-18 Latent Image Animator: Learning to Animate Images via Latent Space Navigation Yaohui Wang et.al. 2203.09043 null
2021-12-21 Image Animation with Keypoint Mask Or Toledano et.al. 2112.10457 null
2021-12-19 Move As You Like: Image Animation in E-Commerce Scenario Borun Xu et.al. 2112.13647 null
2021-12-17 AI-Empowered Persuasive Video Generation: A Survey Chang Liu et.al. 2112.09401 null
2021-10-28 Application of Time Separation Technique to Enhance C-arm CT Dynamic Liver Perfusion Imaging Hana Haseljić et.al. 2110.14318 null
2021-10-26 Incremental Learning for Animal Pose Estimation using RBF k-DPP Gaurav Kumar Nayak et.al. 2110.13598 null
2021-09-06 Sparse to Dense Motion Transfer for Face Image Animation Ruiqi Zhao et.al. 2109.00471 null
2021-08-18 DeepFake MNIST+: A DeepFake Facial Animation Dataset Jiajun Huang et.al. 2108.07949 null
2021-06-23 Analisis Kualitas Layanan Website E-Commerce Bukalapak Terhadap Kepuasan Pengguna Mahasiswa Universitas Bina Darma Menggunakan Metode Webqual 4.0 Adellia et.al. 2106.15342 null
2021-04-07 Single Source One Shot Reenactment using Weighted motion From Paired Feature Points Soumya Tripathy et.al. 2104.03117 null
2021-03-22 PriorityCut: Occlusion-guided Regularization for Warp-based Image Animation Wai Ting Cheung et.al. 2103.11600 null
2020-12-01 Ultra-low bitrate video conferencing using deep image animation Goluck Konuko et.al. 2012.00346 null
2020-10-01 First Order Motion Model for Image Animation Aliaksandr Siarohin et.al. 2003.00196 null
2020-08-27 Deep Spatial Transformation for Pose-Guided Person Image Generation and Animation Yurui Ren et.al. 2008.12606 null
2019-08-30 Animating Arbitrary Objects via Deep Motion Transfer Aliaksandr Siarohin et.al. 1812.08861 null
2019-07-01 Style Generator Inversion for Image Enhancement and Animation Aviv Gabbay et.al. 1906.11880 null
2018-10-09 3D model silhouette-based tracking in depth images for puppet suit dynamic video-mapping Guillaume Caron et.al. 1810.03956 null
2018-06-24 A Design of FPGA Based Small Animal PET Real Time Digital Signal Processing and Correction Logic Jiaming Lu et.al. 1806.09117 null
2018-01-31 RAPTOR I: Time-dependent radiative transfer in arbitrary spacetimes Thomas Bronzwaer et.al. 1801.10452 null
2016-06-23 Gender and Interest Targeting for Sponsored Post Advertising at Tumblr Mihajlo Grbovic et.al. 1606.07189 null
2015-03-16 Use of Effective Audio in E-learning Courseware Kisor Ray et.al. 1503.04837 null
2015-02-04 Multimedia-Video for Learning Kah Hean Chua et.al. 1502.01090 null
2013-01-25 Measurements of Martian Dust Devil Winds with HiRISE David S. Choi et.al. 1301.6130 null
2010-01-04 Tutoring System for Dance Learning Rajkumar Kannan et.al. 1001.0440 null

(back to top)

Notes:

  • We have modified the sorting rule of the above table to prioritize papers based on the time of their latest update rather than their initial publication date. If an article has been recently modified, it will appear earlier in the list.

Function added:

  • Support more reliable text parser. Link

  • Support rich markdown format (better at parsing experimental tables). Link

About

πŸŽ“ Update Talking-Face Research Papers Daily

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%