语音/音频论文速递 2026-04-21
语音/音频论文速递 2026-04-21 共分析 34 篇论文 ⚡ 今日概览 📥 抓取 34 篇 → 🔬 深度分析完成 🏷️ 热门方向 方向 数量 分布 模型评估 13篇 █████████████ 基准测试 9篇 █████████ 音频大模型 8篇 ████████ 数据集 7篇 ███████ 多语言 7篇 ███████ 多模态模型 5篇 █████ 强化学习 5篇 █████ 语音对话系统 4篇 ████ 📊 论文评分排行榜(34 篇,按分数降序) 排名 论文 评分 🥇 FreezeEmpath: Efficient Training for Empathetic Spoken 10.0分 🥈 Audio-DeepThinker: Progressive Reasoning-Aware Reinforc 9.5分 🥉 VoxSafeBench: Not Just What Is Said, but Who, How, and 9.5分 4 Benign Fine-Tuning Breaks Safety Alignment in Audio LLM 9.0分 5 Prosody as Supervision: Bridging the Non-Verbal–Verbal 9.0分 6 Anonymization, Not Elimination: Utility-Preserved Speec 8.5分 7 MimicLM: Zero-Shot Voice Imitation through Autoregressi 8.5分 8 ArtifactNet: Detecting AI-Generated Music via Forensic 8.5分 9 Audio-Cogito: Towards Deep Audio Reasoning in Large Aud 8.5分 10 LLM-Codec: Neural Audio Codec Meets Language Model Obje 8.5分 11 NIM4-ASR: Towards Efficient, Robust, and Customizable R 8.5分 12 Video-Robin: Autoregressive Diffusion Planning for Inte 8.0分 13 A state-space representation of the boundary integral e 8.0分 14 AVRT: Audio-Visual Reasoning Transfer through Single-Mo 8.0分 15 MoVE: Translating Laughter and Tears via Mixture of Voc 8.0分 16 SELF-EMO: Emotional Self-Evolution from Recognition to 8.0分 17 BhashaSutra: A Task-Centric Unified Survey of Indian NL 8.0分 18 MINT-Bench: A Comprehensive Multilingual Benchmark for 8.0分 19 ICLAD: In-Context Learning with Comparison-Guidance for 7.5分 20 Still Between Us? Evaluating and Improving Voice Assist 7.5分 21 Where Do Self-Supervised Speech Models Become Unfair? 7.5分 22 Neural Encoding Detection is Not All You Need for Synth 7.5分 23 Omni-Embed-Audio: Leveraging Multimodal LLMs for Robust 7.5分 24 Latent Fourier Transform 7.5分 25 Hard to Be Heard: Phoneme-Level ASR Analysis of Phonolo 7.5分 26 VIBE: Voice-Induced open-ended Bias Evaluation for Larg 7.5分 27 Aligning Language Models for Lyric-to-Melody Generation 7.5分 28 ClariCodec: Optimising Neural Speech Codes for 200bps C 7.0分 29 From Reactive to Proactive: Assessing the Proactivity o 7.0分 30 A novel LSTM music generator based on the fractional ti 6.5分 31 Incremental learning for audio classification with Hebb 6.5分 32 Coexisting Tempo Traditions in Beethoven’s Piano and Ce 6.0分 33 FLiP: Towards understanding and interpreting multimodal 5.5分 34 HCFD: A Benchmark for Audio Deepfake Detection in Healt 5.0分 📋 论文列表 🥇 FreezeEmpath: Efficient Training for Empathetic Spoken Chatbots with Frozen LLMs 🔥 10.0分 | #语音对话系统 #多模态模型 #迁移学习 #语音情感识别 | arxiv ...