偏差诊断 | 语音/音乐/音频论文速递

语音/音乐/音频论文速递 2026-05-12 共分析 39 篇论文 ⚡ 今日概览 📥 抓取 39 篇 → 🔬 深度分析完成 🏷️ 热门方向方向数量分布 #语音识别 3篇 ███ #音乐生成 2篇 ██ #语音合成 2篇 ██ #语音增强 2篇 ██ #音频深度伪造检测 2篇 ██ #基准测试 2篇 ██ #语音质量评估 1篇 █ #音频编码 1篇 █ 📊 论文评分排行榜（39 篇，按分数降序）排名论文评分分档主任务 🥇 Polyphonia: Zero-Shot Timbre Transfer in Polyphonic Mus 7.5分前30% #音乐生成 🥈 PoDAR: Power-Disentangled Audio Representation for Gene 7.3分前25% #语音合成 🥉 Evaluating the Expressive Appropriateness of Speech in 7.2分前25% #语音质量评估 4. Reducing Linguistic Hallucination in LM-Based Speech En 7.2分前25% #语音增强 5. Encoding and Decoding Temporal Signals with Spiking Ban 7.0分前25% #音频编码 6. Mitigating Multimodal Inconsistency via Cognitive Dual- 7.0分前50% #意图识别 7. SF-Flow: Sound field magnitude estimation via flow matc 6.8分前25% #空间音频 8. Probing Cross-modal Information Hubs in Audio-Visual LL 6.5分前25% #模型分析 9. Towards Trustworthy Audio Deepfake Detection: A Systema 6.5分前25% #音频深度伪造检测 10. Unison: Harmonizing Motion, Speech, and Sound for Human 6.5分前30% #音视频生成 11. CORTEG: Foundation Models Enable Cross-Modality Represe 6.5分前25% #脑机接口 12. Omni-Persona: Systematic Benchmarking and Improving Omn 6.5分前25% #基准测试 13. DiffVQE: Hybrid Diffusion Voice Quality Enhancement Und 6.2分前30% #语音增强 14. A Cold Diffusion Approach for Percussive Dereverberatio 6.2分前35% #音频修复 15. APEX: Audio Prototype EXplanations for Classification T 6.2分前25% #音频分类 16. How Should LLMs Listen While Speaking? A Study of User- 6.0分前25% #语音对话系统 17. RADAR Challenge 2026: Robust Audio Deepfake Recognition 6.0分前50% #音频深度伪造检测 18. ShipEcho – An Interactive Tool for Global Mapping of U 6.0分前25% #水下声学 19. Rethinking Entropy Minimization in Test-Time Adaptation 6.0分前40% #语音识别 20. Separate First, Fuse Later: Mitigating Cross-Modal Inte 6.0分前50% #音视频问答 21. ChladniSonify: A Visual-Acoustic Mapping Method for Chl 6.0分前50% #音频生成 22. Omni-DeepSearch: A Benchmark for Audio-Driven Omni-Moda 6.0分前25% #基准测试 23. Online Segmented Beamforming via Dynamic Programming 6.0分前25% #声源定位 24. FLARE: Full-Modality Long-Video Audiovisual Retrieval B 6.0分前25% #音频检索 25. Speech-based Psychological Crisis Assessment using LLMs 5.8分前25% #语音情感识别 26. EAR: Enhancing Uni-Modal Representations for Weakly Sup 5.8分前25% #音频事件检测 27. Kinetic-Optimal Scheduling with Moment Correction for M 5.5分前50% #语音合成 28. Dolphin-CN-Dialect: Where Chinese Dialects Matter 5.5分前50% #语音识别 29. Latent Secret Spin: Keyed Orthogonal Rotations for Blin 5.5分前50% #音频水印 30. Bangla-WhisperDiar: Fine-Tuning Whisper and PyAnnote fo 5.5分前50% #语音识别 #说话人日志 31. Remix the Timbre: Diffusion-Based Style Transfer Across 5.5分前30% #音色迁移 32. Low-Cost Detection of Degraded Voice Clones via Source- 5.3分前50% #语音伪造检测 33. Single-Microphone Audio Point Source Discriminative Loc 5.0分前50% #说话人分离 34. Responsible Benchmarking of Fairness for Automatic Spee 5.0分前50% #语音识别 35. Sub-JEPA: Subspace Gaussian Regularization for Stable E 5.0分前50% #世界模型 36. AllocMV: Optimal Resource Allocation for Music Video Ge 4.8分前50% #音乐视频生成 37. Multi-layer attentive probing improves transfer of audi 4.0分中等偏上 #生物声学 #音频分类 38. Drum Synthesis from Expressive Drum Grids via Neural Au 4.0分前50% #音乐生成 39. Voice Biomarkers for Depression and Anxiety 1.0分后50% #语音生物标志物 📋 论文列表 🥇 Polyphonia: Zero-Shot Timbre Transfer in Polyphonic Music with Acoustic-Informed Attention Calibration ✅ 7.5/10 | 前30% | #音乐生成 | #扩散模型 | #注意力机制 #零样本 | arxiv ...