Reasoning LLM Improves Speaker Recognition in Long-form TV Dramas
📄 Reasoning LLM Improves Speaker Recognition in Long-form TV Dramas ✅ 7.0/10 | 前50% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递
📄 Reasoning LLM Improves Speaker Recognition in Long-form TV Dramas ✅ 7.0/10 | 前50% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递
📄 ReGen: Hierarchical Multi-Prompt Representation Generation for Efficient Waveform Diffusion Models ✅ 7.0/10 | 前50% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递
📄 REST: Diffusion-based Real-time End-to-end Streaming Talking Head Generation via ID-Context Caching and Asynchronous Streaming Distillation ✅ 6.5/10 | 前50% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递
📄 Rethinking Attention in Spiking Transformers: Overcoming Density Bias with Set Similarity ✅ 6.5/10 | 前50% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递
📄 Robust Signal Enhancement via Fractional Detail Views and Knowledge Guided Multi-view Fusion ✅ 7.5/10 | 前25% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递
📄 S3Audio: Towards Streaming Synchronized Spatial Audio Generation via Autoregressive Diffusion Transformer ✅ 6.5/10 | 前50% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递
📄 SALSA-V: Shortcut-Augmented Long-form Synchronized Audio from Videos ✅ 6.5/10 | 前50% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递
📄 SAM Audio: Segment Anything in Audio #** #未说明。 ✅ 6.5/10 | 前50% | #** | #未说明。 | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递
📄 SARSteer: Safeguarding Large Audio Language Models via Safe-Ablated Refusal Steering ✅ 7.5/10 | 前25% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递
📄 Scaling Laws in Model Fine-tuning for Audio DeepFake Detection 📝 5.0/10 | 后50% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递