Scaling Transformers for End-to-End Discrete Audio Tokenization
📄 Scaling Transformers for End-to-End Discrete Audio Tokenization ✅ 6.0/10 | 前50% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递
📄 Scaling Transformers for End-to-End Discrete Audio Tokenization ✅ 6.0/10 | 前50% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递
📄 Self-Guidance: Enhancing Neural Codecs via Decoder Manifold Alignment ✅ 7.5/10 | 前25% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递
📄 Self-Supervised Flow Matching for Scalable Multi-Modal Synthesis ✅ 7.5/10 | 前25% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递
📄 Simultaneous Speech-to-Speech Translation Without Aligned Data ✅ 7.5/10 | 前25% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递
📄 SONAR: Spectral‑Contrastive Audio Residuals for Generalizable Deepfake Detection 📝 4.0/10 | 后50% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递
📄 SonicMaster: Towards Controllable All-in-One Music Restoration and Mastering ✅ 7.0/10 | 前50% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递
📄 Sparse Autoencoders for Interpretable Emotion Control in Text-to-Speech ✅ 6.5/10 | 前50% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递
📄 Sparse Tokens Suffice: Jailbreaking Audio Language Models via Token-Aware Gradient Optimization 📝 5.5/10 | 前50% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递
📄 SPEAR: A Unified SSL Framework for Learning Speech and Audio Representations ✅ 7.2/10 | 前50% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递
📄 Speech-Audio Compositional Attacks on Multimodal LLMs and Their Defense with SALMONN-Guard ✅ 7.5/10 | 前25% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递