Spherical Procrustes Alignment for Reliable Medical Audio Diagnosis
📄 Spherical Procrustes Alignment for Reliable Medical Audio Diagnosis ✅ 7.5/10 | 前25% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递
📄 Spherical Procrustes Alignment for Reliable Medical Audio Diagnosis ✅ 7.5/10 | 前25% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递
📄 STAR-VAE: Structured Topology-Aware Regularization for Audio Reconstruction and Generation ✅ 7.5/10 | 前25% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递
📄 STARCaster: Spatio-Temporal AutoRegressive Video Diffusion for Identity- and View-Aware Talking Portraits 📝 3.5/10 | 后50% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递
📄 Stream RAG: Instant and Accurate Spoken Dialogue Systems with Streaming Tool Usage ✅ 7.5/10 | 前25% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递
📄 T2AV-Compass: Towards Unified Evaluation for Text-to-Audio-Video Generation ✅ 6.5/10 | 前50% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递
📄 TextME: Bridging Unseen Modalities Through Text Descriptions ✅ 7.0/10 | 前50% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递
📄 The Silent Thought: Modeling Internal Cognition in Full-Duplex Spoken Dialogue Models via Latent Reasoning ✅ 7.0/10 | 前50% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递
📄 TMD-Bench: A Multi-Level Evaluation Paradigm for Music–Dance Co-Generation ✅ 7.0/10 | 前50% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递
📄 Towards Understanding Modality Interaction in Multimodal Language Models via Partial Information Decomposition 📝 4.5/10 | 后50% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递