BEAT: Tokenizing and Generating Symbolic Music by Uniform Temporal Steps

📄 BEAT: Tokenizing and Generating Symbolic Music by Uniform Temporal Steps 🔥 8.0/10 | 前25% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递

2026-05-23 · 更新于 2026-06-19 · 1 min · 21 words

Bioacoustic Geolocation: Species Sounds as Geographic Signals

📄 Bioacoustic Geolocation: Species Sounds as Geographic Signals ✅ 7.2/10 | 前50% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递

2026-05-23 · 更新于 2026-06-19 · 1 min · 18 words

Bridging the Stability-Expressivity Gap: Synthetic Data Scaling and Preference Alignment for Low-Resource Spoken Language Models

📄 Bridging the Stability-Expressivity Gap: Synthetic Data Scaling and Preference Alignment for Low-Resource Spoken Language Models ✅ 7.3/10 | 前50% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递

2026-05-23 · 更新于 2026-06-19 · 1 min · 26 words

Bridging Your Imagination with Audio-Video Generation via a Unified Director

📄 Bridging Your Imagination with Audio-Video Generation via a Unified Director ✅ 7.0/10 | 前50% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递

2026-05-23 · 更新于 2026-06-19 · 1 min · 21 words

Characterizing the Predictive Impact of Modalities with Supervised Latent-Variable Modeling

📄 Characterizing the Predictive Impact of Modalities with Supervised Latent-Variable Modeling ✅ 6.5/10 | 前50% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递

2026-05-23 · 更新于 2026-06-19 · 1 min · 21 words

CMI-RewardBench: Evaluating Music Reward Models with Compositional Multimodal Instruction

📄 CMI-RewardBench: Evaluating Music Reward Models with Compositional Multimodal Instruction 🔥 8.2/10 | 前25% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递

2026-05-23 · 更新于 2026-06-19 · 1 min · 20 words

CoCoEmo: Composable and Controllable Human-Like Emotional TTS via Activation Steering

📄 CoCoEmo: Composable and Controllable Human-Like Emotional TTS via Activation Steering ✅ 6.5/10 | 前50% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递

2026-05-23 · 更新于 2026-06-19 · 1 min · 21 words

CoLA: Cross-Modal Low-rank Adaptation for Multimodal Downstream Tasks

📄 CoLA: Cross-Modal Low-rank Adaptation for Multimodal Downstream Tasks ✅ 6.5/10 | 前50% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递

2026-05-23 · 更新于 2026-06-19 · 1 min · 19 words

Convex Low-resource Accent-Robust Language Detection in Speech Recognition

📄 Convex Low-resource Accent-Robust Language Detection in Speech Recognition #** #凸优化 #语音识别 #语言检测 #低资源 #口音鲁棒性 #ADMM ✅ 7.5/10 | 前25% | #** | #凸优化 | #语音识别 #语言检测 | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递

2026-05-23 · 更新于 2026-06-19 · 1 min · 33 words

DiscoForcing: A Unified Framework for Real-Time Audio-Driven Character Control with Diffusion Forcing

📄 DiscoForcing: A Unified Framework for Real-Time Audio-Driven Character Control with Diffusion Forcing 🔥 8.2/10 | 前25% | arxiv ← 返回 2026-05-23 语音/音乐/音频论文速递

2026-05-23 · 更新于 2026-06-19 · 1 min · 23 words