语音活动检测 on 语音/音频论文速递

语音活动检测 on 语音/音频论文速递 https://nanless.github.io/audio-paper-digest-blog/tags/%E8%AF%AD%E9%9F%B3%E6%B4%BB%E5%8A%A8%E6%A3%80%E6%B5%8B/ Recent content in 语音活动检测 on 语音/音频论文速递 Hugo zh-cn Wed, 29 Apr 2026 00:00:00 +0000 Automatic Estimation of Speaker Diarization Error Rate Based on Features of Audio Quality and Speaker Discriminability https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-automatic-estimation-of-speaker-diarization-error/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-automatic-estimation-of-speaker-diarization-error/ 说话人分离 | 7.5/10 Dual Data Scaling for Robust Two-Stage User-Defined Keyword Spotting https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-dual-data-scaling-for-robust-two-stage-user/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-dual-data-scaling-for-robust-two-stage-user/ 语音活动检测 | 7.5/10 EdgeSpot: Efficient and High-Performance Few-Shot Model for Keyword Spotting https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-edgespot-efficient-and-high-performance-few-shot/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-edgespot-efficient-and-high-performance-few-shot/ 语音活动检测 | 7.5/10 EEND-SAA: Enrollment-Less Main Speaker Voice Activity Detection Using Self-Attention Attractors https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-eend-saa-enrollment-less-main-speaker-voice/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-eend-saa-enrollment-less-main-speaker-voice/ 语音活动检测 | 7.5/10 Enhancing Dialogue-Related Speech Tasks with Generated Spoken Dialogues https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-enhancing-dialogue-related-speech-tasks-with/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-enhancing-dialogue-related-speech-tasks-with/ 语音对话系统 | 6.5/10 From Diet to Free Lunch: Estimating Auxiliary Signal Properties Using Dynamic Pruning Masks in Speech Enhancement Networks https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-from-diet-to-free-lunch-estimating-auxiliary/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-from-diet-to-free-lunch-estimating-auxiliary/ 语音增强 | 7.5/10 ICASSP 2026 - 语音活动检测论文列表 https://nanless.github.io/audio-paper-digest-blog/posts/icassp2026-task-068/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/icassp2026-task-068/ 共 5 篇 ICASSP 2026 语音活动检测方向论文 Lingometer: On-Device Personal Speech Word Counting System https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-lingometer-on-device-personal-speech-word/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-lingometer-on-device-personal-speech-word/ 语音活动检测 | 8.0/10 Spatially Aware Self-Supervised Models for Multi-Channel Neural Speaker Diarization https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-spatially-aware-self-supervised-models-for-multi/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-spatially-aware-self-supervised-models-for-multi/ 说话人分离 | 8.0/10 SynParaSpeech: Automated Synthesis of Paralinguistic Datasets for Speech Generation and Understanding https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-synparaspeech-automated-synthesis-of/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-synparaspeech-automated-synthesis-of/ 语音合成 | 7.5/10 The Role of Prosodic and Lexical Cues in Turn-Taking with Self-Supervised Speech Representations https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-the-role-of-prosodic-and-lexical-cues-in-turn/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-the-role-of-prosodic-and-lexical-cues-in-turn/ 语音对话系统 | 7.5/10 TVP-UNet: Threshold Variance Penalty U-Net for Voice Activity Detection in Dysarthric Speech https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-tvp-unet-threshold-variance-penalty-u-net-for/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-tvp-unet-threshold-variance-penalty-u-net-for/ 语音活动检测 | 7.0/10 Aligning Stuttered-Speech Research with End-User Needs: Scoping Review, Survey, and Guidelines https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-aligning-stuttered-speech-research-with-end-user/ Thu, 23 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-aligning-stuttered-speech-research-with-end-user/ 1. **问题**：当前口吃语音技术研究与口吃者（PWS）及言语语言病理学家（SLP）的实际需求存在系统性脱节，研究重点、任务定义和评估方法未能充分以用户为中心。 2. **方法核心**：通过两部分结合分析：1）对228篇相关论文进行范围综述，提出研究任务分类法并分析研究现状；2）对70名利益相