U-Net on 语音/音频论文速递

U-Net on 语音/音频论文速递 https://nanless.github.io/audio-paper-digest-blog/tags/u-net/ Recent content in U-Net on 语音/音频论文速递 Hugo zh-cn Wed, 29 Apr 2026 00:00:00 +0000 Bridging the Front-End and Back-End for Robust ASR via Cross-Attention-Based U-Net https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-bridging-the-front-end-and-back-end-for-robust/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-bridging-the-front-end-and-back-end-for-robust/ 语音识别 | 7.0/10 Diff-vs: Efficient Audio-Aware Diffusion U-Net for Vocals Separation https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-diff-vs-efficient-audio-aware-diffusion-u-net-for/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-diff-vs-efficient-audio-aware-diffusion-u-net-for/ 语音分离 | 7.5/10 FUN-SSL: Full-Band Layer Followed by U-Net With Narrow-Band Layers for Multiple Moving Sound Source Localization https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-fun-ssl-full-band-layer-followed-by-u-net-with/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-fun-ssl-full-band-layer-followed-by-u-net-with/ 声源定位 | 8.0/10 TVP-UNet: Threshold Variance Penalty U-Net for Voice Activity Detection in Dysarthric Speech https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-tvp-unet-threshold-variance-penalty-u-net-for/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-tvp-unet-threshold-variance-penalty-u-net-for/ 语音活动检测 | 7.0/10 ArtifactNet: Detecting AI-Generated Music via Forensic Residual Physics https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-21-artifactnet-detecting-ai-generated-music-via/ Tue, 21 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-21-artifactnet-detecting-ai-generated-music-via/ 这篇论文旨在解决AI生成音乐检测中普遍存在的泛化能力差的问题。当前主流方法（如CLAM、SpecTTTra）通过学习AI音乐的声音特征，在面对未见过的生成器时性能急剧下降。作者提出了一个核心假设：当前主流AI音乐生成器（如Suno, Udio）都依赖神经音频编解码器（如EnCodec）的残差矢量量化语音/音频论文速递 2026-04-21 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-21/ Tue, 21 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-21/ 共分析 34 篇语音/AI 论文