<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>鲁棒性 on 语音/音频论文速递</title>
    <link>https://nanless.github.io/audio-paper-digest-blog/tags/%E9%B2%81%E6%A3%92%E6%80%A7/</link>
    <description>Recent content in 鲁棒性 on 语音/音频论文速递</description>
    <generator>Hugo</generator>
    <language>zh-cn</language>
    <lastBuildDate>Wed, 29 Apr 2026 00:00:00 +0000</lastBuildDate>
    <atom:link href="https://nanless.github.io/audio-paper-digest-blog/tags/%E9%B2%81%E6%A3%92%E6%80%A7/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>A Feature-Optimized Audio Watermarking Algorithm with Adaptive Embedding Strength</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-feature-optimized-audio-watermarking-algorithm/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-feature-optimized-audio-watermarking-algorithm/</guid>
      <description>音频安全 | 7.5/10</description>
    </item>
    <item>
      <title>A Framework for Controlled Multi-Speaker Audio Synthesis for Robustness Evaluation of Speaker Diarisation Systems</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-framework-for-controlled-multi-speaker-audio/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-framework-for-controlled-multi-speaker-audio/</guid>
      <description>说话人日志 | 7.5/10</description>
    </item>
    <item>
      <title>A Robust KNN Approach for Multi-Class Laryngeal Disease Detection using MFCC Features</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-robust-knn-approach-for-multi-class-laryngeal/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-robust-knn-approach-for-multi-class-laryngeal/</guid>
      <description>音频分类 | 7.5/10</description>
    </item>
    <item>
      <title>A Superb-Style Benchmark of Self-Supervised Speech Models for Audio Deepfake Detection</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-superb-style-benchmark-of-self-supervised/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-superb-style-benchmark-of-self-supervised/</guid>
      <description>音频深度伪造检测 | 7.0/10</description>
    </item>
    <item>
      <title>A Unified SVD-Modal Solution for Sparse Sound Field Reconstruction with Hybrid Spherical-Linear Microphone Arrays</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-unified-svd-modal-solution-for-sparse-sound/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-unified-svd-modal-solution-for-sparse-sound/</guid>
      <description>声源定位 | 6.5/10</description>
    </item>
    <item>
      <title>AccLID: Accent-aware Language Identification for Robust Multilingual Speech Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-acclid-accent-aware-language-identification-for/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-acclid-accent-aware-language-identification-for/</guid>
      <description>语音识别 | 7.0/10</description>
    </item>
    <item>
      <title>Adaptive Per-Channel Energy Normalization Front-End for Robust Audio Signal Processing</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-adaptive-per-channel-energy-normalization-front/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-adaptive-per-channel-energy-normalization-front/</guid>
      <description>音频分类 | 7.5/10</description>
    </item>
    <item>
      <title>Addressing Gradient Misalignment in Data-Augmented Training for Robust Speech Deepfake Detection</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-addressing-gradient-misalignment-in-data/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-addressing-gradient-misalignment-in-data/</guid>
      <description>语音伪造检测 | 7.0/10</description>
    </item>
    <item>
      <title>Advanced modeling of interlanguage speech intelligibility benefit with L1-L2 multi-task learning using differentiable K-means for accent-robust discrete token-based ASR</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-advanced-modeling-of-interlanguage-speech/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-advanced-modeling-of-interlanguage-speech/</guid>
      <description>语音识别 | 7.0/10</description>
    </item>
    <item>
      <title>Adversarial Defense via Generative Speech Enhancement Module</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-adversarial-defense-via-generative-speech/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-adversarial-defense-via-generative-speech/</guid>
      <description>语音增强 对抗防御 | 7.5/10</description>
    </item>
    <item>
      <title>Adversarial Fine-Tuning on Speech Foundation Model with Vulnerable Attention Consistency Regularization for Robust Speech Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-adversarial-fine-tuning-on-speech-foundation/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-adversarial-fine-tuning-on-speech-foundation/</guid>
      <description>语音识别 | 7.5/10</description>
    </item>
    <item>
      <title>AFT: An Exemplar-Free Class Incremental Learning Method for Environmental Sound Classification</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-aft-an-exemplar-free-class-incremental-learning/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-aft-an-exemplar-free-class-incremental-learning/</guid>
      <description>音频分类 | 7.0/10</description>
    </item>
    <item>
      <title>AI-Generated Music Detection in Broadcast Monitoring</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-ai-generated-music-detection-in-broadcast/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-ai-generated-music-detection-in-broadcast/</guid>
      <description>音频深度伪造检测 | 7.0/10</description>
    </item>
    <item>
      <title>AMBER2: Dual Ambiguity-Aware Emotion Recognition Applied to Speech and Text</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-amber2-dual-ambiguity-aware-emotion-recognition/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-amber2-dual-ambiguity-aware-emotion-recognition/</guid>
      <description>语音情感识别 | 8.0/10</description>
    </item>
    <item>
      <title>AmbiDrop: Array-Agnostic Speech Enhancement Using Ambisonics Encoding and Dropout-Based Learning</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-ambidrop-array-agnostic-speech-enhancement-using/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-ambidrop-array-agnostic-speech-enhancement-using/</guid>
      <description>语音增强 | 7.0/10</description>
    </item>
    <item>
      <title>AnyAccomp: Generalizable Accompaniment Generation Via Quantized Melodic Bottleneck</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-anyaccomp-generalizable-accompaniment-generation/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-anyaccomp-generalizable-accompaniment-generation/</guid>
      <description>音乐生成 | 8.0/10</description>
    </item>
    <item>
      <title>AnyRIR: Robust Non-Intrusive Room Impulse Response Estimation in the Wild</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-anyrir-robust-non-intrusive-room-impulse-response/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-anyrir-robust-non-intrusive-room-impulse-response/</guid>
      <description>空间音频 | 7.0/10</description>
    </item>
    <item>
      <title>AQUA-Bench: Beyond finding answers to knowing when there are None in Audio Question Answering</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-aqua-bench-beyond-finding-answers-to-knowing-when/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-aqua-bench-beyond-finding-answers-to-knowing-when/</guid>
      <description>音频问答 | 7.0/10</description>
    </item>
    <item>
      <title>Are Modern Speech Enhancement Systems Vulnerable to Adversarial Attacks?</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-are-modern-speech-enhancement-systems-vulnerable/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-are-modern-speech-enhancement-systems-vulnerable/</guid>
      <description>语音增强 | 7.5/10</description>
    </item>
    <item>
      <title>Audio Classification Models are Vulnerable to Filter Perturbations</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-audio-classification-models-are-vulnerable-to/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-audio-classification-models-are-vulnerable-to/</guid>
      <description>音频分类 | 7.5/10</description>
    </item>
    <item>
      <title>Audio Deepfake Detection at the First Greeting: &#34;Hi!&#34;</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-audio-deepfake-detection-at-the-first-greeting-hi/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-audio-deepfake-detection-at-the-first-greeting-hi/</guid>
      <description>音频深度伪造检测 | 7.5/10</description>
    </item>
    <item>
      <title>AudioFuse: Unified Spectral-Temporal Learning Via A Hybrid VIT-1D CNN Architecture for Phonocardiogram Classification</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-audiofuse-unified-spectral-temporal-learning-via/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-audiofuse-unified-spectral-temporal-learning-via/</guid>
      <description>音频分类 | 7.5/10</description>
    </item>
    <item>
      <title>AURA: A Stegaformer-Based Scalable Deep Audio Watermark with Extreme Robustness</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-aura-a-stegaformer-based-scalable-deep-audio/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-aura-a-stegaformer-based-scalable-deep-audio/</guid>
      <description>音频水印 | 7.5/10</description>
    </item>
    <item>
      <title>Auxiliary Multi-Label Training For Improving the Robustness of Audio Deepfake Detection on AI-Processed Data</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-auxiliary-multi-label-training-for-improving-the/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-auxiliary-multi-label-training-for-improving-the/</guid>
      <description>音频深度伪造检测 | 6.5/10</description>
    </item>
    <item>
      <title>AVATAR: Audio-Visual Adaptive Fusion via Trained Agent Reinforcement for Multimodal Deepfake Detection</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-avatar-audio-visual-adaptive-fusion-via-trained/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-avatar-audio-visual-adaptive-fusion-via-trained/</guid>
      <description>音频深度伪造检测 | 7.5/10</description>
    </item>
    <item>
      <title>Bloodroot: When Watermarking Turns Poisonous for Stealthy Backdoor</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-bloodroot-when-watermarking-turns-poisonous-for/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-bloodroot-when-watermarking-turns-poisonous-for/</guid>
      <description>音频安全 | 7.5/10</description>
    </item>
    <item>
      <title>Brainprint-Modulated Target Speaker Extraction</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-brainprint-modulated-target-speaker-extraction/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-brainprint-modulated-target-speaker-extraction/</guid>
      <description>语音分离 | 8.0/10</description>
    </item>
    <item>
      <title>Bridging the Front-End and Back-End for Robust ASR via Cross-Attention-Based U-Net</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-bridging-the-front-end-and-back-end-for-robust/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-bridging-the-front-end-and-back-end-for-robust/</guid>
      <description>语音识别 | 7.0/10</description>
    </item>
    <item>
      <title>CaMoD: Causal-Aware Modality Denoising for Multimodal Dialogue Intent Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-camod-causal-aware-modality-denoising-for/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-camod-causal-aware-modality-denoising-for/</guid>
      <description>多模态对话意图识别 | 7.5/10</description>
    </item>
    <item>
      <title>Condition-Invariant fMRI decoding of speech intelligibility with deep state space model</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-condition-invariant-fmri-decoding-of-speech/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-condition-invariant-fmri-decoding-of-speech/</guid>
      <description>神经解码 | 7.0/10</description>
    </item>
    <item>
      <title>Confidence-Guided Error Correction for Disordered Speech Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-confidence-guided-error-correction-for-disordered/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-confidence-guided-error-correction-for-disordered/</guid>
      <description>语音识别 | 7.5/10</description>
    </item>
    <item>
      <title>Content Leakage in Librispeech and its Impact on the Privacy Evaluation of Speaker Anonymization</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-content-leakage-in-librispeech-and-its-impact-on/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-content-leakage-in-librispeech-and-its-impact-on/</guid>
      <description>语音匿名化 | 7.5/10</description>
    </item>
    <item>
      <title>Content-Preserving Speech Representation Learning Via Adaptive Segment-Level Alignment</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-content-preserving-speech-representation-learning/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-content-preserving-speech-representation-learning/</guid>
      <description>语音识别 | 7.5/10</description>
    </item>
    <item>
      <title>Context-Aware Dynamic Graph Learning for Multimodal Emotion Recognition with Missing Modalities</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-context-aware-dynamic-graph-learning-for/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-context-aware-dynamic-graph-learning-for/</guid>
      <description>语音情感识别 | 8.8/10</description>
    </item>
    <item>
      <title>Contextual Biasing for ASR in Speech LLM with Common Word Cues and Bias Word Position Prediction</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-contextual-biasing-for-asr-in-speech-llm-with/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-contextual-biasing-for-asr-in-speech-llm-with/</guid>
      <description>语音识别 | 7.0/10</description>
    </item>
    <item>
      <title>Cooperative Multi-Agent Reinforcement Learning for Adaptive Aggregation in Semi-Supervised Federated Learning with non-IID Data</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-cooperative-multi-agent-reinforcement-learning/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-cooperative-multi-agent-reinforcement-learning/</guid>
      <description>联邦学习 | 7.0/10</description>
    </item>
    <item>
      <title>Coupling Acoustic Geometry and Visual Semantics for Robust Depth Estimation</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-coupling-acoustic-geometry-and-visual-semantics/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-coupling-acoustic-geometry-and-visual-semantics/</guid>
      <description>空间音频 | 7.5/10</description>
    </item>
    <item>
      <title>Cross-Modal Bottleneck Fusion for Noise Robust Audio-Visual Speech Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-cross-modal-bottleneck-fusion-for-noise-robust/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-cross-modal-bottleneck-fusion-for-noise-robust/</guid>
      <description>语音识别 | 7.5/10</description>
    </item>
    <item>
      <title>Dissecting Performance Degradation in Audio Source Separation under Sampling Frequency Mismatch</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-dissecting-performance-degradation-in-audio/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-dissecting-performance-degradation-in-audio/</guid>
      <description>音乐源分离 | 7.5/10</description>
    </item>
    <item>
      <title>DOMA: Leveraging Diffusion Language Models with Adaptive Prior for Intent Classification and Slot Filling</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-doma-leveraging-diffusion-language-models-with/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-doma-leveraging-diffusion-language-models-with/</guid>
      <description>语音对话系统 | 8.5/10</description>
    </item>
    <item>
      <title>DSRMS-TransUnet: A Decentralized Non-Shifted Transunet for Shallow Water Acoustic Source Range Estimation</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-dsrms-transunet-a-decentralized-non-shifted/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-dsrms-transunet-a-decentralized-non-shifted/</guid>
      <description>声源定位 | 8.0/10</description>
    </item>
    <item>
      <title>DSSR: Decoupling Salient and Subtle Representations Under Missing Modalities for Multimodal Emotion Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-dssr-decoupling-salient-and-subtle/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-dssr-decoupling-salient-and-subtle/</guid>
      <description>情感识别 | 7.5/10</description>
    </item>
    <item>
      <title>Dynamic Balanced Cross-Modal Attention with Gated Sequence Restoration: Towards Robust Multimodal Sentiment Analysis</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-dynamic-balanced-cross-modal-attention-with-gated/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-dynamic-balanced-cross-modal-attention-with-gated/</guid>
      <description>跨模态 | 7.5/10</description>
    </item>
    <item>
      <title>Dynamic Noise-Aware Multi Lora Framework Towards Real-World Audio Deepfake Detection</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-dynamic-noise-aware-multi-lora-framework-towards/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-dynamic-noise-aware-multi-lora-framework-towards/</guid>
      <description>音频深度伪造检测 | 8.0/10</description>
    </item>
    <item>
      <title>Enhancing Noise Robustness for Neural Speech Codecs Through Resource-Efficient Progressive Quantization Perturbation Simulation</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-enhancing-noise-robustness-for-neural-speech/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-enhancing-noise-robustness-for-neural-speech/</guid>
      <description>语音增强 | 7.5/10</description>
    </item>
    <item>
      <title>Fake Speech Wild: Detecting Deepfake Speech on Social Media Platform</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-fake-speech-wild-detecting-deepfake-speech-on/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-fake-speech-wild-detecting-deepfake-speech-on/</guid>
      <description>语音伪造检测 | 7.0/10</description>
    </item>
    <item>
      <title>Fine-Tuning Bigvgan-V2 for Robust Musical Tuning Preservation</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-fine-tuning-bigvgan-v2-for-robust-musical-tuning/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-fine-tuning-bigvgan-v2-for-robust-musical-tuning/</guid>
      <description>音乐生成 | 7.5/10</description>
    </item>
    <item>
      <title>Frontend Token Enhancement for Token-Based Speech Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-frontend-token-enhancement-for-token-based-speech/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-frontend-token-enhancement-for-token-based-speech/</guid>
      <description>语音识别 | 8.0/10</description>
    </item>
    <item>
      <title>Gdiffuse: Diffusion-Based Speech Enhancement with Noise Model Guidance</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-gdiffuse-diffusion-based-speech-enhancement-with/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-gdiffuse-diffusion-based-speech-enhancement-with/</guid>
      <description>语音增强 | 7.0/10</description>
    </item>
    <item>
      <title>Generalizability of Predictive and Generative Speech Enhancement Models to Pathological Speakers</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-generalizability-of-predictive-and-generative/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-generalizability-of-predictive-and-generative/</guid>
      <description>语音增强 | 7.0/10</description>
    </item>
    <item>
      <title>Graph-based Modality Alignment for Robustness in Conversational Emotion Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-graph-based-modality-alignment-for-robustness-in/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-graph-based-modality-alignment-for-robustness-in/</guid>
      <description>语音情感识别 | 8.0/10</description>
    </item>
    <item>
      <title>GRNet: Graph Reconstruction Network for Robust Multimodal Sentiment Analysis</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-grnet-graph-reconstruction-network-for-robust/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-grnet-graph-reconstruction-network-for-robust/</guid>
      <description>多模态情感分析 | 7.5/10</description>
    </item>
    <item>
      <title>Hanui: Harnessing Distributional Discrepancies for Singing Voice Deepfake Detection</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-hanui-harnessing-distributional-discrepancies-for/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-hanui-harnessing-distributional-discrepancies-for/</guid>
      <description>音频深度伪造检测 | 8.0/10</description>
    </item>
    <item>
      <title>HVAC-EAR: Eavesdropping Human Speech Using HVAC Systems</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-hvac-ear-eavesdropping-human-speech-using-hvac/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-hvac-ear-eavesdropping-human-speech-using-hvac/</guid>
      <description>音频安全 | 8.5/10</description>
    </item>
    <item>
      <title>I-DCCRN-VAE: An Improved Deep Representation Learning Framework for Complex VAE-Based Single-Channel Speech Enhancement</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-i-dccrn-vae-an-improved-deep-representation/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-i-dccrn-vae-an-improved-deep-representation/</guid>
      <description>语音增强 | 7.5/10</description>
    </item>
    <item>
      <title>Improving Automatic Speech Recognition by Mitigating Distortions Introduced by Speech Enhancement Under Drone Noise</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-improving-automatic-speech-recognition-by/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-improving-automatic-speech-recognition-by/</guid>
      <description>语音识别 | 6.5/10</description>
    </item>
    <item>
      <title>Improving Binaural Distance Estimation in Reverberant Rooms Through Contrastive And Multi-Task Learning</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-improving-binaural-distance-estimation-in/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-improving-binaural-distance-estimation-in/</guid>
      <description>声源定位 | 7.0/10</description>
    </item>
    <item>
      <title>Input-Adaptive Differentiable Filterbanks via Hypernetworks for Robust Speech Processing</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-input-adaptive-differentiable-filterbanks-via/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-input-adaptive-differentiable-filterbanks-via/</guid>
      <description>语音识别 | 7.5/10</description>
    </item>
    <item>
      <title>Joint Estimation of Primary and Secondary Paths for Personalized Hearable Applications</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-joint-estimation-of-primary-and-secondary-paths/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-joint-estimation-of-primary-and-secondary-paths/</guid>
      <description>主动降噪 | 7.5/10</description>
    </item>
    <item>
      <title>Learnable Mel-Frontend for Robust Underwater Acoustic Target Detection under Non-Target Interference</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-learnable-mel-frontend-for-robust-underwater/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-learnable-mel-frontend-for-robust-underwater/</guid>
      <description>音频分类 | 6.5/10</description>
    </item>
    <item>
      <title>Leveraging Multiple Speech Enhancers for Non-Intrusive Intelligibility Prediction for Hearing-Impaired Listeners</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-leveraging-multiple-speech-enhancers-for-non/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-leveraging-multiple-speech-enhancers-for-non/</guid>
      <description>模型评估 | 7.5/10</description>
    </item>
    <item>
      <title>LipsAM: Lipschitz-Continuous Amplitude Modifier for Audio Signal Processing and its Application to Plug-And-Play Dereverberation</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-lipsam-lipschitz-continuous-amplitude-modifier/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-lipsam-lipschitz-continuous-amplitude-modifier/</guid>
      <description>语音增强 | 7.5/10</description>
    </item>
    <item>
      <title>Localizing Speech Deepfakes Beyond Transitions via Segment-Aware Learning</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-localizing-speech-deepfakes-beyond-transitions/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-localizing-speech-deepfakes-beyond-transitions/</guid>
      <description>音频深度伪造检测 | 8.0/10</description>
    </item>
    <item>
      <title>Low-Frequency Harmonic Control for Speech Intelligibility in Open-Ear Headphones</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-low-frequency-harmonic-control-for-speech/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-low-frequency-harmonic-control-for-speech/</guid>
      <description>语音增强 | 6.5/10</description>
    </item>
    <item>
      <title>LP-CFM: Perceptual Invariance-Aware Conditional Flow Matching for Speech Modeling</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-lp-cfm-perceptual-invariance-aware-conditional/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-lp-cfm-perceptual-invariance-aware-conditional/</guid>
      <description>语音合成 | 7.0/10</description>
    </item>
    <item>
      <title>Membership Inference Attack against Music Diffusion Models via Generative Manifold Perturbation</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-membership-inference-attack-against-music/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-membership-inference-attack-against-music/</guid>
      <description>音频安全 | 7.5/10</description>
    </item>
    <item>
      <title>MFF-RVRDI: Multimodal Fusion Framework for Robust Video Recording Device Identification</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-mff-rvrdi-multimodal-fusion-framework-for-robust/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-mff-rvrdi-multimodal-fusion-framework-for-robust/</guid>
      <description>视频设备识别 | 7.5/10</description>
    </item>
    <item>
      <title>Multi-Task Learning For Speech Quality Assessment Using ASR-Derived Entropy Features</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-multi-task-learning-for-speech-quality-assessment/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-multi-task-learning-for-speech-quality-assessment/</guid>
      <description>语音质量评估 | 7.5/10</description>
    </item>
    <item>
      <title>NeuroSIFT: A Biologically-Inspired Framework with Explicit Signal-Noise Separation for Robust Multimodal Emotion Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-neurosift-a-biologically-inspired-framework-with/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-neurosift-a-biologically-inspired-framework-with/</guid>
      <description>多模态情感识别 | 8.0/10</description>
    </item>
    <item>
      <title>Noise-Robust AV-ASR Using Visual Features both in the Whisper Encoder and Decoder</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-noise-robust-av-asr-using-visual-features-both-in/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-noise-robust-av-asr-using-visual-features-both-in/</guid>
      <description>语音识别 | 8.0/10</description>
    </item>
    <item>
      <title>Noise-Robust Contrastive Learning with an MFCC-Conformer for Coronary Artery Disease Detection</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-noise-robust-contrastive-learning-with-an-mfcc/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-noise-robust-contrastive-learning-with-an-mfcc/</guid>
      <description>音频分类 | 7.0/10</description>
    </item>
    <item>
      <title>Noise-to-Notes: Diffusion-Based Generation and Refinement for Automatic Drum Transcription</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-noise-to-notes-diffusion-based-generation-and/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-noise-to-notes-diffusion-based-generation-and/</guid>
      <description>音乐信息检索 | 8.0/10</description>
    </item>
    <item>
      <title>Off-The-Grid Multi-Pitch Estimation Using Optimal Transport</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-off-the-grid-multi-pitch-estimation-using-optimal/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-off-the-grid-multi-pitch-estimation-using-optimal/</guid>
      <description>音乐信息检索 | 7.5/10</description>
    </item>
    <item>
      <title>On deepfake voice detection - It’s all in the presentation</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-on-deepfake-voice-detection-its-all-in-the/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-on-deepfake-voice-detection-its-all-in-the/</guid>
      <description>音频深度伪造检测 | 8.0/10</description>
    </item>
    <item>
      <title>Optimizing Speech Language Models for Acoustic Consistency</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-optimizing-speech-language-models-for-acoustic/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-optimizing-speech-language-models-for-acoustic/</guid>
      <description>语音合成 | 8.0/10</description>
    </item>
    <item>
      <title>Position-Invariant Fine-Tuning Of Speech Enhancement Models With Self-Supervised Speech Representations</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-position-invariant-fine-tuning-of-speech/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-position-invariant-fine-tuning-of-speech/</guid>
      <description>语音增强 | 6.5/10</description>
    </item>
    <item>
      <title>Prompt-Guided Mixture-of-Experts for Robust Multimodal Sentiment Analysis with Missing Modalities</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-prompt-guided-mixture-of-experts-for-robust/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-prompt-guided-mixture-of-experts-for-robust/</guid>
      <description>语音情感识别 | 8.5/10</description>
    </item>
    <item>
      <title>Random Matrix-Driven Graph Representation Learning For Bioacoustic Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-random-matrix-driven-graph-representation/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-random-matrix-driven-graph-representation/</guid>
      <description>生物声学 | 7.5/10</description>
    </item>
    <item>
      <title>RAS: a Reliability Oriented Metric for Automatic Speech Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-ras-a-reliability-oriented-metric-for-automatic/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-ras-a-reliability-oriented-metric-for-automatic/</guid>
      <description>语音识别 | 7.5/10</description>
    </item>
    <item>
      <title>RASD-SR: A Robust Anomalous Sound Detection Framework with Score Recalibration</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-rasd-sr-a-robust-anomalous-sound-detection/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-rasd-sr-a-robust-anomalous-sound-detection/</guid>
      <description>异常声音检测 | 8.5/10</description>
    </item>
    <item>
      <title>Reading Between the Waves: Robust Topic Segmentation Using Inter-Sentence Audio Features</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-reading-between-the-waves-robust-topic/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-reading-between-the-waves-robust-topic/</guid>
      <description>音频分类 | 7.0/10</description>
    </item>
    <item>
      <title>Reasoning Driven Captions to Assist Noise Robust Speech Emotion Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-reasoning-driven-captions-to-assist-noise-robust/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-reasoning-driven-captions-to-assist-noise-robust/</guid>
      <description>语音情感识别 | 7.0/10</description>
    </item>
    <item>
      <title>Reducing Prompt Sensitivity in LLM-Based Speech Recognition Through Learnable Projection</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-reducing-prompt-sensitivity-in-llm-based-speech/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-reducing-prompt-sensitivity-in-llm-based-speech/</guid>
      <description>语音识别 | 7.0/10</description>
    </item>
    <item>
      <title>Regularized Inverse Filter Design for Rigid Spherical Microphone Array Processing: Laplace- And Time-Domain Representations</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-regularized-inverse-filter-design-for-rigid/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-regularized-inverse-filter-design-for-rigid/</guid>
      <description>空间音频 | 8.0/10</description>
    </item>
    <item>
      <title>RMODGDF: A Robust STFT-Derived Feature for Musical Instrument Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-rmodgdf-a-robust-stft-derived-feature-for-musical/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-rmodgdf-a-robust-stft-derived-feature-for-musical/</guid>
      <description>音乐信息检索 | 7.0/10</description>
    </item>
    <item>
      <title>Robust and Lightweight F0 Estimation Through Mid-Level Fusion of DSP-Informed Features</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-robust-and-lightweight-f0-estimation-through-mid/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-robust-and-lightweight-f0-estimation-through-mid/</guid>
      <description>基频估计 | 8.0/10</description>
    </item>
    <item>
      <title>Robust Deepfake Audio Detection via Multi-Level Intermediate Feature Fusion</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-robust-deepfake-audio-detection-via-multi-level/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-robust-deepfake-audio-detection-via-multi-level/</guid>
      <description>音频深度伪造检测 | 7.5/10</description>
    </item>
    <item>
      <title>RoCo: Robust Code for Fast and Effective Proactive Defense against Voice Cloning Attack</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-roco-robust-code-for-fast-and-effective-proactive/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-roco-robust-code-for-fast-and-effective-proactive/</guid>
      <description>音频安全 | 7.5/10</description>
    </item>
    <item>
      <title>RRPO: Robust Reward Policy Optimization for LLM-Based Emotional TTS</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-rrpo-robust-reward-policy-optimization-for-llm/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-rrpo-robust-reward-policy-optimization-for-llm/</guid>
      <description>语音合成 | 7.5/10</description>
    </item>
    <item>
      <title>Sampling-Rate-Agnostic Speech Super-Resolution Based on Gaussian Process Dynamical Systems with Deep Kernel Learning</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-sampling-rate-agnostic-speech-super-resolution/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-sampling-rate-agnostic-speech-super-resolution/</guid>
      <description>语音增强 | 6.5/10</description>
    </item>
    <item>
      <title>Snore Sound Classification Based on Physiological Features and Adaptive Loss Function</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-snore-sound-classification-based-on-physiological/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-snore-sound-classification-based-on-physiological/</guid>
      <description>音频分类 | 6.5/10</description>
    </item>
    <item>
      <title>Spectral or Spatial? Leveraging Both for Speaker Extraction in Challenging Data Conditions</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-spectral-or-spatial-leveraging-both-for-speaker/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-spectral-or-spatial-leveraging-both-for-speaker/</guid>
      <description>语音分离 | 7.0/10</description>
    </item>
    <item>
      <title>Spectrogram Event Based Feature Representation for Generalizable Automatic Music Transcription</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-spectrogram-event-based-feature-representation/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-spectrogram-event-based-feature-representation/</guid>
      <description>音乐信息检索 | 7.5/10</description>
    </item>
    <item>
      <title>Spiking Attention Network: A Hybrid Neuromorphic Approach to Underwater Acoustic Localization and Zero-Shot Adaptation</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-spiking-attention-network-a-hybrid-neuromorphic/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-spiking-attention-network-a-hybrid-neuromorphic/</guid>
      <description>声源定位 | 7.0/10</description>
    </item>
    <item>
      <title>Staged Diffusion with Hybrid Mixture-of-Experts (MOE) for Multimodal Sentiment Analysis</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-staged-diffusion-with-hybrid-mixture-of-experts/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-staged-diffusion-with-hybrid-mixture-of-experts/</guid>
      <description>语音情感识别 | 8.0/10</description>
    </item>
    <item>
      <title>StreamMark: A Deep Learning-Based Semi-Fragile Audio Watermarking for Proactive Deepfake Detection</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-streammark-a-deep-learning-based-semi-fragile/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-streammark-a-deep-learning-based-semi-fragile/</guid>
      <description>音频深度伪造检测 | 8.0/10</description>
    </item>
    <item>
      <title>SURE: Synergistic Uncertainty-Aware Reasoning for Multimodal Emotion Recognition in Conversations</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-sure-synergistic-uncertainty-aware-reasoning-for/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-sure-synergistic-uncertainty-aware-reasoning-for/</guid>
      <description>语音情感识别 | 7.5/10</description>
    </item>
    <item>
      <title>Target-Speaker LLM-ASR with Speaker-Aware Speech Encoder</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-target-speaker-llm-asr-with-speaker-aware-speech/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-target-speaker-llm-asr-with-speaker-aware-speech/</guid>
      <description>语音识别 | 8.8/10</description>
    </item>
    <item>
      <title>Toward Robust And Efficient Beat Tracking Via Beat-Aware Attention</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-toward-robust-and-efficient-beat-tracking-via/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-toward-robust-and-efficient-beat-tracking-via/</guid>
      <description>音乐理解 | 8.5/10</description>
    </item>
    <item>
      <title>Towards Blind Data Cleaning: A Case Study in Music Source Separation</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-towards-blind-data-cleaning-a-case-study-in-music/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-towards-blind-data-cleaning-a-case-study-in-music/</guid>
      <description>音乐信息检索 | 7.0/10</description>
    </item>
    <item>
      <title>Towards Lightweight Adaptation of Speech Enhancement Models in Real-World Environments</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-towards-lightweight-adaptation-of-speech/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-towards-lightweight-adaptation-of-speech/</guid>
      <description>语音增强 | 8.5/10</description>
    </item>
    <item>
      <title>Towards Robust Dysarthric Speech Recognition: LLM-Agent Post-ASR Correction Beyond WER</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-towards-robust-dysarthric-speech-recognition-llm/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-towards-robust-dysarthric-speech-recognition-llm/</guid>
      <description>语音识别 | 9.0/10</description>
    </item>
    <item>
      <title>Training Flow Matching Models with Reliable Labels via Self-Purification</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-training-flow-matching-models-with-reliable/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-training-flow-matching-models-with-reliable/</guid>
      <description>语音合成 | 7.5/10</description>
    </item>
    <item>
      <title>Transferable Audio Lottery Tickets: Gradient Accumulation for Extreme Sparsity</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-transferable-audio-lottery-tickets-gradient/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-transferable-audio-lottery-tickets-gradient/</guid>
      <description>音频分类 | 7.0/10</description>
    </item>
    <item>
      <title>Two-Stage Language Model Framework for Acoustic Echo Cancellation</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-two-stage-language-model-framework-for-acoustic/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-two-stage-language-model-framework-for-acoustic/</guid>
      <description>语音增强 | 7.5/10</description>
    </item>
    <item>
      <title>UMV: A Mixture-Of-Experts Vision Transformer with Multi-Spectrogram Fusion for Underwater Ship Noise Classification</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-umv-a-mixture-of-experts-vision-transformer-with/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-umv-a-mixture-of-experts-vision-transformer-with/</guid>
      <description>音频分类 | 7.5/10</description>
    </item>
    <item>
      <title>UNet-Based Fusion and Exponential Moving Average Adaptation for Noise-Robust Speaker Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-unet-based-fusion-and-exponential-moving-average/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-unet-based-fusion-and-exponential-moving-average/</guid>
      <description>说话人验证 | 7.5/10</description>
    </item>
    <item>
      <title>Unseen but Not Unknown: Using Dataset Concealment to Robustly Evaluate Speech Quality Estimation Models</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-unseen-but-not-unknown-using-dataset-concealment/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-unseen-but-not-unknown-using-dataset-concealment/</guid>
      <description>语音质量评估 | 8.3/10</description>
    </item>
    <item>
      <title>Voting-Based Pitch Estimation with Temporal and Frequential Alignment and Correlation Aware Selection</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-voting-based-pitch-estimation-with-temporal-and/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-voting-based-pitch-estimation-with-temporal-and/</guid>
      <description>语音识别 | 8.0/10</description>
    </item>
    <item>
      <title>Wave-Trainer-Fit: Neural Vocoder With Trainable Prior And Fixed-Point Iteration Towards High-Quality Speech Generation From SSL Features</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-wave-trainer-fit-neural-vocoder-with-trainable/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-wave-trainer-fit-neural-vocoder-with-trainable/</guid>
      <description>语音合成 | 7.0/10</description>
    </item>
    <item>
      <title>When Silence Matters: The Impact of Irrelevant Audio on Text Reasoning in Large Audio-Language Models</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-when-silence-matters-the-impact-of-irrelevant/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-when-silence-matters-the-impact-of-irrelevant/</guid>
      <description>模型评估 | 7.0/10</description>
    </item>
    <item>
      <title>When Voice Matters: A Controlled Study of Audio LLM Behavior in Clinical Decision-Making</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-when-voice-matters-a-controlled-study-of-audio/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-when-voice-matters-a-controlled-study-of-audio/</guid>
      <description>模型评估 | 7.0/10</description>
    </item>
    <item>
      <title>RAS: a Reliability Oriented Metric for Automatic Speech Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-28-ras-a-reliability-oriented-metric-for-automatic/</link>
      <pubDate>Tue, 28 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-28-ras-a-reliability-oriented-metric-for-automatic/</guid>
      <description>语音识别 | 7.5/10</description>
    </item>
    <item>
      <title>Robust Audio-Text Retrieval via Cross-Modal Attention and Hybrid Loss</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-28-robust-audio-text-retrieval-via-cross-modal/</link>
      <pubDate>Tue, 28 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-28-robust-audio-text-retrieval-via-cross-modal/</guid>
      <description>音频检索 | 7.5/10</description>
    </item>
    <item>
      <title>Advancing automatic speech recognition using feature fusion with self-supervised learning features: A case study on Fearless Steps Apollo corpus</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-27-advancing-automatic-speech-recognition-using/</link>
      <pubDate>Mon, 27 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-27-advancing-automatic-speech-recognition-using/</guid>
      <description>语音识别 | 7.0/10</description>
    </item>
    <item>
      <title>&#34;This Wasn&#39;t Made for Me&#34;: Recentering User Experience and Emotional Impact in the Evaluation of ASR Bias</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-24-this-wasnt-made-for-me-recentering-user/</link>
      <pubDate>Fri, 24 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-24-this-wasnt-made-for-me-recentering-user/</guid>
      <description>语音识别 | 7.0/10</description>
    </item>
    <item>
      <title>Do LLM Decoders Listen Fairly? Benchmarking How Language Model Priors Shape Bias in Speech Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-24-do-llm-decoders-listen-fairly-benchmarking-how/</link>
      <pubDate>Fri, 24 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-24-do-llm-decoders-listen-fairly-benchmarking-how/</guid>
      <description>语音识别 | 7.5/10</description>
    </item>
    <item>
      <title>Before the Mic: Physical-Layer Voiceprint Anonymization with Acoustic Metamaterials</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-before-the-mic-physical-layer-voiceprint/</link>
      <pubDate>Thu, 23 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-before-the-mic-physical-layer-voiceprint/</guid>
      <description>这篇论文针对在公共场景（如会议、演讲）中，不可信录音设备可能导致声纹泄露且事后无法补救的问题，提出了EchoMask——首个基于声学超材料的物理层实时声纹匿名化系统。其核心方法是在声音到达麦克风前，通过精心设计的被动声学结构对特定低频段（300-700Hz）进行选择性干扰，该频段对说话人识别至关重要</description>
    </item>
    <item>
      <title>Enhancing Speaker Verification with Whispered Speech via Post-Processing</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-enhancing-speaker-verification-with-whispered/</link>
      <pubDate>Thu, 23 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-enhancing-speaker-verification-with-whispered/</guid>
      <description>1. **问题**：耳语语音因缺乏声带振动，其声学特征与正常语音差异显著，导致现有的说话人验证系统性能严重下降。这在用户为保护隐私而低语、或因疾病无法正常发声等实际场景中构成挑战。 2. **方法核心**：在预训练的说话人验证骨干网络（ReDimNet-B6）之上，添加一个轻量级的编码器-解码器结构</description>
    </item>
    <item>
      <title>FastTurn: Unifying Acoustic and Streaming Semantic Cues for Low-Latency and Robust Turn Detection</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-fastturn-unifying-acoustic-and-streaming-semantic/</link>
      <pubDate>Thu, 23 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-fastturn-unifying-acoustic-and-streaming-semantic/</guid>
      <description>这篇论文针对全双工语音对话系统中需要低延迟、高精度判断用户是否结束发言（轮次检测）的难题，提出了FastTurn统一框架。其核心方法是将流式CTC解码提供的快速部分语义信息，与Conformer编码器提取的声学特征，通过适配器输入给大语言模型（LLM）进行推理，并最终融合声学与语义特征进行轮次预测。</description>
    </item>
    <item>
      <title>Omni-Embed-Audio: Leveraging Multimodal LLMs for Robust Audio-Text Retrieval</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-21-omni-embed-audio-leveraging-multimodal-llms-for/</link>
      <pubDate>Tue, 21 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-21-omni-embed-audio-leveraging-multimodal-llms-for/</guid>
      <description>这篇论文旨在解决当前音频-文本检索模型在**真实、多样化用户查询**下性能下降的问题。作者指出，现有基准测试（如AudioCaps, Clotho）依赖描述性标题式查询，与真实世界中简短、多变的搜索行为（如问题、命令、关键词、排除性查询）存在巨大差距。为此，论文提出了两大核心贡献：1) **Omni</description>
    </item>
    <item>
      <title>Still Between Us? Evaluating and Improving Voice Assistant Robustness to Third-Party Interruptions</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-21-still-between-us-evaluating-and-improving-voice/</link>
      <pubDate>Tue, 21 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-21-still-between-us-evaluating-and-improving-voice/</guid>
      <description>本文旨在解决语音语言模型（SLMs）在真实场景中无法有效区分主要用户与第三方插入语音（Third-Party Interruption, TPI）的问题，这会导致上下文理解失败。为此，作者首先创建了 **TPI-Train**，一个包含8.8万个样本的训练数据集，其核心设计是“说话人感知的难负例”，</description>
    </item>
  </channel>
</rss>
