<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>迁移学习 on 语音/音频论文速递</title>
    <link>https://nanless.github.io/audio-paper-digest-blog/tags/%E8%BF%81%E7%A7%BB%E5%AD%A6%E4%B9%A0/</link>
    <description>Recent content in 迁移学习 on 语音/音频论文速递</description>
    <generator>Hugo</generator>
    <language>zh-cn</language>
    <lastBuildDate>Wed, 29 Apr 2026 00:00:00 +0000</lastBuildDate>
    <atom:link href="https://nanless.github.io/audio-paper-digest-blog/tags/%E8%BF%81%E7%A7%BB%E5%AD%A6%E4%B9%A0/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>A Parameter-Efficient Multi-Scale Convolutional Adapter for Synthetic Speech Detection</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-parameter-efficient-multi-scale-convolutional/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-parameter-efficient-multi-scale-convolutional/</guid>
      <description>A Parameter-Efficient Multi-Scale Convolutional Adapter for Synthetic Speech Detection</description>
    </item>
    <item>
      <title>AFT: An Exemplar-Free Class Incremental Learning Method for Environmental Sound Classification</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-aft-an-exemplar-free-class-incremental-learning/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-aft-an-exemplar-free-class-incremental-learning/</guid>
      <description>音频分类 | 7.0/10</description>
    </item>
    <item>
      <title>AISHELL6-Whisper: A Chinese Mandarin Audio-Visual Whisper Speech Dataset with Speech Recognition Baselines</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-aishell6-whisper-a-chinese-mandarin-audio-visual/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-aishell6-whisper-a-chinese-mandarin-audio-visual/</guid>
      <description>语音识别 | 8.3/10</description>
    </item>
    <item>
      <title>CASTELLA: Long Audio Dataset with Captions and Temporal Boundaries</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-castella-long-audio-dataset-with-captions-and/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-castella-long-audio-dataset-with-captions-and/</guid>
      <description>音频检索 | 8.5/10</description>
    </item>
    <item>
      <title>Discrete-Continuous Fusion With Adaptive Hierarchical Features For Audio Deepfake Detection</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-discrete-continuous-fusion-with-adaptive/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-discrete-continuous-fusion-with-adaptive/</guid>
      <description>音频深度伪造检测 | 8.0/10</description>
    </item>
    <item>
      <title>E2E-AEC: Implementing An End-To-End Neural Network Learning Approach for Acoustic Echo Cancellation</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-e2e-aec-implementing-an-end-to-end-neural-network/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-e2e-aec-implementing-an-end-to-end-neural-network/</guid>
      <description>语音增强 | 7.5/10</description>
    </item>
    <item>
      <title>Efficient Depression Detection from Speech via Language-Independent Prompt-Driven Reprogramming</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-efficient-depression-detection-from-speech-via/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-efficient-depression-detection-from-speech-via/</guid>
      <description>语音生物标志物 | 7.5/10</description>
    </item>
    <item>
      <title>Exploring Fine-Tuning Of Large Audio Language Models For Spoken Language Understanding Under Limited Speech Data</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-exploring-fine-tuning-of-large-audio-language/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-exploring-fine-tuning-of-large-audio-language/</guid>
      <description>语音理解 | 8.0/10</description>
    </item>
    <item>
      <title>Fine-Tuning Large Audio-Language Models with Lora for Precise Temporal Localization of Prolonged Exposure Therapy Elements</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-fine-tuning-large-audio-language-models-with-lora/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-fine-tuning-large-audio-language-models-with-lora/</guid>
      <description>音频事件检测 | 6.5/10</description>
    </item>
    <item>
      <title>FlowSE-GRPO: Training Flow Matching Speech Enhancement via Online Reinforcement Learning</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-flowse-grpo-training-flow-matching-speech/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-flowse-grpo-training-flow-matching-speech/</guid>
      <description>语音增强 | 7.5/10</description>
    </item>
    <item>
      <title>From Human Speech to Ocean Signals: Transferring Speech Large Models for Underwater Acoustic Target Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-from-human-speech-to-ocean-signals-transferring/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-from-human-speech-to-ocean-signals-transferring/</guid>
      <description>水下声学目标识别 | 7.0/10</description>
    </item>
    <item>
      <title>Generalizability of Predictive and Generative Speech Enhancement Models to Pathological Speakers</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-generalizability-of-predictive-and-generative/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-generalizability-of-predictive-and-generative/</guid>
      <description>语音增强 | 7.0/10</description>
    </item>
    <item>
      <title>GLUE: Gradient-free Learning to Unify Experts</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-glue-gradient-free-learning-to-unify-experts/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-glue-gradient-free-learning-to-unify-experts/</guid>
      <description>迁移学习 | 6.5/10</description>
    </item>
    <item>
      <title>How Far Do SSL Speech Models Listen for Tone? Temporal Focus of Tone Representation under Low-Resource Transfer</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-how-far-do-ssl-speech-models-listen-for-tone/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-how-far-do-ssl-speech-models-listen-for-tone/</guid>
      <description>语音识别 | 6.5/10</description>
    </item>
    <item>
      <title>Human-1 by Josh Talks: A Full-Duplex Conversational Modeling Framework in Hindi using Real-World Conversations</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-human-1-by-josh-talks-a-full-duplex/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-human-1-by-josh-talks-a-full-duplex/</guid>
      <description>语音对话系统 | 7.5/10</description>
    </item>
    <item>
      <title>ICASSP 2026 - 迁移学习 论文列表</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/icassp2026-task-099/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/icassp2026-task-099/</guid>
      <description>共 1 篇 ICASSP 2026 迁移学习 方向论文</description>
    </item>
    <item>
      <title>Improving Active Learning for Melody Estimation by Disentangling Uncertainties</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-improving-active-learning-for-melody-estimation/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-improving-active-learning-for-melody-estimation/</guid>
      <description>音乐信息检索 | 7.5/10</description>
    </item>
    <item>
      <title>Influence-Aware Curation and Active Selection for Industrial and Surveillance Sound Events</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-influence-aware-curation-and-active-selection-for/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-influence-aware-curation-and-active-selection-for/</guid>
      <description>音频事件检测 | 7.0/10</description>
    </item>
    <item>
      <title>Interval-Aware Retrieval Framework For Speech-Based Automatic Alzheimer’s Detection</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-interval-aware-retrieval-framework-for-speech/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-interval-aware-retrieval-framework-for-speech/</guid>
      <description>语音生物标志物 | 8.5/10</description>
    </item>
    <item>
      <title>It Is Personal: The Importance of Personalization for Recognizing Self-Reported Emotion</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-it-is-personal-the-importance-of-personalization/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-it-is-personal-the-importance-of-personalization/</guid>
      <description>语音情感识别 | 8.0/10</description>
    </item>
    <item>
      <title>Learning to Align with Unbalanced Optimal Transport in Linguistic Knowledge Transfer for ASR</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-learning-to-align-with-unbalanced-optimal/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-learning-to-align-with-unbalanced-optimal/</guid>
      <description>语音识别 | 6.5/10</description>
    </item>
    <item>
      <title>LOTUSDIS: A Thai Far-Field Meeting Corpus for Robust Conversational ASR</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-lotusdis-a-thai-far-field-meeting-corpus-for/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-lotusdis-a-thai-far-field-meeting-corpus-for/</guid>
      <description>语音识别 | 7.5/10</description>
    </item>
    <item>
      <title>Low-Resource Speech-Based Early Alzheimers Detection via Cross-Lingual and Few-Shot Transfer Learning</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-low-resource-speech-based-early-alzheimers/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-low-resource-speech-based-early-alzheimers/</guid>
      <description>语音生物标志物 | 7.5/10</description>
    </item>
    <item>
      <title>MMAudioSep: Taming Video-to-Audio Generative Model Towards Video/Text-Queried Sound Separation</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-mmaudiosep-taming-video-to-audio-generative-model/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-mmaudiosep-taming-video-to-audio-generative-model/</guid>
      <description>语音分离 | 8.0/10</description>
    </item>
    <item>
      <title>Multi-Layer Attentive Probing Improves Transfer of Audio Representations for Bioacoustics</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-multi-layer-attentive-probing-improves-transfer/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-multi-layer-attentive-probing-improves-transfer/</guid>
      <description>生物声学 | 7.5/10</description>
    </item>
    <item>
      <title>Multilingual Supervised Pretraining with Lm-Assisted Decoding for Visual Speech Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-multilingual-supervised-pretraining-with-lm/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-multilingual-supervised-pretraining-with-lm/</guid>
      <description>语音识别 | 6.5/10</description>
    </item>
    <item>
      <title>Perceptual Loss Optimized HRTF Personalization in Spherical Harmonic Domain</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-perceptual-loss-optimized-hrtf-personalization-in/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-perceptual-loss-optimized-hrtf-personalization-in/</guid>
      <description>空间音频 | 7.0/10</description>
    </item>
    <item>
      <title>Praxy Voice: Voice-Prompt Recovery &#43; BUPS for Commercial-Class Indic TTS from a Frozen Non-Indic Base at Zero Commercial-Training-Data Cost</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-praxy-voice-voice-prompt-recovery-bups-for/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-praxy-voice-voice-prompt-recovery-bups-for/</guid>
      <description>语音合成 | 8.0/10</description>
    </item>
    <item>
      <title>Probing the Hidden Talent of ASR foundation models for L2 English Oral Assessment</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-probing-the-hidden-talent-of-asr-foundation/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-probing-the-hidden-talent-of-asr-foundation/</guid>
      <description>预训练 | 7.5/10</description>
    </item>
    <item>
      <title>Probing Whisper for Dysarthric Speech in Detection and Assessment</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-probing-whisper-for-dysarthric-speech-in/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-probing-whisper-for-dysarthric-speech-in/</guid>
      <description>语音生物标志物 | 6.5/10</description>
    </item>
    <item>
      <title>Quality Assessment of Noisy and Enhanced Speech with Limited Data: UWB-NTIS System for Voicemos 2024</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-quality-assessment-of-noisy-and-enhanced-speech/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-quality-assessment-of-noisy-and-enhanced-speech/</guid>
      <description>语音质量评估 | 7.0/10</description>
    </item>
    <item>
      <title>Ranking The Impact of Contextual Specialization in Neural Speech Enhancement</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-ranking-the-impact-of-contextual-specialization/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-ranking-the-impact-of-contextual-specialization/</guid>
      <description>语音增强 | 7.5/10</description>
    </item>
    <item>
      <title>Representation-Diverse Self-Supervision for Cross-Domain Bioacoustic Learning in Low-Resource Settings</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-representation-diverse-self-supervision-for-cross/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-representation-diverse-self-supervision-for-cross/</guid>
      <description>生物声学 | 7.0/10</description>
    </item>
    <item>
      <title>SAUNA: Song-Level Audio &amp; User-Listening Data Neural Alignment</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-sauna-song-level-audio-user-listening-data-neural/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-sauna-song-level-audio-user-listening-data-neural/</guid>
      <description>音乐信息检索 | 7.0/10</description>
    </item>
    <item>
      <title>SELD-MOHA: A Fine-Tuning Method with the Mixture of Heterogeneous Adapters for Sound Event Localization and Detection</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-seld-moha-a-fine-tuning-method-with-the-mixture/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-seld-moha-a-fine-tuning-method-with-the-mixture/</guid>
      <description>音频事件检测 | 7.0/10</description>
    </item>
    <item>
      <title>Semantic Anchor Transfer from Short to Long Speech in a Distillation-Based Summarization Framework</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-semantic-anchor-transfer-from-short-to-long/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-semantic-anchor-transfer-from-short-to-long/</guid>
      <description>语音摘要 | 7.5/10</description>
    </item>
    <item>
      <title>SightSound-R1: Cross-Modal Reasoning Distillation from Vision to Audio Language Models</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-sightsound-r1-cross-modal-reasoning-distillation/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-sightsound-r1-cross-modal-reasoning-distillation/</guid>
      <description>音频问答 | 7.5/10</description>
    </item>
    <item>
      <title>SpeechMapper: Speech-To-Text Embedding Projector for LLMs</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-speechmapper-speech-to-text-embedding-projector/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-speechmapper-speech-to-text-embedding-projector/</guid>
      <description>语音大模型 | 7.0/10</description>
    </item>
    <item>
      <title>Stress Prediction from Temporal Emotion Trajectories in Clinical Patient-Physician Conversations</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-stress-prediction-from-temporal-emotion/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-stress-prediction-from-temporal-emotion/</guid>
      <description>语音情感识别 | 7.0/10</description>
    </item>
    <item>
      <title>Synthesized Data Selection via Score Distribution Matching for Te Reo Māori Automatic Speech Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-synthesized-data-selection-via-score-distribution/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-synthesized-data-selection-via-score-distribution/</guid>
      <description>语音识别 | 8.0/10</description>
    </item>
    <item>
      <title>Three Seconds is Sufficient: A Multi-Pronged Framework for Model-Based Speaker Adaptation in ASR Under Data-Scarce Conditions</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-three-seconds-is-sufficient-a-multi-pronged/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-three-seconds-is-sufficient-a-multi-pronged/</guid>
      <description>语音识别 | 7.0/10</description>
    </item>
    <item>
      <title>Towards Fair ASR for Second Language Speakers using Fairness Prompted Finetuning</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-towards-fair-asr-for-second-language-speakers/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-towards-fair-asr-for-second-language-speakers/</guid>
      <description>语音识别 | 6.5/10</description>
    </item>
    <item>
      <title>Transfer Learning for Paediatric Sleep Apnoea Detection using Physiology-Guided Acoustic Models</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-transfer-learning-for-paediatric-sleep-apnoea/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-transfer-learning-for-paediatric-sleep-apnoea/</guid>
      <description>音频分类 | 7.0/10</description>
    </item>
    <item>
      <title>Transferable Audio Lottery Tickets: Gradient Accumulation for Extreme Sparsity</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-transferable-audio-lottery-tickets-gradient/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-transferable-audio-lottery-tickets-gradient/</guid>
      <description>音频分类 | 7.0/10</description>
    </item>
    <item>
      <title>UNet-Based Fusion and Exponential Moving Average Adaptation for Noise-Robust Speaker Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-unet-based-fusion-and-exponential-moving-average/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-unet-based-fusion-and-exponential-moving-average/</guid>
      <description>说话人验证 | 7.5/10</description>
    </item>
    <item>
      <title>WavLink: Compact Audio–Text Embeddings with a Global Whisper Token</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-wavlink-compact-audiotext-embeddings-with-a/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-wavlink-compact-audiotext-embeddings-with-a/</guid>
      <description>音频检索 | 8.0/10</description>
    </item>
    <item>
      <title>Windowed SummaryMixing: An Efficient Fine-Tuning of Self-Supervised Learning Models for Low-Resource Speech Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-windowed-summarymixing-an-efficient-fine-tuning/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-windowed-summarymixing-an-efficient-fine-tuning/</guid>
      <description>语音识别 | 6.5/10</description>
    </item>
    <item>
      <title>Low-Rank Adaptation Redux for Large Models</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-24-low-rank-adaptation-redux-for-large-models/</link>
      <pubDate>Fri, 24 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-24-low-rank-adaptation-redux-for-large-models/</guid>
      <description>大语言模型 | 5.5/10</description>
    </item>
    <item>
      <title>Environmental Sound Deepfake Detection Using Deep-Learning Framework</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-environmental-sound-deepfake-detection-using-deep/</link>
      <pubDate>Thu, 23 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-environmental-sound-deepfake-detection-using-deep/</guid>
      <description>1.  **问题**：针对环境声音（包括声音场景和声音事件）的深度伪造检测（ESDD）任务，现有研究不足，且尚不清楚声音场景与声音事件的伪造检测是否需要不同模型。 2.  **方法核心**：提出一个深度学习框架，核心是采用预训练的音频模型（BEATs）作为特征提取器，并结合一种三阶段训练策略（包含对</description>
    </item>
    <item>
      <title>SAND: The Challenge on Speech Analysis for Neurodegenerative Disease Assessment</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-sand-the-challenge-on-speech-analysis-for/</link>
      <pubDate>Thu, 23 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-sand-the-challenge-on-speech-analysis-for/</guid>
      <description>1.  **解决的问题**：针对神经退行性疾病（特别是肌萎缩侧索硬化症ALS）的早期诊断和监测，缺乏大规模、有临床标注的语音数据集，以及标准化的算法评估框架。 2.  **方法核心**：构建并发布了名为SAND的挑战赛，其核心是提供一个扩展的、包含纵向数据的ALS患者与健康对照语音数据集（VOC-A</description>
    </item>
    <item>
      <title>FreezeEmpath: Efficient Training for Empathetic Spoken Chatbots with Frozen LLMs</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-21-freezeempath-efficient-training-for-empathetic/</link>
      <pubDate>Tue, 21 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-21-freezeempath-efficient-training-for-empathetic/</guid>
      <description>本文旨在解决训练共情语音聊天机器人时面临的**共情语音数据稀缺、模型泛化能力弱、以及微调导致LLM通用能力退化**三大难题。作者提出了**FreezeEmpath**，一种高效的端到端训练框架。其核心方法是**冻结基础LLM**，采用**语义-情感解耦编码策略**，通过独立的语义适配器和情感提取器从</description>
    </item>
    <item>
      <title>Contextual Biasing for ASR in Speech LLM with Common Word Cues and Bias Word Position Prediction</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-19-contextual-biasing-for-asr-in-speech-llm-with/</link>
      <pubDate>Sun, 19 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-19-contextual-biasing-for-asr-in-speech-llm-with/</guid>
      <description>这篇论文旨在解决语音大模型（SLLM）在识别训练数据中稀有或未见的“偏置词”时性能不佳的问题。传统方法依赖于为偏置词提供精确的音素序列（通过G2P系统生成），但这对用户有专业要求且工具兼容性差。为此，</description>
    </item>
    <item>
      <title>SpeakerRPL v2: Robust Open-set Speaker Identification through Enhanced Few-shot Foundation Tuning and Model Fusion</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-19-speakerrpl-v2-robust-open-set-speaker/</link>
      <pubDate>Sun, 19 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-19-speakerrpl-v2-robust-open-set-speaker/</guid>
      <description>本文旨在解决开放集说话人识别中的鲁棒性问题，即系统在仅有少量目标说话人注册样本的情况下，需同时准确识别已知说话人并可靠拒识未知说话人。作者在先前SpeakerRPL V1框架基础上提出了三项关键改进：</description>
    </item>
    <item>
      <title>Transformer Based Machine Fault Detection From Audio Input</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-19-transformer-based-machine-fault-detection-from/</link>
      <pubDate>Sun, 19 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-19-transformer-based-machine-fault-detection-from/</guid>
      <description>本文旨在探讨基于Transformer的架构在机器故障音频检测任务上相对于传统卷积神经网络（CNN）的潜在优势。**要解决的问题**是传统CNN在处理频谱图时固有的局部性和平移不变性等归纳偏置，可能并</description>
    </item>
  </channel>
</rss>
