<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>语音情感识别 on 语音/音频论文速递</title>
    <link>https://nanless.github.io/audio-paper-digest-blog/tags/%E8%AF%AD%E9%9F%B3%E6%83%85%E6%84%9F%E8%AF%86%E5%88%AB/</link>
    <description>Recent content in 语音情感识别 on 语音/音频论文速递</description>
    <generator>Hugo</generator>
    <language>zh-cn</language>
    <lastBuildDate>Wed, 29 Apr 2026 00:00:00 +0000</lastBuildDate>
    <atom:link href="https://nanless.github.io/audio-paper-digest-blog/tags/%E8%AF%AD%E9%9F%B3%E6%83%85%E6%84%9F%E8%AF%86%E5%88%AB/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Acoustic and Facial Markers of Perceived Conversational Success in Spontaneous Speech</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-acoustic-and-facial-markers-of-perceived/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-acoustic-and-facial-markers-of-perceived/</guid>
      <description>语音情感识别 | 6.0/10</description>
    </item>
    <item>
      <title>ADH-VA: Adaptive Directed-Hypergraph Convolution with VA Contrastive Learning for Multimodal Conversational Emotion Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-adh-va-adaptive-directed-hypergraph-convolution/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-adh-va-adaptive-directed-hypergraph-convolution/</guid>
      <description>语音情感识别 | 7.5/10</description>
    </item>
    <item>
      <title>Affect-Jigsaw: Integrating Core and Peripheral Emotions for Harmonious Fine-Grained Multimodal Emotion Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-affect-jigsaw-integrating-core-and-peripheral/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-affect-jigsaw-integrating-core-and-peripheral/</guid>
      <description>语音情感识别 | 8.0/10</description>
    </item>
    <item>
      <title>AMBER2: Dual Ambiguity-Aware Emotion Recognition Applied to Speech and Text</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-amber2-dual-ambiguity-aware-emotion-recognition/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-amber2-dual-ambiguity-aware-emotion-recognition/</guid>
      <description>语音情感识别 | 8.0/10</description>
    </item>
    <item>
      <title>APKD: Aligned And Paced Knowledge Distillation Towards Lightweight Heterogeneous Multimodal Emotion Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-apkd-aligned-and-paced-knowledge-distillation/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-apkd-aligned-and-paced-knowledge-distillation/</guid>
      <description>情感识别 | 7.5/10</description>
    </item>
    <item>
      <title>Attention-Weighted Centered Kernel Alignment for Knowledge Distillation in Large Audio-Language Models Applied To Speech Emotion Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-attention-weighted-centered-kernel-alignment-for/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-attention-weighted-centered-kernel-alignment-for/</guid>
      <description>语音情感识别 | 8.0/10</description>
    </item>
    <item>
      <title>B-GRPO: Unsupervised Speech Emotion Recognition Based on Batched-Group Relative Policy Optimization</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-b-grpo-unsupervised-speech-emotion-recognition/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-b-grpo-unsupervised-speech-emotion-recognition/</guid>
      <description>语音情感识别 | 6.5/10</description>
    </item>
    <item>
      <title>Behind the Scenes: Mechanistic Interpretability of Lora-Adapted Whisper for Speech Emotion Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-behind-the-scenes-mechanistic-interpretability-of/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-behind-the-scenes-mechanistic-interpretability-of/</guid>
      <description>语音情感识别 | 7.5/10</description>
    </item>
    <item>
      <title>Bimodal Fusion Framework for Dynamic Facial Expression Recognition In-The-Wild</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-bimodal-fusion-framework-for-dynamic-facial/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-bimodal-fusion-framework-for-dynamic-facial/</guid>
      <description>语音情感识别 | 7.0/10</description>
    </item>
    <item>
      <title>Clue2Emo: A Brain-Inspired Framework for Open-Vocabulary Multimodal Emotion Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-clue2emo-a-brain-inspired-framework-for-open/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-clue2emo-a-brain-inspired-framework-for-open/</guid>
      <description>语音情感识别 | 8.5/10</description>
    </item>
    <item>
      <title>Context-Aware Dynamic Graph Learning for Multimodal Emotion Recognition with Missing Modalities</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-context-aware-dynamic-graph-learning-for/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-context-aware-dynamic-graph-learning-for/</guid>
      <description>语音情感识别 | 8.8/10</description>
    </item>
    <item>
      <title>DDSR-Net: Robust Multimodal Sentiment Analysis via Dynamic Modality Reliability Assessment</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-ddsr-net-robust-multimodal-sentiment-analysis-via/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-ddsr-net-robust-multimodal-sentiment-analysis-via/</guid>
      <description>语音情感识别 | 6.5/10</description>
    </item>
    <item>
      <title>DGSDNet: Dual-Graph Spectral Diffusion Network for Incomplete Multimodal Emotion Recognition in Conversations</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-dgsdnet-dual-graph-spectral-diffusion-network-for/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-dgsdnet-dual-graph-spectral-diffusion-network-for/</guid>
      <description>语音情感识别 | 8.0/10</description>
    </item>
    <item>
      <title>Diffemotalk: Audio-Driven Facial Animation with Fine-Grained Emotion Control via Diffusion Models</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-diffemotalk-audio-driven-facial-animation-with/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-diffemotalk-audio-driven-facial-animation-with/</guid>
      <description>语音情感识别 | 7.5/10</description>
    </item>
    <item>
      <title>Do You Hear What I Mean? Quantifying the Instruction-Perception GAP in Instruction-Guided Expressive Text-to-Speech Systems</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-do-you-hear-what-i-mean-quantifying-the/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-do-you-hear-what-i-mean-quantifying-the/</guid>
      <description>语音合成 | 8.0/10</description>
    </item>
    <item>
      <title>Dynamic Balanced Cross-Modal Attention with Gated Sequence Restoration: Towards Robust Multimodal Sentiment Analysis</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-dynamic-balanced-cross-modal-attention-with-gated/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-dynamic-balanced-cross-modal-attention-with-gated/</guid>
      <description>跨模态 | 7.5/10</description>
    </item>
    <item>
      <title>ECSA: Dual-Branch Emotion Compensation for Emotion-Consistent Speaker Anonymization</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-ecsa-dual-branch-emotion-compensation-for-emotion/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-ecsa-dual-branch-emotion-compensation-for-emotion/</guid>
      <description>语音匿名化 | 8.5/10</description>
    </item>
    <item>
      <title>Emo-TTA: Improving Test-Time Adaptation of Audio-Language Models for Speech Emotion Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-emo-tta-improving-test-time-adaptation-of-audio/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-emo-tta-improving-test-time-adaptation-of-audio/</guid>
      <description>语音情感识别 | 7.0/10</description>
    </item>
    <item>
      <title>EMORL-TTS: Reinforcement Learning for Fine-Grained Emotion Control in LLM-based TTS</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-emorl-tts-reinforcement-learning-for-fine-grained/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-emorl-tts-reinforcement-learning-for-fine-grained/</guid>
      <description>语音合成 | 8.5/10</description>
    </item>
    <item>
      <title>Emotion-Aligned Generation in Diffusion Text to Speech Models Via Preference-Guided Optimization</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-emotion-aligned-generation-in-diffusion-text-to/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-emotion-aligned-generation-in-diffusion-text-to/</guid>
      <description>语音合成 | 8.0/10</description>
    </item>
    <item>
      <title>Emotional Dimension Control in Language Model-Based Text-To-Speech: Spanning a Broad Spectrum of Human Emotions</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-emotional-dimension-control-in-language-model/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-emotional-dimension-control-in-language-model/</guid>
      <description>语音合成 | 7.5/10</description>
    </item>
    <item>
      <title>EmoTri-RL: Emotion- and Cause-Aware Reinforcement Learning for Multi-Modal Empathetic Dialogue</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-emotri-rl-emotion-and-cause-aware-reinforcement/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-emotri-rl-emotion-and-cause-aware-reinforcement/</guid>
      <description>语音情感识别 | 7.0/10</description>
    </item>
    <item>
      <title>Encoding Emotion Through Self-Supervised Eye Movement Reconstruction</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-encoding-emotion-through-self-supervised-eye/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-encoding-emotion-through-self-supervised-eye/</guid>
      <description>语音情感识别 | 7.5/10</description>
    </item>
    <item>
      <title>Evaluating Emotion Recognition in Spoken Language Models on Emotionally Incongruent Speech</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-evaluating-emotion-recognition-in-spoken-language/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-evaluating-emotion-recognition-in-spoken-language/</guid>
      <description>语音情感识别 | 7.5/10</description>
    </item>
    <item>
      <title>Expressive Voice Conversion with Controllable Emotional Intensity</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-expressive-voice-conversion-with-controllable/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-expressive-voice-conversion-with-controllable/</guid>
      <description>语音转换 | 7.5/10</description>
    </item>
    <item>
      <title>FIDIC:Fine-Grained Conversational Emotion Recognition via Individual Differences in Inertia and Contagion</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-fidicfine-grained-conversational-emotion/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-fidicfine-grained-conversational-emotion/</guid>
      <description>语音情感识别 | 7.5/10</description>
    </item>
    <item>
      <title>Gen-SER: When the Generative Model Meets Speech Emotion Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-gen-ser-when-the-generative-model-meets-speech/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-gen-ser-when-the-generative-model-meets-speech/</guid>
      <description>语音情感识别 | 6.5/10</description>
    </item>
    <item>
      <title>Graph-based Modality Alignment for Robustness in Conversational Emotion Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-graph-based-modality-alignment-for-robustness-in/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-graph-based-modality-alignment-for-robustness-in/</guid>
      <description>语音情感识别 | 8.0/10</description>
    </item>
    <item>
      <title>ICASSP 2026 - 语音情感识别 论文列表</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/icassp2026-task-066/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/icassp2026-task-066/</guid>
      <description>共 49 篇 ICASSP 2026 语音情感识别 方向论文</description>
    </item>
    <item>
      <title>InconVAD: A Two-Stage Dual-Tower Framework for Multimodal Emotion Inconsistency Detection</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-inconvad-a-two-stage-dual-tower-framework-for/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-inconvad-a-two-stage-dual-tower-framework-for/</guid>
      <description>语音情感识别 | 7.5/10</description>
    </item>
    <item>
      <title>Input-Adaptive Differentiable Filterbanks via Hypernetworks for Robust Speech Processing</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-input-adaptive-differentiable-filterbanks-via/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-input-adaptive-differentiable-filterbanks-via/</guid>
      <description>语音识别 | 7.5/10</description>
    </item>
    <item>
      <title>Inter-Dialog Contrastive Learning for Multimodal Emotion Recognition in Conversations</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-inter-dialog-contrastive-learning-for-multimodal/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-inter-dialog-contrastive-learning-for-multimodal/</guid>
      <description>语音情感识别 | 7.5/10</description>
    </item>
    <item>
      <title>It Is Personal: The Importance of Personalization for Recognizing Self-Reported Emotion</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-it-is-personal-the-importance-of-personalization/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-it-is-personal-the-importance-of-personalization/</guid>
      <description>语音情感识别 | 8.0/10</description>
    </item>
    <item>
      <title>Korean aegyo speech shows systematic F1 increase to signal childlike qualities</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-korean-aegyo-speech-shows-systematic-f1-increase/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-korean-aegyo-speech-shows-systematic-f1-increase/</guid>
      <description>语音情感识别 | 6.0/10</description>
    </item>
    <item>
      <title>LETPAV: Lexicon-Enhanced Text with Progressive Audio-Visual Fusion for Multimodal Sentiment Analysis</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-letpav-lexicon-enhanced-text-with-progressive/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-letpav-lexicon-enhanced-text-with-progressive/</guid>
      <description>语音情感识别 | 7.5/10</description>
    </item>
    <item>
      <title>Leveraging Large Speech Language Models as Evaluators for Expressive Speech</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-leveraging-large-speech-language-models-as/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-leveraging-large-speech-language-models-as/</guid>
      <description>语音情感识别 | 6.5/10</description>
    </item>
    <item>
      <title>MECap-R1: Emotion-Aware Policy with Reinforcement Learning for Multimodal Emotion Captioning</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-mecap-r1-emotion-aware-policy-with-reinforcement/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-mecap-r1-emotion-aware-policy-with-reinforcement/</guid>
      <description>语音情感识别 | 7.5/10</description>
    </item>
    <item>
      <title>MI-Fuse: Label Fusion for Unsupervised Domain Adaptation with Closed-Source Large Audio-Language Model</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-mi-fuse-label-fusion-for-unsupervised-domain/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-mi-fuse-label-fusion-for-unsupervised-domain/</guid>
      <description>语音情感识别 | 8.0/10</description>
    </item>
    <item>
      <title>Mixture-of-Experts Based Soft-Label Learning for Multi-Label Speech Emotion Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-mixture-of-experts-based-soft-label-learning-for/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-mixture-of-experts-based-soft-label-learning-for/</guid>
      <description>语音情感识别 | 7.5/10</description>
    </item>
    <item>
      <title>ML-SAN: Multi-Level Speaker-Adaptive Network for Emotion Recognition in Conversations</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-ml-san-multi-level-speaker-adaptive-network-for/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-ml-san-multi-level-speaker-adaptive-network-for/</guid>
      <description>语音情感识别 | 8.0/10</description>
    </item>
    <item>
      <title>Modeling Both Intra- And Inter-Utterance Variability for Conversational Emotion Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-modeling-both-intra-and-inter-utterance/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-modeling-both-intra-and-inter-utterance/</guid>
      <description>语音情感识别 | 6.5/10</description>
    </item>
    <item>
      <title>MSF-SER: Enriching Acoustic Modeling with Multi-Granularity Semantics for Speech Emotion Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-msf-ser-enriching-acoustic-modeling-with-multi/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-msf-ser-enriching-acoustic-modeling-with-multi/</guid>
      <description>语音情感识别 | 7.5/10</description>
    </item>
    <item>
      <title>Multi-Channel Speech Enhancement for Cocktail Party Speech Emotion Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-multi-channel-speech-enhancement-for-cocktail/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-multi-channel-speech-enhancement-for-cocktail/</guid>
      <description>语音情感识别 | 7.5/10</description>
    </item>
    <item>
      <title>Multi-View Hierarchical Hypergraph Neural Network for Automatic Stuttering Detection</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-multi-view-hierarchical-hypergraph-neural-network/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-multi-view-hierarchical-hypergraph-neural-network/</guid>
      <description>语音生物标志物 | 7.5/10</description>
    </item>
    <item>
      <title>Multimodal Self-Attention Network with Temporal Alignment for Audio-Visual Emotion Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-multimodal-self-attention-network-with-temporal/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-multimodal-self-attention-network-with-temporal/</guid>
      <description>语音情感识别 | 8.0/10</description>
    </item>
    <item>
      <title>Multimodal Variational Graph Network for Multimodal Sentiment Analysis</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-multimodal-variational-graph-network-for/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-multimodal-variational-graph-network-for/</guid>
      <description>语音情感识别 | 7.5/10</description>
    </item>
    <item>
      <title>Plug-and-Play Emotion Graphs for Compositional Prompting in Zero-Shot Speech Emotion Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-plug-and-play-emotion-graphs-for-compositional/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-plug-and-play-emotion-graphs-for-compositional/</guid>
      <description>语音情感识别 | 7.0/10</description>
    </item>
    <item>
      <title>Prompt-Guided Mixture-of-Experts for Robust Multimodal Sentiment Analysis with Missing Modalities</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-prompt-guided-mixture-of-experts-for-robust/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-prompt-guided-mixture-of-experts-for-robust/</guid>
      <description>语音情感识别 | 8.5/10</description>
    </item>
    <item>
      <title>Rationale-Guided Learning for Multimodal Emotion Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-rationale-guided-learning-for-multimodal-emotion/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-rationale-guided-learning-for-multimodal-emotion/</guid>
      <description>语音情感识别 | 7.0/10</description>
    </item>
    <item>
      <title>Reasoning Driven Captions to Assist Noise Robust Speech Emotion Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-reasoning-driven-captions-to-assist-noise-robust/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-reasoning-driven-captions-to-assist-noise-robust/</guid>
      <description>语音情感识别 | 7.0/10</description>
    </item>
    <item>
      <title>Recovering Performance in Speech Emotion Recognition from Discrete Tokens Via Multi-Layer Fusion and Paralinguistic Feature Integration</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-recovering-performance-in-speech-emotion/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-recovering-performance-in-speech-emotion/</guid>
      <description>语音情感识别 | 6.5/10</description>
    </item>
    <item>
      <title>Scaling Ambiguity: Augmenting Human Annotation in Speech Emotion Recognition with Audio-Language Models</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-scaling-ambiguity-augmenting-human-annotation-in/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-scaling-ambiguity-augmenting-human-annotation-in/</guid>
      <description>语音情感识别 | 6.5/10</description>
    </item>
    <item>
      <title>SmoothCLAP: Soft-Target Enhanced Contrastive Language-Audio Pretraining for Affective Computing</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-smoothclap-soft-target-enhanced-contrastive/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-smoothclap-soft-target-enhanced-contrastive/</guid>
      <description>语音情感识别 | 6.5/10</description>
    </item>
    <item>
      <title>Speaker Anonymisation for Speech-Based Suicide Risk Detection</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-speaker-anonymisation-for-speech-based-suicide/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-speaker-anonymisation-for-speech-based-suicide/</guid>
      <description>语音匿名化 | 7.5/10</description>
    </item>
    <item>
      <title>Speech Emotion Recognition based on Hierarchical Transformer with Shifted Windows</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-speech-emotion-recognition-based-on-hierarchical/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-speech-emotion-recognition-based-on-hierarchical/</guid>
      <description>语音情感识别 | 8.0/10</description>
    </item>
    <item>
      <title>Staged Diffusion with Hybrid Mixture-of-Experts (MOE) for Multimodal Sentiment Analysis</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-staged-diffusion-with-hybrid-mixture-of-experts/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-staged-diffusion-with-hybrid-mixture-of-experts/</guid>
      <description>语音情感识别 | 8.0/10</description>
    </item>
    <item>
      <title>Stress Prediction from Temporal Emotion Trajectories in Clinical Patient-Physician Conversations</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-stress-prediction-from-temporal-emotion/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-stress-prediction-from-temporal-emotion/</guid>
      <description>语音情感识别 | 7.0/10</description>
    </item>
    <item>
      <title>StyleBench: Evaluating Speech Language Models on Conversational Speaking Style Control</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-stylebench-evaluating-speech-language-models-on/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-stylebench-evaluating-speech-language-models-on/</guid>
      <description>基准测试 | 8.5/10</description>
    </item>
    <item>
      <title>SURE: Synergistic Uncertainty-Aware Reasoning for Multimodal Emotion Recognition in Conversations</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-sure-synergistic-uncertainty-aware-reasoning-for/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-sure-synergistic-uncertainty-aware-reasoning-for/</guid>
      <description>语音情感识别 | 7.5/10</description>
    </item>
    <item>
      <title>Synthetic yet Striking? Assessing Vocal Charisma in TTS via Perceptual and Algorithmic Measures</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-synthetic-yet-striking-assessing-vocal-charisma/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-synthetic-yet-striking-assessing-vocal-charisma/</guid>
      <description>语音合成 | 7.5/10</description>
    </item>
    <item>
      <title>Temporal Graph Modeling for Speech Emotion Recognition Using LSTM-Aggregated Multigraph Networks</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-temporal-graph-modeling-for-speech-emotion/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-temporal-graph-modeling-for-speech-emotion/</guid>
      <description>语音情感识别 | 7.5/10</description>
    </item>
    <item>
      <title>Test Time Adaptation for Speech Emotion Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-test-time-adaptation-for-speech-emotion/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-test-time-adaptation-for-speech-emotion/</guid>
      <description>语音情感识别 | 7.0/10</description>
    </item>
    <item>
      <title>Tpeformer: Temporal Patch Embedding Transformer</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-tpeformer-temporal-patch-embedding-transformer/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-tpeformer-temporal-patch-embedding-transformer/</guid>
      <description>语音情感识别 | 7.5/10</description>
    </item>
    <item>
      <title>Unrequited Emotions: Investigating the Gaps in Motivation and Practice in Speech Emotion Recognition Research</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-unrequited-emotions-investigating-the-gaps-in/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-unrequited-emotions-investigating-the-gaps-in/</guid>
      <description>语音情感识别 | 8.0/10</description>
    </item>
    <item>
      <title>When Audio Matters: A Lightweight, Hierarchical Fusion Model for Speech and Non-Verbal Emotion Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-when-audio-matters-a-lightweight-hierarchical/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-when-audio-matters-a-lightweight-hierarchical/</guid>
      <description>语音情感识别 | 8.0/10</description>
    </item>
    <item>
      <title>Whisper-QF: Leveraging Dual Cross-Attention Q-Former for Speech Emotion Recognition With Multi-Task Learning</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-whisper-qf-leveraging-dual-cross-attention-q/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-whisper-qf-leveraging-dual-cross-attention-q/</guid>
      <description>语音情感识别 | 7.5/10</description>
    </item>
    <item>
      <title>Psychologically-Grounded Graph Modeling for Interpretable Depression Detection</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-28-psychologically-grounded-graph-modeling-for/</link>
      <pubDate>Tue, 28 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-28-psychologically-grounded-graph-modeling-for/</guid>
      <description>语音情感识别 | 8.0/10</description>
    </item>
    <item>
      <title>MER 2026: From Discriminative Emotion Recognition to Generative Emotion Understanding</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-24-mer-2026-from-discriminative-emotion-recognition/</link>
      <pubDate>Fri, 24 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-24-mer-2026-from-discriminative-emotion-recognition/</guid>
      <description>语音情感识别 | 6.0/10</description>
    </item>
    <item>
      <title>Prosody as Supervision: Bridging the Non-Verbal--Verbal for Multilingual Speech Emotion Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-24-prosody-as-supervision-bridging-the-non-verbal/</link>
      <pubDate>Fri, 24 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-24-prosody-as-supervision-bridging-the-non-verbal/</guid>
      <description>语音情感识别 | 8.0/10</description>
    </item>
    <item>
      <title>MoVE: Translating Laughter and Tears via Mixture of Vocalization Experts in Speech-to-Speech Translation</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-move-translating-laughter-and-tears-via-mixture/</link>
      <pubDate>Thu, 23 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-move-translating-laughter-and-tears-via-mixture/</guid>
      <description>这篇论文旨在解决语音到语音翻译（S2ST）系统普遍丢失源语音中非语言声音（如笑声、哭声）和情感信息的问题，这严重影响了跨语言交流的自然度和准确性。为此，作者提出了三项核心贡献：首先，设计了一个可扩展的自动化数据合成管道，用于生成大规模、高质量的英中富有表现力S2ST平行语料，克服了训练数据稀缺的瓶颈</description>
    </item>
    <item>
      <title>Deep Supervised Contrastive Learning of Pitch Contours for Robust Pitch Accent Classification in Seoul Korean</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-22-deep-supervised-contrastive-learning-of-pitch/</link>
      <pubDate>Wed, 22 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-22-deep-supervised-contrastive-learning-of-pitch/</guid>
      <description>这篇论文旨在解决将连续变化的基频（F0）曲线映射到首尔韩语中离散、不变的音高重音类别（如LHLH, HHLH）这一难题。传统方法易受F0测量噪声和说话人差异的影响。为此，作者提出了**Dual-Glob**，一个深度监督对比学习框架。其核心是通过一个**双分支（干净视图和增强视图）编码器**，在共享</description>
    </item>
    <item>
      <title>FreezeEmpath: Efficient Training for Empathetic Spoken Chatbots with Frozen LLMs</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-21-freezeempath-efficient-training-for-empathetic/</link>
      <pubDate>Tue, 21 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-21-freezeempath-efficient-training-for-empathetic/</guid>
      <description>本文旨在解决训练共情语音聊天机器人时面临的**共情语音数据稀缺、模型泛化能力弱、以及微调导致LLM通用能力退化**三大难题。作者提出了**FreezeEmpath**，一种高效的端到端训练框架。其核心方法是**冻结基础LLM**，采用**语义-情感解耦编码策略**，通过独立的语义适配器和情感提取器从</description>
    </item>
    <item>
      <title>Prosody as Supervision: Bridging the Non-Verbal--Verbal for Multilingual Speech Emotion Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-21-prosody-as-supervision-bridging-the-non-verbal/</link>
      <pubDate>Tue, 21 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-21-prosody-as-supervision-bridging-the-non-verbal/</guid>
      <description>这篇论文旨在解决低资源多语言语音情感识别（SER）中标注数据稀缺的核心瓶颈。作者提出了一个颠覆性的范式：**将SER重新定义为无监督的“非言语到言语”迁移问题**。其核心假设是，非言语发声（如笑、哭）中蕴含的韵律情感线索比言语更纯粹、更跨语言，因此可以作为更好的监督源。为此，作者设计了**NOVA-</description>
    </item>
    <item>
      <title>SELF-EMO: Emotional Self-Evolution from Recognition to Consistent Expression</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-21-self-emo-emotional-self-evolution-from/</link>
      <pubDate>Tue, 21 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-21-self-emo-emotional-self-evolution-from/</guid>
      <description>本文旨在解决对话系统中情感识别（ERC）与情感表达能力受限于高质量标注数据稀缺且静态的问题。**核心贡献**是提出了一个心理学动机的自我进化框架 **SELF-EMO**。**关键方法**是构建一个角色扮演的自博弈范式，使模型同时充当“情绪识别者”和“对话响应者”，并通过一个“生成-筛选-重用”的数</description>
    </item>
  </channel>
</rss>
