<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>音频分类 on 语音/音频论文速递</title>
    <link>https://nanless.github.io/audio-paper-digest-blog/tags/%E9%9F%B3%E9%A2%91%E5%88%86%E7%B1%BB/</link>
    <description>Recent content in 音频分类 on 语音/音频论文速递</description>
    <generator>Hugo</generator>
    <language>zh-cn</language>
    <lastBuildDate>Wed, 29 Apr 2026 00:00:00 +0000</lastBuildDate>
    <atom:link href="https://nanless.github.io/audio-paper-digest-blog/tags/%E9%9F%B3%E9%A2%91%E5%88%86%E7%B1%BB/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>A Consistent Learning Depression Detection Framework Integrating Multi-View Attention</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-consistent-learning-depression-detection/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-consistent-learning-depression-detection/</guid>
      <description>语音生物标志物 | 6.5/10</description>
    </item>
    <item>
      <title>A Dynamic Gated Cross-Attention Framework for Audio-Text Apparent Personality Analysis</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-dynamic-gated-cross-attention-framework-for/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-dynamic-gated-cross-attention-framework-for/</guid>
      <description>音频分类 | 7.0/10</description>
    </item>
    <item>
      <title>A LLM-Driven Acoustic Semantic Enriched Framework for Underwater Acoustic Target Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-llm-driven-acoustic-semantic-enriched-framework/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-llm-driven-acoustic-semantic-enriched-framework/</guid>
      <description>音频分类 | 7.0/10</description>
    </item>
    <item>
      <title>A Metric Learning Approach to Heart Murmur Detection from Phonocardiogram Recordings</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-metric-learning-approach-to-heart-murmur/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-metric-learning-approach-to-heart-murmur/</guid>
      <description>音频分类 | 7.7/10</description>
    </item>
    <item>
      <title>A Robust KNN Approach for Multi-Class Laryngeal Disease Detection using MFCC Features</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-robust-knn-approach-for-multi-class-laryngeal/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-robust-knn-approach-for-multi-class-laryngeal/</guid>
      <description>音频分类 | 7.5/10</description>
    </item>
    <item>
      <title>ACAVCaps: Enabling Large-Scale Training for Fine-Grained and Diverse Audio Understanding</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-acavcaps-enabling-large-scale-training-for-fine/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-acavcaps-enabling-large-scale-training-for-fine/</guid>
      <description>音频分类 | 8.5/10</description>
    </item>
    <item>
      <title>Acoustic Feedback Cancellation in Hearing Aids Exploiting an Inertial Sensor</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-acoustic-feedback-cancellation-in-hearing-aids/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-acoustic-feedback-cancellation-in-hearing-aids/</guid>
      <description>音频分类 | 7.0/10</description>
    </item>
    <item>
      <title>Acoustic Non-Stationarity Objective Assessment with Hard Label Criteria for Supervised Learning Models</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-acoustic-non-stationarity-objective-assessment/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-acoustic-non-stationarity-objective-assessment/</guid>
      <description>音频分类 | 7.0/10</description>
    </item>
    <item>
      <title>Adaptive Embedding Fusion with Contrastive Learning for Robust Fully Few-Shot Class-Incremental Audio Classification</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-adaptive-embedding-fusion-with-contrastive/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-adaptive-embedding-fusion-with-contrastive/</guid>
      <description>音频分类 | 7.5/10</description>
    </item>
    <item>
      <title>Adaptive Per-Channel Energy Normalization Front-End for Robust Audio Signal Processing</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-adaptive-per-channel-energy-normalization-front/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-adaptive-per-channel-energy-normalization-front/</guid>
      <description>音频分类 | 7.5/10</description>
    </item>
    <item>
      <title>Adversarial Rivalry Learning for Music Classification</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-adversarial-rivalry-learning-for-music/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-adversarial-rivalry-learning-for-music/</guid>
      <description>音乐分类 | 6.5/10</description>
    </item>
    <item>
      <title>AFT: An Exemplar-Free Class Incremental Learning Method for Environmental Sound Classification</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-aft-an-exemplar-free-class-incremental-learning/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-aft-an-exemplar-free-class-incremental-learning/</guid>
      <description>音频分类 | 7.0/10</description>
    </item>
    <item>
      <title>AnimalCLAP: Taxonomy-Aware Language-Audio Pretraining for Species Recognition and Trait Inference</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-animalclap-taxonomy-aware-language-audio/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-animalclap-taxonomy-aware-language-audio/</guid>
      <description>音频分类 | 8.0/10</description>
    </item>
    <item>
      <title>Attentive Masked Self-Distillation for Respiratory Sound Classification</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-attentive-masked-self-distillation-for/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-attentive-masked-self-distillation-for/</guid>
      <description>音频分类 | 7.5/10</description>
    </item>
    <item>
      <title>Audio Classification Models are Vulnerable to Filter Perturbations</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-audio-classification-models-are-vulnerable-to/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-audio-classification-models-are-vulnerable-to/</guid>
      <description>音频分类 | 7.5/10</description>
    </item>
    <item>
      <title>AUDIOCARDS: Structured Metadata Improves Audio Language Models for Sound Design</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-audiocards-structured-metadata-improves-audio/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-audiocards-structured-metadata-improves-audio/</guid>
      <description>音频检索 | 7.5/10</description>
    </item>
    <item>
      <title>AudioFuse: Unified Spectral-Temporal Learning Via A Hybrid VIT-1D CNN Architecture for Phonocardiogram Classification</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-audiofuse-unified-spectral-temporal-learning-via/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-audiofuse-unified-spectral-temporal-learning-via/</guid>
      <description>音频分类 | 7.5/10</description>
    </item>
    <item>
      <title>Automated Dysphagia Screening Using Noninvasive Neck Acoustic Sensing</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-automated-dysphagia-screening-using-noninvasive/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-automated-dysphagia-screening-using-noninvasive/</guid>
      <description>音频分类 | 8.0/10</description>
    </item>
    <item>
      <title>Benchmarking Music Autotagging with MGPHot Expert Annotations vs. Generic Tag Datasets</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-benchmarking-music-autotagging-with-mgphot-expert/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-benchmarking-music-autotagging-with-mgphot-expert/</guid>
      <description>音乐信息检索 | 7.5/10</description>
    </item>
    <item>
      <title>Beyond Mapping: Domain-Invariant Representations via Spectral Embedding of Optimal Transport Plans</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-beyond-mapping-domain-invariant-representations/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-beyond-mapping-domain-invariant-representations/</guid>
      <description>领域适应 | 7.5/10</description>
    </item>
    <item>
      <title>Can Hierarchical Cross-Modal Fusion Predict Human Perception of AI Dubbed Content?</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-can-hierarchical-cross-modal-fusion-predict-human/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-can-hierarchical-cross-modal-fusion-predict-human/</guid>
      <description>模型评估 | 6.0/10</description>
    </item>
    <item>
      <title>Constructing Composite Features for Interpretable Music-Tagging</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-constructing-composite-features-for-interpretable/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-constructing-composite-features-for-interpretable/</guid>
      <description>音乐信息检索 | 7.5/10</description>
    </item>
    <item>
      <title>Cooperative Multi-Agent Reinforcement Learning for Adaptive Aggregation in Semi-Supervised Federated Learning with non-IID Data</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-cooperative-multi-agent-reinforcement-learning/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-cooperative-multi-agent-reinforcement-learning/</guid>
      <description>联邦学习 | 7.0/10</description>
    </item>
    <item>
      <title>Directly Trained Spiking Neural Networks with Adaptive Phase Coding</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-directly-trained-spiking-neural-networks-with/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-directly-trained-spiking-neural-networks-with/</guid>
      <description>音频分类 | 7.0/10</description>
    </item>
    <item>
      <title>ECHO: Frequency-Aware Hierarchical Encoding for Variable-Length Signals</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-echo-frequency-aware-hierarchical-encoding-for/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-echo-frequency-aware-hierarchical-encoding-for/</guid>
      <description>音频分类 | 9.5/10</description>
    </item>
    <item>
      <title>Empowering Multimodal Respiratory Sound Classification with Counterfactual Adversarial Debiasing for Out-of-Distribution Robustness</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-empowering-multimodal-respiratory-sound/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-empowering-multimodal-respiratory-sound/</guid>
      <description>音频分类 | 7.0/10</description>
    </item>
    <item>
      <title>Enhanced Generative Machine Listener</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-enhanced-generative-machine-listener/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-enhanced-generative-machine-listener/</guid>
      <description>音频分类 | 7.0/10</description>
    </item>
    <item>
      <title>Estimating Respiratory Effort from Nocturnal Breathing Sounds for Obstructive Sleep Apnoea Screening</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-estimating-respiratory-effort-from-nocturnal/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-estimating-respiratory-effort-from-nocturnal/</guid>
      <description>音频分类 | 6.5/10</description>
    </item>
    <item>
      <title>FOCA: Multimodal Malware Classification via Hyperbolic Cross-Attention</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-foca-multimodal-malware-classification-via/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-foca-multimodal-malware-classification-via/</guid>
      <description>音频分类 | 7.5/10</description>
    </item>
    <item>
      <title>Hair Noise Analysis and Mitigation for Smart Glasses Audio Captures</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-hair-noise-analysis-and-mitigation-for-smart/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-hair-noise-analysis-and-mitigation-for-smart/</guid>
      <description>语音增强 | 7.5/10</description>
    </item>
    <item>
      <title>Hanui: Harnessing Distributional Discrepancies for Singing Voice Deepfake Detection</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-hanui-harnessing-distributional-discrepancies-for/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-hanui-harnessing-distributional-discrepancies-for/</guid>
      <description>音频深度伪造检测 | 8.0/10</description>
    </item>
    <item>
      <title>HFSQVAE: Hierarchical Vector Quantization with Residuals for Frequency-Specific Embedding</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-hfsqvae-hierarchical-vector-quantization-with/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-hfsqvae-hierarchical-vector-quantization-with/</guid>
      <description>音频生成 | 7.0/10</description>
    </item>
    <item>
      <title>Hierarchical Activity Recognition and Captioning from Long-Form Audio</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-hierarchical-activity-recognition-and-captioning/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-hierarchical-activity-recognition-and-captioning/</guid>
      <description>音频事件检测 | 7.5/10</description>
    </item>
    <item>
      <title>ICASSP 2026 - 音频分类 论文列表</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/icassp2026-task-117/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/icassp2026-task-117/</guid>
      <description>共 39 篇 ICASSP 2026 音频分类 方向论文</description>
    </item>
    <item>
      <title>Incremental Learning for Audio Classification with Hebbian Deep Neural Networks</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-incremental-learning-for-audio-classification/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-incremental-learning-for-audio-classification/</guid>
      <description>音频分类 | 7.5/10</description>
    </item>
    <item>
      <title>Influence-Aware Curation and Active Selection for Industrial and Surveillance Sound Events</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-influence-aware-curation-and-active-selection-for/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-influence-aware-curation-and-active-selection-for/</guid>
      <description>音频事件检测 | 7.0/10</description>
    </item>
    <item>
      <title>Input-Adaptive Differentiable Filterbanks via Hypernetworks for Robust Speech Processing</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-input-adaptive-differentiable-filterbanks-via/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-input-adaptive-differentiable-filterbanks-via/</guid>
      <description>语音识别 | 7.5/10</description>
    </item>
    <item>
      <title>Keeping Models Listening: Segment- and time-aware attention rescaling at decoding time</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-keeping-models-listening-segment-and-time-aware/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-keeping-models-listening-segment-and-time-aware/</guid>
      <description>音频问答 | 7.5/10</description>
    </item>
    <item>
      <title>Learnable Mel-Frontend for Robust Underwater Acoustic Target Detection under Non-Target Interference</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-learnable-mel-frontend-for-robust-underwater/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-learnable-mel-frontend-for-robust-underwater/</guid>
      <description>音频分类 | 6.5/10</description>
    </item>
    <item>
      <title>Learning Domain-Robust Bioacoustic Representations for Mosquito Species Classification with Contrastive Learning and Distribution Alignment</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-learning-domain-robust-bioacoustic/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-learning-domain-robust-bioacoustic/</guid>
      <description>生物声学 | 7.5/10</description>
    </item>
    <item>
      <title>LenslessMic: Audio Encryption and Authentication via Lensless Computational Imaging</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-lenslessmic-audio-encryption-and-authentication/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-lenslessmic-audio-encryption-and-authentication/</guid>
      <description>音频安全 | 7.5/10</description>
    </item>
    <item>
      <title>Leveraging prediction entropy for Automatic prompt weighting in Zero-Shot Audio-Language Classification</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-leveraging-prediction-entropy-for-automatic/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-leveraging-prediction-entropy-for-automatic/</guid>
      <description>音频分类 | 7.5/10</description>
    </item>
    <item>
      <title>Modeling Inter-Segment Relationships in Speech for Dementia Detection with Audio Spectrogram Transformers and Graph Attention Networks</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-modeling-inter-segment-relationships-in-speech/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-modeling-inter-segment-relationships-in-speech/</guid>
      <description>语音生物标志物 | 7.0/10</description>
    </item>
    <item>
      <title>More Than a Shortcut: A Hyperbolic Approach to Early-Exit Networks</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-more-than-a-shortcut-a-hyperbolic-approach-to/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-more-than-a-shortcut-a-hyperbolic-approach-to/</guid>
      <description>音频事件检测 | 8.0/10</description>
    </item>
    <item>
      <title>Noise-Robust Contrastive Learning with an MFCC-Conformer for Coronary Artery Disease Detection</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-noise-robust-contrastive-learning-with-an-mfcc/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-noise-robust-contrastive-learning-with-an-mfcc/</guid>
      <description>音频分类 | 7.0/10</description>
    </item>
    <item>
      <title>Non-Line-of-Sight Vehicle Detection via Audio-Visual Fusion</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-non-line-of-sight-vehicle-detection-via-audio/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-non-line-of-sight-vehicle-detection-via-audio/</guid>
      <description>音频分类 | 8.0/10</description>
    </item>
    <item>
      <title>One Model–Three Tasks: Discovering a Shared Winning Ticket for Low-Complexity Audio Intelligence</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-one-modelthree-tasks-discovering-a-shared-winning/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-one-modelthree-tasks-discovering-a-shared-winning/</guid>
      <description>音频分类 | 7.5/10</description>
    </item>
    <item>
      <title>Optimizing Domain-Adaptive Self-Supervised Learning for Clinical Voice-Based Disease Classification</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-optimizing-domain-adaptive-self-supervised/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-optimizing-domain-adaptive-self-supervised/</guid>
      <description>语音生物标志物 | 7.0/10</description>
    </item>
    <item>
      <title>PADAM: Perceptual Audio Defect Assessment Model</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-padam-perceptual-audio-defect-assessment-model/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-padam-perceptual-audio-defect-assessment-model/</guid>
      <description>音频分类 | 7.0/10</description>
    </item>
    <item>
      <title>PC-MCL: Patient-Consistent Multi-Cycle Learning with Multi-Label Bias Correction for Respiratory Sound Classification</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-pc-mcl-patient-consistent-multi-cycle-learning/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-pc-mcl-patient-consistent-multi-cycle-learning/</guid>
      <description>音频分类 | 7.5/10</description>
    </item>
    <item>
      <title>Reading Between the Waves: Robust Topic Segmentation Using Inter-Sentence Audio Features</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-reading-between-the-waves-robust-topic/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-reading-between-the-waves-robust-topic/</guid>
      <description>音频分类 | 7.0/10</description>
    </item>
    <item>
      <title>Reliable AI via Age-Balanced Validation: Fair Model Selection for Parkinson’s Detection from Voice</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-reliable-ai-via-age-balanced-validation-fair/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-reliable-ai-via-age-balanced-validation-fair/</guid>
      <description>语音生物标志物 | 7.5/10</description>
    </item>
    <item>
      <title>RMODGDF: A Robust STFT-Derived Feature for Musical Instrument Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-rmodgdf-a-robust-stft-derived-feature-for-musical/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-rmodgdf-a-robust-stft-derived-feature-for-musical/</guid>
      <description>音乐信息检索 | 7.0/10</description>
    </item>
    <item>
      <title>S-SONDO: Self-Supervised Knowledge Distillation for General Audio Foundation Models</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-s-sondo-self-supervised-knowledge-distillation/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-s-sondo-self-supervised-knowledge-distillation/</guid>
      <description>音频分类 | 7.0/10</description>
    </item>
    <item>
      <title>Semantic-Guided Pseudo-Feature Attention Network for Audio-Visual Zero-Shot Learning</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-semantic-guided-pseudo-feature-attention-network/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-semantic-guided-pseudo-feature-attention-network/</guid>
      <description>音频分类 零样本学习 | 7.0/10</description>
    </item>
    <item>
      <title>SLAP: Scalable Language-Audio Pretraining with Variable-Duration Audio and Multi-Objective Training</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-slap-scalable-language-audio-pretraining-with/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-slap-scalable-language-audio-pretraining-with/</guid>
      <description>音频检索 | 8.0/10</description>
    </item>
    <item>
      <title>Snore Sound Classification Based on Physiological Features and Adaptive Loss Function</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-snore-sound-classification-based-on-physiological/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-snore-sound-classification-based-on-physiological/</guid>
      <description>音频分类 | 6.5/10</description>
    </item>
    <item>
      <title>Speech Emotion Recognition based on Hierarchical Transformer with Shifted Windows</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-speech-emotion-recognition-based-on-hierarchical/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-speech-emotion-recognition-based-on-hierarchical/</guid>
      <description>语音情感识别 | 8.0/10</description>
    </item>
    <item>
      <title>Spiking Temporal-Enhanced Network for Zero-Shot Audio-Visual Learning</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-spiking-temporal-enhanced-network-for-zero-shot/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-spiking-temporal-enhanced-network-for-zero-shot/</guid>
      <description>音频分类 | 7.0/10</description>
    </item>
    <item>
      <title>Testing The Efficient Coding Hypothesis Beyond Humans: The Auditory Kernels of Bat Vocalizations</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-testing-the-efficient-coding-hypothesis-beyond/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-testing-the-efficient-coding-hypothesis-beyond/</guid>
      <description>生物声学 | 7.5/10</description>
    </item>
    <item>
      <title>Thinking While Listening: Simple Test Time Scaling for Audio Classification</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-thinking-while-listening-simple-test-time-scaling/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-thinking-while-listening-simple-test-time-scaling/</guid>
      <description>音频分类 | 6.5/10</description>
    </item>
    <item>
      <title>Timbre-Aware Audio Difference Captioning for Anomalous Machine Sounds without Paired Training Data via Synthetic Perturbations</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-timbre-aware-audio-difference-captioning-for/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-timbre-aware-audio-difference-captioning-for/</guid>
      <description>音频分类 | 7.5/10</description>
    </item>
    <item>
      <title>Timbre-Based Pretraining with Pseudo-Labels for Multi-Instrument Automatic Music Transcription</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-timbre-based-pretraining-with-pseudo-labels-for/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-timbre-based-pretraining-with-pseudo-labels-for/</guid>
      <description>音乐信息检索 | 7.0/10</description>
    </item>
    <item>
      <title>Transfer Learning for Paediatric Sleep Apnoea Detection using Physiology-Guided Acoustic Models</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-transfer-learning-for-paediatric-sleep-apnoea/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-transfer-learning-for-paediatric-sleep-apnoea/</guid>
      <description>音频分类 | 7.0/10</description>
    </item>
    <item>
      <title>Transferable Audio Lottery Tickets: Gradient Accumulation for Extreme Sparsity</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-transferable-audio-lottery-tickets-gradient/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-transferable-audio-lottery-tickets-gradient/</guid>
      <description>音频分类 | 7.0/10</description>
    </item>
    <item>
      <title>UMV: A Mixture-Of-Experts Vision Transformer with Multi-Spectrogram Fusion for Underwater Ship Noise Classification</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-umv-a-mixture-of-experts-vision-transformer-with/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-umv-a-mixture-of-experts-vision-transformer-with/</guid>
      <description>音频分类 | 7.5/10</description>
    </item>
    <item>
      <title>Unsupervised Discovery and Analysis of the Vocal Repertoires and Patterns of Select Corvid Species</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-unsupervised-discovery-and-analysis-of-the-vocal/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-unsupervised-discovery-and-analysis-of-the-vocal/</guid>
      <description>生物声学 | 7.5/10</description>
    </item>
    <item>
      <title>UVT-LM: Unifying Visual and Tactile Perception with Language Model</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-uvt-lm-unifying-visual-and-tactile-perception/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-uvt-lm-unifying-visual-and-tactile-perception/</guid>
      <description>跨模态 | 7.0/10</description>
    </item>
    <item>
      <title>WaveSpikeNet: A Wavelet-Spiking Fusion Architecture for Audio Classification on Edge Devices</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-wavespikenet-a-wavelet-spiking-fusion/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-wavespikenet-a-wavelet-spiking-fusion/</guid>
      <description>音频分类 | 7.5/10</description>
    </item>
    <item>
      <title>When Audio Matters: A Lightweight, Hierarchical Fusion Model for Speech and Non-Verbal Emotion Recognition</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-when-audio-matters-a-lightweight-hierarchical/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-when-audio-matters-a-lightweight-hierarchical/</guid>
      <description>语音情感识别 | 8.0/10</description>
    </item>
    <item>
      <title>Meta-Ensemble Learning with Diverse Data Splits for Improved Respiratory Sound Classification</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-28-meta-ensemble-learning-with-diverse-data-splits/</link>
      <pubDate>Tue, 28 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-28-meta-ensemble-learning-with-diverse-data-splits/</guid>
      <description>音频分类 | 8.0/10</description>
    </item>
    <item>
      <title>Audio Effect Estimation with DNN-Based Prediction and Search Algorithm</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-27-audio-effect-estimation-with-dnn-based-prediction/</link>
      <pubDate>Mon, 27 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-27-audio-effect-estimation-with-dnn-based-prediction/</guid>
      <description>音乐理解 | 8.0/10</description>
    </item>
    <item>
      <title>Deep Hierarchical Knowledge Loss for Fault Intensity Diagnosis</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-deep-hierarchical-knowledge-loss-for-fault/</link>
      <pubDate>Thu, 23 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-deep-hierarchical-knowledge-loss-for-fault/</guid>
      <description>1.  **要解决什么问题**：传统故障强度诊断方法将各类故障视为独立标签，忽略了物理状态之间固有的层次依赖关系（如“空化”是“初期空化”、“稳定空化”等的父类），这限制了模型的性能和鲁棒性。 2.  **方法核心是什么**：提出一个名为DHK的通用框架，其核心是设计两个新的损失函数：**层次树损失</description>
    </item>
    <item>
      <title>Explicit Dropout: Deterministic Regularization for Transformer Architectures</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-explicit-dropout-deterministic-regularization-for/</link>
      <pubDate>Thu, 23 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-explicit-dropout-deterministic-regularization-for/</guid>
      <description>这篇论文旨在解决传统Dropout方法依赖随机掩码、正则化效果不透明且难以精确控制的问题。其核心方法是提出一种确定性公式，将Dropout重新表述为一个可直接加入训练损失函数的显式正则化项，并推导出了适用于Transformer架构中注意力机制（Q、K、V）和前馈网络的正则化表达式。与已有方法相比，</description>
    </item>
    <item>
      <title>Environmental Sound Deepfake Detection Using Deep-Learning Framework</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-22-environmental-sound-deepfake-detection-using-deep/</link>
      <pubDate>Wed, 22 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-22-environmental-sound-deepfake-detection-using-deep/</guid>
      <description>本文针对环境声音（如声音事件、声音场景）的深度伪造检测这一新兴任务，提出了一个系统的深度学习框架。**核心贡献**在于通过大量实验，系统评估了不同频谱图（MEL, CQT, Gammatone）、多种CNN架构（ResNet, Inception等）以及预训练模型（BEATs）在该任务上的表现，并验</description>
    </item>
    <item>
      <title>Incremental learning for audio classification with Hebbian Deep Neural Networks</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-21-incremental-learning-for-audio-classification/</link>
      <pubDate>Tue, 21 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-21-incremental-learning-for-audio-classification/</guid>
      <description>本文针对音频分类中的增量学习（持续学习）问题，提出了一种受生物启发的解决方案。核心是解决深度学习模型在学习新任务时对旧知识的“灾难性遗忘”。作者首次将**Hebbian学习**（一种基于神经元同步激活的无监督、无反馈学习规则）与**增量学习**相结合，并设计了一个**核塑性**机制。该机制通过分析训</description>
    </item>
    <item>
      <title>Adaptive Test-Time Scaling for Zero-Shot Respiratory Audio Classification</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-19-adaptive-test-time-scaling-for-zero-shot/</link>
      <pubDate>Sun, 19 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-19-adaptive-test-time-scaling-for-zero-shot/</guid>
      <description>本文旨在解决零样本呼吸音频分类中“一刀切”的推理计算浪费问题。为此，提出了TRIAGE框架，这是一个三层自适应推理管道：第一层（Tier-L）进行快速的标签-文本相似度匹配；若置信度不足则升级至第二层</description>
    </item>
    <item>
      <title>Classical Machine Learning Baselines for Deepfake Audio Detection on the Fake-or-Real Dataset</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-19-classical-machine-learning-baselines-for-deepfake/</link>
      <pubDate>Sun, 19 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-19-classical-machine-learning-baselines-for-deepfake/</guid>
      <description>本文旨在解决深度伪造音频检测领域缺乏透明、可解释基线的问题。研究团队采用经典机器学习方法，在Fake-or-Real (FoR) 数据集上构建了一个完整的检测流程。他们从高保真（44.1 kHz）和电</description>
    </item>
    <item>
      <title>Comparison of window shapes and lengths in short-time feature extraction for classification of heart sound signals</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-19-comparison-of-window-shapes-and-lengths-in-short/</link>
      <pubDate>Sun, 19 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-19-comparison-of-window-shapes-and-lengths-in-short/</guid>
      <description>本文针对心音信号（PCG）分类任务中，因信号非-stationarity而采用滑动窗口分段提取特征时，窗函数形状和长度选择缺乏系统性研究的问题，进行了一项实验性评估。作者使用双向长短期记忆网络（biL</description>
    </item>
    <item>
      <title>Elastic Net Regularization and Gabor Dictionary for Classification of Heart Sound Signals using Deep Learning</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-19-elastic-net-regularization-and-gabor-dictionary/</link>
      <pubDate>Sun, 19 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-19-elastic-net-regularization-and-gabor-dictionary/</guid>
      <description>本文旨在解决心音信号（PCG）的多分类问题，以辅助心血管疾病的自动诊断。核心贡献在于提出了一套结合**优化Gabor字典**和**弹性网络正则化**的特征提取框架，并与**CNN-LSTM深度学习网络</description>
    </item>
    <item>
      <title>Room compensation for loudspeaker reproduction using a supporting source</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-19-room-compensation-for-loudspeaker-reproduction/</link>
      <pubDate>Sun, 19 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-19-room-compensation-for-loudspeaker-reproduction/</guid>
      <description>本文针对传统房间补偿技术仅能修正频谱（音色）而无法控制空间感知（如距离感）的局限，提出了一种创新的补偿方法。该方法通过引入一个延迟的、经过频谱滤波的辅助扬声器，选择性地向房间的混响声场中添加能量，从而</description>
    </item>
  </channel>
</rss>
