<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>音乐理解 on 语音/音频论文速递</title>
    <link>https://nanless.github.io/audio-paper-digest-blog/tags/%E9%9F%B3%E4%B9%90%E7%90%86%E8%A7%A3/</link>
    <description>Recent content in 音乐理解 on 语音/音频论文速递</description>
    <generator>Hugo</generator>
    <language>zh-cn</language>
    <lastBuildDate>Wed, 29 Apr 2026 00:00:00 +0000</lastBuildDate>
    <atom:link href="https://nanless.github.io/audio-paper-digest-blog/tags/%E9%9F%B3%E4%B9%90%E7%90%86%E8%A7%A3/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>A Bayesian Approach to Singing Skill Evaluation Using Semitone Pitch Histogram and MCMC-Based Generated Quantities</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-bayesian-approach-to-singing-skill-evaluation/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-bayesian-approach-to-singing-skill-evaluation/</guid>
      <description>音乐理解 | 7.0/10</description>
    </item>
    <item>
      <title>Beat and Downbeat Detection: A Reformulated Approach</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-beat-and-downbeat-detection-a-reformulated/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-beat-and-downbeat-detection-a-reformulated/</guid>
      <description>音乐理解 | 7.5/10</description>
    </item>
    <item>
      <title>Controllable Embedding Transformation for Mood-Guided Music Retrieval</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-controllable-embedding-transformation-for-mood/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-controllable-embedding-transformation-for-mood/</guid>
      <description>音乐检索 | 7.5/10</description>
    </item>
    <item>
      <title>Do Foundational Audio Encoders Understand Music Structure?</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-do-foundational-audio-encoders-understand-music/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-do-foundational-audio-encoders-understand-music/</guid>
      <description>音乐信息检索 | 7.0/10</description>
    </item>
    <item>
      <title>Exploring How Audio Effects Alter Emotion with Foundation Models</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-exploring-how-audio-effects-alter-emotion-with/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-exploring-how-audio-effects-alter-emotion-with/</guid>
      <description>音乐理解 | 7.0/10</description>
    </item>
    <item>
      <title>ICASSP 2026 - 音乐理解 论文列表</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/icassp2026-task-109/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/icassp2026-task-109/</guid>
      <description>共 11 篇 ICASSP 2026 音乐理解 方向论文</description>
    </item>
    <item>
      <title>Interpretable Music Harmonic Analysis Through Multilinear Mixture of Experts</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-interpretable-music-harmonic-analysis-through/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-interpretable-music-harmonic-analysis-through/</guid>
      <description>音乐理解 | 7.5/10</description>
    </item>
    <item>
      <title>Investigating Modality Contribution in Audio LLMs for Music</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-investigating-modality-contribution-in-audio-llms/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-investigating-modality-contribution-in-audio-llms/</guid>
      <description>模型评估 | 6.5/10</description>
    </item>
    <item>
      <title>Joint Estimation of Piano Dynamics and Metrical Structure with a Multi-Task Multi-Scale Network</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-joint-estimation-of-piano-dynamics-and-metrical/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-joint-estimation-of-piano-dynamics-and-metrical/</guid>
      <description>音乐理解 | 7.5/10</description>
    </item>
    <item>
      <title>MIDI-LLaMA: An Instruction-Following Multimodal LLM for Symbolic Music Understanding</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-midi-llama-an-instruction-following-multimodal/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-midi-llama-an-instruction-following-multimodal/</guid>
      <description>音乐理解 | 7.5/10</description>
    </item>
    <item>
      <title>MuseTok: Symbolic Music Tokenization for Generation and Semantic Understanding</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-musetok-symbolic-music-tokenization-for/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-musetok-symbolic-music-tokenization-for/</guid>
      <description>音乐生成 | 8.5/10</description>
    </item>
    <item>
      <title>Rethinking Music Captioning with Music Metadata LLMS</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-rethinking-music-captioning-with-music-metadata/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-rethinking-music-captioning-with-music-metadata/</guid>
      <description>音乐理解 | 7.0/10</description>
    </item>
    <item>
      <title>SAUNA: Song-Level Audio &amp; User-Listening Data Neural Alignment</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-sauna-song-level-audio-user-listening-data-neural/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-sauna-song-level-audio-user-listening-data-neural/</guid>
      <description>音乐信息检索 | 7.0/10</description>
    </item>
    <item>
      <title>The Muse Benchmark: Probing Music Perception and Auditory Relational Reasoning in Audio LLMs</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-the-muse-benchmark-probing-music-perception-and/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-the-muse-benchmark-probing-music-perception-and/</guid>
      <description>音乐理解 | 8.5/10</description>
    </item>
    <item>
      <title>TinyMU: A Compact Audio-Language Model for Music Understanding</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-tinymu-a-compact-audio-language-model-for-music/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-tinymu-a-compact-audio-language-model-for-music/</guid>
      <description>音乐理解 | 7.5/10</description>
    </item>
    <item>
      <title>Toward Robust And Efficient Beat Tracking Via Beat-Aware Attention</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-toward-robust-and-efficient-beat-tracking-via/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-toward-robust-and-efficient-beat-tracking-via/</guid>
      <description>音乐理解 | 8.5/10</description>
    </item>
    <item>
      <title>Towards Effective Negation Modeling in Joint Audio-Text Models for Music</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-towards-effective-negation-modeling-in-joint/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-towards-effective-negation-modeling-in-joint/</guid>
      <description>音乐理解 | 7.5/10</description>
    </item>
    <item>
      <title>An event-based sequence modeling approach to recognizing non-triad chords with oversegmentation minimization</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-28-an-event-based-sequence-modeling-approach-to/</link>
      <pubDate>Tue, 28 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-28-an-event-based-sequence-modeling-approach-to/</guid>
      <description>音乐理解 | 7.5/10</description>
    </item>
    <item>
      <title>Audio Effect Estimation with DNN-Based Prediction and Search Algorithm</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-27-audio-effect-estimation-with-dnn-based-prediction/</link>
      <pubDate>Mon, 27 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-27-audio-effect-estimation-with-dnn-based-prediction/</guid>
      <description>音乐理解 | 8.0/10</description>
    </item>
    <item>
      <title>Beyond Rules: Towards Basso Continuo Personal Style Identification</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-24-beyond-rules-towards-basso-continuo-personal/</link>
      <pubDate>Fri, 24 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-24-beyond-rules-towards-basso-continuo-personal/</guid>
      <description>音乐理解 | 7.0/10</description>
    </item>
    <item>
      <title>ONOTE: Benchmarking Omnimodal Notation Processing for Expert-level Music Intelligence</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-onote-benchmarking-omnimodal-notation-processing/</link>
      <pubDate>Thu, 23 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-onote-benchmarking-omnimodal-notation-processing/</guid>
      <description>1.  **问题**：当前多模态大模型在音乐符号处理（Omnimodal Notation Processing, ONP）领域存在严重缺陷：研究碎片化、模型存在严重的符号偏差（偏向五线谱）、且普遍依赖不可靠的“LLM-as-a-Judge”评估方法，掩盖了模型在音乐理论推理上的系统性失败。 2. </description>
    </item>
    <item>
      <title>Coexisting Tempo Traditions in Beethoven&#39;s Piano and Cello Sonatas: A K-means Clustering Analysis of Recorded Performances, 1930-2012</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-21-coexisting-tempo-traditions-in-beethovens-piano/</link>
      <pubDate>Tue, 21 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-21-coexisting-tempo-traditions-in-beethovens-piano/</guid>
      <description>本文旨在挑战音乐表演实证研究中普遍使用的单一回归分析模型，该模型常将历史速度变化描绘为一个单向、统一的过程。作者提出，这种模型掩盖了多种演奏传统并存的事实。研究通过对贝多芬五首钢琴与大提琴奏鸣曲（Op. 5, 69, 102）在1930-2012年间超过一百个乐章录音的逐小节速度数据进行K-mean</description>
    </item>
    <item>
      <title>TinyMU: A Compact Audio-Language Model for Music Understanding</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-20-tinymu-a-compact-audio-language-model-for-music/</link>
      <pubDate>Mon, 20 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-20-tinymu-a-compact-audio-language-model-for-music/</guid>
      <description>本文针对现有大型音频语言模型（LALM）参数庞大（数十亿级）、训练推理成本高、难以部署在边缘设备的问题，提出了 TinyMU——一个仅有 229M 参数的紧凑音乐语言模型。为此，作者构建了 MusicSkills-3.5M 数据集，包含 350 万个涵盖多选、二元判断和开放式格式的音乐问答样本，结合</description>
    </item>
  </channel>
</rss>
