<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>开源工具 on 语音/音频论文速递</title>
    <link>https://nanless.github.io/audio-paper-digest-blog/tags/%E5%BC%80%E6%BA%90%E5%B7%A5%E5%85%B7/</link>
    <description>Recent content in 开源工具 on 语音/音频论文速递</description>
    <generator>Hugo</generator>
    <language>zh-cn</language>
    <lastBuildDate>Wed, 29 Apr 2026 00:00:00 +0000</lastBuildDate>
    <atom:link href="https://nanless.github.io/audio-paper-digest-blog/tags/%E5%BC%80%E6%BA%90%E5%B7%A5%E5%85%B7/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>A Text-To-Text Alignment Algorithm for Better Evaluation of Modern Speech Recognition Systems</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-text-to-text-alignment-algorithm-for-better/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-text-to-text-alignment-algorithm-for-better/</guid>
      <description>模型评估 | 7.5/10</description>
    </item>
    <item>
      <title>Constructing Composite Features for Interpretable Music-Tagging</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-constructing-composite-features-for-interpretable/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-constructing-composite-features-for-interpretable/</guid>
      <description>音乐信息检索 | 7.5/10</description>
    </item>
    <item>
      <title>Denoising Of Stochastic Ray Tracing Room Impulse Responses</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-denoising-of-stochastic-ray-tracing-room-impulse/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-denoising-of-stochastic-ray-tracing-room-impulse/</guid>
      <description>空间音频 | 7.5/10</description>
    </item>
    <item>
      <title>ECHO: Frequency-Aware Hierarchical Encoding for Variable-Length Signals</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-echo-frequency-aware-hierarchical-encoding-for/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-echo-frequency-aware-hierarchical-encoding-for/</guid>
      <description>音频分类 | 9.5/10</description>
    </item>
    <item>
      <title>Evaluating High-Resolution Piano Sustain Pedal Depth Estimation with Musically Informed Metrics</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-evaluating-high-resolution-piano-sustain-pedal/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-evaluating-high-resolution-piano-sustain-pedal/</guid>
      <description>音乐信息检索 | 8.0/10</description>
    </item>
    <item>
      <title>MNV-17: A High-Quality Performative Mandarin Dataset for Nonverbal Vocalization Recognition in Speech</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-mnv-17-a-high-quality-performative-mandarin/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-mnv-17-a-high-quality-performative-mandarin/</guid>
      <description>语音识别 | 7.5/10</description>
    </item>
    <item>
      <title>Polynomial Mixing for Efficient Self-Supervised Speech Encoders</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-polynomial-mixing-for-efficient-self-supervised/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-polynomial-mixing-for-efficient-self-supervised/</guid>
      <description>语音识别 | 8.0/10</description>
    </item>
    <item>
      <title>Praxy Voice: Voice-Prompt Recovery &#43; BUPS for Commercial-Class Indic TTS from a Frozen Non-Indic Base at Zero Commercial-Training-Data Cost</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-praxy-voice-voice-prompt-recovery-bups-for/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-praxy-voice-voice-prompt-recovery-bups-for/</guid>
      <description>语音合成 | 8.0/10</description>
    </item>
    <item>
      <title>SA-SSL-MOS: Self-Supervised Learning MOS Prediction with Spectral Augmentation for Generalized Multi-Rate Speech Assessment</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-sa-ssl-mos-self-supervised-learning-mos/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-sa-ssl-mos-self-supervised-learning-mos/</guid>
      <description>语音质量评估 | 7.0/10</description>
    </item>
    <item>
      <title>Sidon: Fast and Robust Open-Source Multilingual Speech Restoration for Large-Scale Dataset Cleansing</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-sidon-fast-and-robust-open-source-multilingual/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-sidon-fast-and-robust-open-source-multilingual/</guid>
      <description>语音增强 | 8.5/10</description>
    </item>
    <item>
      <title>The Singing Voice Conversion Challenge 2025: From Singer Identity Conversion to Singing Style Conversion</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-the-singing-voice-conversion-challenge-2025-from/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-the-singing-voice-conversion-challenge-2025-from/</guid>
      <description>歌唱语音转换 | 7.0/10</description>
    </item>
    <item>
      <title>Via Score to Performance: Efficient Human-Controllable Long Song Generation with Bar-Level Symbolic Notation</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-via-score-to-performance-efficient-human/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-via-score-to-performance-efficient-human/</guid>
      <description>音乐生成 | 7.5/10</description>
    </item>
    <item>
      <title>Z-Scores: A Metric for Linguistically Assessing Disfluency Removal</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-z-scores-a-metric-for-linguistically-assessing/</link>
      <pubDate>Wed, 29 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-z-scores-a-metric-for-linguistically-assessing/</guid>
      <description>模型评估 | 6.5/10</description>
    </item>
    <item>
      <title>Opening the Design Space: Two Years of Performance with Intelligent Musical Instruments</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-28-opening-the-design-space-two-years-of-performance/</link>
      <pubDate>Tue, 28 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-28-opening-the-design-space-two-years-of-performance/</guid>
      <description>音乐生成 | 6.5/10</description>
    </item>
    <item>
      <title>Audio Video Verbal Analysis (AVVA) for Capturing Classroom Dialogues</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-27-audio-video-verbal-analysis-avva-for-capturing/</link>
      <pubDate>Mon, 27 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-27-audio-video-verbal-analysis-avva-for-capturing/</guid>
      <description>音频问答 | 6.0/10</description>
    </item>
    <item>
      <title>TTS-PRISM: A Perceptual Reasoning and Interpretable Speech Model for Fine-Grained Diagnosis</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-27-tts-prism-a-perceptual-reasoning-and/</link>
      <pubDate>Mon, 27 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-27-tts-prism-a-perceptual-reasoning-and/</guid>
      <description>语音质量评估 | 7.5/10</description>
    </item>
    <item>
      <title>语音/音频论文速递 2026-04-27</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-27/</link>
      <pubDate>Mon, 27 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-27/</guid>
      <description>共分析 13 篇语音/AI 论文</description>
    </item>
    <item>
      <title>DiariZen Explained: A Tutorial for the Open Source State-of-the-Art Speaker Diarization Pipeline</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-24-diarizen-explained-a-tutorial-for-the-open-source/</link>
      <pubDate>Fri, 24 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-24-diarizen-explained-a-tutorial-for-the-open-source/</guid>
      <description>说话人分离 | 6.5/10</description>
    </item>
    <item>
      <title>Centering Ecological Goals in Automated Identification of Individual Animals</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-centering-ecological-goals-in-automated/</link>
      <pubDate>Thu, 23 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-centering-ecological-goals-in-automated/</guid>
      <description>这篇论文旨在解决一个关键问题：为什么近年来在动物个体自动识别（基于图像或声音）上报告的高准确率算法，却很少转化为生态学实践中的常规工具？其方法核心是提出一个“以生态目标为中心”的评估与部署框架，强调自动化识别的有用性取决于其服务的具体生态问题、可用数据以及错误类型带来的实际后果。与以往主要关注算法准</description>
    </item>
  </channel>
</rss>
