<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>全双工交互 on 语音/音频论文速递</title>
    <link>https://nanless.github.io/audio-paper-digest-blog/tags/%E5%85%A8%E5%8F%8C%E5%B7%A5%E4%BA%A4%E4%BA%92/</link>
    <description>Recent content in 全双工交互 on 语音/音频论文速递</description>
    <generator>Hugo</generator>
    <language>zh-cn</language>
    <lastBuildDate>Mon, 20 Apr 2026 00:00:00 +0000</lastBuildDate>
    <atom:link href="https://nanless.github.io/audio-paper-digest-blog/tags/%E5%85%A8%E5%8F%8C%E5%B7%A5%E4%BA%A4%E4%BA%92/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Beyond Monologue: Interactive Talking-Listening Avatar Generation with Conversational Audio Context-Aware Kernels</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-20-beyond-monologue-interactive-talking-listening/</link>
      <pubDate>Mon, 20 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-20-beyond-monologue-interactive-talking-listening/</guid>
      <description>本文旨在解决从单向“独白”式虚拟人生成迈向自然“全双工”交互式生成的核心挑战。**核心问题**在于，现有方法要么因严格的帧对齐而反应僵硬，要么因引入全局注意力而破坏唇同步。**关键方法**是提出一个基于多头高斯核（MHGK）的统一注意力架构，该机制通过为不同的注意力头分配从窄到宽的高斯分布感受野，使</description>
    </item>
    <item>
      <title>语音/音频论文速递 2026-04-20</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-20/</link>
      <pubDate>Mon, 20 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-20/</guid>
      <description>共分析 24 篇语音/AI 论文</description>
    </item>
  </channel>
</rss>
