全双工交互 on 语音/音频论文速递

全双工交互 on 语音/音频论文速递 https://nanless.github.io/audio-paper-digest-blog/tags/%E5%85%A8%E5%8F%8C%E5%B7%A5%E4%BA%A4%E4%BA%92/ Recent content in 全双工交互 on 语音/音频论文速递 Hugo zh-cn Mon, 20 Apr 2026 00:00:00 +0000 Beyond Monologue: Interactive Talking-Listening Avatar Generation with Conversational Audio Context-Aware Kernels https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-20-beyond-monologue-interactive-talking-listening/ Mon, 20 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-20-beyond-monologue-interactive-talking-listening/ 本文旨在解决从单向“独白”式虚拟人生成迈向自然“全双工”交互式生成的核心挑战。**核心问题**在于，现有方法要么因严格的帧对齐而反应僵硬，要么因引入全局注意力而破坏唇同步。**关键方法**是提出一个基于多头高斯核（MHGK）的统一注意力架构，该机制通过为不同的注意力头分配从窄到宽的高斯分布感受野，使语音/音频论文速递 2026-04-20 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-20/ Mon, 20 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-20/ 共分析 24 篇语音/AI 论文