<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>人机交互 on 语音/音频论文速递</title>
    <link>https://nanless.github.io/audio-paper-digest-blog/tags/%E4%BA%BA%E6%9C%BA%E4%BA%A4%E4%BA%92/</link>
    <description>Recent content in 人机交互 on 语音/音频论文速递</description>
    <generator>Hugo</generator>
    <language>zh-cn</language>
    <lastBuildDate>Thu, 23 Apr 2026 00:00:00 +0000</lastBuildDate>
    <atom:link href="https://nanless.github.io/audio-paper-digest-blog/tags/%E4%BA%BA%E6%9C%BA%E4%BA%A4%E4%BA%92/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>CoInteract: Physically-Consistent Human-Object Interaction Video Synthesis via Spatially-Structured Co-Generation</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-cointeract-physically-consistent-human-object/</link>
      <pubDate>Thu, 23 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-cointeract-physically-consistent-human-object/</guid>
      <description>1. **问题**：现有视频扩散模型在生成人机交互（HOI）视频时，常出现手/脸结构崩溃和人机物理穿透等问题，根源在于模型缺乏对3D空间关系和交互结构的理解。 2. **方法核心**：提出CoInteract框架，核心是“空间结构化协同生成”范式。在一个共享的DiT骨干中联合训练RGB外观流和辅助的</description>
    </item>
    <item>
      <title>MOMO: A framework for seamless physical, verbal, and graphical robot skill learning and adaptation</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-momo-a-framework-for-seamless-physical-verbal-and/</link>
      <pubDate>Thu, 23 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-momo-a-framework-for-seamless-physical-verbal-and/</guid>
      <description>1. **问题**：工业机器人需要频繁适应新任务和环境，但现有技能调整方法（如手动重编程）对非专家用户不友好，且单一交互模态无法高效处理所有类型的调整需求。 2. **方法核心**：提出MOMO框架，集成三种互补交互模态：动觉接触（用于精确空间修正）、自然语言（用于高层语义修改）和图形界面（用于参数</description>
    </item>
    <item>
      <title>语音/音频论文速递 2026-04-23</title>
      <link>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23/</link>
      <pubDate>Thu, 23 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23/</guid>
      <description>共分析 27 篇语音/AI 论文</description>
    </item>
  </channel>
</rss>
