人机交互 on 语音/音频论文速递

人机交互 on 语音/音频论文速递 https://nanless.github.io/audio-paper-digest-blog/tags/%E4%BA%BA%E6%9C%BA%E4%BA%A4%E4%BA%92/ Recent content in 人机交互 on 语音/音频论文速递 Hugo zh-cn Thu, 23 Apr 2026 00:00:00 +0000 CoInteract: Physically-Consistent Human-Object Interaction Video Synthesis via Spatially-Structured Co-Generation https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-cointeract-physically-consistent-human-object/ Thu, 23 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-cointeract-physically-consistent-human-object/ 1. **问题**：现有视频扩散模型在生成人机交互（HOI）视频时，常出现手/脸结构崩溃和人机物理穿透等问题，根源在于模型缺乏对3D空间关系和交互结构的理解。 2. **方法核心**：提出CoInteract框架，核心是“空间结构化协同生成”范式。在一个共享的DiT骨干中联合训练RGB外观流和辅助的 MOMO: A framework for seamless physical, verbal, and graphical robot skill learning and adaptation https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-momo-a-framework-for-seamless-physical-verbal-and/ Thu, 23 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-momo-a-framework-for-seamless-physical-verbal-and/ 1. **问题**：工业机器人需要频繁适应新任务和环境，但现有技能调整方法（如手动重编程）对非专家用户不友好，且单一交互模态无法高效处理所有类型的调整需求。 2. **方法核心**：提出MOMO框架，集成三种互补交互模态：动觉接触（用于精确空间修正）、自然语言（用于高层语义修改）和图形界面（用于参数语音/音频论文速递 2026-04-23 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23/ Thu, 23 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23/ 共分析 27 篇语音/AI 论文