语音转换 on 语音/音频论文速递

语音转换 on 语音/音频论文速递 https://nanless.github.io/audio-paper-digest-blog/tags/%E8%AF%AD%E9%9F%B3%E8%BD%AC%E6%8D%A2/ Recent content in 语音转换 on 语音/音频论文速递 Hugo zh-cn Wed, 29 Apr 2026 00:00:00 +0000 Conditional Diffusion Models for Mental Health-Preserving Voice Conversion https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-conditional-diffusion-models-for-mental-health/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-conditional-diffusion-models-for-mental-health/ 语音转换 | 8.0/10 CosyAccent: Duration-Controllable Accent Normalization using Source-Synthesis Training Data https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-cosyaccent-duration-controllable-accent/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-cosyaccent-duration-controllable-accent/ 语音转换 | 7.8/10 Expressive Voice Conversion with Controllable Emotional Intensity https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-expressive-voice-conversion-with-controllable/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-expressive-voice-conversion-with-controllable/ 语音转换 | 7.5/10 FAC-FACodec: Controllable Zero-Shot Foreign Accent Conversion with Factorized Speech Codec https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-fac-facodec-controllable-zero-shot-foreign-accent/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-fac-facodec-controllable-zero-shot-foreign-accent/ 语音转换 | 8.0/10 ICASSP 2026 - 语音转换论文列表 https://nanless.github.io/audio-paper-digest-blog/posts/icassp2026-task-082/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/icassp2026-task-082/ 共 9 篇 ICASSP 2026 语音转换方向论文 Leveraging Text-to-Speech and Voice Conversion as Data Augmentation for Alzheimer's Disease Detection from Spontaneous Speech https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-leveraging-text-to-speech-and-voice-conversion-as/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-leveraging-text-to-speech-and-voice-conversion-as/ 语音生物标志物 | 7.0/10 Lightweight and Perceptually-Guided Voice Conversion for Electro-Laryngeal Speech https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-lightweight-and-perceptually-guided-voice/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-lightweight-and-perceptually-guided-voice/ 语音转换 | 7.5/10 MaskVCT: Masked Voice Codec Transformer for Zero-Shot Voice Conversion with Increased Controllability via Multiple Guidances https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-maskvct-masked-voice-codec-transformer-for-zero/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-maskvct-masked-voice-codec-transformer-for-zero/ 语音转换 | 6.5/10 MeanVC: Lightweight and Streaming Zero-Shot Voice Conversion via Mean Flows https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-meanvc-lightweight-and-streaming-zero-shot-voice/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-meanvc-lightweight-and-streaming-zero-shot-voice/ 语音转换 | 7.5/10 MeanVoiceFlow: One-Step Nonparallel Voice Conversion with Mean Flows https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-meanvoiceflow-one-step-nonparallel-voice/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-meanvoiceflow-one-step-nonparallel-voice/ 语音转换 | 7.0/10 QE-XVC: Zero-Shot Cross-Lingual Voice Conversion via Query-Enhancement and Conditional Flow Matching https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-qe-xvc-zero-shot-cross-lingual-voice-conversion/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-qe-xvc-zero-shot-cross-lingual-voice-conversion/ 语音转换 | 7.5/10 Quantifying Speaker Embedding Phonological Rule Interactions in Accented Speech Synthesis https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-quantifying-speaker-embedding-phonological-rule/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-quantifying-speaker-embedding-phonological-rule/ 语音合成 | 7.0/10 Robust Accent Identification via Voice Conversion and Non-Timbral Embeddings https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-robust-accent-identification-via-voice-conversion/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-robust-accent-identification-via-voice-conversion/ 语音识别 | 7.5/10 S2Voice: Style-Aware Autoregressive Modeling with Enhanced Conditioning for Singing Style Conversion https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-s2voice-style-aware-autoregressive-modeling-with/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-s2voice-style-aware-autoregressive-modeling-with/ 歌唱语音转换 | 7.0/10 Speaker Anonymisation for Speech-Based Suicide Risk Detection https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-speaker-anonymisation-for-speech-based-suicide/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-speaker-anonymisation-for-speech-based-suicide/ 语音匿名化 | 7.5/10 StylePitcher: Generating Style-Following and Expressive Pitch Curves for Versatile Singing Tasks https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-stylepitcher-generating-style-following-and/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-stylepitcher-generating-style-following-and/ 歌唱语音合成 | 7.5/10 Target Speaker Anonymization in Multi-Speaker Recordings https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-target-speaker-anonymization-in-multi-speaker/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-target-speaker-anonymization-in-multi-speaker/ 语音匿名化 | 7.6/10 VChangeCodec: An Ultra Low-Complexity Neural Speech Codec with Built-In Voice Changer for Customized Real-Time Communication https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-vchangecodec-an-ultra-low-complexity-neural/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-vchangecodec-an-ultra-low-complexity-neural/ 语音转换语音增强 | 8.0/10 X-VC: Zero-shot Streaming Voice Conversion in Codec Space https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-x-vc-zero-shot-streaming-voice-conversion-in/ Thu, 23 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-23-x-vc-zero-shot-streaming-voice-conversion-in/ 1. **问题**：零样本语音转换需要同时实现高质量的说话人特征迁移和低延迟的流式推理，这是一个尚未很好解决的挑战。 2. **方法核心**：提出X-VC系统，在预训练的SAC语音编解码器的潜在空间中进行一步转换。核心是一个双条件声学转换器，它联合处理源语音的编解码器潜在表示和目标参考语音的帧级梅尔 MimicLM: Zero-Shot Voice Imitation through Autoregressive Modeling of Pseudo-Parallel Speech Corpora https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-21-mimiclm-zero-shot-voice-imitation-through/ Tue, 21 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-21-mimiclm-zero-shot-voice-imitation-through/ 这篇论文旨在解决零样本语音模仿任务中高质量平行训练数据稀缺的核心瓶颈。传统方法要么依赖复杂的解耦架构，要么使用合成语音作为训练目标，导致输出质量受限于合成系统的能力。作者提出了一种名为 **MimicLM** 的新框架，其核心创新在于**“角色交换”的数据构建策略**：使用TTS生成的语音作为**训 X-VC: Zero-shot Streaming Voice Conversion in Codec Space https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-19-x-vc-zero-shot-streaming-voice-conversion-in/ Sun, 19 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-19-x-vc-zero-shot-streaming-voice-conversion-in/ 这篇论文旨在解决零样本语音转换中**高保真说话人迁移**与**低延迟流式推理**难以兼得的核心挑战。作者提出了**X-VC**系统，其核心创新在于**在预训练神经编解码器（SAC）的潜在空间中进行一步