音频安全 on 语音/音频论文速递

音频安全 on 语音/音频论文速递 https://nanless.github.io/audio-paper-digest-blog/tags/%E9%9F%B3%E9%A2%91%E5%AE%89%E5%85%A8/ Recent content in 音频安全 on 语音/音频论文速递 Hugo zh-cn Wed, 29 Apr 2026 00:00:00 +0000 A Feature-Optimized Audio Watermarking Algorithm with Adaptive Embedding Strength https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-feature-optimized-audio-watermarking-algorithm/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-feature-optimized-audio-watermarking-algorithm/ 音频安全 | 7.5/10 Audio-Text Jailbreak Attack on Large Audio-Language Models: Towards Generality and Stealthiness https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-audio-text-jailbreak-attack-on-large-audio/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-audio-text-jailbreak-attack-on-large-audio/ 音频安全 | 7.0/10 AURA: A Stegaformer-Based Scalable Deep Audio Watermark with Extreme Robustness https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-aura-a-stegaformer-based-scalable-deep-audio/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-aura-a-stegaformer-based-scalable-deep-audio/ 音频水印 | 7.5/10 Bloodroot: When Watermarking Turns Poisonous for Stealthy Backdoor https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-bloodroot-when-watermarking-turns-poisonous-for/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-bloodroot-when-watermarking-turns-poisonous-for/ 音频安全 | 7.5/10 Co-Initialization of Control Filter and Secondary Path via Meta-Learning for Active Noise Control https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-co-initialization-of-control-filter-and-secondary/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-co-initialization-of-control-filter-and-secondary/ 音频安全 | 7.5/10 Cross-Domain Contrastive Learning with Dynamic Threshold Calibration for Source Speaker Tracing https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-cross-domain-contrastive-learning-with-dynamic/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-cross-domain-contrastive-learning-with-dynamic/ 说话人验证 | 8.0/10 Disentangled Authenticity Representation for Partially Deepfake Audio Localization https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-disentangled-authenticity-representation-for/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-disentangled-authenticity-representation-for/ 音频深度伪造检测 | 6.5/10 Emotional Damage: Investigating Safety Vulnerabilities of Large Audio-Language Models Under Speaker Emotional Variations https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-emotional-damage-investigating-safety/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-emotional-damage-investigating-safety/ 音频安全 | 7.5/10 Erasing Your Voice Before it’s Heard: Training-Free Speaker Unlearning for Zero-Shot Text-to-Speech https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-erasing-your-voice-before-its-heard-training-free/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-erasing-your-voice-before-its-heard-training-free/ 语音合成 | 7.5/10 HVAC-EAR: Eavesdropping Human Speech Using HVAC Systems https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-hvac-ear-eavesdropping-human-speech-using-hvac/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-hvac-ear-eavesdropping-human-speech-using-hvac/ 音频安全 | 8.5/10 ICASSP 2026 - 音频安全论文列表 https://nanless.github.io/audio-paper-digest-blog/posts/icassp2026-task-125/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/icassp2026-task-125/ 共 11 篇 ICASSP 2026 音频安全方向论文 Impact of Phonetics on Speaker Identity in Adversarial Voice Attack https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-impact-of-phonetics-on-speaker-identity-in/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-impact-of-phonetics-on-speaker-identity-in/ 说话人验证 | 7.0/10 LenslessMic: Audio Encryption and Authentication via Lensless Computational Imaging https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-lenslessmic-audio-encryption-and-authentication/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-lenslessmic-audio-encryption-and-authentication/ 音频安全 | 7.5/10 Linguard: Authenticating Speech Recordings Using Speech Recognition and Watermark https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-linguard-authenticating-speech-recordings-using/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-linguard-authenticating-speech-recordings-using/ 音频安全 | 6.5/10 Membership Inference Attack against Music Diffusion Models via Generative Manifold Perturbation https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-membership-inference-attack-against-music/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-membership-inference-attack-against-music/ 音频安全 | 7.5/10 Mitigating Data Replication in Text-to-Audio Generative Diffusion Models Through Anti-Memorization Guidance https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-mitigating-data-replication-in-text-to-audio/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-mitigating-data-replication-in-text-to-audio/ 音频生成 | 7.5/10 Multi-Task Transformer for Explainable Speech Deepfake Detection via Formant Modeling https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-multi-task-transformer-for-explainable-speech/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-multi-task-transformer-for-explainable-speech/ 语音伪造检测 | 7.5/10 PADAM: Perceptual Audio Defect Assessment Model https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-padam-perceptual-audio-defect-assessment-model/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-padam-perceptual-audio-defect-assessment-model/ 音频分类 | 7.0/10 PRoADS: Provably Secure And Robust Audio Diffusion Steganography With Latent Optimization And Backward Euler Inversion https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-proads-provably-secure-and-robust-audio-diffusion/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-proads-provably-secure-and-robust-audio-diffusion/ 音频安全 | 6.5/10 RoCo: Robust Code for Fast and Effective Proactive Defense against Voice Cloning Attack https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-roco-robust-code-for-fast-and-effective-proactive/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-roco-robust-code-for-fast-and-effective-proactive/ 音频安全 | 7.5/10 The Impact of Audio Watermarking on Audio Anti-Spoofing Countermeasures https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-the-impact-of-audio-watermarking-on-audio-anti/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-the-impact-of-audio-watermarking-on-audio-anti/ 音频深度伪造检测 | 8.5/10 Training Dynamics-Aware Multi-Factor Curriculum Learning for Target Speaker Extraction https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-training-dynamics-aware-multi-factor-curriculum/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-training-dynamics-aware-multi-factor-curriculum/ 语音分离 | 7.0/10 VoxMorph: Scalable Zero-Shot Voice Identity Morphing via Disentangled Embeddings https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-voxmorph-scalable-zero-shot-voice-identity/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-voxmorph-scalable-zero-shot-voice-identity/ 语音克隆 | 9.0/10 ZK-VSA: Zero-Knowledge Verifiable Speaker Anonymization Leveraging Phase Vocoder with Time-Scale Modification https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-zk-vsa-zero-knowledge-verifiable-speaker/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-zk-vsa-zero-knowledge-verifiable-speaker/ 语音匿名化 | 7.5/10 Misinformation Span Detection in Videos via Audio Transcripts https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-24-misinformation-span-detection-in-videos-via-audio/ Fri, 24 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-24-misinformation-span-detection-in-videos-via-audio/ 音频安全 | 7.5/10 Benign Fine-Tuning Breaks Safety Alignment in Audio LLMs https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-22-benign-fine-tuning-breaks-safety-alignment-in/ Wed, 22 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-22-benign-fine-tuning-breaks-safety-alignment-in/ 这篇论文首次系统研究了良性（无害）音频数据微调对音频大模型安全对齐的破坏作用。**要解决的问题**是：用户出于提升模型性能目的进行的常规微调，是否会无意中破坏模型的安全防护？**方法**上，作者提出了一个基于嵌入空间邻近度的过滤框架，从语义、声学及混合维度，选择性地用与有害内容在表示空间上相近的良性 Environmental Sound Deepfake Detection Using Deep-Learning Framework https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-22-environmental-sound-deepfake-detection-using-deep/ Wed, 22 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-22-environmental-sound-deepfake-detection-using-deep/ 本文针对环境声音（如声音事件、声音场景）的深度伪造检测这一新兴任务，提出了一个系统的深度学习框架。**核心贡献**在于通过大量实验，系统评估了不同频谱图（MEL, CQT, Gammatone）、多种CNN架构（ResNet, Inception等）以及预训练模型（BEATs）在该任务上的表现，并验 Anonymization, Not Elimination: Utility-Preserved Speech Anonymization https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-21-anonymization-not-elimination-utility-preserved/ Tue, 21 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-21-anonymization-not-elimination-utility-preserved/ 这篇论文针对语音数据隐私保护中“隐私泄露”与“数据效用损失”的核心矛盾，提出了一个新颖的两阶段框架。首先，为解决语音匿名化（保护“谁在说”）中身份多样性不足和可控性差的问题，提出了基于流匹配的说话人嵌入匿名器（F3-VA），它能生成多样且与原始说话人充分分离的新身份。其次，为解决内容匿名化（保护“说 Benign Fine-Tuning Breaks Safety Alignment in Audio LLMs https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-21-benign-fine-tuning-breaks-safety-alignment-in/ Tue, 21 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-21-benign-fine-tuning-breaks-safety-alignment-in/ 这篇论文首次系统研究了**良性音频数据微调对音频大模型安全对齐的破坏性影响**。核心问题是：用户出于提升性能的目的，在完全无害的音频数据上微调模型，是否会意外削弱其拒绝有害指令的能力？作者提出了一个**基于嵌入空间邻近性的过滤框架**，通过计算良性音频与有害音频在模型内部或外部参考编码器空间中的距离 Hijacking Large Audio-Language Models via Context-Agnostic and Imperceptible Auditory Prompt Injection https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-19-hijacking-large-audio-language-models-via-context/ Sun, 19 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-19-hijacking-large-audio-language-models-via-context/ 这篇论文揭示了针对音频大语言模型（LALM）的一种新型安全威胁：**上下文无关且不可感知的音频提示注入攻击**。攻击者仅需篡改输入音频数据（如会议录音、音乐片段），即可在用户不知情的情况下，劫持模型行 StreamMark: A Deep Learning-Based Semi-Fragile Audio Watermarking for Proactive Deepfake Detection https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-19-streammark-a-deep-learning-based-semi-fragile/ Sun, 19 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-19-streammark-a-deep-learning-based-semi-fragile/ 本文针对生成式AI带来的音频深度伪造威胁，提出了一种名为StreamMark的主动防御框架。该框架是一种基于深度学习的半脆弱音频水印系统，其核心创新在于重新定义了水印的目标：不是追求对所有变换的绝对鲁 VoxSafeBench: Not Just What Is Said, but Who, How, and Where https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-19-voxsafebench-not-just-what-is-said-but-who-how/ Sun, 19 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-19-voxsafebench-not-just-what-is-said-but-who-how/ 这篇论文旨在解决一个关键问题：当语音大模型（SLM）进入多用户共享环境时，仅基于文本内容的安全对齐策略是不足的，说话人身份、副语言特征和声学场景等音频上下文信息会根本性地改变请求的性质。为此，作者提出