语音增强 on 语音/音频论文速递

语音增强 on 语音/音频论文速递 https://nanless.github.io/audio-paper-digest-blog/tags/%E8%AF%AD%E9%9F%B3%E5%A2%9E%E5%BC%BA/ Recent content in 语音增强 on 语音/音频论文速递 Hugo zh-cn Wed, 29 Apr 2026 00:00:00 +0000 A Generalization Strategy for Speech Quality Prediction: From Domain-Specific to Unified Datasets https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-generalization-strategy-for-speech-quality/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-generalization-strategy-for-speech-quality/ 语音质量评估 | 6.5/10 A Lightweight Fourier-Based Network for Binaural Speech Enhancement with Spatial Cue Preservation https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-lightweight-fourier-based-network-for-binaural/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-lightweight-fourier-based-network-for-binaural/ 语音增强 | 8.5/10 A Noniterative Phase Retrieval Considering the Zeros of STFT Magnitude https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-noniterative-phase-retrieval-considering-the/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-noniterative-phase-retrieval-considering-the/ 信号处理 | 7.5/10 A Stabilized Hybrid Active Noise Control Algorithm of GFANC and FxNLMS with Online Clustering https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-stabilized-hybrid-active-noise-control/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-stabilized-hybrid-active-noise-control/ 语音增强 | 7.5/10 A State-Dependent Markov Diffusion Process for Generative Speech Enhancement https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-state-dependent-markov-diffusion-process-for/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-a-state-dependent-markov-diffusion-process-for/ 语音增强 | 6.5/10 Acoustic Teleportation Via Disentangled Neural Audio Codec Representations https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-acoustic-teleportation-via-disentangled-neural/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-acoustic-teleportation-via-disentangled-neural/ 语音增强 | 7.0/10 Adaptive Deterministic Flow Matching for Target Speaker Extraction https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-adaptive-deterministic-flow-matching-for-target/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-adaptive-deterministic-flow-matching-for-target/ 目标说话人提取 | 8.0/10 Adversarial Defense via Generative Speech Enhancement Module https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-adversarial-defense-via-generative-speech/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-adversarial-defense-via-generative-speech/ 语音增强对抗防御 | 7.5/10 Aligning Generative Speech Enhancement with Perceptual Feedback https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-aligning-generative-speech-enhancement-with/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-aligning-generative-speech-enhancement-with/ 语音增强 | 7.5/10 AmbiDrop: Array-Agnostic Speech Enhancement Using Ambisonics Encoding and Dropout-Based Learning https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-ambidrop-array-agnostic-speech-enhancement-using/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-ambidrop-array-agnostic-speech-enhancement-using/ 语音增强 | 7.0/10 An Efficient Neural Network for Modeling Human Auditory Neurograms for Speech https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-an-efficient-neural-network-for-modeling-human/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-an-efficient-neural-network-for-modeling-human/ 语音增强 | 7.0/10 Aneural Forward Filtering for Speaker-Image Separation https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-aneural-forward-filtering-for-speaker-image/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-aneural-forward-filtering-for-speaker-image/ 语音分离 | 7.5/10 Are Modern Speech Enhancement Systems Vulnerable to Adversarial Attacks? https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-are-modern-speech-enhancement-systems-vulnerable/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-are-modern-speech-enhancement-systems-vulnerable/ 语音增强 | 7.5/10 Auditory-Inspired Transformer for Binaural Speech Enhancement and Spatial Cue Preservation https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-auditory-inspired-transformer-for-binaural-speech/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-auditory-inspired-transformer-for-binaural-speech/ 语音增强 | 7.0/10 Beamforming Using Virtual Microphones for Hearing Aid Applications https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-beamforming-using-virtual-microphones-for-hearing/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-beamforming-using-virtual-microphones-for-hearing/ 语音增强 | 7.5/10 Bone-Conduction Guided Multimodal Speech Enhancement with Conditional Diffusion Models https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-bone-conduction-guided-multimodal-speech/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-bone-conduction-guided-multimodal-speech/ 语音增强 | 7.5/10 Brainprint-Modulated Target Speaker Extraction https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-brainprint-modulated-target-speaker-extraction/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-brainprint-modulated-target-speaker-extraction/ 语音分离 | 8.0/10 BSMP-SENet:Band-Split Magnitude-Phase Network for Speech Enhancement https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-bsmp-senetband-split-magnitude-phase-network-for/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-bsmp-senetband-split-magnitude-phase-network-for/ 语音增强 | 7.0/10 Confidence-Based Filtering for Speech Dataset Curation with Generative Speech Enhancement Using Discrete Tokens https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-confidence-based-filtering-for-speech-dataset/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-confidence-based-filtering-for-speech-dataset/ 语音增强 | 6.5/10 DAT-CFTNet: Speech Enhancement for Cochlear Implant Recipients using Attention-based Dual-Path Recurrent Neural Network https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-dat-cftnet-speech-enhancement-for-cochlear/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-dat-cftnet-speech-enhancement-for-cochlear/ 语音增强 | 7.0/10 DECAF: Dynamic Envelope Context-Aware Fusion for Speech-Envelope Reconstruction from EEG https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-decaf-dynamic-envelope-context-aware-fusion-for/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-decaf-dynamic-envelope-context-aware-fusion-for/ 语音增强 | 7.0/10 Deep Learning-Based Joint Optimization of Adaptive Feedback Cancellation and Residual Feedback Suppression for Hearing Aids https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-deep-learning-based-joint-optimization-of/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-deep-learning-based-joint-optimization-of/ 语音增强 | 8.0/10 DisContSE: Single-Step Diffusion Speech Enhancement based on Joint Discrete and Continuous Embeddings https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-discontse-single-step-diffusion-speech/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-discontse-single-step-diffusion-speech/ 语音增强 | 8.5/10 DISSR: Disentangling Speech Representation for Degradation-Prior Guided Cross-Domain Speech Restoration https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-dissr-disentangling-speech-representation-for/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-dissr-disentangling-speech-representation-for/ 语音增强 | 7.5/10 DiTSE: High-Fidelity Generative Speech Enhancement via Latent Diffusion Transformers https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-ditse-high-fidelity-generative-speech-enhancement/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-ditse-high-fidelity-generative-speech-enhancement/ 语音增强 | 8.5/10 Do We Need EMA for Diffusion-Based Speech Enhancement? Toward A Magnitude-Preserving Network Architecture https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-do-we-need-ema-for-diffusion-based-speech/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-do-we-need-ema-for-diffusion-based-speech/ 语音增强 | 7.5/10 Dynamically Slimmable Speech Enhancement Network with Metric-Guided Training https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-dynamically-slimmable-speech-enhancement-network/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-dynamically-slimmable-speech-enhancement-network/ 语音增强 | 7.5/10 E2E-AEC: Implementing An End-To-End Neural Network Learning Approach for Acoustic Echo Cancellation https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-e2e-aec-implementing-an-end-to-end-neural-network/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-e2e-aec-implementing-an-end-to-end-neural-network/ 语音增强 | 7.5/10 Enhancing Noise Robustness for Neural Speech Codecs Through Resource-Efficient Progressive Quantization Perturbation Simulation https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-enhancing-noise-robustness-for-neural-speech/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-enhancing-noise-robustness-for-neural-speech/ 语音增强 | 7.5/10 Enhancing Speech Intelligibility Prediction for Hearing Aids with Complementary Speech Foundation Model Representations https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-enhancing-speech-intelligibility-prediction-for/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-enhancing-speech-intelligibility-prediction-for/ 语音增强 | 7.5/10 Exploring Resolution-Wise Shared Attention in Hybrid Mamba-U-Nets for Improved Cross-Corpus Speech Enhancement https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-exploring-resolution-wise-shared-attention-in/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-exploring-resolution-wise-shared-attention-in/ 语音增强 | 8.0/10 Fast-ULCNet: A Fast and Ultra Low Complexity Network for Single-Channel Speech Enhancement https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-fast-ulcnet-a-fast-and-ultra-low-complexity/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-fast-ulcnet-a-fast-and-ultra-low-complexity/ 语音增强 | 7.5/10 FastEnhancer: Speed-Optimized Streaming Neural Speech Enhancement https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-fastenhancer-speed-optimized-streaming-neural/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-fastenhancer-speed-optimized-streaming-neural/ 语音增强 | 8.5/10 Flexio: Flexible Single- and Multi-Channel Speech Separation and Enhancement https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-flexio-flexible-single-and-multi-channel-speech/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-flexio-flexible-single-and-multi-channel-speech/ 语音分离 | 8.0/10 FlowSE-GRPO: Training Flow Matching Speech Enhancement via Online Reinforcement Learning https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-flowse-grpo-training-flow-matching-speech/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-flowse-grpo-training-flow-matching-speech/ 语音增强 | 7.5/10 Forward Convolutive Prediction for Frame Online Monaural Speech Dereverberation based on Kronecker Product Decomposition https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-forward-convolutive-prediction-for-frame-online/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-forward-convolutive-prediction-for-frame-online/ 语音增强 | 7.5/10 From Diet to Free Lunch: Estimating Auxiliary Signal Properties Using Dynamic Pruning Masks in Speech Enhancement Networks https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-from-diet-to-free-lunch-estimating-auxiliary/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-from-diet-to-free-lunch-estimating-auxiliary/ 语音增强 | 7.5/10 Frontend Token Enhancement for Token-Based Speech Recognition https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-frontend-token-enhancement-for-token-based-speech/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-frontend-token-enhancement-for-token-based-speech/ 语音识别 | 8.0/10 Gdiffuse: Diffusion-Based Speech Enhancement with Noise Model Guidance https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-gdiffuse-diffusion-based-speech-enhancement-with/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-gdiffuse-diffusion-based-speech-enhancement-with/ 语音增强 | 7.0/10 Generalizability of Predictive and Generative Speech Enhancement Models to Pathological Speakers https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-generalizability-of-predictive-and-generative/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-generalizability-of-predictive-and-generative/ 语音增强 | 7.0/10 H-nnPBFDAF: Hierarchical Neural Network Partitioned Block Frequency Domain Adaptive Filter with Novel Block Activation Probability https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-h-nnpbfdaf-hierarchical-neural-network/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-h-nnpbfdaf-hierarchical-neural-network/ 语音增强 | 7.5/10 Hair Noise Analysis and Mitigation for Smart Glasses Audio Captures https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-hair-noise-analysis-and-mitigation-for-smart/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-hair-noise-analysis-and-mitigation-for-smart/ 语音增强 | 7.5/10 HCGAN: Harmonic-Coupled Generative Adversarial Network for Speech Super-Resolution in Low-Bandwidth Scenarios https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-hcgan-harmonic-coupled-generative-adversarial/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-hcgan-harmonic-coupled-generative-adversarial/ 语音增强 | 8.0/10 High-Fidelity Speech Enhancement Via Discrete Audio Tokens https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-high-fidelity-speech-enhancement-via-discrete/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-high-fidelity-speech-enhancement-via-discrete/ 语音增强 | 7.5/10 HyFlowSE: Hybrid End-To-End Flow-Matching Speech Enhancement via Generative-Discriminative Learning https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-hyflowse-hybrid-end-to-end-flow-matching-speech/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-hyflowse-hybrid-end-to-end-flow-matching-speech/ 语音增强 | 8.0/10 I-DCCRN-VAE: An Improved Deep Representation Learning Framework for Complex VAE-Based Single-Channel Speech Enhancement https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-i-dccrn-vae-an-improved-deep-representation/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-i-dccrn-vae-an-improved-deep-representation/ 语音增强 | 7.5/10 ICASSP 2026 - 语音增强论文列表 https://nanless.github.io/audio-paper-digest-blog/posts/icassp2026-task-062/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/icassp2026-task-062/ 共 75 篇 ICASSP 2026 语音增强方向论文 Improving Automatic Speech Recognition by Mitigating Distortions Introduced by Speech Enhancement Under Drone Noise https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-improving-automatic-speech-recognition-by/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-improving-automatic-speech-recognition-by/ 语音识别 | 6.5/10 Influence of Clean Speech Characteristics on Speech Enhancement Performance https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-influence-of-clean-speech-characteristics-on/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-influence-of-clean-speech-characteristics-on/ 语音增强 | 8.0/10 Is Phase Really Needed for Weakly-Supervised Dereverberation? https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-is-phase-really-needed-for-weakly-supervised/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-is-phase-really-needed-for-weakly-supervised/ 语音增强 | 6.0/10 Joint Deep Secondary Path Estimation and Adaptive Control for Active Noise Cancellation https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-joint-deep-secondary-path-estimation-and-adaptive/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-joint-deep-secondary-path-estimation-and-adaptive/ 语音增强 | 7.5/10 Joint Multichannel Acoustic Feedback Cancellation and Speaker Extraction via Kalman Filter and Deep Non-Linear Spatial Filter https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-joint-multichannel-acoustic-feedback-cancellation/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-joint-multichannel-acoustic-feedback-cancellation/ 语音增强 | 7.0/10 LAFUFU: Latent Acoustic Features For Ultra-Fast Utterance Restoration https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-lafufu-latent-acoustic-features-for-ultra-fast/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-lafufu-latent-acoustic-features-for-ultra-fast/ 语音增强 | 8.0/10 Leveraging Multiple Speech Enhancers for Non-Intrusive Intelligibility Prediction for Hearing-Impaired Listeners https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-leveraging-multiple-speech-enhancers-for-non/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-leveraging-multiple-speech-enhancers-for-non/ 模型评估 | 7.5/10 Lightweight and Perceptually-Guided Voice Conversion for Electro-Laryngeal Speech https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-lightweight-and-perceptually-guided-voice/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-lightweight-and-perceptually-guided-voice/ 语音转换 | 7.5/10 Lightweight Phoneme-Conditioned Bandwidth Extension for Body-Conducted Speech https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-lightweight-phoneme-conditioned-bandwidth/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-lightweight-phoneme-conditioned-bandwidth/ 语音增强 | 7.5/10 LipsAM: Lipschitz-Continuous Amplitude Modifier for Audio Signal Processing and its Application to Plug-And-Play Dereverberation https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-lipsam-lipschitz-continuous-amplitude-modifier/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-lipsam-lipschitz-continuous-amplitude-modifier/ 语音增强 | 7.5/10 Low-Bandwidth High-Fidelity Speech Transmission with Generative Latent Joint Source-Channel Coding https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-low-bandwidth-high-fidelity-speech-transmission/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-low-bandwidth-high-fidelity-speech-transmission/ 语音增强 | 7.5/10 Low-Frequency Harmonic Control for Speech Intelligibility in Open-Ear Headphones https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-low-frequency-harmonic-control-for-speech/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-low-frequency-harmonic-control-for-speech/ 语音增强 | 6.5/10 Low-Latency Audio Front-End Region-of-Interest Beamforming for Smart Glasses https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-low-latency-audio-front-end-region-of-interest/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-low-latency-audio-front-end-region-of-interest/ 语音增强 | 7.0/10 MAGE: A Coarse-to-Fine Speech Enhancer with Masked Generative Model https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-mage-a-coarse-to-fine-speech-enhancer-with-masked/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-mage-a-coarse-to-fine-speech-enhancer-with-masked/ 语音增强 | 8.0/10 Mambaformer: State-Space Augmented Self-Attention with Downup Sampling for Monaural Speech Enhancement https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-mambaformer-state-space-augmented-self-attention/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-mambaformer-state-space-augmented-self-attention/ 语音增强 | 7.0/10 MeanFlowSE: One-Step Generative Speech Enhancement via Conditional Mean Flow https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-meanflowse-one-step-generative-speech-enhancement/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-meanflowse-one-step-generative-speech-enhancement/ 语音增强 | 7.5/10 MeanSE: Efficient Generative Speech Enhancement with Mean Flows https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-meanse-efficient-generative-speech-enhancement/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-meanse-efficient-generative-speech-enhancement/ 语音增强 | 6.5/10 Mixture To Beamformed Mixture: Leveraging Beamformed Mixture As Weak-Supervision for Speech Enhancement and Noise-Robust ASR https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-mixture-to-beamformed-mixture-leveraging/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-mixture-to-beamformed-mixture-leveraging/ 语音增强 | 8.0/10 Modeling Strategies For Speech Enhancement in The Latent Space of a Neural Audio Codec https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-modeling-strategies-for-speech-enhancement-in-the/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-modeling-strategies-for-speech-enhancement-in-the/ 语音增强 | 8.0/10 MSANET: Multi-Scale Semantic Aggregation Network for Brain-Assisted Speech Enhancement in Multi-Speaker Conditions https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-msanet-multi-scale-semantic-aggregation-network/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-msanet-multi-scale-semantic-aggregation-network/ 语音增强 | 7.5/10 Multi-Channel Speech Enhancement for Cocktail Party Speech Emotion Recognition https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-multi-channel-speech-enhancement-for-cocktail/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-multi-channel-speech-enhancement-for-cocktail/ 语音情感识别 | 7.5/10 Multi-Task Learning For Speech Quality Assessment Using ASR-Derived Entropy Features https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-multi-task-learning-for-speech-quality-assessment/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-multi-task-learning-for-speech-quality-assessment/ 语音质量评估 | 7.5/10 On The Design of Efficient Neural Methods for Geometry-Agnostic Multichannel Speech Enhancement https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-on-the-design-of-efficient-neural-methods-for/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-on-the-design-of-efficient-neural-methods-for/ 语音增强 | 6.5/10 ParaGSE: Parallel Generative Speech Enhancement with Group-Vector-Quantization-Based Neural Speech Codec https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-paragse-parallel-generative-speech-enhancement/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-paragse-parallel-generative-speech-enhancement/ 语音增强 | 7.5/10 PG-SE: Predictive Acceleration and Correction for Generative Speech Enhancement https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-pg-se-predictive-acceleration-and-correction-for/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-pg-se-predictive-acceleration-and-correction-for/ 语音增强 | 7.5/10 Position-Invariant Fine-Tuning Of Speech Enhancement Models With Self-Supervised Speech Representations https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-position-invariant-fine-tuning-of-speech/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-position-invariant-fine-tuning-of-speech/ 语音增强 | 6.5/10 Purification Before Fusion: Toward Mask-Free Speech Enhancement for Robust Audio-Visual Speech Recognition https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-purification-before-fusion-toward-mask-free/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-purification-before-fusion-toward-mask-free/ 语音识别 | 7.5/10 Quality Assessment of Noisy and Enhanced Speech with Limited Data: UWB-NTIS System for Voicemos 2024 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-quality-assessment-of-noisy-and-enhanced-speech/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-quality-assessment-of-noisy-and-enhanced-speech/ 语音质量评估 | 7.0/10 Ranking The Impact of Contextual Specialization in Neural Speech Enhancement https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-ranking-the-impact-of-contextual-specialization/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-ranking-the-impact-of-contextual-specialization/ 语音增强 | 7.5/10 Reference Microphone Selection for Guided Source Separation Based on The Normalized L-P Norm https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-reference-microphone-selection-for-guided-source/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-reference-microphone-selection-for-guided-source/ 语音增强 | 7.0/10 Residual Tokens Enhance Masked Autoencoders for Speech Modeling https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-residual-tokens-enhance-masked-autoencoders-for/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-residual-tokens-enhance-masked-autoencoders-for/ 语音合成 | 7.0/10 Sampling-Rate-Agnostic Speech Super-Resolution Based on Gaussian Process Dynamical Systems with Deep Kernel Learning https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-sampling-rate-agnostic-speech-super-resolution/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-sampling-rate-agnostic-speech-super-resolution/ 语音增强 | 6.5/10 Shortcut Flow Matching for Speech Enhancement: Step-Invariant Flows via Single Stage Training https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-shortcut-flow-matching-for-speech-enhancement/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-shortcut-flow-matching-for-speech-enhancement/ 语音增强 | 7.0/10 Sidon: Fast and Robust Open-Source Multilingual Speech Restoration for Large-Scale Dataset Cleansing https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-sidon-fast-and-robust-open-source-multilingual/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-sidon-fast-and-robust-open-source-multilingual/ 语音增强 | 8.5/10 SLM-SS: Speech Language Model for Generative Speech Separation https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-slm-ss-speech-language-model-for-generative/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-slm-ss-speech-language-model-for-generative/ 语音分离 | 7.5/10 Spatial Covariance Matrix Reconstruction for Speech Enhancement in Reverberant Multi-Source Environments https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-spatial-covariance-matrix-reconstruction-for/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-spatial-covariance-matrix-reconstruction-for/ 语音增强 | 7.5/10 SpatialNet-Echo: Real-Time Acoustic Echo Cancellation via Integrated Narrow-Band and Cross-Band Processing https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-spatialnet-echo-real-time-acoustic-echo/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-spatialnet-echo-real-time-acoustic-echo/ 语音增强 | 7.5/10 Speaking Clearly: A Simplified Whisper-Based Codec for Low-Bitrate Speech Coding https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-speaking-clearly-a-simplified-whisper-based-codec/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-speaking-clearly-a-simplified-whisper-based-codec/ 语音编码 | 7.5/10 Spike-Driven Low-Power Speech Bandwidth Extension https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-spike-driven-low-power-speech-bandwidth-extension/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-spike-driven-low-power-speech-bandwidth-extension/ 语音增强 | 8.0/10 Stereophonic Acoustic Echo Cancellation Using an Improved Affine Projection Algorithm with Adaptive Multiple Sub-Filters https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-stereophonic-acoustic-echo-cancellation-using-an/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-stereophonic-acoustic-echo-cancellation-using-an/ 语音增强 | 6.0/10 The 3rd Clarity Prediction Challenge: A Machine Learning Challenge for Hearing aid Speech Intelligibility Prediction https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-the-3rd-clarity-prediction-challenge-a-machine/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-the-3rd-clarity-prediction-challenge-a-machine/ 语音增强 | 7.5/10 Towards Lightweight Adaptation of Speech Enhancement Models in Real-World Environments https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-towards-lightweight-adaptation-of-speech/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-towards-lightweight-adaptation-of-speech/ 语音增强 | 8.5/10 Towards Real-Time Generative Speech Restoration with Flow-Matching https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-towards-real-time-generative-speech-restoration/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-towards-real-time-generative-speech-restoration/ 语音增强 | 6.0/10 Training-Free Inference-Time Scaling for Audio Source Separation https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-training-free-inference-time-scaling-for-audio/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-training-free-inference-time-scaling-for-audio/ 语音增强 | 7.5/10 Two-Stage Language Model Framework for Acoustic Echo Cancellation https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-two-stage-language-model-framework-for-acoustic/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-two-stage-language-model-framework-for-acoustic/ 语音增强 | 7.5/10 UJCodec: An End-to-end Unet-Style Codec for Joint Speech Compression and Enhancement https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-ujcodec-an-end-to-end-unet-style-codec-for-joint/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-ujcodec-an-end-to-end-unet-style-codec-for-joint/ 语音增强 | 7.5/10 UNet-Based Fusion and Exponential Moving Average Adaptation for Noise-Robust Speaker Recognition https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-unet-based-fusion-and-exponential-moving-average/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-unet-based-fusion-and-exponential-moving-average/ 说话人验证 | 7.5/10 Universr: Unified and Versatile Audio Super-Resolution Via Vocoder-Free Flow Matching https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-universr-unified-and-versatile-audio-super/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-universr-unified-and-versatile-audio-super/ 音频超分辨率 | 8.0/10 VChangeCodec: An Ultra Low-Complexity Neural Speech Codec with Built-In Voice Changer for Customized Real-Time Communication https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-vchangecodec-an-ultra-low-complexity-neural/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-vchangecodec-an-ultra-low-complexity-neural/ 语音转换语音增强 | 8.0/10 What the student learns in knowledge distillation: A subspace view and evidence on Convolutional Recurrent Network https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-what-the-student-learns-in-knowledge-distillation/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-what-the-student-learns-in-knowledge-distillation/ 语音增强 | 6.5/10 Whisper-FEST: Single-Channel Far-Field Enhanced Speech-to-text without Parallel Data https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-whisper-fest-single-channel-far-field-enhanced/ Wed, 29 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-29-whisper-fest-single-channel-far-field-enhanced/ 语音识别 | 7.5/10 Speech Enhancement Based on Drifting Models https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-28-speech-enhancement-based-on-drifting-models/ Tue, 28 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-28-speech-enhancement-based-on-drifting-models/ 语音增强 | 7.5/10 Dilated CNNs for Periodic Signal Processing: A Low-Complexity Approach https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-24-dilated-cnns-for-periodic-signal-processing-a-low/ Fri, 24 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-24-dilated-cnns-for-periodic-signal-processing-a-low/ 语音增强 | 6.5/10 Time vs. Layer: Locating Predictive Cues for Dysarthric Speech Descriptors in wav2vec 2.0 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-24-time-vs-layer-locating-predictive-cues-for/ Fri, 24 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-24-time-vs-layer-locating-predictive-cues-for/ 语音生物标志物 | 7.0/10 TokenSE: a Mamba-based discrete token speech enhancement framework for cochlear implants https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-19-tokense-a-mamba-based-discrete-token-speech/ Sun, 19 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-19-tokense-a-mamba-based-discrete-token-speech/ 本文针对人工耳蜗用户在噪声和混响环境下语音理解困难的问题，提出了一种名为TokenSE的语音增强框架。该框架的核心创新在于将语音增强任务从传统的时频域或波形域转换到神经音频编解码器的离散令牌空间中进行 UniPASE: A Generative Model for Universal Speech Enhancement with High Fidelity and Low Hallucinations https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-19-unipase-a-generative-model-for-universal-speech/ Sun, 19 Apr 2026 00:00:00 +0000 https://nanless.github.io/audio-paper-digest-blog/posts/2026-04-19-unipase-a-generative-model-for-universal-speech/ 这篇论文旨在解决通用语音增强（USE）中生成模型面临的“高感知质量”与“低内容幻觉”难以兼得的核心矛盾。作者提出了UniPASE框架，它扩展了其先前的低幻觉PASE模型，以处理包括噪声、混响、丢包、风