Eess-As

Scaling Speech Tokenizers with Diffusion Autoencoders

Artificial Intelligence 6 JAN, 2026

Scaling Speech Tokenizers with Diffusion Autoencoders

By Yuancheng Wang

STACodec: Semantic Token Assignment for Balancing Acoustic Fidelity and Semantic Information in Audio Codecs

Eess As 5 JAN, 2026

STACodec: Semantic Token Assignment for Balancing Acoustic Fidelity and Semantic Information in Audio Codecs

By Kaiyuan Zhang

B-GRPO: Unsupervised Speech Emotion Recognition based on Batched-Group Relative Policy Optimization

Eess As 6 JAN, 2026

B-GRPO: Unsupervised Speech Emotion Recognition based on Batched-Group Relative Policy Optimization

By Yingying Gao

Reciprocal Latent Fields for Precomputed Sound Propagation

Sound 6 JAN, 2026

Reciprocal Latent Fields for Precomputed Sound Propagation

By Hugo Seuté

Automatic Detection and Analysis of Singing Mistakes for Music Pedagogy

Machine Learning 6 JAN, 2026

Automatic Detection and Analysis of Singing Mistakes for Music Pedagogy

By Sumit Kumar

Misophonia Trigger Sound Detection on Synthetic Soundscapes Using a Hybrid Model with a Frozen Pre-Trained CNN and a Time-Series Module

Sound 6 JAN, 2026

Misophonia Trigger Sound Detection on Synthetic Soundscapes Using a Hybrid Model with a Frozen Pre-Trained CNN and a Time-Series Module

By Kurumi Sashida

Complete reconstruction of the tongue contour through acoustic to articulatory inversion using real-time MRI data

Eess As 4 JAN, 2024

Complete reconstruction of the tongue contour through acoustic to articulatory inversion using real-time MRI data

By Sofiane Azzouz