A Review of Voicing Decision in Whispered Speech: From Rules to Machine Learning

Pinto da Silva, João; Duarte Nunes, Gonçalo; Ferreira, Aníbal

doi:10.2139/ssrn.5123047

Download This Paper

Open PDF in Browser

Add Paper to My Library

A Review of Voicing Decision in Whispered Speech: From Rules to Machine Learning

83 Pages Posted: 9 Feb 2025

See all articles by João Pinto da Silva

João Pinto da Silva

INESC TEC - Institute for Systems and Computer Engineering, Technology and Science

Gonçalo Duarte Nunes

INESC TEC - Institute for Systems and Computer Engineering, Technology and Science

Speech serves as a fundamental medium of human communication, encompassing diverse modes such as normal and whispered speech, each characterized by distinct acoustic properties. Normal speech relies on vocal fold vibration, producing a rich harmonic structure that enhances intelligibility and vocal projection, while whispered speech, devoid of such vibration, manifests a noisier signal with diminished clarity. Individuals with impaired phonation, resulting from conditions like vocal fold paralysis or laryngeal trauma, often resort to unintentional whispered speech, leading to significant challenges in communication. In response to these challenges, whispered-to-normal speech conversion systems have been developed to reconstruct the missing voicing components of whispered speech, thereby improving speech quality. Central to the effectiveness of these systems is the voicing decision process, which classifies speech segments into candidates and non-candidates for voicing, ensuring that harmonic structures are appropriately restored. This review aims to provide a comprehensive examination of the voicing decision process within whispered-to-normal speech conversion systems. By analyzing current methodologies and identifying research gaps, this review highlights the critical need for advancements in the voicing decision process to enhance communication for individuals with phonation disorders, ultimately improving their quality of life. Recent trends highlight a shift from rule-based methods to machine learning approaches, reflecting their increasing effectiveness and potential.

Keywords: Voicing decision, candidate for voicing, whispered speech

Suggested Citation: Suggested Citation

Pinto da Silva, João and Duarte Nunes, Gonçalo and Ferreira, Aníbal, A Review of Voicing Decision in Whispered Speech: From Rules to Machine Learning. Available at SSRN: https://ssrn.com/abstract=5123047 or http://dx.doi.org/10.2139/ssrn.5123047