Blind Restoration of Real-World Audio by 1d Operational Gans

21 Pages Posted: 1 Sep 2023

See all articles by Turker Ince

Turker Ince

University of Massachusetts Amherst

Serkan Kiranyaz

Qatar University

Ozer Can Devecioglu

Tampere University

Muhammad Salman Khan

Qatar University

Muhammad Enamul Hoque Chowdhury

University of Nottingham - Sir Peter Mansfield Imaging Centre

Moncef Gabbouj

Tampere University

Abstract

Objective: Despite numerous studies proposed for audio restoration in the literature, most of them focus on an isolated restoration problem such as denoising or dereverberation, ignoring other artifacts. Moreover, assuming a noisy or reverberant environment with a limited number of fixed signal-to-distortion ratio (SDR) levels is a common practice. However, real-world audio is often corrupted by a blend of artifacts such as reverberation, sensor noise, and background audio mixture with varying types, severities, and duration. In this study, we propose a novel approach for blind restoration of real-world audio signals by Operational Generative Adversarial Networks (Op-GANs) with temporal and spectral objective metrics to enhance the quality of restored audio signal regardless of the type and severity of each artifact corrupting it. Methods: 1D Operational-GANs are used with a generative neuron model optimized for blind restoration of any corrupted audio signal. Results: The proposed approach has been evaluated extensively over the benchmark TIMIT-RAR (speech) and GTZAN-RAR (non-speech) datasets corrupted with a random blend of artifacts each with a random severity to mimic real-world audio signals. Average SDR improvements of over 7.2 dB and 4.9 dB are achieved, respectively, which are substantial when compared with the baseline methods. Significance: This is a pioneer study in blind audio restoration with the unique capability of direct (time-domain) restoration of real-world audio whilst achieving an unprecedented level of performance for a wide SDR range and artifact types. Conclusion: 1D Op-GANs can achieve robust and computationally effective real-world audio restoration with significantly improved performance. The source codes and the generated real-world audio datasets are shared publicly with the research community in a dedicated GitHub repository1.

Keywords: Real-World Audio, Blind Audio Restoration, Self-Organized Operational Neural Networks, Operational GANs

Suggested Citation

Ince, Turker and Kiranyaz, Serkan and Devecioglu, Ozer Can and Khan, Muhammad Salman and Chowdhury, Muhammad Enamul Hoque and Gabbouj, Moncef, Blind Restoration of Real-World Audio by 1d Operational Gans. Available at SSRN: https://ssrn.com/abstract=4558725 or http://dx.doi.org/10.2139/ssrn.4558725

Turker Ince (Contact Author)

University of Massachusetts Amherst ( email )

Department of Operations and Information Managemen
Amherst, MA 01003
United States

Serkan Kiranyaz

Qatar University ( email )

College of Law
Qatar University
Doha, 2713
Qatar

Ozer Can Devecioglu

Tampere University ( email )

Tampere, FIN-33101
Finland

Muhammad Salman Khan

Qatar University ( email )

College of Law
Qatar University
Doha, 2713
Qatar

Muhammad Enamul Hoque Chowdhury

University of Nottingham - Sir Peter Mansfield Imaging Centre ( email )

Moncef Gabbouj

Tampere University ( email )

Tampere, FIN-33101
Finland

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
27
Abstract Views
119
PlumX Metrics