Vocal separation using Karaoke U-net

Currently karaoke tracks for songs have to be specially made by an audio engineer. The process for generating a high quality karaoke track for a song is not accessible to the general public. Specialized software’s like Audacity have to be used. Hence in this paper we are proposing a modified U-net called Karaoke U-net which provides a simple and quick separation of vocals
from a given song with both vocal and instrumental components and offers a high-quality karaoke track. It doesn’t require any special audio processing software’s. The proposed system takes as input a song, generates spectrograms of it and passes it through the Karaoke U-net. Our U-net generates the spectrograms of the vocals and instrumental of the input song. Finally the generated spectrograms are used to create audio files of the vocals and instrumental. We have created the
first U-net model specifically for generating a Karaoke track. We have an overall accuracy of 88.6 % and the performance of the proposed model on the MUSDB18 is better than other similar systems. Our U-net allows the user to create an instrumental for any song with vocal components. It can also be used by students who are learning audio mixing and mastering to analyze the vocals separately from the track and understand what processing has been done on the vocals. One more application of the U-net is to remove background noise during live video conferencing and, in turn helping the users to communicate more effectively.

Keywords: Karaoke, Vocal separation, U-net, Audio processing

JEL Classification: Y9

Suggested Citation: Suggested Citation

Mehendale, Ninad and Dube, Vipul and Patel, Rutwik and Sule, Vrushali, Vocal separation using Karaoke U-net (December 12, 2021). Available at SSRN: https://ssrn.com/abstract=3983514 or http://dx.doi.org/10.2139/ssrn.3983514

Ninad Mehendale (Contact Author)

University of Mumbai - K. J. Somaiya College of Engineering (K.J.S.C.E.) ( email )

Mumbai, MA Maharashtra 400007
India

Ninad's research Lab ( email )

M.G. Road, Naupada Thane
Thane, 400602
India

Vipul Dube

K J Somaiya College of Engineering, Somaiya Vidyavihar University, Mumbai, India ( email )

India

Rutwik Patel

University of Mumbai - K. J. Somaiya College of Engineering (K.J.S.C.E.) ( email )

Mumbai, MA Maharashtra 400007
India

Vrushali Sule

K. J. Somaiya College of Engineering ( email )

India

Download This Paper

Open PDF in Browser

Do you have negative results from your research you’d like to share?

Submit Negative Results

Paper statistics

Downloads

112

Abstract Views

636

Rank

445,470

16 References

PlumX Metrics

Feedback

Vocal separation using Karaoke U-net

Abstract

University of Mumbai - K. J. Somaiya College of Engineering (K.J.S.C.E.) ( email )

Ninad's research Lab ( email )

K J Somaiya College of Engineering, Somaiya Vidyavihar University, Mumbai, India ( email )

University of Mumbai - K. J. Somaiya College of Engineering (K.J.S.C.E.) ( email )

K. J. Somaiya College of Engineering ( email )

Do you have negative results from your research you’d like to share?

Paper statistics

Related eJournals

Applied Computing eJournal