Lmcodec2: Ultra-Low Bit Rate Codec with Transformer for Satellitevoice Transmission
22 Pages Posted: 3 Aug 2024
Abstract
Over the past few years, the bottleneck of bandwidth in IoT has driven newmethods to compress the speech to be transmitted. For satellite voice communications,it is necessary to realize high-quality codecs with a code rate ofless than 1 kbps under Beidou-3 channel whose rate is often less than 1 kbps.MELPE vocoders typically compress narrowband speech to 2.4kbps, 1.2kbps,and 0.6kbps, but its MUSHRA scores at all code rates are not good enough.Neural network vocoders have become a very hot concept in the AI community,which aims to provide high-fidelity audio compression. In this paper, wepropose LMCodec2 which is a causal speech codec that can work on all thebit rates and can provides high quality audio at low bit rates for satellite voicetransmission. LMCodec2 trains a Transformer language model to predict thetokens. This technique reduces the bit rate by 25% and keeps the quality of thedecoded audio unchanged. Experimental results show that LMCodec2 providehigh quality decoded audio at 0.76kbps and 1.15kbps. The MUSHRA score ofLMCodec2 at 0.76kbps outperforms Encodec at 1.5kbps. Demo audio link is athttps://dingweipeng.github.io/JACK.github.io. Our codec provides a new ideaand solution for satellite voice communication.
Keywords: End-to-End Codec, Transformer Model, Huffman Coding, Satellite VoiceTransmission
Suggested Citation: Suggested Citation