Optimized lossless audio compression using DCT energy thresholding and machine learning technique

Uttam Mondal; Asish Debnath

doi:10.7494/csci.2025.26.3.6427

Authors

Uttam Mondal Vidyasagar University
Asish Debnath Vidyasagar University

DOI:

https://doi.org/10.7494/csci.2025.26.3.6427

Abstract

In this paper, a novel lossless audio compression technique has been proposed, utilizing the Discrete Cosine Transform (DCT) coefficient-controlled technique based on energy thresholding, an XOR-based neural network compression model, and a CNN model. Initially, the DCT is applied to the input audio signal to achieve better energy compaction, followed by transforming selected DCT coefficients into a compressed binary stream. Subsequently, this binary stream is passed to two prediction-based optimized models: an XOR model and a CNN model for further compression. The binary stream is first processed by the neural network model for XOR operation, and the resulting output is then fed into a CNN model to reduce data dimensionality and generate compressed audio data. The simulation findings are analyzed using various statistical and robustness measures and compared with existing approaches.

Downloads

Download data is not yet available.

Author Biography

Asish Debnath, Vidyasagar University

B.Tech. (Comp. Sci) , M.Tech. (Copm.Sci)

15 Yrs. Industrial experienced

References

Debnath, A., Mondal, U.K., Roy, B.B., Panja, N.: Achieving lossless audio encoder through integrated approaches of wavelet transform, quantization and huffman encoding (laeiwqh). In: 2020 International Conference on Computer Science, Engineering and Applications (ICCSEA), pp. 1–5 (2020). IEEE

Mondal, U.K., Debnath, A., Tabassum, N., Mandal, J.: Designing an iterative adaptive arithmetic coding-based lossless bio-signal compression for online patient monitoring system (iaalbc), 655–664 (2023)

Moriya, T., Harada, N., Kamamoto, Y., Sekigawa, H.: Mpeg-4 als international standard for lossless audio coding. NTT Technical Review 4(8), 40–45 (2006)

Gunawan, T.S., Zain, M.K.M., Muin, F.A., Kartiwi, M.: Investigation of lossless audio compression using ieee 1857.2 advanced audio coding. Indonesian Journal of Electrical Engineering and Computer Science 6(2), 422–430 (2017)

Reznik, Y.A.: Coding of prediction residual in mpeg-4 standard for lossless audio coding (mpeg-4 als) 3, 1024 (2004). IEEE

https://monkeysaudio.com/index.html. Accessed: 30-01-2024

http://www.wavpack.com. Accessed: 29-06-2024

Coalson, J.: Xiph. Org Foundation,“FLAC: Free lossless audio codec”. https: //xiph.org/flac/index.html. Accessed:30-01-2024

Huang, H., Shu, H., Yu, R.: Lossless audio compression in the new ieee standard for advanced audio coding. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6934–6938 (2014). IEEE

Nowak, N., Zabierowski, W.: Methods of sound data compression–comparison of different standards. Radio electronics and informatics (4), 92–95 (2011)

Kankanahalli, S.: End-to-end optimized speech coding with deep neural networks. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2521–2525 (2018). IEEE

Mondal, U.K., Debnath, A.: Developing a dynamic cluster quantization based lossless audio compression (dcqlac). Multimedia Tools and Applications 80(6), 8257–8280 (2021)

Rim, D.N., Jang, I., Choi, H.: Deep neural networks and end-to-end learning for audio compression. arXiv preprint arXiv:2105.11681 (2021)

Mondal, U.K., Debnath, A.: Designing a novel lossless audio compression technique with the help of optimized graph traversal (lacogt). Multimedia Tools and Applications 81(28), 40385–40411 (2022)

Freitag, M., Amiriparian, S., Pugachevskiy, S., Cummins, N., Schuller, B.: audeep: Unsupervised learning of representations from audio with deep recurrent neural networks. The Journal of Machine Learning Research 18(1), 6340–6344 (2017)

Mineo, T., Shouno, H.: A lossless audio codec based on hierarchical residual prediction. In: 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 123–130 (2022). IEEE

http://tausoft.org/wiki/True Audio Codec Overview. Accessed: 29-06-2024

https://macosforge.github.io/alac/. Accessed: 29-06-2024

Diniz, P.S., et al.: Adaptive filtering, vol. 4. Springer (1997)

Gersho, A.: Adaptive filtering with binary reinforcement. IEEE Transactions on Information Theory 30(2), 191–199 (1984)

Gao, W., Huang, T., Reader, C., Dou, W., Chen, X.: Ieee standards for advanced audio and video coding in emerging applications. Computer 47(05), 81–83 (2014)

Ghido, F., Tabus, I.: Sparse modeling for lossless audio compression. IEEE Transactions on Audio, Speech, and Language Processing 21(1), 14–28 (2012)

Debnath, A., Mondal, U.K.: Lossless audio codec based on cnn, weighted tree and arithmetic encoding (laccwa). Multimedia Tools and Applications, 1–23 (2023)

Mineo, T., Shouno, H.: Improving sign-algorithm convergence rate using natural gradient for lossless audio compression. EURASIP Journal on Audio, Speech, and Music Processing 2022(1), 12 (2022)

Gupta, M., Garg, A.K.: Analysis of image compression algorithm using dct. International Journal of Engineering Research and Applications (IJERA) 2(1), 515–521 (2012)

Kameoka, H., Kamamoto, Y., Harada, N., Moriya, T.: A linear predictive coding algorithm minimizing the golomb-rice code length of the residual signal. IEICE Transactions on Fundamentals of Electronics 91(11), 1017–1025 (2008)

Karol Cheinski and Pawel Wawrzynski: Dct-conv: Coding filters in convolutional networks with discrete cosine transform. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–6 (2020). IEEE

Patil, M., Gupta, A., Varma, A., Salil, S.: Audio and speech compression using dct and dwt techniques. International Journal of Innovative Research in Science, Engineering and Technology 2(5), 1712–1719 (2013)

Lin, Y.-C., Hsu, Y.-T., Fu, S.-W., Tsao, Y., Kuo, T.-W.: Ia-net: Acceleration and compression of speech enhancement using integer-adder deep neural network. In: Interspeech, pp. 1801–1805 (2019)

Jadhav, S., Patole, R., Rege, P.: Audio splicing detection using convolutional neural network. In: 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT), pp. 1–5 (2019). IEEE

Zhou, Y., Wang, C., Zhou, X.: Dct-based color image compression algorithm using an efficient lossless encoder. In: 2018 14th IEEE International Conference on Signal Processing (ICSP), pp. 450–454 (2018). https://doi.org/10.1109/ICSP. 2018.8652455

Shukla, S., Ahirwar, M., Gupta, R., Jain, S., Rajput, D.S.: Audio compression algorithm using discrete cosine transform (dct) and lempel-ziv-welch (lzw) encoding method. In: 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon), pp. 476–480 (2019). IEEE

Gupta, M., Garg, A.K.: Analysis of image compression algorithm using dct. International Journal of Engineering Research and Applications (IJERA) 2(1), 515–521 (2012)

https://www.tensorflow.org/api docs/python/tf/keras. Accessed: 29-06-2024

https://www.audacityteam.org/. Accessed: 29-06-2024

Manju, M., Abarna, P., Akila, U., Yamini, S.: Peak signal to noise ratio & mean square error calculation for various images using the lossless image compression in ccsds algorithm. International Journal of Pure and Applied Mathematics 119(12), 14471–14477 (2018)

Mondal, U.K., Debnath, A., Mandal, J.: Deep learning-based lossless audio encoder (dllae). Intelligent Computing: Image Processing Based Applications, 91–101 (2020)

Willmott, C.J., Matsuura, K.: Advantages of the mean absolute error (mae) over the root mean square error (rmse) in assessing average model performance. Climate research 30(1), 79–82 (2005)

Pedro, H.T., Larson, D.P., Coimbra, C.F.: A comprehensive dataset for the accelerated development and benchmarking of solar forecasting methods. Journal of Renewable and Sustainable Energy 11(3), 036102 (2019)