Audio/Speech Coding Based on the Perceptual Sparse Representation of the Signal with DAE Neural Network Quantizer and Near-End Listening Enhancement

Herasimovich, V.; Petrovsky, Al. A.; Avramov, V. V.; Petrovsky, A.

Full metadata record

DC Field	Value	Language
dc.contributor.author	Herasimovich, V.	-
dc.contributor.author	Petrovsky, Al. A.	-
dc.contributor.author	Avramov, V. V.	-
dc.contributor.author	Petrovsky, A.	-
dc.date.accessioned	2018-12-15T07:24:42Z	-
dc.date.available	2018-12-15T07:24:42Z	-
dc.date.issued	2018	-
dc.identifier.citation	Audio/Speech Coding Based on the Perceptual Sparse Representation of the Signal with DAE Neural Network Quantizer and Near-End Listening Enhancement / V. Herasimovich [et al.] // Multimedia and Network Information Systems : Proceedings of the 11th International Conference MISSI 2018 / Editors: K. Choroś [et al.]. – 2018. – P. 109–119.	ru_RU
dc.identifier.uri	https://libeldoc.bsuir.by/handle/123456789/33888	-
dc.description.abstract	The article presents universal sound coding framework. The encoding algorithm works at the junction of the transform and parametric approaches. The input signal goes through the decorrelation transform – wavelet packet decomposition (WPD) that is tuned to perceptual structure of the analyzed signal with the psychoacoustic modelling. The parameterization stage is the matching pursuit (MP) algorithm with the WPD based dictionaries. Selected parameters then quantized and coded for the transmission to the decoder. Quantization algorithm based on the artificial neural networks with a deep autoencoder (DAE) architecture is presented. The decoding part of the coder has the listening enhancement function. Since the decoder input is the parameters that are distributed in the subbands it is only necessary to decompose the noise signal with the corresponding filterbank and estimate the subband gain factor based on this information. The results of the conducted research like objective difference grade and performance demonstration are shown.	ru_RU
dc.language.iso	en	ru_RU
dc.publisher	Springer	ru_RU
dc.subject	публикации ученых	ru_RU
dc.subject	audio/speech coding	ru_RU
dc.subject	wavelet packet	ru_RU
dc.subject	matching pursuit	ru_RU
dc.subject	psychoacoustics	ru_RU
dc.subject	neural networks	ru_RU
dc.subject	deep autoencoder	ru_RU
dc.subject	listening enhancement	ru_RU
dc.title	Audio/Speech Coding Based on the Perceptual Sparse Representation of the Signal with DAE Neural Network Quantizer and Near-End Listening Enhancement	ru_RU
dc.type	Статья	ru_RU
dc.identifier.DOI	https://doi.org/10.1007/978-3-319-98678-4_13	-
Appears in Collections:	Публикации в зарубежных изданиях