Skip navigation
Please use this identifier to cite or link to this item: https://libeldoc.bsuir.by/handle/123456789/38792
Full metadata record
DC FieldValueLanguage
dc.contributor.authorTaha, M.-
dc.contributor.authorAzarov, E. S.-
dc.contributor.authorLikhachov, D. S.-
dc.contributor.authorPetrovsky, A. A.-
dc.contributor.authorТаха, М.-
dc.contributor.authorАзаров, И. С.-
dc.contributor.authorЛихачев, Д. С.-
dc.contributor.authorПетровский, А. А.-
dc.date.accessioned2020-04-04T11:37:52Z-
dc.date.available2020-04-04T11:37:52Z-
dc.date.issued2020-
dc.identifier.citationAn efficient speech generative model based on deterministic/stochastic separation of spectral envelopes / Mostafa Taha [et. al.] // Доклады БГУИР. – 2020. – № 18 (2). – P. 23– 29. – DOI: http://dx.doi.org/10.35596/1729-7648-2020-18-2-23-29.ru_RU
dc.identifier.urihttps://libeldoc.bsuir.by/handle/123456789/38792-
dc.description.abstractThe paper presents a speech generative model that provides an efficient way of generating speech waveform from its amplitude spectral envelopes. The model is based on hybrid speech representation that includes deterministic (harmonic) and stochastic (noise) components. The main idea behind the approach originates from the fact that speech signal has a determined spectral structure that is statistically bound with deterministic/stochastic energy distribution in the spectrum. The performance of the model is evaluated using an experimental low-bitrate wide-band speech coder. The quality of reconstructed speech is evaluated using objective and subjective methods. Two objective quality characteristics were calculated: Modified Bark Spectral Distortion (MBSD) and Perceptual Evaluation of Speech Quality (PESQ). Narrow-band and wide-band versions of the proposed solution were compared with MELP (Mixed Excitation Linear Prediction) speech coder and AMR (Adaptive Multi-Rate) speech coder, respectively. The speech base of two female and two male speakers were used for testing. The performed tests show that overall performance of the proposed approach is speakerdependent and it is better for male voices. Supposedly, this difference indicates the influence of pitch highness on separation accuracy. In that way, using the proposed approach in experimental speech compression system provides decent MBSD values and comparable PESQ values with AMR speech coder at 6,6 kbit/s. Additional subjective listening testsdemonstrate that the implemented coding system retains phonetic content and speaker’s identity. It proves consistency of the proposed approach.ru_RU
dc.language.isoenru_RU
dc.publisherБГУИР, РБru_RU
dc.subjectдоклады БГУИРru_RU
dc.subjectspeech generative modelru_RU
dc.subjectharmonic plus noise modelru_RU
dc.subjectspeech analysisru_RU
dc.subjectspeech codingru_RU
dc.titleAn efficient speech generative model based on deterministic/stochastic separation of spectral envelopesru_RU
dc.typeСтатьяru_RU
Appears in Collections:№ 18(2)

Files in This Item:
File Description SizeFormat 
Taha_An.pdf492 kBAdobe PDFView/Open
Show simple item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.