https://libeldoc.bsuir.by/handle/123456789/61457
Title: | Software for recognizing speaker by voice |
Authors: | Chen Zhengyu |
Keywords: | материалы конференций;speaker recognition;feature Extraction;neural networks |
Issue Date: | 2025 |
Publisher: | БГУИР |
Citation: | Chen Zhengyu. Software for recognizing speaker by voice / Chen Zhengyu // Информационная безопасность : сборник материалов 61-й научной конференции аспирантов, магистрантов и студентов БГУИР, Минск, 21–25 апреля 2025 г. / Белорусский государственный университет информатики и радиоэлектроники. – Минск, 2025. – С. 13–17. |
Abstract: | . FBank (Filter Bank) is a front-end processing algorithm that processes audio in a way similar to the human ear and extracts features to improve the performance of speech recognition. The system uses an efficient context-aware masking-based network, CAM++, which uses a densely connected time-delay neural network (D-TDNN) as the backbone and adopts a novel multi-granularity pooling to capture different levels of context information.Based on the respective advantages of FBank and CAM++ models, this study designs a software for recognizing speaker by voice and implements the system through pytorch. |
URI: | https://libeldoc.bsuir.by/handle/123456789/61457 |
Appears in Collections: | Информационная безопасность : материалы 61-й научной конференции аспирантов, магистрантов и студентов (2025) |
File | Description | Size | Format | |
---|---|---|---|---|
Chen_Zhengyu_Software.pdf | 299.19 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.