CNN-TDNN-Based Architecture for Speech Recognition Using Grapheme Models in Bilingual Czech-Slovak Task

Psutka, Josef; Švec, Jan; Pražák, Aleš

Full metadata record

DC pole	Hodnota	Jazyk
dc.contributor.author	Psutka, Josef
dc.contributor.author	Švec, Jan
dc.contributor.author	Pražák, Aleš
dc.date.accessioned	2022-03-28T10:00:27Z	-
dc.date.available	2022-03-28T10:00:27Z	-
dc.date.issued	2021
dc.identifier.citation	PSUTKA, J. ŠVEC, J. PRAŽÁK, A. CNN-TDNN-Based Architecture for Speech Recognition Using Grapheme Models in Bilingual Czech-Slovak Task. In Text, Speech, and Dialogue 24th International Conference, TSD 2021, Olomouc, Czech Republic, September 6–9, 2021, Proceedings. Cham: Springer International Publishing, 2021. s. 523-533. ISBN: 978-3-030-83526-2 , ISSN: 0302-9743	cs
dc.identifier.isbn	978-3-030-83526-2
dc.identifier.issn	0302-9743
dc.identifier.uri	2-s2.0-85115207848
dc.identifier.uri	http://hdl.handle.net/11025/47248
dc.format	11 s.	cs
dc.format.mimetype	application/pdf
dc.language.iso	en	en
dc.publisher	Springer International Publishing	en
dc.relation.ispartofseries	Text, Speech, and Dialogue 24th International Conference, TSD 2021, Olomouc, Czech Republic, September 6–9, 2021, Proceedings	en
dc.rights	Plný text je přístupný v rámci univerzity přihlášeným uživatelům.	cs
dc.rights	© Springer	en
dc.title	CNN-TDNN-Based Architecture for Speech Recognition Using Grapheme Models in Bilingual Czech-Slovak Task	en
dc.type	konferenční příspěvek	cs
dc.type	ConferenceObject	en
dc.rights.access	restrictedAccess	en
dc.type.version	publishedVersion	en
dc.description.abstract-translated	Czech and Slovak languages are very similar, not only in writing but also in phonetic form. This work aims to find a suitable combination of these two languages concerning better recognition results. We would like to show such a contribution on the Malach project. The Malach speech of Holocaust survivors is highly emotional, filled with many disfluencies, heavy accents, age-related coarticulation, and many non-speech events. Due to the nature of the corpus, it is very difficult to find other appropriate data for acoustic modeling, so such a combination can significantly improve the amount of training data. We will discuss the differences between the phoneme and grapheme way of combining Czech with Slovak. We will also compare different architectures of deep neural networks (TDNN, TDNNF, CNN-TDNNF) and tune the optimal topology. The proposed bilingual ASR approach provides a slight improvement over monolingual ASR systems, not only at the phoneme level but also at the grapheme.	en
dc.subject.translated	Speech recognition	en
dc.subject.translated	Multilingual training	en
dc.subject.translated	Robustness	en
dc.subject.translated	Acoustic modeling	en
dc.identifier.doi	10.1007/978-3-030-83527-9_45
dc.type.status	Peer-reviewed	en
dc.identifier.obd	43933412
dc.project.ID	TN01000024/Národní centrum kompetence - Kybernetika a umělá inteligence	cs
Vyskytuje se v kolekcích:	Konferenční příspěvky / Conference Papers (KKY) OBD

Soubory připojené k záznamu:

Soubor	Velikost	Formát
Psutka2021_Chapter_CNN-TDNN-BasedArchitectureForS.pdf	228,83 kB	Adobe PDF	Zobrazit/otevřít Vyžádat kopii

Zobrazit minimální záznam Zobrazit statistiky

Použijte tento identifikátor k citaci nebo jako odkaz na tento záznam: http://hdl.handle.net/11025/47248

Všechny záznamy v DSpace jsou chráněny autorskými právy, všechna práva vyhrazena.

hledání

navigace