How Much End-to-End is Tacotron 2 End-to-End TTS System

Tihelka, Daniel; Matoušek, Jindřich; Tihelková, Alice

Full metadata record

DC pole	Hodnota	Jazyk
dc.contributor.author	Tihelka, Daniel
dc.contributor.author	Matoušek, Jindřich
dc.contributor.author	Tihelková, Alice
dc.date.accessioned	2022-03-28T10:00:27Z	-
dc.date.available	2022-03-28T10:00:27Z	-
dc.date.issued	2021
dc.identifier.citation	TIHELKA, D. MATOUŠEK, J. TIHELKOVÁ, A. How Much End-to-End is Tacotron 2 End-to-End TTS System. In Text, Speech, and Dialogue 24th International Conference, TSD 2021, Olomouc, Czech Republic, September 6–9, 2021, Proceedings. Cham: Springer International Publishing, 2021. s. 511-522. ISBN: 978-3-030-83526-2 , ISSN: 0302-9743	cs
dc.identifier.isbn	978-3-030-83526-2
dc.identifier.issn	0302-9743
dc.identifier.uri	2-s2.0-85115273150
dc.identifier.uri	http://hdl.handle.net/11025/47247
dc.format	12 s.	cs
dc.format.mimetype	application/pdf
dc.language.iso	en	en
dc.publisher	Springer International Publishing	en
dc.relation.ispartofseries	Text, Speech, and Dialogue 24th International Conference, TSD 2021, Olomouc, Czech Republic, September 6–9, 2021, Proceedings	en
dc.rights	Plný text je přístupný v rámci univerzity přihlášeným uživatelům.	cs
dc.rights	© Springer	en
dc.title	How Much End-to-End is Tacotron 2 End-to-End TTS System	en
dc.type	konferenční příspěvek	cs
dc.type	ConferenceObject	en
dc.rights.access	restrictedAccess	en
dc.type.version	publishedVersion	en
dc.description.abstract-translated	In recent years, the concept of end-to-end text-to-speech synthesis has begun to attract the attention of researchers. The motivation is simple – replacing the individual modules that TTS traditionally built on with a powerful deep neural network simplifies the architecture of the entire system. However, how capable are such end-to-end systems of dealing with classic tasks such as G2P, text normalisation, homograph disambiguation and other issues inseparably linked to text-to-speech systems? In the present paper, we explore three free implementations of the Tacotron 2-based speech synthesizers, focusing on their abilities to transform the input text into correct pronunciation, not only in terms of G2P conversion but also in han- dling issues related to text analysis and the prosody patterns used.	en
dc.subject.translated	End-to-end speech synthesis	en
dc.subject.translated	Tacotron 2	en
dc.subject.translated	WaveRNN	en
dc.subject.translated	MelGan	en
dc.subject.translated	Text processing	en
dc.subject.translated	Homograph disambiguation	en
dc.subject.translated	Prosody patterns	en
dc.identifier.doi	10.1007/978-3-030-83527-9_44
dc.type.status	Peer-reviewed	en
dc.identifier.obd	43933411
dc.project.ID	GA19-19324S/Plně trénovatelná syntéza české řeči z textu s využitím hlubokých neuronových sítí	cs
dc.project.ID	90140/Velká výzkumná infrastruktura_(J) - e-INFRA CZ	cs
Vyskytuje se v kolekcích:	Konferenční příspěvky / Conference papers (KAJ) Konferenční příspěvky / Conference Papers (KKY) OBD

Soubory připojené k záznamu:

Soubor	Velikost	Formát
Tihelka2021_Chapter_HowMuchEnd-to-EndIsTacotron2En.pdf	222,38 kB	Adobe PDF	Zobrazit/otevřít Vyžádat kopii

Zobrazit minimální záznam Zobrazit statistiky

Použijte tento identifikátor k citaci nebo jako odkaz na tento záznam: http://hdl.handle.net/11025/47247

Všechny záznamy v DSpace jsou chráněny autorskými právy, všechna práva vyhrazena.

hledání

navigace