Title: | Automatic Correction of i/y Spelling in Czech ASR Output |
Authors: | Švec, Jan Lehečka, Jan Šmídl, Luboš Ircing, Pavel |
Citation: | ŠVEC, J. LEHEČKA, J. ŠMÍDL, L. IRCING, P. Automatic Correction of i/y Spelling in Czech ASR Output. In: Text, Speech, and Dialogue 23rd International Conference, TSD 2020, Brno, Czech Republic, September 8-11, 2020, Proceedings. Cham: Springer, 2020. s. 321-330. ISBN 978-3-030-58322-4, ISSN 0302-9743. |
Issue Date: | 2020 |
Publisher: | Springer |
Document type: | konferenční příspěvek conferenceObject |
URI: | 2-s2.0-85091182120 http://hdl.handle.net/11025/43118 |
ISBN: | 978-3-030-58322-4 |
ISSN: | 0302-9743 |
Keywords in different language: | Grammatical error correction, ASR , BERT |
Abstract in different language: | This paper concentrates on the design and evaluation of the method that would be able to automatically correct the spelling of i/y in the Czech words at the output of the ASR decoder. After analysis of both the Czech grammar rules and the data, we have decided to deal only with the endings consisting of consonants b/f/l/m/p/s/v/z followed by i/y in both short and long forms. The correction is framed as the classification task where the word could belong to the “i” class, the “y” class or the “empty” class. Using the state-of-the-art Bidirectional Encoder Representations from Transformers (BERT) architecture, we were able to substantially improve the correctness of the i/y spelling both on the simulated and the real ASR output. Since the misspelling of i/y in the Czech texts is seen by the majority of native Czech speakers as a blatant error, the corrected output greatly improves the perceived quality of the ASR system. |
Rights: | Plný text není přístupný. © Springer |
Appears in Collections: | Konferenční příspěvky / Conference papers (NTIS) Konferenční příspěvky / Conference Papers (KKY) OBD |
Files in This Item:
File | Size | Format | |
---|---|---|---|
Švec2020_Chapter_AutomaticCorrectionOfIYSpellin.pdf | 251,46 kB | Adobe PDF | View/Open Request a copy |
Please use this identifier to cite or link to this item:
http://hdl.handle.net/11025/43118
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.