Fast phonetic/lexical searching in the archives of the czech holocaust testimonies: advancing towards the MALACH project visions

Psutka, Josef; Švec, Jan; Psutka, Josef V.; Vaněk, Jan; Pražák, Aleš; Šmídl, Luboš

Název:	Fast phonetic/lexical searching in the archives of the czech holocaust testimonies: advancing towards the MALACH project visions
Další názvy:	Rychlé fonetické/lexikální hledání v archivech výpovědí českého holokaustu: směřování k cílům projektu MALACH
Autoři:	Psutka, Josef Švec, Jan Psutka, Josef V. Vaněk, Jan Pražák, Aleš Šmídl, Luboš
Citace zdrojového dokumentu:	PSUTKA, Josef; ŠVEC, Jan; PSUTKA, Josef V. [et al.]. Fast phonetic/lexical searching in the archives of the czech holocaust testimonies: advancing towards the MALACH project visions. In: Text, speech and dialogue. Berlin: Springer, 2010, p. 385-391. (Lecture notes in computer science; 6231). ISBN 978-3-642-15759-2.
Datum vydání:	2010
Nakladatel:	Springer
Typ dokumentu:	článek article
URI:	http://www.kky.zcu.cz/cs/publications/PsutkaJosefV_2010_FastPhoneticLexical http://hdl.handle.net/11025/16998
ISBN:	978-3-642-15759-2
Klíčová slova:	rozpoznávání řeči;fonetické vyhledávání;slovní vyhledávání
Klíčová slova v dalším jazyce:	speech recognition;phonetic searching;lexical searching
Abstrakt:	Tento článek popisuje systém pro rychlé fonetické/lexikální prohledávání velkého audiovizuálního archívu českého holokaustu. Popisovaný systém je prvním krokem k naplnění projektu Malach. Více než 1000 hodin výpovědí bylo automaticky rozpoznáno a foneticky indexováno. Speciální pozornost byla věnována nespisvným slovům.
Abstrakt v dalším jazyce:	In this paper we describe the system for a fast phonetic/lexical searching in the large archives of the Czech holocaust testimonies. The developed system is the first step to a fulfillment of the MALACH project visions [1,2], at least as for an easier and faster access to the Czech part of the archives. More than one thousand hours of spontaneous, accented and highly emotional speech of Czech holocaust survivors stored at the USC Shoah Foundation Institute as videointerviews were automatically transcribed and phonetically/lexically indexed. Special attention was paid to processing of colloquial words that appear very frequently in the Czech spontaneous speech. The final access to the archives is very fast allowing to detect segments of interviews containing pronounced words, clusters of words presented in pre-defined time intervals, and also words that were not included in the working vocabulary (OOV words).
Práva:	© Josef Psutka - Jan Švec - Josef V. Psutka - Jan Vaněk - Aleš Pražák - Luboš Šmídl
Vyskytuje se v kolekcích:	Články / Articles (KIV) Články / Articles (KKY)

Soubory připojené k záznamu:

Soubor	Popis	Velikost	Formát
PsutkaJosefV_2010_FastPhoneticLexical.pdf	Plný text	201,41 kB	Adobe PDF	Zobrazit/otevřít

Zobrazit celý záznam Zobrazit statistiky

Použijte tento identifikátor k citaci nebo jako odkaz na tento záznam: http://hdl.handle.net/11025/16998

Všechny záznamy v DSpace jsou chráněny autorskými právy, všechna práva vyhrazena.

hledání

navigace