Sign Pose-based Transformer for Word-level Sign Language Recognition

Boháček, Matyáš; Hrúz, Marek

Full metadata record

DC pole	Hodnota	Jazyk
dc.contributor.author	Boháček, Matyáš
dc.contributor.author	Hrúz, Marek
dc.date.accessioned	2023-02-13T11:00:21Z	-
dc.date.available	2023-02-13T11:00:21Z	-
dc.date.issued	2022
dc.identifier.citation	BOHÁČEK, M. HRÚZ, M. Sign Pose-based Transformer for Word-level Sign Language Recognition. In Proceedings - 2022 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops. New York: IEEE, 2022. s. 182-191. ISBN: 978-1-66545-824-5 , ISSN: 2572-4398	cs
dc.identifier.isbn	978-1-66545-824-5
dc.identifier.issn	2572-4398
dc.identifier.uri	2-s2.0-85126778924
dc.identifier.uri	http://hdl.handle.net/11025/51463
dc.format	10 s.	cs
dc.format.mimetype	application/pdf
dc.language.iso	en	en
dc.publisher	IEEE	en
dc.relation.ispartofseries	Proceedings - 2022 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops	en
dc.rights	Plný text je přístupný v rámci univerzity přihlášeným uživatelům.	cs
dc.rights	© IEEE	en
dc.title	Sign Pose-based Transformer for Word-level Sign Language Recognition	en
dc.type	konferenční příspěvek	cs
dc.type	ConferenceObject	en
dc.rights.access	restrictedAccess	en
dc.type.version	publishedVersion	en
dc.description.abstract-translated	In this paper we present a system for word-level sign language recognition based on the Transformer model. We aim at a solution with low computational cost, since we see great potential in the usage of such recognition system on hand-held devices. We base the recognition on the estimation of the pose of the human body in the form of 2D landmark locations. We introduce a robust pose normalization scheme which takes the signing space in consideration and processes the hand poses in a separate local coordinate system, independent on the body pose. We show experimentally the significant impact of this normalization on the accuracy of our proposed system. We introduce several augmentations of the body pose that further improve the accuracy, including a novel sequential joint rotation augmentation. With all the systems in place, we achieve state of the art top-1 results on the WLASL and LSA64 datasets. For WLASL, we are able to successfully recognize 63.18 % of sign recordings in the 100-gloss subset, which is a relative improvement of 5 % from the prior state of the art. For the 300-gloss subset, we achieve recognition rate of 43.78 % which is a relative improvement of 3.8 %. With the LSA64 dataset, we report test recognition accuracy of 100 %.	en
dc.subject.translated	training	en
dc.subject.translated	visualization	en
dc.subject.translated	computational modeling	en
dc.subject.translated	gesture recognition	en
dc.subject.translated	assistive technologies	en
dc.subject.translated	transformers	en
dc.subject.translated	data models	en
dc.identifier.doi	10.1109/WACVW54805.2022.00024
dc.type.status	Peer-reviewed	en
dc.identifier.document-number	802187100020
dc.identifier.obd	43937109
dc.project.ID	LM2018101/LINDAT/CLARIAH-CZ – Digitální výzkumná infrastruktura pro jazykové technologie, umění a humanitní vědy	cs
dc.project.ID	90140/Velká výzkumná infrastruktura_(J) - e-INFRA CZ	cs
Vyskytuje se v kolekcích:	Konferenční příspěvky / Conference papers (NTIS) Konferenční příspěvky / Conference Papers (KKY) OBD

Soubory připojené k záznamu:

Soubor	Velikost	Formát
Bohacek_Hruz_Sign_Pose-based_Transformer_WACVW_2022.pdf	840,81 kB	Adobe PDF	Zobrazit/otevřít Vyžádat kopii

Zobrazit minimální záznam Zobrazit statistiky

Použijte tento identifikátor k citaci nebo jako odkaz na tento záznam: http://hdl.handle.net/11025/51463

Všechny záznamy v DSpace jsou chráněny autorskými právy, všechna práva vyhrazena.

hledání

navigace