Multimodal appearance based Gaze-controlled virtual keyboard with synchronous asynchronous interaction for low-resource settings

dc.contributor.author	Meena, Yogesh Kumar
dc.contributor.author	Salvi, Manish
dc.coverage.spatial	United States of America
dc.date.accessioned	2025-09-04T07:14:08Z
dc.date.available	2025-09-04T07:14:08Z
dc.date.issued	2025-08
dc.identifier.citation	Meena, Yogesh Kumar and Salvi, Manish, "Multimodal appearance based Gaze-controlled virtual keyboard with synchronous asynchronous interaction for low-resource settings", arXiv, Cornell University Library, DOI: arXiv:2508.16606, Aug. 2025
dc.identifier.issn	2331-8422
dc.identifier.uri	https://doi.org/10.48550/arXiv.2508.16606
dc.identifier.uri	https://repository.iitgn.ac.in/handle/123456789/11852
dc.description.abstract	Over the past decade, the demand for communication devices has increased among individuals with mobility and speech impairments. Eye-gaze tracking has emerged as a promising solution for hands-free communication; however, traditional appearance-based interfaces often face challenges such as accuracy issues, involuntary eye movements, and difficulties with extensive command sets. This work presents a multimodal appearance-based gaze-controlled virtual keyboard that utilises deep learning in conjunction with standard camera hardware, incorporating both synchronous and asynchronous modes for command selection. The virtual keyboard application supports menu-based selection with nine commands, enabling users to spell and type up to 56 English characters, including uppercase and lowercase letters, punctuation, and a delete function for corrections. The proposed system was evaluated with twenty able-bodied participants who completed specially designed typing tasks using three input modalities: (i) a mouse, (ii) an eye-tracker, and (iii) an unmodified webcam. Typing performance was measured in terms of speed and information transfer rate (ITR) at both command and letter levels. Average typing speeds were 18.3+-5.31 letters/min (mouse), 12.60+-2.99letters/min (eye-tracker, synchronous), 10.94 +- 1.89 letters/min (webcam, synchronous), 11.15 +- 2.90 letters/min (eye-tracker, asynchronous), and 7.86 +- 1.69 letters/min (webcam, asynchronous). ITRs were approximately 80.29 +- 15.72 bits/min (command level) and 63.56 +- 11 bits/min (letter level) with webcam in synchronous mode. The system demonstrated good usability and low workload with webcam input, highlighting its user-centred design and promise as an accessible communication tool in low-resource settings.
dc.description.statementofresponsibility	by Yogesh Kumar Meena and Manish Salvi
dc.language.iso	en_US
dc.publisher	Cornell University Library
dc.title	Multimodal appearance based Gaze-controlled virtual keyboard with synchronous asynchronous interaction for low-resource settings
dc.type	Article
dc.relation.journal	arXiv

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

E-print Articles [187]

Show simple item record

Search Digital Repository

Browse

All of DSpace
This Collection
- Titles
- Authors
- By Advisor
- By Issue Date
- Subjects
- By Type
- By Degree
- By Department

Multimodal appearance based Gaze-controlled virtual keyboard with synchronous asynchronous interaction for low-resource settings

Files in this item

This item appears in the following Collection(s)

Search Digital Repository

Browse

All of DSpace

This Collection

My Account