Virtual Keyboard App Using Computer Vision

An innovative input method on smart phones based on finger motion tracking

Background And Motivation

The emergence of mobile devices has revolutionized the way we communicate, work and access information. As mobile networking and microprocessor technologies progress, mobile devices such as smartphones are becoming powerful enough to replace desktop computers. In 2019, mobile users accounted for 53% of all web traffic. People are more likely to use a smartphone instead of a desktop computer to perform tasks like online shopping, video streaming and browsing social media sites. Moreover, there is an growing trend of people using their smartphones for workrelated tasks, such as emailing, calling and appointment scheduling.

Despite the convenience offered by mobile devices, their compact form factor makes text typing difficult. For instance, texting on smartphones usually involves tapping on the virtual on-screen keyboard with both thumbs. In contrast to a standard physical keyboard that enables users to type with all ten fingers, a virtual on-screen keyboard has considerably smaller keys that makes it hard for users to use multiple fingers for typing. The small keystroke size on an on-screen keyboard would also reduce the
typing speed and increase the chance of making a typo. To improve the typing experience on mobile devices, new features like autocorrect, SwiftKey, and speech-to-text input are currently added to mobile virtual keyboard apps. These features allow users to type on mobile devices without tapping on individual keys on a virtual on-screen keyboard and reduces typing errors.

Inspired by the new features on mobile virtual keyboard apps, it is believed that alternative text input methods are needed on mobile devices. The new text input method should overcome the limitation of the small form factor and provide good user experience. With the increase in computation power and resolution on mobile
devices, we are motivated to explore an innovative solution by leveraging computer vision and machine learning techniques. We seek to provide users with a more natural and intuitive way to interact with their mobile devices, ultimately enhancing their typing experience and overall satisfaction.

“SelfieType”

In the Consumer Electronics Show 2020 (CES2020), Samsung introduced a conceptual product called “SelfieType”, which uses the front-facing camera as a keyboard for mobile devices. Users just need to place their devices on any flat surface and hold their hands in the typing position. The SelfieType AI engine will analyze user’s finger movements and convert the into keyboard inputs. No additional hardware is required. This technology provides a portable and user-freindly typing experience on smart phones.

See More about SelfieType: https://news.samsung.com/global/how-c-lab-is-preparing-for-a-future-full-of-potential-part-1-c-lab-inside

With this idea, the project aims to build an innovative input method by developing a similar application to “SelfieType”, and explores the possibility of building a keyboard using computer vision techniques and finger motion tracking on mobile devices.

Objectives

Andriod Keyboard Application

An system level keyboard application on the Android platform is developed to provide user with similar typing experience on a standard keyboard on smart phones. Computer vision techniques are leveraged to track hand movements and fingertip movements. Instead of tapping on the screen of mobile
devices, users can type by tapping on any flat surfaces as tapping on a real keyboard in front of the device camera. The virtual keyboard app utilizes the device camera to track users’ finger movements and gestures to determine the text input

Methodology

Hand Landmark Detection Model

MediaPipe

Tap Detection

Fingertip Motion Tracking

Key Input Determination

Fingertip Positioning