Institute of Electronics, Technical University of Lodz, Wolczanska 211/215, 90-924, Lodz, Poland
Typical input devices used nowadays for communication with the machine require manual control and cannot be used by persons impaired in movement capacity. Therefore, it is necessary to develop easily accessible systems for human-computer interaction that would give the physically disabled persons the opportunity to become a part of Information Society. In the paper an algorithm for detection of the eye-blinks in image sequences has been presented. The employed image processing methods include Haar-like features for automatic face detection and template matching based eye tracking and eye-blink detection. Moreover, the developed human-computer interface controlled by eye-blinks is described.
Human-Computer Interface (HCI) can be described as the point of communication between the human user and a computer. Typical input devices used nowadays require manual control and cannot be used by persons impaired in movement capacity. This fact induces the need for development of alternative methods of communication between human and computer that would be suitable for the disabled. Therefore, the work on the development of innovative human-computer interfaces attracts attention of researchers all over the world.
The user friendly human-computer interface for severely movement impaired persons should fulfill the following conditions:
In this paper a vision-based system for detection of voluntary eye-blinks is presented, together with its implementation as a Human-Computer Interface for the disabled. In section 2 the overview of previous works on the interfaces for the disabled is given. The proposed eye-blink detection algorithm is described in section 3. Section 4 presents the eye-blink controlled human-computer interface based on the proposed algorithm. The results are discussed in section 5 and the conclusion are given in section 6.
For severely paralyzed persons, whose ability of movement is limited to the muscles around the eyes two main groups of human-computer interfaces are most suitable: brain-computer interfaces (BCI) and systems controlled by gaze (Starner 1998) or eye-blinks.
Brain-computer interface is a system that allows for controlling the computer applications by measuring and interpreting electrical brain activity. No muscle movements are required. This type of interfaces enable to operate virtual keyboards (Materka 2006), manage environmental control systems, use the text editor, web browser, or make physical movements (Ghaoui 2006). Brain-computer interfaces show great promise for people with severe physical impairments, however, their main drawbacks are intrusiveness and need for using special hardware.
Gaze controlled and eye-blink-controlled HCIs belong to the second group of systems suitable for people who cannot speak or use their hands to communicate. Most of the existing methods for gaze communication are intrusive or use specialized hardware, such as infrared (IR) illumination devices (Thoumies 1998) or electrooculographs (EOG) (Gips 1996). Such systems use two kinds of input signals: scanpath (line of gaze determined by fixations of the eyes) or eye-blinks. The eye-blink-controlled systems distinguish between voluntary and involuntary blinks and interpret the single voluntary blinks or the sequences of them. Specific mouth moves can also be included as the additional modality. The particular eye-blink patterns have the specific keyboard or mouse commands assigned, e.g. single long blink is associated with the TAB action, while double short blink as a mouse click. Such HCIs can be used as controls to simple games or for operating the programs for spelling words.
The vision-based eye-blink detection methods can be classified into two groups: active and passive. Active eye-blink detection techniques require special illumination to take advantage of the retro-reflective property of an eye. The active approach to eye and eye-blink detection gives very accurate results and the method is robust (Seki 1998). However, active methods have some disadvantages. The main one is the fact that they are ineffective in outdoor environment because of the impact of direct sun light on the IR illumination.
The main drawback of EOG-based and IR-based systems is the need for specialized hardware. Moreover, there is a concern about the safety of using IR illumination for a long time since the prolonged exposure of the eyeball to the IR lighting may cause damage to the retina. Therefore the best solution is the development of the gaze-controlled or eye-blink controlled passive vision-based system.
Passive eye-blink detection methods do not use additional light sources. The blinks are detected from the sequence of images within the visible spectrum at natural illumination conditions. Most of the eye-blink detection techniques are in fact eye detection methods. Many approaches are used for this purpose, such as template matching (Horng 2004), skin color models (Horng 2004), projection (Zhou 2004), directional Circle Hough Transform (Kawaguchi 2000), multiple Gabor response waves (Li 2008) or eye detection using Haar-like features (Bradski 2005).
The proposed vision-based system for voluntary eye-blink detection is built from off-the-shelf components: computer grade PC or laptop, and medium quality webcam. The face images of small resolution (320x240 pixels) are processes with the speed of ~28fps. The eye-blink detection algorithm consists of four major steps (Figure 1): (1) face detection, (2) eye region extraction, (3) eye-blink detection and (4) eye-blink classification. These steps are described in more details in sections 3.1-3.4.
Figure 1. Scheme of the proposed algorithm for eye-blink detection
The algorithm allows for eye-blink detection, estimation of eye-blink duration and, on this basis, classification of the eye-blinks as spontaneous or voluntary.
The first step in the proposed algorithm for eye blink monitoring is face detection. For this purpose the statistical approach using features calculated on the basis of Haar-like masks is employed (Viola 2001). The Haar-like features are computed by convolving the image with templates of different size and orientation. The detection decision is carried out by a cascade of boosted tree classifiers. The simple “weak” classifiers are trained on the face images of the fixed size 24×24 pixels. Face detection is done by sliding the search window of the same size as the face images used for training through the test image. The method was tested on the set of 150 face images and the accuracy was equal to 94%.
The next step of the algorithm is eye localization. The position of the eyes in the face image is found on the basis of certain geometrical dependencies known for human face. The traditional rules of proportion show face divided into six equal squares, two by three (Oguz 1996). According to these rules the eyes are located about 0.4 of the way from the top of the head to the eyes (Figure 2). The image of the extracted eye region is further preprocessed for performing eye blink detection.
Figure 2. Rules of human face proportions
The detected eyes are tracked using normalized crosscorrelation method (1). The template image of the user’s eyes is automatically acquired during the initialization of the system.
The correlation coefficient is a measure of the resemblance of the current eye image to the saved template of the opened eye (Figure 3). Therefore, it can be regarded as the measure of the openness of the eye. The example plot of the change of the correlation coefficient value in time is presented in Figure 4.
Figure 3. Eye images used as templates
The change of the correlation coefficient in time is analyzed in order to detect the voluntary eye-blinks of duration greater than 250ms. If the value of the coefficient is lower than the predefined threshold value TL for two consecutive frames - the beginning of the eye-blink is detected. The end of the eye-blink is found if the value of the correlation coefficient is greater than the threshold value TH (Figure 4). The values of the thresholds TL and TH were determined experimentally. If the duration of the detected eye-blink is greater than 250ms and shorter than 2s, then such blink is regarded as the “control” one.
Figure 4. Eye-blink detection procedure
The algorithm for automatic detection of voluntary eye-blinks was employed in the Human-Computer Interface. The applications were written using C++ Visual Studio and OpenCV library. The system is built from off-the-shelf components: internet camera and consumer garde personal computer. For best performance of the system the distance between the camera and the user's head should be not greater than 150cm. The system set-up is presented in Figure 5.
Figure 5. Test bench for the proposed Human-Computer Interface
The proposed interface, designed for Windows OS, has the following functionalities:
The main elements of the developed HCI are virtual keyboard and screen mouse. The operation of the interface is based on activating certain “buttons” of the virtual keyboard or mouse by performing control blinks. Subsequent buttons are highlighted automatically in a sequence. If the control blink is detected by the system, the action assigned to the highlighted button is executed. The user is informed about the detection of the eye-blink in 2 steps: the sound is generated when the start if the eye-blink is detected, and another sound is played when the control blink is registered.
In case of the virtual keyboard the alphanumeric signs are selected in two steps. The first step is selecting the column containing the desired sign. When the control blink is detected – the signs in the column are highlighted in a sequence and the second control blink allow for entering the selected letter to the active text editor.
The arrangement of the alphanumeric signs on the virtual keyboard is different from the QWERTY system. The signs are placed in such a way that the access time to the particular sign is proportional to the frequency of appearance of this sign in particular language (Figure 6). Therefore, for each language a different arrangement of letters is required.
The screen mouse menu consists of seven function buttons and the “Exit” button (Figure 7). The activation of one of the four arrow buttons causes the mouse cursor to move in the selected direction. The second control blink stops the movement of the cursor. Buttons L, R and 2L are responsible for left, right and double left click respectively. In this way the screen mouse gives the user access to all functions of Windows Explorer.
Figure 6. Letter arrangement on the virtual keyboard
Figure 7. Screen mouse panel
The developed eye-blink controlled HCI was tested by 49 user: 12 disabled and 37 healthy ones. The functionality of the interface was assessed in two ways: by estimating the time required to enter particular sequences of alphanumeric signs, and by assessing the precision of using the screen mouse. Each user performance was measured twice: before and after the 2-hours training session. The results are summarized in Table 1. The average time of entering a single sign was equal to 16.8s. before the training session and 11,7s. after the training session. The average time of moving the cursor from the bottom right corner to the center of the screen was equal to 7.46s. Also the percentage of detected control eye-blinks was calculated and it was equal to ~99%.
Before training | After training | |||||
---|---|---|---|---|---|---|
Word | Time range (s) | Average time (s) | Standard Deviation | Time range (s) | Average time (s) | Standard Deviation |
30.4-76.3 | 40.1 | 6.85 | 27.3-50.1 | 35.03 | 6.36 | |
NAME | 37.1-78.9 | 46.2 | 7.02 | 30.8-56.9 | 39.77 | 6.95 |
MY NAME IS | 91.3-142.7 | 111.8 | 8.39 | 82.6-109.3 | 93.45 | 7.05 |
INFORMATION | 98.1-157.9 | 132.5 | 14.2 | 82.9-131.2 | 100.57 | 13.13 |
GOOD MORNING | 104.2-178.3 | 141.3 | 13.98 | 93.1-140.3 | 108.27 | 14.47 |
The users were asked to assess the usefulness of the proposed interface as good or poor. 91% of the testers described the interface as good. The main complaints were about the difficulties with learning the interface and necessity of long training.
Obtained results show that the proposed algorithm allows for accurate detection of voluntary eye-blinks with the rate of ~99%. Performed tests demonstrate that the designed eye-blink controlled Human-Computer Interface is a useful tool for the communication with the machine. The opinions of the disabled tester of the prototype version of the eye-blink controlled human-computer interface were enthusiastic and the proposed system was deployed to market by Telekomunikacja Polska and the Orange™ group as an interface for the disabled under the name b-Link (http://sourceforge.net/projects/b-link/).