Can people who have difficulties creating and articulating speech, use sounds to control technology? The pilot project Sound control looked at this topic and considered the possibilities for a development project.
Story by: Morten Tollefsen - 12.01.2011
MediaLT received several questions from communities and users about how people with speech disabilities could use sounds and / or indistinct language to control technology. We found little previous research or experience relating to this, and took the initiative for a pilot project on Sound Control. IT Funk supported the project which ended on December 31st 2010.
The main objective of the project was to:
"Examine the possibilities for sound control of the PC, and to lay the foundation for a development project if possible and appropriate."
Milestones:
1. Analyze international R & D status in the field of sound control.
2. Define a functionality matrix for sound control of the PC.
3. Determine basic profiles with key functionality.
4. Evaluate possible technologies for sound control.
5. Establish international cooperation.
6. Lay the foundation for a main project if appropriate.
The main target group of the project was people with speech difficulties and problems using the PC with a standard user interface. In other words, the target group included all people who can benefit from sound control as a substitute for or in combination with other forms of interaction.
Analyze international R & D: References are collected and published on the project website:
http://www.medialt.no/lenkerreferanser/847.aspx
Furthermore, the R&D situation is summarized in a separate status report:
http://www.medialt.no/statusrapport/1001.aspx (Norwegian only)
Define a functionality matrix for sound control of the PC: Personas were used as a method to identify the functionality typically sought after in a sound control system. Personas are detailed descriptions of fictitious persons, which serve as good examples of the characteristics of the user group. Based on knowledge from Sikte, Sunnaas and the Cerebral Palsy Association, four personas were created. It became clear from this work that a wide range of accessible functionality was desirable, moreover that the users would have very different needs and assumptions and adaptation at the individual level should be possible. This is consistent with other research findings.
Our assessment is that definition of a functionality matrix is not appropriate and that the system should be developed in such a way as to ensure the greatest possible degree of individual customization.
Determine basic profiles with key functionality: At the start of the project we thought that we could define different "standard packages" of functions that could be controlled by sounds: for example, move the mouse pointer, click, drag and drop etc. During the project we saw that it would be better to develop a system that would as much as possible allow for individual customization. Based on the work with personas, the project group was agreed that the most desirable solution could be described as follows:
Evaluate possible technologies for sound control: One of the key questions relating to the value of a sound control system is how many sounds a typical user is able to make, and whether he / she will be able to reproduce these sounds well enough to allow the system to distinguish them. Since the user group is very heterogeneous, it is not possible to give a clear answer to this question, but the project group agreed to record the sounds of typical users to assess the technological possibilities for sound control.
We proceeded to make a survey of international work in this field, and during this work, Miriam Nes Begnum participated at the International Conference on Computers Helping People with Special Needs (ICCHP) in July 2010. Contact was established with Foad Hamidi from the University of Yorku in Canada. Hamidi presented the paper: "CanSpeak: A Customizable Speech Interface for People with Dysarthric Speech". The recordings of Hamidi and his colleagues were considered sufficient for our purposes and our scheduled recordings were not carried out.
Testing at the University Yorku was done with four people and a vocabulary of 47 words. Without adaptation, vocabulary recognition was between 30 and 56%. Among people without speech difficulties, the result was 94%. With adaptation the rate of detection increased radically to 84.3%. The very best results were achieved when family, teachers, nursing staff or speech specialists were involved. Relying only on the user for definition of appropriate phrases gave minimal improvement. Individual adjustment of the system, using assessment of pronunciation difficulties by speech specialists, is valuable.
User testing in the SMUDI project showed that challenges associated with use of microphones was far greater than we had foreseen. A sound control system for those with speech disabilities would exacerbate this challenge even further, and we found it necessary to include microphone testing in the technological evaluation of the solution.
The testing clearly showed that there was a need for further work in this field, and a project on microphone and switch solutions was started on 1 September 2010:
In this project we tried to find an existing sound recognizer, but investigation nationally and internationally did not suggest an appropriate solution.
Our conclusion is that: To create a solution for sound control a specific sound recognizer must be developed. We believe such a sound recognizer can be realized. Research is needed to measure the value in terms of the quality of recognition in the target group and the value of sound control.
Establish international cooperation: Based on our analysis of the present international R & D situation two communities emerged: Yorku University and Hearing Bridge. Cooperation was established with both of these communities; Yorku University as a research partner and Hørselsbroen as a development partner. A good foundation for further international cooperation has been laid.
Lay the foundation for a main project: In autumn 2010, most of the building blocks for a main project were in place, but signals from communities made us uncertain about how great the need actually is for sound control of technology. We decided therefore to conduct a targeted survey of communities and users. Based on the results of the survey, it was agreed by the project that there was currently no basis to proceed further with a main project, as the number of users who would benefit is relatively small.
Tips noen om siden