HCI Beyond the GUI: Design for Haptic, Speech, Olfactory and Other Nontraditional Interfaces

Multimodal systems take advantage of recognition-based component technologies (e.g., speech, drawing, and gesture recognizers). Advances in recognition technologies make it increasingly possible to build more capable multimodal systems. The expressive power of rich modalities such as speech and handwriting is nonetheless frequently associated with ambiguities and imprecision in the messages (Bourguet, 2006). These ambiguities are reflected by the multiple potential interpretations produced by recognizers for each input. Recognition technology has been making steady progress, but is still considerably limited when compared to human-level natural-language interpretation. Multimodal technology has therefore developed techniques to reduce the uncertainty, attempting to leverage multimodality to produce more robust interpretations.
In the following, a historic perspective of the field is presented (Section 12.2.1). Sections 12.2.2 and 12.2.3 then present technical concepts and mechanisms and information flow, respectively.
Bolt's "Put that there" is one of the first demonstrations of multimodal user interface concepts (1980). This system allowed users to create and control geometric shapes of multiple sizes and colors, displayed over a large-format display embedded in an instrumented media room (Negroponte, 1978). The novelty introduced was the possibility of specifying shape characteristics via speech, while establishing location via either speech or deictic (pointing) gestures. The position of users' arms was tracked using a device attached to a bracelet, displaying an x on the screen to mark the position of a perceived point location when a user spoke an utterance. To create new objects, users could for instance...