Toward Natual Gesture/Speech HCI: A Case Study of Weather Narration

  • Ercan Ozyildiz ,
  • Indrajit Poddar ,
  • Rajeev Sharma ,
  • Yogesh Sethi

In order to incorporate naturalness in the design of Human Computer Interfaces (HCI), it is desirable to develop recognition techniques capable of handling coninous natural gesture and speech inputs. Hidden Markov Models (HMMs) provide a good framework for continous gesture recognition and also for multimodal fusion. Many different researchers have reported high recognition rates for gesture recognition using HMMs. However the gestures which were used for recognition by them were defined precisely and were bound with syntactical and grammatical constraints. But natural gestures do not string together in syntactical bindings. Moreover strict classificaiton of natural gesture is not feasible. In this paper we have examined hand gestures made in a very natural domain, that of a weather person narrating in front of a weather map. The gestures made by the weather person are embedded in a narration. This provides us with abundant data from an uncontrolled environment to study the interaction between speech and gesture in the context of a display. We hypothesize that this domain is very similar to that of a natural HCI interface. We have implemented a continuous HMM based gesture recognition framework. In order to understand the interaction between the gesture and speech, we have done a co-occurrence analysis of different gestures with some spoken keywords. We have also shown the possiblity of improving continuous gesture recognition results based on the co-occurrence analysis. Fast feature extraction and tracking is accomplished by the use of a predictive Kalman filtering on color segmented stream of video images. The results in the weather domain should be a step toward natural gesture/speeck HCI.