ICMI 2016
18th ACM International Conference on Multimodal Interaction (ICMI 2016)
Powered by
Conference Publishing Consulting

18th ACM International Conference on Multimodal Interaction (ICMI 2016), November 12–16, 2016, Tokyo, Japan

ICMI 2016 – Proceedings

Contents - Abstracts - Authors

Invited Talks

Understanding People by Tracking Their Word Use (Keynote)
James W. Pennebaker
(University of Texas at Austin, USA)
The words people use in their conversations, emails, and diaries can tell us how they think, approach problems, connect with others, and their behaviors. Of particular interest are people's use of function words -- pronouns, articles, and other small and forgettable words. Processed in the brain differently from content words, function words reveal where people are paying attention and how they think about themselves and others. After summarizing dozens of studies on language and psychological state, the talk will explore how text analysis can help us get inside the heads of the people we study.

Learning to Generate Images and Their Descriptions (Keynote)
Richard Zemel
(University of Toronto, Canada)
Recent advances in computer vision, natural language processing and related areas has led to a renewed interest in artificial intelligence applications spanning multiple domains. Specifically, the generation of natural human-like captions for images has seen an extraordinary increase in interest. I will describe approaches that combine state-of-the-art computer vision techniques and language models to produce descriptions of visual content with surprisingly high quality. Related methods have also led to significant progress in generating images. The limitations of current approaches and the challenges that lie ahead will both be emphasized.

Embodied Media: Expanding Human Capacity via Virtual Reality and Telexistence (Keynote)
Susumu Tachi
(University of Tokyo, Japan)
The information we acquire in real life gives us a holistic experience that fully incorporates a variety of sensations and bodily motions such as seeing, hearing, speaking, touching, smelling, tasting, and moving. However, the sensory modalities that can be transmitted in our information space are usually limited to visual and auditory ones. Haptic information is rarely used in the information space in our daily lives except in the case of warnings or alerts such as cellphone vibrations. Embodied media such as virtual reality and telexistence should provide holistic sensations, i.e., integrating visual, auditory, haptic, palatal, olfactory, and kinesthetic sensations, such that human users feel that they are present in a computer-generated virtual information space or a remote space having an alternate presence in the environment. Haptics plays an important role in embodied media because it provides both proprioception and cutaneous sensations; it lets users feel like they are touching distant people and objects and also lets them “touch” artificial objects as they see them. In this keynote, an embodied media, which extends human experiences, is overviewed and our research on an embodied media that is both visible and tangible based on our proposed theory of haptic primary colors is introduced. The embodied media would enable telecommunication, tele-experience, and pseudo-experience providing sensations such that the user would feel like working in a natural environment. It would also enable humans to engage in creative activities such as design and creation as though they were in the real environment. We have succeeded in transmitting fine haptic sensations, such as material texture and temperature, from an avatar robot’s fingers to a human user’s fingers. The avatar robot is a telexistence anthropomorphic robot, called TELESAR V, with a body and limbs with 53 degrees of freedom. This robot can transmit not only visual and auditory sensations of presence to human users but also realistic haptic sensations. Our other inventions include RePro3D, which is a full-parallax autostereoscopic 3D (three-dimensional) display with haptic feedback using RPT (retroreflective projection technology); TECHTILE Toolkit, which is a prototyping tool for the design and improvement of haptic media; and HaptoMIRAGE, which is an 180°-field-of-view autostereoscopic 3D display using ARIA (active-shuttered real image autostereoscopy) that can be used by three users simultaneously.

Help Me If You Can: Towards Multiadaptive Interaction Platforms (ICMI Awardee Talk)
Wolfgang Wahlster
(DFKI, Germany)
Autonomous Systems like self-driving cars and collaborative robots must occasionally ask people around them for help in anomalous situations. A new generation of multiadaptive interaction platforms provides a comprehensive multimodal presentation of the current situation in real-time, so that a smooth transfer of control back and forth between human agents and AI systems is guaranteed. We present the anatomy of our multiadaptive human-environment interaction platform which includes explicit models of the attentional and cognitive state of the human agents as well as a dynamic model of the cyber-physical environment, and supports massive multimodality, multiscale and multiparty interaction. It is based on the principles of symmetric multimodality and bidirectional representations: all input modes are also available as output modes and vice versa, so that the system not only understands and represents the user’s multimodal input, but also its own multimodal output. We illustrate our approach with examples from advanced automotive and manufacturing applications.

proc time: 1.02