ICMI Workshops 2017
19th ACM International Conference on Multimodal Interaction (ICMI 2017)
Powered by
Conference Publishing Consulting

1st ACM SIGCHI International Workshop on Investigating Social Interactions with Artificial Agents (ISIAA 2017), November 13, 2017, Glasgow, UK

ISIAA 2017 – Proceedings

Contents - Abstracts - Authors

1st ACM SIGCHI International Workshop on Investigating Social Interactions with Artificial Agents (ISIAA 2017)


Title Page

Message from the Chairs
The workshop “Investigating Social Interactions With Artificial Agents”1 organized within the “International Conference on Multimodal Interactions 2017” attempts to bring together researchers from different fields sharing a similar interest in human interactions with other agents. If interdisciplinarity is necessary to address the question of the “Turing Test”, namely “can an artificial conversational artificial agent be perceived as human”, it is also a very promising new way to investigate social interactions in the first place. Biology is represented by social cognitive neuroscience, aiming to describe the physiology of human social behaviors. Linguistics, from humanities, attempts to characterize a specifically human behavior and language. Social Signal Processing is a recent approach to analyze automatically, using advanced Information Technologies, the behaviors pertaining to natural human interactions. Finally, from Artificial Intelligence, the development of artificial agents, conversational and/or embodied, onscreen or physically attempts to recreate non- human socially interactive agents for a multitude of applications.


How Do Artificial Agents Think?
Thierry Chaminade
(Aix-Marseille University, France)
Anthropomorphic artificial agents, computed characters or humanoid robots, can be sued to investigate human cognition. They are intrinsically ambivalent. They appear and act as humans, hence we should tend to consider them as human, yet we know they are machine designed by humans, and should not consider them as humans. Reviewing a number of behavioral and neurophysiological studies provides insights into social mechanisms that are primarily influenced by the appearance of the agent, and in particular its resemblance to humans, and other mechanisms that are influenced by the knowledge we have about the artificial nature of the agent. A significant finding is that, as expected, humans don’t naturally adopt an intentional stance when interacting with artificial agents.

Publisher's Version Article Search
Body Language without a Body: Nonverbal Communication in Technology Mediated Settings
Alessandro Vinciarelli
(University of Glasgow, UK)
Cognitive and psychological processes underlying social interaction are built around face-to-face interactions, the only possible and available communication setting during the long evolutionary process that has resulted into Homo Sapiens. As the fraction of interactions that take place in technology mediated settings keeps increasing, it is important to investigate how the cognitive and psychological processes mentioned above - ultimately grounded into neural structures - act in and react to the new interaction settings. In particular, it is important to investigate whether nonverbal communication - one of the main channels through which people convey socially and psychologically relevant information - still plays a role in settings where natural nonverbal cues (facial expressions, vocalisations, gestures, etc.) are no longer available. Addressing such an issue has important implications not only for what concerns the understanding of cognition and psychology, but also for what concerns the design of interaction technology and the analysis of phenomena like cyberbullyism and viral diffusion of content that play an important role in nowadays society.

Publisher's Version Article Search
Dialogue Management in Task-Oriented Dialogue Systems
Philippe Blache
(Aix-Marseille University, France)
This paper presents a new framework for implementing a dialogue manager, making it possible to infer new information in the course of the interaction as well as generating responses from the virtual agent. The approach relies on a specific organization of knowledge bases, including the creation of a common ground and a belief base. Moreover, the same type of rules implement both inference and control of the dialogue. This approach is implemented within a dialogue system for training doctors to break bad news (ACORFORMed).

Publisher's Version Article Search
Greta: A Conversing Socio-emotional Agent
Catherine Pelachaud
(CNRS, France; UPMC, France)
To create socially aware virtual agents, we conduct research along two main research directions: 1) develop richer models of multimodal behaviors for the agent; 2) make the agent a more socially competent interlocutor.

Publisher's Version Article Search
Challenges for Adaptive Dialogue Management in the KRISTINA Project
Louisa Pragst, Juliana Miehle, Wolfgang Minker, and Stefan Ultes
(University of Ulm, Germany; Cambridge University, UK)
Access to health care related information can be vital and should be easily accessible. However, immigrants often have difficulties to obtain the relevant information due to language barriers and cultural differences. In the KRISTINA project, we address those difficulties by creating a socially competent multimodal dialogue system that can assist immigrants in getting information about health care related questions. Dialogue management, as core component responsible for the system behaviour, has a significant impact on the successful reception of such a system. Hence, this work presents the specific challenges of the KRISTINA project to adaptive dialogue management, namely the handling of a large dialogue domain and the cultural adaptability required by the envisioned dialogue system, and our approach to handling them.

Publisher's Version Article Search
En Route to a Better Integration and Evaluation of Social Capacities in Vocal Artificial Agents
Fabrice Lefèvre
(University of Avignon, France)
In this talk, work about vocal artificial agent ongoing in the Vocal Interaction Group at LIA, University of Avignon, is presented. A focus is made on the research line aiming at endowing such interactive agents with human-like social abilities. After a short overview of the state-of-the-art in spoken dialogue systems a summary of recent efforts to improve systems' development through online learning using social signals is proposed. Then two examples of skills favoring human-like social interactions are presented: firstly a new turn-taking management scheme based on incremental processing and reinforcement learning, then automatic generation and usage optimisaton of humor traits. These studies converge in enabling to develop interactive systems which could foster studies in human sciences to better understand specificities of human social communication.

Publisher's Version Article Search


A Corpus for Experimental Study of Affect Bursts in Human-Robot Interaction
Lucile Bechade, Kevin El Haddad, Juliette Bourquin, Stéphane Dupont, and Laurence Devillers
(University of Paris-Saclay, France; University of Mons, Belgium)
This paper presents a data collection carried out in the framework of the Joker Project. Interaction scenarios have been designed in order to study the e ects of a ect bursts in a human-robot interaction and to build a system capable of using multilevel a ect bursts in a human-robot interaction. We use two main audio expression cues: verbal (synthesised sentences) and nonverbal (a ect bursts). The nonverbal cues used are sounds expressing disgust, amusement, fear, misunderstanding and surprise. Three di erent intensity levels for each sound have been generating for each emotion.

Publisher's Version Article Search
Could a Virtual Agent Be Warm and Competent? Investigating User's Impressions of Agent's Non-verbal Behaviours
Beatrice Biancardi, Angelo Cafaro, and Catherine Pelachaud
(CNRS, France; UPMC, France)
In this abstract we introduce the design of an experiment aimed at investigating how users' impressions of an embodied conversational agent are influenced by agent's non-verbal behaviour. We focus on impressions of warmth and competence, the two fundamental dimensions of social perception. Agent's gestures, arms rest poses and smile frequency are manipulated, as well as users' expectations about agent's competence. We hypothesize that user's judgments will differ according to his expectations, by following the Expectancy Violation Theory proposed by Burgoon and colleagues. We also hypothesize to replicate the results found in our previous study concerning human-human interaction, for example high frequency of smiles will elicit higher warmth and lower competence impressions compared to low frequency of smiles, while arms crossed will elicit low competence and low warmth impressions.

Publisher's Version Article Search
A Review of Evaluation Techniques for Social Dialogue Systems
Amanda Cercas Curry, Helen Hastie, and Verena Rieser
(Heriot-Watt University, UK)
In contrast with goal-oriented dialogue, social dialogue has no clear measure of task success. Consequently, evaluation of these systems is notoriously hard. In this paper, we review current evaluation methods, focusing on automatic metrics. We conclude that turn-based metrics often ignore the context and do not account for the fact that several replies are valid, while end-of-dialogue rewards are mainly hand-crafted. Both lack grounding in human perceptions.

Publisher's Version Article Search
Introducing a ROS Based Planning and Execution Framework for Human-Robot Interaction
Christian Dondrup, Ioannis Papaioannou, Jekaterina Novikova, and Oliver Lemon
(Heriot-Watt University, UK)
Working in human populated environments requires fast and robust action selection and execution especially when deliberately trying to interact with humans. This work presents the combination of a high-level planner (ROSPlan) for action sequencing and automatically generated finite state machines (PNP) for execution. Using this combined system we are able to exploit the speed and robustness of the execution and the flexibility of the sequence generation and combine the positive aspects of both approaches.

Publisher's Version Article Search
Dialog Acts in Greeting and Leavetaking in Social Talk
Emer Gilmartin, Brendan Spillane, Maria O’Reilly, Ketong Su, Christian Saam, Benjamin R. Cowan, Nick Campbell, and Vincent Wade
(Trinity College Dublin, Ireland; University College Dublin, Ireland)
Conversation proceeds through dialogue moves or acts, and dialog act annotation can aid the design of artificial dialog. While many dialogs are task-based or instrumental, with clear goals, as in the case of a service encounter or business meeting, many are more interactional in nature, as in friendly chats or longer casual conversations. Early research on dialogue acts focussed on transactional or task-based dialogue but work is now expanding to social aspects of interaction. We review how dialog annotation schemes treat non-task elements of dialog -- greeting and leave-taking sequences in particular. We describe the collection and annotation, using the ISO Standard 24617-2 Semantic annotation framework, Part 2: Dialogue acts, of a corpus of 187 text dialogues and study the dialog acts used in greeting and leave-taking.

Publisher's Version Article Search
Social Talk: Making Conversation with People and Machine
Emer Gilmartin, Marine Collery, Ketong Su, Yuyun Huang, Christy Elias, Benjamin R. Cowan, and Nick Campbell
(Trinity College Dublin, Ireland; Grenoble INP, France; University College Dublin, Ireland)
Social or interactive talk differs from task-based or instrumental interactions in many ways. Quantitative knowledge of these differences will aid the design of convincing human-machine interfaces for applications requiring machines to take on roles including social companions, healthcare providers, or tutors. We briefly review accounts of social talk from the literature. We outline a three part data collection of human-human, human-woz and human-machine dialogs incorporating light social talk and a guessing game. We finally describe our ongoing experiments on the corpus collected.

Publisher's Version Article Search
Analyses of the Effects of Agents' Performing Self-Adaptors
Tomoko Koda
(Osaka Institute of Technology, Japan)
This paper introduces the results of a series of experiments on the impression of agents that perform self-adaptors. Human-human interactions were video-taped and analyzed with respect to usage of different types of self-adaptors (relaxed/stressful), and gender-specific self-adaptors (masculine/feminine). We then implemented virtual agents that performed these self-adaptors. Evaluation of the interactions between humans and agents suggested: 1) Relaxed self-adaptors were more likely to prevent any deterioration in the perceived friendliness of the agents than agents without self-adaptors. 2) People with higher social skills harbor a higher perceived friendliness with agents that exhibited self-adaptors than people with lower social skills. 3) Impressions of interactions with agents are formed by mutual-interactions between the self-adaptors and the conversational content. 4) There are cultural differences in sensitivity to other culture's self-adaptors. 5) There is a dichotomy on the impression on the agents that perform gender-specific self-adaptors between participants’ gender.

Publisher's Version Article Search
Who Has to Do It? The Use of Personal Pronouns in Human-Human and Human-Robot-Interaction
Brigitte Krenn, Stephanie Gross, and Lisa Nussbaumer
(Austrian Research Institute for Artificial Intelligence, Austria)
In human communication, pronouns are an important means of perspective taking, and in particular in task-oriented communication personal pronouns are an indicator of who has to do what at a certain moment in a given task. The ability of handling task-related discourse is a factor for robots to interact with people in their homes in everyday life. Both, learning and resolution of personal pronouns pose a challenge for robot architectures as there has to be a permanent adaptation to the human interlocutor’s use of personal pronouns. Especially the use of ich, du, wir (I, you, we) may be irritating for the robot’s natural language processing system.

Publisher's Version Article Search
Using Crowd-Sourcing for the Design of Listening Agents: Challenges and Opportunities
Catharine Oertel, Patrik Jonell, Kevin El Haddad, Eva Szekely, and Joakim Gustafson
(KTH, Sweden; University of Mons, Belgium)
In this paper we are describing how audio-visual corpora recordings using crowd-sourcing techniques can be used for the audio-visual synthesis of attitudional non-verbal feedback expressions for virtual agents. We are discussing the limitations of this approach as well as where we see the opportunities for this technology.

Publisher's Version Article Search
Intimately Intelligent Virtual Agents: Knowing the Human beyond Sensory Input
Deborah Richards
(Macquarie University, Australia)
Despite being in the era of Big Data, where our devices seem to anticipate and feed our every desire, intelligent virtual agents appear to lack intimate and important knowledge of their user. Current cognitive agent architectures usually include situation awareness that allows agents to sense their environment, including their human partner, and provide congruent empathic behaviours. Depending on the framework, agents may exhibit their own personality, culture, memories, goals and reasoning styles. However, tailored adaptive behaviours based on multi-dimensional and deep understanding of the human essential for enduring beneficial relationships in certain contexts are lacking. In this paper, examples are provided of what an agent may need to know about the human in the application domains of education, health and cybersecurity and the challenges around agent adaptation and acquisition of relevant data and knowledge.

Publisher's Version Article Search
Integration and Evaluation of Social Competences such as Humor in an Artificial Interactive Agent
Matthieu Riou, Bassam Jabaian, Stéphane Huet, Thierry Chaminade, and Fabrice Lefèvre
(University of Avignon, France; Aix-Marseille University, France)
In this paper, we present a brief overview of our ongoing work about artificial interactive agents and their adaptation to users. Several possibilities to introduce humorous productions in a spoken dialog system are investigated in order to enhance naturalness during social interactions between the agent and the user. We finally describe our plan on how neuroscience will help to better evaluate the proposed systems, both objectively and subjectively.

Publisher's Version Article Search
Introducing ADELE: A Personalized Intelligent Companion
Brendan Spillane, Emer Gilmartin, Christian Saam, Ketong Su, Benjamin R. Cowan, Séamus Lawless, and Vincent Wade
(Trinity College Dublin, Ireland; University College Dublin, Ireland)
This paper introduces ADELE, a Personalized Intelligent Compan- ion designed to engage with users through spoken dialog to help them explore topics of interest. The system will maintain a user model of information consumption habits and preferences in order to (1) personalize the user’s experience for ongoing interactions, and (2) build the user-machine relationship to model that of a friendly companion. The paper details the overall research goal, existing progress, the current focus, and the long term plan for the project.

Publisher's Version Article Search
Recognizing Emotions in Spoken Dialogue with Acoustic and Lexical Cues
Leimin Tian, Johanna D. Moore, and Catherine Lai
(University of Edinburgh, UK)
Emotions play a vital role in human communications. Therefore, it is desirable for virtual agent dialogue systems to recognize and react to user's emotions. However, current automatic emotion recognizers have limited performance compared to humans. Our work attempts to improve performance of recognizing emotions in spoken dialogue by identifying dialogue cues predictive of emotions, and by building multimodal recognition models with a knowledge-inspired hierarchy. We conduct experiments on both spontaneous and acted dialogue data to study the efficacy of the proposed approaches. Our results show that including prior knowledge on emotions in dialogue in either the feature representation or the model structure is beneficial for automatic emotion recognition.

Publisher's Version Article Search
Multi-Modal Social Interaction Recognition using View-Invariant Features
Rim Trabelsi, Jagannadan Varadarajan, Yong Pei, Le Zhang, Issam Jabri, Ammar Bouallegue, and Pierre Moulin
(Advanced Digital Sciences Center, Singapore; SAP, Singapore; Al Yamamah University, Saudi Arabia; National Engineering School of Tunis, Tunisia; University of Illinois at Urbana-Champaign, USA)
This paper addresses the issue of analyzing social interactions between humans in videos. We focus on recognizing dyadic human interactions through multi-modal data, specifically, depth, color and skeleton sequences. Firstly, we introduce a new person-centric proxemic descriptor, named PROF, extracted from skeleton data able to incorporate intrinsic and extrinsic distances between two interacting persons in a view-variant scheme. Then, a novel key frame selection approach is introduced to identify salient instants of the interaction sequence based on the joint energy. From RGBD videos, more holistic CNN features are extracted by applying an adaptive pre-trained CNNs on optical flow frames. Features from three modalities are combined then classified using linear SVM. Finally, extensive experiments have been carried on two multi-modal and multi-view interactions datasets prove the robustness of the introduced approach comparing to state-of-the-art methods.

Publisher's Version Article Search

proc time: 2.2