It looks like you're reading the research proposal for Kathryn's MA thesis.

Go to:
Designing Embodied Conversational Agents
Research Proposal
Kathryn Lawrence
March 1, 2017
Download PDF
Conversation is a central metaphor in the design of human-computer interaction: from the oldest programming "languages" in binary, to today's voice-commanded interfaces that can respond to the spoken word. Conversational interface agents are pieces of software designed to help people use computers by structuring their interactions as though they are interacting with another human instead of a machine. These agents can offer information, navigational cues, and entertainment to facilitate computer use, and are often personified, given a name and some representation of embodiment. The design of these anthropomorphic agents' embodiments will be the subject of my research. Designing the embodiment of conversational agents raises many questions: how does representing a conversational agent with an avatar containing humanoid face or body features make them more effective? What are the design choices that create the most comfort, ease of use, and enjoyability in engaging with software, and in which contexts? How do the embodied politics of gender, race, and sexuality affect the way that these agents are designed? By performing analyses of existing conversational agents, and the way that the technology and design of these agents has evolved over time, I hope to answer some of these questions and create an effective framework in which to test my findings by designing my own conversational agents.
The background research for analyzing conversational agents necessarily begins in the era of personal computing, the early 1990s. Before this point, computers were generally used by specialists and programmers, whose system images (Norman, 2013) of how computers worked were more accurate to the actual mechanisms of hardware and software. When computers began to enter the homes of non-programmers, design metaphors evolved to facilitate ease of use for casual users, whose system images do not contain any knowledge of hardware or software. In this period, we see the development of the graphical user interface, which contains many visual metaphors that persist to this day, such as a file system (or folders) for data storage and retrieval, or simulated paper for word processors. These translations of other objects into the interface of the computer illustrate how the computer as a tool simultaneously becomes easier to use, and harder to understand how it works. (Grudin, 1990)
Many researchers have noted the phenomenon of computers being anthropomorphized, possibly due to the opacity of their systems: because we don't understand how it works, the computer seems to have a mind of its own (Hybs, 1996). Human-computer interface designers use the conversational agent to bridge this gap. The conversational agent speaks directly to the user, anticipating their needs and lack of knowledge of the computer's inner workings, to help them accomplish whatever task they are using the computer for. Both appearance and functionality play a role in whether the user trusts and successfully utilizes the conversational agent: notable historical examples include Microsoft's Bob and the infamous Clippy. (Koda, 1996)
After collecting examples of conversational agents from the last two or three decades and analyzing the discourse surrounding them amongst the general population and in academia, I intend to compile an analysis of the most successful and unsuccessful design features of these conversational agents. These features include how realistically they are rendered, their avatars' humanity, perceived gender, expressiveness and emotion, animation, placement and context within a web or software interface, and accessibility features. Combining this visual analysis and literature review of studies about human-computer interaction and conversational interfaces with an implementation of web design, I intend to create some conversational agents myself, adhering to the best practices learned from analyzing the examples.
My research is limited in scope by two key features that separate a certain type of conversational agent from other conversational tools for human-computer interaction. The first feature that limits my scope is that the conversation between human and agent should be text-based. The reason that I find this an intriguing limit to impose is because of its historical precedent: text-based conversations with computer systems go as far back as Alan Turing's famous test of machine intelligence. Incorporating text also poses an interesting problem in user interface design, and centers conversation as a visual metaphor, as opposed to a literal conversation occurring with speech. There is a lot of material to explore in how verbal and nonverbal conversational functions and cues are represented and executed using only the keyboard and screen. The second limitation is that there must be a representation of the agent's embodiment as an essential part of the conversational interface. For example, the ELIZA chatbot is one of the best known historical examples of a conversational agent, because it was the most advanced at its time and came close to passing the Turing test. A female embodiment may be implied by the name, but beyond that there is very little design to ELIZA besides her information architecture, and therefore chatbots like ELIZA, which don't have any embodied representation to speak of, will be excluded.
Based on the findings of this research, I will design new conversational agents. These may include redesigns to improve unsuccessful historical examples, or entirely new agents to fill contemporary needs. Because the agents can be web-based, these new conversational agents could be made publicly available on the internet for testing by a large audience, whose reactions and use of the agents could be observed and quantified. It would be a simple matter to track mouse behavior, clicks, and record the conversations people have with these agents. Another method of analyzing how users respond to the agents could be to ask them to complete a survey before or after encountering the agent. The most difficult part of conducting research with new conversational agents will be coming up with realistic scenarios or contexts in which to place them, but these use cases should become clear through the research process.
The expected outcome of the visual analysis and literature review portion is to find that the design of conversational agents and their chat interfaces has not significantly progressed in the last 20 years. I expect to find trends towards more advanced human-like designs, more female conversational agents, and away from anthropomorphized objects. I also expect that conversational agents will be used in many contexts, but for the most part will be used for entertainment, for replacing software tutorials, and as customer service assistants. The expected outcome for the experiments with new conversational agents will have to be determined during the design process.
Resources cited in this proposal:
Grudin, Jonathan. (1990) The Computer Reaches Out: The Historical Continuity of Interface Design. Computer Science Department Aarhus University. Aarhus, Denmark

Hybs, Ivan. (1996) Beyond the Interface: A Phenomenological View of Computer Systems Design. Leonardo, Vol. 29, No. 3, pp. 215-223. Retrieved December 8, 2016 from http://www.jstor.org/stable/1576250

Koda, Tomoko. (1996) Agents with faces: a study on the effects of personification of software agents. Massachusetts Institute of Technology. Cambridge, Massachusetts

Norman, Don. (2013) The Design of Everyday Things: Revised & Expanded Edition. Basic Books. New York, NY
Additional resources collected:
Anabuki, Mahoro, Kakuta, Hiroyuki, Yamamoto, Hiroyuki, & Tamura, Hideyuki. (2000) Welbo: An Embodied Conversational Agent Living in Mixed Reality Space. Mixed Reality Systems Laboratory Inc. Yokohama, Japan

Baylor, Amy L. (2009) Promoting motivation with virtual agents and avatars: role of visual presence and appearance. Philosophical Transactions of the Royal Society, 364, 3559-3565. Retrieved December 8, 2016 from http://rstb.royalsocietypublishing.org

Baylor, Amy L. (2011) The design of motivational agents and avatars. Educational Technology Research and Development, Vol. 59, No. 2, Special Issue on Motivation and New Media, pp. 291-300. Retrieved December 8, 2016 from http://www.jstor.org/stable/41414939

Bertelsen, Olav W., & Pold, Søren. (2004) Criticism as an Approach to Interface Aesthetics. NordiCHI, October 23-27. Tampere, Finland

Brahnam, Sherly, Karanikas, Marianthe, & Weaver, Margaret. (2011) (Un)dressing the interface: Exposing the foundational HCI metaphor “computer is woman” Interacting with Computers, 23, pp. 401-412. British Informatics Society Limited. Elsevier B.V.

Cassell, Justine. (2000). Embodied Conversational Interface Agents. Communications of the ACM, Vol. 43, No. 4, pp. 70-78. ACM

Chafai, Nicholas Ech, Pelachaud, Catherine, & Pelé, Danielle. (2007) A Case Study of Gesture Expressivity Breaks. Language Resources and Evaluation, Vol. 41, No. 3 / 4, pp. 341-365. Retrieved December 8, 2016 from http://www.jstor.org/stable/30204710

Daniels, Jessie. (2009) Rethinking Cyberfeminism(s): Race, Gender, and Embodiment. Women's Studies Quarterly, Vol. 37, No. 1 / 2, Technologies, pp. 101-124. Retrieved December 8, 2016 from http://www.jstor.org/stable/27655141

Duffy, Brian R. (2003) Anthropomorphism and the social robot. Robotics and Authonomous Systems, 42, 177-190. Elsevier Science B.V.

Fineman, Benjamin. (2004) Computers as people: human interaction metaphors in human-computer interaction. Carnegie Mellon University. Pittsburgh, Pennsylvania

Fink, Julia. (2012) Anthropomorphism and Human Likeness in the Design of Robots and Human-Robot Interaction. CRAFT, Ecole Polytechnique Fédérale de Lausanne. Lausanne, Switzerland. Springer-Verlag Berlin Heidelberg

Foster, Mary Ellen, & Oberlander, Jon. (2007) Corpus-Based Generation of Head and Eybrow Motion for an Embodied Conversational Agent. Language Resources and Evaluation, Vol. 41, No. 3 / 4, pp. 305-323. Retrieved December 12, 2016 from http://www.jstor.org/stable/30204708

Hanson, David. (2006) Exploring the Aesthetic Range for Humanoid Robots. The University of Texas at Dallas. Richardson, TX

Hutchins, Edwin. (1987) Metaphors for Interface Design. Institute for Cognitive Science, University of California, San Diego. La Jolla, CA

Höök, Kristina. (2009, December 12) Affective Loops Experiences: Designing for Interactional Embodiment. Philosophical Transactions: Biological Sciences, Vol. 364, No. 1535, Computation of Emotions in Man and Machines, pp. 3585-3595. Retrieved December 8, 2016 from http://www.jstor.org/stable/40538152

Ju, Wendy, & Leifer, Larry. (2008) The Design of Implicit Interactions: Making Interactive Systems Less Obnoxious. Design Issues, Vol. 24, No. 3, Interaction Design Research in Human-Computer Interaction, pp. 72-84. Retrieved January 27, 2017 from http://www.jstor.org/stable/25224184

Kim, Yanghee, Baylor, Amy L., & PALS Croup (2006) Pedagogical Agents as Learning Companions: The Role of Agent Competency and Type of Interaction. Educational Technology Research and Development, Vol. 54, No. 3, pp. 223-243, Retrieved January 27, 2017 from http://www.jstor.org/stable/30221218

Looser, Christine E., & Wheatley, Thalia. (2010) The Tipping Point of Animacy: How, When, and Where We Perceive Life in a Face. Psychological Science, Vol. 21, No. 12, pp. 1854-1862. Retrieved December 8, 2016 from http://www.jstor.org/stable/40984586

Phan, Thao. (2017) The Materiality of the Digital and the Gendered Voice of Siri. Transformations, issue 29, pp. 23-33 Retrieved February 26, 2017 from www.transformationsjournal.org

Ramey, Christopher H. (2006) An Inventory of Reported Characteristics for Home Computers, Robots, and Human Beings: Applications for Android Science and the Uncanny Valley. Department of Psychology, Florida Southern College. Lakeland, FL

Robinson, Peter, & el Kaliouby, Rana (2009, December 12) Introduction: Computation of Emotions in Man and Machines. Philosophical Transactions: Biological Sciences, Vol. 364, No. 1535, Computation of Emotions in Man and Machines, pp. 3441-3447. Retrieved December 8, 2016 from http://www.jstor.org/stable/40538137
Appendix: