Turn-taking and the identification of humanness in a highly restricted interactive task

Madeleine McGrath, Gregory Mills and Gareth Roberts

While language, understood narrowly, may be unique to humans, communicative interaction is extraordinarily widespread in nature and evolutionarily ancient [1]. An important prerequisite for language is that humans identify behaviors as interactive and originating from conspecifics. But what are the hallmarks of such behavior? There is a wealth of literature investigating the structure and nature of linguistic and non-linguistic joint-action using both naturalistic and experimental data, e.g., [2,3,4,5]. The vast majority of such work, however, has focused on rich contexts, in which behaviors under investigation are embedded in a complex set of other behaviors. Here we present an exploratory experimental study in which we stripped interaction down to a basic level and asked participants to guess if they were interacting with humans. Method: 231 participants sat at isolated computers and performed a simple 10-minute task. Each had a partner who was either human (human conditions) or a computer (computer conditions); participants were told they would have to guess which. Most of the computer screen was black or white, with a rectangle in the bottom corner that was independently black or white, representing the partner’s screen. By clicking, each participant could flip their partner’s screen color. After 10min each participant was asked who they thought they had been receiving signals from. All could select “human”. The other option would be either “from a computer” or “at random”. For the 117 participants in the computer conditions, there were two further conditions. In the random computer condition, the server flipped the participant’s screen at random 1–10s intervals, regardless of participant behavior (and participants were asked if they thought signals came from a human partner or at random). In the turn-taking computer condition, the server waited for the participant to send a signal and then sent its own signal 0–3s later (and participants were asked if they thought signals came from a human partner or a computer). Results: Participants showed above-chance accuracy in all but the human condition in which “computer” was an option (Fig. 1, likely due to high expectations about computer partners. There were also differences in behavior between human and computer conditions. Human-partnered participants sent more signals than in computer conditions (mean 481 vs. 409), but there was no relationship with identification accuracy. More interesting results concerned the emergence of interactive behaviors. Human pairs tended to take turns sending individual signals, with a mean turn-length (number of signals sent before partner sends signal) of 1.05 in human conditions and 4.3 in computer conditions, t(114) = 7.17,p < 0.001. The mean time between signals was also shorter in human conditions: 0.62s vs. 1.82s, t(101740) = 12,p < 0.001. Participants’ response times tended to converge over time with both human and computer partners (Fig. 2); there was no evidence of differences in convergence between human and computer conditions in this respect. Overall, results suggest that humans readily align with both human and non-human signaling and are good at identifying nonhuman behavior and distinguishing human behavior from random behavior.

Figure 1: Percentage of correct judgements per condition (chance level 0.5)

Figure 2: Alignment of response time over interactions for four randomly selected example pairs: (a) Human (random), (b) Human (computer), (c) Random computer (computer is Participant 2), (d) Turn-taking computer (computer is Participant 2. Y-axis shows cumulativea verage response time.


[1] W. Tecumseh Fitch. The Evolution of Language. Cambridge University Press, Cambridge, 2010.

[2] Günther Knoblich, Stephen Butterfill, and Natalie Sebanz. Psychological research on joint action: Theory and data. Psychology of Learning and Motivation, 54:59–101, 2011.

[3] Gregory J. Mills. Dialogue in joint activity: Complementarity, convergence and conventionalization. New Ideas in Psychology, 32:158–173, 2014.

[4] Jack Sidnell and Tanya Stivers, editors. The Handbook of Conversation Analysis. John Wiley & Sons, 2012.

[5] Holly P. Branigan and Martin J. Pickering. An experimental approach to linguistic representation. Behavioral and Brain Sciences, 40, 2017.