I have long not liked machines that talk. I don’t even like machines that beep at me, switching all system sounds off at the first possible opportunity. I think my attitude to voice interaction with computers was established by seeing “2001: A Space Oddity” as a child. HAL set my default reaction.
I started using computers in the early 1980s: I first used a device (which I think must have been an Apple) as an undergraduate in1982; my postgrad work required me to use a mainframe for some stats, and I wrote my thesis on a BBC Micro. A keyboard was the only way to interact with these machines, and that could be very frustrating – with the mainframe, “interaction” could take hours, as the machine was very unresponsive. Feedback is powerful. They tried to talk to me – an irritated beep when they didn’t like something (which back in those days was often) – but one only talked back in frustration. Or anger. (Come on – who hasn’t sworn at their computer?!)
I first used a mouse in 1986, again on an Apple. But when I first started using PCs, the keyboard and command line was still the main way of interacting. I still use keystrokes rather than mouse actions for a lot programmes.
My experience of voice activated systems has been limited to supermarket self service systems and telephony systems. I hate self service checkout with a vengeance, largely down to the universally patronising tone of voice used. And my experience of telephony systems is similar to these poor miscreants, shown by Ben.
It’s not just Glaswegian accents – Birmingham City Council installed a voice activated telephony system which couldn’t recognise Brummie accents. They must have done extensive testing of that one!
And then of course there was HAL, lurking at the edge of my technological nightmares.
Perhaps it is a matter of control.
The thing is, voice interaction is becoming much more common. My phone and my tablet – both Android devices – have the ability to use voice activated systems (most commonly Google Now, which is the standard app). You’ve probably realised I’ve not tried them. It appears I’m not the only one.
But voice interaction systems are likely to become more common. As well as Siri and Google Now, Google Glass is voice activated. Satnav appears almost ubiquitous (though I of course abstain…).
I’m beginning to think this might be my problem rather than HCI’s, and Cowan explained why this might be. I’d say I have an issue with the aural “uncanny valley” (my words, not Cowan’s) – the closer to a human voice they sound, the stranger, more passive and downright unemotional – unhuman, even – they seem.
Cowan discussed some of the psychology that goes into this. There are rules in conversation – like “partners in a dance”, even if we’re not aware of the steps. We learn the steps as we learn to talk. Computers don’t. They have to be programmed, and at the moment those programmes are largely database driven and determinate. They work off keywords, rather than natural language. Instead, we fall into line with the machines: Cowan explained how when we talk to people, we model their usage and align our vocabularies. (Starbucks works hard to get us to model their language.) Interestingly, people communicating with computers align their language, too. This has been going on as long as there have been computers: when writing Fortran or Basic programmes back in the 1980s, the vocabulary I could use was very restrained and had very specific meaning. I had to use those words and the programmes’ syntax because otherwise the programmes wouldn’t work, or would give different results from those expected.
When we speak, whether to another person or using voice interaction with a computer, though, the modelling would be internal – subconscious – rather than deliberate.
I was surprised to learn that Siri has a single, masculine voice in the UK (apparently with an American accent). In the USA, Siri seems to have a feminine voice (which can be changed). Presumably its implementation in other languages takes on different voices or accents. Perhaps in the future we will be able to programme computerised voices, as some people do with satnav which I am sure would go some distance to overcoming the uncanny valley – though it may raise other issues (who owns the sound of their voice? What if one decides to use an ex’s voice? …And so on).
Still, it would appear that Siri has a sense of humour…
Which I think is where I came in…
Edit: I have been reliably informed that in the UK, Siri has a British accent. Chris says: “UK Siri [is] decidedly British; he sounds like a sarcastic airline pilot.”