Older people tend to confuse the speech of living people with speech generated using neural networks. Baycrest Geriatrics Center.
In the last few years, neural network-based computer algorithms have succeeded in creating realistic fakes, whether they are produced images or artificial sound. Unlike traditional “computer” sounds, neural network algorithms can mimic intonations and emotions and don’t speak like “robots” in science fiction movies. Despite many useful applications, such technologies can be used by attackers for fraudulent purposes.
Bjorn Herrmann and his colleagues decided to find out how convincingly a neural network sound could deceive people. During the experiment, young (about 30 years old) and older (about 60 years old) adults listened to sentences spoken by 10 different people (five men, five women) and 10 AI voices (5 men, 5 women). In one experiment, participants were asked how natural they found human and AI voices to be. In another, they were asked to identify whether a sentence was spoken by a human or by an AI voice.
The results showed that, compared to younger people, older people found AI speech more natural and discriminated the product of neural network generation significantly worse than human speech. The experimenters don’t have a definitive explanation for the phenomenon, but they suggest that with age, people pay more attention to words, not intonation and tempo.