HEALTH

AI voices easier to understand than human voices

Wednesday, 22 Apr, 2026
Voice clones are more intelligible in noisy environments. (Photo courtesy: AIP)

Washington: Synthetic voices are increasingly a part of our lives, from digital assistants like Siri and Alexa to automated telemarketers and answering machines. With the expansion of generative AI, a new type of synthetic voice has been developed: voice clones, which can recreate a facsimile of a person’s voice from only a few seconds of recorded speech.

In the Journal of the Acoustical Society of America (JASA), published on behalf of the Acoustical Society of America by AIP Publishing, a pair of researchers from University College London and the University of Roehampton evaluated the intelligibility of humans and voice clones. They found that voice clones are easier than humans to understand in noisy environments.

Voice clones differ from traditional synthetic voices in the amount of sampling they require. Synthetic voices like Siri require a voice actor to spend hours in a recording booth. In contrast, a voice clone can be made from as little as 10 seconds of speech, significantly expanding the number of potential voices as well as the number of potential applications.

Researchers Patti Adank and Han Wang specialize in studying human perception of unclear speech and were fascinated by the idea of machine-replicated speech.

“I thought initially that voice clones would be less intelligible because they were unfamiliar,” said Adank. “I found they were up to 20% more intelligible, which was quite shocking. A small part of our paper is talking about that experiment, and then a large part is me and my collaborator frantically trying to find out what it is that makes those voice clones more intelligible.”