Skip to content

Yanny V Laurel and the death of the audio CAPTCHA

If you’ve recently been travelling through a part of the world without any form of internet, television or radio then you may have been one of the few to miss the recent Yanny versus Laurel debate. The premise is an audio sample in which some people distinctly hear the word ‘Yanny’ while others clearly identify the word as ‘Laurel’.

In my case, I hear ‘Yanny’ most of the time but recently I listened to it through a bad quality mobile phone speaker and heard Laurel, so it seems I’m in a good position to get into an argument with myself and represent both sides.  

Yet while the online community has been in meltdown over the past week or so arguing about which word our ears should hear, I found that the cleverly designed sound sample brilliantly highlights an accessibility issue that is very close to my heart – how distorted electronic audio can be interpreted differently depending on a variety of factors, and a well-known example of this in action is the audio CAPTCHA.

CATPCHA is an acronym that stands for a Completely Automated Public Turing test to tell Computers and Humans Apart. The purpose of a CAPTCHA is ultimately to prevent personal data from being harvested by clever computer code known as bots or scripts. While it is important to identify whether there is a real person entering in information, the issue for people with disabilities is that CAPTCHAs not only tell humans and computers apart but also tend to put people with disabilities on the ‘computer’ side of the fence blocking access to processes such as buying tickets online or signing up to an online service.

The journey in fighting the use of CATPCHAs has been a long one for people with disabilities. The traditional CAPTCHA which features a bitmapped graphical image of distorted text is impossible for people who are blind or low vision to complete, so people started looking at the possibility of audio CATPCHAs.

 

As noted in the video above, the idea of an audio CATPCHA is that humans can pick out the ‘real’ audio information from the garbled background noise, while a computer trying to decipher it would get tripped up by the extra sounds.

However, what the Yanny and Laurel debate highlights beautifully is that how people interpret a combination of sounds will vary significantly from person to person, especially for people with a hearing impairment. It may be the case that you can hear the numbers read out in the video clip clearly, but for many identifying the information required and then typing it into a form to pass the CAPTCHA would be impossible. Furthermore, many audio CATPCHAs mix words and numbers together, making it difficult to know if a number should be entered as a numerical value like ‘9’ or typed out in full such as ‘nine’. For people with a hearing impairment, an audio CATPCHA is the equivalent of saying ‘because you can’t hear Laurel, you’re not allowed to buy a ticket to the football’ or ‘because you can’t hear Yanny, you can’t join our new social media platform.’ Thinking about audio CATPCHAs in this way really helps, in my view, to highlight the challenges such technologies pose.

In my work with the W3C Research Questions Task Force we’ve been looking at the issues of CATPCHAs closely as we have been working on an update to the W3C CATPCHA advisory note. The upshot is that CAPTCHAs such as those that depend on audio are not as secure as they used to be in the age of digital assistants that can understand a greater amount of spoken words than ever before. With ever-improving ways to tell humans and computers apart such as federated identities, multiple devices and biometrics such as fingerprint and facial recognition being built into our everyday devices, it is likely that more traditional CAPTCHAs will soon disappear and people with disabilities will once again be counted as human when completing an online task.

So next time you’re having a friendly debate over Yanny and Laurel, consider that for many people, how they hear things could actually be preventing access to critical online content. It’s exciting though to consider that in the not-too-distant future traditional CAPTCHAs will be gone,  putting the focus back on our choices – not ears – that determine our online participation.

Published incommentary