When a person experiencing hearing difficulties visits a hearing clinic, the assessment may involve listening to simple tones, words, or sentences and repeating them back exactly. These kinds of tests are useful, but they do not fully capture how hearing works in everyday life.
Much of the speech we listen to is continuous — conversations or stories — and we usually remember the meaning rather than the exact wording. Tests that use more naturalistic speech materials can provide a better picture of real-world comprehension difficulties but generating these materials and scoring a person’s responses is challenging and time-consuming without automation.
The Data Sciences Institute Catalyst Grant awarded to Björn Herrmann, a psychologist and cognitive neuroscientist at Baycrest’s Rotman Research Institute and Karen Gordon, an audiologist in the Hospital for Sick Children’s Department of Otolaryngology aims to improve hearing assessment processes through the use of AI. The DSI seed grant awarded to this interdisciplinary team is enabling new applications of large language models (LLMs) to generate naturalistic speech materials for comprehension testing, and to create tools to automate scoring of how well the participant understood what was said.
LLMs can compare the semantic meaning of the text that was played to a participant or patient and the text of what they recalled, assigning a higher recall score when the meaning of the recalled response more closely matches the original. “We can use large language models to identify meaningful units of speech and automatically score how much a listener understood them,” Dr. Herrmann explains. “LLMs can actually be more consistent than if we asked humans to do that kind of task.”
Importantly, LLMs also enable testing participants in their first language, removing the added language-processing effort required of non-native speakers. Speech materials can be generated in the participant’s first language, and their responses can be scored in that same language using a standardized automated approach. This helps ensure that results are comparable across languages, including with English-language assessments.
“With modern AI-based voices, we can use the same voice across many languages. So, it’s as if one voice actor could record speech materials in more than 100 languages, and a multilingual scorer evaluates comprehension consistently across them,” says Dr. Herrmann.
With the support of the DSI seed funding, the team is collecting data from older adults with hearing aids, who come in and listen to speech in background noise as would be done at a hearing clinic. They do this with and without hearing aids to see if the automated scoring approaches that the project has developed can help tell whether hearing aids help with speech comprehension or not. This will be the first step towards a clinically viable approach.
The team is also working with hearing aid companies. Manufacturers are interested in measuring how well their hearing aids work with more natural stimuli to further improve their products. The team’s findings suggest that hearing aids have more of a benefit with verbatim recognition than with the naturalistic story listening, which is consistent with many people’s experience in their everyday life. Hearing aid manufacturers want to directly test verbatim recall versus story-based recall, using these automated tools, to develop better metrics to evaluate hearing aids and how well they work.
“This is an exciting application of AI and data science to address an important health challenge. It’s amazing to see how this innovative work is attracting industry partnerships and moving towards having a clinical benefit,” says Gary Bader, DSI Associate Director, Research and Software.
Applications are now open for the 2026 Catalyst Grants.
Photo by Mark Paton on Unsplash