How do humans understand speech?

UNIVERSITY PARK, Pa. — New funding from the National Science Foundation’s Build and Broaden Program will enable a team of researchers from Penn State and North Carolina Agricultural and Technical State University (NC A&T) to explore how speech recognition works while training a new generation of speech scientists at America’s largest historically Black university.

Research has shown that speech-recognition technology performs significantly worse at understanding speech by Black Americans than by white Americans. These systems can be biased, and that bias may be exacerbated by the fact that few Americans of color work in speech-science-related fields.

Understanding how humans understand speech

Navin Viswanathan, associate professor of communication sciences and disorders, will lead the research team at Penn State.

“In this research, we are pursuing a fundamental question,” Viswanathan explained. “How human listeners perceive speech so successfully despite considerable variation across different speakers, speaking rates, listening situations, etc., is not fully understood. Understanding this will provide insight into how human speech works on a fundamental level. On an immediate, practical level, it will enable researchers to improve speech-recognition technology.”

Joseph Stephens, professor of psychology, will lead the research team at NC A&T.

“There are conflicting theories of how speech perception works at a very basic level,” Stephens said. “One of the great strengths of this project is that it brings together investigators from different theoretical perspectives to resolve this conflict with careful experiments.”

According to the research team, speech-recognition technology works in many aspects of people’s lives, but it is not as capable as a human listener at understanding speech, especially when the speech varies from norms established in the software. Speech-recognition technology can be improved using the same mechanisms that humans use, once those mechanisms are understood.

Building and broadening the field of speech science

Increasing diversity in speech science is the other focus of the project.

“When a field lacks diversity among researchers, it can limit the perspectives and approaches that are used, which can lead to technologies and solutions being limited, as well,” Stephens said. “We will help speech science to become more inclusive by increasing the capacity and involvement of students from groups that are underrepresented in the field.”

The National Science Foundation’s Build and Broaden Program focuses on supporting research, offering training opportunities, and creating greater research infrastructure at minority-serving institutions. New awards for the Build and Broaden Program, which total more than $12 million, support more than 20 minority-serving institutions in 12 states and Washington, D.C. Nearly half of this funding came from the American Rescue Plan Act of 2021. These funds aim to bolster institutions and researchers who were impacted particularly hard by the COVID-19 pandemic.

Build and Broaden is funding this project in part because it will strengthen research capacity in speech science at NC A&T. The project will provide research training for NC A&T students in speech science, foster collaborations between researchers at NC A&T and Penn State, and enhance opportunities for faculty development at NC A&T.

By providing training in speech science at NC A&T, the research team will mentor a more diverse group of future researchers. Increasing the diversity in this field will help to decrease bias in speech-recognition technology and throughout the field.

Viswanathan expressed excitement about developing a meaningful and far-reaching collaboration with NC A&T.

“This project directly creates opportunities for students and faculty from both institutions to work together on questions of common interest,” Viswanathan said. “More broadly, we hope that this will be the first step towards building stronger connections across the two research groups and promoting critical conversations about fundamental issues that underlie the underrepresentation of Black scholars in the field of speech science.”

Ji Min Lee, associate professor of communications sciences and disorders; Anne Olmstead, assistant professor of communications sciences and disorders; Matthew Carlson, associate professor of Spanish and linguistics; Paola “Guili” Dussias, professor of Spanish, linguistics and psychology; Elisabeth Karuza, assistant professor of psychology; and Janet van Hell, professor of psychology and linguistics, will contribute to this project at Penn State. Cassandra Germain, assistant professor of psychology; Deana McQuitty, associate professor of speech communication; and Joy Kennedy, associate professor of speech communication, will contribute to the project at North Carolina Agricultural and Technical State University.

Need more dictation or transcription supplies and accessories?

Visit our friends over at TranscriptionGear to get the rest of what you need! From headsets to foot pedals, they have you covered.

Visit TranscriptionGear