UW linguists at Acoustical Society of America 

Submitted by Joyce Parvi on
Photo by Ted Kye

On Dec. 1, 2021, at the 181st meeting of the Acoustical Society of America in Seattle, Richard Wright was named a Fellow, “for contributions to understanding how phonetic variability impacts communication”.  It’s easy to see why Wright received this honor.  At the Seattle ASA, for example, Wright was a co-author on two posters and one presentation, and co-organized a special session. 

UW Linguistics was well represented at this ASA:

  • Posters
    • Ted Kye, “Effects of uvular consonants on vowel quality in Lushootseed”. Kye studied the four vowels of Lushootseed, /i u ə a/, in uvular and non-uvular environments in archival recordings of Annie Jack Lobehan Daniels, a native speaker of the Southern dialect of Lushootseed born near the Green River in the early 1870’s. Unlike studies of uvular effects on vowel quality in other languages, he found F1 increased more than F2 decreased, and all four vowels, even the low vowel, were affected.
    • Sara Ng, Gregory Ellis (Northwestern), Pamela Souza (Northwestern), Frederick Gallun (Oregon Health & Science Univ.), and Richard Wright, “Modeling the time course of cue weighting angle calculations”. The authors developed a statistical measure to quantify the stability of a listener’s attention to spectral or temporal speech cues, and used this measure alongside a machine learning classifier to support shortening the existing cue weighting testing procedure.
    • Gavriel D Kohlberg (UW Medicine), Eric M Prater (UW Medicine), Yi Shen (UW SHS), Adrian KC Lee (UW SHS), Jay T Rubinstein (UW Medicine), Les E Atlas (UW CSE), and Richard Wright, “Do Humans Integrate Auditory and Text Information in a Statistically Optimal Fashion?” The authors evaluated how listeners combine visual, speech, text, and auditory speech information in order to improve speech perception in noise.
  • Presentations
    • Courtney Mansfield (2021 PhD), Sara Ng, Gina-Anne Levow, Mari Ostendorf (Electrical Engineering, adjunct in Linguistics), and Richard Wright, “What does parity mean? A detailed comparison of ASR [automatic speech recognition] and human transcription errors.” Improvement in ASR has led to the conclusion that ASR is approaching parity with human performance. However, when human and machine errors are analyzed in detail, stark differences emerge. For example, while humans tend to make errors on backchannels and discourse markers, ASR systems tend to make errors on content words.
    • Special session, “Speech and Machines”, organized by Richard Wright and Gina Levow focused on how recent advances in computational processing of speech can address the needs and challenges of diverse applications and speakers. Presentations included:
    • Michael Tjalve, UW (affiliate in Linguistics) and Microsoft Philanthropies, on “The non-native canonical accent - and how to use it”
    • Courtney Mansfield (2021 PhD), LivePerson, on “What does parity mean? A detailed comparison of ASR and human transcription errors”
    • Rachael Tatman (2017 PhD), Rasa Technologies, on “Why ASR + NLP isn't enough for commercial language technology”
    • Matthew Kelley (UW Linguistics post-doc), on “APhL Aligner: A Neural Network Forced-Alignment System”