Vowel decoding from single-trial speech-evoked electrophysiological responses: A feature-based machine learning approach

Spectral projection of the single-trial frequency following responses (FFRs) onto the spectral features space.

Abstract

Scalp‐recorded electrophysiological responses to complex, periodic auditory signals reflect phase‐locked activity from neural ensembles within the auditory system. These responses, referred to as frequency‐following responses (FFRs), have been widely utilized to index typical and atypical representation of speech signals in the auditory system. One of the major limitations in FFR is the low signal‐to‐noise ratio at the level of single trials. For this reason, the analysis relies on averaging across thousands of trials. The ability to examine the quality of single‐trial FFRs will allow investigation of trial‐by‐trial dynamics of the FFR, which has been impossible due to the averaging approach. In a novel, data‐driven approach, we used machine learning principles to decode information related to the speech signal from single trial FFRs. FFRs were collected from participants while they listened to two vowels produced by two speakers. Scalp‐recorded electrophysiological responses were projected onto a low‐dimensional spectral feature space independently derived from the same two vowels produced by 40 speakers, which were not presented to the participants. A novel supervised machine learning classifier was trained to discriminate vowel tokens on a subset of FFRs from each participant, and tested on the remaining subset. We demonstrate reliable decoding of speech signals at the level of single‐trials by decomposing the raw FFR based on information‐bearing spectral features in the speech signal that were independently derived. Taken together, the ability to extract interpretable features at the level of single‐trials in a data‐driven manner offers unchartered possibilities in the noninvasive assessment of human auditory function.

Publication
Brain and Behavior, 7(6)