2022
Learning speaker-specific structural expectations
Rachel Ostrand, Rachel Ryskin
Journal of Cognitive Neuroscience, 2022
Abstract
A hallmark of human cognition is its adaptability to input and the environment. In particular, language processing appears to be exquisitely sensitive to the statistical properties of the input: for example, words and structures that are infrequently experienced are less expected and more challenging to process. But these statistical properties differ across contexts, such as genres of texts, historical periods, and individual speakers. Yet evidence that listeners adapt their linguistic expectations to the speaker-context has been mixed. In this registered report, we aim to first replicate effects of rapid syntactic adaptation to the global linguistic statistics of the environment, using a novel EEG/ERP paradigm which provides a sensitive measure of real-time expectations. In a second experiment, we aim to extend this ERP paradigm to test whether listeners who are exposed to two speakers with distinct syntactic distributions can learn differential syntactic predictions for each speaker and thus build partner-specific syntactic expectations.
Lasting effects of the COVID-19 pandemic on language processing
Daniel Kleinman, Adam M. Morgan, Rachel Ostrand, Eva Wittenberg
PLOS ONE 17(6), 1-14, 2022
Abstract
A central question in understanding human language is how people store, access, and comprehend words. The ongoing COVID-19 pandemic presented a natural experiment to investigate whether language comprehension can be changed in a lasting way by external experiences. We leveraged the sudden increase in the frequency of certain words (mask, isolation, lockdown) to investigate the effects of rapid contextual changes on word comprehension, measured over 10 months within the first year of the pandemic. Using the phonemic restoration paradigm, in which listeners are presented with ambiguous auditory input and report which word they hear, we conducted four online experiments with adult participants across the United States (combined N = 899). We find that the pandemic has reshaped language processing for the long term, changing how listeners process speech and what they expect from ambiguous input. These results show that abrupt changes in linguistic exposure can cause enduring changes to the language system.
2021
It's alignment all the way down, but not all the way up: Speakers align on some features but not others within a dialogue
Rachel Ostrand, Eleanor Chodroff
Journal of Phonetics 88, 2021
Abstract
During conversation, speakers modulate characteristics of their production to match their interlocutors characteristics. This behavior is known as alignment. Speakers align at many linguistic levels, including the syntactic, lexical, and phonetic levels. As a result, alignment is often treated as a unitary phenomenon, in which evidence of alignment on one feature is cast as alignment of the entire linguistic level. This experiment investigates whether alignment can occur at some levels but not others, and on some features but not others, within a given dialogue. Participants interacted with two experimenters with highly contrasting acoustic-phonetic and syntactic profiles. The experimenters each described sets of pictures using a consistent acoustic-phonetic and syntactic profile; the participants then described new pictures to each experimenter individually. Alignment was measured as the degree to which subjects matched their current listeners speech (vs. their non-listeners) on each of several individual acoustic-phonetic and syntactic features. Additionally, a holistic measure of phonetic alignment was assessed using 323 acoustic-phonetic features analyzed jointly in a machine learning classifier. Although participants did not align on several individual spectral-phonetic or syntactic features, they did align on individual temporal-phonetic features and as measured by the holistic acoustic-phonetic profile. Thus, alignment can simultaneously occur at some levels but not others within a given dialogue, and is not a single phenomenon but rather a constellation of loosely-related effects. These findings suggest that the mechanism underlying alignment is not a primitive, automatic priming mechanism but rather guided by communicative or social factors.
Automated Computer Vision Assessment of Hypomimia in Parkinson Disease: Proof-of-Principle Pilot Study
Avner Abrami, Steven Gunzler, Camilla Kilbane, Rachel Ostrand, Bryan Ho, Guillermo Cecchi
Journal of Medical Internet Research 23(2), 2021
Abstract
Background: Facial expressions require the complex coordination of 43 different facial muscles. Parkinson disease (PD) affects facial musculature leading to "hypomimia" or "masked facies."Objective: We aimed to determine whether modern computer vision techniques can be applied to detect masked facies and quantify drug states in PD.Methods: We trained a convolutional neural network on images extracted from videos of 107 self-identified people with PD, along with 1595 videos of controls, in order to detect PD hypomimia cues. This trained model was applied to clinical interviews of 35 PD patients in their on and off drug motor states, and seven journalist interviews of the actor Alan Alda obtained before and after he was diagnosed with PD.Results: The algorithm achieved a test set area under the receiver operating characteristic curve of 0.71 on 54 subjects to detect PD hypomimia, compared to a value of 0.75 for trained neurologists using the United Parkinson Disease Rating Scale-III Facial Expression score. Additionally, the model accuracy to classify the on and off drug states in the clinical samples was 63% (22/35), in contrast to an accuracy of 46% (16/35) when using clinical rater scores. Finally, each of Alan Aldas seven interviews were successfully classified as occurring before (versus after) his diagnosis, with 100% accuracy (7/7).Conclusions: This proof-of-principle pilot study demonstrated that computer vision holds promise as a valuable tool for PD hypomimia and for monitoring a patients motor state in an objective and noninvasive way, particularly given the increasing importance of telemedicine.
Using Automatic Assessment of Speech Production to Predict Current and Future Cognitive Function in Older Adults
Rachel Ostrand, John Gunstad
Journal of Geriatric Psychiatry and Neurology, 2021
Abstract
Neurodegenerative conditions like Alzheimer disease affect millions and have no known cure, making early detection important. In addition to memory impairments, dementia causes substantial changes in speech production, particularly lexical-semantic characteristics. Existing clinical tools for detecting change often require considerable expertise or time, and efficient methods for identifying persons at risk are needed. This study examined whether early stages of cognitive decline can be identified using an automated calculation of lexical-semantic features of participants spontaneous speech. Unimpaired or mildly impaired older adults (N = 39, mean 81 years old) produced several monologues (picture descriptions and expository descriptions) and completed a neuropsychological battery, including the Modified Mini-Mental State Exam. Most participants (N = 30) returned one year later for follow-up. Lexical-semantic features of participants speech (particularly lexical frequency) were significantly correlated with cognitive status at the same visit and also with cognitive status one year in the future. Thus, automated analysis of speech production is closely associated with current and future cognitive test performance and could provide a novel, scalable method for longitudinal tracking of cognitive health.
2020
Speech-based characterization of dopamine replacement therapy in people with Parkinsons disease
R. Norel, C. Agurto, S. Heisig, J. J. Rice, H. Zhang, R. Ostrand, P. W. Wacnik, B. K. Ho, V. L. Ramos, G. A. Cecchi
npj Parkinsons disease 6(1), 2020
Abstract speech tempo, parkinson s disease, rating scale, audiology, neurology, habituation, dopamine, medicine, automatic speech, motor function
People with Parkinsons (PWP) disease are under constant tension with respect to their dopamine replacement therapy (DRT) regimen. Waiting too long between doses results in more prominent symptoms, loss of motor function, and greater risk of falling per step. Shortened pill cycles can lead to accelerated habituation and faster development of disabling dyskinesias. The Unified Parkinsons Disease Rating Scale (MDS-UPDRS) is the gold standard for monitoring Parkinsons disease progression but requires a neurologist to administer and therefore is not an ideal instrument to continuously evaluate short-term disease fluctuations. We investigated the feasibility of using speech to detect changes in medication states, based on expectations of subtle changes in voice and content related to dopaminergic levels. We calculated acoustic and prosodic features for three speech tasks (picture description, reverse counting, and diadochokinetic rate) for 25 PWP, each evaluated "ON" and "OFF" DRT. Additionally, we generated semantic features for the picture description task. Classification of ON/OFF medication states using features generated from picture description, reverse counting and diadochokinetic rate tasks resulted in cross-validated accuracy rates of 0.89, 0.84, and 0.60, respectively. The most discriminating task was picture description which provided evidence that participants are more likely to use action words in ON than in OFF state. We also found that speech tempo was modified by DRT. Our results suggest that automatic speech assessment can capture changes associated with the DRT cycle. Given the ease of acquiring speech data, this method shows promise to remotely monitor DRT effects.
doi
speech tempo, parkinson s disease, rating scale, audiology, neurology, habituation, dopamine, medicine, automatic speech, motor function
Automated assessment of speech production and prediction of MCI in older adults
Victoria Sanborn, Rachel Ostrand, Jeffrey Ciesla, John Gunstad
Applied Neuropsychology, 1-8, Informa UK Limited, 2020
Abstract population, cognition, speech production, disease, gerontology, psychology, cognitive screening
The population of older adults is growing dramatically and, with it comes increased prevalence of neurological disorders, including Alzheimers disease (AD). Though existing cognitive screening tes...
doi
population, cognition, speech production, disease, gerontology, psychology, cognitive screening
Detection of Acute 3,4-Methylenedioxymethamphetamine (MDMA) Effects Across Protocols Using Automated Natural Language Processing
Carla Agurto, Guillermo A. Cecchi, Raquel Norel, Rachel Ostrand, Matthew Kirkpatrick, Matthew J. Baggott, Margaret C. Wardle, Harriet de Wit, Gillinder Bedi
Neuropsychopharmacology (2020)
Abstract
The detection of changes in mental states such as those caused by psychoactive drugs relies on clinical assessments that are inherently subjective. Automated speech analysis may represent a novel method to detect objective markers, which could help improve the characterization of these mental states. In this study, we employed computer-extracted speech features from multiple domains (acoustic, semantic, and psycholinguistic) to assess mental states after controlled administration of 3,4-methylenedioxymethamphetamine (MDMA) and intranasal oxytocin. The training/validation set comprised within-participants data from 31 healthy adults who, over four sessions, were administered MDMA (0.75, 1.5 mg/kg), oxytocin (20 IU), and placebo in randomized, double-blind fashion. Participants completed two 5-minute speech tasks during peak drug effects. Analyses included group-level comparisons of drug conditions and estimation of classification at the individual level within this dataset and on two independent datasets. Promising classification results were obtained to detect drug conditions, achieving cross-validated accuracies of up to 87% in training/validation and 92% in the independent datasets, suggesting that the detected patterns of speech variability are associated to drug consumption. Specifically, we found that oxytocin seems to be mostly driven by changes in emotion and prosody, which are mainly captured by acoustic features. In contrast, mental states driven by MDMA consumption appear to manifest in multiple domains of speech. Furthermore, we find that the experimental task has an effect on the speech response within these mental states, which can be attributed to presence or absence of an interaction with another individual. These results represent a proof-of-concept application of the potential of speech to provide an objective measurement of mental states elicited during intoxication.
2019
Repeat after us: Syntactic alignment is not partner-specific
Rachel Ostrand, Victor S. Ferreira
Journal of Memory and Language108, 2019
Abstract
Conversational partners match each other's speech, a process known as alignment. Such alignment can be partner-specific, when speakers match particular partners' production distributions, or partner-independent, when speakers match aggregated linguistic statistics across their input. However, partner-specificity has only been assessed in situations where it had clear communicative utility, and non-alignment might cause communicative difficulty. Here, we investigate whether speakers align partner-specifically even without a communicative need, and thus whether the mechanism driving alignment is sensitive to communicative and social factors of the linguistic context. In five experiments, participants interacted with two experimenters, each with unique and systematic syntactic preferences (e.g., Experimenter A only produced double object datives and Experimenter B only produced prepositional datives). Across multiple exposure conditions, participants engaged in partner-independent but not partner-specific alignment. Thus, when partner-specificity does not add communicative utility, speakers align to aggregate, partner-independent statistical distributions, supporting a communicatively-modulated mechanism underlying alignment.
Syntactic entrainment: The repetition of syntactic structures in event descriptions
Nicholas Gruberg, Rachel Ostrand, Shota Momma, Victor S. Ferreira
Journal of Memory and Language107, 216-232, 2019
Abstract
Syntactic structures can convey certain (subtle) emergent properties of events. For example, the double-object dative ("the doctor is giving a patient pills") can convey the successful transfer of possession, whereas its syntactic alternative, the prepositional dative ("the doctor is giving pills to a patient"), conveys just a transfer to a location. Four experiments explore how syntactic structures may become associated with particular semantic content - such as these emergent properties of events. Experiment 1 provides evidence that speakers form associations between syntactic structures and particular event depictions. Experiment 2 shows that these associations also hold for different depictions of the same events. Experiments 3 and 4 implicate representations of the semantic features of events in these associations. Taken together, these results reveal an effect we term syntactic entrainment that is well positioned to reflect the recalibration of the strength of the mappings or associations that allow syntactic structures to convey emergent properties of events.
2017
Phonological markers of Oxytocin and MDMA ingestion
Carla Agurto, Raquel Norel, Rachel Ostrand, Gillinder Bedi, Harriet de Wit, Matthew J. Baggott, Matthew G. Kirkpatrick, Margaret Wardle, Guillermo Cecchi
Proc. Interspeech 2017, pp. 3142--3146
Abstract
Speech data has the potential to become a powerful tool to provide quantitative information about emotion beyond that achieved by subjective assessments. Based on this concept, we investigate the use of speech to identify effects in subjects under the influence of two different drugs: Oxytocin (OT) and 3,4-methylenedioxymethamphetamine (MDMA), also known as ecstasy. We extract a set of informative phonological features that can characterize emotion. Then, we perform classification to detect if the subject is under the influence of a drug. Our best results show low error rates of 13% and 17% for the classification of OT and MDMA vs. placebo, respectively. We also analyze the performance of the features to differentiate the two levels of MDMA doses, obtaining an error rate of 19%. The results indicate that subtle emotional changes can be detected in the context of drug use.
2016
What you see isn't always what you get: Auditory word signals trump consciously perceived words in lexical access
Rachel Ostrand, Sheila E. Blumstein, Victor S. Ferreira, James L. Morgan
Cognition151, 96-107, Elsevier, 2016
Abstract
Human speech perception often includes both an auditory and visual component. A conflict in these signals can result in the McGurk illusion, in which the listener perceives a fusion of the two streams, implying that information from both has been integrated. We report two experiments investigating whether auditory-visual integration of speech occurs before or after lexical access, and whether the visual signal influences lexical access at all. Subjects were presented with McGurk or Congruent primes and performed a lexical decision task on related or unrelated targets. Although subjects perceived the McGurk illusion, McGurk and Congruent primes with matching real-word auditory signals equivalently primed targets that were semantically related to the auditory signal, but not targets related to the McGurk percept. We conclude that the time course of auditory-visual integration is dependent on the lexicality of the auditory and visual input signals, and that listeners can lexically access one word and yet consciously perceive another.
2011
When Hearing Lips and Seeing Voices Becomes Perceiving Speech: Auditory-Visual Integration in Lexical Access.
Rachel Ostrand, Sheila E Blumstein, James L Morgan
33rd Annual Conference of the Cognitive Science Society, Cognitive Science Society, 2011
Abstract
In the McGurk Effect, a visual stimulus can affect the perception of an auditory signal, suggesting integration of the auditory and visual streams. However, it is unclear when in speech processing this auditory-visual integration occurs. The present study used a semantic priming paradigm to investigate whether integration occurs before, during, or after access of the lexical-semantic network. Semantic associates of the un-integrated auditory signal were activated when the auditory stream was a word, while semantic associates of the integrated McGurk percept (a real word) were activated when the auditory signal was a nonword. These results suggest that the temporal relationship between lexical access and integration depends on the lexicality of the auditory stream.