Lab Notes

SPEECH TECHNOLOGY
Posted August 2014

Beyond What the Words Say

Vocal features may predict the severity of depressive disorders.


The television ad for medication to treat depressive disorders tells us, “Depression hurts.” While the ad refers to the physical and emotional pain endured by individuals, recent news articles remind us that depression also hurts society. According to an AtWork posting in October 2013, a Gallup survey conducted on data collected in 2001 and 2012 concluded that America’s businesses face an annual $23 billion loss of productivity caused by employees’ depression-related health problems. The National Alliance on Mental Illness reports that compared to people without depressive conditions, individuals experiencing depression are more apt to abuse alcohol and drugs, are more prone to chronic illnesses such as diabetes or heart disease, and, most sadly, are more at risk for committing suicide. A May 2013 article in the New York Times reveals that the suicide rate for Americans aged 35 to 64 has increased 30% from 1999 to 2010. Every 65 minutes, a military veteran commits suicide, and suicides among active-duty military personnel reached an all-time high in 2012, reports Forbes magazine. The costs exacted by depression are great: the personal anguish and dysfunction experienced by the sufferers and those close to them; organizational and financial burdens shouldered by health-care, military, and business enterprises; and tragic losses borne by those close to suicide victims.

Both civilian organizations and the military are very interested in methods that may alert doctors to a patient’s early-stage symptoms of major depression disorder (MDD) so that intervention can be started before the condition worsens. To that end, researchers in the Bioengineering Systems and Technologies Group are applying Lincoln Laboratory’s expertise in speech and language processing to the identification of physiological indicators of MDD. They have had promising results from investigations into phonological and articulatory characteristics of speech that may indicate the severity of a person’s depressive state. “We are looking at the correlation between depression severity and certain vocal biomarkers,” says Thomas Quatieri, a senior staff member in the group. “Some of the vocal features we have considered reflect articulation and quality of voice. These include coordination of precisely timed vocal articulators, average speaking rate, individual phoneme speaking rate, pitch and energy dynamics, and irregularities in vocal-fold vibration. These characteristics are often perceived as slowness, slur, monotony, breathiness, and hoarseness in the depressed voice, but may be ‘silent’ and only be detected by advanced temporal-spectral analysis.”

Traditionally, mental health professionals have diagnosed MDD by evaluating patients on scoring systems that rely on qualitative indicators such as a patient’s insomnia, sudden weight gain or loss, agitated state, and self-reported feelings of ennui and guilt. Two commonly used scoring tools are the Hamilton Rating Scale for Depression (HAM-D), which looks at 17 different symptoms, and the self-reported Quick Inventory of Depressive Symptomatology (QIDS), which gauges 16 factors. The Laboratory’s vocal biomarker approach could provide clinicians with a quantitative, more objective tool to supplement the interview-based HAM-D or QIDS diagnostic methods. In addition, detection of speech characteristics associated with depression may enable earlier and more uniform diagnoses than use of the HAM-D or QIDS techniques alone and may also be useful in automatic monitoring of treatment or relapse after treatment has ceased.

Schematic

The vocal biomarkers used in Lincoln Laboratory’s study are based on articulation and quality of voice. Articulation is associated with vocal tract dynamics and precise coordination, while quality is dependent on vocal-fold vibration regularity.

The research on voice to detect depression severity has been going very well,” says Quatieri. “We have focused our investigation on a set of parameters that includes the physiological aspects of creating speech.” Applying signal processing techniques to recorded samples of freely spoken and read speech, the research team has developed two innovative vocal biomarkers based on (1) phoneme-dependent speaking rate and (2) incoordination of vocal tract articulators. This focus was motivated by the clinicians’ observation that people with depressive conditions exhibit psychomotor impairment, manifested by slow thinking, sluggish and uncoordinated physical movements, and listless emotional reactions. “For the phonological class of biomarkers, our team has examined durations for specific language phonemes (sound units such as for a vowel or a consonant) and for pauses,” says team member Bea Yu. The signal patterns when compared to baseline data from non-depressed subjects show that speech produced by depressed individuals exhibits differences in phoneme and pause durations. Identification of subjects’ level of depression from their speech samples correlated well with the assessments of these subjects made by clinicians using the HAM-D assessment metric.

The Laboratory team is also looking at speech patterns that may indicate a lack of coordination of the vocal articulators (tongue, jaw, larynx, vocal fold, lips). By measuring the cross-correlation in vocal tract resonant frequencies seen in samples of recorded speech at different time scales, the researchers may have an indirect measure of how well the vocal articulators are moving together. Changes in resonant cross-correlation may be due to uncoordinated dynamics of speech production, such as with dispersed timing and phasing of articulators, and may indicate depression. “The more highly depressed a person is, the greater the decline in coordination. This incoordination measure of vocal articulators has provided our team with high correlation gains against the HAM-D assessment metric,” says James Williamson, a technical staff member in the Bioengineering Systems and Technologies Group.

The team has seen that their correlations between vocal biomarkers and heightened depressive states are in line with the determinations of depression severity made by clinicians using the HAM-D and QIDS methodologies. Interestingly, quite often the biomarkers correlated more strongly with a single criterion of the two scales—for example, psychomotor retardation—than with a total score from a person’s evaluation. “This may lead to predictors of depression state that use biomarkers tuned to individual subsymptoms of depression,” says team member Brian Helfer.

In October, a Lincoln Laboratory team was awarded first place in the Audio/Visual Emotion Challenge and Workshop (AVEC 2013) event. The challenge was to exploit audio and/or video information to estimate the depression severity of people diagnosed with MDD. In spring 2013, participating research teams were given a set of audio/video recording sessions of patients and were tasked to predict patients’ scores on a clinical depression assessment (Beck Depression Inventory) on the basis of objective biomarkers derived from the recorded speech samples or videotaped facial features. The Lincoln Laboratory team used only the speech data, applying its techniques in assessing phoneme-dependent speaking rate and lack of coordination of the vocal tract articulators. Although the AVEC datasets were in German and the Laboratory’s work has been only on English-language data, the Laboratory’s biomarkers performed well. The team’s predictions highly correlated to the Beck scores, earning their technology a “victorious” rating by a large margin.

 

Processing

A multimodal technological approach may dramatically change the way, and how rapidly, depressive disorders or neurological impairment are diagnosed and treated. The future vision includes a mobile technology assistant that would bring lab results to clinicians in the field.

The Lincoln Laboratory team consisted of Quatieri, Williamson, and Helfer, technical staff members in the Bioengineering Systems and Technologies Group, Daryush Mehta, a consultant to that group, Bea Yu of the Intelligence and Decision Technologies Group, and MIT graduate student Rachelle Horwitz. “Our depression team is highly interdisciplinary, representing skills in signal processing, machine learning, speech science, psychology, and neuroscience—a mix that is essential in addressing the complexity of human depression,” says Quatieri. The team also acknowledges Nicolas Malyska, a staff member in the Human Language Technology Group, and Andrea Trevino, a former summer intern and currently a University of Illinois graduate student, for their involvement in early work on phoneme-dependent speaking rate.

Quatieri says that the work on vocal cues to depression is becoming part of a larger multimodal approach to determining depressive states via physiological indicators. In collaboration with doctors and researchers from Massachusetts General Hospital and the Wyss Institute for Biologically Inspired Engineering, Lincoln Laboratory researchers are studying how the muscle movements behind facial expressions may be indicative of MDD and what changes in autonomous physical functions, such as heart rate and skin conductance, are symptomatic of MDD. An analysis combining results from this variety of quantifiable factors with biomarkers tuned to individual symptoms of depression could add scientific measures to the art of diagnosing depression. “This research may also have implications beyond depression,” says Quatieri, “for example, in diagnosing traumatic brain injury, post-traumatic stress disorders, early dementia, or amyotrophic lateral sclerosis (ALS, also known as Lou Gehrig’s disease), or even in assessing cognitive overload or stress.”

“Our team is quite pleased with our progress and in particular our first-place standing in the recent international depression challenge,” says Quatieri. “We view this win as a research step toward helping our soldiers and veterans, as well as civilians, who are suffering from particularly high depression and suicide rates. We envision, for example, mobile devices for use in automatic monitoring of the effectiveness of treatment or for early intervention and relapse.”

 

top of page