BACKGROUND Information

  • Schizophrenia affects about 1 in 100 people in the United States, and knows no racial, cultural, or economic boundaries.

  • Symptoms commonly include delusions, hallucinations, emotional unresponsiveness, and disordered thinking and speech. These symptoms usually appear between the ages of 13 and 25.

  • Like many other mental illnesses, diagnosis and important treatment decisions are currently either made based on the reporting of subjective experiences by the patient, or on observations made by the clinician. 

  • This can lead to unreliability in determining how mental illnesses such as schizophrenia should be treated. Thus, new tools should be developed for accurately and reliably measuring clinically relevant behavior.


Speech and language abnormalities have long been known to accompany schizophrenia. These can include phonetic symptoms like monotone speech, indicative of flat affect, and lexical or semantic symptoms like a lack of coherence (“tangentiality” or “looseness of associations”). The organization/coherence and prosodic (rhythmic) elements of speech are important indicators of functional impairment and treatment targets for this condition. However, there are currently no objective quantitative tools for measuring these speech and language components available for use in clinical practice.

Using speech samples from both individuals with schizophrenia and healthy control subjects, students will be expected to

  1. optimize and apply methods from computational linguistics and speech analysis to these speech samples, categorizing their semantic coherence and acoustic prosody.

  2. validate these objective measures by comparing them to standard clinical assessments of speech in the sample of patients with schizophrenia, determining if these objective measures differ in patients compared to healthy control subjects.

Students will be expected to work on the following:

  • Moving the speech data from the patients on a proper server

  • Acquiring and testing speech recognition to observe the accuracy for automatic transcription of the conversations

  • Learning to use Praat and other speech analysis tools (like OpenSmile) to extract acoustic features

  • Reading the literature on text-based features for schizophrenia, and then learning to explore, implement, and use natural language tools for automatic measures of "discourse coherence" based on vector semantics.

Technical Skills

At minimum, students should have taken CS 109 (or another introductory probability course, such as STATS 116). A computational linguistics or natural language processing course (such as CS 124) is highly recommended, as is some background in speech/phonetics (Linguistics 105), or willingness to learn.