Twitter, Facebook, and other social media provide a rich, if imperfect portal into people's lives. Penn researchers Lyle Ungar and Andy Schwartz describe "differential language analysis," a technique being used in the World Well-Being Project* in which large data sources are measuring what word use can reveal about various psychosocial characteristics. Tens of millions of Facebook posts and billions of tweets are being quantifed to explain how language use varies according to age, gender, personality, health, and happiness. Word clouds visually illustrate the big five personality traits (e.g., "What is it like to be neurotic?"), while correlations between language use and county level health data suggest connections between health and happiness.
Lyle Ungar holds joint faculty appointments in Penn's Schools of Engineering, Wharton, and Medicine, and is associate director of the Center for BioInformatics. He heads a research group that develops scalable machine learning and text mining methods for large bioinformatic and web-based problems. He also holds several data mapping and mining patents and is widely published in these and other fields. Andy Schwartz is the lead research scientist of the World Well-Being Project.
*Jointly conducted with Johannes Eichstaedt, Lukasz Dziurzynski, Peggy Kern, and Martin Seligman
Computer and Information Science and Positive Psychology Center, Penn