Researchers from University of Pennsylvania, Johns Hopkins University, University College London and Microsoft Research started by looking at Twitter users' self-described

In the UK, a job code system sorts occupation into nine classes. Using that hierarchy, the researchers determined average income for each code, then sought a representative sampling from each.

After manually removing ambiguous profiles, the team ended up with 5,191 Twitter users and more than 10 million tweets to analyze.

"It's the largest dataset of its kind for this type of research," said Daniel Preotiuc-Pietro, a post-doctoral researcher at the University of Pennsylvania, who led the study.

Researchers then created a statistical natural language processing algorithm that pulled in words that people in each code class use distinctly.

Most people tend to use the same or similar words, so the algorithm's job was to 'understand' which were most predictive for each class. Researchers analyzed these groupings and assigned them qualitative signifiers.

Some of the results validated what is already known, for instance, that a person's words can reveal age and gender, and that these are tied to income, researchers said.

The researchers also found that those who earn more tend to express more fear and anger on Twitter. Perceived optimists have a lower mean income.

Text from those in lower income brackets includes more swear words, whereas those in higher brackets more frequently discuss politics, corporations and the nonprofit world.

The study was published in the journal PLOS ONE.


Latest News from Lifestyle News Desk