A team of researchers led by Humboldt State University Dr. Monica Stephens of Humboldt State University (HSU) have released the results of a year-long study identifying the geographic origins of hate speech on Twitter in the U.S. Using the Google Maps API, the team has overlaid the data with a county-level analysis to show the distribution of hate speech on Twitter across the country via an interactive map.
The results? Hate speech tends to be more prevalent in the eastern part of the country than in the west. Concentrations of hate speech show interesting variations by individual terms; playing with the interactive map is both interesting and depressing.
The team analyzed a total of 150,000 geo-coded tweets (i.e. the user had opted into Twitter’s location-sharing service) sent between June 2012 and April 2013, each of which contained one of the homophobic or racist hate words. Rather than using algorithmic sentiment analysis, the project relied upon Humboldt State undergraduate students to read the entirety of tweet and classify it as positive, neutral or negative. Only those tweets that were identified by human readers as negative were used in this analysis.
To produce the map, all 150,000 negative tweets containing each hate word were aggregated to the county level and normalized by the total twitter traffic in each county. This is important because it eliminates population density as a factor that might lead to higher concentrations of hate speech.
More details about the methodology can be found on the group’s Floating Sheep blog (which also features some of the team’s other cool geo-visualizations); there is also an FAQ that addresses some of the controversies this project has stirred up.
Are you surprised at the results?