New Tweet-Screening Method Tracks Flu More Accurately

Tracking the flu with Twitter is no longer a new concept, but Johns Hopkins researchers recently developed a new tweet-screening method that allows for more accurate, real-time data on flu cases.

Previously, tracking the flu via Twitter was challenged by the struggle to identify which tweets were from those who actually had the virus — not just those who were talking about the flu in general.  When trying to track cases, those who were simply mentioning the illness on social media (but not sick) could skew results.  For example, when basketball star Kobe Bryant exhibited flu-like symptoms during a game, Twitter flu mentions spiked as fans discussed the symptoms during and post-game.

“We wanted to separate hype about the flu from messages from people who truly become ill,” said Mark Dredze, an assistant research professor in the Department of Computer Science who uses tweets to monitor public health trends.  To do this, the Johns Hopkins team developed “sophisticated statistical methods based on human language processing technologies.” For example, the system can tell the difference between “I have the flu” and “I’m worried about getting the flu.”

Using Twitter to Track the Flu - Johns Hopkins researchThese methods filter out the “online chatter” that is not directly linked to actual flu infections.  Analyzing 5,000 publicly available tweets per minute, the results (available in real-time) align closely with government disease data that takes much longer to compile.  For example, the U.S. Centers for Disease Control and Prevention, working from flu-related symptoms recorded during hospital visits, usually takes two weeks to publish data on the flu’s prevalence.  When compared with the CDC’s data from November and December 2012, the Johns Hopkins results “demonstrated a substantial improvement in tracking with CDC figures as compared to previous Twitter-based tracking methods.”

The data has also produced maps that show the difference between last year’s mild flu season and this year’s more severe outbreak.

This type of data – and its availability in real-time – will aid health officials in determining both the scope and severity of an epidemic.  And the technology can go beyond tracking the flu; Dredze adds, “Our hope is that the new technology can be used track other diseases as well.”  So far, the system has only been used to track tweets in the US.