Skip to content

Twitter Sentiment: Health Predictions: Python, R Text Mining Tutorials

Photo of
Hosted By
John V.


The event will include presentations by two local professors, Professor Alan Mislove of Northeastern University and Professor Ben Liu of University of Massachusetts Lowell, each presenting their leading-edge research on utilizing Twitter to provide real-time insights on both the mood and health of the country. Next, two beginner tutorials (one using Python given by Shankar Ambady, and another with R) will be given to help audience members tap into Twitter and other APIs, and then apply text mining programs.

(1) Professor Alan Mislove will be presenting "Pulse of the Nation: U.S. Mood Throughout the Day Inferred from Twitter". This research was featured last year on many national news outlets including CBS News, NY Times, Wall Street Journal, and many others. A description of the project including several visualizations (such as a time-lapsed video) can be found here: .

(2) Professor Ben Liu will be presenting "Predicting Flu Trends using Twitter Data". From the research paper's abstract: "In this paper we present the Social Network Enabled Flu Trends framework, which monitors messages posted on Twitter with a mention of flu indicators to track and predict the emergence and spread of an influenza epidemic in a population. The models predict data collected and published by CDC, as the percentage of visits to “sentinel” physicians attributable to ILI in successively weeks." The research paper may be found on Professor Liu's website:

( 3) Shankar Ambady of Session M ( will give a tutorial on the Python NLTK (Natural Language Tool Kit). Shankar had previously presented a comprehensive overview of the NLTK last December at the Python meetup. The Python NLTK is a very powerful collection of libraries that can be applied to a variety of NLP applications such as sentiment analysis. His presentation from last December may be found here (click on Boston Python Meetup Materials) :

(4) Also, a tutorial on using R for text mining will also presented. This tutorial is being co-developed by the Boston Predictive Analytics group and the Greater Boston useR Group (R Programming Language): (

The tentative schedule is:

6:45 - 7:00 - R/Text Libraries

7:00 - 7:40 - Python/NLTK

7:40 - 7:50 - Break - Will ask audience if/when to have

7:50 - 8:20 - U.S. Mood Throughout the Day Using Twitter

8:20 - 8:50 - Predicting Flu Trends Utilizing Twitter

The reason for the order is to first introduce some basic concepts with the thought this can help set the stage for the large-scale, research studies.

1 Memorial Drive · Cambridge, MA
0 spots left