PLEASE NOTE THE TIME CHANCE - WE WILL START AT 5PM INSTEAD OF 5:30PM!
NU ML is proud to have Dhrumil Mehta give this month's talk titled "Political Framing: A Machine Learning Approach to Studying Congressional Rhetoric". Dhrumil is a graduating MS student in Computer Science working with Prof. Doug Downey and will continue his work at the Berkman Center for Internet and Society at Harvard this summer. He will also be consulting for USA today where he will employ his machine learning algorithm to help them improve political reporting. More info below!
Talk title: "Political Framing: A Machine Learning Approach to Studying Congressional Rhetoric"
Thinking about natural language as data that can be processed quantitatively is a relatively new concept outside of academia and large high-tech companies. Only in recent years have personal computers become powerful enough for natural language analysis to be a feasible way for organizations to use language data to gain strategic insights. Language data can be extremely large and unruly. In this presentation, I will explain how I extracted meaningful insights from one large language data set (speeches in the US congress).
In Metaphors We Live By , George Lakoff and Mark Johnson argue that certain metaphorical structures are deeply embedded in our use of language. A conceptual metaphor is a domain through which we can understand an idea. For example, when we use phrases like “he shot down my argument”, we are understanding argumentation through the metaphor of war or battle. In Political Mind, Lakoff explains that often, political speech about a particular issue is either intentionally or unintentionally framed through a particular metaphor. One such example may be the framing of rhetoric related to immigration in terms of crime, or the framing of a war in the terms related to the rhetoric of freedom and liberation.
The aim of this project is to take preliminary steps towards a computational study of rhetorical framing. I have modeled a rhetorical frame using a bag-of-words approach and created sets of approximately 500 words for each rhetorical frame that I aim to study. The data corpus is about 21,000 speeches spanning the last 15 years in both chambers of the US Congress. In this study, I take a supervised learning approach in which I have trained a multinomial Naïve Bayes classifier to classify speeches based upon the categories assigned to them from the Capitol Words API query used to retrieve them. After ascertaining that the classifier works with a good degree of accuracy, rather than asking it to classify a novel speech, we instead ask it to classify the bag-of-words model of a frame. Analyzing the log-likelihoods, we can come up with a preliminary and intuitive idea of the effectiveness of this approach in understanding rhetorical framing. In this paper, I have also presented an analysis of the results of two other variants of the aforementioned program that allow us to understand the rhetoric segmented by party as well as over pre-defined, discrete, time intervals.
Speaker bio: Dhrumil Mehta is graduating from NU with a MS in Computer Science with a focus on AI and Natural Language Processing, and a BA in Philosophy.
This summer I will be conducting research at the Berkman Center for Internet and Society at Harvard on a project called Mediacloud, where I will use computational methods to process and gain insights from news media rhetoric. I will also be consulting for USA Today to help them harness machine learning and language processing techniques to extract meaning from texts such as political speeches to improve their political reporting.