How Accurate is 90% Accurate? On Evaluation of Sentiment Analysis engines

  • October 2, 2013 · 7:00 PM

“Nobody can go beyond 70% accuracy”. ”Our tool reaches 90% accuracy”.

The statements above are meaningful and meaningless at the same time.

Meaningful because they make it clear that there is an issue here, and 100% accuracy is beyond our wildest dreams. Meaningless because they don’t provide details on how accuracy is measured and, most important, they don’t specify for which task the accuracy is accomplished: for document-level sentiment (by the way, is that useful for business purposes?), for entity-level sentiment… In other words, when sentiment is assigned to a piece of text, do we know for which brand (Microsoft, Apple…) or for which topic (operating system, price…) is it assigned? So maybe the question is rather: accuracy on what?

Evaluation of accuracy is a scientific task that should be performed with open methods and metrics. This includes issues like: what’s the difference between accuracy and precision and recall? Or how do we measure inter-tagger agreement? Are there genuinely ambiguous texts for humans?

In our view, there is one factor that plays a major role here: business rules, i.e. the way a company sees its space. If I say “ACME just launched a new release of its explosive tennis balls”, is that a positive statement (new release) or just a neutral fact that shouldn’t distract marketeers? Being able to efficiently implement these peculiarities (business rules) is key in achieving high accuracy in a way that is meaningful for the end user of the information.

We will show why linguistic approaches to sentiment analysis (symbolic approaches as opposed to machine learning approaches) are better suited to efficiently respond to this challenge: integrating business rules. And we will use real corpora for sentiment evaluation and study their peculiarities.

We will make references to Seth Grimes article “Never Trust Sentiment Accuracy Claims“, a common reference for the industry.

This event wil be of interest to users of Sentiment Analysis and Text Analytics Technology in these sectors:

Social CRM: because customer sentiment in social media is key
Business Intelligence: because their new challenge is integrating unstructured data
Contact Center: because Social Media is becoming the channel of choice for many customers
Big Data: because most of the data in “big data” is text

Venue Details:

Date: Wednesday October 2nd, 2013
Hour: 19:00
Place: WeWork (room to be announced)
Address: 156 2nd Street, San Francisco, CA 94104
Presentation by: Antonio Valderrabanos, CEO & Founder, Bitext

Feel free to join via the usual Meetup procedures, or sending an email to Vicky Ortiz at [masked]

Join or login to comment.

  • Bill J.

    Commuting from Sunnvale, hoping to be there but will be late e.g. 730 to 8.

    October 2, 2013

15 went

People in this
Meetup are also in:

Create a Meetup Group and meet new people

Get started Learn more

Meetup has allowed me to meet people I wouldn't have met naturally - they're totally different than me.

Allison, started Women's Adventure Travel

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy