The 10,000 Minute Lecture on Document Sanitization – A Text Mining Technique

This is a past event

56 people went

Needs a location


Welcome to The 10,000 Minute Lecture workshop. This is your opportunity to learn hands-on a wide variety of data science skills at this 1.5-hour online lecture.

The online lecture is FREE. Your laptop must have Anaconda 3.6 pre-installed before you begin the workshop. Find the software here:

Registration Link:


07:00 pm - 08:30 pm: Document Sanitization – A Text Mining Technique


* Session

Topic: Document Sanitization – A Text Mining Technique

Instructor : Smitha Ganesh

Linkedin :

About the instructor:

Smitha is an experienced data scientist, with 14+ years of industry experience and proven track record of leading, designing and developing solutions to enable businesses in aviation, oil & gas and automotive achieve their $MM productivity.

Currently a Principal Consultant at ThoughtWorks, India and is working on projects that help clients become AI mature.

She has a master’s degree in structural engineering and has earlier worked with GM and GE on simulations and machine learning techniques.
She has strong interests in applied machine learning- pattern recognition, text mining, trend analysis and anomaly detection.

Also, Smitha is working with few of the universities in helping them align their curriculum to current data science trends

Learning Outcomes: Learn text mining stream of machine learning
Techniques to sanitize documents through text mining concepts

Key takeaways:
1. What is a text mining ?
2. Is text mining a part of machine learning?
3. What is document sanitisation and why is it so critical?
4. Use cases of document sanitisation
5. Sensitive information identification vs removal strategy
6. Techniques to identify sensitive information – Named entity recognition(NER) and sentence similarity
7. Final look of sanitised document


The online lecture is FREE of cost to attend
RSVP now to reserve your spot at the event!!