Skip to content

[webMeetup] Profiling Text Data

Photo of Kornelia Papp
Hosted By
Kornelia P.
[webMeetup] Profiling Text Data

Details

Join the 35th NLP webMeetup 5th October 2020 online. Mani Sarkar will explain why profiling text data is important and introduces his NLP Profiler library to the community. We are looking forward to welcoming you!

If you want to keep up to date with NLP, join our LinkedIn community site: https://www.linkedin.com/company/nlp-zurich/

Agenda

18:55 Join the NLP Zurich webMeetup
19:00 Mani Sarkar: Profiling Text Data (NLP Profiler)
19:35 Q&A
19:50 Virtual Hugs and Kisses ⊂(◉‿◉)つ

👉 Register in advance for this webMeetup:
https://us02web.zoom.us/meeting/register/tZIuf-GuqDIsG9aYzCAA0sZ99lOYZmwsECot

Talk summary:
Natural language processing (NLP) is a widespread field with many new innovations and advancements. Despite that, at a very basic level, there are no comprehensive tools to analyze tabular text data. So, we all end up building our own little solutions to analyze text datasets. Each one of us might do it differently and get a different response.

While preparing for a talk sometime back, I wrote a utility called NLP Profiler. When given a dataset and a column name with text data, NLP Profiler will return either high-level insights about the text or low-level/granular statistical information about the same text. Think of it as using the pandas.describe() function or running Pandas Profiling on your data frame, but for datasets containing text columns rather than columnar datasets.

In this talk, we can see what profiling means to us, it is important and how it can be applied to datasets to get some interesting information i.e. High-level information that would include things like sentiment analysis, subjectivity/objectivity analysis, grammar or spelling quality check, etc. Low-level details could include the number of words in the sentence, the number of emojis in the text, etc.

NLP Profiler can do this analysis using a single line of code. Above all, it can be extended and shared openly with others.

About the speaker:
Mani Sarkar is a passionate developer, currently strengthening teams and helping them accelerate when working with small teams and startups, as a freelance software engineer, data, ML engineer.

A Java Champion, JCP Member, OpenJDK contributor, thought leader in the LJC and other developer communities and involved with @adoptopenjdk, @graalvm and other F/OSS projects. Writes code, not just on the Java/JVM platform but in other programming languages as well, hence likes to call himself a polyglot developer. He sees himself working in the areas of core Java, Hotspot, GraalVM, Truffle, VMs, Performance Tuning, Data, AI/ML/DL and NLP.

An advocate of a number of agile and software craftsmanship practices and a regular at many talks, conferences and hands-on-workshops – speaks, participates, organises and helps out at many of them. Expresses his thoughts often via blog posts (on his own blog site, DZone, Medium and other third-party sites), and microblogs (tweets).

NLP resources

AI/ML/DL resources

Mani’s thoughts on many things AI/ML/DL/NLP
https://neuralmagic.com/blog/machine-learning-engineer-spotlight-mani-sarkar/

More about Mani
https://neomatrix369.wordpress.com/about

Photo of NLP Zurich (merging with Language AI Meetup) group
NLP Zurich (merging with Language AI Meetup)
See more events