[webMeetup] Profiling Text Data

![[webMeetup] Profiling Text Data](https://secure.meetupstatic.com/photos/event/5/7/3/highres_492541395.webp?w=750)
Details
Join the 35th NLP webMeetup 5th October 2020 online. Mani Sarkar will explain why profiling text data is important and introduces his NLP Profiler library to the community. We are looking forward to welcoming you!
If you want to keep up to date with NLP, join our LinkedIn community site: https://www.linkedin.com/company/nlp-zurich/
Agenda
18:55 Join the NLP Zurich webMeetup
19:00 Mani Sarkar: Profiling Text Data (NLP Profiler)
19:35 Q&A
19:50 Virtual Hugs and Kisses ⊂(◉‿◉)つ
👉 Register in advance for this webMeetup:
https://us02web.zoom.us/meeting/register/tZIuf-GuqDIsG9aYzCAA0sZ99lOYZmwsECot
Talk summary:
Natural language processing (NLP) is a widespread field with many new innovations and advancements. Despite that, at a very basic level, there are no comprehensive tools to analyze tabular text data. So, we all end up building our own little solutions to analyze text datasets. Each one of us might do it differently and get a different response.
While preparing for a talk sometime back, I wrote a utility called NLP Profiler. When given a dataset and a column name with text data, NLP Profiler will return either high-level insights about the text or low-level/granular statistical information about the same text. Think of it as using the pandas.describe() function or running Pandas Profiling on your data frame, but for datasets containing text columns rather than columnar datasets.
In this talk, we can see what profiling means to us, it is important and how it can be applied to datasets to get some interesting information i.e. High-level information that would include things like sentiment analysis, subjectivity/objectivity analysis, grammar or spelling quality check, etc. Low-level details could include the number of words in the sentence, the number of emojis in the text, etc.
NLP Profiler can do this analysis using a single line of code. Above all, it can be extended and shared openly with others.
About the speaker:
Mani Sarkar is a passionate developer, currently strengthening teams and helping them accelerate when working with small teams and startups, as a freelance software engineer, data, ML engineer.
A Java Champion, JCP Member, OpenJDK contributor, thought leader in the LJC and other developer communities and involved with @adoptopenjdk, @graalvm and other F/OSS projects. Writes code, not just on the Java/JVM platform but in other programming languages as well, hence likes to call himself a polyglot developer. He sees himself working in the areas of core Java, Hotspot, GraalVM, Truffle, VMs, Performance Tuning, Data, AI/ML/DL and NLP.
An advocate of a number of agile and software craftsmanship practices and a regular at many talks, conferences and hands-on-workshops – speaks, participates, organises and helps out at many of them. Expresses his thoughts often via blog posts (on his own blog site, DZone, Medium and other third-party sites), and microblogs (tweets).
NLP resources
AI/ML/DL resources
Mani’s thoughts on many things AI/ML/DL/NLP
https://neuralmagic.com/blog/machine-learning-engineer-spotlight-mani-sarkar/
More about Mani
https://neomatrix369.wordpress.com/about

[webMeetup] Profiling Text Data