Spark/NLP 2-4-1 Edition (former Spark Summit Edition)


Hi All, I am writing this/ inviting on behalf of the Spark Meetup AND the NLP Meetup (Sebastian)!!!

Spark Summit is in town !!!

And we have been able to secure one of the top speakers from the event to deliver a talk at the meetup, but ... wait ... there is more!!!

Sebastian wanted to run an NLP meetup on the same day and we decided to join forces and create some synergy, between the NLP group and the Spark group!!!

Means this is Spark/NLP meetup two-for-one edition (two meetups for the price of one) and we have a second speaker to talk about an NLP topic.

We are very excited about this. I will raise the max attendance to 100. Spread the work.

And see you on the 26th.

Regards ... Sebastian & Roland


Akmal Chaudhri, PhD from GridGain.

His talk, "Apache Spark and Apache Ignite: Where Fast Data Meets the IoT," will show attendees how to build a Fast Data solution that will receive endless streams from the IoT side and will be capable of processing the streams in real-time using Apache Ignite's cluster resources.

In particular, attendees will learn about data streaming to an Apache Ignite cluster from embedded devices and real-time data processing with Apache Spark

Abstract: It is not enough to build a mesh of sensors or embedded devices to obtain more insights about the surrounding environment and optimize your production systems. Usually, your IoT solution needs to be capable of transferring enormous amounts of data to storage or the cloud where the data have to be processed further. Quite often, the processing of the endless streams of data has to be done in real-time so that you can react on the IoT subsystem's state accordingly.


Yufang Hou, IBM

The title of her talk is Information Status: Tasks, Datasets and Models

Abstract: Information status (IS henceforth) describes the degree to which a discourse entity is available to the hearer regarding the speaker’s assumption aboutthe hearer’s knowledge and beliefs. In this talk, I will first give a brief overview of research work on IS, including NLP tasks and datasets. Then I will focus on one subtask (i.e., fine-grained IS classification) and talk about two models which we developed recently for this task.