Speed Optimization & Benchmarks in Spark NLP 3: Making the Most of Hardware

Miami Hadoop User Group
Miami Hadoop User Group
Public group

Online event

This event has passed

Details

Register: https://events.johnsnowlabs.com/speed-optimization-benchmarks-in-spark-nlp-3-making-the-most-of-modern-hardware

Spark NLP is the most widely used NLP library in the enterprise, thanks to implementing production-grade, trainable, and scalable versions of state-of-the-art deep learning & transfer learning NLP research. It is also Open Source with a permissive Apache 2.0 license that officially supports Python, Java, and Scala languages backed by a highly active community and JSL members.

Spark NLP library implements core NLP algorithms including lemmatization, part of speech tagging, dependency parsing, named entity recognition, spell checking, multi-class and multi-label text classification, sentiment analysis, emotion detection, unsupervised keyword extraction, and state-of-the-art Transformers such as BERT, ELECTRA, ELMO, ALBERT, XLNet, and Universal Sentence Encoder.

The latest release of Spark NLP 3.0 comes with over 1100+ pretrained models, pipelines, and Transformers in 190+ different languages. It also delivers massive speeds up on both CPU & GPU devices while extending support for the latest computing platforms such as new Databricks runtimes and EMR versions.

The talk will focus on how to scale Apache Spark / PySpark applications in YARN clusters, use GPU in Databricks new Apache Spark 3.x runtimes, and manage large-scale datasets in resource-demanding NLP applications efficiently. We will share benchmarks, tips & tricks, and lessons learned when scaling Spark NLP.

Register: https://events.johnsnowlabs.com/speed-optimization-benchmarks-in-spark-nlp-3-making-the-most-of-modern-hardware