Next Meetup

Visual Spark Development with KNIME
Abstract: The KNIME Analytics Platform is the leading open and open source solution for data-driven innovation. KNIME can be used to help discover the potential hidden in your data, mine for fresh insights, or predict new futures. KNIME provides a visual development environment enabling data scientists of various experience to quickly build complex solutions. The visual environment also enables collaboration with peers and other groups who may not be as technically savvy. KNIME nodes (functions) provide a wide variety of capabilities including sourcing data, data transformation, and modeling using a variety of algorithms. KNIME also has integrations with Python, R, Spark, H2O and deep learning. And all on an open source platform. The Big Data Extensions integrate the power of Apache Hadoop and Spark with the ease-of-use of KNIME Analytics Platform. They consist of two complementary node libraries: KNIME Big Data Connectors enable you to import/export HDFS data and perform SQL analytics within Hive and Impala through a series of KNIME nodes. And KNIME Extension for Apache Spark enables you to create and run Spark jobs for data transformation and model learning through another set of KNIME nodes. In this talk we'll provide a quick overview of the KNIME Analytics Platform then jump right into building Spark workflows using Hive and HDFS-based data. The examples and demonstrations will illustrate using a visual environment to build machine learning workflows that execute on a Hadoop cluster using Spark. KNIME also enables mixing visual development with coding using the Spark SQL and Java Snippet nodes. Bio: Jim has worked with KNIME for the past year helping to get their US-based operations up and running. His work includes evangelizing the KNIME open source platform and supporting customers through their journeys in data science. Jim has a mix of a data science and computer science background including building a dataflow based, distributed computation platform for deep data analysis (similar to Spark). Agenda: 6:30 Food + Networking 7:00 Presentation + QA Location: Visa:12301 Research Blvd, Bldg 3, Austin, TX 78759, United States · Austin, TX RSVP: • Seating is limited to the first 75 to RSVP. • Please let me know if you have any questions about RSVP.


12301 Research Blvd, Bldg 3, Austin, TX 78759, United States · Austin, TX

Respond by: 6/19/2018

Upcoming Meetups

Past Meetups (135)

What we're about

Austin Association of Computing Machinery (ACM) Special Interest Group in Knowledge Discovery and Data Mining (SIGKDD). Local Austin chapter of ACM SIGKDD, the premier professional society for machine learning and data mining. This group is specialized to Hadoop based big data machine learning.

Members (1,530)

Photos (58)