Big Data and Machine Learning - London - Meetup #10


Details
PLEASE NOTE: Limit of 130 attendees!
Meetup #10
Welcome to Meetup #10, and what we hope will be another interesting evening of presentations and lightning talks. The agenda is listed below, followed by further details about the main presentations and their presenters.
Should you wish to contact me, email me at mark.whalley@microfocus.com.
Kindest regards
Mark
---
Agenda
18:30 Doors open and networking
18:55 Welcome
Mark Whalley (5 mins)
19:00 Effectively Scaling with Machine Learning: HBase based Big-Data Reporting
Dragan Milosevic (30 mins)
19:30 Shout Out: Introducing a Hands-On Vertica Workshop
Mark Whalley (10 mins)
19:45 Unstructured Big Data Analytics – Stories from the trenches
Michael McGrath (40 mins)
20:30 Networking
21:30 Close
Effectively Scaling with Machine Learning: HBase based Big-Data Reporting
Dragan will describe the fully operational big-data system that they have built that implements the lambda-architecture with Kafka as input and HBase as output source. The stream-processing layer is a Spring-Boot application that reads tracking-events from Kafka, denormalizes and inserts them into HBase. The batch-processing layer uses Oozie to orchestrate Map-Reduce jobs for retrieving events from Kafka, aggregating and bulk-loading them into HBase. A machine learning optimization on collected queries is used to design HBase schema, which together with End-Point coprocessors that build reports on a region-server side, made possible to execute expensive aggregation-queries in real-time.
Dragan Milosevic
Dr. Dragan Milosevic is a certified Solr/Lucene, Hadoop and HBase developer and currently works as Chief Search Architect at Awin AG. The firm has successfully implemented several Apache open-source projects for building a world-class reporting framework. He is also author of a book "Beyond Centralized Search Engines: An Agent-Based Filtering Framework," which describes the application of various machine-learning techniques for solving cooperation and coordination challenges in distributed systems.
Shout Out: Introducing a Hands-On Vertica Workshop
Through 2017 and into the start of 2018, the BD&ML (London) Meetup has hosted series of presentations on Tracking Commercial Aircraft in Near Real-Time using a Raspberry Pi, Kafka and Vertica.
With a number of requests from Members asking to know more about Vertica, this “Shout Out” provides details of a series of Hands-On Workshops that are being planned in April. The first of which will be installing the Community Edition of Vertica on a VM and a quick guide to getting started.
Mark Whalley
From the early 1980s, Mark worked with Michael Stonebraker's Ingres RDBMS and then a number of column-store big data analytic technologies. In 2016, he joined HPE Big Data Platform as a Systems Engineer specializing in Vertica and Vertica SQL in Hadoop, and from September 2017 followed Vertica as it moved over to Micro Focus.
Mark frequently delivers talks at the London, Cambridge and Munich Big Data & Machine Learning Meetups, the British Computer Society - Advanced Programming Specialist Group, Vertica Forums and elsewhere.
Unstructured Big Data Analytics – Stories from the trenches
Presenting the work to date and next steps for Digital Safe and Investigative Analytics (IA) (using IDOL and Vertica). The scale problem experienced (customers with up to 4.5 PB of unstructured data), the business problem (massive false positives), the solution (IA) and results and future use cases we are investigating – Sales optimisation etc.
Dr. Michael McGrath
Michael is Chief Strategist for Micro Focus’s Information Archiving and Discovery Division. He is an award-winning researcher and has been involved in technology enabled change for over 20 years, primarily in financial services. --- There is a longer profile at https://www.linkedin.com/in/michaelmcgrath/

Big Data and Machine Learning - London - Meetup #10