Skip to content
Math-y Data Science

Details

For the March Data Science DC Meetup, we're very happy to have two talks that will dive into the mathematical details. First, Brandon Bass from Navigant Energy (http://www.navigant.com/energy/) will talk about heuristics in optimization, a fundamental component of many machine learning algorithms, and a highly useful set of techniques in their own right. Then, Biplab Pal and Mark Silverman from Treeminer (http://www.treeminer.com/)will talk about an innovative mathematical re-representation of data that allows dramatically faster classification and clustering. Like math? We do too. Join us to learn some new things.

NOTE: We're up in Microsoft's offices in Chevy Chase this month! If you're taking the Metro, exit North from the Friendship Heights stop. If you're driving, there's a garage underneath the building.

Agenda:

• 6:30pm -- Networking, Empanadas, and Refreshments

• 7:00pm -- Introduction, Announcements, Give-aways

• 7:15pm -- Presentations and Discussion

• 8:30pm -- Data Drinks (TBA)

Talks:

Heuristic Optimization

Lots of people use optimization, but do you know what's going on under the hood? Using the right algorithm for your data can drastically improve the accuracy and speed of your analysis. But what if you're unsure of the theoretical properties of your data, such as its convexity or continuity? Heuristic methods of optimization can come to the rescue. We will go through algorithms which are modeled off of how genes are passed throughout generations, to methods which were inspired by animal herd movement. The concepts and mathematical basis of Greedy Search, Simulated Annealing, and Genetic Algorithm will be emphasized, and time permitting, we will discuss a few tricks, parallelization and the "No Free Lunch" Theorem.

Brandon Bass is a Senior Consultant at Navigant Energy, working on development of data-driven web and platform applications and expansion of Navigant's proprietary energy market models. Before Navigant, Brandon worked with Altenex, a Boston-based small business in the energy space, to develop their power market forecasting and risk analytics engine. Brandon completed his Masters in Engineering in Power Systems Engineering and Optimization, and his B.S. in Environmental Engineering, both from Cornell University. Follow him on twitter @brandon_d_bass (https://twitter.com/Brandon_D_Bass).

Vertical Algebra and Machine Learning

Organizations today face challenges in performing predictive classification and clustering, driven by the volume, speed, and variety of data. We are pioneering a new approach to these tasks, one based on a novel vertical organization of data. This approach has significant scalability advantages over traditional approaches, and has been implemented in Big Data environments like Hadoop and Storm. We’ll review the mathematical foundations of vertical data mining (vertical “algebra”), and some example classification and auto clustering algorithms that have been implemented using vertical “algebra”. We’ll also review some applications where vertical data organization can provide great benefit.

Mark Silverman is the Founder and CEO of Treeminer, Inc., a start-up company tackling the problem of large dataset analytics using Vertical Data Mining techniques. Mark has a diverse technology background, including Data Mining and Analytics, IT software and infrastructure, Networking and Security, and Wireless technologies. He co-founded Treeminer with Dr. William Perrizo to bring to market a novel approach to solving the problems of data mining in large datasets. Mark has been intimately involved with the development of the software and algorithms that are at the heart of Vertical Data Mining. Mark received his Bachelor’s Degree from Columbia University.

Biplab Pal is CTO of Autopredictivecoding LLC (http://www.autopredictivecoding.com/) and Product/Application development manager for Treeminer Inc. Biplab has a PhD in Telecommunication Engineering and he has more than 16 years of experience with modeling of Sensors and Telecom System Engineering Data. Currently he has developed a Big Data platform for IoT for mid-sized manufacturing for MRO (Maintenance, Repair & Operation) using Treeeminer's new classification technology.

Sponsors:

This event is sponsored by Microsoft (http://www.microsoft.com/en-us/default.aspx), Statistics.com (http://bit.ly/12YljkP), Elder Research (http://datamininglab.com/), Novetta Solutions (http://novetta.com/), and Pearson/InformIT (http://www.informit.com/). (Would your organization like to sponsor too? Please get in touch!)

Photo of Data Science DC group
Data Science DC
See more events
Microsoft Chevy Chase Office
5404 Wisconsin Ave, Suite 5186 · Chevy Chase, MD