Date: Tuesday, March 19th. Time: 630 pm
Location: DUMBO Startup Lab 68 Jay St. Suite 718, Brooklyn, NY
6:30-6:45 – Gathering and introductions.
6:45-7:30 – Big Data Apps in the Cloud: What It Really Takes.
7:30 -8:15 – Matrix factorizations for data analysis
8:15 - 840 - Cloud Leads & Networking
- Big Data Apps in the Cloud: What It Really Takes -
The massive computing and storage resources that are needed to support big data applications make on-demand, elastic cloud environments an ideal fit. However, managing your big data app on the cloud is no walk in the park - configuration, orchestration, H/A, auto-scaling are all quite complex when it comes to choosing the right cloud for you, whether it’s public, private or a hybrid cloud - which is where Cloudify and Eucalyptus come together. In this session, you'll learn how to deploy, manage, monitor and scale your big data apps on public clouds such as EC2 and private clouds such as the open source Eucalyptus cloud platform, as well as easily test drive your apps locally and then migrate the workload to Amazon Web Services EC2.
- Matrix factorizations for data analysis -
Topic: A large number of "big data" problems can be phrased in terms of cleaning up noisy matrices. For example, in the "Netflix problem", you are given a matrix with rows indexed by movies, and columns indexed by users; and each entry represents how much a user likes a movie. However, most of the entries of the matrix are empty, and others are filled, but with malicious noise. Your job is to fill in all the empty entries while deciding which filled entries are untrustworthy.
These sorts of problems can often be approached by matrix factorizations, that is, writing the matrix as a product of two (or more) matrices with special structure. I will give an introduction to some of the basic methods used in matrix factorization, and pointers to code you can use in your own projects.
Bio: Arthur Szlam is an assistant professor in the math department at City College. He was previously a postdoc at NYU in the vision and learning group, and more previously a postdoc at UCLA.
Bio: Ali Hodroj is a Solutions Architect at GigaSpaces working with small and large organizations within financial services, retail, telecommunications, and internet media on their Cloud Computing, In-Memory Computing, and Scalability use cases. He also enjoys co-authoring the Cloudify enterprise training curricula and sharing the knowledge with fellow cloud users. Prior to joining GigaSpaces, Ali spent his days as a data and healthcare analytics ninja at GE Healthcare trying to make the world better by writing code to extract healthcare-improving insights from billions of patient records through data mining, big data, and business intelligence technologies.
Ali’s archives on GigaSpaces Blog