Züri Machine Learning Meetup #8


Details
Happy to pass by UZH this time for 3 exiting talks. And we're still looking for an apéro sponsor!
• Processing Large Graphs at Warp Speed
Abraham Bernstein (http://www.ifi.uzh.ch/ddis/people/bernstein.html), Professor at Department of Informatics, UZH (http://www.ifi.uzh.ch/)
Abstract: Large Graphs are ubiquitous and need to be processed everywhere. Unfortunately, most middleware architectures are aimed at tables of key/value pairs. The Web of Data, for example, has grown immensely over the past years. From only one dataset in 2007 the linked portion of the Open Data Cloud has grown to over 31 billion triples (in 2011) usually shown in the diagrams and a plethora of open data sets published by individuals, organizations and governments all over the world usually not shown. Given this immense growth the question arises how to process these data. Even if you can process 10’000 triples per second it will still take more than 861 hours to process the whole cloud… so algorithms traveling (or traversing) the linked data cloud using conventional methods are going to be slow. In this talk I will present some methods to swiftly process large graphs in a distributed setting.
• I'm Rubber, you're Data: A short Introduction to Topological Data Analysis
Simon Pepin Lehalleur (https://www.math.uzh.ch/index.php?id=assistenten&L=&key1=5699&key2=&key3=&keySemId=), PhD Student at UZH
Abstract: Statistical models and machine learning techniques are built and analysed via a rich blend of mathematical disciplines : linear algebra, multivariate calculus, probability, graph theory, etc. Topological Data Analysis (TDA) is a new framework for data analysis built on ideas from Topology, an important branch of pure mathematics. In a nutshell, topology is the study of the properties of shapes which are invariant under continuous deformation. In particular, Algebraic Topology associates computable, linear invariants to spaces, which can be used to "count the number of holes of various dimensions". In TDA, these invariants are computed for certain spaces naturally associated to datasets (in the form of point clouds or distance matrices). This provides topological summaries of the "shape of the data", which can then be used either for data visualization and exploration or as input for other ML algorithms. TDA methods are being used for production data analysis problem, both in scientific research and by the rapidly expanding start-up Ayasdi.
In this talk, I will present the ingredients of topology used in TDA and demonstrate two basic algorithms: Mapper (developed by the founders of Ayasdi) and Persistent Homology.
• Randomized Methods in Optimization
Sebastian U. Stich (http://people.inf.ethz.ch/sstich/), Postdoc at ETH Zurich
Abstract: Randomized methods are ubiquitous in all areas of optimization. For instance, in (i) high-dimensional optimization the choice of random search directions allows to trade-off the efficiency of an update versus its computational cost (e.g. Random Coordinate Descent).Or in (ii) applications where the gradient can not easily be accessed (e.g. parameter tuning), randomized methods can work by querying only function values instead. In both settings, the convergence of the randomized schemes is fastest if the sampling distribution of the search directions does fit the underlying metric of the optimization problem (like in variable metric schemes of second-order optimization).
In this talk, we review complexity results for first- and zeroth-order optimization algorithms. We will especially focus randomized schemes, like for instance Nesterov's accelerated Random Gradient methods for setting (i), or a class of algorithms termed ``Evolution Strategies'' designed for (ii). We will not only present some applications but also a few theoretical results. For instance, we will quantify the influence of the sampling distribution on the convergence rate.

Züri Machine Learning Meetup #8