Züri Machine Learning Meetup #3

  • April 23, 2014 · 6:30 PM

Here is the schedule for our 3rd meetup. Looking forward to it! And many thanks to Google for hosting us this time.

• Scalable Probabilistic Entity-Topic Linking
Massimiliano Ciaramita, Research Scientist at Google Zürich

Entity linking involves labeling phrases in text with their referent entity Id, e.g., from Wikipedia or Freebase. This task is challenging due to the number of possible entities, in the millions, and heavy-tailed mention ambiguity. We formulate this problem in terms of probabilistic inference within a topic model (LDA), where each topic is associated with an entity Id. To scale we propose an efficient Gibbs sampling scheme. This conceptually simple approach achieves state of the art performance on a popular benchmark and can be easily extended to a distributed learning framework.

• Introduction to Speech Recognition
Paul R. Dixon, Research Scientist at Yandex

Speech recognition is a difficult problem that requires knowledge from areas including signal processing, machine learning, algorithms and linguistics.  The aim of this presentation is to give an accessible introduction to the algorithms and techniques used for creating state-of-the-art spoken language systems. Recently, a large number of high quality open source toolkits for speech recognition have become available. These will be of interest to the larger community because many of the algorithms and tools that are used for speech recognition can be applied to other machine learning and language processing tasks. 

• Randomized Linear Regression: A brief Overview and Recent Results
Brian McWilliams, Postdoc at ETH Zürich

Linear regression is an important and ubiquitous tool in machine learning, statistics and data analysis. However, standard solvers for ordinary least squares scale poorly to large datasets. Recently, randomized algorithms based on subsampling the dataset have been proposed which recover good approximate solutions much faster than standard LAPACK routines. I will give a brief overview of the ideas behind these techniques.
At the same time, the assumptions underlying linear regression are known to be unrealistic. We introduce a new statistical model which assumes that some of the datapoints we observe are corrupted. We propose a random subsampling algorithm which is able to identify the corrupted datapoints so that they are sampled with low probability. We show an application of our algorithm to the problem of predicting flight delay time.

- Important: Please update your RSVP if you change your plans! Places are limited to 111 this time.
- Please get in touch if you would like to present something in one of the future meetups, or if you have ideas about topics and speakers and locations! And we're still urgently searching for companies willing to sponsor a small apéro at the future events from Mai on :-)

Join or login to comment.

  • Rakan D.

    Hello guys, are you going to upload videos for the presentations like last time?

    April 26

    • Rakan D.

      Hi Martin, I only saw one video uploaded. Are you going to upload the other videos?

      May 21

    • Martin J.

      we'd like to. but for the two other talks, this depends on if we get the legal approvement, and then if somebody could sync the PDF slides to the audios...

      May 22

  • Dan F.

    About the last presentation about randomized linear regression (also my favorite one!), I wonder about the Hadamard matrix used.

    Rambling ahead:

    What about other matrices for the projection?

    I recall using some gaussian vectors for a fast nearest-neighbors implementation.
    The idea was that if the components are sampled i.i.d from N(0, 1) and the resulting vectors are normalized, they're a good basis to project onto.

    Also, there was talk of a MATLAB package, but what about R?

    Finally, can we have the slides? :)

    April 24

    • Brian

      Here is a great lecture by Gilbert Strang on the Fast Fourier Transform. The SRHT follows the same principle:


      April 25

    • Quim

      I think the fact of the matrix being recursive is only important because then it can be applied to vectors of any size. About why this shape? I don't know but it does look similar to the one used in FFT, it's a super cool super fast projection to a super fancy space where everything is cooler and easier.

      April 25

  • Martin J.

    here are the slides of the two later talks:
    Introduction to Speech Recognition (Paul R. Dixon): http://goo.gl/Sy0v2b
    Randomized Linear Regression (Brian McWilliams): http://goo.gl/IN2bpf

    stay tuned for videos and the NLP slides... thanks to everybody who joined us yesterday! (and as a reminder: if an event has a cap, please update your RSVP if it turns out you can not make it. we had 21 people who failed to show up but didn't free their spot, while about as many were still waiting to get a spot, which is annoying. usual meetup problem i guess, but you get the idea...)

    2 · April 24

  • Erik V.

    I loved the last presentation. The other one's were good as well, but hard to follow because the audio-transmission was constantly fading away.

    1 · April 24

    • Martin J.

      yes i'm really sorry about the audio problem in the back of the room. hope this will be better next time.

      April 24

  • Roy R.

    Great speaker line-up and content. Would love an opportunity to get to know attendees (ahead of or after) in future events. Thank you for organizing!

    April 23

  • A former member
    A former member

    Big apologies, I was on my way this evening and had to turn around to fix an emergency. So sad to miss out on being there tonight :'(

    April 23

  • Dragica K.

    Sorry can't make it in time.

    April 23

  • Lukas G.

    Hi, is anyone is planing to go from Basel?

    April 2

    • Roy R.

      Lukas, I'll be coming by train from Rheinfelden but earlier in the day.

      April 22

    • Lukas G.

      thanks, but that will not help me :(

      April 22

  • Marc-Philippe H.

    Hello, do the presentations and discussions be in English or in German? Thanks for the answer. Regards

    March 31, 2014

    • Martin J.

      As we have both many German and English speakers attending, exchanging in English makes things easier. Wir könnten uns aber auch vorstellen mal zur Abwechslung eine Präsentation oder Diskussion auf Deutsch zu haben.

      2 · March 31, 2014

People in this
Meetup are also in:

Create your own Meetup Group

Get started Learn more

I started the group because there wasn't any other type of group like this. I've met some great folks in the group who have become close friends and have also met some amazing business owners.

Bill, started New York City Gay Craft Beer Lovers

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy