After going deep with the Earthquake Damage Prediction Competition. We are taking a slight turn towards taking a deep dive into clustering techniques. Potentially very powerful for the competition that we have just looked at, clustering can be a great way to utilise unlabelled data.
Please spend some time looking into one or more clustering techniques before the meetup and optionally try to use them for the earthquake prediction dataset: https://www.drivendata.org/competitions/57/nepal-earthquake/page/136/ (by possibly making better use of the geo location data?)
A good place to start with looking into clustering techniques is here: https://scikit-learn.org/stable/modules/clustering.html
Please leave a comment of the technique you plan to look into so we can try and all explore separate methods, so far we have:
- Agglomerative Clustering (Tony)
- DBSCAN (Alan)
As always we encourage you all to spend some time investigating and experimenting with these techniques. It's fine if you are not ultimately able to have success with them; the purpose of the meetup is to talk about what you did and to ask questions.
Have a think about any research papers or Kaggle competitions you would like to discuss in future meetups. We will spend a few minutes discussing this at the end
6:00 - 6:30 - Arrive at venue
6:30 - Introductions
6:35 - Discussion
7:25 - Discuss and decide on next topic
7:30 - Meeting ends
A big 'Thank you' to GridAKL for providing the venue!