Przejdź do treści

25.04.19 Why R? Pre-meeting: Segmentation with NMF + R in palaeontology

Zdjęcie użytkownika MarcinKosinski
Hosted By
MarcinKosinski
25.04.19 Why R? Pre-meeting: Segmentation with NMF + R in palaeontology

Szczegóły

Room 329, Faculty of Mathematics and Information Science

This pre-meeting promots

Plan
18:00 - 18:05 Opening
18:05 - 18:35 Marcin Kosinski: Segmentation with NMF decomposition
18:35 - 19:00 break
19:00 - 19:30 Using R in palaeontology
20:00 - afterparty TBA

# Descriptions

## Segmentation using NMF decomposition

From the nowadays segmentation, we require them to follow below features:

  • it should be balanced,
  • segments should be distinctive,
  • the discovered over and under indexed features within segments should create a meaningful story,
  • and in the best case the amount of differentiative factors that drives segmentation should be small.

The last requirement often is a bottleneck in the scenario of a survey where respondents are asked enormous amount of questions.

The solution, one from many, to this use case can be the nonnegative matrix factorization that in a one attempt segments respondents and their features!

I'll present concept of the NMF decomposition and I'll present applications in R, with the explanation of diagnostic plots.

Working with high dimensional data? Often facing the need to group observations? That's a good presentation for you.

## Using R in palaeontology

During the last years, data science (and R) became more and more popular in various areas of research. One of the probably most exotic domains of such usage is palaeontology. I'll show you a case study (coming directly from my PhD research), where domain expert knowledge is mixed with data science approach.
Trilobites are one of the most fascinating extinct animals. Various shapes, sizes, environments, modes of life - all within one extinct group. Describing their systematics can lead to more complex conclusions about the shape of the whole ancient world. Trilobites from the Holy Cross Mountains are an excellent example of such importance. Laying between two major paleocontinents - Baltica and Gondwana - the exact position of this small area during Cambrian is still unclear.
Studying trilobites can help in resolving this puzzle, but such old fossils suffer from various deformations, so the true diversity is hidden behind the 500 million years of history. In such cases, data science tools can help in removing the noise. I'll show you, how to use Principal Component Analysis with landmark-type data, how to remove bias, and how to interpret the results.

# Speakers bio

Jakub Nowicki holds the PhD degree in Earth Sciences. He is an expert in morphometry and Cambrian trilobites. Currently, he works in Appsilon as a Software Engineer.

Marcin Kosiński has a master degree in Mathematical Statistics and Data Analysis specialty. Challenges seeker and devoted R language enthusiast. In the past, keen on the field of large-scale online learning and various approaches to personalized news article recommendation.
Community events host: organizer of Why R? conferences.
Interested in R packages development and survival analysis models. Currently explores and improves methods for quantitative marketing analyses and global surveys at Gradient Metrics.

# Sponsors

The event will be sponsored by Appsilon.

Appsilon delivers the most advanced R Shiny apps, data science consulting services and support with R Shiny and Python Dash technologies.
www: https://appsilon.com
fb: https://fb.com/appsilondatascience
twitter: https://twitter.com/appsilonds
linkedin: https://linkedin.com/company/appsilon/

Photo of R Users & R-Ladies Warsaw (Spotkania Entuzjastów R) group
R Users & R-Ladies Warsaw (Spotkania Entuzjastów R)
Zobacz więcej wydarzeń