Skip to content

[Online] Lessons from COVID-19: Non-random Missing Data and Its Consequences

Photo of Beryl Kanali
Hosted By
Beryl K. and Reshama S.
[Online] Lessons from COVID-19: Non-random Missing Data and Its Consequences

Details

## How to join the webinar:

NOTE: You can join via your browser (no app download required). Use Chrome or Firefox.
Pre-register for the webinar:
https://www.bigmarker.com/neo4j/Data-Umbrella-Webinar

--------------------------------
Video Recording
--------------------------------
This event will be recorded and placed on our YouTube. We usually have it up within 24 hours of the event. Subscribe to our YouTube and set your notifications:
https://www.youtube.com/c/DataUmbrella/

--------------------------------
Time
--------------------------------
16:00 UTC
9 am PT / 12 pm ET / 7 pm EAT / 9:30 pm IST

## Speaker

Mitzi Morris

## Talk Level

Intermediate

## Pre-reqs

Familiarity with statistical modeling, ideally survey statistics and Bayesian inference.

## Prep Work

  1. Modeling rates of disease with missing categorical data
  2. Multiple Imputation of Missing Data

## Resources

https://github.com/rtrangucci/epi-missing-data

## Event

A fundamental challenge for survey and observational datasets is that not all records in the dataset are complete; key pieces of information may be missing.

In this talk I work through the models and methods from the paper
MODELING RACIAL/ETHNIC DIFFERENCES IN COVID-19 INCIDENCE WITH COVARIATES SUBJECT TO NON-RANDOM MISSINGNESS

They write:
In emergency situations, such as a surging pandemic, it is easy to see how the disease process itself may induce non-random missingness of covariates. For example, during a period of rapidly increasing caseloads, such as the Delta and Omicron surges of the COVID-19 pandemic, the overwhelming number of cases is likely to limit the ability of case investigators to collect data that are as detailed as those collected during lower-incidence periods. These differences may also be more pronounced when comparing wealthier and poorer jurisdictions with differential resources for case-finding and intervention.

Using the Stan language and CmdStanR interface, together with a simulated dataset of Covid-19 cases and population demographics, where age, gender, race/ethnicity, and neighborhood have varying degrees of missingness, we will demonstrate how different approaches produce different estimates of Covid-19 prevalence among key demographics.

This event is being co-promoted with R-Ladies NYC. R-Ladies NYC is part of a world-wide organization to promote gender diversity in the R community. We aspire to encourage and support women and gender minorities interested in learning and sharing their experiences in R programming by hosting a variety of events including talks, workshops, book clubs, data dives, and socials. (https://www.rladiesnyc.org/)

## CONNECTING WITH US

We invite you to follow us on our social networking sites to keep up to date on the latest news we will be sharing.

Photo of Data Umbrella group
Data Umbrella
See more events
Online event
This event has passed