Speaker: De'Mel Mojica

Abstract: This talk will be on a general approach to automatically join large-scale, geospatial data across distinct data sets, using a mix between Levenshtein Distance thresholds and Haversine Distance thresholds. This approach permits joining multiple data sets without the need to provide ad hoc normalization conventions for each data resource. In addition, this approach can be generalized beyond a geospatial field and applied any domain which requires joining across two or more non-identical dimensions.

