Skip to content

De'Mel Mojica: Probabilistic Approaches to Multi-dimensional Fuzzy Joins

De'Mel Mojica: Probabilistic Approaches to Multi-dimensional Fuzzy Joins

Details

Speaker: De'Mel Mojica

Abstract: This talk will be on a general approach to automatically join large-scale, geospatial data across distinct data sets, using a mix between Levenshtein Distance thresholds and Haversine Distance thresholds. This approach permits joining multiple data sets without the need to provide ad hoc normalization conventions for each data resource. In addition, this approach can be generalized beyond a geospatial field and applied any domain which requires joining across two or more non-identical dimensions.

Doors open after 6 pm. DO NOT SHOW UP BEFORE 6 PM. Talks start at 6:30 pm. Repeat: DO NOT SHOW UP BEFORE 6 PM.

Doors are open at bottom, take elevator to 3rd floor, door should be open for suite 320

We'll visit a local watering hole afterwards.

---------------------------

Propose a talk! Or suggest a talk you want to hear or attend: https://sckott.typeform.com/to/ShM55K

Hashtag for PDX R meetups: #pdxrlang & the Twitter account to follow/tweet at is @pdxrlang - We also use http://pdxdata.slack.com/ for a back-channel during MeetUps and in between. Invite yourself here: http://pdxdata.org/slack/

Photo of Portland R User Group group
Portland R User Group
See more events
Mozilla Corp
1120 Nw Couch St # 320 · Portland, OR