#3 => John Feminella on "Coresets and their applications"

Name: #3 => John Feminella on "Coresets and their applications"
Start: 2015-04-30T19:00:00-04:00
End: 2015-04-30T22:00:00-04:00
Location: WeWork DC - Chinatown

Hosted By

Mike B.

#3 => John Feminella on "Coresets and their applications"

Details

Description of the paper:

We'll be looking at the Ph.D. thesis of the head of the MIT Robotics and Big Data Lab, Dan Feldman. It's entitled "Coresets and their applications" (2010). (That's "coresets" as in the mathematical reduction technique, not "corsets" as in the article of clothing.) You can find it here:

http://www.cs.tau.ac.il/thesis/thesis/feldman

Coresets are a way of turning large datasets into much smaller, more manageable sets in a way that preserves the interesting features ofthe larger set. Approximation isn't a new technique; we compress datasets all the time to preserve interesting features (JPEGs for pictures, MP4 for audio, etc.). The choice of the way in which you construct the coreset determines which interesting features are (or are not) preserved, and within what error bounds.

The surprising result is that you can, very frequently, throw away large parts of your dataset still get good approximations to the optimal answer. This is valuable for cases where you want to ask an expensive question on a large dataset (e.g. Hamiltonian tours /Traveling Salesperson problem, convex hull, etc.) when you're willing to trade off some fidelity in the answer for response speed.

Brief outline:

How we improve solutions to problems with computer science: faster algorithms (hard!), faster computers (hard!)
How we improve solutions with coresets: approximate the dataset (much easier!)
Prerequisites for constructing a coreset
What kinds of problems do and don't work well, and why
A few examples of constructing a coreset
Diving into the math of why it works
Tips for constructing an arbitrary coreset (it's hard to do)
Maybe a demo or two, time permitting

A Note:

§1 and 2 (14 pages out of the 80, a lot of which is pictures) are really the only parts to read. The rest is more detailed math and proofs of different corner cases, and a generalization to N dimensions.

About the presenter:

On a dare, John once drank a 32-ounce chocolate milkshake in under 10 seconds. Life has more or less been downhill from there.

Find him at: @jxxf / http://jxf.me

Events in Washington, DC

Papers We Love DC

See more events

Papers We Love DC

Thursday, April 30, 2015 at 7:00 PM to Thursday, April 30, 2015 at 10:00 PM EDT

718 7th Street NW · Washington, DC

Papers We Love DC

public group

#3 => John Feminella on "Coresets and their applications"