addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscrossdots-three-verticaleditemptyheartexporteye-with-lineeyefacebookfolderfullheartglobegmailgooglegroupshelp-with-circleimageimagesinstagramlinklocation-pinm-swarmSearchmailmessagesminusmoremuplabelShape 3 + Rectangle 1ShapeoutlookpersonJoin Group on CardStartprice-ribbonShapeShapeShapeShapeImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruserwarningyahoo

Big Data and Wee Data

Avery Rosen, MongoDB expert will be the headline speaker at our next MUG. We'll also have Dmitry from GridGain do a demo of their In-Memory Accelerator for MongoDB

We all know MongoDB is great for Big Data, but it's also great for work on the other end of the scale -- call it "Wee Data". This type of data is far more common than Big Data scenarios... in fact, just about every project starts with it. In this domain, we don't care about disk access and indices; instead, we care about skipping past the wheel inventing and getting right down to playing with the data. MongoDB lets you persist your prototype or small-working-set data without making you deal with freeze-drying and reconstitution, provides structure well beyond csv, gets out of your way as you evolve your schemas, and provides simple tools for introspecting data and crunching numbers. 

"MongoDB and Wee Data: Hacking a Workflow" will start with theory and proceed to walk through ruby code that shows MongoDB's place in a working ecommerce site's data ecosystem. On display: An ETL workflow, with persistence 

• import CSVs for updates to persistent data 

• run aggregation to answer business questions 

• merge external documents into DB, for workflow phase decoupling run validation 

• export to CSV for upload elsewhere Also on display: BDD style workflow 

• Cucumber features -RSpec specs

GridGain for MongoDB

If you have a medium to large-scale MongoDB deployment, and want to enhance scale and/or a realize more real-time performance, this session is for you. GridGain that brings in-memory capabilities to MongoDB, the most popular API for unstructured data storage and query. We will discuss how you can achieve elastic scale (think automatic transparent re-sharding) up to thousands of nodes, as well as dramatically improve performance: higher throughput and lower latency via concurrent locking and parallel read/write operations. We will discuss various options around choosing databases or collections to keep in memory, synchronous and asynchronous persistence, vertical scale with terabytes of off-heap memory, Visor GUI-based management and monitoring, and much more.

Join or login to comment.

  • Avery R.

    Hey, everyone, thanks for coming to the talk -- I wanted to follow up with a couple of corrections, having looked up some things.

    1) It was shutterfly.com that migrated their user data from MySQL to mongodb, not Photobucket.

    2) CouchDB (mentioned in the Q&A) is not a "just a key-value store" at all (as I offhandedly said), it's a JSON document storage db, and it is a valid choice for that wee data project, but it wash't my choice. Firstly I prefer to invest my expertise into mongodb, because I like it better, and secondly, aggregation, which CouchDB *does* have, is still done in javascript with map reduce, which is cumbersome and hard to debug.

    October 30, 2013

  • A former member
    A former member

    Interesting to me

    October 22, 2013

  • Paul T.

    Checking in

    1 · October 22, 2013

  • A former member
    A former member

    Is this a 7pm start or 7:30?

    October 22, 2013

    • Avery R.

      My understanding is talking starts ~7:15

      October 22, 2013

  • Alim S. G.

    Rather upset I can no longer make this. I hope the event will be recorded.

    October 21, 2013

  • Avery R.

    I'm curious (and I will tailor my talk accordingly) as to how many people attending a) speak ruby b) know BDD either by use or reputation?

    October 20, 2013

  • Eric M.

    Nice!

    October 4, 2013

  • Eric G.

    I will be out of the country

    September 23, 2013

Our Sponsors

  • MongoDB

    MongoDB organizes the NY MongoDB User Group

  • O'Reilly

    Members save 40% off print and 50% off ebooks use discount code DSUG

  • Pearson

    Providing technical book and discounts! 35% off: USERGROUP

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy