addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscrossdots-three-verticaleditemptyheartexporteye-with-lineeyefacebookfolderfullheartglobegmailgooglegroupsimageimagesinstagramlinklocation-pinm-swarmSearchmailmessagesminusmoremuplabelShape 3 + Rectangle 1outlookpersonJoin Group on CardStartprice-ribbonImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruseryahoo

Ensuring 100% Database Uptime for Real-Time Big Data

Brian Bulkowski, Co-founder and CTO of Aerospike

Sunil Sayyaparaju, Tech Lead, Aerospike

Internet environments for consumer-facing applications routinely demand high throughput and sub-millisecond latencies for read/write transactions against terabytes of data, and service-level agreements demand 100% uptime. This session will review 10 proven practices for ensuring the high performance and availability that interactive Internet applications demand—even during power outages or natural disasters. These real-world lessons come from supporting the largescale, multiple data center deployments of CTOs delivering platforms for the high-stakes ad sector, where speed means responses in 5 milliseconds or less, scale ranges from 200,000 to 2 million TPS against terabytes of data, and downtime is not an option. The lessons include:

#1. When scaling, keep it the architecture simple, so there are fewer points of failure. For instance, load balancers may fail at high transaction rates even as the database is cruising.

#2. Provide full end-to-end automation. People make mistakes, and anything that’s not automated will have production issues.

#3. Keep the system asynchronous; otherwise one small failure will quickly snowball into an avalanche of degradation.

#4. Keep metrics of everything, because scale tends to creep up from behind, and no one wants to be caught blind.

#5. Ensure full intra-data center redundancy because servers fail…often.

#6. Extend full data redundancy across multiple data centers, so storms like Sandy don’t put operations out of commission.

#7. Have a back-up plan for a remote graceful shutdown that accounts for IP-based security.

#8. Make sure code is testable, so there’s a way to let the world know what’s going on.

#9. Divide intelligence into online and offline, so all the heavy lifting with predictive modeling is offline.

#10. Use the right data management tool for the job; too often “all-in-one” means mediocre for all.

Dinner will be provided.


Join or login to comment.

  • vadiraj

    Unfortunately I didn't attend due to rain. Can you please let me know if you have recorded videos.

    July 17, 2013

  • nagabrahmam m.

    Very Good.

    July 17, 2013

  • Naresh s.

    too far from my place on a weekday

    July 16, 2013

  • Akshay

    Is there some ticket we need to buy for attending this event?

    July 14, 2013

  • praveen k.

    More about Automation using existing data.

    July 15, 2013

  • omprakash

    We are from Strategic outsourcing services

    July 15, 2013

  • Sunil S.

    No. No need to buy any ticket. But please do RSVP to help us get an idea of number of people attending.

    July 14, 2013

  • Suresh

    BIg data Enthusiast. Looking for good ideas to work

    July 13, 2013

  • Naresh s.

    me too!

    July 3, 2013

  • Vikram Nagaraja R.


    July 3, 2013

54 went

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy