"Increasing MySQL Performance through Aggressive Data Archiving Database" by David Berube
This talk comes straight from its debut at Percona Live - and it's free!
Many strategies exist for increasing the performance of a database system. Some attempt to make each access to the data faster: for example, faster hardware may be used, or system settings may be tweaked. Other approaches attempt to decrease the total number of database operations: for example, by caching database queries or entire pages. Nevertheless, generally, performance decreases as the sytem grows larger, and as a result, access times increase; after at certain point, it may be difficult to offset this increase with additional hardware.
It is also true, however, that "not all data is created equal" - some data is far more relevant than others. Although MySQL and other RDBMses attempt to give priority to frequently accessed data, it can be said that MySQL performance still decreases with table size - older, infrequently accessed data can slow down systems significantly. Archiving this data can be a powerful and important approach to speeding up database systems - by decreasing the size of the working data set, nearly all operations become faster.
This talk will outline the use of pt-archiver to archive database tables; the design of simple archiving scripts in an application's host langauge, such as Ruby or Python, will also be briefly covered. A number of archiving strategies, from very simple to relatively complex, will be covered; for example, an enterprise application may decide to archive records after a certain period of activity. Some types of child records (i.e. unaccepted bids) may be deleted or moved to "deep storage", whereas other types of child records (accepted bids) may be moved to a "staging area" - ready to be unarchived transparently when requested by a user. With some thought, it is nearly always possible to develop an archiving scheme that balances business needs and the desire for performance.
Finally, some consideration will be given to the various options for storing the archived data, including archive tables, flat files (CSV, XML, SQL), and document stores (MongoDB, et al.) Each have different advantages and disadvantages, and choosing an appropriate approach can be very important.
About the speaker:
David Berube is a software developer, consultant, speaker, and writer. He is constantly researching, perfecting, and practicing his trade. He is a prolific writer, appearing in places such as Dr Dobbs Journal, Linux Magazine, IBM DeveloperWorks, PHP International Magazine, and many others. He speaks frequently, notably including his seminar series, "Making Money Using Open Source Software". He authored the books "Practical Rails Gems" and "Practical Reporting with Ruby and Rails", and co-authored the book "Practical Rails Plugins."
------------- Logistics -------------
MIT's map with the building (E51) highlighted: http://whereis.mit.edu/map-jpg?selection=E5...
MIT's map with the Kendall square T stop highlighted: http://whereis.mit.edu/map-jpg?selection=L5... (the stop is in red on that map, and you can see E51 in the bottom right)
MIT's maps with parking lot information: http://whereis.mit.edu/map-jpg?selection=P4... and http://whereis.mit.edu/map-jpg?selection=P5...
Note that MIT lots are free to park in after 3 pm or so, even if it says "reserved for faculty"
Free pizza and soda will be served, so please RSVP accurately.
To RSVP anonymously, please login to the Meetup site with the e-mail address "admin at sheeri dot com" and the password "guest". For more information, see: http://www.meetup.com/mysqlbos/calendar/