Re: [php-49] analytics logging - where to store data

From: Will W.
Sent on: Thursday, November 15, 2012 12:58 PM
This is part of the big data problem that a lot of major companies are dealing with. Though the chances are good that you won't ever develop terabtyes of data on daily basis, but you can generate quite a bit and still have concerns.  For example, I regularly work with 100m user data sets that are in the 20gb file size.

And you are definitely right to think about the future and there is a lot to consider.    here is a cool article i read just the other day, it has some comparisons about data warehouse/query speeds:  http://37signals.com/svn/posts/3315-how-i-came-to-love-big-data-or-at-least-acknowledge-its-existence

MySQL seems to be the norm right now, coupled with tools like hadoop/hive/aws.     I don't work with the data warehousing side on a e-commerce level so I can't comment on the db architecture.   If I were to approach it I would definitely try and keep a series of static tables though, not just an annual one (customer, job, page, etc).   By segmenting the data and using guids with proper keys you should be able to keep things moving along fast enough.  Also, remember that for a lot of reporting/analysis, you shouldn't be working off a production server, I keep a small, slightly outdated data set, on a local MSSQL server running so I can bog it down without pissing off the bosses when I need new queries.  




On Thu, Nov 15, 2012 at 11:50 AM, Mark Steudel <[address removed]> wrote:
BACKGROUND

So I'm working on a project that is all about tracking user activity on a site:

1. How much time spent on site
2. how many activities a user has started, finished,
3. How many times a user has come to the site
4. etc.

And of course they want lots of reporting on all these aspects.

The site gets a fair amount of traffic and I can see all this logging
creating quite a bit of traffic especially over years.

QUESTIONS

1. Not knowing much of anything about nosql implementations, would
this project be worth exploring  nosql?

2. If I stick with MySQL, are there some ways I can architect at the
get go that won't make it a bear to work with in say 3 years. (ie some
ginormous table). I was thinking either try and get some sort of
business process retention policy (ie we only need to worry about a
years worth of data) or database retention (ie dynamic tables that
store a years worth of data user_time_2012, user_time_2013 ).

Anyway would love to get some feedback on this.

Thanks, Mark


--
-----------------------------------------
Mark Steudel
P: [masked]
F: [masked]
[address removed]

. : Work : .
http://www.mindfulinteractive.com

. : Play : .
http://www.steudel.org/blog

. : LinkedIn : .
http://www.linkedin.com/in/steudel



--
Please Note: If you hit "REPLY", your message will be sent to everyone on this mailing list ([address removed])
http://www.meetup.com/php-49/
This message was sent by Mark Steudel ([address removed]) from The Seattle PHP Meetup Group.
To learn more about Mark Steudel, visit his/her member profile: http://www.meetup.com/php-49/members/2764664/
Set my mailing list to email me

As they are sent
http://www.meetup.com/php-49/list_prefs/?pref=1

In one daily email
http://www.meetup.com/php-49/list_prefs/?pref=2

Don't send me mailing list messages
http://www.meetup.com/php-49/list_prefs/?pref=0
Meetup, PO Box 4668 #37895 New York, New York[masked] | [address removed]


Our Sponsors

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy