addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwchatcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscrosseditemptyheartexportfacebookfolderfullheartglobegmailgoogleimageimagesinstagramlinklocation-pinmagnifying-glassmailminusmoremuplabelShape 3 + Rectangle 1outlookpersonplusprice-ribbonImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruseryahoo

Alamo City R Users Group Message Board › aggregate not working as expected in loop with ggplot

aggregate not working as expected in loop with ggplot

Don D.
San Antonio, TX
Post #: 1
I'm having trouble with aggregate (in a loop). Basically, for each time_chunk, I'd like two mean values for for two columns. I've tried getting at this multiple ways - and what I thought worked - doesn't seem to be when I graph the data. Namely, there should not be two values for either of the metrics in the same time chunk. (Each metric has its own value.)

At the moment, I am being quite specific in grabbing the data in hopes of eliminating the problem (with no effect).
Might ddply be a solution here? Could data type be a problem?
The status is a boolean - but I'm fine with graphing by color for the average of the booleans in a five minute period.

aggboolrarall <- split(partfill, partfill$group)
for(i in 1:length(aggboolrarall)){
cat(names(aggboolrarall), "\n")

boolsub <- subset(aggboolrarall[][,c("usern­ame", "time_chunk", "pair_bool")])
boolsub$pair_bool <- as.numeric(boolsub$pair_bool)
tempboolagg <- aggregate(pair_bool ~ username + time_chunk, data = boolsub, FUN = mean)

rarsub <- subset(aggboolrarall[][,c("usern­ame", "time_chunk", "rarity")])

tempraragg <- aggregate(rarity ~ username + time_chunk, data = rarsub, FUN = mean)
currentmerge <- merge(tempraragg, tempboolagg)
d <- ggplot(data=currentmerge, aes(time_chunk, rarity, group=pair_bool, color = pair_bool)) + geom_line() + facet_wrap(~ username) + xlab("Five Minute Intervals") + ylab("Rarity")+ ggtitle(label = paste(names(aggboolrarall), "Rarity by Time\nwith Pair Status", sep="\n")) + scale_colour_gradient(limits=c(0, 1), low="red")

Don D.
San Antonio, TX
Post #: 2
It seems to have been the grouping. I redid this with ddply and by setting the group more appropriately (username) in ggplot - it didn't split. pair_bool was a mean value - not sure what happened.
Alex B.
San Antonio, TX
Post #: 4
So, aggboolrarall is a list of data.frames of identical dimensions (unless of course some of them came up empty due to there being empty levels of the group variable), is that right? To check if there are empty levels (or levels populated by one record, which can also cause trouble) just do table(partfill$group) and see how many observations there are for each level.

Next comment, I'm not sure what the subset(aggboolrarall[]...) statement does... it seems to be expanding the the chunks out to various combinations of values that were not originally observed. Were you perhaps thinking of boolsub <- aggboolrarall[[ i ]], which will give you the ith data.frame within aggboolrarall and does not need to be wrapped in a subset() call?

Next, what is the original data type of boolsub$pair_bool before it gets converted to numeric? Did you know that it's possible to have a matrix or a list be one column in a data.frame? Don't know if it was always the case, but I think I was doing an aggregate operation when I first stumbled onto this behavior last year, and maybe you were having the same problem.

Thirdly, is your goal to get the two column means of pair_bool (I'm assuming from here on that pair_bool is a matrix embedded in your original data.frame) for each combination of group, username, and time_chunk? If so, you can skip looping and do it all in one call to aggregate:

Powered by mvnForum

Our Sponsors

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy