r - Apply a specific function to a certain subset of a dataframe based on time frequency -


i have problem in figure out how possible apply example mean function subset of dataframe based on time frequency.

i explain specific situation: have dataframe reporting data fuel consumption of trucks (having specific plate number) measured @ specific day/time. i'd calculate mean of fuel consumption time series maximum time frequency of 5 minutes (if consecutive events happen 5 minutes each other calculate mean).

here example of initial dataframe , subsets of data want obtain:

data.frame:

columns names respectively plate.number, date.time , fuel.consumption

     ab              2016-07-03 09:21:10                 23.45      ab              2016-07-03 09:22:33                 33.65      bc              2016-07-03 09:23:28                 56.22      ab              2016-07-03 09:24:13                 21.33      bc              2016-07-03 10:32:45                 33.42      zf              2016-07-03 10:32:45                 28.45      zf              2016-07-03 10:34:12                 29.55      ab              2016-07-03 11:26:54                 28.73      ab              2016-07-03 11:27:33                 27.98      bc              2016-07-03 11:28:45                 42.45      ab              2016-07-04 10:32:45                 34.72      ab              2016-07-04 10:33:33                 30.51      ab              2016-07-04 14:54:28                 28.66 

a time series in case:

     ab              2016-07-03 09:21:10                 23.45      ab              2016-07-03 09:22:33                 33.65      ab              2016-07-03 09:24:13                 21.33 

or:

     ab              2016-07-03 11:26:54                 28.73      ab              2016-07-03 11:27:33                 27.98 

as can see time between 1 event , following 1 less 5 minutes. once have these groups quite easy calculate mean of fuel consumption per each group.

ah, might helpful know "date.time" format posixct proper date/time.

any idea function should use? thought maybe possible using function aggregate? how specify time frequency?

thank time , help.

first define function calculates number of seconds since first observation. if exceeds 300, start new group , reset start time. function assumes observations ordered in time.

group_on_seconds <- function(df_part,                               nr_of_secs = 300) {   group_start   <- df_part$date.time[1]   group_ind     <- df_part$group   <- 1    for(i in 2:nrow(df_part)) {         if( (as.numeric(df_part$date.time[i]) -               as.numeric(group_start)) > nr_of_secs) {        group_start <- df_part$date.time[i]        group_ind   <- group_ind + 1      }     df_part$group[i] <- group_ind   }   df_part } 

order df on time, split on plate number , apply function. bind results together.

library(dplyr) df_group <- df[order(df$date.time), ] %>%    split(df$plate.number) %>%   lapply(group_on_seconds) %>%   do.call('rbind', .) 

calculate mean on combination of plate.number , group.

df_group %>%     group_by(plate.number, group) %>%   summarise(mn = mean(fuel.consumption)) 

Comments

Popular posts from this blog

sequelize.js - Sequelize group by with association includes id -

java - Android raising EPERM (Operation not permitted) when attempting to send UDP packet after network connection -

c++ - Migration from QScriptEngine to QJSEngine -