r - Update a subset of a df with mutate_each -


i have dplyr dataframe 100k+ rows , ~200 columns. there 15 columns contain date values in excel format (# of days since jan 1, 1900), column names contain date string makes easy subset dataframe.

library(dplyr)  x <- data.frame(date1 = 45000+ 500*rnorm(100),            date2 = 50000+ 500*rnorm(100),            var1 = 50 * rnorm(100),            var2 = 100 + 20 * rnorm(100))  > x %>% head      date1    date2       var1      var2 1 44952.83 49432.88   8.125523 125.95802 2 44331.47 49231.76 -34.814162 117.26881 3 44597.69 49651.91  27.747881 108.45787 4 45113.50 49802.87  24.584569  83.84904 5 46212.14 49972.59  72.444414  80.61595 6 45753.38 50074.57 -34.927552 127.70018  date_cols <- x %>% select(contains('date', ignore.case=t)) %>% names > date_cols [1] "date1" "date2" 

i'd change these date columns actual r datetimes without changing other columns. can't figure out how update date_cols subset of dataframe:

x %>% select_(.dots = date_cols) %>%        mutate_each(funs(as.date(., origin="1900-01-01"))) %>%        head         date1      date2 1 2023-01-28 2035-05-05 2 2021-05-17 2034-10-16 3 2022-02-07 2035-12-10 4 2023-07-08 2036-05-09 5 2026-07-11 2036-10-26 6 2025-04-08 2037-02-05 

i've tried following doesn't work:

x %>% select_(.dots = date_cols) <- x %>% select_(.dots = date_cols) %>%    mutate_each(funs(as.date(., origin="1900-01-01"))) 

i guess there better way "rbinding" original dataframe without date columns date_colssubset once mutated.

as commented @alistaire, can use mutate_at convert date columns , keep rest of data frames unchanged, can avoid binding original data frame subsets:

library(dplyr) mux <- x %>% mutate_at(vars(contains('date')), funs(as.date(., origin="1900-01-01")))  head(mux) #        date1      date2       var1      var2 # 1 2021-11-09 2038-10-20  44.524710  86.15957 # 2 2020-06-04 2037-08-04  31.402905  94.74633 # 3 2023-12-22 2038-03-06  31.600929  85.90605 # 4 2020-05-08 2037-01-02   7.140777  82.80565 # 5 2025-03-25 2038-07-30 -54.913577 100.83949 # 6 2021-02-18 2034-06-20  28.616670  93.92246 

and according ?mutate_at:

summarise_each() , mutate_each() older variants deprecated in future.

better used these new apis.


Comments

Popular posts from this blog

sequelize.js - Sequelize group by with association includes id -

android - Robolectric "INTERNET permission is required" -

java - Android raising EPERM (Operation not permitted) when attempting to send UDP packet after network connection -