r - Split dataframe, then select random observations from list, and the lists merge back into a dataframe -
i have data frame 3 variables (subject, trialtype, , rt), , need select randomly half of rt observations each subject, , re-create data frame selection.
in browsing list i've got here
split_df <- split(bucnidata_rt, list(bucnidata_rt$subject, bucnidata_rt$trialtype)) (this gives series of split_df[1], split_df[2], ....)
but can not subset using this
split_df[1] <- sample(nrow(split_df[1]), 24), ] i think because sample works on data frames , split_df[1] list.
to re-merge do:
remerged_df <- unsplit(split_df[1], list(bucnidata_rt$subject, bucnidata_rt$trialtype)) could please me step 2?
i propose different approach using dplyr if don't mind. can group subject , randomly select 50% of observations of each group:
library(dplyr) bucnidata_rt %>% group_by(subject) %>% sample_frac(size = 0.5) edit
here's way, closer started. use mtcars dataset in case:
split_df <- split(mtcars, mtcars$cyl) #split `cyl` #randomly select 50% of rows per group, without replacement split_df <- lapply(split_df, function(x) x[sample(seq_len(nrow(x)), nrow(x)/2, replace=false),]) #merge randomly selected list elements 1 data.frame remerged_df <- do.call(rbind, split_df) #check result nrow(remerged_df) #[1] 15 edit #2 corrected dplyr method after comment @gregor
Comments
Post a Comment