split data based on cumulative value of column in r -


i have following type of data:

    myd <- data.frame (group  = c(rep(1, 15), rep(2, 15)),                        distance = c(0, 4, 8,9,11,  14,18,19,23, 24,  29,30,35,40, 43,                                     0, 8,9,9,12,   13,14,15,16, 18,  23,24,28, 29, 30),                       var1 = c(1:15, 11:25), var2 = 1:30, var3 = 1:30)     myd      group distance var1 var2 var3 1      1        0    1    1    1 2      1        4    2    2    2 3      1        8    3    3    3 4      1        9    4    4    4 5      1       11    5    5    5 6      1       14    6    6    6 7      1       18    7    7    7 8      1       19    8    8    8 9      1       23    9    9    9 10     1       24   10   10   10 11     1       29   11   11   11 12     1       30   12   12   12 13     1       35   13   13   13 14     1       40   14   14   14 15     1       43   15   15   15 16     2        0   11   16   16 17     2        8   12   17   17 18     2        9   13   18   18 19     2        9   14   19   19 20     2       12   15   20   20 21     2       13   16   21   21 22     2       14   17   22   22 23     2       15   18   23   23 24     2       16   19   24   24 25     2       18   20   25   25 26     2       23   21   26   26 27     2       24   22   27   27 28     2       28   23   28   28 29     2       29   24   29   29 30     2       30   25   30   30 

i have multiple group levels (than 2 shown above). each distance (say mile posts in highway) starts 0 , cumulative end group. want split data (make bins) in such way each group approximately of distance 10. resulting split data like:

data group1subset1            group distance var1 var2 var3     1      1        0    1    1    1     2      1        4    2    2    2     3      1        8    3    3    3     4      1        9    4    4    4 data group1subset2     5      1       11    5    5    5     6      1       14    6    6    6     7      1       18    7    7    7     8      1       19    8    8    8 data group1subset3     9      1       23    9    9    9     10     1       24   10   10   10     11     1       29   11   11   11     12     1       30   12   12   12 data group1subset4     13     1       35   13   13   13     14     1       40   14   14   14 data group1subset5     15     1       43   15   15   15 ===== data group2subset1     16     2        0   11   16   16     17     2        8   12   17   17     18     2        9   13   18   18     19     2        9   14   19   19 data group2subset2     20     2       12   15   20   20     21     2       13   16   21   21     22     2       14   17   22   22     23     2       15   18   23   23     24     2       16   19   24   24     25     2       18   20   25   25 data group2subset3     26     2       23   21   26   26     27     2       24   22   27   27     28     2       28   23   28   28     29     2       29   24   29   29     30     2       30   25   30   30 

i need automize process real data big. please suggest how can it?

i'd use cut accomplish this:

maxd <- (max(myd$distance) %/% 10 * 10) + 10   transform(myd,cutdist = cut(distance, breaks = seq(0,maxd, = 10),                             include.lowest = true))     group distance var1 var2 var3 cumdist cutdist 1      1        0    1    1    1       0  [0,10] 2      1        4    2    2    2       4  [0,10] 3      1        8    3    3    3      12  [0,10] 4      1        9    4    4    4      21  [0,10] 5      1       11    5    5    5      32 (10,20] 6      1       14    6    6    6      46 (10,20] 7      1       18    7    7    7      64 (10,20] 8      1       19    8    8    8      83 (10,20] 9      1       23    9    9    9     106 (20,30] 10     1       24   10   10   10     130 (20,30] 11     1       29   11   11   11     159 (20,30] 12     1       30   12   12   12     189 (20,30] 13     1       35   13   13   13     224 (30,40] 14     1       40   14   14   14     264 (30,40] 15     1       43   15   15   15     307 (40,50] 16     2        0   11   16   16     307  [0,10] 17     2        8   12   17   17     315  [0,10] 18     2        9   13   18   18     324  [0,10] 19     2        9   14   19   19     333  [0,10] 20     2       12   15   20   20     345 (10,20] 21     2       13   16   21   21     358 (10,20] 22     2       14   17   22   22     372 (10,20] 23     2       15   18   23   23     387 (10,20] 24     2       16   19   24   24     403 (10,20] 25     2       18   20   25   25     421 (10,20] 26     2       23   21   26   26     444 (20,30] 27     2       24   22   27   27     468 (20,30] 28     2       28   23   28   28     496 (20,30] 29     2       29   24   29   29     525 (20,30] 30     2       30   25   30   30     555 (20,30] 

there's no need calculate cumulative distance, since want keep them in groups of multiples of 10


Comments

Popular posts from this blog

javascript - RequestAnimationFrame not working when exiting fullscreen switching space on Safari -

linux - phpmyadmin, neginx error.log - Check group www-data has read access and open_basedir -