python - missing column after pandas groupby -
i've got pandas dataframe df
. group 3 columns, , count results. when lose information, specifically, name
column. column mapped 1:1 desk_id
column. there anyway include both in final dataframe?
here dataframe:
shift_id shift_start_time shift_end_time name end_time desk_id shift_hour 0 37423064 2014-01-17 08:00:00 2014-01-17 12:00:00 adam scott 2014-01-17 10:16:41.040000 15557987 2 1 37423064 2014-01-17 08:00:00 2014-01-17 12:00:00 adam scott 2014-01-17 10:16:41.096000 15557987 2 2 37423064 2014-01-17 08:00:00 2014-01-17 12:00:00 adam scott 2014-01-17 10:52:17.402000 15557987 2 3 37423064 2014-01-17 08:00:00 2014-01-17 12:00:00 adam scott 2014-01-17 11:06:59.083000 15557987 3 4 37423064 2014-01-17 08:00:00 2014-01-17 12:00:00 adam scott 2014-01-17 08:27:57.998000 15557987 0
i group this:
grouped = df.groupby(['desk_id', 'shift_id', 'shift_hour']).size() grouped = grouped.reset_index()
and here result, missing name
column.
desk_id shift_id shift_hour 0 0 14468690 37729081 0 7 1 14468690 37729081 1 3 2 14468690 37729081 2 6 3 14468690 37729081 3 5 4 14468690 37729082 0 5
also, anyway rename count column 'count' instead of '0'?
you need include 'name'
in groupby
groups:
in [43]: grouped = df.groupby(['desk_id', 'shift_id', 'shift_hour', 'name']).size() grouped = grouped.reset_index() grouped.columns=np.where(grouped.columns==0, 'count', grouped.columns) #replace default 0 'count' print grouped desk_id shift_id shift_hour name count 0 15557987 37423064 0 adam scott 1 1 15557987 37423064 2 adam scott 3 2 15557987 37423064 3 adam scott 1
if name-to-id relationship many-to-one type, have pete scott same set of data, result become:
desk_id shift_id shift_hour name count 0 15557987 37423064 0 adam scott 1 1 15557987 37423064 0 pete scott 1 2 15557987 37423064 2 adam scott 3 3 15557987 37423064 2 pete scott 3 4 15557987 37423064 3 adam scott 1 5 15557987 37423064 3 pete scott 1
Comments
Post a Comment