python - Add columns to pandas dataframe containing max of each row, AND corresponding column name -
my system
windows 7, 64 bit
python 3.5.1
the challenge
i guess should easy, best of abilities it's hard accomplish , difficult explain. hope reproducible example below sheds light on problem. similar question has been asked , answered r in this post.
i've got pandas dataframe, , know maximum value each row, , append info new column. know name of column maximum value located. , add column existing dataframe containing name of column max value can found.
reproducible example
in[1]: # make pandas dataframe df = pd.dataframe({'a':[1,0,0,1,3], 'b':[0,0,1,0,1], 'c':[0,0,0,0,0]}) # calculate max my_series = df.max(numeric_only=true, axis = 1) my_series.name = "maxval" # include maxval in df df = df.join(my_series) df out[1]: b c maxval 0 1 0 0 1 1 0 0 0 0 2 0 1 0 1 3 1 0 0 1 4 3 1 0 3
so far good. add column existing dataframe containing name of column part:
in[2]: ? ? ? # i'd accomplish: out[2]: b c maxval maxcol 0 1 0 0 1 1 0 0 0 0 a,b,c 2 0 1 0 1 b 3 1 0 0 1 4 3 1 0 3
notice i'd return column names if multiple columns contain same maximum value. plese notice column maxval not included in maxcol since not make sense. in advance if out there fins interesting.
you can compare df against maxval
using eq
axis=0
, use apply
lambda
produce boolean mask mask columns , join
them:
in [183]: df['maxcol'] = df.ix[:,:'c'].eq(df['maxval'], axis=0).apply(lambda x: ','.join(df.columns[:3][x==x.max()]),axis=1) df out[183]: b c maxval maxcol 0 1 0 0 1 1 0 0 0 0 a,b,c 2 0 1 0 1 b 3 1 0 0 1 4 3 1 0 3
Comments
Post a Comment