python - Filling NAN data with mode() doesn't work -Pandas -


i have data set in there series known outlet_size contain either of {'medium', nan, 'high', 'small'} around 2566 records missing thought fill mode() value wrote :

  train['outlet_size']=train['outlet_size'].fillna(train['outlet_size'].dropna().mode()] 

but when tried find number of missing nan record command

  sum(train['outlet_size'].isnull())  

it still showing 2566 nan records.why ?

thank answers

the problem here mode returns series , causing fillna fail, if @ simple example:

in [194]:     df = pd.dataframe({'a':['low','low',np.nan,'medium','medium','medium','medium']}) df  out[194]:         0     low 1     low 2     nan 3  medium 4  medium 5  medium 6  medium  in [195]:     df['a'].fillna(df['a'].mode())  out[195]: 0       low 1       low 2       nan 3    medium 4    medium 5    medium 6    medium name: a, dtype: object 

so can see fails above, if @ mode returns:

in [196]:     df['a'].mode()  out[196]: 0    medium dtype: object 

it's series albeit single row, when pass fillna fills first row, want scalar value indexing series:

in [197]:     df['a'].fillna(df['a'].mode()[0])  out[197]: 0       low 1       low 2    medium 3    medium 4    medium 5    medium 6    medium name: a, dtype: object 

edit

regarding whether dropna required, no isn't:

in [204]: df = pd.dataframe({'a':['low','low',np.nan,'medium','medium','medium','medium',np.nan,np.nan,np.nan,np.nan]}) df['a'].mode()  out[204]: 0    medium dtype: object 

you can see nan ignored


Comments

Popular posts from this blog

sequelize.js - Sequelize group by with association includes id -

android - Robolectric "INTERNET permission is required" -

java - Android raising EPERM (Operation not permitted) when attempting to send UDP packet after network connection -