python - Drop NaNs from a pandas dataFrame -

- April 15, 2013

i dont understand how nan's being treated in pandas, happy explanation, because logic seems "broken" me.

i have csv file, im loading using read csv. have "comments" column in file, empty of times.

i've isolated column, , tried varies ways drop empty values. first, when im writing:

marked_results.comments

i get:

0       vp 1       vp 2       vp 3     test 4      nan 5      nan ....

the rest of column nan. pandas loading empty entries nans. great far. im trying drop entries. iv tried:

marked_results.comments.dropna()

and recieved same column. nothing dropped. confused, i'd tried understand why nothing dropped, tried:

marked_results.comments==nan

and recieved series of falses. nothing nans... confusing. tried:

marked_results.comments==nan

and again, nothing falses. got little pissed there, , thought smarter. did:

in [71]: comments_values = marked_results.comments.unique() comments_values  out[71]: array(['vp', 'test', nan], dtype=object)

ah, gotya! ive tried:

marked_results.comments==comments_values[2]

and surprisingly, still results falses!!! thing worked was:

marked_results.comments.isnull()

which returnd desired outcome. can explaine has happend here??

you should use isnull , notnull test nan (these more robust using pandas dtypes numpy), see "values considered missing" in docs.

using series method dropna on column won't affect original dataframe, want:

in [11]: df out[11]:   comments 0       vp 1       vp 2       vp 3     test 4      nan 5      nan  in [12]: df.comments.dropna() out[12]: 0      vp 1      vp 2      vp 3    test name: comments, dtype: object

the dropna dataframe method has subset argument (to drop rows have nans in specific columns):

in [13]: df.dropna(subset=['comments']) out[13]:   comments 0       vp 1       vp 2       vp 3     test  in [14]: df = df.dropna(subset=['comments'])

Search This Blog

Search

python - Drop NaNs from a pandas dataFrame -

Comments

Post a Comment

Popular posts from this blog

c++ - Creating new partition disk winapi -

VBA function to include CDATA -

php - Warning: file_get_contents() expects parameter 1 to be a valid path, array given 16 -