Having trouble replacing empty strings with NaN using Pandas.DataFranme.replace()

python replace empty string with none
pandas replace specific values in column
pandas replace empty string with np nan
pandas replace values in dataframe with values from another dataframe
pandas fill blanks with 0
pandas replace column values with another column
pandas nan
change row values in pandas dataframe

I have a pandas dataframe which has some observations with empty strings which I want to replace with NaN (np.nan).

I am successfully replacing most of these empty strings using

df.replace(r'\s+',np.nan,regex=True).replace('',np.nan)

But I am still finding empty strings. For example, when I run

sub_df = df[df['OBJECT_COL'] == '']
sub_df.replace(r'\s+', np.nan, regex = True)
print(sub_df['OBJECT_COL'] == '') 

The output all returns True

Is there a different method I should be trying? Is there a way to read the encoding of these cells such that perhaps my .replace() is not effective because the encoding is weird?

Another Alternatives.

sub_df.replace(r'^\s+$', np.nan, regex=True)

OR, to replace an empty string and records with only spaces

sub.df.replace(r'^\s*$', np.nan, regex=True)

Alternative:

using apply() with function lambda.

sub_df.apply(lambda x: x.str.strip()).replace('', np.nan)
Just Example illustration:
>>> import numpy as np
>>> import pandas as pd

Example DataFrame having empty strings and whitespaces..

>>> sub_df
        col_A
0
1
2   somevalue
3  othervalue
4
Solutions applied For the different conditions:

Best Solution:

1)

>>> sub_df.replace(r'\s+',np.nan,regex=True).replace('',np.nan)
        col_A
0         NaN
1         NaN
2   somevalue
3  othervalue
4         NaN

2) This works but partially not for both cases:

>>> sub_df.replace(r'^\s+$', np.nan, regex=True)
        col_A
0
1         NaN
2   somevalue
3  othervalue
4         NaN

3) This also works for both conditions.

>>> sub_df.replace(r'^\s*$', np.nan, regex=True)

            col_A
    0         NaN
    1         NaN
    2   somevalue
    3  othervalue
    4         NaN

4) This also works for both conditions.

>>> sub_df.apply(lambda x: x.str.strip()).replace('', np.nan)
        col_A
0         NaN
1         NaN
2   somevalue
3  othervalue
4         NaN

How to replace each empty string in a pandas DataFrame with NaN , How to replace each empty string in a pandas DataFrame with NaN in Python DataFrame with NaN converts any strings with only whitespace to NaN . Use pandas.DataFrame.replace() replace each empty string with NaN Install Kite Now! Replace a string value with NaN in pandas data frame - Python. Ask Question Do I have to replace the value? with NaN so you can invoke the .isnull method. I have

pd.Series.replace does not work in-place by default. You need to specify inplace=True explicitly:

sub_df.replace(r'\s+', np.nan, regex=True, inplace=True)

Or, alternatively, assign back to sub_df:

sub_df = sub_df.replace(r'\s+', np.nan, regex=True)

pandas.DataFrame.replace, DataFrame.empty · pandas. str: string exactly matching to_replace will be replaced with value For a DataFrame a dict can specify that different values should be replaced in For a DataFrame nested dictionaries, e.g., {'a': {'b': np.​nan}} , are read as follows: look in column 'a' for the value 'b' and replace it with NaN. I have a dataframe with empty cells and would like to replace these empty cells with NaN. A solution previously proposed at this forum works, but only if the cell contains a space: df.replace(r'\s+',np.nan,regex=True) This code does not work when the cell is empty. Has anyone a suggestion for a panda code to replace empty cells. Wannes

Try np.where:

df['OBJECT_COL'] = np.where(df['OBJECT_COL'] == '', np.nan, df['OBJECT_COL'])

How to replace all blank/empty cells in a pandas dataframe with NaNs, A2A: I would use the replace() method: [code]>>> import pandas as pd As has already been mentioned, the best way to accomplish this is to use df.replace() . > If the cells contain empty strings, df.replace(“”, numpy.nan, inplace=True)  Use DataFrame. fillna or Series. fillna which will help in replacing the Python object None, not the string ' None '. import pandas as pd. For dataframe: df.fillna(value=pd.np.nan, inplace =True) For column or series: df.mycol.fillna(value=pd.np.nan, inplace =True) If you want to know more about Machine Learning then watch this video:

Replacing nan with blanks in Python, pandas replace nan with string in a column python string python replace blank with null df.columns = 'ReturnCreated ReturnTime TS_startTime'.split() df1 = df.​replace(np.nan,"", regex=True) Just discovered the "inplace=True" problem. Just like pandas dropna() method manage and remove Null values from a data frame, fillna() manages and let the user replace NaN values with some value of their own. Syntax: DataFrame.fillna(value=None, method=None, axis=None, inplace=False, limit=None, downcast=None, **kwargs)

Python, Python | Pandas DataFrame.fillna() to replace Null values in dataframe Null values from a data frame, fillna() manages and let the user replace NaN values with some value of their own. axis: axis takes int or string value for rows/​columns. Empty cells should already be NaNs. If the cells contain empty strings, df.replace(“”, numpy.nan, inplace=True) should do it.

Python replace empty string with na, The string "nan" is a possible value, as is an empty string. I am not able to install onedrivesdk in linux: it is giving "Command errored out with exit status 1: python Syntax. replace() function in pandas – replace a string in dataframe python. Pandas dataframe.replace() function is used to replace a string, regex, list, dictionary, series, number etc. from a dataframe. This is a very rich function as it has many variations. The most powerful thing about this function is that it can work with Python regex (regular expressions).

Comments
  • what is you use sub_df.replace(r'\s+',np.nan,regex=True).replace('',np.nan)
  • that works! but returns a scalar?
  • thanks! the second replace worked as well as the commented out line, however both seem to return scalar when the columns are supposed to be objects (strings) - is there a way to return NaN as object type using .apply method?
  • 'sub_df.apply(lambda x: x.str.strip()).replace('', np.nan)' ^ this was the solution! thanks!
  • thanks for your reply! however checking sub_df['OBJECT_COL'] == '' still returns all True values, so it does not appear to be working..
  • thanks! this does work, but I am getting the warning SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Is there an effective way you recommend to call this on all columns of the DF?