How to set a cell to NaN in a pandas dataframe

pandas nan
pandas replace multiple values
pandas replace column values with another column
pandas replace string with number
replace nan with 0 pandas
set cell pandas
set cell in dataframe
convert nan to null pandas

I'd like to replace bad values in a column of a dataframe by NaN's.

mydata = {'x' : [10, 50, 18, 32, 47, 20], 'y' : ['12', '11', 'N/A', '13', '15', 'N/A']}
df = pd.DataFrame(mydata)

df[df.y == 'N/A']['y'] = np.nan

Though, the last line fails and throws a warning because it's working on a copy of df. So, what's the correct way to handle this? I've seen many solutions with iloc or ix but here, I need to use a boolean condition.


just use replace:

In [106]:
df.replace('N/A',np.NaN)

Out[106]:
    x    y
0  10   12
1  50   11
2  18  NaN
3  32   13
4  47   15
5  20  NaN

What you're trying is called chain indexing: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

You can use loc to ensure you operate on the original dF:

In [108]:
df.loc[df['y'] == 'N/A','y'] = np.nan
df

Out[108]:
    x    y
0  10   12
1  50   11
2  18  NaN
3  32   13
4  47   15
5  20  NaN

How to set a cell to NaN in a pandas dataframe, Use replace which will replace bad values in a column of a dataframe by NaN's: df.replace('N/A',np.NaN). Output. x y. 0 10 12. 1 50 11. Python Pandas replace NaN in one column with value from corresponding row of second column asked Aug 31, 2019 in Data Science by sourav ( 17.6k points) pandas


While using replace seems to solve the problem, I would like to propose an alternative. Problem with mix of numeric and some string values in the column not to have strings replaced with np.nan, but to make whole column proper. I would bet that original column most likely is of an object type

Name: y, dtype: object

What you really need is to make it a numeric column (it will have proper type and would be quite faster), with all non-numeric values replaced by NaN.

Thus, good conversion code would be

pd.to_numeric(df['y'], errors='coerce')

Specify errors='coerce' to force strings that can't be parsed to a numeric value to become NaN. Column type would be

Name: y, dtype: float64

pandas.DataFrame.replace, For a DataFrame a dict can specify that different values should be replaced in are read as follows: look in column 'a' for the value 'b' and replace it with NaN. python - values - How to set a cell to NaN in a pandas dataframe pandas replace with nan (4) While using replace seems to solve the problem, I would like to propose an alternative.


You can use replace:

df['y'] = df['y'].replace({'N/A': np.nan})

Also be aware of the inplace parameter for replace. You can do something like:

df.replace({'N/A': np.nan}, inplace=True)

This will replace all instances in the df without creating a copy.

Similarly, if you run into other types of unknown values such as empty string or None value:

df['y'] = df['y'].replace({'': np.nan})

df['y'] = df['y'].replace({None: np.nan})

Reference: Pandas Latest - Replace

Working with missing data, The choice of using NaN internally to denote missing data was largely for simplicity and For a DataFrame, you can specify individual values by column:. You can then use to_numeric in order to convert the values in the data-set into a float format. But since two of those values contain text, you’ll get a ‘NaN’ result for those two values. Later, you’ll see how to replace the NaN values with zero’s in pandas DataFrame.


df.loc[df.y == 'N/A',['y']] = np.nan

This solve your problem. With the double [], you are working on a copy of the DataFrame. You have to specify exact location in one call to be able to modify it.

pandas.DataFrame.fillna, Note: this will modify any other views on this object (e.g., a no-copy slice for a If method is specified, this is the maximum number of consecutive NaN values to  Pandas how to get a cell value and update it. Accessing a single value or setting up the value of single row is sometime required when we doesn’t want to create a new Dataframe for just updating that single cell value. There are indexing and slicing methods available but to access a single cell values there are Pandas in-built functions at and iat.


You can try these snippets.

In [16]:mydata = {'x' : [10, 50, 18, 32, 47, 20], 'y' : ['12', '11', 'N/A', '13', '15', 'N/A']}
In [17]:df=pd.DataFrame(mydata)

In [18]:df.y[df.y=="N/A"]=np.nan

Out[19]:df 
    x    y
0  10   12
1  50   11
2  18  NaN
3  32   13
4  47   15
5  20  NaN

pandas.notnull, Boolean inverse of pandas.notna. Series. Detect valid values in a DataFrame. array = np.array([[1, np.nan, 3], [4, 5, np.nan]]) >>> array array([[ 1., nan, 3.]  Within pandas, a missing value is denoted by NaN. In most cases, the terms missing and null are interchangeable, but to abide by the standards of pandas, we’ll continue using missing throughout this tutorial. Evaluating for Missing Data. At the base level, pandas offers two functions to test for missing data, isnull() and notnull().


Setting NaN values on DataFrame with non-unique column names , a NaN value using the iloc syntax. Instead of just the entry being set, the entire row gets set. import pandas as pd import numpy as np df = pd. Use pd.DataFrame.iat to reference and/or assign to the ordinal location of a single cell. You could also use iloc however, iloc can also take array like input. This makes iloc more flexible but also requires more overhead. Therefore, if it is only a single cell you want to change


How to replace all blank/empty cells in a pandas dataframe with , The replacing solution is : df['y'] = df['y'].replace({'N/A': np.nan}). Simple done inplace parameter for replace. df.replace({'N/A': np.nan},  I have a very simple problem. I would like to change a value in a given column of a given row of a pandas data frame. I try to do it in the following way: df['column3'].loc[this_date] = val As a result I get the following warning: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame


RE: How to set a cell to NaN in a pandas dataframe, Let's now review how to apply each of the 4 methods using simple examples. 4 cases to replace NaN values with zeros in pandas DataFrame. Case 1: replace  Here is the code that you may then use to get the NaN values: As you may observe, the first, second and fourth rows now have NaN values: Step 2: Drop the Rows with NaN Values in Pandas DataFrame. To drop all the rows with the NaN values, you may use df.dropna().