Fill NaN values based on key

pandas replace values in column based on condition
replace nan with 0 pandas
pandas replace nan with string in a column
pandas ffill
fill missing values
pandas fillna with mean
pandas fillna based on condition
how to fill missing value based on other columns in pandas dataframe

I am trying to fill null values in one dataframe based on another dataframe based on a key found in both dataframes.

df

parcel     ID
1234       NaN
4586       lmnop
5960       wywy
df1        

parcel     ID
1234       abcd
4586       lmnop

Since the parcel number is the same in df and df1, I want to fill only the null values in the ID column based on df1.

I think combine_first() is good approach, but you need to set index first - in this case the column parcel is common in both:

import numpy as np
import pandas as pd

df = pd.DataFrame({
    'parcel': [1234, 4586, 5960, 9999],
    'ID': [np.nan, 'lmnop', 'wywy', np.nan]
    })

df1 = pd.DataFrame({
    'parcel': [1234, 4586, 9999, 8888],
    'ID': ['abcd', 'lmnop', 'xxx', 'nonexistent']
    })

df_out = df.set_index('parcel').combine_first( df1.set_index('parcel') )
df_out = df_out[df_out.index.isin(df.parcel)].reset_index()
print(df_out)

Prints:

   parcel     ID
0    1234   abcd
1    4586  lmnop
2    5960   wywy
3    9999    xxx

How to fill missing value based on other columns in Pandas , Assuming three columns of your dataframe is a , b and c . This is what you want: df['c'] = df.apply( lambda row: row['a']*row['b'] if np.isnan(row['c']) else row['c'],� Metrics imputations is a way to fill NaN values with some special metrics that depend on your data: mean or median for example. Mean value is the sum of a value in a series divided by a number of all values of series. It is one of the most used types of metrics in statistics. But why do we impute the NaN values with mean value?

you can use map which allows you to use a dictionary to map values.

na_dict = dict(zip(df1.Parcel,df1.ID))

df.ID.fillna(df.ID.map(na_dict))

pandas.DataFrame.fillna — pandas 1.1.0 documentation, Values not in the dict/Series/DataFrame will not be filled. This value cannot be a list. method{'backfill', 'bfill', 'pad', 'ffill', None}� Just like pandas dropna() method manage and remove Null values from a data frame, fillna() manages and let the user replace NaN values with some value of their own. Syntax: DataFrame.fillna(value=None, method=None, axis=None, inplace=False, limit=None, downcast=None, **kwargs) Parameters:

You can use combine_first:

df.combine_first(df1)

Output:

   parcel     ID
0    1234   abcd
1    4586  lmnop
2    5960   wywy

pandas.Series.fillna — pandas 1.1.0 documentation, Values not in the dict/Series/DataFrame will not be filled. This value cannot be a list. method{'backfill', 'bfill', 'pad', 'ffill', None}� I need to group by column 'a', and fill the NaN with the column 'b' value where the date for that row is closest to the date in the NaN row. So the output should look like: a b date 0 1 4.0 01/10/2017 1 1 6.0 02/09/2017 2 1 6.0 02/10/2016 3 2 5.0 01/10/2017 4 2 5.0 01/11/2017 5 2 7.0 02/10/2016

Python, Sometimes csv file has null values, which are later displayed as NaN in Data Frame. Just like pandas dropna() method manage and remove Null� Use axis=1 if you want to fill the NaN values with next column data. How pandas ffill works? ffill is a method that is used with fillna function to forward fill the values in a dataframe. so if there is a NaN cell then ffill will replace that NaN value with the next row or column based on the axis 0 or 1 that you choose. Let’s see how it works.

Working with missing data — pandas 0.25.0.dev0+752.g49f33f0d , The actual missing value used will be chosen based on the dtype. fillna() can “ fill in” NA values with non-NA data in a couple of ways, which we illustrate: ~/ build/pandas-dev/pandas/pandas/core/series.py in __getitem__(self, key) 941 key� Then, to eliminate the missing value, we may choose to fill in different data according to the data type of the column. Both numpy.nan and None can be filled in using pandas.fillna().For

How to Use the Pandas fillna Method, Very simply, the Pandas fillna method fills in missing values in Pandas However, instead of providing one “key value pair” to the dictionary,� We will use update where we have to match the dataframe index with the dictionary Keys. Lets use the above dataframe and update the birth_Month column with the dictionary values where key is meant to be dataframe index, So for the second index 1 it will be updated as January and for the third index i.e. 2 it will be updated as February and so on

Comments
  • Unfortunately, this did not fill in the NaN values. Does it matter if there are other NaN values within the dataframe?