Python Pandas replace NaN in one column with value from another column of the same row it has be as list column

pandas replace values in column based on condition
pandas replace nan with string in a column
pandas replace nan with value from another column
pandas fillna with another column
pandas replace column values with another column
pandas replace specific values in column
python replace nan with another column
pandas fillna with mean

Input dataframe

data = {

'id' :[70,70,1148,557,557,104,581,69],
'r_id' : [[70,34, 44, 23, 11, 71], [70, 53, 33, 73, 41], 
          np.nan, np.nan, np.nan, np.nan,np.nan,[69, 68, 7],]
}

df = pd.DataFrame.from_dict(data)
print (df)
     id                      r_id
0    70  [70, 34, 44, 23, 11, 71]
1    70      [70, 53, 33, 73, 41]
2  1148                       NaN
3   557                       NaN
4   557                       NaN
5   104                       NaN
6   581                       NaN
7    69               [69, 68, 7]

Output dataframe,

data = {

'id' :[70,70,1148,557,557,104,581,69],
'r_id' : [[70,34, 44, 23, 11, 71], [70, 53, 33, 73, 41], 
          [1148], [557], [557], [104],[581],[69, 68, 7]]
}

df = pd.DataFrame.from_dict(data)
print (df)
     id                      r_id
0    70  [70, 34, 44, 23, 11, 71]
1    70      [70, 53, 33, 73, 41]
2  1148                    [1148]
3   557                     [557]
4   557                     [557]
5   104                     [104]
6   581                     [581]
7    69               [69, 68, 7]

I want the target column r_id with a list column the source column id is not a list, referred the below links in stackoverflow, python-pandas-replace-nan-in-one-column Tried the following as well, data_merge_rel.RELATED_DEVICE.fillna(data_merge_rel.DF0_Desc_Label_i.to_list(), inplace=True)

We can use list_comprehension + Series.fillna.

First we create a list with all the id values converted to list type. Then we replace NaN here by our list values:

df['temp'] = [[x] for x in df['id']]
df['r_id'] = df['r_id'].fillna(df['temp'])
df = df.drop(columns='temp')

Or in one line using apply (thanks r.ook)

df['r_id'] = df['r_id'].fillna(df['id'].apply(lambda x: [x]))
     id                      r_id
0    70  [70, 34, 44, 23, 11, 71]
1    70      [70, 53, 33, 73, 41]
2  1148                    [1148]
3   557                     [557]
4   557                     [557]
5   104                     [104]
6   581                     [581]
7    69               [69, 68, 7]

Pandas Coalesce, Pandas Coalesce - How to Replace NaN values in a dataframe from another column and if any of the values are null in that column then it should A column Final Rate is inserted which contains the Hourly rate and if any values within the same dataframe with previous and next row and column values� I want to replace the col1 values with the values in the second column (col2) only if col1 values are equal to 0, and after (for the zero values remaining), do it again but with the third column (col3). The Desired Result is the next one:

You can use explode() and groupby():

(df.explode('r_id').ffill(axis=1).reset_index().groupby(['index','id'],sort=False).agg(list)
                                                               .reset_index(1))

         id                      r_id
index                                
0        70  [70, 34, 44, 23, 11, 71]
1        70      [70, 53, 33, 73, 41]
2      1148                    [1148]
3       557                     [557]
4       557                     [557]
5       104                     [104]
6       581                     [581]
7        69               [69, 68, 7]

Python Pandas replace NaN in one column with value from , If the DataFrame is in df then replace any NaN values with the corresponding value of df.Farheit. After that delete the 'Farheit' column and then� I want to replace the NaN in the EVENT column where GAME < 0.24 by the value in the GAME column. 23 0:23 2 2:34 3:43 3 NaN 4:50 Another column to Python

You can transform the column id to an array, add a dimension, then make a list of it and fillna with a Series like:

df['r_id'] = df['r_id'].fillna(pd.Series(df.id.to_numpy()[:,None].tolist(), index=df.index))
print (df)
     id                      r_id
0    70  [70, 34, 44, 23, 11, 71]
1    70      [70, 53, 33, 73, 41]
2  1148                    [1148]
3   557                     [557]
4   557                     [557]
5   104                     [104]
6   581                     [581]
7    69               [69, 68, 7]

or if you don't have a lot of nan, it may worth to select only these rows prior to do anything:

mask_na = df.r_id.isna()
df.loc[mask_na, 'r_id'] = pd.Series(df.loc[mask_na,'id'].to_numpy()[:,None].tolist(), 
                                    index=df[mask_na].index)

pandas.DataFrame.fillna — pandas 1.1.0 documentation, DataFrame.plot.line � pandas. If method is specified, this is the maximum number of consecutive NaN values to [np.nan, 3, np.nan, 4]], columns=list(' ABCD')) >>> df A B C D 0 NaN 2.0 NaN 0 1 3.0 4.0 NaN 1 2 NaN NaN NaN Replace all NaN elements in column 'A', 'B', 'C', and 'D', with 0, 1, 2, and 3 respectively. First we will use NumPy’s little unknown function where to create a column in Pandas using If condition on another column’s values. Next we will use Pandas’ apply function to do the same. Let us first load Pandas and NumPy. import pandas as pd import numpy as np Let us use gapminder dataset from Carpentries for this examples.

I think anky_91's answer will be faster, but you could also try this:

df['r_id'] = np.where(df['r_id'].isnull(),
                      df['id'].apply(lambda x: [x]),
                      df['r_id'])

Output:

     id                      r_id
0    70  [70, 34, 44, 23, 11, 71]
1    70      [70, 53, 33, 73, 41]
2  1148                    [1148]
3   557                     [557]
4   557                     [557]
5   104                     [104]
6   581                     [581]
7    69               [69, 68, 7]

pandas.Series.fillna — pandas 1.1.0 documentation, Series.plot.line � pandas. If method is specified, this is the maximum number of consecutive NaN values to forward/backward fill [np.nan, 3, np.nan, 4]], columns=list('ABCD')) >>> df A B C D 0 NaN 2.0 NaN 0 1 3.0 4.0 NaN 1 2 Replace all NaN elements in column 'A', 'B', 'C', and 'D', with 0, 1, 2, and 3 respectively� One possible approach to consider: You could first replace the empty string by NaN (see here) and then use this approach. – edesz Feb 5 at 17:04 The answer is perfect. Just if you like to stay more in pandas syntax I'd suggest to delete columns by df.drop("Farheit", axis=1) , but thats probably personal preference – MichaelA Mar 3 at 11:03

pandas.DataFrame.dropna — pandas 1.1.0 documentation, 'any' : If any NA values are present, drop that row or column. Labels along other axis to consider, e.g. if you are dropping rows these would be a list of columns to include. Replace missing values. Drop the columns where at least one element is missing. Keep the DataFrame with valid entries in the same variable . Using these methods either you can replace a single cell or all the values of a row and column in a dataframe based on conditions .

pandas.DataFrame.replace — pandas 0.23.0 documentation, First, if to_replace and value are both lists, they must be the same length. For a DataFrame a dict can specify that different values should be replaced in For example, {'a': 1, 'b': 'z'} looks for the value 1 in column 'a' and the value 'z' in are read as follows: look in column 'a' for the value 'b' and replace it with NaN. Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Pandas dataframe.replace() function is used to replace a string, regex, list, dictionary, series, number etc. from a dataframe. This

How to fill missing value based on other columns in Pandas , Assuming three columns of your dataframe is a , b and c . This is what you want: df['c'] = df.apply( lambda row: row['a']*row['b'] if np.isnan(row['c']) else row['c'],� See the examples section for examples of each of these. value scalar, dict, list, str, regex, default None. Value to replace any values matching to_replace with. For a DataFrame a dict of values can be used to specify which value to use for each column (columns not in the dict will not be filled).

Comments
  • If you are using list comprehension, why not df['r_id'].fillna(df['id'].apply(lambda x: [x]))?
  • Yes was thinking about it as well. But went for the more "readable" approach. But added it as second option. Thanks @r.ook
  • Thanking you, can you post both the solution, I have a bigger dataframe need to see the performance as well
  • I thought you had two solution one with explode and an another with group by
  • It went and did the same operation on other columns as well :(
  • @vinsentparamanantham you can use the column name before agg such as .groupby(['index','id'],sort=False)['r_id'].agg(list)