Updating pandas dataframe with new column

Related searches

I want to create a new column with all the distinct values across the rows. Each value in a row is a string(not list).

This is how dataframe looks like:

+-----------------------------+-------------------------+---------------------------------------------+
|         first               |            second       |           third                             |  
+-----------------------------+-------------------------+---------------------------------------------+
|['able', 'shovel', 'door']   |['shovel raised']        |['shovel raised', 'raised', 'door', 'shovel']|
|['grade control']            |['grade']                |['grade']                                    |
|['light telling', 'love']    |['would love', 'closed'] |['closed', 'light']                          |
+-----------------------------+-------------------------+---------------------------------------------+

This is how the dataframe should look like after creating a new column with distinct values.

df = pd.DataFrame({'first': "['able', 'shovel', 'door']" , 'second': "['shovel raised']", 'third': "['shovel raised', 'raised', 'door', 'shovel']", "Distinct_set": "['able', 'shovel', 'door', 'shovel raised', 'raised']" }, index = [0])

try this:

df['new_col'] = df.apply(lambda x: list(set(x['first'] + x['second']+x['third'])), axis =1)

its creating set of single char as your data in cell is string.

"['able', 'shovel', 'door']"

to correct this use below:

df['new_col'] = df.apply(lambda x: list(set(eval(x['first']) + eval(x['second'])+eval(x['third']))), axis =1)

Updating pandas dataframe with new column, try this: df['new_col'] = df.apply(lambda x: list(set(x['first'] + x['second']+x['third'])), axis =1). its creating set of single char as your data in cell is� Let’s discuss how to add new columns to existing DataFrame in Pandas. There are multiple ways we can do this task. Method #1: By declaring a new list as a column.

How about this:

import pandas as pd
import numpy as np

df = pd.DataFrame([[['able', 'shovel', 'door'], ['shovel raised'], ['shovel raised', 'raised', 'door', 'shovel']], [['grade control'], ['grade'], ['grade']], [['light telling', 'love'], ['would love', 'closed'], ['closed', 'light']]], columns=['first', 'second', 'third'])

df.apply(lambda row: [np.unique(np.hstack(row))], raw=True, axis=1)

The last command produces:

0        [[able, door, raised, shovel, shovel raised]]
1                             [[grade, grade control]]
2    [[closed, light, light telling, love, would lo...

which can be saved in a new column of the dataframe:

df['Distinct_set'] = df.apply(lambda row: [np.unique(np.hstack(row))], raw=True, axis=1) 

Modifying Columns in DataFrame, Rename columns. Use rename() method of the DataFrame to change the name of a column. See rename() documentation here. Add columns. You can add a column to DataFrame object by assigning an array-like object (list, ndarray, Series) to a new column using the [ ] operator. Delete columns. In [7]: Insert/Rearrange columns. pandas.DataFrame.update¶ DataFrame.update (other, join = 'left', overwrite = True, filter_func = None, errors = 'ignore') [source] ¶ Modify in place using non-NA values from another DataFrame. Aligns on indices. There is no return value. Parameters other DataFrame, or object coercible into a DataFrame

You can try out below snippet

import json
def get_list_from_str(s):
    return json.loads(s.replace("'", '"'))

def flatten_list_rows(row):
    return (set(
        get_list_from_str(row['first']) + 
        get_list_from_str(row['second']) + 
        get_list_from_str(row['third']) 
    ))

df['Distinct_set'] = df.apply(flatten_list_rows, axis=1)

How to update a value in each row of a Pandas DataFrame in Python, Use the syntax DataFrame[column] and perform the desired operation(s) on each entry in column in the DataFrame . df = pd.DataFrame({"A" : [� Updating the existing DataFrame with new column. Let us now look at ways to add new column into the existing DataFrame. (i) DataFrame.insert() Adding new column in our existing dataframe can be done by this method. Its syntax is as follow: DataFrame.insert(loc, column, value, allow_duplicates = False)

How to change/update cell value in Python Pandas dataframe , the at() method added a new column in my data frame instead of modifying the. Col B of my dataframe. Was this the case for anyone else? While working with data in Pandas, we perform a vast array of operations on the data to get the data in the desired form. One of these operations could be that we want to create new columns in the DataFrame based on the result of some operations on the existing columns in the DataFrame.

Pandas update and add rows one dataframe with key column in , Python Pandas update a dataframe value from another dataframe , Modify in place using non-NA values from another DataFrame. Aligns on indices. There is no� A quick and dirty solution which all of us have tried atleast once while working with pandas is re-creating the entire dataframe once again by adding that new row or column in the source i.e. csv, txt, DB etc. Pandas is a feature rich Data Analytics library and gives lot of features to achieve these simple tasks of add, delete and update.

Accessing a single value or setting up the value of single row is sometime required when we doesn’t want to create a new Dataframe for just updating that single cell value. There are indexing and slicing methods available but to access a single cell values there are Pandas in-built functions at and iat.

Comments
  • Please provide a reproducible code to produce the data so that people can directly copy and use. Also, can you please provide more details.
  • Could you tell a bit more about what you are trying to achieve? and code you have tried would also aid that endeavor.
  • It is creating a set of each alphabet and punctuations whereas I need set of words in an inverted comma.
  • @Mohit Sharma, avoid using eval, they have very destructive outputs. Check - stackoverflow.com/questions/1832940/…
  • You missed conversion of the list as a string value. Read the question again.
  • The brackets in the example from the main question are misleading, ie. are they supposed to be the part of the actual string? Once the author provides the code to generate the dataframe I will reassess.
  • He mentioned in the question, "Each value in a row is a string(not list)" if you see the first column from expected df also you will get to know.
  • OK, you are right. This means that the square brackets are the part of the strings, which is weird...