updating column in one dataframe with value from another dataframe based on matching values

pandas update column based on another dataframe
pandas replace values in dataframe with values from another dataframe
r replace values in dataframe from another dataframe
pandas lookup value in another dataframe
pandas update row value
r change column value based on another column
pandas add column from another dataframe
pandas modify column values

I have a data frame "z"

   letter color
1       a     0
2       e     0
3       b     0
4       b     0
5       d     0
6       d     0
7       a     0
8       b     0
9       c     0
10      d     0
11      c     0
12      c     0
13      c     0
14      c     0
15      e     0
16      e     0
17      a     0
18      d     0
19      e     0
20      b     0

and another data frame "y"

  letter color
1      a   red
2      b  blue
3      c green

when the letter in z matches with a letter in y I would like to append the color from y into the corresponding color field in z but I do not want to remove any values from z. If a match doesn't occur, z$color should remain unchanged. I used"0" as a place holder in z$color, this could be text instead.

I've been attempting things for loops, the match() command and statements with %in% but I'm not quite achieving the results I'm after.

Any ideas?

This is the code I used for the data frames

set.seed(3)
z=data.frame(sample(c("a","b","c","d","e"),20,replace=T))
names(z)="letter"
z$color=rep(0,dim(z)[1])
z

y1=c("a","b","c")
y2=c("red","blue","green")
y=data.frame(cbind(y1,y2))
names(y)=c("letter","color")
y

you don't need z$color in the first place if its just place holder, you can replace NA later with 0

z$color<-y[match(z$letter, y$letter),2]

Mapping column values of one DataFrame to another DataFrame , You can convert df2 to a dictionary and use that to replace the values in df1 cat_1 = [10, 11, 12] cat_2 = [25, 22, 30] cat_3 = [12, 14, 15] df1 = pd. Update a column in a dataframe, based on the values in another dataframe Hot Network Questions Did they actually have "mushroom cloud tourist attraction" to Nevada Test Site in the 1950s?


You can use merge:

dat <- merge(z, y, by = "letter", all.x = TRUE)
transform(dat, color = ifelse(is.na(color.y), 
                              color.x, as.character(color.y)))[-(2:3)]

   letter color
1       a   red
2       a   red
3       a   red
4       b  blue
5       b  blue
6       b  blue
7       b  blue
8       c green
9       c green
10      c green
11      c green
12      c green
13      d     0
14      d     0
15      d     0
16      d     0
17      e     0
18      e     0
19      e     0
20      e     0

Update pandas dataframe based on matching columns of a second , But the problem is, that it will fill in the values, even when just one column matches. I want to fill in the value when the columns "housenumber"  and another data frame "y" letter color 1 a red 2 b blue 3 c green when the letter in z matches with a letter in y I would like to append the color from y into the corresponding color field in z but I do not want to remove any values from z.


sqldf/sqlite is very flexible:

library(sqldf)
z$color="0" # to avoid conflicts numeric/characters
z <- sqldf(c("UPDATE z
             SET color = (SELECT y.color
                          FROM y
                          WHERE z.letter = y.letter
                           )
             WHERE EXISTS (SELECT 1
                           FROM y
                           WHERE z.letter = y.letter
                           )"
             , "select * from main.z"
                  )
           )
z
   letter color
1       b  blue
2       a   red
3       d   0.0
4       d   0.0
5       e   0.0
6       a   red
7       a   red
8       c green
9       b  blue
10      c green
11      e   0.0
12      c green
13      b  blue
14      d   0.0
15      d   0.0
16      d   0.0
17      c green
18      e   0.0
19      a   red
20      c green

pandas.DataFrame.update, Modify in place using non-NA values from another DataFrame. Aligns on indices. There is no return value. Parameters Should have at least one matching index/​column label with the original DataFrame. If a Series is passed, its name  Often, you may want to subset a pandas dataframe based on one or more values of a specific column. Essentially, we would like to select rows based on one value or multiple values present in a column. Here are SIX examples of using Pandas dataframe to filter rows or select rows based values of a column(s).


Merge, join, and concatenate, pandas provides various facilities for easily combining together Series, DataFrame, operations along an axis while performing optional set logic (union or intersection) of the the column names when creating a new DataFrame based on existing Series. Here is another example with duplicate join keys in DataFrames:. Dataframe cell value by Integer position. From the above dataframe, Let’s access the cell value of 1,2 i.e Index 1 and Column 2 i.e Col C. iat - Access a single value for a row/column pair by integer position. Use iat if you only need to get or set a single value in a DataFrame or Series.


Merge, join, and concatenate, In [1]: df = DataFrame(np.random.randn(10, 4)) In [2]: df Out[2]: 0 1 2 3 0 0.469112 -1.469388 9 0.357021 -0.674600 -1.776904 -0.968914 [10 rows x 4 columns] the other n - 1 axes instead of performing inner/outer set logic; keys: sequence, default None. If True, do not use the index values on the concatenation axis. Pandas Update column with Dictionary values matching dataframe Index as Keys. We will use update where we have to match the dataframe index with the dictionary Keys. Lets use the above dataframe and update the birth_Month column with the dictionary values where key is meant to be dataframe index, So for the second index 1 it will be updated as January and for the third index i.e. 2 it will be updated as February and so on


Indexing and selecting data, Series: series[label] returns a scalar value; DataFrame: frame[colname] returns a Another common operation is the use of boolean vectors to filter the data. This allows you to select rows where one or more columns have values you want: You may wish to set values on a DataFrame based on some boolean criteria  title: “Pandas How to replace values based on Conditions” date: “2019-07-17” categories: [ Data Science, Pandas, Python, Python, Data Science ] tags: [ DataScience, Pandas, Python ] — Using these methods either you can replace a single cell or all the values of a row and column in a dataframe based on conditions . Dataframe: