Combine two columns while giving priority to the first one

Combine two columns while giving priority to the first one

pandas combine first
pandas merge columns into one
pandas combine_first multiple columns
pandas coalesce multiple columns
pandas merge on different column names
merge two dataframes pandas with same column names
pandas concat duplicate columns
pandas update

From this question, I have two matrices and am looking to merge them in such a way that I left join dfB onto dfA replacing NaN values with non-NaN values wherever I have them.

That is,

>>> dfA
  s_name  geo    zip  date value
0      A  zip  60601  2010   NaN  # In the earlier question, this was None
1      B  zip  60601  2010   NaN  # rather than NaN, which was
2      C  zip  60601  2010   NaN  # a mistake.
3      D  zip  60601  2010   NaN

>>> dfB
  s_name  geo    zip  date  value
0      A  zip  60601  2010    1.0
1      B  zip  60601  2010    NaN
3      D  zip  60601  2010    4.0

Merging them, I see:

>>> new = pd.merge(dfA,dfB,on=["s_name","geo", "geoid", "date"],how="left")
>>> new.head()
  name    geo   zip  date  value_x  value_y
0    A  state    01  2009      NaN      1.0
1    B  state    01  2010      NaN      NaN
2    C  state    01  2011      NaN      NaN
3    D  state    01  2012      NaN      4.0
4    E  state    01  2013      NaN      5.0

I can't be sure value_y is always numbered and value_x is always NaN. But I want a merged value, call it value that is whichever-value-is-not-NaN. I try this:

>>> new["value"] = new.apply(lambda r: r.value_x or r.value_y, axis=1)
>>> new.head()
  name    geo   zip  date  value_x  value_y  value
0    A  state    01  2009      NaN      1.0    NaN
1    B  state    01  2010      NaN      NaN    NaN
2    C  state    01  2011      NaN      NaN    NaN
3    D  state    01  2012      NaN      4.0    NaN
4    E  state    01  2013      NaN      5.0    NaN

Oh no.

It makes sense in that NaN should propagate, but is not what I'm looking for. I'd like logic that would return whichever is present, not return NaN if either is present.

I'd like the logic that None gives me. You can see:

>>> new["value_z"] = None
>>> new.head()
  name    geo   zip  date  value_x  value_y  value value_z
0    A  state    01  2009      NaN      1.0    NaN    None
1    B  state    01  2010      NaN      NaN    NaN    None
2    C  state    01  2011      NaN      NaN    NaN    None
3    D  state    01  2012      NaN      4.0    NaN    None
4    E  state    01  2013      NaN      5.0    NaN    None

>>> new["value2"] = new.apply(lambda r: r.value_z or r.value_y, axis=1)
>>> new.head()
  name    geo   zip  date  value_x  value_y  value value_z   value2
0    A  state    01  2009      NaN      1.0    NaN    None      1.0
1    B  state    01  2010      NaN      NaN    NaN    None      NaN
2    C  state    01  2011      NaN      NaN    NaN    None      NaN
3    D  state    01  2012      NaN      4.0    NaN    None      4.0
4    E  state    01  2013      NaN      5.0    NaN    None      5.0

The logic that creates value2 is the behavior I'm looking for, not value.

What's the best way to do this?


if you have a preference for value_x , you could try:

df.value_x = df.value_x.fillna(df.value_y)
df.pop('value_y')

or :

df.value_x=df.value_x.fillna(df.pop('value_y'))

>>df
   name geo    zip  date    value_x
0   A   state   1   2009    1.0
1   B   state   1   2010    NaN
2   C   state   1   2011    NaN
3   D   state   1   2012    4.0
4   E   state   1   2013    5.0

pandas.DataFrame.combine_first, Combine two DataFrame objects by filling null values in one DataFrame with non​-null values from other DataFrame. The row and column indexes of the  Following is the data i have in my sql table Date Unit Anchor LU 20171231 ESG 134.08 156.68 20180228 OUT 23.56 11.51 20171231 OUT 525.58 620.05 20180430 GNS 0 0 20180630 GNS 0 0 20180331 ANS 1.5333 15.3775 20180430 ESG 0 15.9999 20180531 ANS 11.8999 45.0722 But in power bi visualisation they woul


combine_first will work after merge:

dfC = pd.merge(dfA, dfB, on=["s_name", "geo", "zip", "date"], how="left")
dfC['value'] = dfC.pop('value_x').combine_first(dfC.pop('value_y'))
dfC

  s_name  geo    zip  date  value
0      A  zip  60601  2010    1.0
1      B  zip  60601  2010    NaN
2      C  zip  60601  2010    NaN
3      D  zip  60601  2010    4.0

combine_first gives preference to "value_x" over "value_y". You can also write this as:

dfC = pd.merge(dfA, dfB, on=["s_name", "geo", "zip", "date"], how="left")
dfC['value_x'] = dfC['value_x'].combine_first(dfC.pop('value_y'))
dfC

  s_name  geo    zip  date  value_x
0      A  zip  60601  2010      1.0
1      B  zip  60601  2010      NaN
2      C  zip  60601  2010      NaN
3      D  zip  60601  2010      4.0

Merge, join, and concatenate, When gluing together multiple DataFrames, you have a choice of how to handle the other The default behavior with join='outer' is to sort the other axis (columns in this case). Let's consider a variation of the very first example presented: In  If one is filled then the other one is always empty. A normal SELECT FROM table ORDER BY organization,lastname would list all the organizations first and then the lastnames second, but I wanted to intermix them, so I did this: SELECT FROM table ORDER BY CONCAT(organization,lastname) This will combine the two columns for the ORDER BY without actually creating a new column.


This technically works by hammering out the logic, but is ugly and feels like a hack (I believe it gives preference to value_x due to operator short-circuiting?):

>>> new["value3"] = new.apply(lambda r: (not(pd.isna(r.value_x)) or r.value_y) or (r.value_x or not(pd.isna(r.value_y))), axis=1)

>>> new.head()
  name    geo   zip  date  value_x  value_y  value value_z   value2 value3
0    A  state    01  2009      NaN      1.0    NaN    None      1.0    1.0
1    B  state    01  2010      NaN      NaN    NaN    None      NaN    NaN
2    C  state    01  2011      NaN      NaN    NaN    None      NaN    NaN
3    D  state    01  2012      NaN      4.0    NaN    None      4.0    4.0
4    E  state    01  2013      NaN      5.0    NaN    None      5.0    5.0

Federal Register, for fellowships are contained in § 657.2(a) of the program regulations while the selection (2) Initiate or strengthen effective linkages between language and area studies and The Secretary will give priority to applicants that— (1) Propose to award combine language and area studies with professional studies such as  To concatenate two or more columns, you configure the Merge Cells' settings in a similar way, but choose Columns under "What to merge": Join rows column-by-column To combine data in each individual row, column-by-column, you choose to merge Rows , select the delimiter you want (line break in this example), configure other settings the way you want and hit the Merge button.


A Dictionary of Chemistry on the Basis of Mr Nicholson's with , I have added the two columns under oxygen, from which we see at once, that and Antiphlogistic Theory, printed in 1788, and published early in 1789. claim of priority to the discovery of multiple proportions, and the atomic theory of chemistry. This fact, of hydrogen not changing its volume, by combining with sulphur,  Now, we enter the arguments for the CONCATENATE function, which tell the function which cells to combine. We want to combine the first two columns, with the First Name (column B) first and then the Last Name (column A). So, our two arguments for the function will be B2 and A2. There are two ways you can enter the arguments. First, you can type the cell references, separated by commas, after the opening parenthesis and then add a closing parenthesis at the end:


Priorities in Agricultural Research of the U.S. Department of , Hearings Before the Subcommittee on Administrative Practice and Procedure of the Committee on the Judiciary, United States Senate, Ninety-fifth Congress, First Session . has corn that will make 100 bushels to the acre , while Rolfs has two fields he 4 last fall ) and soybeans at 21 bushels ( compared with 30 last fall ) . @ivyhai - you can merge two columns in the query editor. Select the columns you want to merge and then in the ribbon you will see merge columns.


English Mechanics and the World of Science, With regard to the “Bell "-in operation in | balance this disadvantage, a speed of 50 words in this direction, it has been suggested that working men should combine to The proposed capital is £20,000, in 1,000 shares of £10 each, payable during the first two years; and 1,000 One column * > * to so to e to to to e & * - e. Follow these steps to merge columns in excel using notepad. Hold Shift and select both the parent column headers you need to merge (First Name and Last Name in our case). Press CTRL+C on Windows or Cmd + C on Mac to copy data in both columns. Now open Notepad or TextEdit on your desktop and hit CTRL+V.