Combining text values in a pandas dataframe column based on same value in another column

pandas merge
merge columns in same dataframe pandas
pandas merge on different column names
pandas combine two columns into list
drop column pandas
merge two dataframes pandas with same column names
pandas add columns from another dataframe
pandas merge rows

I have data where I may have different people associated with the same entry.

I need to combine the two entries together and note that two people are on it.

For example, the data may look like:

Name Share_ID value1 value2 value3 etc.
Joe  0001     1      2      4
Ann  0002     2      5      2
Mel  0001     1      2      4

The output would need to be:

Name      Share_ID value1 value2 value3 etc.
Joe, Mel  0001     1      2      4
Ann       0002     2      5      2

I tried to use groupby

df1.groupby(['Share_ID'])['Name'].apply(', '.join).reset_index()

But my result from that was just:

Share_ID Name
0001     Joe, Mel
0002     Ann

The Name column combined correctly, but I lost the other columns. Note that I do not want the other columns to have anything applied to them--Joe and Ann's records are identical.

I think my approach is off, but I'm not sure what function to use.


Starting where you left off you could just join your resulting data set back to the initial DataFrame:

# Find the merged name data set and rename the 'Name' column
names = df1.groupby(['Share_ID'])['Name'].apply(', '.join).reset_index().rename(columns={'Name':'Merged Name'})
# Join it to the original dataset
df1 = df1.merge(names, on='Share_ID')
# Drop the 'Name' column then drop duplicates.
df1 = df1.drop(columns=['Name']).drop_duplicates()

Join two text columns into a single column in Pandas, Recommended Posts: Split a text column into two columns in Pandas DataFrame · Create a new column in Pandas DataFrame based on the existing columns  In this short guide, I’ll show you how to concatenate column values in pandas DataFrame. To start, you may use this template to concatenate your column values (for strings only): df1 = df ['1st Column Name'] + df ['2nd Column Name'] + Notice that the plus symbol (‘+’) is used to perform the concatenation.


You can take the outcome you got, merge it with the original dataframe, and drop duplicates:

pd.merge(df1.groupby(['Share_ID'])['Name'].apply(', '.join).reset_index(), df1, on='Share_ID').drop_duplicates(subset='Share_ID')

How to Join Two Text Columns into a Single Column in Pandas , How to Combine Two Text Columns in to One Column in Pandas? import pandas as pd. # create a new data frame. df = pd.DataFrame({  Often, you may want to subset a pandas dataframe based on one or more values of a specific column. Essentially, we would like to select rows based on one value or multiple values present in a column. Here are SIX examples of using Pandas dataframe to filter rows or select rows based values of a column(s).


Any particular reason for not using values fields in group by?

df1.groupby(['Share_ID','value1', 'value2', 'value3'])['Name'].apply(', '.join).reset_index()

This will give the required output.

How to Concatenate Column Values in Pandas DataFrame, You may use pandas to concatenate column values in Python. In this Example 3: Concatenating two DataFrames, and then finding the maximum value In the previous example, you saw how to create the first DataFrame based on this data: The purpose of this exercise is to demonstrate that you can apply different  pandas.DataFrame.combine ¶ DataFrame.combine(self, other: ‘DataFrame’, func, fill_value=None, overwrite=True) → ’DataFrame’ [source] ¶ Perform column-wise combine with another DataFrame. Combines a DataFrame with other DataFrame using func to element-wise combine columns.


pandas.DataFrame.combine, Perform column-wise combine with another DataFrame. Combines a The value to fill NaNs with prior to passing any column to the merge func. Combine two DataFrame objects and default to non-null values in frame calling the method. However, if the same element in both dataframes is None, that None is preserved​. The Name column combined correctly, but I lost the other columns. Note that I do not want the other columns to have anything applied to them--Joe and Ann's records are identical. I think my approach is off, but I'm not sure what function to use.


Merge, join, and concatenate, Note the index values on the other axes are still respected in the join. In the case of DataFrame, the indexes must be disjoint but the columns do not need to be: the column names when creating a new DataFrame based on existing Series. will use the value of the passed string as the name for the indicator column. Pandas : How to merge Dataframes by index using Dataframe.merge() - Part 3; Pandas : Convert a DataFrame into a list of rows or columns in python | (list of lists) Pandas : Merge Dataframes on specific columns or on index in Python - Part 2; Pandas: Convert a dataframe column into a list using Series.to_list() or numpy.ndarray.tolist() in


Pandas merge column duplicate and sum value, In another case when you have a dataset with several duplicated columns and you wouldn't want to select them separately use: df.groupby(by=df.columns,  Pandas: Sort rows or columns in Dataframe based on values using Dataframe.sort_values() Pandas : How to merge Dataframes by index using Dataframe.merge() - Part 3; Pandas : Merge Dataframes on specific columns or on index in Python - Part 2; Pandas : Sort a DataFrame based on column names or row index labels using Dataframe.sort_index()