Rename variously formatted column headers in pandas

pandas rename column
pandas to_csv rename columns
pandas rename multiple columns
pandas column names
pandas pivot table rename columns
pandas read_csv rename columns
rename column while merging pandas
pandas rename albon

I'm working on a small tool that does some calculations on a dataframe, let's say something like this:

df['column_c'] = df['column_a'] + df['column_b']

for this to work the dataframe need to have the columns 'column_a' and 'column_b'. I would like this code to work if the columns are named slightly different named in the import file (csv or xlsx). For example 'columnA', 'Col_a', ect).

The easiest way would be renaming the columns inside the imported file, but let's assume this is not possible. Therefore I would like to do some think like this:

if column name is in list ['columnA', 'Col_A', 'col_a', 'a'... ] rename it to 'column_a'

I was thinking about having a dictionary with possible column names, when a column name would be in this dictionary it will be renamed to 'column_a'. An additional complication would be the fact that the columns can be in arbitrary order.

How would one solve this problem?

Simply

for index, column_name in enumerate(df.columns):
    if column_name in ['columnA', 'Col_A', 'col_a' ]:
        df.columns[index] = 'column_a'

with dictionary

dico = {'column_a':['columnA', 'Col_A', 'col_a' ], 'column_b':['columnB', 'Col_B', 'col_b' ]}
for index, column_name in enumerate(df.columns):
    for name, ex_names in dico:
        if column_name in ex_names:
            df.columns[index] = name

python, I was thinking about having a dictionary with possible column names, when a column name would be in this dictionary it will be renamed to  Renaming column headers in Pandas When importing a file into a Pandas DataFrame, Pandas will use the first line of the file as the column names. If you have repeated … - Selection from Python Business Intelligence Cookbook [Book]

I recommend you formulate the conversion logic and write a function accordingly:

lst = ['columnA', 'Col_A', 'col_a', 'a']

def converter(x):
    return 'column_'+x[-1].lower()

res = list(map(converter, lst))

['column_a', 'column_a', 'column_a', 'column_a']

You can then use this directly in pd.DataFrame.rename:

df = df.rename(columns=converter)

Example usage:

df = pd.DataFrame(columns=['columnA', 'col_B', 'c'])
df = df.rename(columns=converter)

print(df.columns)

Index(['column_a', 'column_b', 'column_c'], dtype='object')

Rename Column Headers In pandas, Rename Column Headers In pandas. 20 Dec 2017. Originally from rgalbo on StackOverflow. Preliminaries. # Import required modules import pandas as pd  One way to rename columns in Pandas is to use df.columns from Pandas and assign new names directly. For example, if you have the names of columns in a list, you can assign the list to column names directly.

This should solve it:

df=pd.DataFrame({'colA':[1,2], 'columnB':[3,4]})
def rename_df(col):
    if col in ['columnA', 'Col_A', 'colA' ]:
        return 'column_a'
    if col in ['columnB', 'Col_B', 'colB' ]:
        return 'column_b'
    return col
df = df.rename(rename_df, axis=1)

How to rename multiple column headers in a Pandas DataFrame?, How to rename multiple column headers in a Pandas DataFrame def Snippet_103(): print() print(format('How to rename multiple column headers in a Pandas  Below code will rename all the column names in sequential order. # rename all the columns in python df1.columns = ['Customer_unique_id', 'Product_type', 'Province'] first column is renamed as ‘Customer_unique_id’. second column is renamed as ‘Product_type’. third column is renamed as ‘Province’.

if you have the list of other names like list_othername_A or list_othername_B, you can do:

for col_name in df.columns:
    if col_name in list_othername_A:
        df = df.rename(columns = {col_name : 'column_a'})
    elif col_name in list_othername_B:
        df = df.rename(columns = {col_name : 'column_b'})
    elif ...

EDIT: using the dictionary of @djangoliv, you can do even shorter:

dico = {'column_a':['columnA', 'Col_A', 'col_a' ], 'column_b':['columnB', 'Col_B', 'col_b' ]}
#create a dict to rename, kind of reverse dico:
dict_rename = {col:key for key in dico.keys() for col in dico[key]}
# then just rename:
df = df.rename(columns = dict_rename )

Note that this method does not work if in df you have two columns 'columnA' and 'Col_A' but otherwise, it should work as rename does not care if any key in dict_rename is not in df.columns.

Summarising, Aggregating, and Grouping data in Python Pandas , Aggregation and grouping of Dataframes is accomplished in Python Pandas using unique groups and corresponding values being the axis labels belonging to each group. Groupby output format – Series or DataFrame? We'​ll examine two methods to group Dataframes and rename the column results in your work. I want to make all column headers in my pandas data frame lower case. Example. If I have: data = country country isocode year XRAT tcgdp 0 Canada CAN 2001 1.54876 924909.44207 1 Canada CAN 2002 1.56932 957299.91586 2 Canada CAN 2003 1.40105 1016902.00180 .

A Comprehensive Guide to Pandas' Advanced Features in 20 Minutes, astype has to be called directly on the column that you want to convert. Pandas will then guess the format and try to parse the date from the Input. True to ignore preexisting indices and instead use labels from 0 to n-1 for the resulting when the columns are named differently in the two DataFrames. columns dict-like or function. Alternative to specifying axis (mapper, axis=1 is equivalent to columns=mapper). axis int or str. Axis to target with mapper. Can be either the axis name (‘index’, ‘columns’) or number (0, 1). The default is ‘index’. copy bool, default True. Also copy underlying data. inplace bool, default False

How to loop through headers of a DF in Pandas, If you just want the column headers, you can throw them into a list and loop through that list. How does a for Loop execute code differently than an IF-Else Loop? How do I write a Python algorithm to loop in this format, 54321 4321 321 1,  Setting pd.core.format.header_style to None means that the header cells don't have a user defined format and thus the format can be overridden.. – jmcnamara Apr 18 '16 at 13:18 7 It looks like in pandas 0.18.1 they moved pd.core.format to pd.formats module, so now one should write pd.formats.format.header_style = None . – krvkir May 27 '16 at 10:39

Python Data Aggregation, Python has several methods are available to perform aggregations on data. groups and corresponding values being the axis labels belonging to each group. You can change this by selecting your operation column differently: Renaming of variables within the agg() function no longer functions as in the diagram  *****How to rename multiple column headers in a Pandas DataFrame***** Commander Date Score Cochice Jason 2012, 02, 08 4 Pima Molly 2012, 02, 08 24 Santa Cruz Tina 2012, 02, 08 31 Maricopa Jake 2012, 02, 08 2 Yuma Amy 2012, 02, 08 3 Leader Time Score Cochice Jason 2012, 02, 08 4 Pima Molly 2012, 02, 08 24 Santa Cruz Tina 2012, 02, 08 31 Maricopa Jake 2012, 02, 08 2 Yuma Amy 2012, 02, 08 3

Comments
  • edited my question to make it more clear
  • This looks like a neat and compact solution, exactly what I was looking for, thank you!
  • Yes i could see this work, this would work for renaming one column at a time. But not for renaming multiple columns at ones right?
  • @XanderMJ see now with several list of names