renaming column in dataframe for Pandas using regular expression

pandas replace regex
pandas rename column
pandas replace string with number
pandas replace multiple values
pandas regex
pandas replace values in column based on condition
pandas replace with nan
pandas replace values with dictionary

I have a dataframe made by Pandas that I want to remove the empty space at the end of each column name. I tried something like:

raw_data.columns.values = re.sub(' $','',raw_data.columns.values)

But this is not working, anything I did wrong here?

I should have used the re package:

raw_data = raw_data.rename(columns=lambda x: re.sub(' $','',x))

Renaming columns in pandas dataframe using regular expressions , I'd use map: In [11]: df.columns.map(lambda x: int(x[1:])) Out[11]: array([2010, 2011, 2012, 2013]) In [12]: df.columns = df.columns.map(lambda x: int(x[1:])) In [ 13]:  Method #1: Using rename () function. One way of renaming the columns in a Pandas dataframe is by using the rename () function. This method is quite useful when we need to rename some selected columns because we need to specify information only for the columns which are to be renamed. Rename a single column.

I would recommend using pandas.Series.str.strip

df.columns = df.columns.str.strip()

Replace values in Pandas dataframe using regex, Additionally, We will use Dataframe.apply() function to apply our customized function on each values the column. Use pandas.DataFrame.rename() You can use the rename() method of pandas.DataFrame to change any row / column name individually. pandas.DataFrame.rename — pandas 0.22.0 documentation; Specify the original name and the new name in dict like {original name: new name} to index / columns of rename(). index is for index name and columns is for the columns name.

The answer from @Christian is probably right for this specific question, but to the more general question about replacing names in the columns, I would suggest to create a dictionary comprehension and pass it to the rename function:

df.rename(columns={element: re.sub(r'$ (.+)',r'\1', element, flags = re.MULTILINE) for element in df.columns.tolist()})

In my case, I wanted to add something to the beginning of each column, so:

df.rename(columns={element: re.sub(r'(.+)',r'x_\1', element) for element in df.columns.tolist()})

You can use the inplace=True parameter to actually make the change in the dataframe.

How to rename columns in Pandas DataFrame, Split a String into columns using regex in pandas DataFrame · Using dictionary to remap values in Pandas DataFrame columns · Loop or Iterate over all or certain  The rename method has added the axis parameter which may be set to columns or 1. This update makes this method match the rest of the pandas API. It still has the index and columns parameters but you are no longer forced to use them. The set_axis method with the inplace set to False enables you to rename all the index or column labels with a list.

pandas.DataFrame.replace, regex: regexs matching to_replace will be replaced with value. list of str Note: this will modify any other views on this object (e.g. a column from a DataFrame). So in those cases, we use regular expressions to deal with such data having some pattern in it. We have already discussed in previous article how to replace some known string values in dataframe . In this post, we will use regular expressions to replace strings which have some pattern to it.

pandas.Series.str.replace, Use of case , flags , or regex=False with a compiled regex will raise an error. Examples. When pat is a string and regex is True (the default), the given pat is  Im trying to apply some regular expressions that I have coded up and can run against a variable but I would like to apply it on a dataframe column and then pass the results out to a new column. df["Details"] is my dataframe df["Details"] is my dataframe and it contains some text similar to what I have created below as details

Class notes on replacing values and strings, Renaming columns on an existing dataframe The “real” NaN is from numpy, the numeric powerhouse hiding inside of pandas. You need to make sure to specify both regex=True to use regular expressions and inplace=True to save it back  There are several pandas methods which accept the regex in pandas to find the pattern in a String within a Series or Dataframe object. These methods works on the same line as Pythons re module. Its really helpful if you want to find the names starting with a particular character or search for a pattern within a dataframe column or extract the dates from the text.

Comments
  • If the empty space is at the end of the column names, shouldn't it be re.sub(' $', '')?
  • sorry for the wrong regex there, I tried the corrected one but python responded type error
  • This will only remove 1 space.Use this if you want to remove all.raw_data.columns.values = re.sub(r'[ ]*$','',raw_data.columns.values)
  • would this work if the resulting column names were non-unique?
  • I tried my hand at looping through columns [for col in df.columns], using regex to identify what I needed to remove, and then re-name the columns one-by-one, which worked just fine, but was very slow. Your solution is much more performant! (for context: I was stripping a column-counter, i.e. (1), (2), ... (300), so that's why I needed regex)