Pandas - Replace number-strings with the name ID

pandas replace string with number
pandas replace specific values in column
pandas replace values in column based on condition
pandas replace multiple values
pandas replace with nan
pandas replace row
pandas index replace
pandas replace with dictionary

I have a column in my dataframe of all Strings some of them are TAG(machines/computers), some other items, and the others are ID's. I am looking to change all the strings that are ID's to "ID" instead of the number-string.

type(df.columnOne[1])
str 

This is what my df column looks like:

df
  columnOne
0 TAG
1 1115268
2 13452
3 system
4 TAG
5 355511
6 95221543
7 5124
8 111333544
9 TAG
10 local
11 434312

Desired output:

df
  columnOne
0 TAG
1 ID
2 ID
3 system
4 TAG
5 ID
6 ID
7 ID
8 ID
9 TAG
10 Local
11 ID

I would normally do something where if it doesn't equal TAG or system or Local then ID. But it is always changing with names.

If I understand correctly, you can use str.isnumeric:

df.loc[df.columnOne.str.isnumeric(),'columnOne'] = 'ID'

>>> df
   columnOne
0        TAG
1         ID
2         ID
3     system
4        TAG
5         ID
6         ID
7         ID
8         ID
9        TAG
10     local
11        ID

Replacing strings with numbers in Python for Data Analysis , So we assign unique numeric value to a string value in Pandas DataFrame. Note: Before executing create an example.csv file containing some names and Then iterate using for loop through Gender column of DataFrame and replace the  When value=None and to_replace is a scalar, list or tuple, replace uses the method parameter (default ‘pad’) to do the replacement. So this is why the ‘a’ values are being replaced by 10 in rows 1 and 2 and ‘b’ in row 4 in this case. The command s.replace('a', None) is actually equivalent to

Try replace

df.columnOne = df.columnOne.str.replace('\d+', 'ID')

0        TAG
1         ID
2         ID
3     system
4        TAG
5         ID
6         ID
7         ID
8         ID
9        TAG
10     local
11        ID

pandas.Series.replace, Series. replace (self, to_replace=None, value=None, inplace=False, Note that column names (the top-level dictionary keys in a nested dictionary) cannot be regular However, if those floating point numbers are strings, then you can do this. Varun July 1, 2018 Python Pandas : Replace or change Column & Row index names in DataFrame 2018-09-01T20:16:09+05:30 Data Science, Pandas, Python No Comment In this article we will discuss how to change column names or Row Index names in DataFrame object.

As RafaelC mentioned in the comment , if contain float

df.loc[pd.to_numeric(df.columnOne,errors='coerce').notna(),'columnOne']='ID'
df
Out[536]: 
   columnOne
0        TAG
1         ID
2         ID
3     system
4        TAG
5         ID
6         ID
7         ID
8         ID
9        TAG
10     local
11        ID

Class notes on replacing values and strings, If you're renaming the columns on a dataframe that already exists, you can use Replace all of the terrible spellings of my name df['name'].replace("J[O]H?NATH?[ I might like 0000834 to be my ID number, but in the file it's 834 and pandas  Replace values in Pandas dataframe using regex While working with large sets of data, it often contains text data and in many cases, those texts are not pretty at all. The is often in very messier form and we need to clean those data before we can do anything meaningful with that text data.

Solution using apply: (just for completeness, str.replace and str.isnumeric solutions are much more simple)

df = pd.DataFrame({'columnOne': ['TAG', 
                                 '1111', 
                                 'system']})

def ids_replace(x):
    try:
        int(x)
        return 'ID'
    except ValueError:
        return x

print(df.apply(ids_replace, axis=1))

> columnOne
0   TAG
1   ID
2   system

pandas.DataFrame.replace seems taking number string as integer , Since I loaded big id numbers like '100000715097692381911' as string type, the pandas.DataFrame.replace() method should replace it with  pandas.DataFrame.set_index¶ DataFrame.set_index (self, keys, drop=True, append=False, inplace=False, verify_integrity=False) [source] ¶ Set the DataFrame index using existing columns. Set the DataFrame index (row labels) using one or more existing columns or arrays (of the correct length). The index can replace the existing index or expand on it.

How to replace a part string value of a column using another column , data['Product Name'] = data['Product Name'].str.replace('\d+',''). This should get rid of the number if that's what you are looking for. I am not sure what you mean  A problem with this approach to change column names is that one has to change names of all the columns in the data frame. This approach would not work, if we want to change just change the name of one column. 2. Pandas rename function to Rename Columns. Another way to change column names in pandas is to use rename function.

7.2. re — Regular expression operations, The solution is to use Python's raw string notation for regular expression patterns; and also by name in the regular expression itself (using (?P=id)) and replacement text \number: Matches the contents of the group of the same number. Can be either the axis name (‘index’, ‘columns’) or number (0, 1). The default is ‘index’. copy : bool, default True Also copy underlying data. inplace : bool, default False Whether to return a new DataFrame. If True then value of copy is ignored. level : int or level name, default None In case of a MultiIndex,

[PDF] Package 'textclean', drop_row - returns a dataframe with the termed/markered rows removed. If TRUE and fixed = TRUE, the pattern string is sorted by number of Replaces emojis with word equivalents or a token identifier for use in the sentimentr Arguments x. The text variable. names. A vector of names to replace. Replacing strings with numbers in Python for Data Analysis Sometimes we need to convert string values in a pandas dataframe to a unique integer so that the algorithms can perform better. So we assign unique numeric value to a string value in Pandas DataFrame.

Comments
  • Yes, based on the OP's example, should be OK. Good to point it out though, thanks!
  • I like the way of replace