Create New Column Based On String

pandas create new column based on other columns
create new column pandas
np where string contains
pandas create column based on values in other columns
create new column in dataframe based on other columns python
create a new column based on existing column in pandas
pandas str contains list
make new dataframe column

I have a data frame, want to create a column based on the string in column1_sport.

import pandas as pd

df = pd.read_csv('C:/Users/test/dataframe.csv', encoding  = 'iso-8859-1')

Data contains:

column1_sport
baseball
basketball
tennis
boxing
golf

I want to look for certain strings ("ball" or "box") and create a new column based on whether the column contains that word. If the dataframe doesn't contain that word, add "other". See below.

column1_sport    column2_type
baseball         ball
basketball       ball
tennis           other 
boxing           box              
golf             other

For multiple conditions I suggest np.select. For example:

values = ['ball', 'box']
conditions = list(map(df['column1_sport'].str.contains, values))

df['column2_type'] = np.select(conditions, values, 'other')

print(df)

#   column1_sport column2_type
# 0      baseball         ball
# 1    basketball         ball
# 2        tennis        other
# 3        boxing          box
# 4          golf        other

Pandas: Create new column based on first x letters of string in other , I want to create a new column based on the first 3 letter of string contained in existing column. Example: Initial dataframe: col1. XYAZSZ. CXJSHD. New dataframe:  Column Expression node is the one where you can combine columns and strings with conditions. You can try that or other option is to use two nodes. First use String Manipulation node to get new column and after that Rule engine node where you will take value from new column or put “Linear”.

df["column2_type"] = df.column1_sport.apply(lambda x: "ball" if "ball" in x else ("box" if "box" in x else "Other"))
df

    column1_sport   column2_type
0        baseball           ball
1      basketball           ball
2          tennis          Other
3          boxing            box
4            golf          Other

Incase you have more complex conditions

def func(a):
    if "ball" in a.lower():
        return "ball"
    elif "box" in a.lower():
        return "box"
    else:
        return "Other"

df["column2_type"] = df.column1_sport.apply(lambda x: func(x))

Create a new column in Pandas DataFrame based on the existing , Split a String into columns using regex in pandas DataFrame · Join two text columns into a single column in Pandas · Python | Delete rows/columns from  Create a new column in Pandas DataFrame based on the existing columns While working with data in Pandas, we perform a vast array of operations on the data to get the data in the desired form. One of these operations could be that we want to create new columns in the DataFrame based on the result of some operations on the existing columns in the

You can use a nested np.where

cond1 = df.column1_sport.str.contains('ball')
cond2 = df.column1_sport.str.contains('box')
df['column2_type'] = np.where(cond1, 'ball', np.where(cond2, 'box', 'other') )

    column1_sport   column2_type
0   baseball        ball
1   basketball      ball
2   tennis          other
3   boxing          box
4   golf            other

Python, Pandas provide a method to split string around a passed separator/delimiter. Return Type: Series of list or Data frame depending on expand Parameter The Data frame is then used to create new columns and the old Name column is  Add a new column in DataFrame with values based on other columns. Let’s add a new column ‘Percentage’ where entry at each index will be calculated by the values in other columns at that index i.e. dfObj['Percentage'] = (dfObj['Marks'] / dfObj['Total'] ) * 100

How to create a new column based on the values from multiple , Just needed == replacing with = and the braces adding. $NF } 1' file col1 col2 col3 col4 col5 newcol 1 3 4 string string 0 4 2 1 string string 4. When the awk script has added and outputted the new column header, it starts to compute the new  Hi, I have a table with a column with date values in the following format: eg. 20140119 I want to create/ add a new column which will have the following format: 2014-01-19 I tried adding a custom column, but not sure about the function to be used to get a substring.

create a new column based on existing columns using if else , Make new columns from existing data and build custom functions. You can do this by creating a derived column based on the values in the platform column. This new column is You can define mobile platforms in this list of strings: Input. Create a calculate column using DAX as below: lBStatus_ = VAR Most_Current_Year_Sem = MAX(Sheet1[Year Sem]) RETURN CALCULATE(VALUES(Sheet1[IBStatus]), FILTER(ALLEXCEPT(Sheet1, Sheet1[StudentID]), Sheet1[Year Sem] = Most_Current_Year_Sem))

Deriving New Columns & Defining Python Functions, Solved: I am trying to fill a new column with String values based on an example of how you'd like to input that, I'm sure we can make it work. Actually we don’t have to rely on NumPy to create new column using condition on another column. Instead we can use Panda’s apply function with lambda function. gapminder['gdpPercap_ind'] = gapminder.gdpPercap.apply(lambda x: 1 if x >= 1000 else 0) gapminder.head()

Comments
  • @nia4life, go with jpp's np.select for more conditions