Using a for-loop to change a column in a dataframe

rename column pandas
create pandas dataframe from loop
create multiple dataframe in for loop python
how to iterate through one column in dataframe
r loop through specific columns
iterate through columns of a pandas dataframe
for loop specific columns python
create a column using for loop in pandas dataframe

I am a beginning user of R and i have written a code that I believe could be shortened with a for loop. The problem is I cannot figure out how to write the loop.

I have a dataframe with the column 'TestGrade' with values like 'Grade 1' or 'Kindergarten'. I am trying to change that column to be only a numeric value. For example 'Kindergarten' would be changed to 0 and 'Grade 1' would be changed to 1. I will provide code below of a sample dataframe and also how I solved the problem without a loop.

Any guidance will be greatly appreciated!

##Sample Data
FirstInitial <- c("A", "D", "M", "C", "J", "S", "K", "L", "M", "K", "G", "B", "F")
LastInitial <- c("S", "M", "T", "M", "A", "B", "H", "M", "S", "W", "L", "Z", "P")
TestGrade <- c('Kindergarten', 'Grade 1','Grade 2', 'Grade 3','Grade 4', 'Grade 5', 'Grade 6','Grade 7','Grade 8', 'Grade 9', 'Grade 10', 'Grade 11','Grade 12')

df <- data.frame(FirstInitial, LastInitial, TestGrade)

##The codes current function
if(any(df$TestGrade == 'Kindergarten')){
  df$TestGrade <- gsub('Kindergarten', '0', df$TestGrade)
}
if(any(df$TestGrade == 'Grade 1')){
  df$TestGrade <- gsub('Grade 1', '1', df$TestGrade)
}
if(any(df$TestGrade == 'Grade 2')){
  df$TestGrade <- gsub('Grade 2', '2', df$TestGrade)
}
if(any(df$TestGrade == 'Grade 3')){
  df$TestGrade <- gsub('Grade 3', '3', df$TestGrade)
}
if(any(df$TestGrade == 'Grade 4')){
  df$TestGrade <- gsub('Grade 4', '4', df$TestGrade)
}
if(any(df$TestGrade == 'Grade 5')){
  df$TestGrade <- gsub('Grade 5', '5', df$TestGrade)
}

if(any(df$TestGrade == 'Grade 6')){
  df$TestGrade <- gsub('Grade 6', '6', df$TestGrade)
}
if(any(df$TestGrade == 'Grade 7')){
  df$TestGrade <- gsub('Grade 7', '7', df$TestGrade)
}
if(any(df$TestGrade == 'Grade 8')){
  df$TestGrade <- gsub('Grade 8', '8', df$TestGrade)
}
if(any(df$TestGrade == 'Grade 9')){
  df$TestGrade <- gsub('Grade 9', '9', df$TestGrade)
}
if(any(df$TestGrade == 'Grade 10')){
  df$TestGrade <- gsub('Grade 10', '10', df$TestGrade)
}
if(any(df$TestGrade == 'Grade 11')){
  df$TestGrade <- gsub('Grade 11', '11', df$TestGrade)
}
if(any(df$TestGrade == 'Grade 12')){
  df$TestGrade <- gsub('Grade 12', '12', df$TestGrade)
}

First shortening: you don't need any if(any(...)). gsub is smart, it's like a find/replace. The command gsub('Grade 9', '9', df$TestGrade) will replace 'Grade 9' with '9', and won't touch anything else. So deleting all your if statements, we get:

df$TestGrade <- gsub('Kindergarten', '0', df$TestGrade)
df$TestGrade <- gsub('Grade 1', '1', df$TestGrade)
df$TestGrade <- gsub('Grade 2', '2', df$TestGrade)
df$TestGrade <- gsub('Grade 3', '3', df$TestGrade)
df$TestGrade <- gsub('Grade 4', '4', df$TestGrade)
df$TestGrade <- gsub('Grade 5', '5', df$TestGrade)
df$TestGrade <- gsub('Grade 6', '6', df$TestGrade)
df$TestGrade <- gsub('Grade 7', '7', df$TestGrade)
df$TestGrade <- gsub('Grade 8', '8', df$TestGrade)
df$TestGrade <- gsub('Grade 9', '9', df$TestGrade)
df$TestGrade <- gsub('Grade 10', '10', df$TestGrade)
df$TestGrade <- gsub('Grade 11', '11', df$TestGrade)
df$TestGrade <- gsub('Grade 12', '12', df$TestGrade)

Next improvement, we could do a loop. This is exactly equivalent to the code above, just less typing.

pattern = c("Kindergarten", paste("Grade", 1:12))
replacement = as.character(0:12)

for (i in seq_along(pattern)) {
  df$TestGrade <- gsub(pattern[i], replacement[i], df$TestGrade)
}

Even better, we could be cleverer, make kindergarten a special case and just delete "Grade " from everything else, as in Juian's and Ronak's answers. Another variation of that is this:

df$TestGrade = as.character(df$TestGrade) # needed only if it is a factor
df$TestGrade[df$TestGrade == "Kindergarten"] = 0
df$TestGrade = sub("Grade ", "", df$TestGrade)
df$TestGrade = as.numeric(df$TestGrade) # if needed

If we really want to be fancy, we could set fixed = TRUE inside sub(). This tells sub we want exact matches only, we're not trying to use regular expressions. This will make the code run faster, but unless you've got a lot of data, you won't notice a difference. If you have 100,000+ rows, this method will be quite fast:

# optimized
df$TestGrade = as.character(df$TestGrade) # needed only if it is a factor
df$TestGrade[df$TestGrade == "Kindergarten"] = 0
df$TestGrade = as.integer(sub("Grade ", "", df$TestGrade, fixed = TRUE))

Create A pandas Column With A For Loop, Create A pandas Column With A For Loop. 20 Dec 2017 Create an example dataframe. raw_data = {'student_name': ['Miller', 'Jacobson', 'Ali'� Iterate over columns in dataframe using Column Names. Dataframe.columns returns a sequence of column names. We can iterate over these column names and for each column name we can select the column contents by column name i.e.

We can use ifelse, assign 0 for "Kindergarten" and remove "Grade" from others

as.numeric(ifelse(df$TestGrade == "Kindergarten", 0, 
          sub("Grade ", "", df$TestGrade)))

#[1]  0  1  2  3  4  5  6  7  8  9 10 11 12

Create a column using for loop in Pandas Dataframe, How to Convert String to Integer in Pandas DataFrame? More related articles in Python. Permutation of first N positive� Let’s see how to create a column in pandas dataframe using for loop. Such operation is needed sometimes when we need to process the data of dataframe created earlier for that purpose, we need this type of computation so we can process the existing data and make a separate column to store the data.

We can use case_when

library(dplyr)
library(readr)
df %>%
  mutate(TestGrade = case_when(as.character(TestGrade) == "Kindergarten"~ 0,
                               TRUE ~ parse_number(TestGrade)))

#   FirstInitial LastInitial TestGrade
#1             A           S         0
#2             D           M         1
#3             M           T         2
#4             C           M         3
#5             J           A         4
#6             S           B         5
#7             K           H         6
#8             L           M         7
#9             M           S         8
#10            K           W         9
#11            G           L        10
#12            B           Z        11
#13            F           P        12

python loop through column in dataframe Code Example, 0 must be installed. python pyautogui how to change the screenshot location � sort a dataframe by a column valuepython � with font type stuff� Iteration is a general term for taking each item of something, one after another. Pandas DataFrame consists of rows and columns so, in order to iterate over dataframe, we have to iterate a dataframe like a dictionary.

This can be done without the need for a for loop using two line code. I also suggest you add stringsAsFactors = F in your data.frame command before running these lines

df$TestGrade[df$TestGrade == "Kindergarten"] = 0
df$TestGrade <- gsub("Grade ", "", df$TestGrade)

> df
   FirstInitial LastInitial TestGrade
1             A           S         0
2             D           M         1
3             M           T         2
4             C           M         3
5             J           A         4
6             S           B         5
7             K           H         6
8             L           M         7
9             M           S         8
10            K           W         9
11            G           L        10
12            B           Z        11
13            F           P        12

Assigning column names through for loop in Pandas : learnpython, I am trying to name pandas dataframe columns based on different years, from Learning Python Protip: when doing exercises change the example code to� I wonder what would be the best or most appropriate way to create and modify a data frame in a for-loop, using cbind or rbind? For the first iteration, the data frame has no column or rows, so - in the below example - cbind does not work. Only for this first case, I need the if-else-command inside the for-loop.

You can write a key and set the grades as a factor. This will work even if the format of the grades changes.

key <- c('Kindergarten',
         'Grade 1',
         'Grade 2',
         'Grade 3',
         'Grade 4',
         'Grade 5',
         'Grade 6',
         'Grade 7',
         'Grade 8',
         'Grade 9',
         'Grade 10',
         'Grade 11',
         'Grade 12')
dat <- c('Grade 3', 'Grade 5', 'Grade 2')
dat <- factor(dat, levels = key)
dat <- as.numeric(dat) - 1
dat

We subtract 1 at the end because the factors start at 1 and you wanted kindergarten set to 0.

For Loops in Python, To iterate over a series of items For loops use the range function. What if you would like to modify or work with the indices of the sequence like changing the For loop to iterate over rows and columns of a dataframe. When you get column by w.female. or w[[2]] (where, suppose, 2 is number of your column) you'll get back DataFrame. So in this case you can use DataFrame methods like .replace. When you use .loc or iloc you get back Series, and Series don't have .replace method, so you should use methods like apply, map and so on.

Tutorial: Advanced For Loops in Python – Dataquest, Each time Python iterates through the loop, the variable object takes on the If we try to iterate over a pandas DataFrame as we would a numpy array, this Loop control statements change the execution of a for loop from its normal sequence. Pandas Dataframe type has two attributes called ‘columns’ and ‘index’ which can be used to change the column names as well as the row indexes. Create a DataFrame using dictionary. filter_none

Pandas : Loop or Iterate over all or certain columns of a dataframe , Pandas : Drop rows from a dataframe with missing values or NaN in columns � Python Pandas : Replace or change Column & Row index names� Iterate pandas dataframe. DataFrame Looping (iteration) with a for statement. You can loop over a pandas dataframe, for each column row by row. Related course: Data Analysis with Python Pandas. Below pandas. Using a DataFrame as an example.

How to efficiently loop through Pandas DataFrame | by Wei Xia, for loop with .iloc; iterrows; itertuple; apply; python zip; pandas is trying to compute the sum of all elements of two of the columns of the DataFrame. Here we convert each column into a numpy array, and does all the heavy� In using_apply, we does apply on each row, then access each column value separately, whereas in the other function, we only pass in the relevant columns, and unpack the row to get all columns at

Comments
  • The tangential advice I repeatedly offer up is to use explicit loops as a LAST choice in R. The apply functions and related packages are designed for coding efficiency.