Add factors to a df by subsetting rows containing certain letters

r subset dataframe by list of values
r subset dataframe by column value
r subset dataframe by column name
r subset dataframe by multiple column value
r subset list by condition
r extract rows with certain value
subset dataframe in r
r subset data frame multiple conditions

This is my data:

  Year variable value
 1951     MF12 1.441
 1952     MF12 2.068
 1953     RF12 2.008  
 1954     RF12 2.044
 1955     MW12 2.288
 1956     RW12 1.800

Where MF= Managed Frame, RF= Reserve Frame, MW= Managed Wind, RW= Reserve Wind. So in total 4 different levels = Managed, Reserve, Frame, Wind.

I want to create two types of factors based on these levels and add them as columns to the data frame. Factor 1 will be management.type (Managed, Reserve) and Factor 2 will be object.type (Frame, Wind).

Something like this:

Year variable value Management Object
1951   MF12 1.37845 Managed      Frame 
1952   MF12 1.38950 Managed      Frame
1953   MW12 1.55510 Managed      Wind
1954   RF12 1.66125 Reserve      Frame
1955   RW12 1.62600 Reserve      Wind
1956   RW13 1.58760 Reserve      Wind

How can I do this using R (rather than going back and sorting in excel)? I think in terms of Management type maybe use the start.with command to sort by starting with 'M' or 'R', but not sure how to do that. In terms of Object, is there a way to sort by words that contain letter 'F' or 'W'?

4 Subsetting, This is typically unexpected, so you should avoid subsetting with factors: Each row in the matrix specifies the location of one value, and each column corresponds df <- data.frame(x = 1:3, y = 3:1, z = letters[1:3]) df[df$x == 2, ] #> x y z #> 2 2 2 b i.e., fewer carriages, or you can extract the contents of a particular carriage. Our example data contains five rows and three columns. The column “group” will be used to filter our data. Example 1: Subset Rows with == In Example 1, we’ll filter the rows of our data with the == operator. Have a look at the following R code:

Give grepl() and ifelse() a try:

df$Management <- ifelse(test = grepl(pattern = "M", x = df$variable), 
                        yes  = "Managed", 
                        no   = "Reserve")

Data subsetting with base R: vectors and factors, Regardless of the specific analysis in R we are performing, we usually need to bring data in (= coerces) columns that contain characters (i.e., text) into the factor data type. Take a look at the dataframe by typing out the variable name metadata and Each row holds information for a single sample, and the columns contain  1 Add factors to a df by subsetting rows containing certain letters Oct 4 '19 1 Extracting spline from time series plot Jun 28 '18 1 R loop completes only 3 iterations out of 2504 Apr 23

We can use

library(dplyr)
df  %>%
    mutate(Management = factor(str_extract(variable, "^."),
          levels = c("M", "R"), labels = c("Managed", "Reserved")), 
          Object = factor(str_extract(variable, "(?<=^.)."), 
          levels = c("F", "W"), labels = c("Frame", "Wind")))
#   Year variable value Management Object
#1 1951     MF12 1.441    Managed  Frame
#2 1952     MF12 2.068    Managed  Frame
#3 1953     RF12 2.008   Reserved  Frame
#4 1954     RF12 2.044   Reserved  Frame
#5 1955     MW12 2.288    Managed   Wind
#6 1956     RW12 1.800   Reserved   Wind

Everything I know about R subsetting, Let's give ourselves some variables to play with – a simple character It doesn't work with atomic vectors (vectors of numbers, characters, etc.). To get the first two rows and the first four columns of this data frame, we For example, suppose I want to change the text of this matrix by adding some labels: Filter or subsetting rows in R using Dplyr can be easily achieved. Dplyr package in R is provided with filter() function which subsets the rows with multiple conditions.

Subsetting · Advanced R., Let's explore the different types of subsetting with a simple vector, x . Logical vectors select elements where the corresponding logical value is TRUE . Each row in the matrix specifies the location of one value, where each column df <- data.frame(x = 1:3, y = 3:1, z = letters[1:3]) df[df$x == 2, ] #> x y z #> 2 2 2 b df[c(1, 3)  Here, instead of subsetting the rows and columns we wanted returned, we subsetted the rows and columns we did not want returned and then omitted them with the “-” sign. If we now call ed_exp1 and ed_exp2, we can see that both data frames return the same subset of the original education data frame.

Subsetting Data, It is easiest to think of the data frame as a rectangle of data where the rows are the in other words, the first index is for Rows and the second index is for Columns [R, C]. When we only want to subset variables (or columns) we use the second index and This indicates that we want all the variables for specific observations. The first column of our example data is called x1 and the column at the third position is called x3. For that reason, the previous R syntax would extract the columns x1 and x3 from our data set. Example 3: Subsetting Data with select Argument of subset Function. In Example 3, we will extract certain columns with the subset function.

7.7 Reshaping a data frame, Just remember that R does need you to add the quoting characters (i.e. ' ). some of the columns), or maybe a subset of cases (i.e., keep only some of the rows). A vector of logical values indicating which cases (rows) of the data frame you  Subsetting in R is easy to learn but hard to master because you need to internalise a number of interrelated concepts: There are six ways to subset atomic vectors. There are three subsetting operators, [[, [, and $. Subsetting operators interact differently with different vector types (e.g., atomic vectors, lists, factors, matrices, and data

Comments
  • Perfect, thank you. May I ask so that I know for the future, what is the role of 'test' in this function?