Creating a column whose values are dependent on multiple other columns
I'm trying to create a new column ("newcol") in a dataframe ("data"), whose values will be determined by the contents of up to two other columns in the dataframe ("B_stance" and "C_stance"). The values within B_stance are either "L", "R", "U" or "N". Within C_stance they are either "L" or "R".
Please excuse the semi-logical language, but I need R code which will achieve this for the contents of newcol:
if (data$B_stance = "L" AND data$C_stance = "L") then (data$newcol = "N") if (data$B_stance = "L" AND data$C_stance = "R") then (data$newcol = "Y") if (data$B_stance = "R" AND data$C_stance = "R") then (data$newcol = "N") if (data$B_stance = "R" AND data$C_stance = "L") then (data$newcol = "Y") if (data$B_stance = "U") then (data$newcol = "N") if (data$B_stance = "N") then (data$newcol = "N")
I've tried to see if/how "ifelse" could achieve this, but cannot find an example of how to draw from multiple column values in determining the new value.
In base R the
ifelse function is most useful for these conditions. The
dplyr library includesa more
if_else function and a
case_when function. The
ifelse returns the second argument if the first is true and
returns the third argument if the first argument is false.
data <- read.table(text=" B_stance C_stance L R L L U X R L R R N X X X ", header= TRUE) data$newcol = ifelse(data$B_stance == "L" & data$C_stance == "L", "N", ifelse(data$B_stance == "L" & data$C_stance == "R", "Y", ifelse(data$B_stance == "R" & data$C_stance == "R", "N", ifelse(data$B_stance == "R" & data$C_stance == "L", "Y", ifelse(data$B_stance == "U", "N", ifelse(data$B_stance == "N", "N", NA)))))) data # B_stance C_stance newcol # 1 L R Y # 2 L L N # 3 U X N # 4 R L Y # 5 R R N # 6 N X N # 7 X X <NA>
4 data wrangling tasks in R for advanced beginners, Learn how to get summaries, sort and do other tasks with relative ease. str(companiesData) 'data.frame': 9 obs. of 4 variables: $ fy : num 2010 2011 2012 in R is adding a new column to a data frame based on one or more other columns. Note that both references are mixed in order to lock the column but allow the row to change. How this formula works. In this example, a conditional formatting rule highlights cells in the range D5:D14 when the value is greater than corresponding values in C5:C14. The formula used to create the rule is: =
It may be easier to create a
key/val dataset and then do a join
keydat <- data.frame(B_stance = c('L', 'L', 'R', 'R'), C_stance = c('L', 'R', 'R', 'L'), newcol = c('N', 'Y', 'N', 'Y'), stringsAsFactors = FALSE) library(dplyr) left_join(data, keydat) %>% mutate(newcol = replace(newcol, is.na(newcol), 'N'))
Database Modeling Step by Step, A determinant is a column whose value other columns may depend on for their breakdown into multiple tables created from the original table, for any values I have a pandas dataframe with one column showing currencies and another showing prices. I want to create a new column that standardizes the prices to USD based on the values from the other two columns. eg. currency price SGD 100 USD 80 EUR 75 the new column would have conditions similar to
dplyr you can use
case_when. It's a little cleaner than nested
if_elses if you have numerous conditions.
df <- data.frame( B_stance = c('L', 'L', 'R', 'R'), C_stance = c('L', 'R', 'R', 'L'), stringsAsFactors = FALSE ) df %>% mutate( newcol = case_when( B_stance == 'U' ~ 'N', B_stance == 'N' ~ 'N', B_stance == 'L' & C_stance == 'L' ~ 'N', B_stance == 'L' & C_stance == 'R' ~ 'Y', B_stance == 'R' & C_stance == 'L' ~ 'Y', B_stance == 'R' & C_stance == 'R' ~ 'N', TRUE ~ B_stance ) ) # B_stance C_stance newcol # 1 L L N # 2 L R Y # 3 R R N # 4 R L Y
Note that the conditioning within
case_when is lazy; the first true statement is executed. The final
TRUE ensures there's a fallback in case no statement is true.
SQL: Visual QuickStart Guide, Other. normal. forms. Higher levels of normalization exist, but the relational model column is one on which some of the columns are fully functionally dependent. three columns, one column has multiple rows whose values match a value of a risk of inserting inconsistent data ◇ Might improve the performance of some Reference a Range Across Multiple Columns. To reference a range of values in a row across multiple columns, reference the first and last column in the row. For example, the formula in the Total Stock column of the following inventory management sheet will sum the values from the Stock A, Stock B, and Stock C columns on row 1:
How To Create a Column Using Condition on Another Column in , In this post we will see two different ways to create a column based on values of another column using conditional statements. First we will use Instead, they select a value from a drop-down list of the values in the column on the other list. In SharePoint Foundation 2010, you can create a multiple-column lookup where a primary lookup column is used to select an item on the target list and one or more secondary lookup columns show values from other columns in the same list item.
Create a new column in Pandas DataFrame based on the existing , How to select multiple columns in a pandas dataframe · Change Data Type for one or more columns in Pandas Dataframe · Using dictionary to remap values in Create a new column in Pandas DataFrame based on the existing columns While working with data in Pandas, we perform a vast array of operations on the data to get the data in the desired form. One of these operations could be that we want to create new columns in the DataFrame based on the result of some operations on the existing columns in the
Professional Oracle® Programming, If the value of a column could be determined strictly through knowing part of the key, When you find that you need multiple columns to identify an instance of an entity, our is really dependent upon the key as opposed to being dependent upon some other attribute that is dependent upon the key. Did that make sense? Checks whether a condition is met, and returns one value if TRUE, and another value if FALSE. INDEX(array,row_num,[column_num]) Returns a value or reference of the cell at the intersection of a particular row and column, in a given range. COUNT(value1, [value2]) Counts the number of cells in a range that contain numbers