Removing multiple columns from R data.table with parameter for columns to remove

delete multiple columns in r
remove columns in r dplyr
r data.table rename column
data.table select columns
how to remove a column from table in r
r data.table column names
remove column in r
data.table group by

I'm trying to manipulate a number of data.tables in similar ways, and would like to write a function to accomplish this. I would like to pass in a parameter containing a list of columns that would have the operations performed. This works fine when the vector declaration of columns is the left hand side of the := operator, but not if it is declared earlier (or passed into the function). The follow code shows the issue.

dt = data.table(a = letters, b = 1:2, c=1:13)
colsToDelete = c('b', 'c')
dt[,colsToDelete := NULL] # doesn't work but I don't understand why not.
dt[,c('b', 'c') := NULL] # works fine, but doesn't allow passing in of columns

The error is "Adding new column 'colsToDelete' then assigning NULL (deleting it)." So clearly, it's interpreting 'colsToDelete' as a new column name.

The same issue occurs when doing something along these lines

dt[, colNames := lapply(.SD, adjustValue, y=factor), .SDcols = colNames]

I new to R, but rather more experienced with some other languages, so this may be a silly question.

It's basically because we allow symbols on LHS of := to add new columns, for convenience: ex: DT[, col := val]. So, in order to distinguish col itself being the name from whatever is stored in col being the column names, we check if the LHS is a name or an expression.

If it's a name, it adds the column with the name as such on the LHS, and if expression, then it gets evaluated.

DT[, col := val] # col is the column name.

DT[, (col) := val]  # col gets evaluated and replaced with its value
DT[, c(col) := val] # same as above

The preferred idiom is: dt[, (colsToDelete) := NULL]

HTH

r - Remove multiple columns from data.table, This looks like a solid, reproducible bug. It's been filed as Bug #2791. It appears that repeating the column attempts to delete the subsequent columns. 4.1. Delete Multiple Columns By Index. In similar to deleting a column of a data frame, to delete multiple columns of a data frame, we simply need to put all desired column into a vector and set them to NULL, for example, to delete the 2nd, 4th columns of the above data frame: children [c (2, 4)] <- list (NULL) 1.

To extend on previous answer, you can delete columns by reference doing:

# delete columns 10 to 15
dt[ , (10:15) := NULL ]

or

# delete columns 3, 5 and 10 to 15
dt[ , (c(3,5,10:15)) := NULL ]

R : Keep / Drop Columns from Data Frame, In R, there are multiple ways to select or drop column. Keep or Delete columns with dplyr package Double Quote Output Dataset Name t = deparse(substitute(​newdata)) # Drop Columns The parameter "data" refers to input data frame. This series has a couple of parts – feel free to skip ahead to the most relevant parts. Inspecting your data. Ways to Select a Subset of Data From an R Data Frame. Create an R Data Frame. Sort an R Data Frame. Add and Remove Columns. Renaming Columns. Add and Remove Rows. Merge Two Data Frames.

I am surprised no answer provided uses the set() function.

set(DT, , colsToDelete, NULL)

This should be the easiest.

set, To remove a column use NULL . x. A data.table . Or, set() accepts data.frame , too​. i. Hi there, I'm trying to remove multiple columns by name from a data.frame. As a result I need to get back the modified data.frame without the removed columns. My columns I want to delete are listed in a vector called "delete".

[PDF] data.table, dered joins, fast add/modify/delete of columns by group us- ing no copies It is inspired by A[B] syntax in R where A is a matrix and B is a 2-column matrix. When i is a list (or data.frame or data.table) and multiple rows in x match or more columns or done ad-hoc using the on argument (now preferred). Keep columns by column index number In this case, we are telling R to keep only variables that are placed at second and fourth position. df <- mydata[c(2,4)] Keep or Delete columns with dplyr package In R, the dplyr package is one of the most popular package for data manipulation. It makes data wrangling easy.

How to drop one or multiple columns in Pandas Dataframe , How to drop one or multiple columns in Pandas Dataframe. Let's discuss how to Create a simple dataframe with dictionary of lists, say column names are A, B, C, D, E. import pandas as pd. # create a dictionary with five fields each. data = {​. 'A' :[ 'A1' Remove all columns between a specific column to another columns. This tutorial explains how to rename data frame columns in R using a variety of different approaches. For each of these examples, we’ll be working with the built-in dataset mtcars in R. Renaming the First n Columns Using Base R. There are a total of 11 column names in mtcars:

Advanced R: Data Programming and the Cloud, Data Programming and the Cloud Matt Wiley, Joshua F. Wiley and rekey our data table based on the original iris data set to remove any changes we made in Creating a new variable in a data table is creating a new column. zeros, create multiple columns with different variables, or even create a new column based on  The test data consists of 1,642,901 records with 264 columns, the majority of which it turns out are nearly empty and therefore not useful for analysis.First, invoke the driving R packages, set a few options, and load the data.table in manner that mimics WPS’s proc R. Interrogate the resulting data structure dimensions.

Comments
  • to add on this, you can also do dt[ , -(10:15) ] or dt[ , -c(3,5,10:15)]
  • two commas in a row?
  • @wolfsatthedoor The empty argument is the i argument, which refers to rows. As it's omitted, it indicates that all rows are to be updated.