R Melt reshape data
r melt multiple variables
opposite of melt in r
r transpose data frame column names
recast in r
r reshape long to wide
r reshape long to wide multiple variables
reshape2 cheat sheet
Here' my data:
Day Morning_1_id Var1 Morning_2_id Var2 Afternoon_1_id Var3 Afternoon_2_id Var4 1 20180501-033-000001 3.156667 20180501-033-000002 2.866667 20180501-033-000008 2.946667 20180501-033-000009 3.133333 2 20180502-033-000001 2.986667 20180502-033-000002 2.930000 20180502-033-000020 3.076667 20180502-033-000021 3.013333 3 20180503-033-000001 3.073333 20180503-033-000002 3.070000 20180503-033-000011 3.106667 20180503-033-000012 2.900000 4 20180507-033-000001 3.236667 20180507-033-000002 2.990000 20180507-033-000015 3.043333 20180507-033-000016 3.116667 5 20180508-033-000001 3.030000 20180508-033-000002 3.150000 20180508-033-000015 3.156667 20180508-033-000017 3.343333 6 20180509-033-000001 3.010000 20180509-033-000002 3.020000 20180509-033-000007 3.000000 20180509-033-000008 3.156667 7 20180510-033-000001 2.916667 20180510-033-000002 3.103333 20180510-033-000007 3.336667 20180510-033-000008 3.066667 8 20180511-033-000001 3.293333 20180511-033-000002 3.163333 20180511-033-000013 2.980000 20180511-033-000014 2.940000 9 20180514-033-000001 3.136667 20180514-033-000002 3.186667 20180514-033-000007 2.766667 20180514-033-000008 3.100000 10 20180516-033-000001 3.116667 20180516-033-000002 3.283333 20180516-033-000008 3.133333 20180516-033-000009 3.040000 11 20180517-033-000003 2.843333 20180517-033-000004 3.120000 20180517-033-000008 3.060000 20180517-033-000009 3.033333 12 20180518-033-000001 3.033333 20180518-033-000002 3.290000 20180518-033-000007 3.006667 20180518-033-000008 2.973333 13 20180521-033-000002 3.173333 20180521-033-000003 2.993333 20180521-033-000008 2.983333 20180521-033-000009 3.020000 14 20180523-033-000001 3.336667 20180523-033-000002 3.026667 20180523-033-000007 3.300000 20180523-033-000008 3.210000
Reproducible form:
structure(list(Day = 1:14, Morning_1_id = structure(1:14, .Label = c("20180501-033-000001", "20180502-033-000001", "20180503-033-000001", "20180507-033-000001", "20180508-033-000001", "20180509-033-000001", "20180510-033-000001", "20180511-033-000001", "20180514-033-000001", "20180516-033-000001", "20180517-033-000003", "20180518-033-000001", "20180521-033-000002", "20180523-033-000001"), class = "factor"), Var1 = c(3.156666667, 2.986666667, 3.073333333, 3.236666667, 3.03, 3.01, 2.916666667, 3.293333333, 3.136666667, 3.116666667, 2.843333333, 3.033333333, 3.173333333, 3.336666667), Morning_2_id = structure(1:14, .Label = c("20180501-033-000002", "20180502-033-000002", "20180503-033-000002", "20180507-033-000002", "20180508-033-000002", "20180509-033-000002", "20180510-033-000002", "20180511-033-000002", "20180514-033-000002", "20180516-033-000002", "20180517-033-000004", "20180518-033-000002", "20180521-033-000003", "20180523-033-000002"), class = "factor"), Var2 = c(2.866666667, 2.93, 3.07, 2.99, 3.15, 3.02, 3.103333333, 3.163333333, 3.186666667, 3.283333333, 3.12, 3.29, 2.993333333, 3.026666667), Afternoon_1_id = structure(1:14, .Label = c("20180501-033-000008", "20180502-033-000020", "20180503-033-000011", "20180507-033-000015", "20180508-033-000015", "20180509-033-000007", "20180510-033-000007", "20180511-033-000013", "20180514-033-000007", "20180516-033-000008", "20180517-033-000008", "20180518-033-000007", "20180521-033-000008", "20180523-033-000007"), class = "factor"), Var3 = c(2.946666667, 3.076666667, 3.106666667, 3.043333333, 3.156666667, 3, 3.336666667, 2.98, 2.766666667, 3.133333333, 3.06, 3.006666667, 2.983333333, 3.3), Afternoon_2_id = structure(1:14, .Label = c("20180501-033-000009", "20180502-033-000021", "20180503-033-000012", "20180507-033-000016", "20180508-033-000017", "20180509-033-000008", "20180510-033-000008", "20180511-033-000014", "20180514-033-000008", "20180516-033-000009", "20180517-033-000009", "20180518-033-000008", "20180521-033-000009", "20180523-033-000008"), class = "factor"), Var4 = c(3.133333333, 3.013333333, 2.9, 3.116666667, 3.343333333, 3.156666667, 3.066666667, 2.94, 3.1, 3.04, 3.033333333, 2.973333333, 3.02, 3.21)), class = "data.frame", row.names = c(NA, -14L))
Here's what I want it to be:
Day Id Var Time 1 20180501-033-000001 3.156666667 Morning1 2 20180502-033-000001 2.986666667 Morning1 3 20180503-033-000001 3.073333333 Morning1 4 20180507-033-000001 3.236666667 Morning1 5 20180508-033-000001 3.03 Morning1 6 20180509-033-000001 3.01 Morning1 7 20180510-033-000001 2.916666667 Morning1 8 20180511-033-000001 3.293333333 Morning1 9 20180514-033-000001 3.136666667 Morning1 10 20180516-033-000001 3.116666667 Morning1 11 20180517-033-000003 2.843333333 Morning1 12 20180518-033-000001 3.033333333 Morning1 13 20180521-033-000002 3.173333333 Morning1 14 20180523-033-000001 3.336666667 Morning1 1 20180501-033-000002 2.866666667 Morning2 2 20180502-033-000002 2.93 Morning2 3 20180503-033-000002 3.07 Morning2 4 20180507-033-000002 2.99 Morning2 5 20180508-033-000002 3.15 Morning2 6 20180509-033-000002 3.02 Morning2 7 20180510-033-000002 3.103333333 Morning2 8 20180511-033-000002 3.163333333 Morning2 9 20180514-033-000002 3.186666667 Morning2 10 20180516-033-000002 3.283333333 Morning2 11 20180517-033-000004 3.12 Morning2 12 20180518-033-000002 3.29 Morning2 13 20180521-033-000003 2.993333333 Morning2 14 20180523-033-000002 3.026666667 Morning2 1 20180501-033-000008 2.946666667 Afternoon1 2 20180502-033-000020 3.076666667 Afternoon1 3 20180503-033-000011 3.106666667 Afternoon1 4 20180507-033-000015 3.043333333 Afternoon1 5 20180508-033-000015 3.156666667 Afternoon1 6 20180509-033-000007 3 Afternoon1 7 20180510-033-000007 3.336666667 Afternoon1 8 20180511-033-000013 2.98 Afternoon1 9 20180514-033-000007 2.766666667 Afternoon1 10 20180516-033-000008 3.133333333 Afternoon1 11 20180517-033-000008 3.06 Afternoon1 12 20180518-033-000007 3.006666667 Afternoon1 13 20180521-033-000008 2.983333333 Afternoon1 14 20180523-033-000007 3.3 Afternoon1 1 20180501-033-000009 3.133333333 Afternoon2 2 20180502-033-000021 3.013333333 Afternoon2 3 20180503-033-000012 2.9 Afternoon2 4 20180507-033-000016 3.116666667 Afternoon2 5 20180508-033-000017 3.343333333 Afternoon2 6 20180509-033-000008 3.156666667 Afternoon2 7 20180510-033-000008 3.066666667 Afternoon2 8 20180511-033-000014 2.94 Afternoon2 9 20180514-033-000008 3.1 Afternoon2 10 20180516-033-000009 3.04 Afternoon2 11 20180517-033-000009 3.033333333 Afternoon2 12 20180518-033-000008 2.973333333 Afternoon2 13 20180521-033-000009 3.02 Afternoon2 14 20180523-033-000008 3.21 Afternoon2
I want to do wide-to-long conversion such that the Ids and the values of 'Var' get stacked day wise. I also want an additional column named 'Time', which will depend upon the initial ids, namely 'Morning_1_id', 'Morning_2_id', 'Afternoon_1_id' and 'Afternoon_2_id'. How to do this? I tried using melt from reshape2 but couldn't get it done.
Here is a solution using dplyr
to transform your table into the requested format:
library(dplyr) mydata<- reshape(mydata, direction='long', varying=c('Morning_1_id', 'Var1', 'Morning_2_id', 'Var2', 'Afternoon_1_id', 'Var3', 'Afternoon_2_id', 'Var4'), timevar='Var', times=c('Morning1', 'Morning2', 'Afternoon1', 'Afternoon2'), v.names=c('Id', 'Var'), idvar='Day') mydata<- tibble::rownames_to_column(mydata) mydata$rowname<- gsub("^.*\\.","", mydata$rowname) names(mydata)<- c("Time", "Day", "Var", "Id") mydata<- mydata[,c(2,4,3,1)]
An Introduction to reshape2, Wide- to long-format data: the melt function. For this example we'll work with the airquality dataset that is built into R . First we'll change the The melt function takes data in wide format and stacks a set of columns into a single column of data. To make use of the function we need to specify a data frame, the id variables (which will be left at their settings) and the measured variables (columns of data) to be stacked.
Consider base R by building a list of every 2nd column of a sequence and then row binding all df elements:
df_list <- lapply(seq(3, length(df), 2), function(i) { sub <- df[c(1, (i-1):i)] # SUBSET BY COLS sub <- transform(sub, Time = sub("_id", "", names(df)[i-1])) # ADD TIME VAR setNames(sub, c("Day", "Id", "Var", "Time")) # RENAME COLS }) long_df <- do.call(rbind, df_list) head(long_df, 20) # Day Id Var Time # 1 1 20180501-033-000001 3.156667 Morning_1 # 2 2 20180502-033-000001 2.986667 Morning_1 # 3 3 20180503-033-000001 3.073333 Morning_1 # 4 4 20180507-033-000001 3.236667 Morning_1 # 5 5 20180508-033-000001 3.030000 Morning_1 # 6 6 20180509-033-000001 3.010000 Morning_1 # 7 7 20180510-033-000001 2.916667 Morning_1 # 8 8 20180511-033-000001 3.293333 Morning_1 # 9 9 20180514-033-000001 3.136667 Morning_1 # 10 10 20180516-033-000001 3.116667 Morning_1 # 11 11 20180517-033-000003 2.843333 Morning_1 # 12 12 20180518-033-000001 3.033333 Morning_1 # 13 13 20180521-033-000002 3.173333 Morning_1 # 14 14 20180523-033-000001 3.336667 Morning_1 # 15 1 20180501-033-000002 2.866667 Morning_2 # 16 2 20180502-033-000002 2.930000 Morning_2 # 17 3 20180503-033-000002 3.070000 Morning_2 # 18 4 20180507-033-000002 2.990000 Morning_2 # 19 5 20180508-033-000002 3.150000 Morning_2 # 20 6 20180509-033-000002 3.020000 Morning_2
Efficient reshaping using data.tables, Introduction. The melt and dcast functions for data.table s are for reshaping wide-to-long and long-to This the generic melt function. See the following functions for the details about different data structures: further arguments passed to or from other methods. Should NA values be removed from the data set? This will convert explicit missings to implicit missings. lixx5423@d.umn.edu at Feb 27, 2019 reshape2 v1.4.3.
Here is a tidyverse
option
corrected per comments from @Calum You
df %>% gather(Time, Var, -Day, -c(Var1, Var2, Var3, Var4)) %>% mutate(Time = gsub('.{3}$', '',Time), start = substr(Time, 1, 1), end = substr(Time, nchar(Time), nchar(Time)), id = paste0(start,end), Val = case_when(id=='M1' ~ Var1, id=='M2' ~ Var2, id=='A1' ~ Var3, id=='A2' ~ Var4)) %>% dplyr::select(Day, Id=Var, Val, Time)
Original incorrect code
df %>% gather(Time, Var, -Day, -c(Var1, Var2, Var3, Var4)) %>% gather( key, value, -Day, -Time, -Var) %>% mutate(Time = gsub('.{3}$', '',Time)) %>% dplyr::select(Day, Id=Var, Var=value, Time)
melt function, Arguments. data. Data set to melt further arguments passed to or from other methods. na.rm. Should NA values be removed from the data set? This will convert One of the most interesting aspects of R programming is about changing the shape of the data to get a desired shape.Melting and casting in R, are the functions that can be used efficiently to reshape the data. The functions used to do this are called melt() and cast(). Melt Function in R:
Here's another tidyverse
method. This is complicated by the fact that the different Var
columns correspond to a particular time, but the indication of the time is different from the way it is represented in the id
columns. So you need to have some way of matching the two. Here I do that with a named list inside var_renamer
. Once the columns are consistently named, it becomes possible to use gather
and separate
to generate the right variables to be spread
back out into the desired format. Note that I mutate
Time
into an ordered factor so it can be sorted by time rather than alphabetically with arrange
.
df <- structure(list(Day = 1:14, Morning_1_id = structure(1:14, .Label = c("20180501-033-000001", "20180502-033-000001", "20180503-033-000001", "20180507-033-000001", "20180508-033-000001", "20180509-033-000001", "20180510-033-000001", "20180511-033-000001", "20180514-033-000001", "20180516-033-000001", "20180517-033-000003", "20180518-033-000001", "20180521-033-000002", "20180523-033-000001"), class = "factor"), Var1 = c(3.156666667, 2.986666667, 3.073333333, 3.236666667, 3.03, 3.01, 2.916666667, 3.293333333, 3.136666667, 3.116666667, 2.843333333, 3.033333333, 3.173333333, 3.336666667), Morning_2_id = structure(1:14, .Label = c("20180501-033-000002", "20180502-033-000002", "20180503-033-000002", "20180507-033-000002", "20180508-033-000002", "20180509-033-000002", "20180510-033-000002", "20180511-033-000002", "20180514-033-000002", "20180516-033-000002", "20180517-033-000004", "20180518-033-000002", "20180521-033-000003", "20180523-033-000002"), class = "factor"), Var2 = c(2.866666667, 2.93, 3.07, 2.99, 3.15, 3.02, 3.103333333, 3.163333333, 3.186666667, 3.283333333, 3.12, 3.29, 2.993333333, 3.026666667), Afternoon_1_id = structure(1:14, .Label = c("20180501-033-000008", "20180502-033-000020", "20180503-033-000011", "20180507-033-000015", "20180508-033-000015", "20180509-033-000007", "20180510-033-000007", "20180511-033-000013", "20180514-033-000007", "20180516-033-000008", "20180517-033-000008", "20180518-033-000007", "20180521-033-000008", "20180523-033-000007"), class = "factor"), Var3 = c(2.946666667, 3.076666667, 3.106666667, 3.043333333, 3.156666667, 3, 3.336666667, 2.98, 2.766666667, 3.133333333, 3.06, 3.006666667, 2.983333333, 3.3), Afternoon_2_id = structure(1:14, .Label = c("20180501-033-000009", "20180502-033-000021", "20180503-033-000012", "20180507-033-000016", "20180508-033-000017", "20180509-033-000008", "20180510-033-000008", "20180511-033-000014", "20180514-033-000008", "20180516-033-000009", "20180517-033-000009", "20180518-033-000008", "20180521-033-000009", "20180523-033-000008"), class = "factor"), Var4 = c(3.133333333, 3.013333333, 2.9, 3.116666667, 3.343333333, 3.156666667, 3.066666667, 2.94, 3.1, 3.04, 3.033333333, 2.973333333, 3.02, 3.21)), class = "data.frame", row.names = c(NA, -14L)) library(tidyverse) var_renamer <- function(name) { time_list <- list( "1" = "Morning_1", "2" = "Morning_2", "3" = "Afternoon_1", "4" = "Afternoon_2" ) timenum = str_remove(name, "Var") timestr = map_chr(timenum, ~ time_list[[.x]]) str_c(timestr, "-Var") } df %>% rename_at(vars(starts_with("Var")), var_renamer) %>% rename_all(funs(str_replace(., "_id", "-Id"))) %>% gather(colname, val, -Day) %>% separate(colname, c("Time", "id_var"), sep = "-") %>% mutate(Time = factor( x = Time, levels = c("Morning_1", "Morning_2", "Afternoon_1", "Afternoon_2"), ordered = TRUE )) %>% spread(id_var, val) %>% arrange(Time, Day) #> Warning: attributes are not identical across measure variables; #> they will be dropped #> Day Time Id Var #> 1 1 Morning_1 20180501-033-000001 3.156666667 #> 2 2 Morning_1 20180502-033-000001 2.986666667 #> 3 3 Morning_1 20180503-033-000001 3.073333333 #> 4 4 Morning_1 20180507-033-000001 3.236666667 #> 5 5 Morning_1 20180508-033-000001 3.03 #> 6 6 Morning_1 20180509-033-000001 3.01 #> 7 7 Morning_1 20180510-033-000001 2.916666667 #> 8 8 Morning_1 20180511-033-000001 3.293333333 #> 9 9 Morning_1 20180514-033-000001 3.136666667 #> 10 10 Morning_1 20180516-033-000001 3.116666667 #> 11 11 Morning_1 20180517-033-000003 2.843333333 #> 12 12 Morning_1 20180518-033-000001 3.033333333 #> 13 13 Morning_1 20180521-033-000002 3.173333333 #> 14 14 Morning_1 20180523-033-000001 3.336666667 #> 15 1 Morning_2 20180501-033-000002 2.866666667 #> 16 2 Morning_2 20180502-033-000002 2.93 #> 17 3 Morning_2 20180503-033-000002 3.07 #> 18 4 Morning_2 20180507-033-000002 2.99 #> 19 5 Morning_2 20180508-033-000002 3.15 #> 20 6 Morning_2 20180509-033-000002 3.02 #> 21 7 Morning_2 20180510-033-000002 3.103333333 #> 22 8 Morning_2 20180511-033-000002 3.163333333 #> 23 9 Morning_2 20180514-033-000002 3.186666667 #> 24 10 Morning_2 20180516-033-000002 3.283333333 #> 25 11 Morning_2 20180517-033-000004 3.12 #> 26 12 Morning_2 20180518-033-000002 3.29 #> 27 13 Morning_2 20180521-033-000003 2.993333333 #> 28 14 Morning_2 20180523-033-000002 3.026666667 #> 29 1 Afternoon_1 20180501-033-000008 2.946666667 #> 30 2 Afternoon_1 20180502-033-000020 3.076666667 #> 31 3 Afternoon_1 20180503-033-000011 3.106666667 #> 32 4 Afternoon_1 20180507-033-000015 3.043333333 #> 33 5 Afternoon_1 20180508-033-000015 3.156666667 #> 34 6 Afternoon_1 20180509-033-000007 3 #> 35 7 Afternoon_1 20180510-033-000007 3.336666667 #> 36 8 Afternoon_1 20180511-033-000013 2.98 #> 37 9 Afternoon_1 20180514-033-000007 2.766666667 #> 38 10 Afternoon_1 20180516-033-000008 3.133333333 #> 39 11 Afternoon_1 20180517-033-000008 3.06 #> 40 12 Afternoon_1 20180518-033-000007 3.006666667 #> 41 13 Afternoon_1 20180521-033-000008 2.983333333 #> 42 14 Afternoon_1 20180523-033-000007 3.3 #> 43 1 Afternoon_2 20180501-033-000009 3.133333333 #> 44 2 Afternoon_2 20180502-033-000021 3.013333333 #> 45 3 Afternoon_2 20180503-033-000012 2.9 #> 46 4 Afternoon_2 20180507-033-000016 3.116666667 #> 47 5 Afternoon_2 20180508-033-000017 3.343333333 #> 48 6 Afternoon_2 20180509-033-000008 3.156666667 #> 49 7 Afternoon_2 20180510-033-000008 3.066666667 #> 50 8 Afternoon_2 20180511-033-000014 2.94 #> 51 9 Afternoon_2 20180514-033-000008 3.1 #> 52 10 Afternoon_2 20180516-033-000009 3.04 #> 53 11 Afternoon_2 20180517-033-000009 3.033333333 #> 54 12 Afternoon_2 20180518-033-000008 2.973333333 #> 55 13 Afternoon_2 20180521-033-000009 3.02 #> 56 14 Afternoon_2 20180523-033-000008 3.21
Created on 2018-08-07 by the reprex package (v0.2.0).
Chapter 8: Melting & Casting, 1 Melting and Casting. 1.1 Why Reshape Your Data. Reshape2 is a package that allows us to easily transform our data into whatever structure we How can I reshape my data in R? | R FAQ When there are multiple measurements of the same subject, across time or using different tools, the data is often described as being in "wide" format if there is one observation row per subject with each measurement present as a different variable and "long" format if there is one observation row per
How to reshape data in R: tidyr vs reshape2, Reshape your data from long to wide, split a column, aggregate: a from tidyr, with the functions melt() , colsplit() and dcast() , from reshape2. Reshape in R from wide to long and from long to wide Reshape function in R transforms the data from wide to long and also transforms back the data from long to wide. Reshape in R – reshape(), is one of the efficient function to transform the data.
Reshape R package, reshape2, melt and cast, In this tutorial on data frame we saw the reshape function that can be used to convert multiple columns of a dataframe to multiple rows. Reshape your data from long to wide, split a column, aggregate: a comparison between tidyr and reshape2 R packages to tidy data The post How to reshape data in R: tidyr vs reshape2 appeared first on MilanoR.
Melting and Casting in R, Melting and casting in R, are the functions that can be used efficiently to reshape the data. The functions used to do this are called melt() and cast(). Reshaping Data . R provides a variety of methods for reshaping data prior to analysis. Transpose . Use the t() function to transpose a matrix or a data frame. In the later case, rownames become variable (column) names. # example using built-in dataset mtcars t(mtcars) The Reshape Package
Comments
reshape
is a base R function. You are not using reshape2 which usesmelt
for long format. You could even remove thetibble
call:my_data$Time <- gsub(".*\\.", "", row.names(mydata)); row.names(mydata) <- NULL
- I think the two
gather
makes a lot of a incorrect ID-Var combinations here. The output should have 56 rows only