How to reshape a data frame with multiple values for each id? (like pivot table in excel)

pandas pivot table to dataframe
pandas dataframe reshape(-1 1)
pandas pivot multiple columns
pandas pivot multiple index
pandas pivot table column names
pandas pivot vs pivot_table
pandas unstack
pandas stack

I have this data frame:

data <- data.frame(id=sample(1:10,2000,replace = T),value=sample(100:10000,2000,replace = T))
> head(data)
  id value
1  4  2032
2  3  2512
3  9  8925
4  8  8527
5  6  5176
6  9  8182

Now I want value for each id as colnames and the values are to be rows that correspond to the id.

What I want is not to summarise but to group the values according to id and need to convert the id into columns.

This should work:

library(tidyverse)

data %>% 
  group_by(id = paste("id", id, sep = "_")) %>%
  mutate(rn = row_number()) %>%
  spread(id, value) %>%
  select(-rn)

Output (first 10 rows):

    id_1 id_10  id_2  id_3  id_4  id_5  id_6  id_7  id_8  id_9
   <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
 1  8161   576  4921  5965  8969  8419  7898  5724  6513  7475
 2  8526  8121  5200  7847  4033  9348  5051  4430  9320  2973
 3  4587  4505  1747  6179  6358   234  5649  5780  3579  4986
 4  2609  9058  5709  4284  4068   523  9156  3253  6753  5570
 5  1261  4533  5954  7703  2460  2171  4196  7576  7118  8702
 6  3125  8303  2364  9305  9094  1211  3439  8201  5268  6794
 7  3464   657  2917  4831  6154  3125  9964  9324  1917  7439
 8  6601  2297  4163  7866  6701  6336   262  6725  7646  5361
 9  3042  4296  9312  8990   366  5891  3984  4675  7289  9549
10  4829  5565  8841   775  5482  9519  1084  1845  4735  3467
# ... with 203 more rows

The tail of the dataset looks like:

   id_1 id_10  id_2  id_3  id_4  id_5  id_6  id_7  id_8  id_9
  <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
1  2723    NA    NA    NA    NA    NA    NA    NA  7147    NA
2  7746    NA    NA    NA    NA    NA    NA    NA  1809    NA
3  4281    NA    NA    NA    NA    NA    NA    NA  8140    NA
4    NA    NA    NA    NA    NA    NA    NA    NA  6564    NA
5    NA    NA    NA    NA    NA    NA    NA    NA  6001    NA
6    NA    NA    NA    NA    NA    NA    NA    NA  3471    NA

Reshaping and pivot tables, To reshape the data into this form, we use the DataFrame.pivot() method The function pivot_table() can be used to create spreadsheet-style pivot tables. to pivot_table , special All columns and rows will be added with partial group index : array-like, values to group by in the rows. Pivoting with multiple aggregations¶. Efficient reshaping using data.tables 2019-12-08. This vignette discusses the default usage of reshaping functions melt (wide to long) and dcast (long to wide) for data.tables as well as the new extended functionalities of melting and casting on multiple columns available from v1.9.6.

First split partial data frames by ID into a temporary list.

ls1 <- lapply(sort(unique(data$id)), function(x) data[data$id == x, ])

Second, number the values for each ID and summarize everything back into the original data frame structure.

data <- do.call(rbind, 
                lapply(1:(length(ls1)), 
                       function(x) transform(ls1[[x]], 
                                             time=1:length(ls1[[x]][[1]]))))
rm(ls1)  # remove tmp list

Finally use reshape().

result <- reshape(data, idvar="time", timevar="id", direction="wide")

Yields:

> head(result)
   time value.1 value.2 value.3 value.4 value.5 value.6 value.7 value.8 value.9 value.10
25    1    8097    8445    7029    3001    2823    7371    8359    6504    8902     9901
35    2     565    6701    6765    1187     116    9527    1680    3701    8514     4441
37    3    5383    5311    1073    9261    7899    6894    2297    1335    2910     5700
43    4    4885    6716    1608    6547    7379    5821    1295     866     702     8029
55    5    7721    8430    5324    6937     195    5758    1704    8017    9744     2062
71    6    4537    7004    8477    2071    9130    2072    4455    6628    6076     3888

> dim(result)
[1] 226  11

Data:

set.seed(42)
data <- data.frame(id=sample(1:10, 2000, replace=TRUE),
                   value=sample(100:10000, 2000, replace=TRUE))

Reshape data in excel, Dec 18, 2015 · Dear All I would like to reshape a matrix into a row vector. Excel which returns data in a crosstab format like FIGURE 1, including multiple rows A Reshape Example Simple Dataframe Oct 04, 2014 · The Excel PivotTable is  Keys to group by on the pivot table column. If an array is passed, it is being used as the same manner as column values. aggfunc: function to use for aggregation, defaulting to numpy.mean. Consider a data set like this:

The thing is you have to have a unique id values i.e. columns can not be duplicated.

library(tidyr)

set.seed(999)
data<-data.frame(id=sample(1:10,2000,replace = T),value=sample(100:10000,2000,replace = T))

# reshape to wide format
oo <- data %>% 
  distinct(id, .keep_all = TRUE) %>% 
  spread(id, value)

# rename columns, add prefix 'id'
colnames(oo) <- sapply(colnames(oo), function(x) paste0("id_", x))

Output

  id_1 id_2 id_3 id_4 id_5 id_6 id_7 id_8 id_9 id_10
1 9850 9160  407 4846 6612 9174 8294 1277 8854  9941

Seven Clean Steps To Reshape Your Data With Pandas Or How I , Concepts: multi-level indexing, pivoting, stacking, apply, lambda, and I would like to share with you my process in case it comes up in your own work. So, parse the tab into a data frame, df, skipping the useless empty rows at the top. to whatever level we want afterwards, just like an Excel pivot table. Let's reshape the pivot table we had created earlier, back into the long data shape with each row representing an observation. In the Melt function we pass a list as identifier variables i.e. id_vars and all other columns are considered as the measure variables which get listed under the variable and value columns.

The first step is to create a list, where each element corresponds to one id:

l <- tapply(data$value, data$id, list)
l["2"]
# $`2`
#   [1] 3961 2644 4194 3630 2485  353 6801 4487 9770 5793 9291 7071 1842
#  [14] 1970 6200 6499 4067 2968 3879 1677 3964 4934 5891 7502 7333 7742
#  ....

Actually, for most purposes it would be recommended to use this data structure rather than what you are asking in your question. Given that, we have multiple vector of unequal length which we want to cbind. There have been multiple proposals how to do that (see, e.g., here). For instance,

library(qpcR)
result <- do.call(qpcR:::cbind.na, l)
head(result, 2)
#         1    2    3    4    5    6    7    8    9   10
# [1,] 3118 6938 2360 9680 1540 4900 1427  680 3020 3824
# [2,] 4430 9265 4275 3689  624 6713  196 4605 9439  190
tail(result, 2)
#         1  2  3  4    5  6  7  8  9 10
# [212,] NA NA NA NA 1775 NA NA NA NA NA
# [213,] NA NA NA NA 9398 NA NA NA NA NA

Reshaping in Pandas, We may like to reshape/pivot the table so that all USD prices for an Each cell in the newly created DataFrame will have as a value the What will happen if we have multiple rows with the same values for these columns? In this post, we will learn how to use Pandas melt () function and wide_long_long () function to reshape Pandas dataframe in wide form to long tidy form. A data frame is tidy when it satisfies the following rules. Each variable in the data set is placed in its own column. Each observation is placed in its own row.

Practical R for Mass Communication and Journalism, In other words, tidy data. There are several different packages that make it easy to reshape a data frame from “wide” – with important information embedded in  Reshape data (produce a “pivot” table) based on column values. Uses unique values from specified index / columns to form axes of the resulting DataFrame. This function does not support data aggregation, multiple values will result in a MultiIndex in the columns. See the User Guide for more on reshaping.

[PDF] Reshaping data with the reshape package, 4.4 Returning multiple values . procedure is Excel's Pivot tables. rows. In the original data frame, the “variable” id variable forms the columns, and all identifiers if your data is highly balanced or crossed, the data you want to reshape may. Hence, before calling pivot we need to ensure that our data does not have rows with duplicate values for the specified columns. If we can’t ensure this we may have to use the pivot_table method instead. Pivot Table. The pivot_table method comes to solve this problem. It works like pivot, but it aggregates the values from rows with duplicate

Pivoting data from columns to rows (and back!) in the tidyverse , Two functions for reshaping columns and rows ( gather() and spread() ) were These are the data we typically see in spreadsheet software like In the Indexed data frame, the group and number variables are used to keep track of table the measure values are at the intersection of group and each ID###  Data tidying. Data tidying is the operation of transforming data into a clear and simple form that makes it easy to work with. “Tidy data” represent the information from a dataset as data frames where each row is an observation and each column contains the values of a variable (i.e. an attribute of what we are observing).

Comments
  • Are the values sums for every id?
  • no...default values has to be needed... i have tried dcast but not worked
  • Why prob in the question title?
  • sorry it was edited now
  • thanks for the response but i dont want to summarise and added the little bit information in title....i want to group the values according to id and need to convert the id as columns
  • In this case just skip the summarise part?
  • simple words ...below thread have clear example stackoverflow.com/questions/53236989/…
  • @saisaran, in this case take a look at my edit, it should work fine now.