Reshape pivot table in pandas

I need to reshape a csv pivot table. A small extract looks like:

      country          location  confirmedcases_10-02-2020  deaths_10-02-2020  confirmedcases_11-02-2020  deaths_11-02-2020
0   Australia   New South Wales                        4.0                0.0                          4                0.0
1   Australia          Victoria                        4.0                0.0                          4                0.0
2   Australia        Queensland                        5.0                0.0                          5                0.0
3   Australia   South Australia                        2.0                0.0                          2                0.0
4    Cambodia     Sihanoukville                        1.0                0.0                          1                0.0
5      Canada           Ontario                        3.0                0.0                          3                0.0
6      Canada  British Columbia                        4.0                0.0                          4                0.0
7       China             Hubei                    31728.0              974.0                      33366             1068.0
8       China          Zhejiang                     1177.0                0.0                       1131                0.0
9       China         Guangdong                     1177.0                1.0                       1219                1.0
10      China             Henan                     1105.0                7.0                       1135                8.0
11      China             Hunan                      912.0                1.0                        946                2.0
12      China             Anhui                      860.0                4.0                        889                4.0
13      China           Jiangxi                      804.0                1.0                        844                1.0
14      China         Chongqing                      486.0                2.0                        505                3.0
15      China           Sichuan                      417.0                1.0                        436                1.0
16      China          Shandong                      486.0                1.0                        497                1.0
17      China           Jiangsu                      515.0                0.0                        543                0.0
18      China          Shanghai                      302.0                1.0                        311                1.0
19      China           Beijing                      342.0                3.0                        352                3.0

is there any ready to use pandas tool to achieve it?

into something like:

      country          location        date  confirmedcases  deaths
0   Australia   New South Wales  2020-02-10             4.0     0.0
1   Australia          Victoria  2020-02-10             4.0     0.0
2   Australia        Queensland  2020-02-10             5.0     0.0
3   Australia   South Australia  2020-02-10             2.0     0.0
4    Cambodia     Sihanoukville  2020-02-10             1.0     0.0
5      Canada           Ontario  2020-02-10             3.0     0.0
6      Canada  British Columbia  2020-02-10             4.0     0.0
7       China             Hubei  2020-02-10         31728.0   974.0
8       China          Zhejiang  2020-02-10          1177.0     0.0
9       China         Guangdong  2020-02-10          1177.0     1.0
10      China             Henan  2020-02-10          1105.0     7.0
11      China             Hunan  2020-02-10           912.0     1.0
12      China             Anhui  2020-02-10           860.0     4.0
13      China           Jiangxi  2020-02-10           804.0     1.0
14      China         Chongqing  2020-02-10           486.0     2.0
15      China           Sichuan  2020-02-10           417.0     1.0
16      China          Shandong  2020-02-10           486.0     1.0
17      China           Jiangsu  2020-02-10           515.0     0.0
18      China          Shanghai  2020-02-10           302.0     1.0
19      China           Beijing  2020-02-10           342.0     3.0
20  Australia   New South Wales  2020-02-11             4.0     0.0
21  Australia          Victoria  2020-02-11             4.0     0.0
22  Australia        Queensland  2020-02-11             5.0     0.0
23  Australia   South Australia  2020-02-11             2.0     0.0
24   Cambodia     Sihanoukville  2020-02-11             1.0     0.0
25     Canada           Ontario  2020-02-11             3.0     0.0
26     Canada  British Columbia  2020-02-11             4.0     0.0
27      China             Hubei  2020-02-11         33366.0  1068.0
28      China          Zhejiang  2020-02-11          1131.0     0.0
29      China         Guangdong  2020-02-11          1219.0     1.0
30      China             Henan  2020-02-11          1135.0     8.0
31      China             Hunan  2020-02-11           946.0     2.0
32      China             Anhui  2020-02-11           889.0     4.0
33      China           Jiangxi  2020-02-11           844.0     1.0
34      China         Chongqing  2020-02-11           505.0     3.0
35      China           Sichuan  2020-02-11           436.0     1.0
36      China          Shandong  2020-02-11           497.0     1.0
37      China           Jiangsu  2020-02-11           543.0     0.0
38      China          Shanghai  2020-02-11           311.0     1.0
39      China           Beijing  2020-02-11           352.0     3.0

Use pd.wide_to_long:

print (pd.wide_to_long(df,stubnames=["confirmedcases","deaths"],
                       i=["country","location"],j="date",sep="_",
                       suffix=r'\d{2}-\d{2}-\d{4}').reset_index())

      country          location        date  confirmedcases  deaths
0   Australia   New South Wales  10-02-2020             4.0     0.0
1   Australia   New South Wales  11-02-2020             4.0     0.0
2   Australia          Victoria  10-02-2020             4.0     0.0
3   Australia          Victoria  11-02-2020             4.0     0.0
4   Australia        Queensland  10-02-2020             5.0     0.0
5   Australia        Queensland  11-02-2020             5.0     0.0
6   Australia   South Australia  10-02-2020             2.0     0.0
7   Australia   South Australia  11-02-2020             2.0     0.0
8    Cambodia     Sihanoukville  10-02-2020             1.0     0.0
9    Cambodia     Sihanoukville  11-02-2020             1.0     0.0
10     Canada           Ontario  10-02-2020             3.0     0.0
11     Canada           Ontario  11-02-2020             3.0     0.0
12     Canada  British Columbia  10-02-2020             4.0     0.0
13     Canada  British Columbia  11-02-2020             4.0     0.0
14      China             Hubei  10-02-2020         31728.0   974.0
15      China             Hubei  11-02-2020         33366.0  1068.0
16      China          Zhejiang  10-02-2020          1177.0     0.0
17      China          Zhejiang  11-02-2020          1131.0     0.0
18      China         Guangdong  10-02-2020          1177.0     1.0
19      China         Guangdong  11-02-2020          1219.0     1.0
20      China             Henan  10-02-2020          1105.0     7.0
21      China             Henan  11-02-2020          1135.0     8.0
22      China             Hunan  10-02-2020           912.0     1.0
23      China             Hunan  11-02-2020           946.0     2.0
24      China             Anhui  10-02-2020           860.0     4.0
25      China             Anhui  11-02-2020           889.0     4.0
26      China           Jiangxi  10-02-2020           804.0     1.0
27      China           Jiangxi  11-02-2020           844.0     1.0
28      China         Chongqing  10-02-2020           486.0     2.0
29      China         Chongqing  11-02-2020           505.0     3.0
30      China           Sichuan  10-02-2020           417.0     1.0
31      China           Sichuan  11-02-2020           436.0     1.0
32      China          Shandong  10-02-2020           486.0     1.0
33      China          Shandong  11-02-2020           497.0     1.0
34      China           Jiangsu  10-02-2020           515.0     0.0
35      China           Jiangsu  11-02-2020           543.0     0.0
36      China          Shanghai  10-02-2020           302.0     1.0
37      China          Shanghai  11-02-2020           311.0     1.0
38      China           Beijing  10-02-2020           342.0     3.0
39      China           Beijing  11-02-2020           352.0     3.0

Reshaping and Pivot Tables — pandas 0.23.4 documentation, Reshaping and Pivot Tables�. Reshaping by pivoting DataFrame objects�. Data is often stored in CSV files or databases in so-called “stacked” or “record� In the case of pivot(), the data is only rearranged. When multiple values need to be aggregated (in this specific case, the values on different time steps) pivot_table() can be used, providing an aggregation function (e.g. mean) on how to combine these values. Pivot table is a well known concept in spreadsheet software.

Yes, and you can achieve it by reshaping the dataframe.

Firs you have to melt the columns to have them as values:

df = df.melt(['country', 'location'],
             [ p for p in df.columns if p not in ['country', 'location'] ], 
             'key',
             'value')

#>       country         location                        key  value
#> 0   Australia  New South Wales  confirmedcases_10-02-2020      4
#> 1   Australia         Victoria  confirmedcases_10-02-2020      4
#> 2   Australia       Queensland  confirmedcases_10-02-2020      5
#> 3   Australia  South Australia  confirmedcases_10-02-2020      2
#> 4    Cambodia    Sihanoukville  confirmedcases_10-02-2020      1
#> ..        ...              ...                        ...    ...
#> 75      China          Sichuan          deaths_11-02-2020      1
#> 76      China         Shandong          deaths_11-02-2020      1
#> 77      China          Jiangsu          deaths_11-02-2020      0
#> 78      China         Shanghai          deaths_11-02-2020      1
#> 79      China          Beijing          deaths_11-02-2020      3

After that you need to separate the values in the column key:

key_split_series = df.key.str.split("_", expand=True)
df["key"] = key_split_series[0]
df["date"] = key_split_series[1]

#>       country         location             key  value        date
#> 0   Australia  New South Wales  confirmedcases      4  10-02-2020
#> 1   Australia         Victoria  confirmedcases      4  10-02-2020
#> 2   Australia       Queensland  confirmedcases      5  10-02-2020
#> 3   Australia  South Australia  confirmedcases      2  10-02-2020
#> 4    Cambodia    Sihanoukville  confirmedcases      1  10-02-2020
#> ..        ...              ...             ...    ...         ...
#> 75      China          Sichuan          deaths      1  11-02-2020
#> 76      China         Shandong          deaths      1  11-02-2020
#> 77      China          Jiangsu          deaths      0  11-02-2020
#> 78      China         Shanghai          deaths      1  11-02-2020
#> 79      China          Beijing          deaths      3  11-02-2020

In the end, you just need to pivot the table to have confirmedcases and deaths back as columns:

df = df.set_index(["country", "location", "date", "key"])["value"].unstack().reset_index()

#> key    country         location        date  confirmedcases  deaths
#> 0    Australia  New South Wales  10-02-2020               4       0
#> 1    Australia  New South Wales  11-02-2020               4       0
#> 2    Australia       Queensland  10-02-2020               5       0
#> 3    Australia       Queensland  11-02-2020               5       0
#> 4    Australia  South Australia  10-02-2020               2       0
#> ..         ...              ...         ...             ...     ...
#> 35       China         Shanghai  11-02-2020             311       1
#> 36       China          Sichuan  10-02-2020             417       1
#> 37       China          Sichuan  11-02-2020             436       1
#> 38       China         Zhejiang  10-02-2020            1177       0
#> 39       China         Zhejiang  11-02-2020            1131       0

Reshaping and Pivot Tables — pandas 0.15.2 documentation, Reshaping and Pivot Tables�. Reshaping by pivoting DataFrame objects�. Data is often stored in CSV files or databases in so-called “stacked” or “record� 4 Pandas Function to Reshape Table Layout 1.sort_values. Simple yet useful. This elegant method is one of the most useful in Pandas arsenal. Just from the name, 2.pivot. Pivot is a method from Data Frame to reshape data (produce a “pivot” table) based on column values. it uses 3.pivot_table.

Use {dataframe}.reshape((-1,1)) if there is only one feature and {dataframe}.reshape((1,-1)) if there is only one sample

Reshaping in Pandas, In Pandas data reshaping means the transformation of the structure of a table or vector (i.e. DataFrame or Series) to make it suitable for further� Reshaping a data from long to wide in python pandas is done with pivot () function. Pivot () function in pandas is one of the efficient function to transform the data from long to wide format. pivot () Function in python pandas depicted with an example. Let’s create a simple data frame to demonstrate our reshape example in python pandas.

Reshape pandas dataframe, pivot_table or the pivot_table method associated with pandas dataframes. In other languages like R, pivot is also known as spread or dcast. I� Let’s use the Pivot function to derive them from this pivot table we had created earlier. Here we have both the reports in wide shape using the Pivot function as desired! To explore more, here is the official link to the Pandas documentation page for reshaping data functions.

Reshaping in Pandas, In Pandas data reshaping means the transformation of the structure of a table or vector (i.e. DataFrame or Series) to make it suitable for further analysis. Some of� As an example, the tables above (from the Pandas documentation) have been reshaped by pivoting, stacking or unstacking the table. The pivot method takes a large data set with multiple indexes and summarizes it The stack method takes a table with multiple indexes and groups them

Pandas/Pandas-Tutorials/Reshaping and Pivot Tables.ipynb (solar , Reshaping by Melt; Combining with stats and GroupBy; Pivot tables; Cross tabulations; Tiling; Factorizing values. In [ ]:. import pandas as pd import numpy as� Pivot table gives us the shape we are looking for, notice we didn’t have to assign value because we are using aggfunc=’size’. also, in this case, fill_value=0 doesn’t do anything but in

Comments
  • Yes, pd.wide_to_long.
  • seem a good solution. However with this code I always get an empty dataframe. I cannot find a way to specify (if there are any ways) the date format. pd.wide_to_long(df, ["confirmedcases_", "deaths_"], i="id", j="year")
  • awesome, thanks! So in my case I have to set up a regex to get the correct dates.
  • Yes to capture what you want. you can look into the docs on the suffix part.
  • Hi @AjuDevus welcome to stackoverflow! You may want to look at the guidelines on how to answer questions at SO. You can check out the page here and provide a readable explanation for your answer to the posted question.