how to transform dataframe so that column values are row values

how to convert rows into columns in python
pandas transpose one column
pandas pivot multiple columns
python transpose columns to rows
pandas transpose column names
convert row to column python list
pandas rename column
pandas pivot table column names

I have the following dataframe, which looks like the below:

df = pd.DataFrame({'fruit': ['berries','berries', 'berries', 'tropical', 
'tropical','tropical','berries','nuts'], 
           'code': [100,100,100,200,200, 300,400,500],
           'subcode': ['100A', '100B', '100C','200A', '200B','300A', 
           '400A', '500A']})


    code    fruit   subcode
  0 100     berries 100A
  1 100     berries 100B
  2 100     berries 100C
  3 200     tropica 200A
  4 200     tropical 200B
  5 300     tropical 300A
  6 400     berries 400A
  7 500     nuts    500A

I want to transform the dataframe to this format:

    code    fruit   subcode1 subcode1 subcode1
  0 100     berries 100A      100B   100C
  3 200     tropica 200A      200B
  5 300     tropical 300A
  6 400     berries 400A
  7 500     nuts    500A 

Unfortunately, I'm stuck as to how to proceed. I've consulted posts like, Unmelt Pandas DataFrame, and have combinations of stack and unstack. I suspect that some concatenation is involved, too. Would appreciate any advice to help point me in the right direction!

You can use groupby, take the values and convert them to series.

df.groupby(['code','fruit'])['subcode'].apply(
         lambda x: x.values
      ).apply(pd.Series)
       .add_prefix('subcode_')

                subcode_0 subcode_1 subcode_2
code fruit                                 
100  berries       100A      100B      100C
200  tropical      200A      200B       NaN
300  tropical      300A       NaN       NaN
400  berries       400A       NaN       NaN
500  nuts          500A       NaN       NaN

Reshaping and pivot tables, Data is often stored in so-called “stacked” or “record” format: In [3]: df.pivot(​index='date', columns='variable', values='value') Out[3]: We can 'explode' the values column, transforming each list-like to a separate row, by using explode() . And before extracting data from the dataframe, it would be a good practice to assign a column with unique values as the index of the dataframe. The State column would be a good choice. Assigning an index column to pandas dataframe ¶ df2 = df1.set_index("State", drop = False)

Play around a bit with set_index and unstack, and you'll get it.

(df.set_index(['code', 'fruit'])
   .set_index(df.subcode.str.extract('([a-zA-Z]+)', expand=False), append=True)
   .subcode
   .unstack()
   .fillna('')                  # these last three 
   .reset_index()               # operations are  
   .rename_axis(None, axis=1)   # not important
)

   code     fruit     A     B     C
0   100   berries  100A  100B  100C
1   200  tropical  200A  200B      
2   300  tropical  300A            
3   400   berries  400A            
4   500      nuts  500A            

pandas.DataFrame.transpose, Reflect the DataFrame over its main diagonal by writing rows as columns and vice-versa. The property T is In such a case, a copy of the data is always made. Use the T attribute or the transpose() method to swap (= transpose) the rows and columns of pandas.DataFrame.Neither method changes the original object, but returns a new object with the rows and columns swapped (= transposed object).Note that depending on the data type dtype of each column, a view

With defaultdict

from collections import defaultdict


d = defaultdict(list)

for f, c, s in df.itertuples(index=False):
    d[(f, c)].append(s)

pd.DataFrame.from_dict(
    {k: dict(enumerate(v)) for k, v in d.items()}, orient='index'
).add_prefix('subcode').rename_axis(['fruit', 'code']).reset_index()

      fruit  code subcode0 subcode1 subcode2
0   berries   100     100A     100B     100C
1   berries   400     400A      NaN      NaN
2      nuts   500     500A      NaN      NaN
3  tropical   200     200A     200B      NaN
4  tropical   300     300A      NaN      NaN

pandas.DataFrame.pivot, Reshape data (produce a “pivot” table) based on column values. Notice that the first two rows are the same for our index and columns arguments. We often encounter a need to transpose or transform the row and column in a given input data while dealing with big data in data analytics.Also, we might be asked in Spark interviews How to pivot Dataframes ?. In this blog, we will learn to convert the value of a column into rows in Spark dataframe.

pandas.DataFrame.transform, Call func on self producing a DataFrame with transformed values. Produced DataFrame Function to use for transforming the data. dict of axis labels -> functions, function names or list of such. If 1 or 'columns': apply function to each row. Select the range of data you want to rearrange, including any row or column labels, and press Ctrl+C. Note:  Ensure that you copy the data to do this, since using the Cut command or Ctrl+X won’t work. Choose a new location in the worksheet where you want to paste the transposed table, ensuring that there is plenty of room to paste your data.

Convert a column to row name/index in Pandas, Let's see how can we convert a column to row name/index in Pandas. dataframe · Get n-smallest values from a particular column in Pandas DataFrame​  To accomplish this goal, you may use the following Python code, which will allow you to convert the DataFrame into a list, where: The top part of the code, contains the syntax to create the DataFrame with our data about products and prices. The bottom part of the code converts the DataFrame into a list using: df.values.tolist()

Conversion Functions in Pandas DataFrame, Code #1: Convert the Weight column data type. Printing the first 10 rows of As the data have some “nan” values so, to avoid any error we will drop all the rows containing any DataFrame.isna() function is used to detect missing values​. Steps to Convert String to Integer in Pandas DataFrame. Step 1: Create a DataFrame. To start, let’s say that you want to create a DataFrame for the following data: You can capture the values under the Price column as strings by surrounding those values with quotation marks.

Comments
  • I like this approach, but I dislike the apply(Series). Good effort though!
  • I doo agreee, consumes a lotta time.
  • Is there any difference between applying ravel versus list?
  • @ALollz I just realize that's unnecessary .
  • thanks so much! this totally works on my data set and i learned that there is an .add_prefix/sufix to row/col labels.
  • thanks! I'll will have to read up on default dict to see how it works. definitely appreciate learning different approaches.