how to transform dataframe so that column values are row values
pandas transpose one column
pandas pivot multiple columns
python transpose columns to rows
pandas transpose column names
convert row to column python list
pandas rename column
pandas pivot table column names
I have the following dataframe, which looks like the below:
df = pd.DataFrame({'fruit': ['berries','berries', 'berries', 'tropical', 'tropical','tropical','berries','nuts'], 'code': [100,100,100,200,200, 300,400,500], 'subcode': ['100A', '100B', '100C','200A', '200B','300A', '400A', '500A']}) code fruit subcode 0 100 berries 100A 1 100 berries 100B 2 100 berries 100C 3 200 tropica 200A 4 200 tropical 200B 5 300 tropical 300A 6 400 berries 400A 7 500 nuts 500A
I want to transform the dataframe to this format:
code fruit subcode1 subcode1 subcode1 0 100 berries 100A 100B 100C 3 200 tropica 200A 200B 5 300 tropical 300A 6 400 berries 400A 7 500 nuts 500A
Unfortunately, I'm stuck as to how to proceed. I've consulted posts like, Unmelt Pandas DataFrame, and have combinations of stack and unstack. I suspect that some concatenation is involved, too. Would appreciate any advice to help point me in the right direction!
You can use groupby
, take the values and convert them to series.
df.groupby(['code','fruit'])['subcode'].apply( lambda x: x.values ).apply(pd.Series) .add_prefix('subcode_') subcode_0 subcode_1 subcode_2 code fruit 100 berries 100A 100B 100C 200 tropical 200A 200B NaN 300 tropical 300A NaN NaN 400 berries 400A NaN NaN 500 nuts 500A NaN NaN
Reshaping and pivot tables, Data is often stored in so-called “stacked” or “record” format: In [3]: df.pivot(index='date', columns='variable', values='value') Out[3]: We can 'explode' the values column, transforming each list-like to a separate row, by using explode() . And before extracting data from the dataframe, it would be a good practice to assign a column with unique values as the index of the dataframe. The State column would be a good choice. Assigning an index column to pandas dataframe ¶ df2 = df1.set_index("State", drop = False)
Play around a bit with set_index
and unstack
, and you'll get it.
(df.set_index(['code', 'fruit']) .set_index(df.subcode.str.extract('([a-zA-Z]+)', expand=False), append=True) .subcode .unstack() .fillna('') # these last three .reset_index() # operations are .rename_axis(None, axis=1) # not important ) code fruit A B C 0 100 berries 100A 100B 100C 1 200 tropical 200A 200B 2 300 tropical 300A 3 400 berries 400A 4 500 nuts 500A
pandas.DataFrame.transpose, Reflect the DataFrame over its main diagonal by writing rows as columns and vice-versa. The property T is In such a case, a copy of the data is always made. Use the T attribute or the transpose() method to swap (= transpose) the rows and columns of pandas.DataFrame.Neither method changes the original object, but returns a new object with the rows and columns swapped (= transposed object).Note that depending on the data type dtype of each column, a view
With defaultdict
from collections import defaultdict d = defaultdict(list) for f, c, s in df.itertuples(index=False): d[(f, c)].append(s) pd.DataFrame.from_dict( {k: dict(enumerate(v)) for k, v in d.items()}, orient='index' ).add_prefix('subcode').rename_axis(['fruit', 'code']).reset_index() fruit code subcode0 subcode1 subcode2 0 berries 100 100A 100B 100C 1 berries 400 400A NaN NaN 2 nuts 500 500A NaN NaN 3 tropical 200 200A 200B NaN 4 tropical 300 300A NaN NaN
pandas.DataFrame.pivot, Reshape data (produce a “pivot” table) based on column values. Notice that the first two rows are the same for our index and columns arguments. We often encounter a need to transpose or transform the row and column in a given input data while dealing with big data in data analytics.Also, we might be asked in Spark interviews How to pivot Dataframes ?. In this blog, we will learn to convert the value of a column into rows in Spark dataframe.
pandas.DataFrame.transform, Call func on self producing a DataFrame with transformed values. Produced DataFrame Function to use for transforming the data. dict of axis labels -> functions, function names or list of such. If 1 or 'columns': apply function to each row. Select the range of data you want to rearrange, including any row or column labels, and press Ctrl+C. Note: Ensure that you copy the data to do this, since using the Cut command or Ctrl+X won’t work. Choose a new location in the worksheet where you want to paste the transposed table, ensuring that there is plenty of room to paste your data.
Convert a column to row name/index in Pandas, Let's see how can we convert a column to row name/index in Pandas. dataframe · Get n-smallest values from a particular column in Pandas DataFrame To accomplish this goal, you may use the following Python code, which will allow you to convert the DataFrame into a list, where: The top part of the code, contains the syntax to create the DataFrame with our data about products and prices. The bottom part of the code converts the DataFrame into a list using: df.values.tolist()
Conversion Functions in Pandas DataFrame, Code #1: Convert the Weight column data type. Printing the first 10 rows of As the data have some “nan” values so, to avoid any error we will drop all the rows containing any DataFrame.isna() function is used to detect missing values. Steps to Convert String to Integer in Pandas DataFrame. Step 1: Create a DataFrame. To start, let’s say that you want to create a DataFrame for the following data: You can capture the values under the Price column as strings by surrounding those values with quotation marks.
Comments
- I like this approach, but I dislike the apply(Series). Good effort though!
- I doo agreee, consumes a lotta time.
- Is there any difference between applying ravel versus list?
- @ALollz I just realize that's unnecessary .
- thanks so much! this totally works on my data set and i learned that there is an .add_prefix/sufix to row/col labels.
- thanks! I'll will have to read up on default dict to see how it works. definitely appreciate learning different approaches.