Pandas MultiIndex Vector Setting

pandas multiindex tutorial
pandas set multiindex
pandas reindex multiindex
pandas series set index
pandas loc multiindex
pandas multiindex get level
pandas set index name
pandas multiindex to single index

I have a DataFrame with multiindex like this:

             0         1         2
 a 0  0.928295  0.828225 -0.612509
   1  1.103340 -0.540640 -0.344500
   2 -1.760918 -1.426488 -0.647610
   3 -0.782976  0.359211  1.601602
   4  0.334406 -0.508752 -0.611212
 b 2  0.717163  0.902514  1.027191
   3  0.296955  1.543040 -1.429113
   4 -0.651468  0.665114  0.949849
 c 0  0.195620 -0.240177  0.745310
   1  1.244997 -0.817949  0.130422
   2  0.288510  1.123550  0.211385
   3 -1.060227  1.739789  2.186224
   4 -0.109178 -1.645732  0.022480
 d 3  0.021789  0.747183  0.614485
   4 -1.074870  0.407974 -0.961013

What I want : array([1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0])

Now I want to generate a zero vector which have the sample length of this DataFrame and only have ones on the first elements of level[1] index. For example, here the df have a shape of (15, 3). Therefore I want to get a vector with length of 15 and should have 1 at(a, 0), (b, 2), (c, 0), (d, 3) and 0 at other points. How could I generator an vector like that? (If possible don't loop over get each sub vector and then use np.concatenate()) Thanks a lot!

IIUC duplicated

(~df.index.get_level_values(0).duplicated()).astype(int)
Out[726]: array([1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0])

Or using groupby and head

df.loc[df.groupby(level=0).head(1).index,'New']=1
df.New.fillna(0).values
Out[721]: array([1., 0., 0., 0., 0., 1., 0., 0., 1., 0., 0., 0., 0., 1., 0.])

MultiIndex / advanced indexing, Whether a copy or a reference is returned for a setting operation may depend on You can think of MultiIndex as an array of tuples where each tuple is unique. pandas documentation: Setting and sorting a MultiIndex. Example. This example shows how to use column data to set a MultiIndex in a pandas.DataFrame.. In [1]: df = pd

Get the labels of your first multiindex, turn them into a series, then find where they are not equal to the adjacent ones

labels = pd.Series(df.index.labels[0])

v = labels.ne(labels.shift()).astype(int).values

>>> v
array([1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0])

Reshape using Stack() and unstack() function in Pandas python , How do I change the order of columns in pandas? The MultiIndex object is the hierarchical analogue of the standard Index object which typically stores the axis labels in pandas objects. You can think of MultiIndex as an array of tuples where each tuple is unique.

pd.Index(df.labels[0])
Int64Index([0, 0, 0, 0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3], dtype='int64')
res = pd.Index(df.labels[0]).duplicated(keep='first')
array([False,  True,  True,  True,  True, False,  True,  True, False,
       True,  True,  True,  True, False,  True])

Mulitindex has an attribute labels to indicate postion. Which has the same meaning of the requirement.

How to change the order of DataFrame columns?, Set the DataFrame index (row labels) using one or more existing columns or This parameter can be either a single column key, a single array of the same  Create a MultiIndex from the cartesian product of iterables. MultiIndex.from_tuples. Convert list of tuples to a MultiIndex. MultiIndex.from_frame. Make a MultiIndex from a DataFrame. The base pandas Index type. See the user guide for more. A new MultiIndex is typically constructed using one of the helper methods MultiIndex.from_arrays

Pandas Set Index Example, Whether a copy or a reference is returned for a setting operation may depend on You can think of MultiIndex as an array of tuples where each tuple is unique. Python | Pandas MultiIndex.set_labels () Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Pandas MultiIndex.set_labels() function set new labels on MultiIndex. Defaults to returning new index.

pandas.DataFrame.set_index, from_arrays ), an array of tuples (using MultiIndex.from_tuples ), or a crossed set of iterables (using MultiIndex.from_product ). The Index constructor will  Create index with target’s values (move/add/delete values as necessary). Index.rename (self, name [, inplace]) Alter Index or MultiIndex name. Index.repeat (self, repeats [, axis]) Repeat elements of a Index. Index.where (self, cond [, other]) Return an Index of same shape as self and whose corresponding entries are from self where cond is

MultiIndex / advanced indexing, from_arrays), an array of tuples (using MultiIndex.from_tuples), or a crossed set of iterables (using MultiIndex.from_product). The Index constructor will attempt to​  pandas.DataFrame.set_index¶. Set the DataFrame index using existing columns. Set the DataFrame index (row labels) using one or more existing columns or arrays (of the correct length). The index can replace the existing index or expand on it.

Comments
  • The duplicated method is nice!
  • Thanks for your answer and I have just found a simpler solution. Thank you all the same.
  • @Kid hi , you can vote for the answers and accept one of the answers you like