Using .loc on just second index in multiindex

pandas multiindex to single index
pandas multiindex get level
pandas groupby multiindex
pandas remove multiindex
pandas loc multiindex second level
for a multi-index, the label must be a tuple with elements corresponding to each level.
pandas index to column
pandas nested dataframe

I have multiindex dataframe that looks like this:

                value
year    name                
1921    Ah      40     
1921    Ai      90      
1922    Ah      100     
1922    Ai      7

in which year and name are the indices. I want to select every row where the name Ai appears. I have tried df.loc[(:,'Ai')] and df.loc['Ai'] but both give errors. How do I index only using the name column?

I would use .xs on the first level of your multiindex (note: level=1 refers to the "second" index (name) because of python's zero indexing: level 0 is year in your case):

df.xs('Ai', level=1, drop_level=False)
# or
df.xs('Ai', level='name', drop_level=False)

           value
year name       
1921 Ai       90
1922 Ai        7

MultiIndex / advanced indexing, MultiIndex.from_tuples(tuples, names=['first', 'second']) In [5]: index Out[5]: If you want to see only the used levels, you can use the get_level_values() If you also want to index a specific column with .loc , you must use a tuple like this:. Syntactically integrating MultiIndex in advanced indexing with .loc is a bit challenging, but we’ve made every effort to do so. In general, MultiIndex keys take the form of tuples. In general, MultiIndex keys take the form of tuples.

@sacul has the most idiomatic answer, but here are a few alternatives.

MultiIndex.get_level_values
df[df.index.get_level_values('name') == 'Ai']

           value
year name       
1921 Ai       90
1922 Ai        7

DataFrame.query
df.query('name == "Ai"')

           value
year name       
1921 Ai       90
1922 Ai        7

DataFrame.loc(axis=0) with pd.IndexSlice

Similar to @liliscent's answer, but does not need the trailing : if you specify axis=0.

df.loc(axis=0)[pd.IndexSlice[:, 'Ai']]

           value
year name       
1921 Ai       90
1922 Ai        7

MultiIndex / advanced indexing, MultiIndex.from_tuples(tuples, names=['first', 'second']) In [5]: index Out[5]: If you want to see only the used levels, you can use the get_level_values() If you also want to index a specific column with .loc , you must use a tuple like this:. The loc indexer is used with the same syntax as iloc: data.loc[<row selection>, <column selection>] . 2a. Label-based / Index-based indexing using .loc. Selections using the loc method are based on the index of the data frame (if any).

If you prefer loc, you can use:

In [245]: df.loc[(slice(None), 'Ai'), :]
     ...: 
Out[245]: 
           value
year name       
1921 Ai       90
1922 Ai        7

Hierarchical Indexing, Our tuple-based indexing is essentially a rudimentary multi-index, and the Now to access all data for which the second index is 2010, we can simply use the but each individual index in loc or iloc can be passed a tuple of multiple indices. The MultiIndex also supports partial indexing, or indexing just one of the levels in the index. The result is another Series , with the lower-level indices maintained: In [23]:

Indexing and Selecting Data, See the MultiIndex / Advanced Indexing for MultiIndex and more advanced pandas aligns all AXES when setting Series and DataFrame from .loc , and .iloc . You can use this access only if the index element is a valid Python identifier, e.g. you wish to get the 0th and the 2nd elements from the index in the 'A' column. Set new levels on MultiIndex. set_codes (self, codes[, level, inplace, …]) Set new codes on MultiIndex. to_frame (self[, index, name]) Create a DataFrame with the levels of the MultiIndex as columns. to_flat_index (self) Convert a MultiIndex to an Index of Tuples containing the level values. is_lexsorted (self)

.loc on Hierarchical Index with single-valued index level can drop , If one level of that index has only one value, then .loc can drop that level inplace. I personally expect it to return the original multi-index where the first level has only that value. So your second example would be unique: I have dataframes that sometimes have up to 5 levels on their multiindex. It's not uncommon for me to want to just grab a subset containing only one value on a certain level. If one level of that index has only one value, then .loc can drop that level inplace. I'd say this is highly undesirable. First the normal behavior. Here's my input:

MultiIndex / Advanced Indexing, This should be a transparent change with only very limited API implications (See the MultiIndex.from_tuples(tuples, names=['first', 'second']) In [5]: index Out[5]: Syntactically integrating MultiIndex in advanced indexing with .loc/.ix is a bit  MultiIndex; Displaying all elements in the index; How to change MultiIndex columns to standard columns; How to change standard columns to MultiIndex; Iterate over DataFrame with MultiIndex; MultiIndex Columns; Select from MultiIndex by Level; Setting and sorting a MultiIndex; Pandas Datareader; Pandas IO tools (reading and saving data sets) pd

Comments
  • (I approve of this answer)
  • Thanks! I also aprove of yours!
  • All good answers but this is the most straightforward solution I think. Thanks very much.
  • Faster, you deserve +1
  • Good answer, this is equivalent to df.loc[pd.IndexSlice[:, 'Ai'], :].
  • @coldspeed Thanks.