Using .loc on just second index in multiindex
pandas multiindex get level
pandas groupby multiindex
pandas remove multiindex
pandas loc multiindex second level
for a multi-index, the label must be a tuple with elements corresponding to each level.
pandas index to column
pandas nested dataframe
I have multiindex dataframe that looks like this:
value year name 1921 Ah 40 1921 Ai 90 1922 Ah 100 1922 Ai 7
name are the indices. I want to select every row where the name
Ai appears. I have tried
df.loc['Ai'] but both give errors. How do I index only using the name column?
I would use
.xs on the first level of your multiindex (note:
level=1 refers to the "second" index (
name) because of python's zero indexing: level 0 is
year in your case):
df.xs('Ai', level=1, drop_level=False) # or df.xs('Ai', level='name', drop_level=False) value year name 1921 Ai 90 1922 Ai 7
MultiIndex / advanced indexing, MultiIndex.from_tuples(tuples, names=['first', 'second']) In : index Out: If you want to see only the used levels, you can use the get_level_values() If you also want to index a specific column with .loc , you must use a tuple like this:. Syntactically integrating MultiIndex in advanced indexing with .loc is a bit challenging, but we’ve made every effort to do so. In general, MultiIndex keys take the form of tuples. In general, MultiIndex keys take the form of tuples.
@sacul has the most idiomatic answer, but here are a few alternatives.
df[df.index.get_level_values('name') == 'Ai'] value year name 1921 Ai 90 1922 Ai 7
df.query('name == "Ai"') value year name 1921 Ai 90 1922 Ai 7
Similar to @liliscent's answer, but does not need the trailing
: if you specify
df.loc(axis=0)[pd.IndexSlice[:, 'Ai']] value year name 1921 Ai 90 1922 Ai 7
MultiIndex / advanced indexing, MultiIndex.from_tuples(tuples, names=['first', 'second']) In : index Out: If you want to see only the used levels, you can use the get_level_values() If you also want to index a specific column with .loc , you must use a tuple like this:. The loc indexer is used with the same syntax as iloc: data.loc[<row selection>, <column selection>] . 2a. Label-based / Index-based indexing using .loc. Selections using the loc method are based on the index of the data frame (if any).
If you prefer
loc, you can use:
In : df.loc[(slice(None), 'Ai'), :] ...: Out: value year name 1921 Ai 90 1922 Ai 7
Hierarchical Indexing, Our tuple-based indexing is essentially a rudimentary multi-index, and the Now to access all data for which the second index is 2010, we can simply use the but each individual index in loc or iloc can be passed a tuple of multiple indices. The MultiIndex also supports partial indexing, or indexing just one of the levels in the index. The result is another Series , with the lower-level indices maintained: In :
Indexing and Selecting Data, See the MultiIndex / Advanced Indexing for MultiIndex and more advanced pandas aligns all AXES when setting Series and DataFrame from .loc , and .iloc . You can use this access only if the index element is a valid Python identifier, e.g. you wish to get the 0th and the 2nd elements from the index in the 'A' column. Set new levels on MultiIndex. set_codes (self, codes[, level, inplace, …]) Set new codes on MultiIndex. to_frame (self[, index, name]) Create a DataFrame with the levels of the MultiIndex as columns. to_flat_index (self) Convert a MultiIndex to an Index of Tuples containing the level values. is_lexsorted (self)
.loc on Hierarchical Index with single-valued index level can drop , If one level of that index has only one value, then .loc can drop that level inplace. I personally expect it to return the original multi-index where the first level has only that value. So your second example would be unique: I have dataframes that sometimes have up to 5 levels on their multiindex. It's not uncommon for me to want to just grab a subset containing only one value on a certain level. If one level of that index has only one value, then .loc can drop that level inplace. I'd say this is highly undesirable. First the normal behavior. Here's my input:
MultiIndex / Advanced Indexing, This should be a transparent change with only very limited API implications (See the MultiIndex.from_tuples(tuples, names=['first', 'second']) In : index Out: Syntactically integrating MultiIndex in advanced indexing with .loc/.ix is a bit MultiIndex; Displaying all elements in the index; How to change MultiIndex columns to standard columns; How to change standard columns to MultiIndex; Iterate over DataFrame with MultiIndex; MultiIndex Columns; Select from MultiIndex by Level; Setting and sorting a MultiIndex; Pandas Datareader; Pandas IO tools (reading and saving data sets) pd
- (I approve of this answer)
- Thanks! I also aprove of yours!
- All good answers but this is the most straightforward solution I think. Thanks very much.
- Faster, you deserve +1
- Good answer, this is equivalent to
df.loc[pd.IndexSlice[:, 'Ai'], :].
- @coldspeed Thanks.