multi-index – Page 2

Selecting rows from a Pandas dataframe with a compound (hierarchical) index

September 20, 2023 by Tarik

Try using xs to be very precise: In [5]: df.xs(‘a’, level=0) Out[5]: value1 value2 group2 c 1.1 7.1 c 2.0 8.0 d 3.0 9.0 In [6]: df.xs(‘c’, level=”group2″) Out[6]: value1 value2 group1 a 1.1 7.1 a 2.0 8.0

Filling in date gaps in MultiIndex Pandas Dataframe

August 30, 2023 by Tarik

You can make a new multi index based on the Cartesian product of the levels of the existing multi index. Then, re-index your data frame using the new index. new_index = pd.MultiIndex.from_product(df.index.levels) new_df = df.reindex(new_index) # Optional: convert missing values to zero, and convert the data back # to integers. See explanation below. new_df = … Read more

Pandas reset index on series to remove multiindex

July 31, 2023 by Tarik

Just call reset_index(): In [130]: s Out[130]: 0 1 1999-03-31 SOLD_PRICE NaN 1999-06-30 SOLD_PRICE NaN 1999-09-30 SOLD_PRICE NaN 1999-12-31 SOLD_PRICE 3 2000-03-31 SOLD_PRICE 3 Name: 2, dtype: float64 In [131]: s.reset_index() Out[131]: 0 1 2 0 1999-03-31 SOLD_PRICE NaN 1 1999-06-30 SOLD_PRICE NaN 2 1999-09-30 SOLD_PRICE NaN 3 1999-12-31 SOLD_PRICE 3 4 2000-03-31 SOLD_PRICE 3 … Read more

Benefits of panda’s multiindex?

July 19, 2023 by Tarik

Hierarchical indexing (also referred to as “multi-level” indexing) was introduced in the pandas 0.4 release. This opens the door to some quite sophisticated data analysis and manipulation, especially for working with higher dimensional data. In essence, it enables you to effectively store and manipulate arbitrarily high dimension data in a 2-dimensional tabular structure (DataFrame), for … Read more

Pandas: Modify a particular level of Multiindex

July 9, 2023 by Tarik

Thanks to @cxrodgers’s comment, I think the fastest way to do this is: df.index = df.index.set_levels(df.index.levels[0].str.replace(‘ ‘, ”), level=0) Old, longer answer: I found that the list comprehension suggested by @Shovalt works but felt slow on my machine (using a dataframe with >10,000 rows). Instead, I was able to use .set_levels method, which was quite … Read more

Nested dictionary to multiindex dataframe where dictionary keys are column labels

June 27, 2023 by Tarik

Pandas wants the MultiIndex values as tuples, not nested dicts. The simplest thing is to convert your dictionary to the right format before trying to pass it to DataFrame: >>> reform = {(outerKey, innerKey): values for outerKey, innerDict in dictionary.items() for innerKey, values in innerDict.items()} >>> reform {(‘A’, ‘a’): [1, 2, 3, 4, 5], (‘A’, … Read more

Read excel sheet with multiple header using Pandas

June 8, 2023 by Tarik

[See comments for updates and corrections] Pandas already has a function that will read in an entire Excel spreadsheet for you, so you don’t need to manually parse/merge each sheet. Take a look pandas.read_excel(). It not only lets you read in an Excel file in a single line, it also provides options to help solve … Read more

reading excel sheet as multiindex dataframe through pd.read_excel()

June 3, 2023 by Tarik

You can add parameter index_col=[0,1] to read_excel, because index is Multindex too: EDIT: You need also change header from header=[0,1,2] to header=[0,1], and remove empty rows – row 5 and 7. You can add parameter sheetname: import pandas as pd df = pd.read_excel(‘test/ipsos_excel_tables_type_2_trimed_nosig.xlsx’, header=[0,1], index_col=[0,1], sheetname=”0001″) print df T \ Total Q1. Do you have … Read more

Change timezone of date-time column in pandas and add as hierarchical index

May 28, 2023 by Tarik

If you set it as the index, it’s automatically converted to an Index: In [11]: dat.index = pd.to_datetime(dat.pop(‘datetime’), utc=True) In [12]: dat Out[12]: label value datetime 2011-07-19 07:00:00 a 0 2011-07-19 08:00:00 a 1 2011-07-19 09:00:00 a 2 2011-07-19 07:00:00 b 3 2011-07-19 08:00:00 b 4 2011-07-19 09:00:00 b 5 Then do the tz_localize: In … Read more