Filling in date gaps in MultiIndex Pandas Dataframe

You can make a new multi index based on the Cartesian product of the levels of the existing multi index. Then, re-index your data frame using the new index. new_index = pd.MultiIndex.from_product(df.index.levels) new_df = df.reindex(new_index) # Optional: convert missing values to zero, and convert the data back # to integers. See explanation below. new_df = … Read more

Pandas reset index on series to remove multiindex

Just call reset_index(): In [130]: s Out[130]: 0 1 1999-03-31 SOLD_PRICE NaN 1999-06-30 SOLD_PRICE NaN 1999-09-30 SOLD_PRICE NaN 1999-12-31 SOLD_PRICE 3 2000-03-31 SOLD_PRICE 3 Name: 2, dtype: float64 In [131]: s.reset_index() Out[131]: 0 1 2 0 1999-03-31 SOLD_PRICE NaN 1 1999-06-30 SOLD_PRICE NaN 2 1999-09-30 SOLD_PRICE NaN 3 1999-12-31 SOLD_PRICE 3 4 2000-03-31 SOLD_PRICE 3 … Read more

Benefits of panda’s multiindex?

Hierarchical indexing (also referred to as “multi-level” indexing) was introduced in the pandas 0.4 release. This opens the door to some quite sophisticated data analysis and manipulation, especially for working with higher dimensional data. In essence, it enables you to effectively store and manipulate arbitrarily high dimension data in a 2-dimensional tabular structure (DataFrame), for … Read more

Pandas: Modify a particular level of Multiindex

Thanks to @cxrodgers’s comment, I think the fastest way to do this is: df.index = df.index.set_levels(df.index.levels[0].str.replace(‘ ‘, ”), level=0) Old, longer answer: I found that the list comprehension suggested by @Shovalt works but felt slow on my machine (using a dataframe with >10,000 rows). Instead, I was able to use .set_levels method, which was quite … Read more

Nested dictionary to multiindex dataframe where dictionary keys are column labels

Pandas wants the MultiIndex values as tuples, not nested dicts. The simplest thing is to convert your dictionary to the right format before trying to pass it to DataFrame: >>> reform = {(outerKey, innerKey): values for outerKey, innerDict in dictionary.items() for innerKey, values in innerDict.items()} >>> reform {(‘A’, ‘a’): [1, 2, 3, 4, 5], (‘A’, … Read more

reading excel sheet as multiindex dataframe through pd.read_excel()

You can add parameter index_col=[0,1] to read_excel, because index is Multindex too: EDIT: You need also change header from header=[0,1,2] to header=[0,1], and remove empty rows – row 5 and 7. You can add parameter sheetname: import pandas as pd df = pd.read_excel(‘test/ipsos_excel_tables_type_2_trimed_nosig.xlsx’, header=[0,1], index_col=[0,1], sheetname=”0001″) print df T \ Total Q1. Do you have … Read more

Change timezone of date-time column in pandas and add as hierarchical index

If you set it as the index, it’s automatically converted to an Index: In [11]: dat.index = pd.to_datetime(dat.pop(‘datetime’), utc=True) In [12]: dat Out[12]: label value datetime 2011-07-19 07:00:00 a 0 2011-07-19 08:00:00 a 1 2011-07-19 09:00:00 a 2 2011-07-19 07:00:00 b 3 2011-07-19 08:00:00 b 4 2011-07-19 09:00:00 b 5 Then do the tz_localize: In … Read more

Pandas groupby(),agg() – how to return results without the multi index?

Below call: >>> gr = df.groupby([‘EVENT_ID’, ‘SELECTION_ID’], as_index=False) >>> res = gr.agg({‘ODDS’:[np.min, np.max]}) >>> res EVENT_ID SELECTION_ID ODDS amin amax 0 100429300 5297529 18 25 1 100429300 5297559 30 38 returns a frame with mulit-index columns. If you do not want columns to be multi-index either you may do: >>> res.columns = list(map(”.join, res.columns.values)) >>> … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)