dataframe
Create DataFrame from multiple Series
You can use pd.concat: pd.concat([r, s], axis=1) Out: rrr sss 0 0 0 1 3 5 2 6 10 3 9 15 4 12 20 5 15 25 6 18 30 7 21 35 8 24 40 9 27 45 Or the DataFrame constructor: pd.DataFrame({‘r’: r, ‘s’: s}) Out: r s 0 0 0 1 … Read more
Pandas fillna throws ValueError: fill value must be in categories
Use Series.cat.add_categories for add categories first: AM_train[‘product_category_2’] = AM_train[‘product_category_2’].cat.add_categories(‘Unknown’) AM_train[‘product_category_2’].fillna(‘Unknown’, inplace =True) AM_train[‘city_development_index’] = AM_train[‘city_development_index’].cat.add_categories(‘Missing’) AM_train[‘city_development_index’].fillna(‘Missing’, inplace =True) Sample: AM_train = pd.DataFrame({‘product_category_2’: pd.Categorical([‘a’,’b’,np.nan])}) AM_train[‘product_category_2’] = AM_train[‘product_category_2’].cat.add_categories(‘Unknown’) AM_train[‘product_category_2’].fillna(‘Unknown’, inplace =True) print (AM_train) product_category_2 0 a 1 b 2 Unknown
Python: Check if dataframe column contain string type
4 years since the creation of this question and I believe there’s still not a definitive answer. I don’t think strings were ever considered as a first class citizen in Pandas (even >= 1.0.0). As an example: import pandas as pd import datetime df = pd.DataFrame({ ‘str’: [‘a’, ‘b’, ‘c’, None], ‘hete’: [1, 2.0, datetime.datetime.utcnow(), … Read more
Subtract two columns in dataframe
Given the following dataframe: import pandas as pd df = pd.DataFrame([[“Australia”, 1, 3, 5], [“Bambua”, 12, 33, 56], [“Tambua”, 14, 34, 58] ], columns=[“Country”, “Val1”, “Val2”, “Val10”] ) It comes down to a simple broadcasting operation: >>> df[“Val1”] – df[“Val10”] 0 -4 1 -44 2 -44 dtype: int64 You can also store this into a … Read more