Using Scikit-Learn OneHotEncoder with a Pandas DataFrame

OneHotEncoder Encodes categorical integer features as a one-hot numeric array. Its Transform method returns a sparse matrix if sparse=True, otherwise it returns a 2-d array. You can’t cast a 2-d array (or sparse matrix) into a Pandas Series. You must create a Pandas Serie (a column in a Pandas dataFrame) for each category. I would … Read more

One hot encoding of string categorical features

If you are on sklearn>0.20.dev0 In [11]: from sklearn.preprocessing import OneHotEncoder …: cat = OneHotEncoder() …: X = np.array([[‘a’, ‘b’, ‘a’, ‘c’], [0, 1, 0, 1]], dtype=object).T …: cat.fit_transform(X).toarray() …: Out[11]: array([[1., 0., 0., 1., 0.], [0., 1., 0., 0., 1.], [1., 0., 0., 1., 0.], [0., 0., 1., 0., 1.]]) If you are on … Read more

Adding dummy columns to the original dataframe

In [77]: df = pd.concat([df, pd.get_dummies(df[‘YEAR’])], axis=1); df Out[77]: JOINED_CO GENDER EXEC_FULLNAME GVKEY YEAR CONAME BECAMECEO \ 5622 NaN MALE Ira A. Eichner 1004 1992 AAR CORP 19550101 5622 NaN MALE Ira A. Eichner 1004 1993 AAR CORP 19550101 5622 NaN MALE Ira A. Eichner 1004 1994 AAR CORP 19550101 5622 NaN MALE Ira A. … Read more

Feature names from OneHotEncoder

A list with the original column names can be passed to get_feature_names. >>> encoder.get_feature_names([‘Sex’, ‘AgeGroup’]) array([‘Sex_female’, ‘Sex_male’, ‘AgeGroup_0’, ‘AgeGroup_15’, ‘AgeGroup_30’, ‘AgeGroup_45’, ‘AgeGroup_60’, ‘AgeGroup_75’], dtype=object) DEPRECATED: get_feature_names is deprecated in 1.0 and will be removed in 1.2. Please use get_feature_names_out instead. As per sklearn.preprocessing.OneHotEncoder. >>> encoder.get_feature_names_out([‘Sex’, ‘AgeGroup’]) array([‘Sex_female’, ‘Sex_male’, ‘AgeGroup_0’, ‘AgeGroup_15’, ‘AgeGroup_30’, ‘AgeGroup_45’, ‘AgeGroup_60’, ‘AgeGroup_75’], dtype=object)

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)