Correlation among multiple categorical variables

You can using pd.factorize df.apply(lambda x : pd.factorize(x)[0]).corr(method=’pearson’, min_periods=1) Out[32]: a c d a 1.0 1.0 1.0 c 1.0 1.0 1.0 d 1.0 1.0 1.0 Data input df=pd.DataFrame({‘a’:[‘a’,’b’,’c’],’c’:[‘a’,’b’,’c’],’d’:[‘a’,’b’,’c’]}) Update from scipy.stats import chisquare df=df.apply(lambda x : pd.factorize(x)[0])+1 pd.DataFrame([chisquare(df[x].values,f_exp=df.values.T,axis=1)[0] for x in df]) Out[123]: 0 1 2 3 0 0.0 0.0 0.0 0.0 1 0.0 0.0 … Read more

Custom Annotation Seaborn Heatmap

This feature has just been added in the recent version of Seaborn 0.7.1. From Seaborn update history: The annot parameter of heatmap() now accepts a rectangular dataset in addition to a boolean value. If a dataset is passed, its values will be used for the annotations, while the main dataset will be used for the … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)