PCA For categorical features?

I disagree with the others.

While you can use PCA on binary data (e.g. one-hot encoded data) that does not mean it is a good thing, or it will work very well.

PCA is designed for continuous variables. It tries to minimize variance (=squared deviations). The concept of squared deviations breaks down when you have binary variables.

So yes, you can use PCA. And yes, you get an output. It even is a least-squared output: it’s not as if PCA would segfault on such data. It works, but it is just much less meaningful than you’d want it to be; and supposedly less meaningful than e.g. frequent pattern mining.

Leave a Comment

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)