dendrogram
Extracting clusters from seaborn clustermap
While using result.linkage.dendrogram_col or result.linkage.dendrogram_row will currently work, it seems to be an implementation detail. The safest route is to first compute the linkages explicitly and pass them to the clustermap function, which has row_linkage and col_linkage parameters just for that. Replacing the last line in your example (result = …) with the following code … Read more
sklearn agglomerative clustering linkage matrix
It’s possible, but it isn’t pretty. It requires (at a minimum) a small rewrite of AgglomerativeClustering.fit (source). The difficulty is that the method requires a number of imports, so it ends up getting a bit nasty looking. To add in this feature: Insert the following line after line 748: kwargs[‘return_distance’] = True Replace line 752 … Read more
how to plot and annotate hierarchical clustering dendrograms in scipy/matplotlib
The input to linkage() is either an n x m array, representing n points in m-dimensional space, or a one-dimensional array containing the condensed distance matrix. In your example, mat is 3 x 3, so you are clustering three 3-d points. Clustering is based on the distance between these points. Why does mat and 1-mat … Read more
Plot dendrogram using sklearn.AgglomerativeClustering
Here is a simple function for taking a hierarchical clustering model from sklearn and plotting it using the scipy dendrogram function. Seems like graphing functions are often not directly supported in sklearn. You can find an interesting discussion of that related to the pull request for this plot_dendrogram code snippet here. I’d clarify that the … Read more
plotting results of hierarchical clustering on top of a matrix of data
The question does not define matrix very well: “matrix of values”, “matrix of data”. I assume that you mean a distance matrix. In other words, element D_ij in the symmetric nonnegative N-by-N distance matrix D denotes the distance between two feature vectors, x_i and x_j. Is that correct? If so, then try this (edited June … Read more
scipy linkage format
I agree with https://stackoverflow.com/users/1167475/mortonjt that the documentation does not fully explain the indexing of intermediate clusters, while I do agree with the https://stackoverflow.com/users/1354844/dkar that the format is otherwise precisely explained. Using the example data from this question: Tutorial for scipy.cluster.hierarchy A = np.array([[0.1, 2.5], [1.5, .4 ], [0.3, 1 ], [1 , .8 ], [0.5, … Read more