Extracting clusters from seaborn clustermap

While using result.linkage.dendrogram_col or result.linkage.dendrogram_row will currently work, it seems to be an implementation detail. The safest route is to first compute the linkages explicitly and pass them to the clustermap function, which has row_linkage and col_linkage parameters just for that. Replacing the last line in your example (result = …) with the following code … Read more

sklearn agglomerative clustering linkage matrix

It’s possible, but it isn’t pretty. It requires (at a minimum) a small rewrite of AgglomerativeClustering.fit (source). The difficulty is that the method requires a number of imports, so it ends up getting a bit nasty looking. To add in this feature: Insert the following line after line 748: kwargs[‘return_distance’] = True Replace line 752 … Read more

how to plot and annotate hierarchical clustering dendrograms in scipy/matplotlib

The input to linkage() is either an n x m array, representing n points in m-dimensional space, or a one-dimensional array containing the condensed distance matrix. In your example, mat is 3 x 3, so you are clustering three 3-d points. Clustering is based on the distance between these points. Why does mat and 1-mat … Read more

Plot dendrogram using sklearn.AgglomerativeClustering

Here is a simple function for taking a hierarchical clustering model from sklearn and plotting it using the scipy dendrogram function. Seems like graphing functions are often not directly supported in sklearn. You can find an interesting discussion of that related to the pull request for this plot_dendrogram code snippet here. I’d clarify that the … Read more

plotting results of hierarchical clustering on top of a matrix of data

The question does not define matrix very well: “matrix of values”, “matrix of data”. I assume that you mean a distance matrix. In other words, element D_ij in the symmetric nonnegative N-by-N distance matrix D denotes the distance between two feature vectors, x_i and x_j. Is that correct? If so, then try this (edited June … Read more

scipy linkage format

I agree with https://stackoverflow.com/users/1167475/mortonjt that the documentation does not fully explain the indexing of intermediate clusters, while I do agree with the https://stackoverflow.com/users/1354844/dkar that the format is otherwise precisely explained. Using the example data from this question: Tutorial for scipy.cluster.hierarchy A = np.array([[0.1, 2.5], [1.5, .4 ], [0.3, 1 ], [1 , .8 ], [0.5, … Read more