scikit-learn AgglomerativeClustering and connectivity -
I'm trying to use AgglomerativeClustering with scikit- Learn the cluster points in one place to coordinate the digits ( X, Y) is directed at _XY.
The cluster is limited to some neighbors via connectivity matrix defined by C = kneighbors_graph (_XY, n_neighbors = 20)
.
I want that some points are not part of the same cluster, even if they are neighbors, so I modified the connectivity matrix to put 0 between these points.
The algorithm runs smoothly, but in the end, in some groups there are such numbers which should not be together, that is, some pairs for which I imposed _C = 0.
From children, I can see that the problem occurs when a cluster two points (i, j) have already been created and that Kashmir joins (i, j) even if _C [i , K] = 0
So I was wondering how the connectivity barrier spread when the size of some groups is larger than 2, in that case, _C is not defined.
Thank you!
So what is happening in your case is that despite your active separation you are not in a cluster These points are still part of the same connected component, and the data related to it still shows that they should be connected at the same level from a certain level.
In general, AgglomerativeClustering
works as follows: At the beginning, all data points are separate clusters, then in each repetition, two adjacent groups are merged, such as The overall increase in the discrepancy with the original data is minimal, if we calculate the original data with the cluster from L2 distance.
Therefore, although you can link directly between two nodes, they can be assumed to be high on one level by an intermediate node simultaneously.
Comments
Post a Comment