Laboratory of Computer and Information Science > Teaching > T-61.5060 > Exercises 2007 > Solutions 7
BestClustering simply chooses the best of the given four clusterings here. For example choosing C1 gives zero disagreements with itself, five with C2, three with C3 and six with C4 thus resulting in the total count of disagreements 14. Finally and not surprisingly the C3 is decided to be the official best clustering.
For the Agglomerative algorithm we need to first generate the distances between elements. After this we come to notice that actually the 2-4 pair is the only one with distance less than half. We merge those two. But after that we actually can't do anything anymore.
The same distance table can be used in the case of the Balls algorithm. First we select the first element. Only the third element is at most one half away from it and so the average distance in the ball will too big, one half, and the first element will be a singleton cluster. After this we select the second element. Only the fourth element is less than half away from and so they form a cluster together. The leftover elements are easily seen to form singleton clusters too.
C1 | C2 | C3 | C4 | BestClustering | Agglomerative | Balls | |
---|---|---|---|---|---|---|---|
v1 | 1 | 2 | 1 | 1 | 1 | 1 | 1 |
v2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
v3 | 2 | 1 | 1 | 1 | 1 | 3 | 3 |
v4 | 2 | 2 | 2 | 4 | 2 | 2 | 2 |
v5 | 3 | 3 | 3 | 3 | 3 | 4 | 4 |
v6 | 3 | 1 | 3 | 2 | 3 | 5 | 5 |
Each data row is generated independently from each other. The probability for a data row not having any of given K attributes is (1/2)^K. The probability of success across the whole data P(S) is thus (1-(1/2)^K)^|D| (each row should give the complement of the afore-mentioned case). The Bernoulli inequality gives us instantly P(S) > 1 - |D|(1/2)^K and to get this to be a constant we need K to be about O(log |D|).
Correlated attributes naturally reduce the expected size of X. This means also negatively correlated cases. One can vaguely look at this problem as a way of code construction. We must minimize the code (attribute set) size and at the same time maximize the mutual information (the descriptive capability of the attribute set).
We simply first calculate the pair-wise distances of all point-pairs. After that we iteratively select the point that is close to a largest share of left-over points. In this situation each point can be considered as a some sort of latent de-facto standard for the set of points. Now we want to find those consepts or standards that best fit the existing field of conventional ways of doing things.
Naturally now one can see each consept (center) and the area it fits (the circle) as a local cluster in the space. Therefore this algorithm is then nothing but a local clustering method.
You are at: CIS → T-61.5060 Exercises 2007, Solutions 7
Page maintained by t615060@james.hut.fi, last updated Monday, 19-Nov-2007 15:30:13 EET