|
The following is a graphical representations of the property values:
|
|
|
|
|
|
I will only discuss K-means (K-medians) and Hierarchical clustering because the stream clustering algorithm - by Guha et. al. - is an extension of these algorithms.
Example: Food
|
Sample data:
|
|
|
|
|
|
Given the following properties of 4 medicines:
Weight index pH value
Medicine A 1 1
Medicine B 2 1
Medicine C 4 3
Medicine D 5 4
|
Graphical representation:
|
Problem:
|
(Easy examples make things easier to understand :-))
|
|
New centroid (c1, c2, ..., cn) for the cluster C is found through:
|
Example:
|
Result:
|
|
|
|
|
DONE
|
The new centroid (c1, c2, ..., cn) for the cluster C is found through:
|
Select K points as initial centroids;
repeat
{
Form K clusters by assigning each point to its neareast centroid;
Recompute the centroid using the new membership of each cluster;
} until (centroids do not change)
|
|
|
|
|
|
for ( each x ∈ input set )
{
Cx = { x }; // Each data point is in its own cluster
}
Compute the Proximity Matrix between every 2 cluster
repeat
{
Merge the closest 2 clusters;
Update Proximity Matrix;
} stop condition (e.g., min. distance > MIN or number clusters = k, etc., etc)
|
|
|
|
|