Silhouette value is used to see the quality of the clusters.
-d <input file>: The input file name
–c <configuration file>: configuration file name
-ο <output file>: Output file name
-complete: Output has the members of each cluster
2)Assign the rest of the points to the centroids
3)Update the centroids
Go to 2 until no better clusters are found
All initialization, assign and update methods are used and the results of the clusters are writed to the output file as with the silhouette value of each cluster void KMedoids<T>::run(InitializationType initial, AssignmentType Assign, UpdateType update, int s = 2)
initial is ={InitializationPP, InitializationConcentrate}
Assign is ={PamAssign, LSHAssign}
update is={Lloyds,Clarans}
void KMedoids<T>::clara(int s = 5)
is the clara algorithm, and s is the number of iterations
More information about the algorithms in the Papers
Vector space
@metric_space vector
@metric {euclidean, cosine} //default: euclidean
item_id1 x11 x12 ... x1d
item_id2
.
.
item_idN xN1 xN2 ... xNd
Hamming
@metric_space hamming
item_id1 B1
....
item_idN
Distance Matrix
@metric_space matrix
item_id1 x11 x12 ... x1d
item_id2
.
.
item_idQ xQ1 xQ2 ... xQd
Configuration file
number_of_clusters: int // k value
number_of_hash_functions: int //default 4
number_of_hash_tables: int //default 5
clarans_set_fraction: int //default max{0.12*k(N-k),250}, k is the number of clusters and N the number of the datapoints
clarans_iterations: int //default 2