The generate program creates a random data set in the following way.
It creates random noise points. It then creates
clusters by
choosing
points which will be the centres of circles of radius
.
We then randomly choose a cluster and a point within the circle for each of
the
cluster points being creating.
No input is required on standard input, and the unclustered dataset is written to standard output.
The annotation of each point is the cluster number it belongs to, or -1 for outliers. This aids greatly in evaluating how well a given clustering algorithm performs on the dataset.
Usage:
generate [OPTIONS]