MicroStrategy ONE

Cluster analysis

Cluster analysis attempts to group objects into clusters such that the objects in a particular cluster are similar to one another and dissimilar to objects in other clusters.

MicroStrategy supports the following type of cluster analysis: k-means

This algorithm dictates that a set of n objects be partitioned into k clusters so that the resulting intra-cluster similarity is maximized but the inter-cluster similarity is minimized. The mean value of the objects in a cluster is used as a measure of similarity, and can be viewed as the "center of gravity" of the cluster. This is an iterative algorithm.

MicroStrategy supports two varieties of this algorithm.

  • Specify exact number of clusters: With this option you can instruct the algorithm to partition the data into a specific number of clusters by specifying an exact number of clusters.

  • Specify maximum number of clusters: With this option you can instruct the algorithm to partition the data into the optimal number of clusters by specifying a maximum number of clusters. The algorithm creates multiple models, starting with two clusters, and continues to add more clusters one at a time. With each additional cluster, the quality of the model is assessed. The quality of the current model is measured by calculating the total distance of all records to the centers of their assigned clusters (DCurr). This result is compared to the same result for the previously generated model (DPrev). This process continues until the amount of improvement, (DPrev - DCurr) / DPrev, is less than the amount specified by the Percent improvement parameter, or the maximum number of clusters is reached. Upon completion of this process, the model with the best quality is used in the predictive metric.