Guided Tour of MPCluster: Controlling the Clusters
MPCluster supports two different cluster algorithms, and a range of contraints that control the number, size, and spacing of the resulting clusters.
MPCluster's Cluster Options
The various cluster options are set in the lower left of MPCluster's main panel.
MPCluster for Maptitude supports two algorithms: K-Means and Hierarchical clustering; whilst the original MapPoint version only supported the K-Means algorithm. K-Means is a stochastic algorithm (i.e. one with a random element) that works by trying to find 'low energy' clusters. In contrast, the Hierarchical algorithm is deterministic with no random component. It works by grouping neighboring data points together and adding more data points until they become clusters that meet the required constraints. Being deterministic, the Hierarchical algorithm will always produce an identical result when used with identical data and parameters. The K-Means algorithm will tend to produce more compact, circular clusters, whilst the Hierarchical algorithm can produce concave clusters.
Constraints include setting the minimum and/or maximum cluster size, in terms of the number of data points, radius, or diameter. The minimum cluster separation can also be set.
These cluster distances are typically 'straight line' ('great circle') distances but the Hierarchical Algorithm set to use Median centers can also use an external distance table. This lets you provide your own distances, such as driving distances or a set of custom 'costs'.
There is no guarantee that either of these algorithms will find the requested number of clusters that meet the required constraints. All constraints are optional, and it is generally recommended that the number of constraints is limited, so as to increase the chances of finding a good stable result.
The Professional license also adds the ability to pre-define your own fixed cluster centers. MPCluster will then try to use this cluster centers (if possible), and not move them. This feature is useful if your project is working in area where you already have resources allocated. For example, you might be trying to find the best locations for some new sales depots, but you already have operational sales depots in the study area.
Number of Clusters
The number of clusters to find is an optional parameter for the Hieriarchical algorithm. However, the K-Means requires a number (or maximum number) of clusters to find. MPCluster has the ability to estimate the number of clusters for the K-Means algorithm using the current parameters. This can be time consuming, but MPCluster has been optimized for modern multi-core CPUs for speed.
Next, we look at the MPCluster display settings.