MicroStrategy ONE

Predictive metric output

When a PMML model is imported, the user has the option to create a variety of predictive metrics, representing the different types of outputs supported by the model being imported. If the model defines specific outputs, these outputs are listed in the import dialog. However, if no outputs are defined, the user is given the option to create either a score (predicted value) or confidence (probability) predictive metric, or both.

The following is a list of possible output types currently supported by MicroStrategy (for all model types except Association Rules, which defines its own set of output types):

  • Predicted value: This output, commonly referred to as the score, is the raw predicted value of the model. For example, if the model predicts whether a customer is likely to respond to a marketing campaign, the predictive metric will return a prediction that says whether the customer is expected or not expected to respond (for example, Yes or No, 0 or 1, and so on).

  • Predicted display value: This output refers to the display value defined within the model that corresponds to the raw predicted value. This display value can be specified in the element Target. If it is not specified explicitly, then the raw predicted value is used by default.

  • Probability: This output, also referred to as confidence, represents the probability of the associated with the raw predicted value, as defined by the model. For some types of models, there is no way to provide a probability output. In these cases, MicroStrategy displays either an empty cell as an indicator that a confidence cannot be calculated, or some other useful information.

    For some types of regression equations, confidence cannot be calculated and an empty result is returned. If the regression model was generated by MicroStrategy, the R-Squared of the regression equation is returned as the confidence.

  • For Clustering models for which confidence cannot be calculated, the calculated measure of the target cluster is returned.

  • For Neural Network models with continuous targets for which confidence cannot be calculated, the untransformed output of the neural network is returned. If this output is defined to be within the range of zero to one, this value can be interpreted as a probability. Depending on the transformation associated with the output layer, the predicted value and probability values may be identical.

  • For General Regression models, confidence cannot be calculated for continuous targets.

  • For Tree models, the record counts provided by the model are used to calculate confidence. If these optional counts are not provided, an empty result is returned.

  • Cluster ID (Clustering models only): This output refers to the ID of the predicted cluster. This will either be the name of the cluster, if such a name is defined within the model, or a 1-based index.

  • Cluster affinity (Clustering models only): This output refers to the distance from or the similarity to the winning cluster, depending on the context of the model.

  • Entity ID: Similar to Cluster ID, this output indicates the predicted cluster, tree node, neuron or rule, depending on the type of model.

  • Association rules models require a different set of output types. The following is a list of output types supported by MicroStrategy for this type of model:

  • Rule: This output returns a string representation of the rule, which uses the rule format that can be described as the antecedent itemset implies the consequent itemset.

  • Rule ID: This output refers to the ID of the output rule (similar to the Entity ID described above).

  • Antecedent: This output returns a list of items in the antecedent itemset of the output rule.

  • Consequent: This output returns a list of items in the consequent itemset of the output rule.

  • Confidence: This output returns the confidence of the output rule. Confidence is an estimate of the probability of a transaction having the consequent given the antecedent.

  • Support: This output returns the support of the output rule. Support is the relative frequency of transactions containing both the antecedent and the consequent.

  • Lift: This output returns the lift of the output rule. Lift is a ratio that describes whether the rule is more or less significant than random chance. Lift values greater than 1.0 indicate that transactions containing the antecedent tend to contain the consequent more often than transactions that do not contain the antecedent.

  • Leverage: This output returns the leverage of the output rule. Leverage is a value that describes the support of the combination of the antecedent and the consequent as compared to their individual support.

  • Affinity: This output returns the affinity of the output rule. Affinity is a measure of the similarity between the antecedent and consequent itemsets, which is referred to as the Jaccard Similarity in statistical analysis.

For more details on these outputs, please refer to the Output chapter of the PMML standard, which can be found on the Data Mining Group website.