MicroStrategy ONE
Association rules analysis
Association Rules analysis attempts to find relationships between items. The most common example of this is market basket analysis. For more information on this type of analysis, refer to the Data Mining Services chapter in the Advanced Reporting Help.
The rules can be described as the antecedent itemset implies the consequent itemset. The antecedent and consequent are itemsets, which are sets of items. In other words, the antecedent is a combination of items that are analyzed to determine what other items are implied by this combination. These implied items are the consequent of the analysis.
The Training Metric Wizard provides access to the following settings which limit the number of rules generated from the training data:
-
Maximum number of items per antecedent: This setting defines the maximum number of items that can be included in each antecedent itemset (consequent itemsets can only contain a single item). For example, if set to three, items for each antecedent will be grouped into itemsets containing one, two, or three items. In a transaction that includes the four items beef, onions, potatoes, and soda, a maximum of two creates antecedents with no more than two items, while still including each item in the analysis.
The default value for this setting is two. Increasing this number may lead to the generation of more rules and, as a consequence, long execution time for training reports.
-
Minimum confidence: The minimum probability that qualifying rules should have. For example, if set to 10%, then an association rule must have a confidence of 10% or more to appear in the model.
The default value for this setting is 10%. Increasing this value may lead to the generation of fewer rules.
-
Minimum support: The minimum number of transactions an itemset must occur in to be considered for an association rule. For example, if set to 1%, then itemsets must appear, on average, in one transaction out of 100.
The default value for this setting is 10%. Increasing this value may lead to the generation of fewer rules.
-
Maximum consequent support: The maximum support of the consequent allowed for qualifying rules. This can be used to avoid including obvious recommendations in the resulting rules. For example, if set to 99%, then rules that have a consequent support greater than 99% are not included in the resulting model.
For more information on the meanings of confidence and support, refer to the Advanced Reporting Help.
Rules To Return
Association rules analysis can look for a variety of relationships. The type of analysis is controlled by the Rules To Return dialog box, which is accessed via the Rules… button on the Select Output tab of the Training Metric Wizard. These criteria determine which rules will be returned as output for a particular input itemset.
From this dialog, the following selections can be made:
-
Rule selection criteria:
-
Rules to select:
-
Antecedent only (Recommendation): Select rules when the input itemset includes the antecedent itemset only. This is the default behavior. Its purpose is to provide a recommendation, based on the input itemset.
-
Antecedent and consequent (Rule matching): Select rules when the input itemset includes both the antecedent and consequent itemsets. Its purpose is to locate rules that match the input itemset.
-
Return the top ranked rules: These settings allow for the specification of exactly which rule is to be returned, based on its ranking among all selected rules.
-
Return the top ranked rules up to this amount: Defines the maximum number of rules that are to be returned. For instance, if set to three, the top three rules can be returned as output. In this case, three separate predictive metrics are created. The default value for this setting is one, and its maximum value is 10.
-
Rank selected rules by: Defines whether the rules are be ranked by confidence, support, lift, leverage, or affinity. Multiple selections may be specified.
-
Select all rules (One rule per row): This option allows for the creation of a report which displays all rules found within the model. If the corresponding predictive metric is placed on a report, a single rule is returned for each row of the report. The rules are returned in the order they appear within the model, and are not based on an input itemset.
The combination of selections made above determines the number of predictive metrics that are generated.
Outputs
Unlike most other models which utilize generic outputs such as predicted value and probability, association rules models require a special set of outputs. With the exception of the Select all rules option described above, all outputs represent the winning rule (default), or the rule specified by the rank value. For a description of the available outputs for association rules analysis, see Predictive metric output.