The confidence metric identifies the probability of items or itemsets occurring in the itemsets together. For example, if there are two items in a transaction, the existence of one item is assumed to lead to the other. The first item or itemset is the antecedent, and the second is the consequent. The confidence is thus defined as the ratio of the number of transactions having both the antecedent and the consequent, to the number of transactions only having the antecedent. This scenario is represented as:
where A is the antecedent, B is the consequent, and C(A,B) is the confidence that A leads to B.
Extending the preceding example, assume that there are 150 transactions where apples and bananas were purchased together. The confidence is calculated as:
This result indicates a 60% chance that an apple purchase then leads to a banana purchase. Similarly, assuming a total of 500 transactions for bananas, then the confidence that a banana purchase leads to an apple purchase is calculated as:
Here, there is just a 30% chance that a banana purchase leads to an apple purchase.
While confidence is a good measure of likelihood, it is not a guarantee of a clear association between items. The value of confidence might be high for other reasons. For this reason, a minimum confidence threshold is applied to filter out weakly probable associations while mining with association rules.