Market Basket Analysis Using Association Rules

Berkay
2 min readJun 13, 2021

Association Rule was one of the first techniques used in data mining. Today, these rules are also referred to as the “Recommendation System.” It aims to make predictions about future sales by analyzing the patterns of transactions in the past and using this information. It is a rule-based machine learning technique. It reveals the rules of being together with certain probabilities. “Which products are sold together?” It helps us find the answer to the question.

Market Basket Analysis is one of the key techniques used by large retailers to uncover associations between items. It works by looking for combinations of items that occur together frequently in transactions. To put it another way, it allows retailers to identify relationships between the items that people buy.
Association Rules are widely used to analyze retail basket or transaction data and are intended to identify strong rules discovered in transaction data using measures of interestingness based on the concept of strong rules.

There are many algorithms used for Association Rules Analysis. In this article, we will examine the apriori algorithm.

Apriori Algorithm

This application, which is called the market basket analysis method, plays an important role in determining the market strategy of the companies. It is used to reveal product associations.

Key Metrics

An example:

  • Assume there are 100 customers
  • 10 of them bought milk, 8 bought butter and 6 bought both of them.
  • bought milk => bought butter
  • support = P(Milk & Butter) = 6/100 = 0.06
  • confidence = support/P(Butter) = 0.06/0.08 = 0.75
  • lift = confidence/P(Milk) = 0.75/0.10 = 7.5

Note: this example is extremely small. In practice, a rule needs the support of several hundred transactions before it can be considered statistically significant, and datasets often contain thousands or millions of transactions.

Let’s get to the code. The dataset we are using today comes from the UCI Machine Learning repository. The dataset is called “Online Retail” and can be found here. It contains all the transactions occurring between 01/12/2009 and 09/12/2011 for a UK-based and registered online retailer.

You can access the project codes here:

--

--

Berkay

Data Science Enthusiast — For more information check out my LinkedIn page here: www.linkedin.com/in/berkayihsandeniz