Customer segmentation – combining RFM and predictive algorithms

The Recency-Frequency-Monetary value segmentation has been around for a while now and provides a pretty simple but effective way to segment customers.  An RFM model can be used in conjunction with certain predictive models to gain even further insight into customer behavior.  In this post we’ll discuss three predictive models – K-means clustering, Logistic Regression and Recommendation and see how they enhance results from RFM analysis.

Here’s a high level flow of the analysis.

  • Calculate R, F and M parameters
  • Apply k-means clustering algorithm on these parameters to group similar customers.
    • Note the input values to this algorithm have to be continuous variables.
    • K-means and it’s offshoots are a popular approach for classification because of simplicity of implementation and been widely used in market segmentation
    • The number of clusters can be determined by using the elbow method
  • Apply classification algorithms such as Logistic Regression and Decision Trees to predict future customer behavior.
    • This will be a multi-class classification problem with the number of classes corresponding to the number of clusters from the previous step.
    • Use any customer attributes such as age, gender, region, etc as independent variables in the model
    • Compute Variable Importance Factors to understand which variable has the most impact on the outcome
  • Finally apply recommendation algorithms such as collaborative or content based filtering and Association Rules
    • Association Rules method connects relationships between products
    • This steps identifies associations between customer segments and product items purchased together

The above three step process provides ability to predict customer behavior so that businesses can take appropriate action to attract new customers or retain and upsell to existing customers.