Full Text Available
Note: Clicking the button above will open the full text document at the original institutional repository in a new window.
CRM) will continue to gain prominence in the coming years. A commonly used CRM metric called Customer Lifetime Value (CLV) is the value a customer will contribute while they are an active customer. This study investigated the ability of supervised machine learning models constructed with XGBoost to...
| Main Author: | |
|---|---|
| Other Authors: | |
| Format: | Thesis |
| Language: | English |
| Published: |
Department of Computer Science
2023
|
| Subjects: | |
| Tags: |
No Tags, Be the first to tag this record!
|
| _version_ | 1867613172189364224 |
|---|---|
| access_status_str | Open Access |
| author | Myburg, Marius Errol |
| author2 | Berman, Sonia |
| author_browse | Berman, Sonia Myburg, Marius Errol |
| author_facet | Berman, Sonia Myburg, Marius Errol |
| author_sort | Myburg, Marius Errol |
| collection | Thesis |
| description | CRM) will continue to gain prominence in the coming years. A commonly used CRM metric called Customer Lifetime Value (CLV) is the value a customer will contribute while they are an active customer. This study investigated the ability of supervised machine learning models constructed with XGBoost to predict future CLV, as well as the likelihood that a customer will drop to a lower CLV in the future. One approach to determining CLV, called the RFM method, is done by isolating recency (R), frequency (F) and (M) monetary values. The produced models used these RFM variables and also assessed if including temporal, product, and other customer transaction information assisted the XGBoost classifier in making better predictions. The classification models were constructed by extracting each customer's RFM values and transaction information from a Fast Mover Consumer Goods dataset. Different variations of CLV were calculated through one- and two-dimensional K-means clustering of the M (Monetary), F and M (Profitability), F and R (Loyalty), as well as the R and M (Burgeoning) variables. Two additional CLV variations were also determined by isolating the M tercile segments and a commonly used weighted-RFM approach. To test the effectiveness of XGBoost in predicting future timeframes, the dataset was divided into three consecutive periods, where the first period formed the features used to predict the target CLV variables in the second and third periods. Models that predicted if CLV dropped to a lower value from the first to the second and from the first to the third periods were also constructed. It was found that the XGBoost models were moderately to highly effective in classifying future CLV in both the second and third periods. The models also effectively predicted if CLV would drop to a lower value in both future periods. The ability to predict future CLV and CLV drop in the second period, was only slightly better than the ability to predict the future CLV in the third period. Models constructed by adding additional temporal, product, and customer transaction information to the RFM values did not improve on those created that used only the RFM values. These findings illustrate the effectiveness of XGBoost as a predictor for future CLV and CLV drop, as well as affirming the efficacy of utilising RFM values to determine future CLV. |
| format | Thesis |
| id | oai:open.uct.ac.za:11427/38088 |
| institution | University of Cape Town (South Africa) |
| language | eng |
| last_indexed | 2026-06-10T12:31:54.917Z |
| license_str | Not specified — see source repository |
| provenance_str_mv | Harvested via OAI-PMH from UCTD — University of Cape Town Open Access Repository |
| publishDate | 2023 |
| publishDateRange | 2023 |
| publishDateSort | 2023 |
| publisher | Department of Computer Science |
| publisherStr | Department of Computer Science |
| record_format | dspace |
| source_str | UCTD — University of Cape Town Open Access Repository |
| spelling | oai:open.uct.ac.za:11427/38088 Using recency, frequency and monetary variables to predict customer lifetime value with XGBoost Myburg, Marius Errol Berman, Sonia computer science CRM) will continue to gain prominence in the coming years. A commonly used CRM metric called Customer Lifetime Value (CLV) is the value a customer will contribute while they are an active customer. This study investigated the ability of supervised machine learning models constructed with XGBoost to predict future CLV, as well as the likelihood that a customer will drop to a lower CLV in the future. One approach to determining CLV, called the RFM method, is done by isolating recency (R), frequency (F) and (M) monetary values. The produced models used these RFM variables and also assessed if including temporal, product, and other customer transaction information assisted the XGBoost classifier in making better predictions. The classification models were constructed by extracting each customer's RFM values and transaction information from a Fast Mover Consumer Goods dataset. Different variations of CLV were calculated through one- and two-dimensional K-means clustering of the M (Monetary), F and M (Profitability), F and R (Loyalty), as well as the R and M (Burgeoning) variables. Two additional CLV variations were also determined by isolating the M tercile segments and a commonly used weighted-RFM approach. To test the effectiveness of XGBoost in predicting future timeframes, the dataset was divided into three consecutive periods, where the first period formed the features used to predict the target CLV variables in the second and third periods. Models that predicted if CLV dropped to a lower value from the first to the second and from the first to the third periods were also constructed. It was found that the XGBoost models were moderately to highly effective in classifying future CLV in both the second and third periods. The models also effectively predicted if CLV would drop to a lower value in both future periods. The ability to predict future CLV and CLV drop in the second period, was only slightly better than the ability to predict the future CLV in the third period. Models constructed by adding additional temporal, product, and customer transaction information to the RFM values did not improve on those created that used only the RFM values. These findings illustrate the effectiveness of XGBoost as a predictor for future CLV and CLV drop, as well as affirming the efficacy of utilising RFM values to determine future CLV. 2023-07-12T10:20:29Z 2023-07-12T10:20:29Z 2023 2023-07-12T10:16:39Z Master Thesis Masters MSc http://hdl.handle.net/11427/38088 eng application/pdf Department of Computer Science Faculty of Science |
| spellingShingle | computer science Myburg, Marius Errol Using recency, frequency and monetary variables to predict customer lifetime value with XGBoost |
| thesis_degree_str | Master's |
| title | Using recency, frequency and monetary variables to predict customer lifetime value with XGBoost |
| title_full | Using recency, frequency and monetary variables to predict customer lifetime value with XGBoost |
| title_fullStr | Using recency, frequency and monetary variables to predict customer lifetime value with XGBoost |
| title_full_unstemmed | Using recency, frequency and monetary variables to predict customer lifetime value with XGBoost |
| title_short | Using recency, frequency and monetary variables to predict customer lifetime value with XGBoost |
| title_sort | using recency frequency and monetary variables to predict customer lifetime value with xgboost |
| topic | computer science |
| url | http://hdl.handle.net/11427/38088 |
| work_keys_str_mv | AT myburgmariuserrol usingrecencyfrequencyandmonetaryvariablestopredictcustomerlifetimevaluewithxgboost |