Enhancing customer segmentation through factor analysis of mixed data (FAMD)-based approach using K-means and hierarchical clustering algorithms

Journal article


Sattar, U., Ufeli, C. P., Hasan, R. and Mahmood, S. 2025. Enhancing customer segmentation through factor analysis of mixed data (FAMD)-based approach using K-means and hierarchical clustering algorithms. information. 16 (6), pp. 1-25. https://doi.org/10.3390/info16060441
AuthorsSattar, U., Ufeli, C. P., Hasan, R. and Mahmood, S.
Abstract

In today’s data-driven business landscape, effective customer segmentation is crucial for enhancing engagement, loyalty, and profitability. Traditional clustering methods often struggle with datasets containing both numerical and categorical variables, leading to suboptimal segmentation. This study addresses this limitation by introducing a
novel application of Factor Analysis of Mixed Data (FAMD) for dimensionality reduction, integrated with K-means and Agglomerative Clustering for robust customer segmentation. While FAMD is not new in data analytics, its potential in customer segmentation has been underexplored. This research bridges that gap by demonstrating how FAMD can
harmonize mixed data types, preserving structural relationships that conventional methods overlook. The proposed methodology was tested on a Kaggle-sourced retail dataset comprising 3900 customers, with pre-processing steps including correlation ratio filtering (η ≥ 0.03), standardization, and encoding. FAMD reduced the feature space to three principal components, capturing 81.46% of the variance, which facilitated clearer segmentation.

Comparative clustering analysis showed that Agglomerative Clustering (Silhouette Score: 0.52) outperformed K-means (0.51) at k = 4, revealing distinct customer segments
such as seasonal shoppers and high spenders. Practical implications include the development of targeted marketing strategies, validated through heatmap visualizations and
cluster profiling. This study not only underscores the suitability of FAMD for customer segmentation but also sets the stage for more nuanced marketing analytics driven by
mixed-data methodologies.

Keywordscustomer segmentation; FAMS; K-means; agglomerative clustering; silhouette score; mixed data analysis
Year2025
Journalinformation
Journal citation16 (6), pp. 1-25
PublisherMDPI
ISSN2078-2489
Digital Object Identifier (DOI)https://doi.org/10.3390/info16060441
Web address (URL)https://www.mdpi.com/2078-2489/16/6/441
Accepted author manuscript
File Access Level
Restricted
Publisher's version
License
File Access Level
Open
Output statusPublished
Publication dates
Online26 May 2025
Publication process dates
Accepted26 May 2025
Deposited27 May 2025
Permalink -

https://repository.derby.ac.uk/item/qy494/enhancing-customer-segmentation-through-factor-analysis-of-mixed-data-famd-based-approach-using-k-means-and-hierarchical-clustering-algorithms

Download files


Publisher's version
information-16-00441.pdf
License: CC BY 4.0
File access level: Open

  • 194
    total views
  • 69
    total downloads
  • 4
    views this month
  • 1
    downloads this month

Export as

Related outputs

A personality-informed candidate recommendation framework for recruitment using MBTI typology
Sattar, U. 2025. A personality-informed candidate recommendation framework for recruitment using MBTI typology. Information MDPI. 16 (10), pp. 1-21. https://doi.org/10.3390/info16100863
Artificial intelligence for enhanced quality assurance through advanced strategies and implementation in the software industry
Vivekananthan, J., Sattar, U. and Lackner, M. 2025. Artificial intelligence for enhanced quality assurance through advanced strategies and implementation in the software industry. Journal of Intelligent Systems. 34 (1), pp. 1-19. https://doi.org/10.1515/jisys-2024-0377
Adopting open-source SD-WAN: a comprehensive analysis of performance, cost, and security benefits over traditional WAN architectures
Arogundade, S. V., Sattar, U. and Khan, H. W. 2025. Adopting open-source SD-WAN: a comprehensive analysis of performance, cost, and security benefits over traditional WAN architectures. EAI Endorsed Transactions on Scalable Information Systems. 12 (4). https://doi.org/10.4108/eetsis.7217
Beyond polarity: forecasting consumer sentiment with aspect- and topic-conditioned time series models
Sattar, U., Hasan, R., Palaniappan, S., Mahmood, S. and Khan, H. W. 2025. Beyond polarity: forecasting consumer sentiment with aspect- and topic-conditioned time series models. Information. 16 (8), pp. 1-20. https://doi.org/10.3390/info16080670
Predicting product sales performance using various types of customer review data
Baskaran, J., Sattar, U. and Khan, H. W. 2025. Predicting product sales performance using various types of customer review data. EAI Endorsed Transactions on Scalable Information Systems. 12 (4), pp. 1-11. https://doi.org/10.4108/eetsis.7216
From promotion to empathy: a content analysis of brand responses to social justice movements
Dilshad, W., Sattar, U. and Ghaffar, A. 2025. From promotion to empathy: a content analysis of brand responses to social justice movements. Bulletin of Management Review . 2 (2), p. 440–453.
Enhancing supply chain management: a comparative study of machine learning techniques with cost–accuracy and esg-based evaluation for forecasting and risk mitigation
Sattar, U., Dattana, V., Hasan, R., Mahmood, S., Khan, H. W. and Hussain, S. 2025. Enhancing supply chain management: a comparative study of machine learning techniques with cost–accuracy and esg-based evaluation for forecasting and risk mitigation. Sustainability. 17 (13), pp. 1-45. https://doi.org/10.3390/su17135772
Exploring the impact of augmented reality on medical students’ intrinsic motivation: a three-dimensional analysis
Sattar, U., Khan, H. W., Ghaffar, A. and Raza, S. 2025. Exploring the impact of augmented reality on medical students’ intrinsic motivation: a three-dimensional analysis. Journal of Management & Social Science. 2 (2), pp. 257-276. https://doi.org/10.63075/dt4f4h66
Stroke detection in brain CT images using convolutional neural networks: model development, optimization and interpretability
Abdi, H., Sattar, U., Dattana, V., Hasan, R., Dattana, V. and Mahmood, S. 2025. Stroke detection in brain CT images using convolutional neural networks: model development, optimization and interpretability. Information. 16 (5), pp. 1-29. https://doi.org/10.3390/info16050345
Mitigating fuel station drive-offs using AI: YOLOv8 OCR and MOT history API for detecting fake and altered plates
Milinda, G., Sattar, U. and Hasan, R. 2025. Mitigating fuel station drive-offs using AI: YOLOv8 OCR and MOT history API for detecting fake and altered plates. Computers, Materials & Continua. 83 (3), pp. 4061-4084. https://doi.org/10.32604/cmc.2025.062826
A human-centered design framework for intuitive mobile AR in medical learning
Sattar, U., Khan, H., Hasan, R. and Hassan, A. 2025. A human-centered design framework for intuitive mobile AR in medical learning. UMT Education Review. 7 (2), p. 94–122. https://doi.org//10.32350/uer.72.05