Efficient resampling for fraud detection during anonymised credit card transactions with unbalanced datasets
Conference item
Authors | Mrozek, Petr, Panneerselvam, J. and Bagdasar, Ovidiu |
---|---|
Abstract | The rapid growth of e-commerce and online shopping have resulted in an unprecedented increase in the amount of money that is annually lost to credit card fraudsters. In an attempt to address credit card fraud, researchers are leveraging the application of various machine learning techniques for efficiently detecting and preventing fraudulent credit card transactions. One of the prevalent common issues around the analytics of credit card transactions is the highly unbalanced nature of the datasets, which is frequently associated with the binary classification problems. This paper intends to review, analyse and implement a selection of notable machine learning algorithms such as Logistic Regression, Random Forest, K-Nearest Neighbours and Stochastic Gradient Descent, with the motivation of empirically evaluating their efficiencies in handling unbalanced datasets whilst detecting credit card fraud transactions. A publicly available dataset comprising 284807 transactions of European cardholders is analysed and trained with the studied machine learning techniques to detect fraudulent transactions. Furthermore, this paper also evaluates the incorporation of two notable resampling methods, namely Random Under-sampling and Synthetic Majority Oversampling Techniques (SMOTE) in the aforementioned algorithms, in order to analyse their efficiency in handling unbalanced datasets. The proposed resampling methods significantly increased the detection ability, the most successful technique of combination of Random Forest with Random Under-sampling achieved the recall score of 100% in contrast to the recall score 77% of model without resampling technique. The key contribution of this paper is the postulation of efficient machine learning algorithms together with suitable resampling methods, suitable for credit card fraud detection with unbalanced dataset. |
Keywords | Credit cards; Vegetation; Machine learning algorithms |
Year | 2020 |
Journal | 2020 IEEE/ACM 13th International Conference on Utility and Cloud Computing (UCC) |
2020 IEEE/ACM 13th International Conference on Utility and Cloud Computing (UCC) | |
Publisher | IEEE |
Digital Object Identifier (DOI) | https://doi.org/10.1109/ucc48980.2020.00067 |
Web address (URL) | http://hdl.handle.net/10545/625574 |
hdl:10545/625574 | |
ISBN | 9780738123943 |
File | File Access Level Open |
File | File Access Level Open |
Publication dates | 30 Dec 2020 |
Publication process dates | |
Deposited | 01 Feb 2021, 11:09 |
Accepted | 06 Oct 2020 |
Contributors | University of Derby and University of Leicester |
https://repository.derby.ac.uk/item/928z4/efficient-resampling-for-fraud-detection-during-anonymised-credit-card-transactions-with-unbalanced-datasets
Download files
File
license.txt | ||
File access level: Open |
(2020) UCC - Mrozek et al - Fraud detection.pdf | ||
File access level: Open |
119
total views124
total downloads3
views this month5
downloads this month