research-article

Corporate residence fraud detection

Authors:
Enric Junqué de Fortuny

University of Antwerp, Antwerp, Belgium

University of Antwerp, Antwerp, Belgium
View Profile

,
Marija Stankova

University of Antwerp, Antwerp, Belgium

University of Antwerp, Antwerp, Belgium
View Profile

,
Julie Moeyersoms

University of Antwerp, Antwerp, Belgium

University of Antwerp, Antwerp, Belgium
View Profile

,
Bart Minnaert

Ghent University, Ghent, Belgium

Ghent University, Ghent, Belgium
View Profile

,
Foster Provost

New York University, New York, NY, USA

New York University, New York, NY, USA
View Profile

,
David Martens

University of Antwerp, Antwerp, Belgium

University of Antwerp, Antwerp, Belgium
View Profile

KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data miningAugust 2014Pages 1650–1659https://doi.org/10.1145/2623330.2623333

Published:24 August 2014Publication History

KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 1650–1659

ABSTRACT

With the globalisation of the world's economies and ever-evolving financial structures, fraud has become one of the main dissipaters of government wealth and perhaps even a major contributor in the slowing down of economies in general. Although corporate residence fraud is known to be a major factor, data availability and high sensitivity have caused this domain to be largely untouched by academia. The current Belgian government has pledged to tackle this issue at large by using a variety of in-house approaches and cooperations with institutions such as academia, the ultimate goal being a fair and efficient taxation system. This is the first data mining application specifically aimed at finding corporate residence fraud, where we show the predictive value of using both structured and fine-grained invoicing data. We further describe the problems involved in building such a fraud detection system, which are mainly data-related (e.g. data asymmetry, quality, volume, variety and velocity) and deployment-related (e.g. the need for explanations of the predictions made).

Supplemental Material

p1650-sidebyside.mp4

mp4

190.1 MB

Download

References

M. H. Baer. Linkage and the Deterrence of Corporate Fraud, 2008.Google Scholar
S. Basta, F. Fassetti, M. Guarascio, G. Manco, F. Giannotti, D. Pedreschi, L. Spinsanti, G. Papi, and S. Pisani. High quality true-positive prediction for fiscal fraud detection. In Data Mining Workshops, 2009. ICDMW'09. IEEE International Conference on, pages 7--12. IEEE, 2009. Google ScholarDigital Library
S. Bhattacharyya, S. Jha, K. Tharakunnel, and J. C. Westland. Data mining for credit card fraud: A comparative study. Decision Support Systems, 50(3):602--613, 2011. Google ScholarDigital Library
R. J. Bolton and D. J. Hand. Statistical fraud detection: A review. Statistical Science, pages 235--249, 2002.Google ScholarCross Ref
R. J. Bolton, D. J. Hand, et al. Unsupervised profiling methods for fraud detection. Credit Scoring and Credit Control VII, pages 235--255, 2001.Google Scholar
R. Brause, T. Langsdorf, and M. Hepp. Neural data mining for credit card fraud detection. In Tools with Artificial Intelligence, 1999. Proceedings. 11th IEEE International Conference on, pages 103--106. IEEE, 1999. Google ScholarDigital Library
M. Cecchini, H. Aytug, G. J. Koehler, and P. Pathak. Detecting management fraud in public companies. Management Science, 56(7):1146--1160, 2010. Google ScholarDigital Library
C. Cortes, D. Pregibon, and C. Volinsky. Communities of interest. Springer, 2001.Google ScholarCross Ref
J. Crombez. Zwart en wit. De Bezige Bij, 2013.Google Scholar
J. Demšar. Statistical Comparisons of Classifiers over Multiple Data Sets. Journal of Machine Learning Research, 7:1--30, 2006. Google ScholarDigital Library
EUR-LEX. Communication from the commission to the european parliament and the council, 2012.Google Scholar
European Commission. Fight against tax fraud and tax evasion: A huge problem, 2013.Google Scholar
R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research, 9:1871--1874, 2008. Google ScholarDigital Library
T. Fawcett and F. Provost. Combining data mining and machine learning for effective user profiling. In Proceedings of the Third KDD International Conference on Knowledge Discovery and Data Mining, pages 8--13, 1996.Google Scholar
T. Fawcett and F. Provost. Adaptive fraud detection. Data Mining and Knowledge Discovery, 1(3):291--316, 1997. Google ScholarDigital Library
P. C. González and J. D. Velásquez. Characterization and detection of taxpayers with false invoices using data mining techniques. Expert Systems with Applications, 40(5):1427--1436, 2013. Google ScholarDigital Library
C. S. Hilas and P. A. Mastorocostas. An application of supervised and unsupervised learning approaches to telecommunications fraud detection. Knowledge-Based Systems, 21(7):721--726, 2008. Google ScholarDigital Library
E. Junqué de Fortuny, D. Martens, and F. Provost. Predictive Modeling with Big Data: Is Bigger Really Better? Big Data, 1(4):215--226, Oct. 2013.Google ScholarCross Ref
P. Juszczak, N. M. Adams, D. J. Hand, C. Whitrow, and D. J. Weston. Off-the-peg and bespoke classifiers for fraud detection. Computational Statistics & Data Analysis, 52(9):4521--4532, 2008. Google ScholarDigital Library
E. Kirkos, C. Spathis, and Y. Manolopoulos. Data mining techniques for the detection of fraudulent financial statements. Expert Systems with Applications, 32(4):995--1003, 2007. Google ScholarDigital Library
S. A. Macskassy and F. Provost. A simple relational classifier. 2003.Google Scholar
S. A. Macskassy and F. Provost. Suspicion scoring based on guilt-by-association, collective inference, and focused data access. In International conference on intelligence analysis, 2005.Google Scholar
D. Martens and F. Provost. Explaining data-driven document classifications. MIS Quarterly, 38(4), 2014. Google ScholarDigital Library
D. Martens, F. Provost, J. Clark, and E. Junqué de Fortuny. Mining fine-grained consumer payment data to improve targeted marketing. Technical report, Stern School of Business, New York University, 2013.Google Scholar
National Fraud Authority. Annual fraud indicator 2013. 2013.Google Scholar
E. Ngai, Y. Hu, Y. Wong, Y. Chen, and X. Sun. The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature. Decision Support Systems, 50(3):559--569, 2011. Google ScholarDigital Library
Organisation for Economic Co-operation and Development. Tax and development themes in recent G20 discussion, 2013.Google Scholar
C. Perlich and F. Provost. Distribution-based aggregation for relational learning with identifier attributes. Machine Learning, 62(1--2):65--105, 2006. Google ScholarDigital Library
C. Phua, V. Lee, K. Smith, and R. Gayler. A comprehensive survey of data mining-based fraud detection research. arXiv preprint arXiv:1009.6119, 2010.Google Scholar
J.-J. Rousseau. The Social Contract, Or Principles of Political Right (Du contrat social ou Principes du droit politique). 1762.Google Scholar
C. Rudin. The p-norm push: A simple convex ranking algorithm that concentrates at the top of the list. The Journal of Machine Learning Research, 10:2233--2271, 2009. Google ScholarDigital Library
Y. Sahin and E. Duman. Detecting credit card fraud by decision trees and support vector machines. In Proceedings of the International MultiConference of Engineers and Computer Scientists, volume 1, 2011.Google Scholar
D. Sánchez, M. Vila, L. Cerda, and J.-M. Serrano. Association rules applied to credit card fraud detection. Expert Systems with Applications, 36(2):3630--3640, 2009. Google ScholarDigital Library
M. Stankova, D. Martens, and F. Provost. Classification over bipartite graphs through projection. University of Antwerp, working paper, 2013.Google Scholar
O. Stitelman, C. Perlich, B. Dalessandro, R. Hook, T. Raeder, and F. Provost. Using co-visitation networks for classifying non-intentional traffic. 2013.Google Scholar
L. C. Thomas. Consumer Credit Models: Pricing, Profit and Portfolios: Pricing, Profit and Portfolios. Oxford University Press, 2009.Google Scholar
L. C. Thomas, D. B. Edelman, and J. N. Crook. Credit scoring and its applications. Siam, 2002. Google ScholarDigital Library
B. C. Wallace, K. Small, C. E. Brodley, and T. A. Trikalinos. Class imbalance, redux. In Data Mining (ICDM), 2011 IEEE 11th International Conference on, pages 754--763. IEEE, 2011. Google ScholarDigital Library
C. Whitrow, D. J. Hand, P. Juszczak, D. Weston, and N. M. Adams. Transaction aggregation as a strategy for credit card fraud detection. Data Mining and Knowledge Discovery, 18(1):30--55, 2009. Google ScholarDigital Library
D. Wolpert. Stacked generalization. Neural networks, 1992. Google ScholarDigital Library
R.-S. Wu, C.-S. Ou, H.-Y. Lin, S.-I. Chang, and D. C. Yen. Using data mining technique to enhance tax evasion detection performance. Expert Systems with Applications, 39(10):8769--8777, 2012. Google ScholarDigital Library

Index Terms

Corporate residence fraud detection
1. Computing methodologies
  1. Machine learning

Recommendations

Research on Credit Card Fraud Detection Model Based on Distance Sum
JCAI '09: Proceedings of the 2009 International Joint Conference on Artificial Intelligence

Along with increasing credit cards and growing trade volume in China, credit card fraud rises sharply. How to enhance the detection and prevention of credit card fraud becomes the focus of risk control of banks. This paper proposes a credit card fraud ...
Read More
A comparison of machine learning algorithms for credit card fraud detection
NISS '23: Proceedings of the 6th International Conference on Networking, Intelligent Systems & Security

With the increasing use of credit cards for online and offline transactions, the risk of fraudulent activities has also increased significantly. In this study, we propose a machine learning-based approach to predict credit card fraud. We used a public ...
Read More
The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature

This paper presents a review of - and classification scheme for - the literature on the application of data mining techniques for the detection of financial fraud. Although financial fraud detection (FFD) is an emerging topic of great importance, a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining
August 2014
2028 pages
ISBN:9781450329569
DOI:10.1145/2623330
General Chairs:
Sofus Macskassy
Facebook
,
Claudia Perlich
Dstillery
,
Program Chairs:
Jure Leskovec
Stanford University
,
Wei Wang
UCLA
,
Rayid Ghani
University of Chicago
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 August 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
corporate residence fraud
fraud detection
structured data
transactional data
Qualifiers
- research-article
Conference

Acceptance Rates
KDD '14 Paper Acceptance Rate151of1,036submissions,15%Overall Acceptance Rate1,133of8,635submissions,13%
More
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 20
  Total Citations
  View Citations
- 889
  Total Downloads
- Downloads (Last 12 months)24
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Corporate residence fraud detection

KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Research on Credit Card Fraud Detection Model Based on Distance Sum

A comparison of machine learning algorithms for credit card fraud detection

The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature