Efficient multi-classifier wrapper feature-selection model: Application for dimension reduction in credit scoring

Authors

DOI:

https://doi.org/10.7494/csci.2022.23.1.4120

Keywords:

multi-classifier, heuristic, dimensionality reduction, credit scoring

Abstract

The task of identifying most relevant features for a credit scoring application is a challenging task. Reducing the number of redundant and unwanted features is an inevitable task to improve the performance of the credit scoring model. The wrappers approach is usually used in credit scoring applications to identify the most relevant features. However, this approach suffers from the issue of subsets generation and the use of a single classifier as an evaluation function. The problem here is that each classifier may give different results which can be interpreted differently. Hence, we propose in this study an ensemble wrapper feature selection model which is based on a multi-classifiers combination. In a first stage, we address the problem of subsets generation by minimizing the search space through a customized heuristic. Then, a multi-classifier wrapper evaluation is applied using two classifier arrangement approaches in order to select a set of mutually approved set of relevant features. The proposed method is evaluated on four credit datasets and has shown a good performance compared to individual classifiers results.

Downloads

Download data is not yet available.

References

Bouckaert R.R., Frank E., Hall M., Kirkby R., Reutemann P., Seewald A., Scuse D.: Weka manual (3.7.1), 2009.

Chan Y.H., Ng W.W.Y., Yeung D.S., Chan P.P.K.: Empirical comparison of forward and backward search strategies in L-GEM based feature selection with RBFNN. In: ICMLC, pp. 1524–1527. 2010.

Chen F.L., Li F.C.: Combination of feature selection approaches with SVM in credit scoring. In: Expert Syst. Appl., vol. 37, pp. 4902–4909, 2010. ISSN 0957-4174.

Chrysostomou K., Chen S.Y., Liu X.: Combining multiple classifiers for wrapper feature selection. In: International Journal of Data Mining, Modelling and Management, vol. 1(1), pp. 91–102, 2008.

Hayashi Y., Takano N.: One-Dimensional Convolutional Neural Networks with Feature Selection for Highly Concise Rule Extraction from Credit Scoring Datasets with Heterogeneous Attributes. In: ELECTRONICS, vol. 9(8), 2020. URL http://dx.doi.org/{10.3390/electronics9081318}. 2021/02/13; 15:33 str. 20/22

Hsieh N.C., Hung L.P.: A data driven ensemble classifier for credit scoring analysis. In: Expert Syst. Appl., vol. 37, pp. 534–545, 2010. ISSN 0957-4174.

Kittler J.: Combining classifiers: A theoretical framework. In: Pattern Analysis & Applications, vol. 1(1), pp. 18–27, 1998.

Kozodoi N., Lessmann S., Papakonstantinou K., Gatsoulis Y., Baesens B.: A multi-objective approach for profit-driven feature selection in credit scoring. In: DECISION SUPPORT SYSTEMS, vol. 120, pp. 106–117, 2019. ISSN 0167-9236. URL http://dx.doi.org/{10.1016/j.dss.2019.03.011}.

Kuncheva L.I., Bezdek J.C., Duin P.W.: Decision templates for multiple classifier fusion: an experimental comparison. In: Pattern Recognition, vol. 34, pp. 299– 314, 2001.

Liu H., Yu L.: Toward integrating feature selection algorithms for classification and clustering. In: IEEE Transactions on Knowledge and Data Engineering, vol. 17(4), pp. 491–502, 2005. ISSN 1041-4347. URL http://dx.doi.org/10. 1109/TKDE.2005.66.

Liu Y., Schumann M.: Data mining feature selection for credit scoring models. In: Journal of the Operational Research Society, vol. 56, pp. 1099–1108, 2005.

Lopez J., Maldonado S.: Profit-based credit scoring based on robust optimization and feature selection. In: INFORMATION SCIENCES, vol. 500, pp. 190–202, 2019. ISSN 0020-0255. URL http://dx.doi.org/{10.1016/j.ins.2019.05. 093}.

Nalic J., Martinovic G., Zagar D.: New hybrid data mining model for credit scoring based on feature selection algorithm and ensemble classifiers. In: ADVANCED ENGINEERING INFORMATICS, vol. 45, 2020.

Paleologo G., Elisseeff A., Antonini G.: Subagging for credit scoring models. In: European Journal of Operational Research, vol. 201(2), pp. 490–499, 2010.

Piramuthu S.: Evaluating feature selection methods for learning in data mining applications. In: European Journal of Operational Research, vol. 156(2), pp. 483 – 494, 2004.

Rodriguez I., Huerta R., Elkan C., Cruz C.S.: Quadratic Programming Feature Selection. In: Journal of Machine Learning Research, vol. 11, pp. 1491–1516, 2010.

Tripathi D., Edla D.R., Cheruku R., Kuppili V.: A novel hybrid credit scoring model based on ensemble feature selection and multilayer ensemble classification. In: COMPUTATIONAL INTELLIGENCE, vol. 35(2), pp. 371–394, 2019. ISSN 0824-7935. URL http://dx.doi.org/{10.1111/coin.12200}.

Trivedi S.K.: A study on credit scoring modeling with different feature selection and machine learning approaches. In: TECHNOLOGY IN SOCIETY, vol. 63, 2020. ISSN 0160-791X. URL http://dx.doi.org/{10.1016/j.techsoc.2020. 101413}.

Sustersic M., Mramor D., Zupan J.: Consumer credit scoring models with limited data. In: Expert Syst. Appl., vol. 36, pp. 4736–4744, 2009. ISSN 0957-4174. 2021/02/13; 15:33 str. 21/22

Wang D., Zhang Z., Bai R., Mao Y.: A hybrid system with filter approach and multiple population genetic algorithm for feature selection in credit scoring. In: JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, vol. 329, pp. 307–321, 2018. ISSN 0377-0427. URL http://dx.doi.org/{10.1016/j. cam.2017.04.036}. International Conference on Information and Computational Science (ICICS), Dalian Univ Technol, Dalian, PAKISTAN, AUG 02-06, 2016.

Yun C., Shin D., Jo H., Yang J., Kim S.: An Experimental Study on Feature Subset Selection Methods. In: Proceedings of the 7th IEEE International Conference on Computer and Information Technology, CIT ’07, pp. 77–82. IEEE Computer Society, Washington, DC, USA, 2007. ISBN 0-7695-2983-6.

Zhang X., Zhou Z.: Credit Scoring Model based on Kernel Density Estimation and Support Vector Machine for Group Feature Selection. In: 2018 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), pp. 1829–1836. PES Inst Technol, Bangalore South Campus; IEEE; IEEE Communicat Soc; IEEE Photon Soc; IEEE Robot & Automat Soc, 2018. ISBN 978-1-5386-5314-2. 7th International Conference on Computing, Communications and Informatics (ICACCI), Bangalore, INDIA, SEP 19-22, 2018.

Downloads

Published

2022-03-29

How to Cite

Bouaguel, W. (2022). Efficient multi-classifier wrapper feature-selection model: Application for dimension reduction in credit scoring. Computer Science, 23(1). https://doi.org/10.7494/csci.2022.23.1.4120

Issue

Section

Articles