Customer Segmentation of Credit Card Default by Self Organizing Map

In this paper we applied the technique of Self Organizing Map (SOM) to segment individuals based on their credit information. SOM is an unsupervised machine learning method that reduces data complexity and dimensionality while keeping sits original topology, which is superior to other dimension reduction methods especially when features in data have unclear nonlinear relations. Through this method we provide more clear and intuitive segmentation that other traditional methods cannot achieve.


Introduction
SOM is an unsupervised machine learning method using artificial neuro network to reduce data complexity and dimensionality.Designed and proposed by [1], C. von der Marlsburg (1973), developed and refined by [2], T. Kohnen (Finland, 1982), it is based on the same principle of biological neuro network.Like in human brain, when a nerve cell gets excitement, it will restrain other nerve cells surrounding it.The effect will trigger competition among nerve cells, and in the end, only some winning cells are excited.SOM simulates the above biological procedure and hence has the similar topology reserving properties as of human brain, which makes it superior than other dimension reduction methods especially when features in data have unclear nonlinear relations.
Credit card default prediction is a long-standing problem that many financial institutes and banks are interested in.With the rising capability of acquiring and processing big data, people naturally think whether there are better forecasting models for credit card default prediction for taking advantage of the new availa-

Algorithm of Self Organizing Map
In this algorithm, SOM simulates the excitability, coordination and suppression of biological neurons, and by using dynamics of competition for information processing and to guide the learning and work of the network, unlike the multi-layer neural network (MLP) using the network error as a criterion for the algorithm.The basic idea of the composition of competitive neural networks is that the competing layers of the network compete for the input mode to respond to the opportunity, and finally only one neuron becomes the winner of the competition.This winning neuron represents the classification of the input pattern.
In an artificial neural network, a neuron processing unit can represent different objects, such as features, letters, concepts, or some meaningful abstraction pattern.The type of processing unit in the network is divided into three categories: input unit, output unit and hidden unit.The typical SOM network consists of two layers (input layer and output layer).
The input unit accepts signals and data from the outside world.The output makes responses to the information and output the processing result.The hidden unit is a unit that cannot be observed by the outsider of the system between the input and output units.Connection weights between neurons reflect the connection strength between cells.The representation and processing of information are reflected in the connection relationship of the network processing unit.
Each neuron on the grid is an output neuron which has maintained topological properties within training set.SOM usually Operates as following process: the make of a two-dimensional array/map and the randomization of the initial data.Then it gives training data to the network and let the cells on the network compete to win which, stimulates winner and some friends in the "neighborhood", meanwhile updating neurons, repeating this process again and again, forming the result of a 2-dimensional network.
SOM is divided into training paces and testing procedure.In the training process, the weight vector is trained as the clustering center of the input sample space.In the testing process, when the input vector and a competitive layer of the inner star weights are similar, it will be assigned to the corresponding clusters.
A typical structure of a self-organizing neural network: it consists of the input layer and the competitive layer.Mainly for the completion of the basic tasks or "classification" and "clustering", the former has supervision, the latter performs without supervision.Clustering can also be known as sorting the target sample, but there is no prior information, the purpose is to put similar samples together, and to separate not similar samples.
The implementation of SOM algorithms to deal with complicated data has attracted considerable attention from many researchers [3]- [9].[10] [11] introduced concept of SOM, followed by [11] make development and applications.

Data and Methodology
We are using default of credit card individual default data set.We used 4 variables: Age, Gender, Marriage and Education of the data set to conduct numerical training through SOM.In Figure 1, it plots the distribution of data by age; In Figure 2, it plots the distribution of data by gender; In Figure 3, it plots the distribution of data by marriage; In Figure 4, it plots the distribution of data by education.The data is of high-dimensional (greater than 3), and hard for human to interpret.By applying SOM segmentation, the data can be reduced to lower dimension, while keeping its original topology property.separated.This is a much-desired property for multivariate clustering.Normally multivariate clustering will separate data in a higher dimensional space, its 2-dimensional projection could be chaos.However, in the result of SOM, Figure 5, it is clear the 2-dimenisonal projection of data still possesses clear boundary, and hence the data maintained its topological property.

Conclusion
The SOM method has advantages of data compression.That is, high-dimensional space samples data are mapped into low-dimensional space while keeping the topology unchanged.SOM has clear advantages in this aspect, which other wildly used methods such as PCA or LDA do not have.Regardless of how many spatial dimensions the input sample data have, it can be mapped in one area of the SOM output layer.The SOM method extracts, grasps and retains features.After simulating process, the vectors in high-dimensional space can be more clearly expressed in the low-dimensional feature space.Therefore, the mapping is not only a simple data compression, but also a discovery of the law.

Figure 5 Figure 1 .
Figure 5 gives the result of clusters through SOM.The cluster result has well preserved the topological properties of the original data, i.e. the clusters are well

Figure 4 .
Figure 4. Data distribution by education.