An Improved Fuzzy ISODATA Algorithm for Credit Risk Assessment of the EIT Enterprises *

We proposed an improved fuzzy ISODATA algorithm for credit risk assessment of the emerging information technology enterprise in this paper. Firstly, as the uncertainty of the EIT enterprise is relatively large, we set a reference sample and an initial clustering center matrix so that we overcame the shortcomings of traditional ISODATA algorithm and improved the reliability of fuzzy clustering analysis. Secondly, we proposed the steps of evaluating the EIT enterprises’ credit risk with improved fuzzy ISODATA algorithm. Last but not least, we assessed 10 EIT enterprises’ credit risk of a certain city, which proved the effectiveness and operability.


Introduction
Emerging Information Technology (EIT) is defined as a technology that can innovate or upgrade the function, product, or service of information technology by using the basic principles and methods of information science, as well as with the technical characteristics of emerging technologies [1].As there are uncertain elements like the EIT itself, product market of the EIT enterprise and so on, the EIT enterprises are facing large credit risk [2,3].
Credit is the inevitable product of social economic development, and is also an essential part of modern social economy.Credit risk is the possibility that a bond issuer or borrower will default by failing to repay principal and interest in a timely manner, leading to a loss to a bank, or investors.EIT enterprises are the typical venture business.To make objective and comprehensive assessment of their credit risk is not only necessary foundation for a smooth financing, but also the essential part for the EIT enterprises' risk management.
The current credit risk assessment models are Credit Metrics, Credit Risk+, KMV, multi-objective decision making, non-parametric statistical methods, neural network and so on.Using fuzzy clustering analysis for corporate credit risk assessment is non-parametric statistical methods [4,5].It's especially in kinds of methods are unsure of the overall distribution function, with good results.The current literatures usually no longer evaluate the merits after classify the target objectives with fuzzy clustering method, or they consider all indicators as efficiency indicators in the ratings analysis, and the classification is according to the properties of specific target range set.According to this current way, it is possible that the credit rating results are different, if we cluster and rate the same EIT enterprise separately with two (or more than two) groups of the EIT enterprises with different qualifications as one target set.Especially in the case that there is very small number of objects in one group, the possibility that this situation happens is huge, which reduce the objectivity of the EIT enterprise credit risk assessment.On the other hand, as an EIT enterprise is usually in initial period of foundation, there is large uncertainty of its growing and development, and the index data used to assess its credit risk is usually incomplete which need to set up a corresponding reference sample system firstly.Reference sample system is an objective standard for the specific requirement of the EIT enterprise credit risk assessment, which is set as ideal value of every characteristic of every grade, used to study the target clustering.Currently, there are few literatures about the EIT enterprise credit risk assessment.
In view of this, we've improved the traditional fuzzy clustering method through setting the reference sample system of the EIT enterprises, and have proposed an improved fuzzy clustering algorithm to cluster the credit risk of the EIT enterprises.Example shows that the improved algorithm solves the problem of insufficient objectivity of the traditional fuzzy clustering method applied in the credit risk assessment in a certain extent.

Fuzzy Cluster Analysis
Fuzzy ISODATA (Iterative Self-Organizing Data Analysis Techniques Algorithm) is an interactive self-organizing data analysis technique for fuzzy cluster [6,7].Cluster using standard Fuzzy ISODATA works as follow, suppose classes' number has been decided, and choose an original fuzzy cluster matrix, calculate optimal fuzzy cluster matrix and optimal cluster center matrix using iterative operation, then classify the inspected object.The algorithm requires more stringent selection of original fuzzy cluster matrix.Inappropriate selection would cause distortion in iterative process.There are limitations when standard fuzzy ISODATA was used in the scene of rating of target object.The algorithm can only cluster object into specific classes, but can't discriminate whether classes meet the "meaningful distance".Based on this, reference sample system and investigation sample will be collected to be cluster.Improved fuzzy ISODATA algorithm steps are as follows: 1) Establish the original characteristic indicators matrix U * that descript each attribute value of all inspected object and reference samples.ij is on behalf of the characteristic indicators j of object i.
2) Standardize the data of original characteristic indicators matrix U * by range method to get U, define 3) Start iterative operation based on original cluster center matrix of reference sample system, .4) Calculate fuzzy classified matrix using formula (2), where c is on behalf of classes number.And based on 5) Modify cluster center matrix for , , where (3) 6) Repeat step 2), compare and , for given , iterative operation should be stopped and should be outputted.In opposite condition, , repeat step 3).
7) Get fuzzy cluster based on optimal cluster center matrix discrimination principle--suppose the optimal cluster center matrix ,   , if , object uk should be classified to class i.

Assessment Steps of Integration of Rough Set and Improved Fuzzy Clustering
The evaluation indicators should be screened at first when we assess the EIT enterprises' credit risk.Based on indicators screening, the application of improved fuzzy ISO-DATA algorithm for classification of the EIT enterprises credit risk would have a better result.The following are specific steps of assessing and classifying the EIT enterprises credit risk with attribute reduction method from integration of rough set theory and improved fuzzy ISODATA algorithm: 1) Establish an initial set of assessment indicators and sample set to be inspected.Suppose is the initial set of assessment indicators, and is the sample set.a ij is the value of assessment indicator j of sample I; 2) Discretization of data a ij ; 3) Attribute reduction to the indicator set with application of rough set theory [8,9]; 4) Establish a proper reference sample set, and construct an initial clustering center matrix ; 5) Add the reference sample set into sample set to be inspected, and cluster all the samples with above-mentioned improved fuzzy ISODATA algorithm in order to achieve risk rating of the EIT enterprises.

Case Analysis
We assessed credit risk of 10 EIT enterprises (denoted by A, B, C, D, E, F, G, H, I and J) in certain city.Firstly, choose 4 primary-level indicators and 15 secondary indicators, according to systematic, scientific, operational, objective principle as well as a combination of quantitative and qualitative principle, and referring the evaluation indexes system of emerging technology enterprise credit risk [1].They are financial benefits status indicator (P 1 ): ROE (P 11 ), ROA (P 12 ), asset maintenance and appreciation indicator (P 13 ), OPE (P 14 ); assets operating state indicator (P 2 ): total asset turnover (P 21 ), current assets turnover (P 22 ), inventory turnover (P 23 ), receivable accounts turnover (P 24 ); debtpaying ability indicator (P 3 ): asset-liability ratio (P 31 ), acidtest ratio (P 32 ), cash flow debt ratio (P 33 ); development state indicator (P 4 ): sales growth rate (P 41 ), capital accumulation rate (P 42 ), the average growth rate of capital (P 43 ) and the ability of technological innovation and application (P 44 ).The data is from the financial statements of 10 EIT enterprises, except (P 44 ) is through experts grading.
Set the accuracy ε = 0.001, and according to the above m classification results could be got according to Last but not least, establish the 5-level reference sample system, denoted as K、 L、 M、 N and O, referring to the current 5-level credit risk standard of commercial bank [5] and credit rating standard of the IT corporate in the EIT corporate [10], as Table 2 shows (V indicates the highest credit rating and correspond to the lowest credit risk).entioned improved fuzzy ISODATA algorithm specific steps, through computer multiple iterative operation, get such as shown in Table 3 the clustering center of the credit rating.system, 10 investigated EIT enterprises must be assigned into 5 credit grades.In the case of V, at least one enterprise would be assigned into this highest level.Actually, only pre-set reference sample existed in grade V, and 7 property values of all cluster centers are close to corresponding attribute values of reference samples, more than that, 5 reference samples are assigned to correct credit grade after clustered with credit risk indicators of other 10 investigated EIT enterprises.Through comparative analysis, results obtained by the improved algorithm stated in this paper are more appropriate for EIT enterprises realities.

Conclusion
Setting a reference Chi the situation that when a same EIT enterprise forms clustering set separately with other EIT enterprises with different qualifications, it will show different assessment results, and also increase the objectivity of the EIT enterprises' credit assessment.At the same time, it is proved that the possible distortion brought by a random selection of initial fuzzy classification matrix could be avoided, if we accord to the improved method that pre-set the reference sample system and determine the initial clustering tion matrix, which increase the reliability of fuzzy clustering analysis for the EIT enterprise credit risk.
et al., "The Evolution r Emerging Technology ence Press, Beijing, 2010.(in Chinese)

Table 2 :
Secondly, make the index data of the EIT enterprise discrete.Use the software, Rosetta, and choose the Entropy Scalar tool in it, which can measure entropy, to accompany this step.The next step is attribute reduction.With help of Genetic Algorithm tool in Rosetta, we get the indicators of credit risk characteristics of these 10 EIT enterprises as

Table 3
and the principle of optimal clustering center