### Paper Menu >>

### Journal Menu >>

J. Software Engineering & Applications, 2010, 3, 1027-1031 doi:10.4236/jsea.2010.311120 Published Online November 2010 (http://www.SciRP.org/journal/jsea) Copyright © 2010 SciRes. JSEA A Design of Incremental Granular Network for Software Data Modeling Keun-Chang Kwak Department of Control, Instrumentation and Robot Engineering, Chosun University, Gwangju, Korea (South) Email: kwak@chosun.ac.kr Received September 6th, 2010; revised September 27th, 2010; accepted October 5th, 2010. ABSTRACT In this paper, we propose an incremental method of Granular Networks (GN) to construct conceptual and computa- tional platform of Granular Computing (GrC). The essence of this network is to describe the associations between in- formation granules including fuzzy sets formed both in the input and output spaces. The co ntext within which such rela- tionships are being formed is established by the system developer. Here information granules are built using Con- text-driven Fuzzy Clustering (CFC). This clustering develops clusters by preserving the homogeneity of the clustered patterns associated with the input and output space. The experimenta l results on well-known software module of Medi- cal Imaging System (MIS) revealed that the incremental granular network showed a good performance in comparison to other previous literature. Keywords: Increment al Granular N etwork, Granular C omput ing, Informat ion Granul es, Context-Based Fuzzy C lust eri ng 1. Introduction Granular Computing (GrC) is a general computation the- ory for effectively using information granules such as classes, clusters, subsets, groups, and intervals to build an efficient computational model for complex applica- tions with huge amounts of data, information and know- ledge [1]. Furthermore, granular computing forms a uni- fied conceptual and computing platform. Yet, it directly benefits from the already existing and well-established concepts of information granules formed in the setting of sets and interval theory, fuzzy sets, rough sets, and sha- dowed sets [2]. In order to form the conceptual and computing plat- form of granular computing, we introduce granular net- work with two types that directly uses the fundamental idea of fuzzy clustering. Based on this network, we also develop and design an incremental granular network that combines linear regression and local granular network [3]. First, we build a standard regression model which could be treated as a preliminary construct capturing the linear part of the data and in this way forming a back- bone of the entire construct. Next, all modeling discrep- ancies are compensated by a collection of rules that be- come attached to the regions of the input space in which the error becomes localized. Here the network is de- signed by the use of fuzzy granulation realized via con- text-based fuzzy clustering [4]. This clustering technique builds information granules in the form of fuzzy sets and develops clusters by preserving the homogeneity of the clustered patterns associated with the input and output space. The effectiveness of this clustering has been demonstrated on Linguistic Models (LM) [5,6], Radial Basis Function Neural Networks (RBFNNs) [7], and incremental models [8]. These models represented a nonlinear and complex characteristic more effectively than conventional models based on context-free cluster- ing. This paper is organized as follows. Section 2 de- scribes the architecture of granular network with two types and mechanism of context-based fuzzy clustering. In Section 3, we present the design of incremental granular network. This network is applied to software module of well-known Medical Imaging System (MIS) [9] in Section 4. Finally, conclusions are given in Section 5. 2. Granular Network Let us firstly recall the mechanism of context-based fuzzy clustering. This clustering as an interesting variant of the fuzzy c-means is realized via individual contexts. Each context has clearly defined semantics that can be interpreted as a large negative error, medium negative error, etc. Consider a certain fixed context Wj described A Design of Incremental Granular Network for Software Data Modeling 1028 by some membership function. The data point in the output space is then associated with the corresponding membership value. Let us introduce a family of the parti- tion matrices induced by the l-th context and denote it by U(Wl) cN ikik kik i1 k1 (W)u0,1|uwk and 0uN ll U (1) where wlk denotes a membership value of the k-th datum implied by the l-th context. The underlying objective function is as follows 2 ik N 1k m ik c 1i ||||uQ vx (2) where vi denotes the i-th prototype. The Q is minimized under the constraints imposed by (1) as follows Min Q subject to U(Wl), l = 1,2,, p (3) The minimization of Q is realized by iteratively up- dating the values of the partition matrix and the cluster centers. The successive updates of the partition matrix are completed as follows c 1j 1m 2 jk ik k l ik w u vx vx (4) where 1, 2,,,1, 2,,ickN Note that uik means the partition matrix induced by the l-th context. The prototypes are determined as N 1k m ik N 1k k m ik iu ux v (5) We assume that the fuzzification factor m is 2.0. In the design of the granular network, we consider the contexts to be described by triangular membership functions being equally distributed in the error space E with the 1/2 overlap occurring between two successive fuzzy sets. Figure 1 visualizes the example of a blueprint of the incremental granular network for p = 3 and c = 2. Each context generates a number of induced clusters whose activation levels are afterwards summed up as shown in Figure 2. Denoting those by 12 n ,,, the output of the net- work is granular. Assuming the triangular form of the contexts, the result is a triangular fuzzy number E as fol- lows nn2211....WWWE (6) We denote the algebraic operations by to emphasize , Figure 1. Concept of context-based fuzzy clustering. 11 u i1 u c1 u 1p u pi u pc u E x 1 W p W 1 p 1t u ti u tc ut W t Context-based centers Contexts 11 u i1 u c1 u 1p u pi u pc u E x 1 W p W 1 p 1t u ti u tc ut W t Context-based centers Contexts Figure 2. Architecture of the granular network (case 1). that the underlying computing operates on a collection of fuzzy numbers. As such, E is characterized by its three parameters that are a modal value, the lower bound, and upper bound. On the other hand, we develop the advanced granular network with detailed linguistic context as shown in Fig- ure 3. The consequent part is obtained by Constrained Least Square Estimate (CLSE) method as follows YU θmin , )max()min( YYtosubject (7) where U and Y denote the activation levels in layer 2 and the actual output, respectively. The parameter to be estimated is the modal values of the detailed linguistic contexts. For further details on the CLSE method, see [10]. C opyright © 2010 SciRes. JSEA A Design of Incremental Granular Network for Software Data Modeling 1029 11 u i1 u c1 u 1p u pi u pc u x 1t u ti u tc u 0 b y ˆ Context-based fuzzy clustering Detailed linguistic contexts Using CLSE method 11 z i1 z c1 z 1t z ti z tc z 1p z pi z pc z 11 u i1 u c1 u 1p u pi u pc u x 1t u ti u tc u 0 b y ˆ Context-based fuzzy clustering Detailed linguistic contexts Using CLSE method 11 z i1 z c1 z 1t z ti z tc z 1p z pi z pc z Figure 3. Architecture of the granular network (case 2) Linear Regression z bias Y E x INCREMENTAL MODEL Context-based clustering fuzzy numbers (granular information processed) Figure 4. Overall flow of incremental granular network 3. Design of Incremental Granular Network The main design process of the incremental granular network is shown in Figure 4 showing how the two functional modules operate. Firstly, we decide upon the granularity of information to be used in the develop- ment of the model such as the number of contexts and the number of clusters formed for each context. The design procedure of incremental granular network is as follows [8]. [Step 1] Design of a linear regression in the input and output space, z = L(x; b) with b denoting a vec- tor of the regression hyperplane, b =[a a0]T. On the basis of the original data set formed is a col- lection of input-error pairs, (xk, ek) where ek = target-L(xk,a). [Step 2] Construction of the collection of contexts in the space of error of the regression model E1, E2, , Ep. The distribution of these fuzzy sets is optimized through the use of fuzzy equalization while the fuzzy sets are characterized by trian- gular membership functions with a 0.5 overlap between neighboring fuzzy sets. [Step 3] Context-based fuzzy clustering completed in the input space and induced by the individual fuzzy sets of context. For “p” contexts and “c” clusters per context, obtained are c*p clusters. [Step 4] Summation of the activation levels of the clus- ters induced by the corresponding contexts and their overall aggregation through weighting by fuzzy sets of the context leading to the triangular fuzzy number of output, E = F (x; E1, E2, , Ep) where F denotes the overall transformation real- ized by the incremental granular network. Fur- thermore note that we eliminated eventual sys- tematic shift of the results by adding a numeric bias term. [Step 5] The result of the incremental granular network is then combined with the output of the linear part. The result is a shifted triangular number Y, Y = z E. 4. Experimental Results In order to evaluate the performance of the incremental granular network for data modeling in software engi- neering, we applied to well-known Medical Imaging System (MIS) subset of 390 software modules written in Pascal and FORTRAN [9]. These modules consist of approximately 40,000 lines of code. We use 11 system input variables such as, LOC, CL, TChar, TComm, MChar, DChar, N, Nh, NF, V(G), and BW, The output variable to be predicted is “Changes”. The training and testing data set are randomly selected by 60%-40%. The experiments are performed by 10 runs. The train- ing data set is used for model construction, while the test set is used for model validation. Thus, the resultant model is not biased toward the training data set and it is likely to have a better generalization capacity to new data. We obtained the best case (m = 3.0, p = c = 6), while varying the number of cluster () and fuzzifica- tion factor (m = 1.5, 2.0, 2.5, 3.0). 62 p Figure 5 and Figure 6 show the contexts (Case 1) and consequent parameters (Case 2) obtained from linear regression error, respectively. Figure 7 shows the pre- diction performance of incremental granular networks. Figure 8 visualizes the distribution of clusters and some input data. Table 1 lists the experimental comparison on RMSE (root mean square error). In the design of LM, we C opyright © 2010 SciRes. JSEA A Design of Incremental Granular Network for Software Data Modeling 1030 -40 -30-20 -10010 20 0 0. 2 0. 4 0. 6 0. 8 1 error degree of m em bershi p Figure 5. Contexts obtained from linear regression error (Case 1). 0510 152025 30 3540 -50 -40 -30 -20 -10 0 10 20 30 error z pc Figure. 6 Consequent par a meters (Case 2). used six contexts and six clusters in each context for context-based fuzzy clustering. Although the LM has a structured knowledge representation in the form of fuzzy if-then rules, it lacked the adaptability to deal with non- linear model. Moreover, we constructed the RBFN based on six contexts and six clusters in the same manner. Here learning rate is 0.0001 and the number of epoch is 1000. As listed in Table 1, we can recognize that the proposed method (IGN with two cases) showed a good perform- ance in comparison to linguistic model and RBFNNs based on context-based fuzzy clustering. 5. Conclusions We presented the design of the incremental granular network for software data of medical imaging system. This network is adopted a construct of a linear regression as a first-principle global model, refine it through a series 050 100 150 200 250 -20 0 20 40 60 80 100 num of data Changes model out put ac tual out put Figure 7. Predication performance for MIS data. 00.5 1 0 0. 5 1 W1(c=6) clusters dat a 00.5 1 0 0. 5 1 W2(c=6) 00.5 1 0 0. 5 1 W3(c=6) 00.5 1 0 0. 5 1 W4(c=6) 00.5 1 0 0. 5 1 W5(c=6) 00.5 1 0 0. 5 1 W6(c=6) Figure 8. Distribution of clusters and input data (DChar, N). Table 1. Performance comparison. Prediction Performance Methods Train_RMSE Check_RMSE LM [4] 6.266 7.981 RBFN [6] 6.631 7.772 IGN(Case1) 4.626 6.624 IGN(Case2) 3.770 6.532 of local fuzzy rules that capture remaining and more lo- calized nonlinearities of the system. More schematically, we could articulate the essence of the resulting incre- mental granular network by stressing the existence of the C opyright © 2010 SciRes. JSEA A Design of Incremental Granular Network for Software Data Modeling Copyright © 2010 SciRes. JSEA 1031 two essential modeling structures that are combined lin- ear regression and local granular network. The experi- mental results revealed that the incremental granular network outperformed the previous works. The granular networks used in this paper can be applied to intelligent data analysis, nonlinear system modeling, adaptive hy- permedia, e-commerce, and intelligent interfaces. REFERENCES [1] W. Pedrycz, A. Skowron and V. Kreinovich, “Handbook of Granular Computing,” John Wiley & Sons, Hoboken, 2008. [2] W. Pedrycz and F. Gomide, “Fuzzy Systems Engineering: Toward Human-Centric Computing,” Wiley-Interscience, Hoboken, 2007. [3] M. Y. Lee and K. C. Kwak, “An Incremental Granular Network for Data Modeling in Software Engineering,” 2010 4th International Conference on New Trends in In- formation Science and Service Science (NISS), Gyeongju, Korea , May 2010, pp. 495-498. [4] W. Pedrycz, “Conditional Fuzzy C-Means,” Pattern Recognition Letters, Vol. 17, No. 6, May 1996, pp. 625-632. [5] W. Pedrycz and A. V. Vasilakos, “Linguistic Models and Linguistic Modeling,” IEEE Transactions on Systems, Man and Cybernetics-Part C, Vol. 29, No. 6, 1999, pp. 745-757. [6] W. Pedrycz and K. C. Kwak, “Linguistic Models as Framework of User-Centric System Modeling,” IEEE Transactions on Systems, Man and Cybernetics-Part A, Vol. 36, No. 4, 2006, pp. 727-745. [7] W. Pedrycz, “Conditional Fuzzy Clustering in the Design of Radial Basis Function Neural Networks,” IEEE Transactions on Neural Networks, Vol. 9, No. 4, 1999. pp. 745-757. [8] W. Pedrycz and K. C. Kwak, “The Development of In- cremental Models,” IEEE Transactions on Fuzzy Systems, Vol. 15, No. 3, 2007, pp. 507-518. [9] S. K. Oh, W. Pedrycz and B. J. Park, “Self-Organizing Neurofuzzy Networks in Modeling Software Data,” Fuzzy Sets and Systems, Vol. 145, No. 1, July 2004, pp. 165-181. [10] J. Abonyi, R. Babuska and F. Szeifert, “Fuzzy Modeling with Multivariate Membership Functions: Gray-Box Identification and Control Design,” IEEE Transactions on Systems, Man and Cybernectics-Part B, Vol. 31, No. 5, 2001, pp. 755-767. |