1. Introduction
During the past years, more and more complexity measurement of UML class diagrams have been developed in literatures, which play an important role in software development, testing and maintenance, and provide guidance for developing high quality software. Among these complexity measurements of UML class diagrams, some only focus on counting respective numbers of attributes, methods and relationships among classes [1], thus they are simple; the others are based on entropy-distance [2]-[4], they are relatively complicated. Later researchers carried out some empirical validation works to declaim their advantages [2] [5]-[8]. Despite that simple and complex metrics have their own advantages and disadvantages respectively; it is difficult for users to choose a suitable one in practice.
In order to help user to determine which one is better, simple or complex metrics, this paper analyzes and compares four typical metrics for UML class diagrams from experimental software engineering viewpoints. Understandability, analyzability and maintainability were classified and predicted for 27 class diagrams related to a banking system [1] by means of algorithm C5.0 within the framework of the Weak [9] toolkit.
The remainder of this paper is organized as follows: following the introduction, Section 2 overviews related complexity measurement of UML class diagrams and typical empirical validation works. Understandability, analyzability and maintainability were classified and predicted based on four typical UML class diagram metrics in Section 3. Finally, conclusions are drawn in Section 4.
2. Measuring Complexities of UML Class Diagrams
Till now, there are lots of complexity measurements of UML class diagrams. They can be divided into two groups, namely, simple and complex. One simple metric is Genero’s metrics; three complex metrics are Zhou’s metric, Yi’s metrics, and Wu’s metrics respectively.
2.1. Genero’s Metrics [1]
Genero believed attributes, methods and relationships among classes all have impact on complexities of UML class diagrams, hence Genero’s metrics focus on counting numbers of attributes, methods and relationships among classes and depth of class tree. Genero’s metrics are respective numbers of classes, attributes, methods, associations relationships, aggregation relationships, generalization relationships, dependency relationships, classes depended by other classes, classes depend on other classes (abbreviated as NC, NA, NM, NAssoc, NAgg, NGen, NDep, NDepOut and NDepIn, respectively), respective maximum longest path from a class to its root and from a class to its leaves (abbreviated as MaxDIT and MaxHagg, respectively), respective hierarchy numbers of generalization and aggregation (abbreviated as NGen and NGenH, respectively).
2.2. Zhou’s Metric [2]
Unlike Genero, Zhou claimed that attributes and methods in classes have little impact on complexities of UML class diagrams, but believes that relationships among classes are the key factor. Moreover, he supposed that relationships among classes are random. Firstly, UML class diagrams are transferred into class dependent graphs by Baudry B.’s transformation rules [10]. Secondly, different weight values are specified to various kinds of relationships manually. Finally, complexities of UML class diagrams are measured based on entropy-distance.
2.3. Yi’s Metrics [3]
In comparison with Zhou’s viewpoint, Yi implied that not only relationship among classes, but also attributes and methods all affect complexities of UML class diagrams, furthermore considerers their public, private and protected properties. Yi’s metrics are made up of three seed metrics, namely EDCRC, EDCAC and EDCMC. These three seed metrics make up EDCC. The above four metrics together are called Yi’s metrics. Zhou’s metric and EDCRC have the same frame work; the latter is a modified version of the former. Yi’s metrics are also measured on entropy-distance.
Despite advantages of Zhou’s metric and Yi’s metrics, no consensus has yet been reached as for what weight values of relationships should be specified. Different views are held by different research. On one hand, some of them regard the weight value of association relationships is smaller than that of aggregation; on the other hand, others believe association and aggregation should have the same weight value.
2.4. Wu’s Metrics [4]
To overcome above-mentioned shortcoming in Zhou’s metric and Yi’s metrics, Wu proposed a novel method to measure complexities of UML class diagrams (abbreviated as UMLDMCN) based on data mining and complex networks. Wu’s metrics are also made up of three seed metrics, namely EDCRC, EDCAC and EDCMC. These three seed metrics make up UMLDMCN. The above four metrics together are called Wu’s metrics. Wu’s metrics and Yi’s metrics have the same framework. EDCAC and EDCMC of Wu’s metrics are exactly the same as those of Yi’s metrics respectively. However, EDCRC of Wu’s metrics is an improved version of that of Yi’s metrics. The difference lies in weight values of relationships are automatically computed by virtue of Page Rank algorithm in the former.
2.5. Related Comparative Research of Typical UML Class Diagram Metrics
Several empirical validations were conducted in order to analyze and compare the above metrics systematically and deeply.
Reference [1] performed some experiments and concluded that: there is statistically significant correlation between some Genero’s metrics (namely NC, NA, NM, NAgg, NGen) and understandability; other Genero’s metrics (namely NC, NA, NM, NGen) and analyzability; still other Genero’s metrics (namely NC, NA, NM) and modifiability. NDep is the only one that has a lesser correlation.
Reference [5] compared Marchesi’s, Genero’s, In’s, Rufai’s and Zhou’s metrics from different viewpoints, different types of relationships, different types of metric values, complexity, and theoretical & empirical validation. The results showed that the above metrics have their shortcomings while being effective or efficient for some special characteristics of systems.
Reference [6] validated Zhou’s metric by using twenty-seven UML class diagrams related to bank information systems as material. The results showed that Zhou’s metric is perfectly positively correlated with understandability, analyzability, and modifiability respectively.
Reference [7] compared Marchesi’s, Genero’s and Yi’s metrics both theoretically and experimentally through Internet banking system from different viewpoints, different types of relationships, different types of metric values and complexity. The results showed that the above metrics have their shortcomings while being effective or efficient for some special characteristics of systems.
Reference [8] compared advantages and disadvantages of Genero’s and Zhou’s metrics through twenty-seven UML class diagrams related to bank information systems. Their understandability, analyzability and maintainability were classified and predicted by means of algorithm C5.0 in tool SPSS Clementine. Results showed that Genero’s metrics have higher classification accuracy than that of Zhou’s metric.
In short, existing empirical validations pay attention to particular metrics, not a kind of metrics. Once the particular metrics are changed to another one, people still don't know how to choose metrics. This paper groups numerous metrics into two kinds, namely simple and complex metrics. Once we suggest that simple metrics is better than complex metrics or vice verse, there is no confused to choose a particular metric.
3. Comparative Study on Classification and Prediction of Typical UML Class Diagram Metrics
3.1. Dataset
In order to better compare with Reference [6]-[8], this paper also selected twenty-seven UML class diagrams related to bank information systems as object. Table 1 shows metric values computed by the four kinds of metrics, in which the values of understandability, analyzability and maintainability were determined manually by twenty-four students in third year of computer science in the department of computer science at the university of Castilla-La Mancha in Spain, and twenty-six students in the fourth year of computer science in Italy according to their own experiences. From Table 1, we can’t determine which kind of metrics is better.
3.2. Classifier
This paper chose algorithm C5.0 within the framework of the toolkit Weak as the classifier. Furthermore, the default parameters of the J48 were adopted, namely -C 0.25-M 2.
3.3. Evaluation Criteria
A large number of evaluation criteria have been used in literature, among which we chose correctly, TP Rate, FP Rate, Precision, Recall, F-Measure and AUC [11] in this paper.
3.4. Experimental Parameters
The ultimate goal of this paper is to compare the performance of simple and complex metrics, 6*3 sets of experiments were conducted. Classification and prediction performance of Genero’s metrics, Zhou’s metric, EDCC and UMLDMCN were compared. Classification and prediction performance of Genero’s, Zhou’s, Yi’s and Wu’s metrics were also compared.
3.5. Results and Discussion
This section provides a detailed report of our experimental results. This paper orders Genero’s metrics, Zhou’s
metric, Yi’s metrics and Wu’s metrics from simple to complex. Table 2 describes their classification and prediction performance.
It can be seen from Table 2 that all performance indicators (namely correctly, TP Rate, Precision, Recall, F-Measure and AUC) of Genero’s metrics for classifying and predicting understandability are the best in those of Genero’s metrics, Zhou’s metric, Yi’s metrics and Wu’s metrics. Table 2 shows that the main performance indicators (namely correctly, TP Rate, Precision, Recall and F-Measure) of Genero’s metrics for classifying and
Table 2. Classification and prediction performance
predicting analyzability are the best except AUC indicator in those of Genero’s metrics, Zhou’s metric, Yi’s metrics and Wu’s metrics. The above results indicate that the performance of simple metrics is better than that of complex metrics.
It is obvious from Table 2 that the performance of Genero’s metrics for classifying and predicting maintainability is not the best one; however it is not the worst one. The above results indicate that the performance of simple metrics is not inferior to that of some complex metrics.
In a word, experimental results suggest that the performance of simple metrics is not inferior to that of complex metrics, in some cases even better than that of some complex metrics.
4. Conclusion
This paper empirically validated the ability of complexity measurement of UML class diagrams to classify and predicate understandability, analyzability and maintainability. Experimental results showed that the performance of simple metrics is not inferior to that of complex metrics, in some cases even better than that of complex metrics. This observation, as well as confirmed by the experiments reported in previous studies [8], can provide some practical guidance for users to select suitable complexity measurements of UML class diagrams.
Acknowledgements
This work has been partially supported by the Natural Science Foundation of China (Project No. 61163007, 61262010), Natural Science Foundation of Jiangxi (Project No. 20142BAB207010, 20114BAB211019, 20132BAB201036) and Scientific Research Foundation of Jiangxi Provincial Education Department (Project No. GJJ12731, GJJ13305).