Three design principles are prominent in software development-encapsulation, data hiding, and separation of concerns. These principles are used as subjective quality criteria for both procedural and object-oriented applications. The purpose of research is to quantify encapsulation, data hiding, and separation of concerns is quantified using cyclomatic-based metrics. As a result of this research, the derived design metrics, coefficient of encapsulation, coefficient of data hiding, and coefficient of separation of concerns, are defined and applied to production software indicating whether the software has low or high encapsulation, data hiding, and separation of concerns.
The importance of software development has never been more critical. Software is a driving force in product and information technology as it provides functionality for a broad range of computer platforms. Today, the sheer volume of software is immeasurable. Existing application portfolios grow annually through the introduction of new applications and maintenance of existing applications. An overriding force in this growth is software quality. When developing software, a software developer is aware of how critical high quality is to every software solution. Sustaining quality in today’s large software inventories is an ongoing challenge.
Three proven design principles, encapsulation, data hiding and separation of concern, have a long and successful history in the software development discipline. They have been used to assist software developers when designing software and assessing the quality of the software design. When packaging functionality, software developers seek to encapsulate process with data so that the functionality is constructed logically correct and highly maintainable. When process and data are constructed to hide data, the design properties promote protected access and reduced risks to data. Further, when multiple software components are developed with separation of concerns, the influence across software components is less, and these components standalone without creating ripple effects that are often unknown or unpredictable. Software with good encapsulation, data hiding and separation of concerns are more desirable than the alternative—poor encapsulation, data hiding and separation of concern.
The problem with these design principles is their inheritance subjectivity. While software engineers seek to achieve good encapsulation, data hiding, and separation of concern, the assessment of these design principles is subjective. What is good? What is poor? Multiple software engineers can look at the same software target and assess the design quality to be poor, medium, or high. Currently, these design principles lack a fact basis for assessing the quality level. In this research, McCabe metrics are utilized to derive objective measures for the level of encapsulation, data hiding and separation of concern in a software application. Using these software metrics, a software engineer can achieve the following:
1) An objective measurement for encapsulation, data hiding, and separation;
2) A repeatable measurement for a targeted software application;
3) A measurement understandable by software engineers and management;
4) A measurement derived from a classical set of McCabe metrics that have recognition and creditability in the software engineering discipline.
Software developers and engineers should consider proven design principles when designing and constructing software. Software design principles serve as the foundation for analytical reasoning regarding the quality of a software solution. A good software design facilitates the creation of software code and reduces the time to maintain the software while achieving execution performance. The objectives of this article include two elements. The first is to define three quantitative design metrics: encapsulation, data hiding, and separation of concerns coefficients. This quantification increases the value of these principles when creating and maintaining software. The second element is to illustrate these metrics using software examples and demonstrate how they provide valuable objective insight into software quality. To accomplish the stated objectives, McCabe metrics are applied to three proven design principles: encapsulation, data hiding, and separation of concern. Cyclomatic complexity, module design complexity, global data complexity, and specified data complexity for local data are used to quantify these design principles.
Important research has been performed on software metrics. Metrics have been shown to predict faults [
These software metrics are powerful, and the Chidamber and Kemerer metric suite has become the gold standard for object oriented metrics. Intuitively, they demonstrate measurable characteristics of an object oriented program. For the general design environment, they describe measurable attributes about complexity,
Metric | Acronym | Description | Author |
---|---|---|---|
Number of public methods | NPM | A count of public methods in a class | Bansiya & Davis |
Data access metric | DAM | The ratio of private attributes to the total attributes declared in a class | Bansiya & Davis |
Measure of aggregation | MOA | A count of the number of class fields who types are user defined classes; measures the part-whole relationship | Bansiya & Davis |
Measure of functional abstraction | MFA | The ratio of the number of methods inherited by a class to the total number of methods accessible by the member methods on the class | Bansiya & Davis |
Cohesion among methods of class | CAM | The summation of the number of different types of method parameters in every method divided by a multiplication of the number of different method parameter types in whole class and number of methods; the relatedness of methods of a class based on the parameter list of the methods | Bansiya & Davis |
Weighted methods per class | WMC | The number of methods in the class | Chidamber & Kemerer |
Depth of inheritance tree | WIT | The inheritance levels form the top object hierarchy top | Chidamber & Kemerer |
Number of children | NOC | The number of immediate descendants of the class | Chidamber & Kemerer |
Coupling between object classes | CBO | The number of classes coupled to a given class as a result of method calls, field accesses, inheritance, method arguments, return types, and exceptions | Chidamber & Kemerer |
Response for a class | RFC | The number of different methods that can be executed when an object of that class receives a message | Chidamber & Kemerer |
Lack of cohesion of methods | LCOM | The sets of methods a class that are not related through sharing of some of the class fields | Chidamber & Kemerer |
Inheritance coupling | IC | The number of parent classes to which a given class is coupled; coupling occurs when 1) an inherited method uses an attribute that is defined in a new or redefined method, 2) one of its inherited methods calls a redefined method or 3) one of its inherited methods is called by a redefined method and uses a parameter that is defined in the redefined model | Tang, Kao & Chen |
Coupling between methods | CBM | The total number of new or redefined methods to which all the inherited methods are coupled | Tang, Kao & Chen |
Average method complexity | AMC | The average method size for each class; size is equal to the number of Java binary codes in the method | Tang, Kao & Chen |
cohesion and coupling. A shortcoming is that they do not map directly to long-standing design principles—encapsulation, data hiding, and separation of concerns. In addition, these design principles apply to all software, not just object oriented code.
As software is designed, there are well-established principles applied by software developers. These principles guide the development effort with the goal of creating quality software that is understandable, testable, and measurable. Three of the most popular design principles are encapsulation, data hiding, and separation of concerns. In Meilir Page-Jones’ book, encapsulation is defined as [
The grouping of related ideas into one unit, which can thereafter be referred to by a single name.
In programming languages, encapsulation refers to one of two related but distinct notions, and sometimes to the combination thereof. First, it is a language construct that facilitates the bundling of data with the modules (or other functions) operating on that data. Secondly, it is language mechanism for restricting access to some of the object’s components.
Sometimes, encapsulation and data hiding are used interchangeably. However, data (information) hiding is a software design principle originally introduced by David L. Parnas. In his paper written almost 40 years ago, Parnas used data hiding to discuss the criteria leading to good modularity in structured programming [
Every module in the [...] decomposition is characterized by its knowledge of a design decision which it hides from all others. Its interface or definition was chosen to reveal as little as possible about its inner workings.
Specifically, in object-oriented programming, data hiding shields internal object details (data members). Data hiding ensures exclusive data access to class members and protects object integrity by preventing unintended or intended changes. Decision logic that does not use public global and parameter data indicates strong data hiding (since other objects cannot access this data) and decision logic that uses public global and parameter data indicates weak data hiding (since other objects can access this data).
The term, separation of concerns, was first used by Edsger W. Dijkstra in his 1974 paper “On the Role of Scientific Thought”. Dijkstra wrote that separation of concerns [
Even if not perfectly possible is yet the only available technique for effective ordering of one’s thoughts.
Separation of concerns has grown into a software design characteristic for separating a program into objects such that each object addresses a separate concern. A concern is a logic or a set of information that affects the code of a program. Decision logic that connects modules and uses public global and parameter data indicates weak separation of concerns, since this logic and data is influencing the operation of one or more other objects.
Encapsulation, data hiding, and separation of concerns are design principles that are, sometimes, misunderstood and, at other times, misapplied.
realized, it is accompanied by lower separation of concerns as less local data usage will promote more global data and program control logic that reaches into other modules. Finally, when the intersection of data hiding and separation of concerns is considered, software exhibiting high data hiding will result in high separation of concerns. When lower data hiding is realized, it is accompanied by lower separation of concerns as more global data usage will promote more data and program control logic that reach into other modules.
There are pragmatic implications associated with the relationship among encapsulation, data hiding, and separation of concerns. Note the 7 regions in
In
Region # | Encapsulation | Data Hiding | Separation of Concerns | Tactic When Enhancing, Modifying, Fixing or Reusing |
---|---|---|---|---|
1 | good | good | good | Best chance of success; try not to introduce global data references or call additional modules. |
2 | good | poor | poor | Pinpoint the used global data to see if the modifications affect it. Check how the called modules are affected by the prescribed changes. |
3 | poor | good | poor | Check the effects on called modules. Make sure the changes are not referencing global data. |
4 | poor | poor | good | These ‘leaf’ modules tend to be the most reused—keep it that way by not calling new modules. Be wary of the global data referenced—just pulling the module for reuse without its global data will not work. |
5 | good | poor | good | The main issue is global data—do not try to reuse without its global data environment. |
6 | poor | good | good | Good chance of success here. Not much local data used so this is probably highly computational. Also reusable since not dependent on called modules. |
7 | good | good | poor | This could be a transaction driven module calling several subordinates. Looks clean except for the substructure it calls. When reusing you may need all the modules it calls. |
concerns exist. Let’s examine region 2, this region represents a design with good encapsulation and poor data hiding and separation of concerns. Since this design has good encapsulation, global data and interdependent control logic among modules produce poor data hiding and separation of concerns. When work is conducted on this design type, modifications should pinpoint the risk of global data changes and the affected modules should be closely monitored for ripple effects. Consider another area, region 3. This region represents a design with good data hiding and poor encapsulation and separation of concerns. When implementing software changes to this class of software, it is important to check the effects on called modules and to ensure that changes are not referencing global data. In
The proof in the pudding is the usefulness of these design principles in doing what must be done with software modules—constructing, enhancing, modifying, reusing, understanding, and testing. When a module is modified or reused, design principles provide a sensitivity than can help avoid errors and increase productivity. The above Venn diagram has seven distinct regions delineating unique advice and reuse sensitivity based on their encapsulation, data hiding, and separation of concerns qualities.
Today, the McCabe metric set includes module, design and data metrics. Module level metrics are cyclomatic complexity and essential complexity. Design level metrics are module design complexity, design complexity, and integration complexity. There are two special object-oriented metric derived from the design level metrics. Object design complexity and object integration complexity are special cases of design level metrics calculated for object oriented programs. A third set is data metrics. These metrics are calculated by transferring mathematical concepts of cyclomatic complexity to the data world. Global data complexity is a data metric for public data. Specified data complexity is a metric that supports any selected set of data. The selected set can be public, private, or parameter data, or any combination of these three data types.
In 1976, McCabe introduced a number of software metrics. Cyclomatic complexity, v, is a measure of the number of paths through a program [
1. v(G) = number of predicates + 1
By examining the decision statements, the design predicates can be counted resulting in the cyclomatic complexity metric. It is important to recognize that many decision statements have compound conditions. An example is a compound IF statement [
If A = B and C = D and E = F then
In this example, v(G) is equal to 4 (three IF design predicates + 1). Cyclomatic complexity recognizes that compound predicates increase program logic complexity and integrates complex decision constructs in order to calculate v(G). An upper limit of 10 for a testable software unit is proposed because McCabe’s research found software with v(G) > 10 was less manageable, more difficult to test, and less reliable.
In McCabe’s original article [
2. ev = the cyclomatic complexity of a reduced flowgraph, v(G)
Reduction is completed by logically removing all structured constructs. Essential complexity can range from 1 to v. When ev is low, a module is well-structured and can easily be decomposed into multiple modules. In contrast, when ev is high, a module is not well-structured and cannot be easily be decomposed into multiple modules [
In 1989, McCabe and Butler extended cyclomatic complexity-based metrics into design principles. Module design complexity, iv, is a measurement of the decision structure of a module, G, which controls the invocation of G’s immediate subordinate modules.
3. iv = the cyclomatic complexity of a reduced flowgraph, v(G)
Reduction is completed by logically removing all decision constructs that do not significantly impact calls and returns from subordinate routines. It is a quantification of the testing effort of a module as it calls its subordinates [
After introducing these three metrics in 1989, McCabe and Butler developed two variations of S0 and S1 for testing of object-oriented code. These metrics took advantage of object oriented design efficiencies to reduce the number of design tests. Object design complexity (OS0) is a measurement of the decision structure which controls the invocation of modules within an object-oriented design. It is a quantification of the testing effort of all calls in the design, starting with the top module, trickling down through subordinates and exiting through the top. Object integration complexity (OS1) is a measurement of the integration tests that qualifies the design tree for an object-oriented program including any unresolved module references. A subtree is a sequence of calls from a module to descendant modules and of returns from them. Just as integration complexity design defines the number of test subtrees in the required basis set for that design, OS1 defines the number of linearly independent design subtrees in a basis set for an object-oriented design [
McCabe metrics are not limited to measurement of logic and decision structure. The data domain includes two classes of metrics [
4. gdv = the cyclomatic complexity of a reduced flowgraph, v(G)
5. sdv = the cyclomatic complexity of a reduced flowgraph, v(G)
Reduction is completed by logically removing all decision constructs that do not significantly the use of global or selected data [
Six specified data metrics are used in building design coefficients for encapsulation, data hiding, and separation of concern. Below is the definition of these six metrics.
· local data complexity (ldv): specified data complexity of local (private) data; 0 ≤ ldv ≤ v;
· ldvdensity (ldv/v): high ldvdensity is a design characteristic that indicates high use of local (private) data in module logic; low ldvdensity indicates low use of local (private) data in module logic; 0 ≤ ldvdensity ≤ 1;
· public global data complexity (pgdv): specified data complexity of global (public) data excluding parameter data; 0 ≤ pgdv ≤ v;
· pgdvdensity (pgdv/v): high pgdvdensity is a design characteristic that indicates high use of global (public) data in module logic; low pgdvdensity indicates low use of global (public) data in module logic; 0 ≤ pgdvdensity ≤ 1;
· parameter data complexity (pdv): specified data complexity of parameters (passed arguments) excluding public global data; 0 ≤ pdv ≤ v;
· pdvdensity (pdv/v): high pdvdensity is a design characteristic that indicates high use of parameters in module logic; low pdvdensityindicates low use of parameters in module logic; 0 ≤ pdvdensity ≤ 1.
These specified data metrics and their density metrics form the foundation for design coefficients.
In statistics, correlation is a measure of the degree of relatedness of variables. It can used by a business analyst to determine if variables, such as temperature and ice cream sales, rise and fall in a related manner. For a sample pair, correlation analysis can yield a numerical value that represents the relatedness of temperature and ice cream sales over time. The coefficient of correlation, r, is a popular measure of correlation. The statistic r is the Pearson product-moment correlation, named after Karl Pearson, an English statistician [
Encapsulation, data hiding, and separation of concerns are design concepts without ordinal or interval measurement. Being able to measure software’s design quality is fundamental when assessing the progress of design as it relates to construction. There is now widespread acknowledgement that using subjective design characteristics is a common practice when measuring overall software quality. In practice, these design concepts are measured using subject measurements such as a Likert Scale—very low, low, moderate, high, and very high. However, the Likert Scale is uni-dimensional and only gives limited choices, and the space between each choice is not equidistant. Therefore, a Likert Scale fails to measure the true value of these design concepts. Also, it is likely that software developers’ subjective values are influenced by previous development work or opinions of peers. Frequently, people tend to avoid choosing the extremes options on the scale (very low and very high) because of the negative social implications, even when an extreme choice would be best.
Cyclomatic complexity-based design metrics for encapsulation, data hiding, and separation of concerns are measured as coefficients. Each of these design characteristics has a representative coefficient, Cx, where x is either encapsulation (encap), data hiding (dh), or separation of concerns (soc), and each measures the spread of the design characteristics from low to high. For example, the encapsulation coefficient measures encapsulation based upon cyclomatic complexity-based measurements. Unlike the coefficient of correlation, these design measurements are not a measure of linear correlation between two variables. Rather, they are proxy variables used to replace previous subjective measures for encapsulation, data hiding, and separation of concerns. The coefficients range from 0 to +1, representing the spread of the design characteristics from low to high. When Cx is 0, the design characteristic is the lowest and when Cx is 1, the design characteristic is the highest. Using cyclomatic complexity-based measures to derive measurements for these design characteristics provides several advantages. First, the measurements are derived from McCabe metrics which have an acknowledged place in software metrics. McCabe metrics are also objective in that different software developers looking at the same software calculated the same result. McCabe’s cyclomatic complexity has a research implication with software quality. Research studies have shown that modules with higher cyclomatic complexity (v > 10) are harder to test and exhibit poorer quality [
The encapsulation coefficient is a measurement of the quality of data bundling within a module operating of that data; a measurement of the decision logic that uses (refers to) local data and global data. The coefficient measures the relative magnitude of local data manipulation to the magnitude of combined local and global manipulation. Decision logic that uses local data indicates high encapsulation (since the data is exclusively bundled with the module) and decision logic that uses global data indicates low encapsulation (since the data is not exclusively bundled with the module).
Module encapsulation coefficient:
C e n c a p = f ( cyclomatic complexity , local complexity , publiccomplexity , public global data complexity )
C e n c a p = f ( v , l d v , p g d v )
(rationale: ldv is a measurement of local data manipulation within a module; high ldv indicates that module logic is bundled using local data)
C e n c a p 1 = ( l d v d e n s i t y − p g d v d e n s i t y l d v d e n s i t y + p g d v d e n s i t y ) = ( l d v v − p g d v v ) ( l d v v + p g d v v ) = ( l d v − p g d v l d v + p g d v )
where ldv and pgdv not = 0 and 0 ≤ Cencap1 < |1|; −1 indicates low encapsulation and 1 indicates high encapsulation.
So that encapsulation, data hiding, and separation of concerns behave quantitatively with common magnitude, encapsulation is scaled to be:
C e n c a p = ( C e n c a p 1 + 1 ) / 2
where 0 ≤ Cencap ≤ 1; 0 indicates low encapsulation and 1 indicates high encapsulation.
Where ldv and pgdv = 0, Cencap = 0.
A peculiar and sometimes pathological case occurs when both ldv and pgdv are zero. This strange case implies that the algorithm references no data—no local data nor any global data. Even though the denominator is zero in the above formula, the convention of defining the metric value to be zero is used. This convention gives an encapsulation coefficient indicating the lowest (worse) value for encapsulating data. Over and above this mathematical anomaly, these bizarre modules deserve careful inspection. Often they are doing something strange that affects data below the syntactic level of the language being used. For example, in a real-time system, a module referencing no data may just be looping in order to waste some time so it can properly synchronize. It appears that it is using no data but that is not the case. It is affecting the system clock which is data below the syntactic level of language. Change the units in the system clock or get a faster processor and the low encapsulation in this logical construct would cause errors. Another case that has been witnessed is dividing by zero in Ada which triggers an exception recovery procedure. Although there is no syntactic data being used, there is compiler generated data keeping track of these anomalies—so, in fact, this is low encapsulation with risky properties. This same argument applies to data hiding (defined next) when ldv, pgdv and pdv = 0. When modules use no data, they are exhibiting the lowest encapsulation and data hiding characteristics.
The data hiding coefficient is a module measurement of the quality of hiding a module’s internal data members and limiting data access from external objects.
Module data hiding coefficient:
C d h = f ( cyclomatic complexity , local datacomplexity , public global data complexity , parameterdatacomplexity )
C d h = f ( v , l d v , p g d v , p d v )
(rationale: pgdv and pdv are measurements of public global and parameter data manipulation within a module; low pgdv and pdv indicate that module logic is hiding data from other modules)
C d h = 1 − ( ( p g d v d e n d i s y + p d v d e n s i t y ) / ( l d v d e n s i t y + p g d v d e n s i t y + p d v d e n s i t y ) )
C d h = 1 − ( ( ( p g d v / v ) + ( p d v / v ) ) / ( ( l d v / v ) + ( p g d v / v ) + ( p d v / v ) ) )
C d h = 1 − ( ( p g d v + p d v ) / ( l d v + p g d v + p d v ) )
where ldv and pgdv and pdv not = 0 and 0 ≤ Cdh≤ 1; 0 indicates low data hiding and 1 indicates high data hiding.
Where ldv and pgdv and pdv = 0, Cdh = 0.
Module separation of concerns coefficient: module separation of concerns (soc) is a design characteristic for partitioning a design into modules such that each module addresses a separate internal function concern, not relying on other modules.
C s o c = f ( cyclomatic complexity , module design complexity , global data complexity , parameter data complexity )
C s o c = f ( v , i v , p g d v , p d v )
(rationale: iv, pgdv and pdv are measurements of module logic influencing one or more other modules; low iv, pgdv and pdv indicate that module logic is not influencing other module logic or data manipulation)
C s o c = ( ( 1 − i v d e n s i t y ) + ( 1 − p g d v d e n s i t y ) + ( 1 − p d v d e n s i t y ) ) / 3 = ( ( 1 − ( i v / v ) ) + ( 1 − ( p g d v / v ) ) + ( 1 − ( p d b / v ) ) ) / 3 = ( 3 − ( ( i v + p g d v + p d v ) / v ) ) / 3 = 1 − ( ( i v + p g d v + p d v ) / 3 v )
where 0 ≤ Csoc ≤ 1; 0 indicates low separation of concerns and 1 indicates high separation of concerns.
In
Module Name | v | iv | ldv | pgdv | pdv | Cencap | Cdh | Csoc |
---|---|---|---|---|---|---|---|---|
( C e n c a p 1 + 1 ) / 2 | 1 − ( ( p g d v + p d v ) / ( l d v + g d v + p d v ) ) | 1 − ( ( i v + p g d v + p d v ) / 3 v ) | ||||||
A high Cencap | 10 | 1 | 10 | 1 | 0 | 0.909 | 0.909 | 0.933 |
B low Cencap | 10 | 1 | 1 | 10 | 0 | 0.091 | 0.091 | 0.633 |
C high Cdh | 10 | 1 | 10 | 0 | 0 | 1.000 | 1.000 | 0.967 |
D low Cdh | 10 | 1 | 0 | 10 | 10 | 0.000 | 0.000 | 0.300 |
E high Csoc | 10 | 0 | 10 | 0 | 0 | 1.000 | 1.000 | 1.000 |
F high Csoc | 10 | 10 | 10 | 10 | 10 | 0.500 | 0.333 | 0.000 |
G high Cencap | 20 | 1 | 20 | 1 | 0 | 0.952 | ||
H low Cencap | 20 | 1 | 1 | 20 | 0 | 0.048 |
Module B uses low quantities of local data and high quantities of global data which are poor encapsulation qualities. The encapsulation coefficient behaves mathematically consistent with these intuitive qualities. Values between these extremes are determined by the actual values for local data and public data complexities for a given module.
In
The final example illustrated in
There is one other important illustration of the encapsulation, data hiding, and separation of concerns coefficients in
The Kuali Foundation is a consortium of interested universities, colleges and commercial firms that joined together to produce an enterprise software solution for the academic business model. The Kauli Foundation utilizes a community source developed software acquisition module. Using Kuali Enterprise Resource Planning (ERP) modules, the encapsulation, data hiding, and separation of concerns coefficients are calculated using production code. For this study, the Kauli Coeus, Version 5.0.1, and Kauli Financial System, Version 4.1.1, application code was utilized. Kauli Financial System is financial software that meets the needs of Carnegie class institutions, and Kauli Coeus is an application for administration of grants to federal funding agencies [
Module/Module | v | ev | iv | ldv | pgdv | pdv | Cencap | Cdh | Csoc | Design Quality |
---|---|---|---|---|---|---|---|---|---|---|
A—(Coeus-S2S PHS398TrainingSub AwardBudgetV1_0Generator.getPHS398TrainingBudget_MODLTR-ANE_GRF-1) | 12 | 5 | 11 | 12 | 0 | 0 | 1.000 | Higher | ||
B—(Coeus-Budget- QueryList.getFieldValue_MODLTR-JB_GRF-1) | 12 | 7 | 5 | 10 | 3 | 0 | 0.769 | Lower | ||
C—(Coeus-S2S- SFLLLV1_1Generator.getReportEntity_MODLTR-BGE_GRF-1) | 14 | 1 | 14 | 0 | 1 | 0 | 0.933 | Higher | ||
D—(Coeus-S2S- GlobalLibraryV2_0Generator.getAddressDataType_MODLTR-YI_GRF-1) | 13 | 1 | 12 | 0 | 2 | 4 | 0.684 | Lower | ||
E—(Coeus-S2S- S2SUtilServiceImpl.getNKeyPersons_MODLTR-BOM_GRF-1) | 12 | 7 | 5 | 0 | 2 | 0 | 0.750 | Higher | ||
F—(Coeus-S2S- PHS398TrainingSubAwardBudgetV1_0Generator.getPHS398TrainingBudget_MODLTR-ANE_GRF-1) | 11 | 6 | 10 | 0 | 0 | 8 | 0.684 | Lower |
metrics for 6 Coeus modules including the two modules shown in
An extension of design coefficients is the measurement of class encapsulation, data hiding, and separation of concerns. Extending design coefficients to the class level is valuable because the metrics provide objective measurement for a higher abstraction of code. Since a class is composed of modules, the class modules are used to calculate the class design coefficients. Instead of a simple aggregation of individual module coefficients, a weight average based upon cyclomatic complexity is used. Using module cyclomatic complexity as a weight, the class design coefficients take into account a module’s granularity. When a large module (for example, v = 40) is poorly designed with low encapsulation, data hiding, and separation of concerns, that module’s weight will negatively impact the class design coefficient. The inverse is true when design coefficients are positive. Utilizing this approach, a small well-designed module and a poorly-designed module do not cancel out their contribution to the class design coefficients.
Class encapsulation coefficient: a measurement of the level of bundling of data with modules operating on that data; a measurement of the proportion of decision logic that uses (refers to) local data and global data. The class encapsulation coefficient is calculated by summarizing the module encapsulation coefficients that are encapsulated in the class. It is the weighted average of the class module encapsulation coefficients.
O e n c a p = ( ∑ m ( v m ) ( C e n c a p ) ) / ∑ v m
where m is a module.
Class data hiding coefficient: a class measurement of the level of hiding internal data members and limiting data access from external objects. The class data hiding coefficient is calculated by summarizing the module data hiding coefficients that are encapsulated in the class. It is the weighted average of the class module data hiding coefficients.
O d h = ( ∑ m ( v m ) ( C d h ) ) / ∑ v m
where m is a module.
Class separation of concerns coefficient: class separation of concerns is a design characteristic for separating a computer program into object classes, such that each class addresses a separate concern. The class separation of concerns coefficient is calculated by summarizing the module separation of concerns coefficients that are encapsulated in the class. It is the weighted average of the class module separation of concerns coefficients.
O s o c = ( ∑ m ( v m ) ( C s o c ) ) / ∑ v m
where m is a module.
Class Coefficient ExampleThere are software instances where encapsulation, data hiding, and separation of concerns behave unique to the module’s function. In these instances, data sharing is not always bad; logical dependency is not always bad.
v | Cencap | Cdh | Csoc | ||
---|---|---|---|---|---|
Class | temp List Lookup Action Class | ||||
Methods | perform Lookup | 3 | 0.330 | 0.500 | 0.000 |
build Lock Key Message | 4 | 0.500 | 0.570 | 0.330 | |
populate Lock Summary | 6 | 0.460 | 0.550 | 0.280 | |
perform Question Without Input | 4 | 0.330 | 0.500 | 0.170 | |
do Unlock Confirmation | 6 | 0.380 | 0.500 | 0.220 | |
unlock | 6 | 0.230 | 0.380 | 0.110 | |
submit Report | 6 | 0.230 | 0.380 | 0.110 | |
get NEW Incumbent | 1 | 0.330 | 0.500 | 0.000 | |
perform Extended Incumbent Search | 1 | 0.330 | 0.500 | 0.000 | |
search | 1 | 0.330 | 0.500 | 0.000 | |
get New Position | 2 | 0.500 | 0.670 | 0.330 | |
perform Extended Position Search | 12 | 0.330 | 0.380 | 0.360 | |
cancel | 2 | 0.400 | 0.500 | 0.170 | |
clear Values | 4 | 0.400 | 0.500 | 0.170 | |
start | 5 | 0.380 | 0.500 | 0.130 | |
Design Coefficients | 63 | Osoc = 0.355 | Odh = 0.469 | Osoc = 0.206 | |
Class | Ojb Pending BC Appointment Funding Active Indicator Conversion | ||||
Methods | sql ToJava | 4 | 0.8 | 1 | 0.67 |
java ToSql | 3 | 0.43 | 0.5 | 0.33 | |
Design Coefficients | 7 | Osoc = 0.641 | Odh = 0.786 | Osoc = 0.524 |
Method Name | v | iv | ldv | pgdv | pdv | Cencap | Cdh | Csoc |
---|---|---|---|---|---|---|---|---|
( C e n c a p 1 + 1 ) / 2 | 1 − ( ( p g d v + p d v ) / ( l d v + g d v + p d v ) ) | 1 − ( ( i v + p g d v + p d v ) / 3 v ) | ||||||
A—execution switch | 45 | 45 | 0 | 1 | 0 | 0.000 | 0.000 | 0.659 |
B—global data switch | 45 | 1 | 1 | 45 | 0 | 0.022 | 0.022 | 0.659 |
C—local data switch | 45 | 1 | 45 | 0 | 0 | 1.000 | 1.000 | 0.993 |
D—small well- structured | 10 | 1 | 5 | 0 | 0 | 1.000 | 1.000 | 0.967 |
E—large complex unstructured | 34 | 28 | 10 | 15 | 0 | 0.400 | 0.250 | 0.431 |
global data through an application, it should use high volumes of public global and parameter data resulting in a low Cencap and Cdh. Module B’s low Cencap (0.022) and Cdh (0.022) are not bad since they reflects the module’s functional responsibility.
Other module types can be identified and classified by design coefficient metrics. A local data switch should exhibited good design coefficients. In
Encapsulation, data hiding, and separation of concerns have been prominent design principles for over forty years. Subjectively, software developers have used them to assess the quality of designs. Quantifying encapsulation, data hiding, and separation of concerns design principles provide metrics valuable to software developers as software design tools. Our design metrics, coefficient of encapsulation, coefficient of data hiding, and coefficient of separation of concerns are three metrics which represent an important set of tools. Since the design structure of a program is an important component of logical complexity and data usage, the decision structure inherently contains the manner in which the design logic is implemented. These three design metrics address how software logic reflects the use of local, global, and parameter data in program logic. With these metrics, the quantification can be completed at two levels: module and class. Calculation of design coefficients represents a new analytical tool previously unavailable to software developers.
The coefficient of encapsulation, coefficient of data hiding, and coefficient of separation of concerns exhibit desired properties which support their applicability.
· The metrics are objective and mathematically rigorous. In addition to being intuitive consistent with low or high encapsulation, data hiding, and separation of concerns, it is critical that the metrics be objective. The same design principle viewed a two different times or by two different software developers yields the same coefficient values. If the metrics are not objective, the individual interest involved in a development effort will have differing interpretation reducing the effectiveness of the coefficients as tools.
· The metrics intuitively behave with the subjective degree of design quality. When a module or class is well-designed, the encapsulation, data hiding, and separation of concerns are determined to be high, good or positive depending on the subjective stratification used. Accordingly, when a module or class is well-designed, the coefficient of encapsulation, coefficient of data hiding, and coefficient of separation of concerns are valued from 0 to 1 with 1 corresponding to high, good, or positive.
· The metrics should be of operational help. Metrics that correlate and estimate characteristics such as quality and maintainability are useful. If the metrics can directly drive the design modularization process.
· The design metrics introduced in this article are for quantifying design principles in procedural and object-oriented code. They are valuable for assessing and controlling encapsulation, data hiding, and separation of concerns. They are also valuable for application portfolio management as they provide measures of design dynamics throughout the application life cycle. If better encapsulation, data hiding, and separation of concerns is achieved in during software design and maintenance, there will be positive impact on application costs, reliability, and performance.
The authors declare no conflicts of interest regarding the publication of this paper.
Butler, C.W. and McCabe, T.J. (2021) Cyclomatic Complexity-Based Encapsulation, Data Hiding, and Separation of Concerns. Journal of Software Engineering and Applications, 14, 44-66. https://doi.org/10.4236/jsea.2021.141004