_{1}

^{*}

At first sight, the choice of a socially best economic policy and the choice of an optimal engineering design seem to be quite separate issues. A closer look, however, shows that both approaches which aim at generating a (set of) best alternative(s) have much in common. We describe and characterize axiomatically an aggregation method that uses a set of evaluations that are arranged on a common scale. This scale establishes a common language, so to speak, which conforms to the criteria that are deemed relevant in order to compare various design options. Two conditions are able to characterize the proposed aggregation mechanism. One is a simple dominance requirement, and the other called cancellation independence makes use of the fact that for any pair of objects, rank differences of opposite sign can be reduced without changing the aggregate outcome of the ranking procedure. The proposed method has its origin in voting theory but may have the potential to prove useful in engineering design as well.

At the first glance, problems in social choice and those in engineering design seem to be quite far apart from each other. Social choice theory typically looks for the existence of a consistent social welfare function that is meant to represent the common will or social good of society.

Engineering design aims at choosing a particular design out of a set of alternative options, and this design has the property of best satisfying a certain number of criteria that are deemed relevant in the given context.

At the second glance, however, it becomes clear that both approaches pursue roughly the same goal. In social choice theory, the preferences of different individuals, the members of a particular society, are amalgamated such that a collective preference is distilled. In engineering design, either a particular person or a team of similarly oriented individuals do the following: They collect information on how alternative design options fare under different criteria, evaluate the performance of the latter and then aggregate this information in order to derive a (set of) best option(s). While in social choice theory, the inputs are the preference rankings of different individuals, in engineering design the inputs are “preference” rankings that emanate from the application of various criteria or requirements.

During the last decade or so, there have been various papers that examine in some detail the relationship between the individual design engineer and a social planner who wishes to elicit the societal ranking of alternatives.

However, the views on this relationship are by no means congruent. While some authors claim (see, for example, Franssen [

Just to remind the reader, Arrow had shown that there does not exist an aggregation mechanism that satisfies five intuitively looking conditions, namely 1) that the social ranking is an ordering (i.e., complete and transitive), 2) that all individual preference rankings should be admissible, 3) that some social state x should, on the aggregate level, be strictly preferred to some other state y whenever every individual has exactly the same strict preference, 4) that the social outcome between x and y, let’s say, should only depend on the individual preferences between x and y and not on the preferences towards any other alternative nor on the preferences between any other alternative and x or y, and finally 5) that no individual should have the power to declare his or her strict preference for any x over any y as the social preference.

If the former of the two positions described above is correct, rational choice of an optimal design is doomed to failure, at least as long as one remains in an ordinal framework so that preference intensities (on a cardinal scale, for example) are excluded and as long as one insists that all of the conditions that Arrow required be upheld. The latter position comes in various forms. Dym et al. [

Consider the following multi-criteria decision problem from engineering where the choice of engine is at issue and weight, power-to-weight ratio and cost are the relevant criteria (

In this example, criteria 1 and 3 follow the “philosophy” that smaller is better than larger, while criterion 2 is such that larger is preferred to smaller. So according to criterion 1, we obtain that a is better than b which is better than c which again is better than d. Criterion 2 prefers c to b, b to d, and d to a, and criterion 3 finds d best, then a which is followed by b, and c being last. These rankings, taken separately, are very intuitive; taken together, they cannot be arranged in a single-peaked fashion (see also Franssen [

Alternatives | Power (bhp) | Weight (kg) criterion 1 | Power/weight (bhp/t) criterion 2 | Cost criterion 3 |
---|---|---|---|---|

a | 300 | 1000 | 300 | 6300 |

b | 330 | 1050 | 314 | 6600 |

c | 350 | 1100 | 318 | 7000 |

d | 400 | 1300 | 308 | 6200 |

Our example shows that Scott and Antonsson’s [

It is interesting to note that the problem of an engineer to judge alternative design options is almost identical, from a theoretical perspective, to that of a scientist who has to evaluate alternative new theories that promise to solve a particular question in science, this exercise being based on a set of epistemic values or criteria. This problem was prominently formulated by Kuhn [

In the following Section 2, we shall start with an example of a voting procedure that is meant to illustrate our proposed method and then deliver arguments for introducing a common scale or language on which all criteria relevant for a particular design problem should base their preferential judgments. Section 3 discusses the comparability issue across criteria. Section 4 proposes a method for evaluation that is cardinal in character and is known in social choice theory as a scoring method. It has been successfully applied in social and political competition. An axiomatic underpinning is presented which uses the idea of a common scale with qualitative verdicts as a common language. It allows us to formulate in Section 5 a possibility result that, as we hope, may prove useful in engineering design. In this section, we also briefly discuss the situation that there is more than one evaluating engineer. We end with a few concluding remarks in Section 6.

Imagine you are one of the members of a recruitment committee that has to decide among a certain number of applications for a faculty position. Let us suppose that k candidates are being considered more closely. Let us further assume that the chairman of your committee comes forward with the following procedure. He declares that there are m categories (from “excellent” to “not acceptable”, let’s say, with m − 2 categories in between), with rank scores from m to 1 attached to these categories. The chairperson asks all members of your committee to allocate the k candidates to the m available categories. It is explicitly not required that every member comes up with a strict ordering and that all categories have to be filled by each and every committee member.

Furthermore, the chairperson announces that, as soon as each member has assigned the k candidates to the m categories, he would count the rank numbers assigned to each candidate and then construct a ranking over the k applicants from the highest rank sum to the lowest, the candidate with the highest aggregate sum being the winner. We claim that this aggregation procedure can be made fruitful for design choices in engineering.

In the above scenario, replace the candidates with alternative design options, the members of the committee with certain criteria to be satisfied, and the chairperson with an individual engineer. Furthermore, consider an interval scale with the following equi-distanced categories: “excellent”, “high”, “satisfactory”, “just sufficient”, and “insufficient”.

This structure provides the following set-up for design choice: our engineer considers alternative design options in the light of the set of given criteria. For each criterion, the engineer attaches a qualitative verdict to the alternative options (e.g. “design model D_{1} is ‘sufficient’ in relation to acceleration”, “design model D_{2}’s acceleration is ‘insufficient’”). The five qualitative verdicts correspond to an interval scale with rank scores. The overall ranking of the design models is determined by the sum of rank scores for each alternative option.

Notice the function that the qualitative verdicts are performing. The verdicts expressed in rank scores constitute a common qualitative scale and represent a cardinalization of the preference orderings over the set of criteria. However, this cardinality is imposed on the ordinal ranking which was formed according to the performance of the various criteria. It is not assumed that the criteria themselves necessarily provide more than ordinal information. It is the imposed cardinal representation which allows intercriteria comparison and, hence, enables us to avoid, as we shall see, an impossibility result à la Arrow. In order to make the cardinal scores for each of the design objects comparable across the set of criteria and thereby to achieve inter-criteria comparability, the process of construction of the scale is of utmost importance. This process can be compared to the creation of a common language that constitutes a unifying basis for comparison. Balinski and Laraki [

In order to clarify more concretely our proposed approach, let us refer again to the committee and its members. Each individual has to transform his or her ordinal preference relation ⪰ over the alternative candidates into a cardinal ranking with the requirement that if candidate x, let us say, is at least as good as applicant y, the cardinal rank or score attached to x is at least as high as the rank assigned to y so that for all x, y Î X, the set of all candidates, the following relationship holds:_{n}(x) − s_{n}(y) > s_{n}(z) − s_{n}(w), where s_{n}(x) stands for the score assigned to alternative x by committee member n. Note that any affine transformation of these scores with a common positive scale factor over all n does not destroy this comparison of score differences. Coming back to our problem of design comparison, our engineer is assumed to examine the given designs in the light of the set of criteria that are relevant for the problem at stake. More explicitly, the engineer starts for each single criterion with an ordinal ranking over the design objects to be evaluated and then transforms this ranking into a sequence of cardinal scores according to the specified relationship from above. Hereby, we assume, as outlined above, that the engineer can translate the ordinal into the cardinal information for every given criterion in isolation.

The common scale that we are proposing could well be the zero-one interval of the reals, but in many situations, the scale is integer-valued and equi-distanced. Examples of such scenarios are manifold-ranging from wine tasting over figure skating (Balinski and Laraki [

To be clear, the grade “excellent”, for example, does not necessarily fulfil the same list of prerequisites for each and every criterion. For one criterion, the grade “excellent” may hardly ever be given or reached (e.g. susceptibility to a break-down in the case of a new car design), another criterion may achieve the predicate “excellent” fairly easily (e.g. noise reduction). What we have as information is a judgment that comes from each criterion. However, if two criteria attach the judgment “excellent” to a particular design object, then this should be taken at face value. To argue that “excellent” according to one criterion corresponds to only “sufficient” in relation to another criterion may critically undermine the required inter-criteria comparability and may also provoke a controversial debate, once not one but several engineers are engaged in the evaluation of alternative options. In other words, the grades that are assigned to the alternative design options have a meaning that is absolute across the criteria.

One could argue that in engineering, grades such as “excellent” or “sufficient” are not always useful since much more fine-grained verdicts are available. This is undoubtedly true for certain characteristics but not for others. In the case of judging alternative car designs, for example, velocity, acceleration and fuel consumption are features such that different performances can be assigned to particular points on a ratio scale. Par contre, comfort, image and aesthetic aspects in relation to body shape and colour but also features such as the afore mentioned susceptibility to break-down are properties that are more difficult to quantify on some scale. What may make the evaluative exercise of the engineer more complex is that an index that reflects an outstanding acceleration, for example, will perhaps not be considered excellent in relation to an extremely high fuel consumption and a low fuel consumption will not be viewed as excellent in the light of a very modest degree of acceleration. So the design engineer may consider interdependencies among various criteria whereas for criteria such as, for example, fuel consumption and comfort, a separability argument may apply.

We believe, however, that these complications are not insurmountable.

To repeat, what is essential for the aggregation procedure that we shall describe and characterize in the next section is that the common scale is properly understood as an inter-attribute device to link or combine different dimensions. Without the latter, it would just make no sense to use integer values on a scale as an expression of qualitative verdicts that are assigned to a given set of criteria deemed relevant for alternative design options. Clearly, to conceive such a common scale is by no means an easy task that the decision maker has to perform, and it will, quite naturally, depend on the type of design considered. It may be more fine-grained in some cases of design evaluation and less fine-grained in others.

What we propose is that an engineer performs the following mental exercise. Given a common scale for all relevant criteria, he assigns ratings to performance under the various criteria. If design D_{1} gets the attribute “excellent” for criterion 1 and “satisfactory” for criterion 2, and for design D_{2}, the rating is just the other way round, the two designs are ranked equally. If life expectancy as criterion 1 is considered more valuable than load capacity, one may want the performance under criterion 1 to count twice as much. This can be achieved by putting this criterion twice on the list of relevant criteria or, even better if possible, another criterion is added to the list which is closely related to criterion 1. In the case of cars as discussed, the expected price or value of this car on the second-hand market or the cost of maintenance may be such an additional criterion.

Let X be the universal set of design options containing a finite number of elements. Let N be the set of criteria deemed relevant with n > 1. Let E = {1, _{i}: X ® E is chosen for each criterion i Î N, such that, for all x Î X, s_{i}(x) indicates the score that criterion i assigns to x. Let S_{i} be the set of all possible scoring functions for criterion i. As explained in the last section, the statement how well or how badly each criterion fares has to be inserted in the commonly given scale constituted by set E so as to accurately represent its ´value for the design object considered.

Let P be the set of all orderings over X. A profile _{E}, iff, for any s Î S, and any x, y Î X,

and ~, respectively. For any s, s' Î S, any i, j Î N and any x, y Î X, we say that s and s' are (i, j)-variant with respect to (x, y) if

We now introduce two properties to be imposed on an aggregation rule f.

Dominance (D). For all s Î S and all x, y Î X, if s_{i}(x) ³ s_{i}(y) for all i Î N, then x ⪰ y and if s_{i}(x) ³ s_{i}(y) for all i Î N and s_{j}(x) > s_{j}(y) for some j Î N, then x

Cancellation Independence (CI). For all s, s' Î S, all x, y Î X and all i, j Î N, if s and s¢ are (i,j)-variant with respect to (x, y),

Condition D is a simple vector dominance condition. It requires that, in ranking two design models x and y, if the score assigned to x by each criterion i Î N is at least as large as the score assigned to y by the same criterion i, then x must be ranked at least as high as y by the evaluating engineer, and if in addition, some criterion assigns a higher score to x than to y, then x must be ranked higher than y.

Condition CI makes use of the fact that for any pair of alternatives, rank differences of opposite sign can be reduced without changing the aggregate outcome of the ranking procedure. This reduction procedure is performed in a stepwise fashion, starting with any two design options x and y, let’s say, and picking any two criteria whose rank differences for x and y are of opposite sign. The “net” rank difference between x and y for this pair of criteria is determined. Then another criterion is picked whose rank difference for x and y is opposite in sign to the net rank difference of the first two criteria. The new net rank difference for x and y is calculated and the next criterion is picked whose rank difference again is opposite in sign to the just determined net rank difference with respect to x and y, if there is still one such criterion, and so on.

Such a step-wise cancellation procedure can be illustrated by the following simple scheme where two designs D_{1} and D_{2} are evaluated via three criteria. One starts with criteria 1 and 2 whose rank differences in relation to D_{1} and D_{2} are of opposite sign, calculates the “net” rank difference and then_{ }moves on to the third criterion, where, again, a rank difference of_{ }opposite sign occurs. The final outcome in this case is an equivalence between the two designs (

In Condition CI, vectors s and s' define scoring profiles that are aggregate-rank equivalent with respect to any pair of objects to be evaluated. We call s' an s-reduced scoring profile. Condition CI therefore demands that f(s) and f(s') order any x and y in exactly the same way. Note that Condition CI makes an implicit assumption about an inter-criterion comparison of scores for which a common language is required.

Dominance and Cancellation Independence allowed Gaertner and Xu [

THEOREM. f = f_{E} if and only if f satisfies the properties of Dominance and Cancellation Independence.

A proof of this result can be found in Gaertner and Xu [

Our model neither runs into Arrow’s impossibility result nor does it make restrictions which, at a closer look, turn out to be highly implausible. The method uses cardinal scores, though, and if this is a knock-out criterion, then this procedure is doomed to be beyond limits. Clearly, the scoring rule proposed has to be weighed against the pairwise comparison charts (Dym et al. [

Our theorem establishes, for the evaluating engineer, an ordering over the set of alternative design objects once this person has consented to the common language as we assumed. Should there be more than one decision-maker, various methods are available in order to obtain an overall verdict among the group of evaluators. All of these rules violate at least one of Arrow's requirements (see, for example, Gaertner [

It may be surprising for the reader to see that in the situation of several engineers who are about to evaluate alternative design objects, we propose methods which use ordinal information only. We could have proposed in this case that all evaluators apply the scoring method laid out above and characterized in the theorem. This

Qualitative verdicts | Assigned rank scores | Criterion 1 | Criterion 2 | Criterion 3 |
---|---|---|---|---|

High | 3 | D_{1} | ||

Sufficient | 2 | D_{2} | D_{2} | |

Insufficient | 1 | D_{2} | D_{1} | D_{1} |

would have required, however, that all individuals involved agree to the same grading structure, be it fine- grained or coarse-grained. One can, of course, make this assumption but this may be unduly restrictive. What our aggregation procedure leads to, namely that x ⪰ y ↔

The aggregation rule that is characterized in this paper is a general version of what Smith [

Our procedure treats criteria equally and also treats objects neutrally. The suggested procedure is sensitive towards the degree of criteria fulfilment. This is, in social choice theory, denoted as a form of positive responsiveness or positive association in the sense that a unilateral change in the fulfilment of some criterion in favour of design object x, let’s say, should be reflected on the aggregate level in the same and not in the opposite direction, in other words should be beneficial for x.

Our model also satisfies a property that is sometimes called consistency, at other times reinforcement, demanding that if the set of criteria is split up into two parts and a certain design object wins in both subsets, then this object must also win in relation to the complete set of criteria.

Let us emphasize again that different criteria can rank or rather assign scores to the given objects in completely different ways as explained in the previous sections. We consider this as an advantage since the single criterion has more flexibility “to articulate its preference”, i.e., it has more flexibility to express to what degree or extent it finds itself represented among the different design objects under consideration. It is our conviction and hope that the cardinal approach proposed in this paper will be a fruitful way to evaluate alternative design objects. The final verdict, of course, has to come from the engineering profession.

I am grateful for the comments and suggestions from an anonymous referee.

Wulf Gaertner, (2016) Aggregating Qualitative Verdicts: From Social Choice to Engineering Design. Open Journal of Applied Sciences,06,319-326. doi: 10.4236/ojapps.2016.65032