^{1}

^{*}

^{1}

^{1}

^{1}

Multi-criteria decision analysis deals with decision problems in which multiple criteria need to be considered. The criteria might be measured on different scales so that comparability is difficult. One approach to help the user to organize the problem and to reflect on his or her assessment on the decision is Measuring Attractiveness by a Categorical Based Evaluation TecHnique (MACBETH). Here the user needs to provide qualitative judgment about differences of attractiveness regarding pairs of options. MACBETH was implemented in the M-MACBETH software using the additive aggregation model. The present article introduces the software tool “AniFair” which combines the MACBETH approach with the Choquet integral as an aggregation function, because the Choquet integral enables the modeling of interaction between criteria. With the Choquet integral, the user can define constraints on the relative importance of criteria (Shapley value) and the interaction between criteria. In contrast to M-MACBETH, with every instance of “AniFair” the user is made available at least two aggregation level. “AniFair” provides Graphical User Interfaces for the entering of information. The software tool is introduced via an example from the Welfare Quality Assessment protocol for pigs. With this, “AniFair” is applied to real data that were collected from thirteen farms in Northern Germany by an animal welfare expert. The “AniFair” results enabled a division of the farms into five groups of comparable performance concerning the welfare principle “Good feeding”. Hereby, the results differed in how much the interaction between criteria contributed to the Choquet integral values. The shares varied from 5% to 55%. With this, the vulnerability of aggregation results towards relative importance of and interaction between criteria was stressed, as changes in the ranking due to the definition of constraints could be shown. All results were exported to human readable txt or csv files for further analyses, and advice could be given to the farmers on how to improve their welfare situation.

Multi-criteria Decision Analysis (MCDA) is a general term for concepts that aim at supporting the user in dealing with decision problems involving multiple criteria [

In this article the software tool “AniFair” is presented. “AniFair” is a software for Multi-criteria Decision Analysis which like the M-MACBETH software was implemented based on the mathematical foundations of the MACBETH method. However, “AniFair” combines the MACBETH approach for the calculation of comparable scales with the Choquet integral as aggregation function instead of an additive model. In the application of additive aggregation (weighted arithmetic mean) lies the implicit assumption that the criteria are mutually preferentially independent. In reality, this condition does not hold, as interaction between criteria is rather to be expected. The Choquet integral was introduced by Murofushi and Sugeno [

The software tool provides a Graphical User Interface (GUI) and the choice between a ’Single instance’—or ’Multiple instances’—version. With already two aggregation level given with every “AniFair” instance, the application of various “AniFair” instances can be used to get an additional third aggregation level, as the results from multiple instances can also be aggregated, or to compare several decision problems.

Animal welfare is a complex and multidimensional concept, and its evaluation has thoroughly been studied in the past years [^{1}, for ’Sows and piglets’ no proposal for an aggregation system has been released yet. The authors chose ’Sows and piglets’ in terms of the welfare principal ’Good feeding’ as the main example to present the functionality of “AniFair”, because it was less likely that a direct comparison with a currently used aggregation system could cloud the judgment of the possibilities offered by “AniFair”. “AniFair” was used by an expert in the field of animal welfare and applied to real world data.

The software tool “AniFair” is in detail described in Section 2.5 and was applied to a real life example associated to the evaluation of animal welfare. “AniFair” was used with data collected on farm concerning the category ’Sows and piglets’ from the ’Welfare Quality Assessment protocol for pigs’ (Section 2.1).

Studies [

In the ’Welfare Quality® Assessment protocol for sows and piglets’ measures regarding the sows, regarding the piglets, and both were described. To explain handling and functionality of “AniFair”, the principle ’Good feeding’ was used. The remaining principles ’Good housing’, ’Good health’, and ’Appropriate behavior’ were added in order to present the ’Multiple instances’—version of “AniFair” (Section 2.5.4), but were not individually discussed in detail.

The animal welfare principle ’Good feeding’ in ’Sows and piglets’ consists of the criteria ’Absence of prolonged hunger’ and ’Absence of prolonged thirst’. These criteria were evaluated using the measures ’Body condition score’ (BCS), ’Age of weaning’, and ’Water supply’.

• Body condition score, as a measure of ’Absence of prolonged hunger’. The BCS measured the energy reserves of an animal. According to WQAP it was scored for the sows on a three point scale. A score was given to every sow. Thereby, the sows were scored ’0’ when their BCS was within a healthy range, i.e. firm pressure was needed to feel the hip bones and the backbone. The animals were scored ’1’, when the sows appeared obese or the hip bone and backbone could easily be felt. The BCS score ’2’ was given when the sows had prominent hip bones or backbone and a very thin visual appearance. The percentages of sows with BCS ’0’, ’1’, and ’2’ were calculated for every farm, respectively.

• Age of weaning, as a measure of ’Absence of prolonged hunger’. The age of weaning was a measure concerning the piglets. Legal specification state that piglets need to be suckled by the sow for at least 28 days. As score for the farm the averaged number of days from birth to weaning was taken.

• Water supply, as a measure of ’Absence of prolonged thirst’. The drinking places for sows and piglets were scored on a two point scale. One score was given for the whole farm taking into account the cleanliness and functionality of all drinkers. The score ’0’ was given when all drinkers were clean and functioning without stint. The score ’2’ was given otherwise.

Data was collected on thirteen farms in Schleswig-Holstein in Northern Germany. The farms held 40 to 5000 sows (mean 663.1 ± 1331.9). An observer trained with regard to WQAP visited the farms repeatedly and scored 30 sows per visit according to WQAP. For this example the data from the first visit on every farm was used. These first visits took place from September to December 2016 and from April to July 2017.

The MACBETH approach presented the user with decisions about DoA that involve only qualitative judgment regarding two options at the time. In the ’AniFair’ implementation this was used for the calculation of comparable scales for all criteria (Section 2.5.2). In the following, different types of scales are defined.

Let ν ∈ ℕ . For the remainder of this section let X = { x 1 , ⋯ , x ν } ≠ ∅ be a finite set.

Definition 1 (Ordinal scale). A function S : X → ℝ is called an ordinal scale on X if the following conditions hold

∀ x i , x j ∈ X : x i is more attractive than x j ⇔ S ( x i ) > S ( x j ) (1)

∀ x i , x j ∈ X : x i is equally attractive as x j ⇔ S ( x i ) = S ( x j ) (2)

An ordinal scale can easily be obtained by ranking the elements of X according to their attractiveness and assigning real numbers that satisfy conditions (1) and (2). However, the differences between the scores on an ordinal scale can be arbitrary, and in MCDA scales are needed, that reflect not only the order of attractiveness of the elements, but also the differences of their attractiveness.

To create a scale with meaningful differences between its scores, in the M-MACBETH software the user needed to judge the DoA for pairs of elements of X with one of the following attributes ’extreme’, ’very strong’, ’strong’, ’moderate’, ’weak’ and ’very weak’. Based on these judgments a scale S could be reviewed as precardinal.

Definition 2 (Precardinal scale (reflecting given user judgment)) An ordinal scale S : X → ℝ is called a precardinal scale on X if for all x i , x j , x l , x k ∈ X such that x i is more attractive than x j and x l is more attractive than x k the following implication holds: If the difference of attractiveness between x i and x j was judged to be larger than the difference of attractiveness between x l and x k , than S ( x i ) − S ( x j ) > S ( x l ) − S ( x k ) .

A positive affine transformation applied to a precardinal scale results in a precardinal scale that reflects the same given user judgment. Large/small distances on a precardinal scale correspond to large/small DoA between the respective elements. Precardinal scales, however, do not necessarily fulfill that the relative distances between scores on the scale exactly represent the relative DoA as experienced by the user. This is the characteristic of cardinal scales.

In both the M-MACBETH software and “AniFair”, cardinal scales were achieved while the user got the possibility to modify the precardinal scale proposed by the software (supplementary material, Appendix: Background of ’Making criteria comparable’, Visualization and adaption of scales.).

The Choquet integral can be seen as a natural extension of the weighted arithmetic mean in case mutual preferential independence between criteria cannot be assumed. In practice interaction phenomena among criteria occur. In this case the aggregation function cannot be considered additive, and not only the importance of each criterion, but the importance of subsets of criteria needs to be taken into account. Instead of a vector of weights, a monotone set function—called capacity—is introduced. For the remainder of this section let n ∈ ℕ and N = { 1, ⋯ , n } .

Definition 3 (Capacity) A set function μ : { Y | Y ⊆ N } → [ 0,1 ] is called a capacity, if the following conditions hold:

μ ( ∅ ) = 0 (3)

∀ Y 1 , Y 2 ⊆ N : Y 1 ⊆ Y 2 ⇒ μ ( Y 1 ) ≤ μ ( Y 2 ) (4)

Based on the concept of a capacity, the Choquet integral can be defined.

Definition 4 (Choquet integral) Let f : N → ℝ + be a function represented by the vector ( f 1 , ⋯ , f n ) . Let θ be a permutation on { 1 , ⋯ , n } satisfying f θ ( 1 ) ≤ ⋯ ≤ f θ ( n ) . For all i ∈ { 1 , ⋯ , n } let A θ ( i ) : = { θ ( i ) , ⋯ , θ ( n ) } , and A θ ( n + 1 ) : = ∅ . Then the Choquet integral of f with respect to a capacity μ is defined by

C μ ( f ) : = ∑ i = 1 n f θ ( i ) ( μ ( A θ ( i ) ) − μ ( A θ ( i + 1 ) ) ) . (5)

In case the capacity μ is an additive function, the Choquet integral coincides with a weighted arithmetic mean. The exponential complexity due to the fact that a capacity is in general given by a set of 2^{n} coefficients has been a limiting condition, since Grabisch [

Definition 5 (Mobius transform of a set function) The Möbius transform of a set function μ : { Y | Y ⊆ N → ℝ } is for all Y ⊆ N defined by

m o e b μ ( Y ) : = ∑ Z ⊆ Y ( − 1 ) | Y \ Z | μ ( Z ) . (6)

Definition 6 (k-additive capacity) Let μ : { Y | Y ⊆ N } → ℝ be a capacity. Let k ∈ ℕ ≤ n . μ is called k-additive, if m o e b μ ( Y ) = 0 for all Y ⊆ N with | Y | > k , and if there is at least one Y ⊆ N holding | Y | = k and m o e b μ ( Y ) ≠ 0 .

Every k-additive capacity can thus be represented by at most ∑ i = 1 k ( n i ) coefficients, which is a significant reduction of complexity [

As capacities put weight on all subsets that hold a criterion instead of just weighting the singled out criteria, not only the importance of each individual criterion was meaningful for the decision process. Thus, the Shapley value was introduced Shapley to address the relative importance of each criterion with respect to the decision problem. With n being the number of criteria, the Shapley value was a vector v = ( v 1 , ⋯ , v n ) . For all i ≤ n the entry v i was called Shapley index of the i^{th} criterion. Without loss of generality ∑ i = 1 n v i = 1 was considered.

Interaction between criteria could roughly be divided into three cases. Firstly, two criteria i , j were said to be complementary or to interact positively, when the importance of the pair was considered comparably larger than the importance of each of the two single criteria. This was represented by interaction indices I i j ∈ ] 0,1 ] . Secondly, two criteria were called redundant or to interact negatively, when the union of the criteria did not contribute more to the decision problem than each criterion individually. This was represented by interaction indices I i j ∈ [ − 1,0 [ . Thirdly, two criteria were said to be independent when they did not interact, i.e. the importance of the single criteria more or less summed up to the importance of the combination of criteria. Formula for and development of the interaction index could be found in Murofushi and Soneda [

For a 2-additive capacity μ the formula for the Choquet integral of a function f : N → ℝ + represented by the vector ( f 1 , ⋯ , f n ) transforms into

C μ ( f ) = ∑ i = 1 n v i f i − 1 2 ∑ I i j ≠ 0 I i j | f i − f j | , (7)

with the property v i − 1 2 ∑ i ≠ j | I i j | ≥ 0 for all i ∈ N . The second term of the sum

could be seen as the part of the Choquet integral value that results from interaction of criteria [

“AniFair” was implemented using R [

An installer for “AniFair” can be downloaded at https://www.anifair.uni-kiel.de/de/willkommen-bei-anifair. It comes with a portable version of R 3.4.1 and the above mentioned packages to avoid instabilities due to version conflicts, but allow “AniFair” to run in its development environment instead. In addition, example status files are provided for tryout runs (supplementary material, Saving and reloading ’AniFair’ status.).

“AniFair” was designed to assist the user in the decision between objects of interest (OoI) when multiple and not comparable criteria are involved. Hereby, the possibility was provided to run more than one instance of “AniFair” simultaneously. As all instances in the ’Multi instance’—version worked equally to the single instance version, ’AniFair’ was explained with respect to a single instance.

The procedure associated with the software tool “AniFair” could be divided into the three sectors ’Creation of criteria tree’, ’Making criteria comparable’, and ’Choquet integral aggregation’ as illustrated in

In the GUI window opened by “AniFair” the topic of the decision problem could be inserted as root of the criteria tree (

framed box container for the entering of OoI and a respective framed box container for the building of the criteria tree. Both, objects and criteria, could be entered manually or uploaded from file. “AniFair” prevented the entering of object or criteria names that had already been used for other items. Entered objects and criteria were presented in the “AniFair” start window each associated with

buttons ’Alter’, ’.Delete’, and ’.Restore’. The ’.Delete’ button left the object or criterion greyed out, and it was not used in further processing, except it was restored again. While the entered object names were all listed in one framed box container, each criterion had its own framed box container, because the definition of second level criteria (subcriteria) was possible. All subcriteria of one criterion were displayed in the same framed box container as the criterion. The entering of second level criteria was carried out within the framed box container of the corresponding first level criterion.

With each criterion or subcriterion, additional ’DA’ buttons were displayed. It had to be marked for which first or second level criteria data had been collected and which data, respectively (sub)criteria, should be used in the aggregation process. Thus, if a first level criterion was marked as ’Data Available’ (’DA’), none of its subcriteria could be marked, and if a subcriterion was marked as ’DA’, the corresponding first level criterion could not be marked at the same time. This gave the user the possibility to design his or her criteria tree as visualization of the decision problem, and then independently decide upon the criteria involved in the decision process.

Instead of entering OoI and criteria tree manually or uploading them from individual files, a complete “AniFair” status from a former “AniFair” application could be reloaded. The ’LOAD’ button opened a drop down menu from which an “AniFair” status file (Section 2.5.2, paragraph Saving and reloading “AniFair” status) could be chosen.

Independent and dependent subcriteria. “AniFair” distinguished between two types of subcriteria. The subcriteria of a first level criterion were considered dependent, if the states of the subcriteria were effected by each other. E.g. the criterion ’BCS_S’ was splitted into the dependent subcriteria ’BCS_S_1’ and ’BCS_S_2’, each measured as percentages of sows with BCS ’1’ and ’2’, respectively (Sections 2.2.1 and 2.5.5). As an animal scored ’1’ could not be scored ’2’ simultaneously, these percentages (i.e. the states of the subcriteria) are not independent from each other. A pre-aggregation of the subcriteria had to take place within ’BCS_S’ and ’BCS_S’ was afterwards used in the main aggregation. With independent subcriteria the state of one subcriterion did not influence the state of the remaining subcriteria. Independent subcriteria were used in the aggregation together with first level criteria; no pre-aggregation took place.

Before the processing could continue, the user had to provide “AniFair” with the information, which subcriteria should be considered independent, respectively, dependent (supplementary material, FigureS.2).

Limiting the number of criteria per aggregation step. As the computational time for capacity calculation grew disproportionately with the number of criteria, in “AniFair” at most fifteen criteria per aggregation step were allowed. With sixteen or more criteria very large objects burdened the working memory, or the calculation could not be carried out at all due to the fact, that the native code of lin.prog.capa.ident could not support long vectors (64 bit indexes).

The main user interaction occurred in the part of “AniFair” in which comparability between the criteria was aspired. This was approached in a very similar way as in the M-MACBETH software [

The user had to deal with the definition of the different states (performance level) the ’DA’ criteria could take (

In case of dependent subcriteria, pre-aggregations within the respective first level criterion took place, and the first level criterion was then used in the main aggregation (Section 2.5.1, Independent and dependent subcriteria). Thus, not only scales for the dependent subcriteria (’BCS_S_1’, ’BCS_S_2’), but also a scale for the respective first level criterion (’BCS_S’) was needed. As a basis for this, additional matrices of judgment were needed to be filled in by the user concerning the DoA between the dependent subcriteria (

As final step before aggregation with the Choquet integral could be carried out, the OoI needed to be assigned scores from the final criteria scales S 1 final , ⋯ , S n final (paragraph Scoring of objects of interest.).

Export of user entered information. “AniFair” suggested to export all user entered information to human readable txt files. This included criteria tree, defined performance level and filled in matrices of judgment and the scales. In contrast to the uncommon mcb file format exported by the M-MACBETH software, these files can be easily viewed and can serve as a basis for discussion between groups of decision makers. Examples for these files are given in the supplementary material, Appendix: Data exported from “AniFair”. In addition, the status of “AniFair” could be saved to less human readable files and reloaded, in case a modeling needed to be interrupted (supplementary material, Saving and reloading “AniFair” status).

Scoring of objects of interest. Every OoI was associated with one performance level per ’DA’ criterion according to the available data (as an example see

For the latter, the user needed to prepare a file organized as follows. OoI denoted the rows and ’DA’ criteria denoted the colums. It was important, that the object and criteria names in the file match the names entered in “AniFair” by the user. For criteria with qualitative performance level, the fields of the table might exclusively hold performance level as defined by the user. For criteria with quantitative performance level, the fields of the table contained the originally collected data, which was internally compared with the defined quantitative performance level by “AniFair”. “AniFair” could manage, if the number or order of OoI, respectively, ’DA’ criteria differ between user entered information and file, and it deleted duplicates. For the ’Good feeding’ example the beginning of the corresponding file was depicted in Listing 1.

If still the upload was not successful, a window was opened that showed an example on how to prepare the file, and “AniFair” allowed the user to choose

another file or to switch to manual entering of object scores. For the manual entering of scores a notebook object with one tab per OoI was opened. On every tab the ’DA’ criteria were listed with drop down menus holding the performance level (

As the performance level of the criteria corresponded to scores on the final criteria scales S 1 final , ⋯ , S n final , the collections of performance level associated with the OoI were transformed into vectors of scale entries between 0 and 100 within “AniFair”. These vectors formed the rows of an m × n matrix Scores ( OoI ) with one row per OoI and one column per criterion.

There were several mathematical approaches to identify a k-additive capacity (Section 2.4, Definition 6) reflecting specific user given information. An overview on the methods provided by the R-package ’Kappalab’ [

The R-function lin.prog.capa.ident was based on linear programming and used the R-package ’lpSolve’. Given the input Acp created in Listing 2, an object of class Mobius.capacity was created which held the capacity μ as list of coefficients. μ could afterwards be used to calculate the corresponding Choquet integral (Definition 4) using the function Choquet.integral (Listing 3).

Visualization of the Choquet results of pre-aggregation steps. In case of dependent subcriteria, “AniFair” opened a notebook with one tab for each criterion that had dependent subcriteria (_{dep} be the number of dependent subcriteria for the respective criterion, then the table had n_{dep} + 2 or n_{dep} + 3 columns. In the first column, the objects were listed. The following n_{dep} columns held the scores of the objects for the dependent subcriteria, i.e. the respective columns of the matrix Scores (OoI) (Section 2.5.2, paragraph Scoring of objects of interest). These were followed by a column for the mean scores and one column for the Choquet integral values if existent. The displayed results equaled the solution as provided by lin.prog.capa.ident without additional constraints on the Shapley value or the interaction (Listing 3). In order not to complicate the workflow the definition of constraints (compare paragraph Application of constraints and re-calculation) and the calculation of a weighted mean were not supported for pre-aggregation steps. The main aggregation step needed to be initiated by the user by clicking ’OK’.

Visualization of the Choquet results of the main aggregation and adding of constraints. For the main aggregation step, the results of Choquet integral calculation, the Shapley value, and the matrix of interaction indices were visualized in a window comprising three separated tables. If no dependent subcriteria

were present, the matrix Scores (OoI) was included in the table at the top as the columns holding the scores for the OoI. In case of dependent subcriteria, the columns for these subcriteria were replaced by one column containing the pre-aggregated results for the respective first level criterion (

No Choquet integral solution: Weighted mean as alternative. In the case no solution existed for the capacity, no Choquet integral values could be calculated. In the main aggregation, “AniFair” then proposed the calculation of a weighted mean as an alternative and provided the opportunity for the user to define or alter weights for the criteria. The results were presented in a table with a column for the weighted mean instead of the Choquet integral values (supplementary material, FigureS.7).

Application of constraints and re-calculation. The ’Add constraints on Shapley value and interaction’ button opened a notebook with four tabs, as two types of constraints could be defined for both Shapley value and interaction between criteria. On the one hand, for the Shapley and interaction indices interval boundaries between 0 and 1, respectively, −1 and 1 could be set. On the other hand, pre-orders could be defined, whereby the corresponding notebook tabs presented interactive matrices. In case of the Shapley indices, the criteria were displayed along the rows and the columns (Figure7). Right mouse click opened a drop down menu holding the choices ’=’, ’<’, and ’>’ associated with the constraint that the Shapley index of the criterion naming the row should be equal, lower or greater than the Shapley index of the criterion naming the column. In case of the interaction indices all pairs of criteria (i.e. ’BCS_S’-’Age_of_weaning’, ’BCS_S’-’Water_supply’, ’Age_of_weaning’-’Water_supply’) were displayed along the rows and colums (supplementary material, FigureS.5(c)). Again, the relation between the interaction indices of the criteria pairs could be evaluated as ’=’, ’<’, and ’>’ in drop down menus. Preference thresholds δ_{S}, δ_{I} were defined, and matrices Asp/Asi (for Shapley value), and Aip/Aii (for interaction) were generated for the formalization of the constraints. For every constraint that defined an interval one line consisting of the index (or indices in case of the interaction between two criteria) and the interval boundaries was added to Asi, respectively, Aii (Listing 4).

For every constraint defining the pre-order of Shapley or interaction indices two lines (equality of indices) or one line were added to Asp or Aip according to Listing 5.

The pre-order of the OoI given to lin.prog.capa.ident via the argument Acp (Listing 2) was based on the assumption that all criteria were equally important to the decision problem. In re-calculation the pre-order needed to be reconsidered, when constraints on the Shapley value were defined that suggested otherwise. The generation of a weight vector representing the defined Shapley value constraints was implemented using the function lp from the R-package ’lpSolve’. A weighted version of the matrix Scores (OoI) of scores for the OoI was calculated while the weight vector was element wisely multiplied to the rows. From this weighted version of Scores(OoI) a weighted version of Acp was created according to Listing 2 and passed to lin.prog.capa.ident for the re-calculation (Listing 6) together with the constraints defining matrices Asp, Asi, Aip, and Aii.

As far as a solution existed that satisfied the given constraints, the results were displayed in a two-sided window (supplementary material, FigureS.6). On the left the solution from the preceding calculation was presented, and on the right the re-calculated solution could be seen. If no solution existed, the user was asked to define less strict constraints.

Export of results. The windows displaying the results of aggregations were equipped with an ’Export’ button, in order to export the results to txt files (supplementary material, Listing Exported.3, Exported.5) and csv files (supplementary material, Listing Exported.4, Exported.6).

In the ’Multiple instances’—version of ’AniFair’ the instances appeared as tabs in the main window (

Creation of criteria tree. In the “AniFair” instance in the first tab (Figure2) the topic ’Good feeding’ was entered as root for the criteria tree. Furthermore, the thirteen farms were entered as objects ’1’, ..., ’13’ as well as the first level criteria ’BCS_S’ (body condition score of sows), ’Age_of_weaning’, and ’Water_supply’. ’BCS_S’ was split up in second level criteria ’BCS_S_1’ and ’BCS_S_2’. As ’DA’ criteria the second level criteria ’BCS_S_1’ and ’BCS_S_2’ and the first level criteria ’Age_of_weaning’ and ’Water_supply’ were marked. As BCS was scored on a three-point scale and ’BCS_S_0/1/2’ were measured in percentages of affected animals, all information concerning ’BCS_S_0’ was given with the information on ’BCS_S_1’ and ’BCS_S_2’. A visualization of the complete criteria tree can be found in supplementary material (FigureS.1). The second level criteria ’BCS_S_1’, and ’BCS_S_2’ were marked dependent after “Proceed calculation” was hit (supplementary material, FigureS.2, Listing Exported.1).

Making criteria comparable. The definition of performance level, the filling of the matrices of judgment, and the adaption of the scales were carried out by the same person, who collected the data. The proper scientific background in the topic of animal welfare was guaranteed in the decision process. The bases of comparison were set to ’Quantitative performance level’ for the ’DA’ criteria ’BCS_S_1’, ’BCS_S_2’, and ’Age_of_weaning’. Being measured as percentages of sows and numbers of days, these criteria were scored on numerical scales (section 2.2). Performance level were inserted in order of decreasing attractiveness. The BCS value ’1’ was given to sows that failed the healthy state, therefore, it was desirable to have low percentages of sows with BCS ’1’. The three performance level ’<8.7’, ’8.7-18.6’, and ’>18.6’ were defined. ’2’ was the least desirable BCS value, as it described malnourished sows. The performance level defined here were ’0’, ’0-0.3’, and ’>0.3’. For ’Age_of_weaning’ averaged number of days from birth to weaning were grouped into the three performance level ’>28’, ’28-24.5’, and ’<24.5’. As ’Water_supply’ was measured qualitatively by judging the cleanliness and functionality of the drinkers, ’Qualitative performance level’ was set. The descriptions ’0: adequate’ and ’2: cleanliness and/or functionality not adequate’ were entered with the abbreviations ’0’ and ’2’. Pictures of the inserted performance level and the graphical visualizations of the adapted scales could be found in the supplementary material (Figures S.3, respectively, S.4). All user defined information was exported to txt file and could as well be found in the supplementary material (Listings Exported.1 and Exported.2). The OoI were afterwards scored by uploading the information from the file depicted in Listing 1 and the following 13 × 4 matrix Scores ( OoI ) was build (Listing 7).

Choquet integral aggregation. Figure6 shows the results for the Choquet integral calculation. Additional constraints were defined afterwards. The criterion ’BCS_S’ was a so called animal-based measure [

Aggregation of instances. As can be seen in

According to user information, the following final scales were received for the “DA” criteria (supplementary material Listing Exported.2):

The scores of the 13 farms under analysis with respect to these scales can be found in

Apart from the results delivered by “AniFair”, the part of the Choquet integral values which can be attributed to criteria interaction calculated according to Formula 7 (Section 2.4) can be found in the last column of

The final solution with regard to the aggregation of instances was associated with the Shapley value v = ( 0.25,0.25,0.25,0.25 ) and for all pairs of welfare principles the interaction indices equaled zero as set by the user. Choquet integral values ranged from 39.02 to 72.63 and coincided with the mean of the

Rank | Farm | BCS_S | Age_of_weaning | Water_supply | Mean score | Choquet values | Part of interac. |
---|---|---|---|---|---|---|---|

1 | 8 | 60.00 | 79.90 | 90.00 | 76.63 | 64.55 | 3.30 |

2 | 4 | 60.00 | 79.90 | 19.90 | 53.27 | 51.17 | 6.60 |

3 | 3 | 42.85 | 79.90 | 90.00 | 70.92 | 51.07 | 5.19 |

4 | 6 | 42.85 | 79.90 | 90.00 | 70.92 | 51.07 | 5.19 |

5 | 1 | 42.85 | 79.90 | 19.90 | 47.55 | 39.61 | 6.60 |

6 | 10 | 25.71 | 79.90 | 90.00 | 65.20 | 37.59 | 7.07 |

7 | 9 | 60.00 | 0.00 | 19.90 | 26.63 | 36.50 | 6.60 |

8 | 11 | 25.71 | 79.90 | 19.90 | 41.84 | 28.06 | 6.60 |

9 | 12 | 25.71 | 0.00 | 90.00 | 38.57 | 20.03 | 9.90 |

10 | 7 | 0.00 | 100.00 | 90.00 | 63.33 | 19.93 | 11.00 |

11 | 13 | 0.00 | 100.00 | 90.00 | 63.33 | 19.93 | 11.00 |

12 | 2 | 0.00 | 79.90 | 90.00 | 56.63 | 17.37 | 9.90 |

13 | 5 | 25.71 | 0.00 | 19.90 | 15.20 | 17.27 | 2.83 |

Rank | Farm | Good feeding | Good housing | Good health | Appropriate behaviour | Mean score | Choquet values |
---|---|---|---|---|---|---|---|

1 | 13 | 63.33 | 80.50 | 77.70 | 68.98 | 72.63 | 72.63 |

2 | 7 | 63.33 | 75.00 | 67.33 | 63.48 | 67.28 | 67.29 |

3 | 8 | 76.63 | 60.50 | 44.81 | 59.08 | 60.26 | 60.27 |

4 | 1 | 47.55 | 61.00 | 52.23 | 54.75 | 53.88 | 53.88 |

5 | 3 | 70.92 | 48.00 | 49.97 | 33.22 | 50.53 | 50.54 |

6 | 9 | 26.63 | 65.00 | 43.65 | 64.29 | 49.89 | 49.89 |

7 | 11 | 41.84 | 55.00 | 60.73 | 41.70 | 49.82 | 49.82 |

8 | 10 | 65.20 | 45.00 | 43.15 | 45.25 | 49.65 | 49.66 |

9 | 2 | 56.63 | 30.00 | 45.70 | 53.75 | 46.52 | 46.52 |

10 | 6 | 70.92 | 50.00 | 38.60 | 25.12 | 46.16 | 46.18 |

11 | 4 | 53.27 | 40.00 | 44.15 | 47.06 | 46.12 | 46.12 |

12 | 12 | 38.57 | 50.00 | 47.43 | 28.44 | 41.11 | 41.11 |

13 | 5 | 15.20 | 40.00 | 39.19 | 61.75 | 39.03 | 39.02 |

scores due to vanishing interaction indices (

In the present article, the software tool “AniFair” for Multi-criteria decision analysis was introduced and presented via an example of assessing animal welfare with regard to the principles and criteria from the Welfare Quality® Assessment protocol for pigs. In contrast to ’Growing and finishing pigs’, no proposal for an aggregation system regarding ’Sows and piglets’ has been released yet [

To establish a ranking of the farms considering all criteria associated with ’Good feeding’ was a difficult task. The decision maker was forced to compare and weight multilayered information, as both quantitative criteria (’BCS_S_1’, ’BCS_S_2’, ’Age_of_weaning’); however on incomparable scales, and the qualitative criterion ’Water_supply’ needed to be taken into account. The MACBETH approach [

Looking at other methods in multi-criteria decision, the UTA (UTilités Additives) method proposed by Jacquet-Lagreze and Siskos [

However, in the M-MACBETH software, a questioning-answering-protocol was analogously used to determine criteria weights for the additive aggregation function. “AniFair” used the MACBETH approach solely to generate scales on which the criteria can be addressed comparably, but not for the weighting of criteria. As another difference, on every aggregation level within “AniFair” the Choquet integral [

For the capacity calculation, the ’maximum split’ method was chosen that led to dispersed utilities and reached the maximal split that a Choquet integral solution can take for the given pre-order of objects [

Since independence of criteria usually was not given with real live decision problems [

A further consequence of the recalculation, concerns the distribution of Choquet integral values. The ’maximum split’ solution was associated with fairly balanced weighting between the criteria. After recalculation, the range of Choquet integral values was narrower, the mean difference between consecutive farms was smaller, and the differences showed stronger variation. Instead of nearly equidistantly ranked farms with the ’maximum split’ solution, after the recalculation the distribution showed a majority of negligible differences. In this way, the more specific adaption of the model to user preferences in terms of constraints, led to a clear separation into groups of farms with comparable overall ’Good feeding’ scores, while the first farm in ranking farm ’8’ was clearly superior to the following farms. Thus, a more pronounced statement towards the animal welfare status was made.

Similar to the aforementioned AHP method, with ’AniFair’ at least two aggregation level were possible. With this the natural human tendency was supported to break down selection processes and to split up decision making in several stages, when the number of objects increased [

All information were entered in “AniFair” via Graphical User Interfaces. The user was guided through the decision process, and his or her content-related expertise was specifically queried when needed for the next step in decision making. Behind the red ’?’ buttons that were placed in all windows where user interaction was needed (^{4} are needed.

Regarding animal welfare, the main concern is its measurability, as a clear definition as well as how to address animal welfare with overall scores are heavily discussed topics and subjects to current scientific research [

Gratitude is expressed to the farmers that participated in the data collection.

This work was supported by the Federal Ministry of Food and Agriculture (funding code 2817200913).

The authors declare no conflicts of interest regarding the publication of this paper.

Salau, J., Friedrich, L., Czycholl, I. and Krieter, J. (2020) “AniFair”: A GUI Based Software Tool for Multi-Criteria Decision Analysis—An Example of Assessing Animal Welfare. Agricultural Sciences, 11, 278-331. https://doi.org/10.4236/as.2020.113018

On the Chosen Example

Animal welfare has gotten in the public eye and has become an important issue for consumers. Politics has come up with various legal requirements for the farmers to look after and maintain the welfare status of their animals. To avoid emotionality in the discussion about this topic, it became essential to clearly define the terminology and to provide a conceptually sound assessment of animal welfare. For this reason scientific work has been carried out inter alia by the Welfare Quality® project. The latter identified twelve welfare criteria that were partitioned in the four welfare principles ’Good feeding’, ’Good housing’, ’Good health’, and ’Appropriate behaviour’. The assessment of animal welfare according to Welfare Quality® was based on multiple indicators which—in the form they are gathered—were not necessarily comparable, but measured as binary decisions, on three-point scales or on cardinal scales. Comparability of the collected information was achieved by decision trees, index calculation, and I-spline functions, before the scores were stepwisely aggregated to an overall evaluation of the welfare standards of farms (Welfare Quality® 2009a; Welfare Quality® 2009b; Welfare Quality® 2009c).

When it came to the welfare of pigs, Welfare Quality® proposed aggregation systems for ’Growing and finishing pigs’ based on the above mentioned methods which were implemented in an online calculator^{S1} to achieve overall welfare scores. For ’Sows and piglets’ no proposal for an aggregation system has been released yet. The authors chose ’Sows and piglets’ in terms of the welfare principal ’Good feeding’ as the main example to present the functionality of “AniFair”, because it was less likely that a direct comparison with a currently used aggregation system could cloud the judgment of the possibilities offered by “AniFair”. Furthermore, “AniFair” provided the choice between a ’Single instance’—and a ’Multiple instances’—version. The remaining three welfare principles were used to present the ’Multiple instances’—version and the possibility to aggregate over multiple instances, but those were neither discussed nor illustrated in detail.

^{S1}http://www1.clermont.inra.fr/wq/index.php?id=simul&new=1.

An expert in animal welfare collected the data and made all decisions regarding criteria performance level, the differences of attractiveness between them, the modification of precardinal scales and the definition of constraints. However, it was not the intent of the main article to discuss the meaning of the user entered information or the resulting ranking of farms as final truth concerning the assessment of pig welfare. Rather should “AniFair” be introduced as a tool to address the assessment of e.g. animal welfare in a transparent way.

As not all user entered information or results could have been displayed in the main article, missing visualization for the ’Good feeding’ example as well as graphical illustrations concerning the aggregation over all four instances ’Good feeding’, ’Good housing’, ’Good health’, and ’Appropriate behaviour’ was placed in this Supplementary material.

“AniFair” Application to ’Good Feeding’ in ’Sows and Piglets’

Creation of criteria tree. Assessed were the welfare criteria ’Absence of prolonged hunger’ and ’Absence of prolonged thirst’ (main article Section 2.2.1) via the measures body condition score of sows (’BCS_S’), age of weaning (’Age_of_weaning’), and water supply (’Water_supply’). ’BCS_S’ was divided into the second level criteria (subcriteria) ’BCS_S_1’, ’BCS_S_2’. At data collection the percentages of sows scored ’1’, and ’2’ were calculated for every farm. The criterion ’Age_of_weaning’ was assessed as the averaged number of days from birth to weaning as stated by the farmer. The criterion ’Water_supply’ was given by a binary decision, if the drinking places for sows and piglets were adequate regarding the cleanliness and functionality of all drinkers (score ’0’) or not (score ’2’). From these criteria a criteria tree was build in the “AniFair” main window which was fully displayed in FigureS.1. All criteria that were selected by the ’DA’ buttons were marked in bold and red font. Hitting the ’Proceed calculation’ button opened a GUI window where the User had to decide upon the in/dependence of subcriteria (main article Section 3.1, Independent and dependent subcriteria) and confirm the choices before further processing could be carried out (FigureS.2).

Making criteria comparable. In FigureS.3 the definitions of the performance level for the decision criteria were illustrated. As a next step, the matrices of judgment could be filled in (Appendix: Background of ’Making criteria comparable’). The evaluation of the differences of attractiveness could for all criteria be viewed in Listing Exported.1 together with the export of the criteria tree and the definition of performance level. All Listings were presented in the Appendix: Data exported from “AniFair” of this Supplementary material.

Based on these user preferences “AniFair” scales were calculated which were precardinal scales, i.e. the distances between the entries on the scale mirror the qualitative attributes with which the User evaluated the pairwise differences between performance level. However, the relative differences of attractiveness as experienced by the User might not be represented by distances between entries on the “AniFair” scales. That was why the User was asked to modify the scales after inspecting the graphical visualization. FigureS.4 illustrated the “AniFair” scales on the left and the final criteria scales after user modification on the right exemplary for ’Age_of_weaning’ and ’Water_supply’. As an example, the User experienced that the performance level ’28 - 24.5’ of criterion ’Age_of_weaning’ needed to be scored closer to the maximum score 100 associated with the performance level ’>28’ than “AniFair” had suggested. Thus, the User modified the scale for ’Age_of_weaning’ by raising the score for ’28 - 24.5’ via the spin buttons of the thermometer. “AniFair” internally calculated boundaries (Appendix: Background of ’Making criteria comparable’, Dependent intervals.) for the modification of the scale to prevent that the user preferences entered earlier were

violated (FigureS.4(a) & FigureS.4(b)). The “AniFair” scale for ’Water_supply’ consisted of a straight line from 100 to 0, because only two performance level were defined. One possible modification of this scale without violating the condition that the

User judged the difference of attractiveness between ’0’ and ’2’ as ’very strong’ would be to lower the score of ’0’ to 90.0 and raise the score of ’2’ to 19.9 as visualized in the example (FigureS.4(c) & FigureS.4(d)). For ’BCS_S_1’, ’BCS_S_2’, and ’BCS_S’ no user modifications were made. For the sake of reproducibility, all “AniFair” and final scales were exported to txt file and can be seen in Listing Exported.2 in the Appendix: Data exported from “AniFair”.

Choquet integral aggregation. For the calculation of the Choquet integral the User needed to provide “AniFair” with the information, how each of the thirteen farms performed with regard to the ’DA’ criteria. In this example this scoring of objects took place via upload from file (main article Section 3.2, Scoring of objects of interest).

The performance level assigned to the farms were transformed into scores on the final criteria scales and in a first run Choquet integral values were calculated without any additional constraints (main article Section 3.3, Visualization of the Choquet results of the main aggregation and adding of constraints.). In addition to the constraints with regard to the pre-order of the Shapley indices (main article Figure7, Section 3.3, Application of constraints and re-calculation.) the constraints displayed in FigureS.5 were defined. The User wanted the Shapley indices of ’BCS_S’ to be higher, because animal-based measures as ’Body condition score’ were considered more important in the assessment

of animal welfare (FigureS.5(a)). FigureS.5(b) showed, that all interaction indices had been set greater than zero. As the welfare of pigs was sensitive towards prolonged hunger as well as prolonged thirst, the importance of the union of criteria was considered larger than the importance of single criteria, and thus, all criteria interact positively (complementary criteria). Furthermore, the User considered it necessary, that the interaction indices for pairs of criteria coincided. These constraints were defined via the pre-order of interaction indices (FigureS.5(c)).

In FigureS.6 could be seen, that the ranking of farms with regard to the welfare principle ’Good feeding’ had changed for rank 2 and following. Furthermore, the Shapley value and the interaction indices had been adapted according to the defined constraints. The final results as well as the constraints were then exported to txt-file and csv-file (Listings Exported.3, Exported.4 in Appendix: Data exported from “AniFair”).

’Multiple instances’—version and aggregation of instances. ’Good feeding’ and the remaining welfare principles ’Good housing’, ’Good health’, and ’Appropriate behaviour’ were run out in the ’Multiple instances’—version of ’AniFair’ (main article Section 3.4). As with ’Good health’ no capacity solution existed and a weighted mean was calculated instead, these results were displayed in FigureS.7 as an example for the weighted mean alternative (main article Section 3.3, No Choquet integral solution: Weighted mean as alternative). However,

all other results and user entered information regarding the welfare principles ’Good housing’, ’Good health’, and ’Appropriate behaviour’ were not illustrated, as in this article no detailed discussion on the welfare of pigs but a presentation of the “AniFair” software tool was aspired.

For the aggregation of instances the User was presented with the type of results available for each instance (main article Section 3.4, Figure8(b)). As for ’Good health’ no Choquet integral values existed, for the sake of homogeneity the unweighted mean was chosen for all welfare principles in this example. The results of aggregation prior to the definition of constraints was displayed in Section 3.4, Figure8(c) in the main article. The following constraints had been defined, additionally: All interaction indices have been limited between 0 and 1 to enforce complementary interaction between the welfare criteria. Via the pre-order of the Shapley indices equality of all Shapley indices was determined. The results as well as the associated Shapley value and interaction indices were exported to txt file (Listing Exported.5 in Appendix: Data exported from “AniFair”) and could be viewed in FigureS.8.

As a result the thirteen farms were assigned overall scores for all individual welfare principals and an overall evaluation of the welfare standard. A ranking was formed that reflects the relative importance of the criteria, respectively, principles. As the scores were made comparable and displayed together with the final scores, aimed advice could be given to the farms with low rankings, in which criterion/principle it was most pressing to improve the welfare status of the animals. All decisions could be looked up in the exported files and served as basis for discussion for animal welfare experts.

Saving and reloading “AniFair” status. Up to three ’{SAVE}’ buttons could be found in “AniFair”. With these buttons the current “AniFair” status could be saved. This included OoI, criteria, subcriteria, information which criteria are ’DA’, information on the (in) dependence of subcriteria, bases of comparison, performance level, matrices of judgment, “AniFair” scales and dependent intervals. In contrast to the export of user entered information, scales or results to txt files, these “AniFair” status files were not designed for analysis or to be human readable, but to reload information into “AniFair”. Every “AniFair” instance was equipped with a ’LOAD’ button to restore all information in a respective “AniFair” status. Afterwards, criteria could be added without compromising any loaded information. However, the deletion of a criterion could compromise the mapping between the criteria and the information on performance level, matrices of judgment, scales and dependent intervals. “AniFair” might, thus, be obliged to ignore the information. Alteration of criteria also caused “AniFair” to neglect the information on the respective criteria.

Appendix: Data exported from “AniFair”

Performance level. The term performance level referred to the different states that can occur regarding a criterion (Section 3.2 in the main article; Bana e Costa, Corte, and Vansnick (2003)). For qualitative criteria the performance level were characteristics like the colors of a car or existence versus non-existence of an illness. Quantitative criteria could be measured on a numerical scale like percentages of sick animals in a herd. With this, exemplary performance level could be ’0 - 10’, ’10 - 50’, ’>50’. After a list of ’DA’ criteria had been confirmed, “AniFair” opened a window with one framed box container for every criterion. Next to the criterion name a drop down menu was placed, in which the basis of comparison could be set either ’Quantitative performance level’ or ’Qualitative performance level’. In the quantitative case “AniFair” prepared two slots ’insert level 1’ and ’insert level 2’ (

Matrices of judgment and scale calculation. When performance level for a criterion CRIT were defined, the matrix of judgment needed to be filled in (

Inconsistency. With every modification of the interactive fields in the matrix of judgment, the SLI changed and was checked regarding the existence of a solution. When the judgments appeared to be inconsistent with a precardinal scale, a modal dialog opened to help the User to solve the inconsistency. The dialog specified the judgment lastly made and showed a list of suggestions (

Dependent intervals. “AniFair” supported the refinement of the positions of scores on the precardinal “AniFair” scales in order to achieve cardinal scales (Section 2.3.1 in the main article). Repositioning could be carried out without violation of user preferences within the dependent interval of each score: We referred to the dependent interval associated to a score s on an “AniFair” scale S i AniFair , i ≤ n as the interval between the minimal and maximal possible value for s such that S i AniFair still reflected its underlying user preferences, given that all other scores on S i AniFair were kept fixed. For the calculation of the dependent interval for s the conditions that all scores except s remain unchanged were included in the SLI described in paragraph Matrices of judgment and scale calculation. The function lp was then used twice to optimize the SLI with respect to minimal and maximal values of s. Dependent intervals were re-calculated with every user modification (paragraph Visualization and adaption of scales) of the “AniFair” scales.

Visualization and adaption of scales. A notebook opened with one tab per ’DA’ criterion to display the precardinal “AniFair” scales S 1 AniFair , ⋯ , S n AniFair . On the left, the scale was shown as curve with the performance level labeling the horizontal axis. On the right, the scale was presented as thermometer, similar to the scale representation in the M-MACBETH software MMacBeth. The displayed scores in the thermometer were editable and could also be altered via spin buttons. Every altering of scores in the thermometer led directly to adaption of the graphic. In this way, the User could modify the “AniFair” scales within the dependent intervals (paragraph Dependent intervals). If the User for example evaluated all DoA between successive performance level equally, e.g by the attribute ’moderate’, the “AniFair” scale was presented by a straight line. However, all successive DoA could be ’moderate’ without the necessity of an equidistant scale, as ’moderate’ was a qualitative judgment and represented a range of DoA. The User could, thus, refine the relative DoA according to his or her experience. In this way final cardinal criteria scales S 1 final , ⋯ , S n final were obtained (Section 2.3.1 in the main article). While clicking the ’Ok’ button of the notebook, the User could export “AniFair” scales S 1 AniFair , ⋯ , S n AniFair as well as the final criteria scales S 1 final , ⋯ , S n final of all criteria to txt file and proceed to the scoring of objects. Figures of visualization and adaption as well as the exported txt file for the ’Good feeding’ example could be found in the Supplementary Material (FigureS.4, Listing Exported.2). The scales for criteria with dependent subcriteria were displayed the same way.