Analysis of the Accuracy of AncesTrees Software in Ancestry Estimation in Brazilian Identified Sample

In the present study a software tool for craniometric ancestry estimation, AncesTrees, was evaluated in an identified Brazilian skeletal sample with known self-reported ancestry. Twenty-three craniometric measures were obtained from each skull and analyzed using AncesTrees software, with two classification strategies—tournamentForest and ancestralForest algorithm. The tournamentForest (53.54%) and ancestralForest algorithms with three ancestry groups (50.96%) were more accurate to classify Europeans, while the ancestralForest algorithm with six (50.00%) and two (67.64%) groups were more accurate to estimate the ancestry of African descents. Admixed ancestry specimens were classified predominantly as European descent. The use of the ancestralForest algorithm considering only European and African origin (58.42%) was the most accurate setup for ancestry estimation in Brazilian skulls. Supervised classification algorithms and tools such as the AncesTrees work based on data analysis and pattern matching, and there is no Brazilian sample in its database, the software showed a low accuracy Brazilian samples. The incorporation of representative craniometric data obtained from Brazilian skulls into the software database may significantly increase the accuracy of ancestry estimates.


Introduction
Several programs and software tools have been developed in recent years to tackle the challenging task of ancestry estimation from skeletal remains in forensic anthropology (FA). These computational tools make use of statistical and machine learning algorithms to solve a mathematical problem that abstractly speaking involves allocate objects to predefined classes hence the name classifiers or classification algorithms. Such software and computer programs abstract the modelling and computation from end-user and render complex mathematical formulas easy to use through graphical user interfaces. Some prime examples of such tools are FORDISC (Ousley & Jantz, 2013), CRANID (Wright, 1992), COLIPR (Urbanová & Králík, 2008), 3D-ID (Slice & Ross, 2009) and AncesTrees (Navega et al., 2015). The latter, which will be the focus of the present study, was developed in 2015 by Portuguese researchers to quantitatively estimate ancestry based on 23 craniometric measures. This tool classifies the human skull using the random forest algorithm (Breiman, 2001), a non-linear and non-parametric ensemble-based classification technique that uses hundreds to thousands of classification decision trees as base models.
One fundamental disadvantage and transversal issue of these methodologies and tools is the lack of reference data for each population worldwide resulting inevitably in lower accuracy for individuals from geographic regions that are not fully represented in the software databases (Cunha & Ubelaker, 2020;Kranioti et al., 2019). In its current state AncesTrees uses as reference database a large worldwide sample of craniometric data from individuals from major populational clusters. Nonetheless, no individual of Brazilian origin is represented in the software which raises obvious question regarding its accuracy as a tool for ancestry estimation in Brazil.
Brazil is a country of continental dimensions with one of the most heterogeneous population worldwide. Brazilian miscegenation and admixture are the result of economic, migratory, and ethnic-racial interrelationships, especially between ethnic groups of African, European and American Indian (Amerindian) ancestry (Carvalho-Silva et al., 2001), which makes it unique and highly regionalized (Cuzzullin et al., 2020;Tinoco et al., 2016). The systematic development and curation of identified osteological collections in Brazil have provided in recent years a more reliable source of skeletal remains for the study of Brazilian population fostering development and advances in forensic anthropology both nation and worldwide (Cunha et al., 2018;de Carvalho et al., 2020). These reference collections enable the systematic and rigorous analysis of published and available methods or protocols, which is vital to guarantee the accuracy and reliability of forensic analysis. In this study, we assess accuracy of the AncesTrees software to estimate ancestry in a large sample of modern Brazilian identified human skulls of multi-ancestral origin. The significance of ancestry as a fundamental parameter of the biological profile in forensic identification and the complex populational structure of Brazil utters urgency in the validation of the tools available to forensic experts.

Skeletal Sample
In the current study the skull of 266 identified Brazilian nationals were analyzed to assess the accuracy of a craniometric ancestry estimation tool. The study sample was composed by 144 males and 122 females, with known age-at-death between 20 to 100 years old. Self-reported ancestry was obtained from official documentation with 155 individuals reporting as European descent, 34 as African descent, 76 of admixed ancestry (n = 76) and one individual of Asian descent.
All demographic information was collected according to death certificate, an official document required for inhumation in Brazil (Cunha et al., 2018

Methods
Twenty-three measures were obtained from each skull according to the study by Howells and as recommended by the AncesTrees software (Table 1) (Howells, 1973(Howells, , 1989(Howells, , 1994Navega et al., 2015). In the case of bilateral structures, the left-side measurement was selected. The measurements were performed using a statistical program available from http://osteomics.com/AncesTrees/. After data entry, the "validation" tab was clicked to check for validity of the measurements or for re-assessment of the values that were very divergent from the mean. Subsequently, the algorithm (ancestralForest or tournamentForest) was chosen in the "analysis" tab, and the algorithm setup was determined according to computational and statistical parameters.
The tournamentForest algorithm has a more automated approach and differs from the ancestralForest algorithm for being classified according to round-robin tournaments, in which the best binary classifier is selected. This algorithm follows a division-and-conquest approach; after each round of the tournament, the ancestry group with less affinity with the skull under analysis is discarded, until only two possible ethnic groups remain. As the tournamentForest algorithm is more suitable for cases in which there is little or no prior knowledge about the  (Navega et al., 2015), the software was set up to consider 512 trees and all nine available ancestry patterns.
The second tested algorithm was the ancestralForest, which analyzes the likelihood of the skull under analysis matching the selected ancestry groups (Navega et al., 2015). After validating the model parameters, the predicted ancestry was that which ranked the first place in the tournamentForest algorithm tournament and that was the most likely outcome in the ancestralForest algorithm. The ancestry estimate was compared against the known ancestry obtained from the death certificate. The accuracy of the methods in identifying the ancestry of Brazilian skulls was calculated based on the classification of Europeans, Africans, and Asians into their corresponding group. Due to the high miscegenation of mixed-ancestry samples and because they do not fit into any of the ancestry groups available in the AncesTrees software database, they were analyzed separately to avoid bias following Jacometti (2018) protocol.
The data were analyzed descriptively and by statistical tests using SPSS 23 (SPSS Inc., Chicago, IL, USA). The data showed a normal distribution curve (Kolmogorov-Smirnov test), and pairwise comparisons between mean measurements were carried out by one-way ANOVA followed by Tukey HSD post-hoc test. Chi-Square and Chi-Square test with Yates correction were used to check for an association between the known vs. estimated ancestry, considering a 0.05 significance level.
As seen in Table 3  (1) One-Way ANOVA with Tukey HSD post-test. Same letters in the same line indicate no significant difference between ancestry groups, whereas different letters ("a" and "b") indicate statistically significant differences between groups. (*) Significant at 0.05. The Asian category was not included in the comparison because there was only one case. Thus, the test and post-test comparisons were performed only between the European, African and Admixed ancestry.
group. Significant inter-group differences regarding metric variables were assessed by one-way ANOVA followed by Tukey HSD post-hoc test.
To assess the relationship between the known and estimated ancestry, the metric parameters of the Brazilian skulls were tested in the AncesTrees software.
The tournamentForest (for six ancestry groups) and ancestralForest algorithms (for six, three or two ancestry groups) were used.     with an overall accuracy of 50.52%.
Lastly, the ancestralForest algorithm was tested considering only two ancestry groups (European and African), as shown in Table 7. This setup correctly estimated the ancestry in 88 and 23 skulls cataloged as Europeans and Africans, respectively. The Asian specimen was erroneously estimated to be of African origin. The accuracy of the total sample was 58.42%, which was statistically significant. Figure 1 shows the accuracy of tournamentForest and ancestralForest algorithms, and their different setups, in correctly estimating ancestry in Brazilian human skulls. When all ethnic groups in the bone collection (European, African, and Asian) were analyzed, except for mixed ancestry, the accuracy of the software ranged from 48.94% (ancestralForest with six groups) to 58.42% (ance-stralForest with two groups). When only European skulls were considered in the analysis, the best accuracy (56.77%) was shown by the ancestralForest algorithm with two ancestry groups (European and African). This was also observed in the analysis of African skulls, with an accuracy of 67.64%. As the AncesTrees software showed inconsistency in estimating the skulls of individuals cataloged as admixed ancestry, this part of the sample was analyzed separately (Table 8).    Based on the tournamentForest and ancestralForest setups with six groups, 30 out of 76 mixed-ancestry skulls were estimated to be Europeans. However, when the ancestralForest algorithm was used with three ancestry groups, then most of the admixed ancestry skulls (n = 30) were estimated to be Africans. When the ancestralForest algorithm was tested considering only two ancestry groups, half of the admixed ancestry skulls were estimated to be Europeans (n = 38) and half of them were estimated to be Africans (n = 38).

Discussion
The Brazilian population originates from a variety of geographic and ancestral origins, mainly American, European, and African (Cuzzullin et al., 2020;Tinoco et al., 2016;Urbanová et al., 2014). Despite this, few studies in the literature have investigated the ancestral patterns that are typical to the Brazilian population.
This scenario is aggravated by the fact that the criterion for determining human ancestry is based only on self-reported skin color (Petruccelli & Saboia, 2013). The association between skin color (white, black, yellow, mixed race and indigenous) and ancestry is not generally accurate, as one characteristic does not necessarily reflect the other.
To date, there is no method for estimating the specific ancestry of the Brazilian population, which makes the challenge even greater, as Brazil is a continental country and with differences even among populations from different regions of the territory. With the implementation of osteological collections in Brazil, a source of reliable research for skeletal remains appears to characterize the Brazilian population, facilitating the development of FA and the validation of methods for solving forensic cases (Cunha, 2019). The Osteological and Tomographic Collection Prof. Dr. Eduardo Daruge is one of the few contemporary collections in Brazil with specimens cataloged for ancestry, thus offering research opportunities to improve the accuracy of ethnic affinity estimates (Cunha & Ubelaker, 2020). Nevertheless, the classification of ancestry was based on skin color information described in the death certificate. We note that this a subjective, outdated, misleading and invalid procedure for FA practice. In fact, there is no skin color scale that relates automatically to ancestry since skin color and ancestry are not synonymous.
In Brazil, estimation of ancestry is a complex task due to the admixture and miscegenation of the Brazilian population produces features that are not typical of a specific ancestry group, such as European, African, or Asian. AncesTrees is a software used to estimate human ancestry based on craniofacial measurements. In our study, we examined the behavior of the AncesTrees software in estimating the ancestral pattern of an osteological collection according to the ancestry groups contained in the software database. As reported herein, the results were unsatisfactory, which can be due to the inaccuracy in estimating the ancestry of the Brazilian bone collection based simply on the skin color. Hence, the inclusion of this Brazilian sample into the software database is utterly important to help estimate the Brazilian ancestry considering metric standards that are widely accepted and recommended in FA.
The wide biological variety in humans has rendered ancestry estimation an increasingly challenging task. Hence, a holistic approach added to technological advances in the field may considerably contribute to greater data accuracy of estimating methods. For instance, the analysis of genetic markers, also called ancestry informative markers (AIMs), together with the study of stable isotopes of tooth enamel via strontium level mapping, and the anthropological analysis, allow estimating the population affinity of unidentified human remains (Cunha & Ubelaker, 2020).
In the United States of America, the Daubert versus Merrell Dow Pharmaceuticals lawsuit encouraged the adoption of new international guidelines for better credibility of scientific evidence, which greatly impacted FA research (Grivas & Komar, 2008). As of this trial, a rigorous and valid scientific method was required for determination of forensic outcomes in each population based on statistical analysis and known error rates. Hence, population-specific me-thods are better accepted by the scientific and legal communities for being more accurate (Liebenberg et al., 2019).
The identification of unknown individuals without a critical analysis of the methodological assumptions used to estimate their biological profile is equivocal (Cuzzullin et al., 2020). In other words, it is useless to establish one's identity based on an unrealistic conjecture for a given population group, since regionally specific criteria are needed.
Due to the availability of morphological and metric information or a combination of both (Cunha & Ubelaker, 2020), the human skull is the most suitable anatomical structure for estimating ancestry in unidentified remains, especially the facial portion of the skull. More importantly, some authors argue that quantitative analyses should be preferred over visual examinations for greater reproducibility, repeatability, and objectivity (Kranioti et al., 2019;Urbanová et al., 2014).
The AncesTrees software has a database currently including almost 3,000 individuals from six main ancestry groups-Sub-Saharan African, Australo-Melanesian, East Asian, European, Native American and Polynesian-from the well know W. W Howells (1973W Howells ( , 1989W Howells ( , 1994 craniometric series. The software was tested on 128 adult human skulls from European and African osteological collections and on 114 Brazilian skulls (Jacometti, 2018;Navega et al., 2015). The method was accurate in determining the ancestral classification of European and African individuals due to the great representativeness of the database. European and African groups was correctly classified in 79.20% and 75.00% of cases, respectively, when all six ancestries were considered in the analysis. When only European and African ancestries were considered, the algorithm correctly estimated population affinity in 93.8% of cases. Despite these findings, the incorporation of representative data obtained from different geographical origins across the globe, including Brazil, into the AncesTrees database is needed to confirm the accuracy and usefulness of the software for forensic practice.
AncesTrees is a relatively new software, so there are not many studies reporting on the accuracy of its estimates. Skalic (2018) determined ancestry estimates using the AncesTrees software based on nine measures in 108 skeletons from the Terry Osteological Collection (United States of America) and the Coimbra Osteological Collection (Portugal). Both collections have skulls of men and women previously classified as white or black, two of which are archaeological cases (one belongs to the Archaic First Nation and the other is a Peruvian skull intentionally altered for cultural reasons). The accuracy of the tournamentForest algorithm ranged from 37.00% to 40.70%. The software was tested for its ability to allocate a sample that did not fit into any of the ancestry groups contained in the database. The archaeological cases were estimated to be originating from South Western Europe and East Asia. Thus, the author argues AncesTrees software may not be appropriate to estimate population affinity in groups that are not well represented in the database. Advances in Anthropology In Brazil, Jacometti (2018) tested the AncesTrees software on a sample of 114 skulls from São Paulo State (Identified Skull Collection at UNIFESP) previously cataloged as Europeans, Africans, and mixed ancestry. Using the same algorithms and setups as those tested in our study, the author found a better performance for estimating European (73.0% accuracy) and African (66.0% accuracy) ancestries, with the ancestralForest algorithm with two ancestry groups (European and African) being the best strategy (70.0% accuracy). Mixed-ancestry individuals were mostly, albeit inconsistently, classified as Europeans. These results corroborate with those observed in our study, showing that the applicability of the software for ancestral classification of this Brazilian population is poor.
Predictive models work based on data matching, but the fact that the Brazilian population is not yet registered in the AncesTrees database may yield an atypical outcome, if no other similar metric standards can be retrieved by the software.
In contrast, when Portuguese researchers were invited to estimate the biological profile of an exhumed young adult from the cemetery attached to the Igreja Do Carmo (Do Carmo Church), in Lisbon, they observed that the cranial morphology and intentional dental changes were suggestive of African origin, indicating that cultural aspects found in the skeleton may be a direct evidence of ancestry (Alves et al., 2016;Cunha & Ubelaker, 2020). When thirteen metric parameters were considered in the analysis of this case, the AncesTrees software indicated a probable Sub-Saharan African origin, with an accuracy of 92%. This confirms that when the likely ancestral origin of unidentified remains is included in the software database, then the accuracy rate of the algorithm is considerably higher.
Another Portuguese study reported favorable results when using the Ance-sTrees software (Navega et al., 2015). A total of 158 individuals buried in the region of Lagos, Portugal, and of probable African origin, were metrically and genetically examined for their ancestry. Cultural artifacts associated with the skeletons were found, and the skull morphology and the presence of intentional changes in the teeth were analyzed. The ancestral affinity was confirmed as African, demonstrating a high accuracy of the algorithm for ancestries included in the database.
In 2016, Slovenian researchers analyzed the bones of an individual allegedly missing since the Second World War. Some morphological aspects of the individual's skull revealed typical Caucasian features, as follows: narrow nasal opening and jaws, prominent anterior nasal spine, round-shaped orbits, reduced inter-orbital distance, and the presence of malar tubercles. The European ancestry of the specimen was confirmed with an accuracy of 82.0% by a metric analysis in the AncesTrees software, which was set up to not consider Asian and African ancestry groups in the analysis. The software provided more accurate estimates when only the most likely ancestries were selected for comparison (Zupanič Pajnič et al., 2016).
In recent years, the supply of computational tools in forensic anthropology has been increasing (Lynch & Stephan, 2018), although most of these tools have 176 Advances in Anthropology a high financial cost. AncesTrees, however, is a free-to-use statistical program from a universe of tools made available by the Osteomics project (d'Oliveira Coelho et al., 2020) for forensic anthropologists, forensic experts, and scholars in the process of estimating human ancestry in unidentified specimens. The accuracy of this software depends on the craniometric measurements obtained, number of ancestry groups included for analysis, and on the statistical setup Therefore, a more robust database comprising the variety of human populations is required to increase the reliability and accuracy of the algorithms for use in the resolution of forensic cases (Kranioti et al., 2019). The continuous miscegenation of human populations means that even the most complex forensic methods are challenged as to their effectiveness in determining a biological profile (Urbanová et al., 2014). The results observed in our study showed that regardless of the algorithm and the statistical setup, the accuracy of the software in determining the real ancestry of Brazilian skulls varied from 48.94% to 58.42%.
In our study, the tournamentForest and ancestralForest algorithms with three ancestry groups were more suitable for classifying Europeans, while the ance-stralForest algorithm with six and two groups was more accurate for estimating African ancestry. Mixed-ancestry cases were predominantly classified as Europeans. The ancestralForest algorithm, configured for European and African ancestries, was more accurate to estimate the ancestry of the Brazilian sample included in our study.

Conclusion
To date, data on Brazilian skulls have not yet been incorporated into the Ance-sTrees software database. Therefore, this program should undergo more validation studies by the forensic and scientific community to more rigorously and systematically assess its accuracy as a tool for ancestry estimation in Brazil. The incorporation of identified and documented forensic cases into the database, especially of recently identified osteological collections, such as the FOP/UNICAMP Osteological and Tomographic Collection-Prof. Dr. Eduardo Daruge, will allow the development and adaptation of population-specific approaches. Thus, the authors of the present study propose to upload the information on Brazilian skulls into the AncesTrees software database and to re-assessment of the accuracy of the ancestry estimates.
Forensic anthropology is a discipline with an immense societal value and responsibility. Nonetheless, to guarantee its mission experts need to assert and recognize the advantages and limitations of the methods employed in the field. Systematic and constant validation, and improvement of all methodological aspect is crucial. The work here presented is contribution to such endeavor, and particularly relevant for forensic experts operating in Brazil.