Differentiation of wild boar and domestic pig populations based on the frequency of chromosomes carrying endogenous retroviruses

Analysis of the frequencies of chromosomes carrying various classes of porcine endogenous retroviruses (PERVs) and combinations of these classes was performed in the swine species Sus scrofa L. by using maps constructed in two principal component coordinates. Four population clusters can be recognized in the maps. Cluster 1 is formed by wild boars,cluster 2 by domestic meat breeds, cluster 3 by meat-and-lard (universal) breeds, and cluster 4 by miniature pigs. The maps indicate that modern domesticated swine meat breeds are the closest to the wild type. Meat-and-lard domestic swine breeds are more distant from wild boars, and miniature pigs are diverged the most. The maps showed that microevolution processes associated with PERV carriership frequency had two basic dimensions, or vectors: the vector of fat deposition variation and the “minus” selection vector (determination of commercial traits). Thus, PERVs may cause variation in pig physiology.


INTRODUCTION
Porcine endogenous retroviruses (PERVs) became an integral part of swine genomes, including Sus scrofa L. 1758 (Suidae, Mammalia), before the formation of the Sus genus.This is confirmed by their presence in bushpigs (Potamochoerus larvatus and P. porcus) and warthogs (Phacochoerus africanus) [1].Three PERV classes are known: A, B, and C. Different classes are highly similar in the nucleotide sequences of the gag (groupspecific antigens) and pol (polymerase) genes but differ in the nucleotide sequence of the receptor-binding domain of the env (envelope) gene, which encodes the envelope protein of the virus [2][3][4].This difference is responsible for the host range in various virus classes.
Porcine endogenous retrovirus copies carried by different pig varieties are distinct in nucleotide composition, expression, and ability to produce infectious virions.It is believed that the pig genome can carry 6-10 replicationcompetent proviruses, 30-50 full-size PERV copies, and 100-200 loci carrying truncated virus sequences [1].Comprehensive studies of pigs belonging to the Large White breed were undertaken to evaluate the number of genomic sequences coding for full-length replicationcompetent proviruses [5][6][7].As a result, significant variations in the distribution and number of proviruses were found in this breed.Viral genomes were also analyzed in the following breeds: Westran [8,9], Duroc, Landrace, Yorkshire, Berkshire, and their hybrids [10,11]; Chinese breeds Banna miniature pig, Wu-Zhi-Shan, Nei Jiang [12], and Meishan [11]; and west European wild boars [9].In these breeds, as in Large White pigs, PERV sequences are dispersed throughout the genome.The breeds differ in the PERV copy number, chromosomal distribution, and presence of full-length sequences.These traits also varied within the breeds.
Previously, differences in the prevalence of individuals with chromosomes carrying PERVs of various classes and their combinations between domestic pig breeds, between wild and domestic pigs, and between wild pigs of East Europe and Central Asia were demonstrated [13,14].
In this work, we analyze the differentiation between populations of wild and domestic pigs.For this purpose, we performed statistical assessment of the population frequencies of chromosomes carrying certain PERV classes and combinations is using maps constructed in two principal component coordinates.

MATERIALS AND METHODS
Experiments were performed with blood samples from three subspecies of wild boars, five commercial breeds of domestic pigs, and one breed of laboratory miniature pigs (Table 1).Wild boar animals of the European Sus scrofa scrofa variety (SSS) were obtained from the Voronezh Biosphere Reservation.Wild boars of the Romanian subspecies S. s. attila were taken from two southern Ukrainian populations (SAS and SAN).The Central Asian Wild Boar subspecies S. s. nigripes (SSN) was represented by animals hunted down in the Chu Valley, Kyrgyzstan.Domestic pigs Sus scrofa domestica of the Large White breed included animals of the Achinsk (LWA) and Novosibirsk (LWN) types bred at the Inya stud Farm.Landrace pigs were obtained from the Kudryashovskoe farm (LNK) and the Experimental Farm of the Siberian Branch of the Russian Academy of Sciences, hereafter referred to as the Experimental Farm (LNE).Duroc pigs were obtained from the Kudryashovskoe farm (DRK).Animals of the SM1 precocious meat breed were obtained from the Tulinskoe work-study unit.The Kemerovo breed included animals from the Yurginskii breeding farm (KMR).Miniature pigs (MS) were obtained from the Experimental Farm.A total of 636 blood samples were studied: 35 from mature wild boar males and females and 601 from domestic pigs (mature breeding males and females and youngsters below two months).Blood was taken from the anterior vena cava of living domestic pigs or from the heart of killed wild boars.
The population frequencies of chromosomes carrying various PERV classes and their combinations were determined by Bernstein method modified for a gene with multiple copies located on different chromosomes [14].
Phylogenetic relationships among the populations under study were studied in maps constructed in two principal component coordinates.Determination of genetic distances was followed by sample ordination [15].After scaling, each population was defined as a point in a 500 500 arbitrary unit area.Three types of genetic distances were used: 1) Euclidean distances  where n is the number of phenotypes, and p i and q i are frequencies of phenotypes in populations to be compared [16]; 2)Harpending-Jenkins distances 2 , where p  is the weighted mean frequency of a certain phenotype in populations to be compared, and n is the number of genes according to which the populations are compared [17]; 3) Nei's distances 3 ln 1 2(ln ln ) , J p is the theoretical homozygosity in the first population, J q is the theoretical homozygosity in the second population, and J pq is the mutual identity of the populations under comparison [16].Two models were considered for construction of maps in two principal component coordinates.In model M-1, the frequencies of chromosomes carrying PERV classes were presented as frequencies of three independent factors with two variables: envA + and envA -, envB + and envB -, envC + and envC -.In model M-2, the frequencies of chromosomes carrying PERV class combinations were presented as frequencies of seven independent factors with two variables: envA + and envA -, envB + and envB -, envC + and envC -, envAB + and envAB -, envAC + and env AC -, envBC + and envBC -, envABC + and envABC -.For each model, three versions of maps of the phylogenetic relationship among the varieties were constructed.

RESULTS
As shown in our previous study [14], frequencies of chromosomes carrying various PERV classes and their combinations vary significantly among the subspecies of wild boars and breeds of domesticated pigs (Sus scrofa L. 1758) (SUIDAE, MAMMALIA), as well as among herds within a breed.One to three distinct PERV classes were detected in chromosomes of the populations under study.In addition to single provirus copies, such chromosomes contained combinations AB, AC, BC, and ABC (Table 2).
Maps in two principal component coordinates constructed on the base of models M-1 and M-2 demonstrate features of the phylogenetic relationship between populations determined by microevolutionary processes (Figure 1).It should be emphasized that model M-1, which considers the frequencies of chromosomes carrying certain PERV classes, and model M-2 that deals with the frequencies of chromosomes carrying combinations of these classes, yield different results (Figure 1).
Four population clusters can be recognized in maps constructed on the base of frequencies of chromosomes carrying PERV classes.Cluster 1 is formed by wild boars; cluster 2 by domestic meat breeds; cluster 3 by meat-and-lard (universal) breeds, and cluster 4 by miniature pigs (Figure 1).These clusters form a certain logical order.Three clusters form one straight line: wild boars, meat breeds, and universal breeds.The fourth cluster, miniature pigs, is distant from this line.Thus, according to the M-1 model, the frequencies of PERV classes show the following trend associated with morphotypes: wild boar commercial meat morphotype  commercial meat-and-lard morphotype.The maps also indicate that variation among populations within the   1.
clusters is nonrandom.The vector of this variation is directed to point 0, 500 (bottom-right corner of the map).Thus, it is reasonable to suggest that cluster 4 (miniature pigs) is the farthest deviation from this vector and that these pigs originated from a population initially belonging to cluster 3 (meat-and-lard morphotype).
A markedly different pattern is seen in the maps constructed on the base of frequencies of chromosomes carrying PERV class combinations (Figure 1).It is similar to that obtained by using the M-1 model (Figure 1) in that the miniature pig population is distant from other populations, whereas wild boar populations still form a compact cluster.However, the clusters of meat and universal meat-and-lard breeds are extended in parallel to the 0, 0; 500, 500 line, so that the populations belonging to these clusters are located on the opposite sides of the wild boar cluster (Figure 1).The populations can be combined into clusters according to their locations in the maps (Figure 1).Cluster 1 includes Landrace and Duroc domestic pigs from the Kudryashovskoe farm and the Novosibirsk subbreed of Large White pigs.Noteworthy, these pigs have a history of the most intensive selection for meat yield.Cluster 2 includes Large White pigs of the Achinsk subbreed and the Kemerovo breed.These populations belong to the universal meat-and-lard morphotype.Cluster 3 is formed by miniature pigs.Cluster 4, located in the centres of the maps, is of special interest.It includes wild boars, SM-1 pigs, and Landrace pigs from the Experimental Farm.This combination is reasonable.The intensity of selection for meat yield in these two populations was less than in populations of the first cluster, and this is the cause of the reversion to the wild morphotype observed in these populations.It is known that meat breeds had been raised from meat-and-lard and lard morphotypes [18]; therefore, they deviated even more from wild boars.In Landrace pigs of the Experimental Farm, this shift is directed to the cluster of meat-and-lard populations, and in the SM-1 breed to miniature pigs, which can be considered closer to their ancestral Asian lard breeds [19,20].The Kemerovo breed is located between the Achinsk Large White subbreed and miniature pigs in both maps.The most likely cause of this location is that the Kemerovo breed was at first raised as a lard breed [21], and later selection was directed to the universal meat-and-lard type [22].Thus, maps constructed on the frequencies of chromosomes carrying PERV class combinations reveal finer features of population differentiation than maps of simple PERV class carriership.These features are associated with the differentiation among populations within large groups, such as wild boars and commercial domestic pig morphotypes, rather than with the differentiation among the groups.
There were two questions to be raised in our work: what wild boar population is closer to the present-day domestic pig according to PERV prevalence and what domestic pig breeds are closer to wild boars?To answer these questions, we employed maps constructed in two principal component coordinates according to two models: a model named M-1 that considers frequencies of PERV classes, and a model M-2 that deals with various PERV class combinations.The distances between populations seen in the maps (Figure 1) are presented as bar graphs (Figure 2).The graphs obtained on the base of the M-1 model show that the European wild boar Sus scrofa scrofa subspecies from the Voronezh Biosphere Reserve is the closest to the domestic pig, and the Central Asian wild boar subspecies Sus scrofa nigripes is the farthest.The distances determined on the base of the M-2 model show the same result, although less clearly.Both models indicate that modern domesticated swine meat breeds are the closest to the wild type.Meat-andlard domestic swine breeds are farther from wild boars, and miniature pigs are the farthest.

DISCUSSION
Different aspects of PERV-associated features of differentiation of swine (Sus scrofa) populations were revealed by using the models presented in this study.Previous results [13,14] were summarized, and hypotheses concerning the role of PERVs in microevolutionary changes occurring in Sus scrofa populations were substantiated.The variation in the frequencies of chromosomes carrying certain PERV classes and class combinations follows two microevolutionary vectors.The first vector is more specific: wild boars meat pig breeds meat-and-lard pig breeds.It can be defined as an increase in fat deposition rate in pig subspecies and breeds.The second vector is of a more general nature: wild boars and commercial domestic pig breeds miniature pigs.It is a "minus" selection vector.It can be suggested that in the first case PERVs tag loci affecting fat deposition rate and in the second case are associated with loci controlling some commercial and adaptive traits.However, the following hypothesis appears to be more plausible: retroviral copies are inserted into certain functionally important genome regions, thereby disrupting the normal function of genes located in the insertion sites or nearby.These aberrations may give rise to undesirable or, to the contrary, desirable traits.Thus, the vector of increasing fat deposition appears to be a special manifestation of the "minus" selection vector.The latter should be more accurately termed the genomic aberration vector.This hypothesis is in agreement with the formerly reported data that the frequencies of individuals and chromosomes carrying PERVs in populations naturally selected for fitness (wild boars) or intensively selected for pork production (pigs of the Kudryashovskoe herd) are much lower than in populations undergoing a less intensive selection [14].Also, it was shown that a class B PERV copy was inserted into the BAT1 gene (coding for RNA helicase) of Large White pigs [9].It should be mentioned that the frequency of PERV carriers was higher among miniature pigs, the product of "minus" selection.They can be considered a pathological form because of their slow growth, small size, low fertility, high postnatal mortality, and a tendency to obesity.Note that in miniature pigs not only the highest PERV carriership frequency was recorded but also the highest frequency of chromosomes carrying all the three PERV classes in comparison to all other populations investigated (Table 2).
The proximity of wild boars from the Voronezh Biosphere Reserve located in the centre of European Russia to present-day domestic pigs (Figure 2) appears to be due to convergence rather than divergence.The Voronezh wild boar population was formed by a recent natural crossing between the subspecies Sus scrofa scrofa and Sus scrofa attila during migrations of wild boars from the West (from Europe) and South (from Ukraine and Caucasus) [23,24].Therefore, its early assignment to S. s. scrofa is entirely formal.Both these subspecies were among the ancestors of modern European breeds of Sus scrofa domestica [23,25,26].Thus, the similarity between these two forms, the wild boar and domestic pig that originate from common ancestral subspecies is natural and of a convergent nature.
It is reasonable to suggest that the graph series obtained by the ranking similarity of domestic pig populations to wild boar are related to the vector of genomic changes induced by PERVs.Breeds of the meat type are in the closest proximity to wild boars.They were raised  1.
by selection for a less intense fat deposition.The elevated fat deposition in domestic pigs in comparison with wild or early domesticated forms is an obvious abnormality, which may have been caused by the breakdown of some genes owing to PERV insertion.Therefore, the natural selection against these breaks favoured alleles characteristic of the original wild boar or similar.This may have resulted in convergent similarity between meat pig breeds and wild boars.Domestic breeds of the universal meat-and-lard morphotype should possess a certain number of loci in the genome that would determine the fat deposition degree corresponding to this morphotype.Mutations in some of these loci caused by PERV insertion give rise to the desirable trait; therefore, this morphotype diverges more from wild boars than breeds of the meat type in the frequencies of chromosomes carrying certain PERV classes and type combinations.Miniature pigs were raised by intensive selection for a smaller adult body size with minimum selection for other traits.The development of irrelevant traits should be sufficient for no more than maintenance of the population.For this reason, it is likely that the miniature pig genome was enriched in loci whose function was disrupted by PERV insertions.In some cases, this favoured the desired trait (small size), and in other cases this was of no significance, because no selection for commercial traits was conducted.
In summary, we analyzed in this study patterns of differentiation of domestic and wild pigs in the frequencies of chromosomes carrying certain PERV classes and type combinations.With regard to this differentiation, we demonstrated that the convergence processes were at least no less significant than the divergence ones.It appears that PERVs were not neutral elements in the evolution of the pig genome.

Figure 2 .
Figure 2. Distances between wild boar populations and domestic pig breeds according to maps constructed in two principal component coordinates.For each breed, the first bar presents Euclidean distances; the second, Harpending-Jenkins' ones; and the third, Nei's ones, respectively.Population designations follow Table 1.

Table 1 .
Populations of wild boars and domestic pigs.

Table 2 .
Frequencies of chromosomes carrying certain PERV classes and combinations of these classes in wild boar and domestic pig populations.The classes are identified by env gene sequences.