Antibody-Like Phosphorylation Sites. Theme for Studies of Cancer, Aging and Evolution

Sequence similarities were found between protein and DNA sequences encoding certain part of conserved variable immunoglobulin domains (i.e. conserved IgV) and phosphorylation sites. Hypermutation motifs were then indicated in the majority of the corresponding non-IgV nucleotide sequences. According to database confirmations or double prediction of phosphorylation sites, 80% of the selected human and mouse IgV-related phosphorylation sites or their highly probable candidates exhibited substrate relationship to atax-ia-telangiectasia-mutated kinase known as ATM. In accordance with litera-ture data, inactivation of ATM by mutations can participate in the mechan-isms of carcinogenesis, neurodegeneration and possibly also in aging. In agreement with this relationship, some of the selected IgV-/ATM-related segments formed molecules specifically involved in carcinogenesis. The selected IgV-related sequence segments were also similar to certain segments of higher plants containing immunoglobulin-like repeats and related regions. Bioinformatic analysis of some selected plant sequences then indicated the presence of catalytic domains composing serine/threonine/tyrosine recep-tor/receptor-like kinases, which are considered important structures for evolution of very early and part of later Ig-domain-related immunity. The ana-lyzed conserved domain similarities also suggested certain interesting structural and phylogenic relationships, which need to be further investigated. This review in fact briefly summarizes the findings on the subject from the last twenty years.


Introduction
Segments of immunoglobulin variable domains (IgV) of immunoglobulins (Ig) or T-cell receptors interacting with antigens as well as protein phosphorylation sites represent short peptides whose interactions can be considerably altered by one or a few non-synonymous mutations in the corresponding encoding DNA. Therefore, the question was whether at least some of the phosphorylation sites were not structurally close to IgV segments [1] [2]. In the initial examination of this question, our chain matrix was used [3], and intertwined and linear similarities were found between 1) certain model protein kinase substrates or inhibitors and 2) different IgV consensi [1] [2] [4]. The N-terminal IgV region containing the C-terminal part of FR1, the hypervariable CDR1 region and the N-terminal part of FR2 appeared to be the most structurally interesting in terms of the sought similarities and also frequent hypermutation [2] [4] [5], while the greatest similarities between non-vertebrate metazoan IgV and also between different conserved IgV were found in the FR3 framework (this fact could be rather important for the evolution of recombination of antibody genes [5] [6]). Nevertheless, not only the selected N-terminal regions but also such FR3 segments of different conserved IgV constructs exhibited dominant occurrence of predicted phosphorylation sites [7].
The corresponding more detailed search for IgV-related murine and human phosphorylation sites was then performed using 1) sequences of the conserved IgV domains corresponding to N-terminal region mentioned above, 2) bioinformatic procedures analyzing simultaneously the corresponding RNA and protein sequences (i.e. bilingual approach including mainly predictions based on artificial intellect and searches for hypermutation motifs) and 3) databases of existing phosphorylation sites [6] (for details see next chapter). This study and some related attempts mentioned above thus belonged to the medical studies of mutability or potential mutability of regulatory important phosphorylation sites.
The number of such studies increased mainly in last ten years [8]- [13].
Existing and convincingly predicted phosphorylation sites also contributed to our attempt to search for Ig-domain-related structures in higher plants [14] prolonging our part of research intended to evolution [4] [5]. These Ig-domainrelated aims were accompanied with our little contribution to long lasting research specifically concerning deep evolution of the catalytic domains of serine/threonine/tyrosine kinases [15] [16] [17] [18] [19] in the plant kinase molecules containing Ig-like segments [14]. As well known, some of these kinases often accompany IgV domains in non-vertebrate metazoan proteins (cf. [5] [ 14]). For more detailed comments to our phylogenic studies see the chapter 3.

IgV-Like Phosphorylation Sites and Their Relatives Found in Human and Mouse Sequences
As was specifically described in the corresponding our paper [6] (cf. also Figure  1), several types of BLAST searches for IgV-related segments occurring within sequences different from antigen receptors were performed when using two multiple nucleotide sequence queries (MNQ) and score-derived or combined limits. In summary, these MNQ were formed by preselected 149 different conserved N-terminal non-redundant IgV segments of reference mRNA sequences encoding Ig and T-cell receptors. Following the searches with MNQ, anti-redundant procedures finally restricted the set of six hundred ten IgV-related nucleotide sequence segments.
To predict phosphorylation sites in the next selection step, the originally selected sequences were transformed to the corresponding protein sequences. The prediction was realized by means two methods based on artificial intellect [6]. For the selected associated results see the chapters 2 and 3 of this paper. For the specific schemes and more extended comments see the previous papers [6] [14]. For a more detailed description of the necessary new approaches and applied statistics see the corresponding paper supplements accessible on http://www.papersatellitesjk.com/.
More precisely, we used online programs KinasePhos2.0 and NetPhos2.0 working on the principles of support vector machines and neural networks, respectively [20] [21]. These programs usually selected most probable alternatives of substrate peptide segments for several kinases or phosphatases. This means that the programs mostly indicated fused chimerical regions of predicted phosporylations (FCRP) with close amino acids predicted as phosphorylated in the tested sequences. Such regions looked like partially realized or only sleeping pluri-pontent germ-like structures rather than sole phosphorylation sites. Primarily, all peptide segments achieving at least one prediction score 0.800 were restricted. Subsequently, we required presence of at least one certain tri-or tetranucleotide hypermutation motifs denoted here as HM* in the regions encoding superiorly predicted phosphorylation sites. These HM* were defined by 1) their location at positions critical with respect to possible aa alteration, i.e. as motifs associated with non-synonymous mutations and 2) the corresponding structure known from publications dealing with hypermutation of antigen receptors or hypermutation activities of oncologically important APOBEC enzymes (cf. [6] [22] [23]).
In the next step, we required either experimental confirmation published in databases (databases PHOSIDA and Phospho.ELM [24] [25] were applied in this case) or simultaneous achievement of the limiting score by the both predicting methods. Seven protein sequences forming the set of sequences denoted as ALPS (i.e. antibody-like phosphorylation sites sometimes interpreted as FCRP due to predictions) were restricted by means the databases. Thirty five protein sequences different from ALPS composed set of sequences called ALPS2 (set of ALPS relatives mostly assembling FCRP) restricted only by successful double predictions. This set included among others two pairs of identical human and mouse sequences occurring in the corresponding orthologues. Fusion of ALPS and ALPS2 then yielded set of forty-two protein sequences entitled as ALPS+ above all dealt with the corresponding paper [6].
The products of the corresponding maximum score pairs were individually enumerated for each ALPS+ by KinasePhos2.0 and NetPhos2.0. These products were distributed bimodally (formed two peaks), not as might be expected, with exponential decrease. More detailed statistical evaluation then indicated a strong and significant prevalence (p < 0.05) of these products to higher values than value 0.880 (limit for two identical scores 0.938) constituting the inferior border product value for dominant peak of product frequencies in range of maximum possible product values [6]. This result suggests the existence of selective evolutionary pressure that probably maintains the function of ALPS+ of the dominant peak and thus also attests to their validity of prediction. Consequently, ALPS and ALPS+ which product values formed the dominant peak composed set of thirty seven functionally correlated ALPS+ items (including also two identical sequence pairs described above in case of ALPS2). ALPS+ of this set were called here as ALPS++. IgV-related protein sequences including these ALPS++ were used in the next phylogenic studies [14] (see also chapter 3). Some ALPS++ selected by authors are shown in Table 1 (for associated sequencing projects see [26]- [41] and NCBI database of nucleotide sequences).
Since IgV-related segments including ALPS+-encoding sequences were primarily searched using comparison with the representative nucleotide sequences of MNQ, not all of the corresponding peptide candidates for ALPS+ were found in the same reading frame as the antigen receptor sequences being compared. In spite of it, considerable number of the ALPS+ amino acid sequences translated outside the reading frame of antigen receptors were identified as substrates of almost the same phosphorylating enzymes like ALPS+ translated in the same frame as IgV [6].
In our opinion, this remarkable independence of the reading frame could be of interest for understanding the evolution of recognition plasticity by IgV of antigen receptors compared to ALPS+ (cf. for instance MHC context in recognition by T-cell receptors). In addition, the observed differences in the reading frame appear to be important in terms of possible epitope diversification after HM*-mediated mutation. While translation of the mutated code for ALPS+ in a different reading frame from antigen receptors could lead to neo-epitopes, the translation in the same frame more likely causes cross-interactivities with rheumatoid autoantibodies [6].
Despite independent selection for kinase specificity, 80 percent of the selected ALPS+ segments were predicted as substrates for ataxia-telangiectasia-mutated kinase (ATM) phosphorylating substrate on serine. Interestingly, inactivating mutations of the gene encoding ATM inhibit DNA damage response concerning first of all double strand breaks and lead to oxidative stress, genomic instability, disregulation of mitochondrial homeostasis and autophagy [42] [43] [44]. Such changes then cause cancer (mainly lymphomas), neurodegeneration, immunodeficiency, chronic lung disease and segmental premature aging [43] [45] [46] [47] [48]. Consistently with this fact, more than a quarter of the selected molecules containing ALPS+ were among those important in oncogenesis (cf. Table   1). Besides ATM-related ALPS+, we found among others ALPS+ of relationship to kinase Aurora, i.e. kinase that directly regulates the activity of the most frequently predicted ATM via its phosphorylation. In addition to ATM-phosphorylated serine, phosphorylations of threonine and tyrosine were also predicted in the evaluated IgV-related protein sequence segments. In summary, the results concerning predictions of phosphorylating enzymes were consistent with the previously predicted occurrence of phosphorylation sites in the corresponding compared N-terminal parts of model conserved IgV [7].
Several notable relationships followed from the statistical evaluation of HM* occurrence in the corresponding DNA sequences [6]. Only six human and four mouse DNA segments encoding ALPS+ included HM* in template (RNA complementary, i.e. directly transcribed) strands. In addition, the ratio between the occurrence of HM* and complementary group of hypermutation motifs was The order of items was sorted according to 1) database confirmed existence and 2) products of score values obtained with ALPS++ predictions. pos_aa: predicted or database confirmed phosphorylated aa is denoted by a single character accompanied by the number indicating the corresponding aa position in the corresponding IgV-related peptide; *F, *P, *PF: database records confirming the existence of phosphorylation sites were found using Phosida, Phospho.ELM or both databases, respectively; Hu: human; KinP, NetP: scores obtained with prediction realized by KinasePhos2.0 and NetPhos2.0, respectively; Mu: mouse; Sp: species origin. b aa_HM*: number of amino acids (aa), which compose existing or superior predicted phosphorylation site and are encoded by aa-altering nucleotide HM*; NS/PS: items in the two following rows concern nucleotide and protein sequences, respectively. c The similarities between pre-selected conserved-IgV-domain-related nucleotide sequences of vertebrate antigen receptors (IgV-NS) and human or mouse non-Ig sequences were searched. Maximum local bit score higher than thirty or maximum bit score higher than twenty five together with the presence of more than five supporting similarities were required. The records of supporting similarities had to comprise a) almost the same segment of subject sequence like the supported maximum similarity, b) different IgV-NS and c) score higher than 22 bits. For more detailed information see the corresponding original paper [6]. FrS: frequency of supporting similarities; max BS: maximum bit score of similarity with IgV-NS; asterix, i.e. * after score values lower than 30: effective frequency of supporting similarities was required. d kinases specifically related to ALPS++ segments #: autophosphorylation; Aur: Aurora-related kinase; ATM: ataxia-telangiectasia-mutated (kinase); CK1, CK2: casein kinases 1 and 2, respectively; GSK3: Glycogen synthetase kinase-3; MAP3K: MAPK kinase kinase; PKB, PKC: protein kinase B and C, respectively; PLK1: polo-like kinase 1; dash: confirmed phosphorylation was recorded by database though without the knowledge about the phosphorylating enzyme.
significantly and markedly higher in cases of non-template (lagging) DNA segments encoding ALPS+ (cf. Fig. 4c in [6]). These facts appeared to be interesting with respect to the observed predominant actions of mutator enzymes forming APOBEC3 family in extended non-template DNA of model bacteria and cancer cells [49]- [54]. Hence similar mutagenesis of oligonucleotide segments encoding actually phosphorylated ALPS+ could cause diminishing or loss of the corresponding phosphorylation or even exchange (cf. FCRP in chapter 2) of regulating phosphorylating enzyme. As well known such events sometimes lead to carcinogenesis [8] [9]. In addition, it is a question whether also the considered mutation changes of at least some frequent ATM-related ALPS+ can imitate consequences of ATM failure mentioned above and thus leading not only to cancer but also to premature aging [ [48]. In accordance with usual superior occurrence of HM* in segments of variable Ig genes encoding hypervariable regions 1 (CDR1), HM* occurred most frequently in ALPS+ encoded by nucleotide sequences similar to these CDR1. The twelve DNA segments encoding ALPS+ contained HM* whose specific change would directly remove phosphorylated serine from ALPS+.

Ig Domain Prehistory in the Focus of Conserved Domain Similarities
Higher plants represent Bikonta organisms. Consequently they differ from Unikonta group and its subgroup Opsithokonta including among others the taxonomic group of animals, i.e. Metazoa [55] [56] [57] [58], which frequently encode IgV in their genomes (cf. [5]). Hierarchically, four types of similarities ponding to dense conserved domain similarities between fold length segments [14]. Forty nine sequences achieving NRI similarities were found in our case.
Nineteen of them were selected with help of MQP including sequences of ALPS++. Significant conserved domain similarities represented last type of the evaluated similarities. The Expect limit E* = 0.01 for these similarities, cur-rently applied in the selected widely-used CDD program [59] [60], was stricter than usual validity limit p < 0.05. Hence this limit restricted significant probability p < w(E*) = 0.0099502 ≈ E*. In addition, independently verified fold similarities (fifteen sequences) and contexts selected by the authors (five sequences) also played a role in the final selection of the presented data (for overall list of the selected twenty items see Table 3 in the previous paper [14]).
Serine/threonine protein kinase ALE2 of Hevea brasiliensis origin (sequence ID XP_021662681.1 [61]) containing a segment with significant conserved domain similarity to the vertebrate Ig1_Neogenin domain (sIg1N region at ALE2 positions 288-335) appeared to be the most interesting molecule in our study [14]. The identified conserved Ig1_Neogenin domain of this molecule usually indicates mammalian proteins involved in neurodegenerative processes, i.e.
processes absent in plants. Since neurodegenerative processes in mammalians are associated with aging, we can ask whether this domain can also participate in aging processes in plants [14]. In addition to the observed significant similarity, including ALPS++ were also found in the three cases of other molecules mentioned below (see Table 2). Among the molecules achieving NRX-level of conserved domain similarity with vertebrate conserved Ig domains and, at the same time, having required fold similarities, we have to mention first of all 1) molecule XP_010937019.2  Table 3 in [14]. **CoALIgdom: co-localized occurrence of 1) similarities with ALPS++ containing IgV-like segments (cf. Table 1) and 2) conserved Ig domain similarities. b Together with the monitored conserved Ig domain similarities (Ig-cds), we show here maximum conserved domain similarity present in each molecule-related CDD record and the molecular maxima related to similarities of catalytic kinase domains STYK-CD mentioned in the chapter 3. For more detailed information about the displayed conserved domains see the special option in the menu on NCBI web page. c Pairs or triplets of restricted intervals present in grey elements -chimeras of co-locating recessive conserved Ig-cds and dominant non-Ig cds. d AC-NRI: numbers and group-related maximum edge positions of all reciprocally co-locating or (if the number is one) solely found Ig-cds, whose Expects achieved at least NRI level (i.e. E < 4.605) mentioned in chapter 3. e We searched the maxima of HMMER-derived similarities between the selected Ig-domain-related segments of higher plant (Embryophyta) origin and the complementary set comprising sequences of all living organisms except for higher plants. WTG: well-known taxonomy group comprising species producing the most similar molecules restricted by HMMER searches; bold: Expect and taxonomy group of hot candidate segments for recent horizontal transfer (these segments were selected based on our empirical statistics; cf. the chapter 3 and [14]).
with a segment close to Ig-domains of IL1-receptors (unique molecule selected by three independent methods and in addition exhibiting significant specificity of top non-chimerical NRX similarity with the corresponding Ig domain), 2) an XP_013448976.1 molecule with maximum similarity to the IgV-H domain and two other colocalized Ig domain similarities at the NRX level, 3) a potassium channel molecule XP_018679675.1 selected by two NRX-limited Ig domain similarities. The similarities of potassium channel sequences to Ig domains have so far been described in mammals [63] [64]. Furthermore, Kubrycht and Sigler (2020) [14] considered five molecules with an interesting context. Four of them formed chimerical similarities with vertebrate conserved Ig domains at the NRX level. This means that minor Ig-domain similarities were usually part of segments achieving more extensive dominant significant conserved domain similarities with non-Ig domains. This was the case for the dominant similarities with conserved domains of 1) B-lectins in molecules XP_019152409.1 and XP_010228837.1, 2) self-incomp_S1, i.e. domains causing so-called self-incompatibility preventing inbred pollination of plants (cf. XP_017233469.1 in Table 2) and 3) leucine-rich-repeat-associated kinases PLN00113 (cf. RCH1 molecule with ID: XP_021677865.1). In the last case, PLN00113-related segment separately contained in its two different parts two differently similar subsegments. More precisely, this concerned a) a more N-terminally located subsegment similar to the Ig3_L1-CAML domain, and b) a subsegment simultaneously significantly similar to many catalytic conserved domains of STY-kinases including such Ig-domain-associated catalytic domains (these catalytic kinase domains are otherwise described in the three terminal paragraphs of this chapter). For further details regarding the description of the Ig-domain similarities mentioned here see Table 2 (for the sequencing projects associated with the displayed sequence items see [65]- [76] and NCBI database of protein sequences).
A feedback database projection of all Ig-domain related segments at the NRI similarity level was performed by Hidden Markov related HMMER and provided a bimodal frequency distribution within the scale of the logarithms of the corresponding Expect values (cf. Table 2). In agreement with this bimodal (two-peak) distribution the Expect limitation (E < E # = 10 −26 ) was derived based on inferior values of the peak comprising the values determining superior similarities. Four Ig-domain related segments displayed in Table 2 achieved the corresponding limited very high similarities (VHS). Due to unusually high similarity-related values of E # , these segments appeared to be hot candidates for products of recent horizontal transfer of genes or gene segments (cf. web supplement to [14] and Table 3 in the same paper). As can be expected based on simple skeptic objections assuming horizontal transfer of Ig domains from the metazoans, VHS to metazoan proteins included both selected peptide segments forming significant conserved domain similarities to vertebrate Ig domains, i.e. segments of ALE2 and Gamete expressed 2. More precisely, these important VHS were indicated when comparing the Ig-domain related segments of ALE2 and Gamete expressed 2 with proteins of stony corals Pocillopora (primitive metazoans belonging to clade Cnidaria) and insect origin, respectively [14]. On the other hand, two selected HMMER-derived VHS to sequences of Fungi origin can be classified evolutionarily interesting with respect to future supplementary investigation of Ig-domain-related molecules in yet belittled clades of Opisthokonta [77] [78]. These VHS were composed of already mentioned Ig-domainrelated segments of the potassium channel and the RCH1 molecule (see the pre-vious two paragraphs and Table 2).
An analysis of the sequences encoding multi-domain plant receptor/receptorlike kinases (RKs) provided also some orientation in the molecular evolution of signaling by cell-surface immune receptors [14]. These 121 sequences were restricted with the help of Ig-like similarities described above. Concerning the relationship to ALPS++, majority of RKs (104 of 121 sequences) were selected with combinatory contribution of IgV-related sequences including ALPS++, i.e. using always one of two MQP mentioned above. Consistently with records of significant conserved domain similarities or the literature, majority or at least some of the selected RKs appeared to be involved in plant antiviral immunity, respectively [14] [79]- [84]. In accordance with possible signaling role and name of RKs, their protein sequences frequently contained region or even regions achieving simultaneously many co-localized robust (extremely significant) conserved domain similarities with different model serine/threonine/tyrosine kinase catalytic domains (STYK-CD; cf. [15] [17] [19]). This means that the corresponding plant protein segments of RKs represented common similarity regions (CSR) which gave evidence about their undiversified structure with respect of indicated familial group of STYK-CD (a superfamily subset). Consequently, the conserved domain similarities indicated a slowed down (or even frozen) evolution of STYK-CD-related segments in plants.
The domain group of model STYK-CD included representative subset of nine special conserved tyrosine kinase domains (with robust average domain similarities in the range of 60 -120 bits), which were often associated with IgV (cdigvtk) forming joint peptide chains of metazoan proteins [6]. In accordance with our data, cdigvtk represented primarily a reasonably defined domain set due to their model-validated IgV association in primitive Metazoa and minimized redundancy of the domain set (cf. section 2.9 and Fig. 4 in [14]). In 112 cases of the Ig-like RKs, these segments were significantly similar to at least one cdigvtk and in 102 cases to all nine cdigvtk. However, the other conserved domains than cdigvtk attained superior similarities with the STYK-CD-related segments of molecules from the set of RKs. This concerned mainly maximum similarities of CSR to conserved catalytic kinase domain associated with the interleukin 1 receptor (IRAK, i.e. cd14066; presented robust maximum mean bit score 286 bits) which occurred in 113 cases comprising CSR of 112 molecules participating in significant conserved domain similarities selected by cdigvtk. In accordance with the slowed down evolution of the STYK-CDs mentioned above, the dominantly similar IRAK could be assumed as a domain close to the molecular ancestor of STYK-CD, probably occurring earlier than the last common ancestor of plants and animals [14]. In addition, CSR of 118 RKs molecules, including CSR of 113 molecules significantly similar to IRAK, were significantly similar to the conserved domain of the leucine-rich-repeat-associated kinase PLN00113. The corresponding robust mean similarities with RKs were smaller than for IRAK but higher than for cdigvtk. However, certain segments immediately N-terminally adjacent to CSR of the compared plant sequences were sometimes also similar to PLN00113. Such very robust extended similarities in some cases exceeded 450 bits, which corresponded to an overall similarity much higher than any IRAK domain similarity. This fact, together with the more frequent occurrence of the significant CSR-related conserved domain similarities in the set of RKs mentioned above, could indicate the similarity of PLN00113 with the domain ancestor of STYK-CD even older than the considered ancestor close to IRAK (cf. [14]). This would explain the lower conserved domain similarity of CSR to PLN00113 than to IRAK as a consequence of higher phylogenic distance when assuming scarcely maintained original functional importance of the critical N-terminally adjacent segments in some cases of PLN00113-related sequences.
The apparently controversial point in terms of the evolutionary relationship between the discussed IgV-associated cdigvtk domains and Ig domains was noticed. This point consisted in the fact that the set of forty nine molecules achieving superior Ig domain NRI similarities comprised only six or seven molecules with significant conserved domain similarities to all cdigvtk or at least one cdigvtk, respectively. The corresponding fraction was therefore significantly and markedly lower than the already mentioned fractions of significant cdigvtk similarities in the case of the set of RKs. Perhaps we could still consider the possibility of 1) slowed down development of STYK-CD and thus also their slower gradual evolution of gene-gene interactions with Ig domain ancestors or 2) interesting but not yet verified alternative of evolution from ancestor structures resembling some Ig-domain-related subsegment of PLN00113 to Ig domains (cf. RCH1 mentioned above and in Table 2).

Perspectives
The volume of sequence data and repertoire of the tools important for their processing keeps enlarging. In twenty years, super-smart computers can globally solve the problems we have listed here. Until then, however, we need to know a wide range of answers that will help physicians treat better and super-smart computers calculate, search or select better and more objectively.
We consider a more detailed study of the corresponding subset of cancer-related molecules to be a suitable for continuation of our effort. In this more detailed study, we would like to include the re-evaluation of selected segments using 1) database mapping of cancer-related mutations [ [107] comprising also more specific cancer related neo-epitopes [108]. Based on literature data, the corresponding phylogenic studies could also include sequence sets of non-metazoan Unikonta (Amoebozoa, Apusomonadida, Breviata) more specifically evaluating non-metazoan Opisthokonta (including Fungi mentioned above). The reason for the proposed choice of the target organisms consists in the fact that a large number of early Ig domains and Ig-like segments occur in non-opisthokontal Unikonta, whereas structures close to IgV can be found in non-metazoan Opisthokonta [109] [110] [111] [112] (see also Figure 2 and cf. [113]- [120]). Due to our data and their statistical evaluation present in the preceding chapter and in [14], we assume that the corresponding future investigation could moreover expand to certain additional taxonomic groups of species than follows from literature. More precisely, this extension concerns yet not considered part of Neozoa representing close descendants of last common ancestor of higher plants and IgV-containing metazoans (cf. bottom branch following limiting dark brown division in the tree displayed in Figure 2). The Figure 2. Phylogenic tree illustrating divergent evolution of investigated higher plants (Embryophyta) and metazoans expressing structurally important IgV (domains). This tree was constructed when comparing NCBI taxonomy [122] and the trees recently published in Wikipedia (see also the corresponding recent papers [113]- [120]). Blue: notes and comments; dark brown: branches or leaves substantial for the illustrated divergence; green: simplified restriction of terminal parts, i.e. leafs, of the displayed tree; mya: millions years ago. proposed taxonomy-based extensions of compared sequences would require a feasible usage of more invasive and specific search strategies than those used in our previous paper [14], when respecting the recent frames of taxonomy relationships following from taxonomic databases [121] [122] and recent knowledge [58] [77] [123]- [130]. The possible relationships of at least some ALPS+ to aging have been also discussed here and in our last but one paper [6]. We suppose that also this context of our investigation will meet with better bioinformatic and more specific factual background in this decade.
We also hope that at least some of our methodological procedures and approaches, described so far mostly in publication supplements (e.g. our BLAST-related evaluation of multiple sequence alignments, an approximation determining the specificity of conserved domain similarities and several contributions to odds ratio evaluations), will be made publicly available in the future. In addition, the important methodic question for future consists evidently in deeper structural description of the observed sequence relationships.