Comparative structures and evolution of mammalian lipase I ( LIPI ) genes and proteins : A close relative of vertebrate phospholipase LIPH *

Lipase I (enzyme name LIPI or LPDL) (gene name LIPI [human] or Lipi [mouse]) is a phospholipase which generates 2-acyl lysophosphatidic acid (LPA), a lipid mediator required for maintaining homeostasis of diverse biological functions and in activating cell surface receptors. Bioinformatic methods were used to predict the amino acid sequences, secondary and tertiary structures and gene locations for LIPI genes and encoded proteins using data from several mammalian genome projects. LIPI is located on human chromosome 21 and is distinct from other phospholipase A1-like genes (LIPH and PS-PLA1). Mammalian LIPI genes contained 10 (human) or 11 (mouse) coding exons transcribed predominantly on the negative DNA strand. Mammalian LIPI protein subunits shared 61% 99% sequence identities and exhibited sequence alignments and identities for key LIPI amino acid residues as well as extensive conservation of predicted secondary and tertiary structures with those previously reported for pancreatic lipase (PL), with “N-signal peptide”, “lipase” and “plat” structural domains. Comparative studies of mammalian LIPI sequences with LIPH, PS-PLA1 and pancreatic lipase (PL) confirmed predictions for LIPI Nterminal signal peptides (residues 1 15); predominantly conserved mammalian LIPI N-glycosylation sites (63NNSL and 396NISS for human LIPI); active site “triad” residues (Ser159; Asp183; His253); disulfide bond residues (238 251; 275 286; 289 297; 436 455); and a 12 residue “active site lid”, which is shorter than for other lipases examined. Phylogenetic analyses supported a hypothesis that LIPI arose from a vertebrate LIPH gene duplication event within a mammalian common ancestral genome. In addition, LIPI, LIPH and PL-PLA1 genes were distinct from the vascular lipase (LIPG, LIPC and LPL) and pancreatic lipase (PL) gene families.

Human LIPI and mouse Lipi genes are expressed in the testis with LIPI found at the connecting piece of pri-mary spermatocytes [1,2], although a specific role for LIPI in LPA signalling has not as yet been reported [10].LIPI has been identified as a Ewing tumor-associated cancer/testis antigen (called CTA17), which is also expressed in testis and thyroid tissues [11,12].The human LIPI gene spans more than 98 kilobases of DNA, comprises 10 coding exons on the reverse strand and is localized on chromosome 21 near a gene encoding the RNA binding motif protein 11 (RBM11) [13,14].Mouse Lipi spans more than 45 kilobases of DNA, encodes 11 coding exons on the reverse strand of chromosome 16, and is also located near the gene encoding RNA binding motif protein 11 (RBM11) [2,14].Mutations of the human LIPI (or LPDL) and mouse Lipi (or Lpdl) genes have been associated with dyslipidemia in humans and hypertriglyceridemia in mice [2].In addition, a monosomic deletion model for the mouse chromosome 16 Lipi-Usp25 region exhibited a significant increase in fat mass, thickened subcutaneous fat and liver steatosis, when fed a high-fat diet [15].Hesse and coworkers [16] have recently identified two LIPH/LIPI-like genes in the chicken genome with wide tissue expression profiles.
There have been few biochemical and structural studies of mammalian LIPI, however two reports have described amino acid sequences for human and mouse LIPI, derived from cDNA sequences, encoding 460 and 476 amino acids, respectively, containing N-terminal and lipase domains in each case [1,2].Human LIPI (also called LPDL) exhibited 71% identity with mouse LIPI, and ~44% identity with LIPH and ~35% identity with vascular lipase sequences [lipoprotein lipase (LPL), endothelial lipase (EL) and hepatic lipase (HL)].In common with the human and mouse LIPH amino acid sequences, the LIPI sequences were characterized by shortened "active site lid" motifs and exhibited a higher affinity for heparin binding than for LIPH.
This paper reports the predicted gene structures and amino acid sequences for several mammalian LIPI genes and proteins, the predicted secondary and tertiary structures for mammalian LIPI protein subunits, and the structural, phylogenetic and evolutionary relationships for these genes and enzymes with the vertebrate LIPH and PL-PLA1 gene families and with vertebrate vascular and pancreatic lipase gene families.

Mammalian LIPI Gene and Protein Identification
BLAST (Basic Local Alignment Search Tool) studies were undertaken using web tools from the National Center for Biotechnology Information (NCBI) (http://blast.ncbi.nlm.nih.gov/Blast.cgi)[17].Protein BLAST analyses used human and mouse LIPI amino acid sequences previously described [1,2] (Table 1).Non-redundant protein sequence databases for several mammalian and other vertebrate genomes were accessed from sources previously described [18].Predicted LIPI-like protein sequences were obtained in each case and subjected to analyses of predicted protein and gene structures.
BLAT (BLAST-Like Alignment Tool) analyses were subsequently undertaken for each of the predicted vertebrate LIPI amino acid sequences using the University of California Santa Cruz (UCSC) Genome Browser [http:// genome.ucsc.edu/cgi-bin/hgBlat][14] with the default settings to obtain the predicted locations for each of the vertebrate LIPI genes, including predicted exon boundary locations and gene sizes.This browser was also used to show alignments of LIPI genes from several vertebrate genomes (called Multiz alignments).Structures for human and mouse LIPI isoforms were obtained using the AceView website to examine predicted gene and protein structures [19].Mammalian LIPI sequences were aligned using the ClustalW2 multiple sequence alignment program [20].

Predicted Structures and Properties of Mammalian LIPI Protein Subunits
Predicted secondary and tertiary structures for mammalian LIPI-like subunits were obtained using the PSIPRED v2.5 web site tools [21] and the SWISS MODEL web tools [swissmodel.expasy.org],respectively [22].The reported tertiary structure for horse pancreatic lipase (PDB: 1hpl) [23] served as the reference for the predicted human, mouse and rat LIPI tertiary structures, with a modeling range of residues 10 to 447.Theoretical isoelectric points and molecular weights for mammalian LIPI subunits were obtained using Expasy web tools (http://au.expasy.org/tools/pi_tool.html).SignalP 3.0 web tools were used to predict the presence and location of signal peptide cleavage sites (http://www.cbs.dtu.dk/services/SignalP/) for each of the predicted mammalian LIPI sequences [24].The NetNGlyc 1.0 Server was used to predict potential N-glycosylation sites for mammalian LIPI subunits (http://www.cbs.dtu.dk/services/NetNGlyc/).

Phylogenetic Studies and Sequence Divergence
Alignments of mammalian LIPI, human, mouse and zebrafish LIPH (lipoprotein H) and human, mouse, rat, chicken and zebrafish PS-PLA1, HL (hepatic lipase), LPL (lipoprotein lipase), EL (endothelial lipase) protein sequences, as well as human, mouse and frog pancreatic lipases (PL) sequences (Table 1s) were assembled using BioEdit v.5.0.1, as previously described [25].Alignment ambiguous regions were excluded prior to phy-logenetic analysis yielding alignments of 379 residues for comparisons of sequences (Table 1).Evolutionary distances were calculated using the Kimura option as previously described [18].Phylogenetic trees were constructed from evolutionary distances using the neighborjoining method [25].Tree topology was examined by the boot-strap method (100 bootstraps were applied) of resampling and only values that were highly significant (≥95) are shown [26].
Alignments of the human and other mammalian LIPI subunits examined showed between 61% -99% sequence identities, suggesting that these are products of the same family of genes and proteins (Table 2).The amino acid sequence for human and other primate LIPI subunits contained 460 residues while other mammalian LIPI subunits contained 476 (mouse and rat) to 452 residues (dog LIPI) (Figure 1; Table 1).The mouse and rat LIPI sequences differed in length from the primate LIPI sequences as a result of 16 additional residues at the amino terminus end.Several key amino acid residues for vertebrate LIPI were recognized (sequence numbers refer to human LIPI) (Figures 1 and 1s).These included the catalytic triad for the active site (Ser159; Asp183; His253) forming a charge relay network for substrate hydrolysis, similar to human lipoprotein lipase, other lipases and carboxylesterases [27,28]; a hydrophobic N-terminus signal peptide (residues 1 -15), similar to that reported for rat hepatic lipase [29]; and disulfide bond forming residues (Cys238/Cys251; Cys275/Cys286; Cys289/ Cys297; and Cys436/Cys455) [1,2].Identical residues were observed for each of the mammalian LIPI subunits for the active site triad and disulfide bond forming residues, however the N-terminus 15-residue signal peptide underwent some changes in sequence but retained predicted signal peptide properties (Figures 1 and 1s; Table 1).Two of the N-glycosylation sites predicted for human LIPI at Asn63-Asn64-Ser65 and at Asn396-397Ile-398Ile (designated as sites 3 and 9, respectively) were retained for most of the 11 mammalian LIPI sequences examined, although predicted N-glycolsylation sites were observed at other positions for some sequences, including Asn23-Thr24-Thr52 (site 1) for mouse LIPI; Asn34-Arg35-Thr36 (site 2) for rat LIPI; Asn61-Phe62-Figure 1.Amino acid sequence alignments for mammalian LIPI subunits.See Table 1 for sources of LIPI sequences; * shows identical residues for LIPI subunits; : similar alternate residues; .disimilar alternate residues; N-signal peptide residues are in red; Nglycosylation residues are in green; active site triad residues Ser; Asp; and His are in pink; phosphorylated human Ser25 is in khaki; disulfide bond Cys residues are in blue; predicted helix; predicted sheet; bold font shows known or predicted exon junctions; exon numbers refer to human LIPI gene; predicted "lid" covering the active site (human LIPI residues 239 -250) are shown #####; Huhuman LIPI; Or-orangutan LIPI; Gi-gibbon LIPI; Ma-marmoset LIPI; Mo-mouse LIPH; Ra-rat LIPI.
Ser63 and Asn71-Phe72-Ser73 (site 4) for dog and panda LIPI, respectively (Table 3).Given the reported role of the N-glycosylated carbohydrate group in contributing to the stability and maintaining catalytic efficiency of a related enzyme (human carboxylesterase or CES1) [30], this property may be shared by mammalian LIPI as well.Human LIPI Ser25, which has been previously reported to undergo phosphorylation in response to DNA damage [31], has also been retained (but with Thr41 for mouse LIPI) for each of the mammalian LIPI sequences examined (Figure 1).

Alignments of Human LIPI and Other
Lipase Subunits Alignments of human LIPI, lipase H (LIPH) [32], phosphatidylserine specific phospholipase A1 (PS-PLA1) [8][9][10], endothelial lipase (EL) [18,33], hepatic lipase (HL) [34], lipoprotein lipase (LPL) [35] and pancreatic lipase (PL) [36] sequences are shown in Figure 2. The following key amino acid residues were observed for each of these lipases consistent with those observed for the mammalian LIPI sequences (see Table 1).These included the N-terminal signal peptide sequences, which were distinct for each lipase but retained the property as signal peptides; the active site triad residues aligning with LIPI Ser159, Asp183 and His253; the disulfide cysteine residues reported for mammalian LIPH and LIPI sequences (human LIPI Cys238/Cys251; Cys275/Cys286; Cys289/Cys297; and Cys436/Cys455), although human PS-PLA1 did not contain the last disulfide pair, and additional disulfide bonds were observed for EL, LPL and HL (corresponding to human EL Cys64/Cys77) and for human PL (corresponding to Cys20/Cys26 and Cys107/ Cys118); distinct N-glycosylation sites were predominantly observed for each of the lipase sequences examined; the active site "lid" sequences showed that LIPI, LIPH and PS-PLA1 exhibited fewer residues in each case (12 amino acids), in comparison with the other lipases (EL [19 residues]; LPL and HL [22 residues]; and PL [23 residues]; and a high basic amino acid content region was observed for human LIPI residues (Arg299 →Arg321] aligning with a heparin binding site (human EL 324Lys→333Lys) reported for human EL which binds the enzyme to heparan sulfate proteoglycans on the luminal side of endothelial cells [37,38].The latter region for human LIPI may explain the enhanced heparin binding reported for LIPI, in compareson with LIPH, and the higher isoelectric point (pI) values observed for mammalian LIPI proteins (Table 1) [1].

Predicted Secondary and Tertiary Structures of Mammalian LIPI Subunits
Predicted secondary structures for mammalian LIPI sequences were compared in Figure 1, and similar αhelix and β-sheet structures were observed for all of the mammalian LIPI subunits examined.Consistent structures were particularly apparent near key residues or functional domains including the β-sheet and α-helix structures near the active site Ser159 (β4/α3) and His253 (α4) residues; and the conserved disulfide bonds at Cys275/Cys286 (near α4) and Cys436/Cys455 (near β14).Proximate to the active site histidine (His253) and lo-  See Table 1 and Supplementary Table 1 for sources of human LIPI, LIPH, PS-PLA1, endothelial lipase (EL), lipoprotein lipase (LPL), hepatic lipase (HL) and pancreatic lipase (PL) sequences; * shows identical residues for subunits; : similar alternate residues; .dissimilar alternate residues; N-signal peptide residues are in red; known or predicted N-glycosylation residues are in green; active site triad residues Ser; Asp; and His are in pink; disulfide bond Cys residues are shown (C); predicted helix; predicted sheet; bold font shows known or predicted exon junctions; exon numbers refer to human LIPI gene; basic amino acid residues (R and K) located in the heparin binding region of EL, LPL and HL are shown; predicted "lid" resisdues covering the active site (human LIPI residues 238 -251) are shown #####; the following domains were recognized for human LIPI: "lipase": residues 18 -300; "hinge": residues 301 -321; and "plat": residues 322 -460.
cated between two residues (Cys238/Cys251) forming disulfide intramolecular bonds are 12 amino acids forming the proposed "active site lid" structure for vertebrate LIPI [1,2].In addition, eight β-sheets were observed at the LIPI C-terminus end, which is consistent with PLAT domain structures previously reported for horse pancreatic lipase (PL) [23].It is apparent from these studies that the LIPI subunits examined have highly similar seconddary structures.
The tertiary structure for horse pancreatic lipase (PTL) is from [28]; predicted mouse, rat and human LIPI 3-D structures were obtained using the SWISS MODEL web site http://swissmodel.expasy.organd based on the reported structure for horse PTL (PDB: 1hpl); the rainbow color code describes the 3-D structures from the N-(blue) to C-termini (red color); N refers to amino-terminus; C refers to carboxyl terminus; the "lipase" and "plat" do-mains, the active site region and the "lid" covering the active site are shown.
Figure 3 describes predicted tertiary structures for human, mouse and rat LIPI sequences, in comparison with horse pancreatic lipase (PL) [23].Identification of specific structures within the predicted LIPI sequences were based on the reported structure for horse PL which identifies a sequence of twisted β-sheets interspersed with several α-helical structures which are typical of the alphabeta hydrolase super-family.The active site LIPI triad was centrally located which is similar to that observed for other lipases [18,23,39] and carboxylesterase (human CES1) [40].The major difference between LIPI and other lipases examined (see Figures 2 and 3) is the much smaller size of the "lid" region at positions 239 -251, which may act as a surface loop that partially covers the opening to the catalytic triad and allows access to the active site by LIPI substrates.This "lid" structure is readily apparent in the predicted structures for human, mouse and rat LIPI.These comparative studies of mammalian LIPI proteins suggest that the properties, structures and key sequences are substantially retained for the mammalian sequences examined.Figure labels.

Predicted Gene Locations and Exonic
Structures for Mammalian LIPI Genes Table 1 summarizes the predicted locations for mammalian LIPI genes based upon BLAT interrogations of several mammalian genomes using the reported sequences for human and mouse LIPI [1,2] and the predicted sequences for other mammalian LIPI proteins and the UCSC Genome Browser [14].Human and mouse LIPI genes were located on human chromosome 21 and mouse chromosome 16, which are distinct to the gene locations for other lipases, including lipase H (LIPH) (human chromosome 3 and mouse chromosome 16); phosphatidylserine specific phospholipase A1 (PS-PLA1) (human chromosome 21 and mouse chromosome 16); hepatic lipase (LIPC) (human chromosome 15 and mouse chromosome 9); endothelial lipase (LIPG) (both human and mouse genes on chromosome 18); lipoprotein lipase (LPL) (both human and mouse genes on chromosome 8); and pancreatic lipase (PL) (human chromosome 10 and mouse chromosome 19), respectively (Tables 1 and 1s). Figure 1 summarizes the pre-dicted exonic start sites for several mammalian LIPI genes with each having 10 or 11 (mouse and rat Lipi) exons, in identical or similar positions to those reported for the human LIPI and mouse Lipi genes [1,2,14].Human LIPG, LIPC, LPL and PL genes contained 10, 9, 9 and 12 exons respectively, which are in similar positions for several exons of vertebrate LIPI genes, suggesting that these are related genes (Figure 2). Figure 4 presents the structures for human LIPI and mouse Lipi transcripts, with the major transcript isoforms being designnated as NM_198996 and NM_001252513, respectively [1,2,14].The transcripts were 1.7 and 2.1 kilobases in length respectively, with 10 introns and 11 exons in each case.
Figure 2s shows a UCSC Genome Browser Comparative Genomics track that shows evolutionary conservation and alignments of the nucleotide sequences for the human LIPI gene, including the 5'-flanking, 5'-untranslated, intronic, exonic and 3'-untranslated regions of this gene, with the corresponding sequences for 8 vertebrate genomes, including 4 eutherian mammals (rhesus monkey, mouse, elephant and dog), a marsupial (opossum), a bird (chicken), frog and zebrafish genomes.Extensive conservation was observed only for the mammalian LIPI genes, particularly for the rhesus LIPI gene and for exonic sequences for eutherian mammalian LIPI genes.In contrast with the eutherian mammalian genomes examined, other vert es exhibited ebrate genom few conserved sequences, which indicates that only the mammalian LIPI genes were predominantly conserved throughout vertebrate evolution.
Derived from AceView website [31]; the major isoform variants are shown with capped 5'-and 3'-ends for the predicted mRNA sequences; introns and coding exons are shown; the direction for transcription is shown;

Phylogeny and Divergence of Mammalian LIPI and other Vertebrate Lipase Sequences
A phylogenetic tree (Figure 5) was calculated by the progressive alignment of human and other vertebrate LIPI amino acid sequences with human, mouse, rat, chicken and zebra fish LIPH; human, mouse, rat, chicken, lizard and zebra fish PS-PLA1; human, mouse and zebra fish hepatic lipase (HL) and endothelial lipase (EL); human, mouse and stickleback (fish) lipoprotein lipase (LPL); and human, mouse and frog pancreatic lipase (PL) sequences.The phylogram showed clustering of the mammalian LIPI sequences which were distinct from the vertebrate LIPH and PS-PLA1 sequences; and from other vertebrate sequences for vascular lipases (HL, EL and LPL) and pancreatic lipase (PL) vertebrate lipase families.In addition, LIPH and LIPI sequences clustered together, which is consistent with these genes being products of a recent duplication event during mammalian evolution.Overall, the data suggested that the mammalian LIPI gene arose from a gene duplication event of an ancestral LIPH-like gene, resulting in two separate lines of mammalian gene evolution for these genes, namely LIPI and the LIPH genes.This is supported by the comparative biochemical and genomic evidence for mammalian LIPI and vertebrate LIPH genes and encoded proteins, which share several key features of protein and gene structure, including having similar alpha-beta hydrolase secondary and tertiary structures.
The tree is labeled with the gene name and the name of the vertebrate.Note the major cluster for the mammalian LIPI sequences and the separation of these sequences from vertebrate LIPH (lipase H), PS-PLA1, human, mouse and zebrafish HL (hepatic lipase), EL (endothelial lipase) and LPL (lipoprotein lipase) sequences and human, mouse and frog PL (pancreatic lipase) sequences.See Tables 1 and 1s for details of sequences and gene locations.A genetic distance scale is shown (% amino acid substitutions).The number of times a clade (sequences common to a node or branch) occurred in the bootstrap replicates is shown.Only replicate values of 95 or more, which are highly significant, are shown with 100 bootstrap replicates performed in each case.

Conclusion
The results of this study suggest that mammalian LIPI genes and encoded LIPI enzymes represent a distinct alpha-beta hydrolase-like gene and enzyme family which share key conserved sequences and structures that have been reported for human LIPH [5][6][7], PS-PLA1 [8][9][10], the vascular lipase gene families, hepatic lipase (HL) [41,42], endothelial lipase (EL) [18,33] and lipoprotein lipase (LPL) [35], and for pancreatic triglyceride lipase (PL) [36].LIPI is a membrane associated phosphatidic acid-selective phospholipase catalysing the production of fatty acids and lysophosphatidic acid.Bioinformatic methods were used to predict the amino acid sequences, secondary and tertiary structures and gene locations for LIPI genes and encoded proteins using data from several vertebrate genome projects.Mammalian LIPI protein subunits shared 61% -99% sequence identities and exhibited sequence alignments and identities for key LIPI amino acid residues as well as extensive conservation of predicted secondary and tert y structures with those iar previously reported for horse pancreatic lipase, with "Nsignal peptide", "lipase" and "plat" structural domains.Phylogenetic analyses demonstrated the relationships and potential evolutionary origins of the mammalian LIPI family of genes from a duplication of a vertebrate LIPH ancestral gene, for which duplicated genes have been reported for chicken LIPH genes [16].These studies indicated that LIPI genes have appeared early in eutherian mammalian evolution.

Figure 3 .
Figure 3. Tertiary structure for horse pancreatic lipase (PL) and predicted tertiary structures for mouse, rat and human LIPI subunits.

Figure 4 .
Figure 4. Gene structures and isoforms for the human LIPI and mouse Lipi genes.

3 '
UTR refers to 3'-untranslated region; a scale is shown in base pairs (bps); coding exons are in pink; untranslated 5' and 3' regions are shown as open rectangles.

Figure 2s .
Figure 2s.Comparative sequences for vertebrate LIPI genes.Derived from the UCSC Genome Browser using the Comparative Genomics track to examine alignments and evolutionary conservation of vertebrate LIPI gene sequences; genomic sequences aligned for this study included rhesus, mouse, dog, elephant, a marsupial (opossum), bird (chicken), amphibian (frog) and fish (zebra fish); sequence identity is indicated by the green color; exons are numbered; note the conservation of mammalian LIPI exons.

Table 3 .
Known or predicted N-glycosylation sites for mammalian LIPI subunits.Amino acid residues are shown for known or predicted N-glycosylation sites: N-Asn; A-Ala; T-Thr; S-Ser; M-Met; L-Leu; D-Asp; G-Gly; F-Phe; I-Ile; V-Val; sites with high probabilities for N-glycosylation are highlighted in BOLD.