Increased Electrophoretic Mobility of Long-Type GATA-6 Transcription Factor upon Substitution of Its PEST Sequence

The transcriptional factor GATA-6 gene produces two translational isoforms from a single mRNA through ribosomal leaky scanning. L-type GATA-6 has an extension of 146 amino acid residues at its amino terminus. In the extension, there is a unique PEST sequence (Glu31-Cys46), which is composed of an amino terminal Pro-rich segment and a carboxyl terminal Ser-cluster. Substitution of either half of the PEST sequence with Ala residues by cassette mutagenesis reduced the apparent molecular size of L-type GATA-6 on SDS-polyacrylamide gel-electrophoresis. However, the effect of substitution of the Pro-rich segment was much more significant; the mobility increase of the Pro-rich segment on the gel was 13% while that of the Ser-cluster was 8%. Substitution of each amino acid residue demonstrated that the effect of Pro substitution is greater than that of the Ser and Thr residues. Such increased mobility of L-type GATA-6 in the presence of a detergent may apparently correlate with the decrease in transcription activity in vivo as determined by means of luciferase reporter gene assay. The activity of ΔAla (with Ala residues instead of the PEST sequence) was reduced to one fifth of that of ΔA (with the PEST sequence). These results suggest that the PEST sequence of L-type GATA-6 does not function as a constitutive protein degradation signal, but rather plays structural and functional roles in the activation of gene expression on the GATA Corresponding author.


Introduction
The GATA DNA-binding proteins recognize a canonical DNA motif (A/T) GATA (G/A), and regulate the expression of various genes required for developmental processes and tissue-specific functions [1] [2].In mammals, six GATA proteins that contain a characteristic zinc finger region having two tandem zinc fingers separated by 29 amino acid residues (CVNC-X17-CNAC)-X29-(CXNC-X17-CNAC) have been found [1] [3] [4].This zinc finger region is commonly located on the amino-terminal side of the last half of the primary structure of GATA proteins [1]- [4].The carboxyl-terminal zinc finger of GATA proteins is required for specific DNAbinding, whereas the amino-terminal zinc finger may interact with protein cofactors [2].The basic domain adjacent to the carboxyl-terminal zinc finger is suggested to be critical for both the DNA-binding and nuclear localization of GATA proteins [5] [6].The transcriptional activation domains are located within the amino terminus of a protein [6].
GATA-6 is distinct in that it has an extension of 146 amino acid residues at the amino-terminus compared with the five other GATA members [3].Furthermore, the two translational isoforms, L(long)-type and S(short)type, are produced from a single mRNA through ribosomal leaky scanning upon transfection of an expression plasmid for GATA-6 into cultured cells [7].These two translational isoforms are produced through in vitro translation of GATA-6 mRNA [8], and also detected in colon cancer cells [9].Interestingly, the transcriptional activity of L-type GATA-6 is higher [7] [9].However, the physiological role of L-type GATA-6 remains to be revealed.
L-type GATA-6 is expressed in humans and rodents, and the PEST sequence [10] that is composed of 16 amino acid residues (Glu 31 -Cys 46 ) between the two arginine residues in the L-type specific sequence is conserved [3].Deletion of this PEST sequence unusually reduced the protein molecular size on SDS-polyacrylamide gel-electrophoresis [7] [9].However, such reduction of the molecular size is not ascribed to proteolytic degradation [9].Furthermore, L-type GATA-6 containing the PEST sequence exhibits greater activation potential than that without this sequence [9].
In this study, we introduced mutations into this PEST sequence by means of cassette mutagenesis to determine which amino acid residues in this sequence contribute to the abnormal mobility on the gel, and discussed the structure and function of the PEST sequence in L-type GATA-6.

Chemicals
Restriction enzymes were purchased from New England BioLabs and Toyobo.T4 DNA ligase and Agarose-LE Classic Type were supplied by TaKaRa.Klenow enzyme, T4 polynucleotide kinase and calf intestine phosphatase were obtained from New England BioLabs.AmpliTaq was from Roche.Oligonucleotides were purchased from Gene Design Inc. Phenylmethanesulfonylfluoride was provided by Sigma.MG115 and E-64d were from the Peptide Institute.All other chemicals used were of the highest grade commercially available.

Construction of Expression Plasmids for Human L-Type GATA-6 with Mutations in the PEST Sequence
pTA4-4ΔEXCTGm was constructed previously by cassette mutagenesis of pTA4-4ΔEX to introduce CTC or TTG codons instead of the CTG codons between the XmaI and BpU1102I sites [7].The pTA4-4ΔEXCTGm was digested with XmaI, and an oligonucleotide cassette (Sense A/Antisense A) (Table 1) was inserted to regenerate the PEST sequence.The small XhoI-AccI fragment of the resulting plasmid, pTA4-4CTGm, was ligated be-  tween the corresponding sites of pME-hGT1LΔEX to produce pME-hGT1LCTGm (Figure 1(a)).The small BglII-BclI fragment of pTA4-4CTGm was replaced with an oligonucleotide cassette [(Sense Ala1/Antisense Ala1) or (Sense Ala2/Antisense Ala2)] to produce pTA4-4CTGmAla1 or Ala2 (Figure 1(a)).The XhoI-AccI fragments of both plasmids were inserted into the corresponding sites of pME-hGT1LΔEX.The resulting plasmids were named pME-hGT1LCTGmAla1 and pME-hGT1LCTGmAla2, respectively.pTA4-4ΔEX and pME-hGT1LΔEX were constructed by Takeda et al. [7] and they were derivatives of pUC18 and pME18S, respectively.
To introduce further mutations into the first half of the PEST sequence (Figure 1(b)), the PvuI site between the SphI and NaeI sites of the vector sequence in pTA4-4CTGmAla2 was deleted by digestion with restriction enzymes SphI and NaeI, followed by Klenow enzyme treatment and self-ligation.The resulting plasmid, pTA4-4dPCTGmAla2, was digested with BglII and PvuI.The PvuI-BglII fragment and the PvuI-PvuI fragment were ligated together with one of the following cassettes, STA, PA, PG, TPAA and SPAA (Figure 2 and Table 1).The XhoI-AccI fragments of the constructs (pTA4-4dPCTGmAla2STA, PA, PG, TPAA and SPAA) were inserted into the corresponding sites of pME-hGT1L.The resulting plasmids were named pME-hGT1LCTGmAla2STA, PA, PG, TPAA and SPAA, respectively (Figure 1(b)).The DNA sequence after each manipulation step was confirmed by the dideoxy chain-termination method [11] using sequence primers listed in our previous study [9].The molecular biological techniques were performed by published methods [12].Restriction enzyme digestion and ligation were carried out according to the manuals supplied by the manufacturers.

Construction of a Control Expression Plasmid
GATA-6 mRNA has an additional initiation codon located in frame upstream of the canonical one [7], the latter being located at a similar position in the GATA proteins [4].The nucleotide sequence around the upstream initiation codon does not match the Kozak sequence [17], resulting in the translation of S-type GATA-6 from the downstream initiation codon in addition to L-type GATA-6 from upstream one [7].It was further demonstrated that both species are not produced through proteolysis [9], or that the CTG codons located in the intra-coding region of L-type GATA-6 mRNA does not function as translational initiation sites [7].In this study, we used an expression plasmid of the L-type (pME-hGT1LCTGm) (Figure 2) as a control in which the CTG codons were substituted with CTC or TTG codons [7] to exclude any possibility that internal CTG codons could function as translational initiation sites [17].

Effect of Substitution of Either the Pro-Rich Segment or Ser-Cluster on Mobility
The PEST sequence (Glu 31 -Cys 46 ) located in the L-type specific sequence could be divided into two sub-sequences; the amino terminal Pro-rich segment and the carboxyl terminal Ser-cluster (Figure 2 and Figure 3).We first examined the effect of the introduction of Ala residues into either of the entire sub-sequences (Ala1 and Ala2, respectively).The transient expression of such a mutant L-type GATA-6 in Cos-1 cells demonstrated that the mobility of both proteins (Ala1 and Ala2) was increased on the gel compared with that of the control (CTGm) (Figure 3(a), lanes 1 -4).However, the effect of substitution of the Pro-rich segment (Ala1) was much more significant; the increase of the mobility of Ala1 was 13%, while that of Ala2 was 8%.

Effects of Amino Acid Substitutions in the Pro-Rich Segment on Mobility
We further studied the effects of substitutions in the amino terminal Pro-rich segment; S/T→A (Ser 33 , Thr 34 and Ser 37 were substituted to Ala residues; also denoted to STA), P→A (Pro 32, 35, 36, 38 were all substituted to Ala residues; also denoted to PA), P→G, (Pro 32, 35, 36, 38 were all substituted to Gly residues; also denoted to PG), TP→AA (Thr 34 and Pro 35 were substituted to Ala residues; also denoted to TPAA) and SP→AA (Ser 37 and Pro 38 were substituted to Ala residues; also denoted to SPAA), in addition to the introduction of Ala residues into the Ser-cluster (Figure 2).The results showed that all the substitutions increased the mobility of the resulting recombinant proteins (Figure 3(a), lanes 5 -9).The TP and SP substitutions in the Pro-rich sequence had similar medium effects to S/T→A.However, the substitution of all Pro residues (P→A and P→G) significantly increased the mobility (lanes 6 and 7), being similar to in the case of Ala1.
When we calculated the PEST score [10] (determined with genetic information processing software GENETYX  MAC, Tokyo Japan) of the sequence "KPGDLPSTPPSPISSSSSSCDH" in the control CTGm sequence (Figure 2), a significantly high score, 14.78, was obtained.However, those for all mutagenized sequences in Figure 2 were negative values and were not defined as the PEST sequence.Thus, it seems likely that the control (CTGm) with the PEST sequence showed slow mobility and that the mobility was increased without the PEST sequence.

Effect of Substitution of the PEST Sequence on the Transcription Activity of L-Type GATA-6
Next we determined the biological activity of L-type GATA-6 with or without the PEST sequence.For this purpose, the transcriptional activity of the reporter gene with the GATA-responsive promoter [9] was used together with the expression plasmid for GATA-6.To express only L-type GATA-6, a Kozak sequence was introduced ahead of the upstream initiation codon.As shown in Figure 4, the activity of the positive control with the PEST sequence (LΔA) was significantly higher.The activity of LΔA was as much as five times of that of LΔAla in which all the residues in the Pro-rich segment and the Ser-cluster were substituted with Ala residues.The activity of LΔAla was similar to those of LΔA(-2), LΔE and the S-type without the amino-terminal 146 residues of L-type GATA-6.These results suggest that the PEST sequence of L-type GATA-6 may play a functional role(s) in the transcriptional activation process other than as a constitutive degradation signal [10].Furthermore, inhibitory sequence(s) for transcription may be present in the Arg 13 -Arg 30 and Arg 48 -Pro 102 regions within the amino-terminal extension of L-type specific 146 residues, since the L-type showed rather lower activity than LΔA.

Discussion
Since the PEST sequence was reported to be a signal for rapidly degrading proteins [10], it has been widely found in eukaryotic proteins [18].Although PEST sequences contribute to protein disorder and are recognized to be targets for multiple protein degradation systems, they are also involved in phosphorylation and protein-protein interactions [18].Actually, the acidic residues in the IκBα PEST sequence competitively interact with NF-κB DNA-binding basic residues [19].Protein-tyrosine phosphatase PTP-PEST interacts via its proline-rich sequence with Csk (a cytosolic tyrosine protein kinase) [20], Grb2 [21], and filamin [22].Unstructured PEST motifs in tissue-specific proteins enriched within them enable them to exhibit conformational variability, and hence they can interact with many partners at the same time or after different time intervals, playing vital roles in tissue-specific cell signaling and transcriptional regulation [23].
It is suggested that the poly-L-proline type II helices found in many folded and unfolded proteins are favorable for protein-protein and protein-nucleic acid interactions [24].The PXXP motif of p53 binds directly to transcriptional coactivator p300 [25].This motif is present in the Pro-rich sequence of L-type GATA-6 ( 32 PSTP and 35 PPSP).Phosphorylation of the C-terminal Ser-rich sequence motif of the Na + /H + exchanger (NHE3) by Akt and GSK-3 stimulates NHE3 activity through Ezrin binding to this motif [26].The consensus sequence for the GSK-3 substrate (S/T)XXX(S/T) is also present in the Pro-rich segment and the Ser-cluster of L-type GATA-6 ( 33 STPPS 37 and 40 SSSSSS 45 , respectively).Phosphorylation of multiple (S/T)P motifs in p53 is recognized by the prolylisomerase Pin1, resulting in the genotoxic response of p53 [27].The same motifs are repeated in tandem in the Pro-rich segment of GATA-6 ( 34 TP and 37 SP).It must be determined how these motifs of L-type GATA-6 function in transcriptional regulation in vivo.
Proline-rich proteins showed aberrant high molecular weights on SDS-polyacrylamide gel-electrophoresis and size-exclusion chromatography [28] [29], which is ascribed to their rod-like conformation in the detergent [29].Consistently, increased SDS-binding and helicity accompanies retardation of the protein mobility on SDS-polyacrylamide gel-electrophoresis [30].Similarly, the L-type GATA-6 with the PEST sequence moves slowly on the gel, and deletion or substitution of this sequence increases its mobility.The substitutions made in our study do not alter the charge of the amino acid side chains, suggesting that the change in the mobility could not be ascribed to alteration of the net negative charge [31].Since such a physicochemical property of GATA-6 with the PEST sequence correlates with its biological activity [7] [9], it would be of interest to examine protein association with the region directly or indirectly in cells in which GATA-6 is expressed, and whether or not this sequence could be inserted into other proteins to confer a novel protein function.

Conclusion
There is a unique PEST sequence (Glu 31 -Cys 46 ) in the extension of 146 amino acid residues of L-type GATA-6.Substitution of an amino terminal Pro-rich segment or a carboxyl terminal Ser-cluster of the PEST sequence with Ala residues reduced the apparent molecular size of L-type GATA-6 on SDS-polyacrylamide gel-electrophoresis.However, the effect was much more significant upon substitution of Pro-rich segment and especially upon substitution of Pro residues.The PEST sequence had a potential to increase transcription activity.Thus, the PEST sequence of L-type GATA-6 does not function as a constitutive protein degradation signal, but rather structural and functional roles in the activation of transcription mediated by the GATA responsive promoters.

Figure 2 .
Figure 2. Nucleotide sequences encoding the PEST sequence in the expression plasmids.The nucleotide sequences encoding the PEST sequences in the expression plasmids (see Figure 1) are shown together with those in their vicinities.The nucleotide sequences in bold letters are the restriction enzyme sites introduced.The amino acid sequence of the PEST sequence is double underlined.The bold letters in the PEST sequence show the substituted amino acids.The underlined amino acid residues in bold letters with residue numbers (the upper) are derived from the native sequence of L-type GATA-6 [7].

Figure 3 .
Figure 3. Mobility of the L-type GATA-6 with an altered PEST sequence on an SDS-polyacrylamide gel.(a) Cos-1 cells were transfected with pME-hGT1LCTGmAla1, Ala2, Ala2 (STA, PA, PG, TPAA, or SPAA), or pME18S (mock transfection).After two days, nuclear extracts were prepared.An aliquot (10 μg) of each extract was subjected to SDS-polyacrylamide gel-electrophoresis (10%), followed by analysis by Western blotting with hGATA-6N antibodies.The open arrow and asterisk indicate the positions of S-type and L-type GATA-6, respectively.The S-type GATA-6 was detected due to leaky scanning of the Met codon for the L-type by ribosomes [7].The mobility of each L-type GATA-6 was shown as a % value compared with that of S-type GATA-6.The values in the parenthesis are relative values of the control CTGm; (b) The amino acid sequences in the PEST sequences of the wild type (CTGm) and mutants are indicated.

Table 1 .
Oligonucleotides used for cassette mutagenesis and sequencing.The introduced restriction enzyme sites are underlined with bold letters.