High yield expression of proteins in E . coli for NMR studies

In recent years, high yield expression of proteins in E. coli has witnessed rapid progress with developments of new methodologies and technologies. An important advancement has been the development of novel recombinant cloning approaches and protocols to express heterologous proteins for Nuclear Magnetic Resonance (NMR) studies and for isotopic enrichment. Isotope labeling in NMR is necessary for rapid acquisition of high dimensional spectra for structural studies. In addition, higher yield of proteins using various solubility and affinity tags has made protein overexpression cost-effective. Taken together, these methods have opened new avenues for structural studies of proteins and their interactions. This article deals with the different techniques that are employed for over-expression of proteins in E. coli and different methods used for isotope labeling of proteins vis-à-vis NMR spectroscopy.


INTRODUCTION
The two most popular methods today for structural studies of biomolecular systems are X-ray crystallography and Nuclear Magnetic Resonance (NMR) spectroscopy.The former requires the preparation of single crystal of proteins whereas the latter involves the preparation of protein samples either in solution (for liquid state NMR) or in micro-crystalline form (for solid state NMR).In the case of NMR studies of proteins, homonuclear one-and two-dimensional (1D/2D) techniques are sufficient if the molecular mass of the system is less than 10 kDa (<90 -100 amino acid residues) [1].However, for larger proteins advanced NMR techniques combined with isotopic enrichment or isotope labeling of the molecule (with 13 C/ 15 N/ 2 H) is required [2].This is owing to the fact that with increase in size of the molecules the sensitivity of NMR experiments decrease and hence isotopic enrichment helps to alleviate to some extent the deleterious effects of large size.
Isotope labeling implies the replacement of 12 C, 14 N or 1 H atoms of the backbone or side-chain of proteins by 13 C, 15 N or 2 H, respectively, either uniformly throughout the protein (i.e., independent of the amino acid type) or in a selective manner (i.e., amino acid type dependent) [2].This can be accomplished in one of the three ways: 1) over-expression of proteins in E. coli or higher organisms if required; 2) use of cell-free protein expression; and 3) chemical synthesis.Of the three, over-expression of proteins in E. coli is by far the most popular and costeffective method today.The protein to be expressed is first cloned using the recombinant DNA methods.The protein is either expressed as such or fused with a suitable tag for increasing the yield/solubility and/or facilitating easy purification.Starting from 1960s when the first isotope enrichment of proteins was carried out, isotope labeling has come a long way with various advancements.Today it is possible to express proteins with 13 C and/or 15 N at a fraction of cost and/or with an order of magnitude compared to a decade ago.
In this article, we focus on the different techniques used today for over-expression and isotope labeling of proteins in E. coli for NMR studies.The various methods proposed recently for high-yield expression are discussed with suitable examples.The article is divided into three sections.In the first section, the different general protocols for protein expression are described.In the second section, different methods for isotope labeling for NMR studies are discussed.In the last section, special protocols for expressing difficult proteins in E. coli for NMR are covered.

RECENT METHODS FOR HIGH YIELD PROTEIN EXPRESSION IN E. COLI
yield requires over production by cells.A major impedement in this regard is that many proteins are poorly expressed especially those of eukaryotic origin, and sometimes in insoluble form which are prone to degradation by cellular proteases.In recent years, high level expression of proteins has been impelled by various developments in understanding and manipulating the biological processes of E. coli.Paramount factors to obtain high yields of protein are gene of interest, expression vector, gene dosage, transcriptional regulation, codon usage, translational regulation, host design, growth media and culture conditions or fermentation conditions available for manipulating the expression conditions, specific activity or biological activity of the protein of interest, protein targeting, fusion proteins, molecular chaperones, protein degradation [3][4][5][6].Recent methods to obtain high protein expression include high cell density shaking cultures, High cell density fermentation, fed batch based cultivation, cold shock induction, co-expression with chaperones or fusion tags, cell free protein synthesis among others.Figure 1 below depicts some of the methods used to achieve high yielding proteins.Each of these methods is described below in detail.

High Cell Density Shaking Cultures
High cell density shaking cultures based on IPTG induction reported by Qianqian Li and others used a regular incubator shaker to achieve a cell-density (measured as optical density at 600 nm; OD 600) of 10 -20 in the normal laboratory setting instead of a fermenter [7].Several parameters such as host strain selection, plasmid copy numbers, promoter selection, mRNA stability, and codon usage were optimized to attain a high cell density [3,7,8].In this IPTG induction based method, rich media is used as starter culture, which is grown at an optimized temperature and time period, following which the cell pellet is transferred into same volume of minimal media when the cells are in the middle of growing phase.Further, the cells are cultured for medium exchange at pre-determined, optimized temperature for the target protein production [7].At this point, the bacterial culture is induced for target protein production by IPTG at the same temperature for an optimized time period.Bacterial expression parameters to be optimized in this method to attain a very high cell density in laboratory shaking cultures are: 1) double selection of colonies highly expressing the target protein; 2) optimization of temperature and time period for starter culture to avoid plasmid instability or loss; 3) optimizing induction temperature and time course; and 4) glucose optimization.These expression parameters must be optimized for every new target protein [7].
High cell density shaking cultures by auto-induction is an efficient cost effective method for high level protein production with glycerol as carbon source since glucose prevents auto-induction and lactose is necessary for autoinduction at previously optimized physiological conditions and optimized minimal medium [4,9].Auto-induction method can be alternative to tedious fermenter cultures as the latter requires additional equipment to monitor various parameters which is economically prohibitive.Additionally, the smaller media volumes required make the production method ideal for selective side-chain or amino acid labeling procedures [10,11].Auto-induction medium contains 4 NH  as nitrogen source, glycerol, lactose and glucose at optimized levels so that glycerol is used as carbon source.Lactose is metabolized for autoinduction once glucose is depleted, which is otherwise the most favorable carbon source.Individual isotopes ( 13 C/ 15 N) incorporation is slightly more efficient than 15 N and 13 C incorporated together.To overcome decreased rate of incorporation of appropriate isotopes, a longer period of clearance of unlabeled nutrients is allotted before induction.Also, it favors metabolic rates of E. coli in deuterated media [11] and high yield expression.Generally a lower temperature for prolonged duration of time is preferred to obtain soluble target protein at high yield without any additional requirement of amino acids and vitamin supplements [9].

Fed Batch Cultivation Method
This refers to a fed batch liquid phase cultivation technology using enzymatic release of glucose for protein production in E. Coli in round-bottom Erlenmeyer shaken cultures [12].Media comprising mixture of mineral salts and complex additives is optimized to provide high cell density and high protein yield.High cell density and favorable physiological conditions of the target protein are achieved in the shaken cultures with EnBase ® Flo (glucose releasing polymer is in soluble form rather than a gel) throughout the protein expression period which is a day longer compared to other expression methods but reaches OD600 of 30 to 50 [12] yielding high amount of soluble protein in comparison to commonly used media.It can be carried out in lower volumes in 24-deep well plates as well as higher volumes in large shake flasks [12].
E. coli cultures induced with IPTG progressively in a fed-batch culture condition results in high level target protein production.Once the glucose from the batch phase has been depleted, an exponential substrate feed is used to provide a constant growth followed by continuous inducer feed in increasing linear fashion in a bioreactor maintaining the physiological conditions like pH, temperature and aeration [13].Experiments show that lowering the induction/process temperature effectively increases the solubility/specific activity of the target protein in comparison to 37˚C achieving high cell density fed batch cultures with increased protein production [4,7,13].

High Cell Density Fermentation
Fermentation at lab scale normally results in high yield of proteins mostly by fed batch process.High yield protein production by fermentation is reported to give higher protein yield compared to traditional shake flask cultures [14].High cell density fermentation enhances the overall yield of the protein [15].Parameters such as feeding strategies, aeration, temperature, pH, media composition, expression strains, and plasmid stability need to be optimized [15].High cell density fermentation is most preferred for large scale protein production.Detrimental fact is that fermentation is slightly expensive and thus not preferred over shake flask cultures.Achieving high cell density cultures in shake flasks is possible as highlighted in the above section.

Cold Shock Induced High Yield Protein Production
Cold shock along with IPTG induces protein production in E. coli transformed with pCold [16] vectors (cold shock expression vectors) under the control of cspA promoter at optimized growth conditions and cold temperature of 15˚C [16].A unique feature of this technology developed by Inouye and others is the inhibition of non-target proteins at cold temperatures termed LACE effect [17] during which most of the translational machinery is dedicated for selectively isotope labeled target protein production termed lace effect [16,17] requiring no protein purification.In this method transformed cells are grown in rich medium starter culture at optimized growth temperature and then shifted to minimal medium and induced with IPTG at previously optimized cold temperature for prolonged hours.Thus obtained crude cell lysate is amenable for NMR studies making this a rapid high yield protein production method [16].Enhanced protein expression is observed by cold shock induction in E. coli transformed with pCold-PST vectors in a similar fashion as described above with PrS 2 tags having affinity to myxospores [18].

Co-Expression with Fusion/Solubility Tag
The application of tags has been highly effective in the structural studies of proteins previously thought unapproachable by solution NMR techniques.These tags are not only important for solubility and stability enhancement issues, but also for a favorable effect on the folding of their fusion protein partners [4].Tags used to improve the yield of recombinant proteins can be roughly divided in two categories: 1) affinity tags for rapid and efficient purification of proteins; and 2) solubility tags to enhance the proper folding and solubility of the recombinant protein [5].Multiple tags can be added together in different combination for a particular protein to get better result on these issues [7].
Recent findings throw light on "rapid" expression among others [27] in which T7RNAP is controlled by arabinose promoter (ParabAD), while the target gene is controlled by the T7 lacpromoter (PT7lac) paving the way for dual induction by both arabinose and IPTG and increasing the yield of T7RNAP which in turn translates target gene resulting in rapid and high level protein ex-pression [27].Among other cost effective methods for high yield protein production include heat-cooling extraction coupled with ammonium precipitation followed by chromatography optimized based on amino acid composition of the target protein [28].
To choose an effective combination of the protein and tag, the advantages and disadvantages of various tags must be considered with respect to their yield, solubility, stability and easing the purification of the fusion partners [53] (Table 1).Additionally, as the affinity tag have the potential to interfere with structural or functional studies in some cases, provisions also be made for removing them after purification gets over [54].Though these tags help in crystallization for some proteins, for solution NMR study it is often necessary to remove the tags before recording the spectra if the tag interferes with the NMR signals of the protein [33,55,56].This is especially true for the large tags like GST and MBP.Endoproteases are widely used for the tag removal purpose [57] (Table 2).
An alternative strategy to endoprotease cleavage is the use of exopeptidases for the tag removal after protein purification.Several amino peptidases and carboxypeptidases are available from the natural resources like por-cine kidney and bovine pancreas.Although the potential use of these enzyme for tag removal is somewhat limited because of the growing concern on contamination from animal sources also with the additional challenge for the purification of the target protein after tag cleavage [58].
There is no common affinity or, solubility tags that functions for all the proteins.Rather, the choice of a tag largely depends on the factors related to the protein being expressed [67].The use of tags has been demonstrated to overcome mainly solubility and stability issues along with yield and proper folding of the recombinant protein in some cases.The recent development of NMR invisible tags promises the implementation of more such tags in the field of Bio-molecular NMR in near future for the structural study of proteins [54].

Cell Free Protein Synthesis
Ever since the advent of cell free expression systems in mid 90s [68] it has become a rapid, efficient method of protein synthesis [69] and recent advances in cell free systems has rendered it an even more powerful tool for simple, efficient and cost effective method of in-vitro protein synthesis [69], mainly for structural studies for  Enterokinase DDDDK secondary sites at other basic amino-acid (aa) [58][59][60][61] Factor Xa ID (/E)GR secondary sites at GR [59] Thrombin LVPRGS secondary site.Biotin labeled for removal of the protease [59,62] PreScission LEVLFQGP GST tag for removal of the protease [63] TEV protease ENLYFQG His-tag for removal of the protease [64,65] 3C protease EVLFQGP GST tag for removal of the protease [61,66] The arrow () indicates the position of endoprotease cleavage site.Amino-acid residues in the bold letter remain in the protein after endoprotease cleavage.
those proteins where purification is not feasible like-membrane proteins which pose obstacles to work with [70], viral proteins [69], incorporation of non-natural amino acids [70,71], in studying protein-interactions and also in high throughput proteomics [69,72].In addition, cell free method provides a means by which target proteins can be decisively screened for NMR studies by means of multiple screening systems [73].
Highly condensed wheat-germ extract based cell free system condensed by polyethylene glycol precipitation is found to increase the protein yield [74] coupled with addition to the wheat germ extract to prevent mRNA destruction and decrease in ATP levels which are important parameters for cell free expression [74].This method has shown to increase protein synthesis rate and thus the final protein concentration in comparison to uncondensed cell free extract [74].Bernhard and others have reported a cell free protein synthesis of few integral membrane proteins at high yields by optimizing various parameters like macro nutrients, amino acids and modified the quality of S30 cell free extract by addition of detergents and lipids to obtain soluble IMP (avoiding precipitation) and monitoring the production by GFP expression [75].NMR studies confirmed correct folded conformations of these proteins produced by cell-free method [75].

Cu 
Major drawbacks of cell free system are the pH change and accumulation of phosphates which inhibit protein synthesis.Swartz and others have engineered the cell free system which successfully overcomes this problem by making use of pyruvate as energy source in batch reactions mimicking the cytoplasm of the cell [76].This is found to bring down the cost of cell free system also cutting down the requirement of expensive high energy phosphate compounds [76].

Co-Expression with Molecular Chaperones
The high level expression of recombinant gene products in E. coli often results in misfolding of the protein of interest and its subsequent degradation by cellular proteases or the deposition of the protein into biologi-cally inactive aggregates known as inclusion bodies [77].It has been established that in vivo protein folding is an energy dependent process mediated by two classes folding modulators; molecular chaperones-DnaK-Dn-aJ-GrpE and GroEL-GroES systems, which suppress offpathway aggregation reaction and lead to the proper folding through ATP-coordinated cycle of binding and release of intermediates [78].Additionally, it accelerates rate-limiting steps along the protein folding pathway such as cis-trans isomerization of peptide-prolyl bonds and formation/reshuffling of disulfide-bridges [79,80].These two chaperone system holds great promise in facilitating the production, purification and proper folding of heterologous protein [81,82].
Molecular chaperone has three sub-classes based on their mechanism of actions [83]."Folding" chaperones (e.g., DnaK and GroEL) as mentioned above, "Holding" chaperones (e.g.IbpB) maintain partially folded proteins on their surface to await availability of folding chaperones upon stress slack and the "Disaggregating" chaperone ClpB promotes the solubilization of the protein that became aggregated as a result of stress.The general mechanism of protein folding assisted by molecular chaperones is depicted in Figure 2 [84].In Figure 2, A represents m-RNA which translates into unfolded-protein (U).In absence of any chaperone protein, U can fold back to its native form (N) along with the aggregationprocess which leads to inclusion-bodies (Z); but in presence of chaperone protein U solely folds back to its native form (N) via an intermediate (I) with chaperone protein without any aggregation path which leads to inclusion bodies.The path from I to N governs by the hydrolysis of ATP to ADP and Pi.The ktransl, kunfold and kfold represent the corresponding rate constants for the translational-process, unfolding and folding pathways.
A set of co-expression vectors with the different combination of DnaK-DnaJ-GrpE, GroEL-GroES and trigger-factor (tig) system is given below (Table 3): Syntheses of DnaK-DnaJ-GrpE and GroEL-GroES are under positive control of a minor σ factor (σ 32 ) encoded by the rpoH gene [77].These folding chaperone plasmids carry an origin of replication derived from pACYC and a chloramphenicol resistance gene (Cm r ).This system allows their use with E. coli expression system that utilizes ColE1 type plasmids containing the ampicillin resistance gene as a marker.Expression of target proteins and chaperones can be induced individually, the chaperone plasmid contain either araC or, tetR for each promoter [88].This is notable that this chaperone system cannot be used in combination with chloramphenicol resistant E. coli host strains or, expression that carry the chloramphenicol resistant gene.An effective method for constructing a system for co-expression of target proteins and chaperone involves transformation of E. coli with chaperone plasmid followed by expression with an expression plasmid for the target protein which results in high transformation efficiency than by doing it in any other way.In this regard, it is mentionable that E. coli Heat Shock Protein40 (HSP40) and E. coli Heat Shock Protein 70 (HSP70) are also known as DnaJ and DnaK and also widely used for differently or, in combination with the GroEl-GroEs chaperones [83,[89][90][91] (Table 4).
Molecular chaperones were constitutively expressed and play an important role in the synthesis of properly folded proteins.In addition to protein folding molecular chaperones control wide range of cell function such as transcription, protein assembly and membrane translocation [92][93][94].Bacterial chaperonins have cylindrical structure composed of two stacked 7-fold rotational symmetric rings of cpn60 subunit and the co-chaperonins has the dome-shaped structure composed of 7 fold rotational symmetric rings of cpn10 subunit [95].These chaperonins assist the protein folding in an ATP dependent manner and are expected to improve the refolding yields from inclusion body [96,97].
The demand of pure, soluble and properly folded functional protein is very high in modern technology as well as in the drug delivery purpose.E. coli is frequently used host for the purification of recombinant protein [98][99][100][101].But the problem of having non-functional and insoluble protein expressed by E. coli can be overcome by using the chaperon systems.This is a promising way to get back the protein in its proper functional form with higher yield, which is also a cheaper and easy to do approach [102].The typical chaperone target is a short unstructured stretch of hydrophobic amino acids covered on either side by basic residues (lacking acidic residues).In addition to proper de novo folding, GroEl and DnaK refold host protein that become unfolded under the stressed condition of cell [103].Holding chaperones assist in this process that stabilizes partially folded protein rather promoting them to their native states generally.In addition to this, generic chaperones GroEl and DnaK assist in the incorporation of the synthesized protein in the cytoplasm to inner-membrane or, translocation to the periplasm.

EXPRESSION OF ISOTOPE LABELED PROTEINS FOR NMR STUDIES
NMR studies for analyzing structures of proteins are performed with isotopically labeled samples enriched with 13 C, 15 N or 2 H by growing E. coli cultures in a medium containing the appropriate isotope [104].The various isotope labeling schemes for NMR is categorized in Figure 3 [2].The choice of a particular isotope labeling scheme depends on what kind of information we want by using particular NMR experiments for that sample (Table 5).
The Bacterial expression vector BL21, producing high yields of recombinant proteins are frequently used expression strain of E. coli which responds differently to different gluconeogenic carbon sources and salt contents [104].The most frequently used medium is the M9 medium used for growing E. coli BL21 did not support the optimum growth lacking ferrous sulphate.The addition of bivalent iron to M9 medium inoculated with E. coli increases the growth rate and the cell density in the stationary phase (iron is required by the enzymes of the tricarboxylic acid cycle and aerobic respiration chain) [105].
Two of the popular isotope labeling schemes (apart from uniform 13 C/ 15 N labeling) is selective labeling and unlabeling [2].In the selective labeling approach, the bacteria are grown in the M9 minimal medium supplemented with the amino acid to be selective labeled in 13 C/ 15 N.In the method of selective unlabeling, the cells are grown in M9 minimal medium supplemented with the unlabeled amino acid which is to be selective unlabeled.The biosynthesis of the different amino acids in E. coli is well known and depicted in Figure 4. Selective amino acid labeling or unlabeling aids in sequence specific resonance assignment by helping to identify resonance which are otherwise buried in the crowded regions  of 2D and 3D NMR spectra.However, a disadvantage of this method is "Isotope-scrambling" [106], which leads to mis-incorporation of 15 N (for selective labeling) or 14 N (for selective unlabeling) in undesired amino-acid.This happens due to metabolic conversion of one amino acid to the other in the bio-synthetic pathway of the cell, which is shown below in Figure 4.For Asp, Glu and Gln isotopic scrambling is maximum as they higher up the intermediates in the metabolic pathway.Isotope-scrambling in E. coli can be reduced by reducing the activity of the enzyme(s) catalyzing the inter-conversion of amino acids using specific (auxotrophic) strains [107] or, enzyme inhibitors [108] or, by doing cell-free synthesis with one more alternative of using in-vitro systems that lack these enzymes [109].
labeled deuterium has been used to eliminate signal from one component in a macromolecular complex [117].Although fractionally labeled deuterium samples also have several applications like improvement of sensitivity of 2D 1 H-1 H homonuclear spectra by reducing dipolar relaxation pathways, spin-diffusion and passive scalar couplings; fractional deuteration also significantly improves the sensitivity of many triple resonance experiments ( 15 N, 13 C, 1 H) and side-chain dynamics of protein prepared in this manner can be studied using a number of new 13 C and 2 H relaxation experiments [118,119].Now a days, combination of isotopic labeling and multidimensional multinuclear experiments is used which has significantly expanded the range of problems in structural biology amenable to NMR.
Over the last few decades deuterium labeling have played a vital role in solution NMR studies of proteins, in by most cases improving the quality of spectra by both including reduction in the number of peaks and narrowing of line widths [110].To reduce the complexity of one-dimensional (1D) 1H spectra of proteins, Crespi HL and Jardetzky O. initially used the deuteration method in a set of refined experiments [111][112][113].Since then, amino-acid selective labeling in deuterated environment or, selective deuteration in an otherwise protonated molecule has been regularly used for spectral simplification and residue type assignment [113].As the gyromagnetio-ratio of deuterium ( 2 H) is significantly lower than the proton ( 1 H), replacement of 1 H by 2 H removes contribution to proton line widths from proton-proton dipolar relaxation and 1 H-1 H scalar coupling.Gain in sensitivity was initially demonstrated in 1D 1 H spectra of 43 kDa E. coli EF-Tu protein [114] and afterwards in 2D homonuclear spectra of E. coli thioredoxin used for chemical-shift assignment [115]; significant improvements have also been noted in many heteronuclear NMR experiments later on it has also been reported that substitution of aliphatic/aromatic protons with deuterons result in impressive sensitivity gain in NOESY spectra that record NH-NH correlations [116].

EXPRESSION OF HETEROLOGOUS PROTEINS IN E. COLI FOR NMR STUDIES
The high yield expression of heterologous protein in E. coli requires modification of rare codons in the host according to its usage frequency.The codon usage frequency refers to t-RNA levels proportional to 61 amino acid codons within a functional mRNA molecule [120].
The improvement thus attained does not alter the amino acid sequence of the encoded protein but codons which otherwise would exacerbate the degeneracy property of the genetic code.As reported by J F Kane 1995, the rare codon clusters of AGG/AGA, CUA, AUA, CGA, CCC causes translational errors.The greater problems are frame shifts and low level expression of heterologous protein; a mistranslation goes unrecognized [120].Therefore, gene synthesis performed by monitoring E. coli codon usage frequency, GC content and unfavorable The optimal level of deuteration, i.e. uniform or, fractional labeling of deuterium depends on the experiments that are planned on that particular protein.Uniformly OPEN ACCESS codon pairs eliminates codon biases [121,122] which favors over expression of proteins [36,123].
Improving the solubility of recombinant proteins in E. coli commonly involves changing some of the expression factors like reduced temperature, changes in the E. coli expression strain, different promoters or induction conditions, reduced translational rates and co-expression of molecular chaperones and folding modulators.Heat shock can enhance protein solubility without affecting induction [124].All have been examined to show increased chances of folding into a native state prior to aggregating with folding intermediates leading to enhancement of soluble protein production.
There are many factors involved in expression of target proteins but codon bias plays an important role.Considering other factors like selection of expression vectors and transcriptional promoters are equally important.The codon biases vary in the same operon or during recombinant expression produced at high or low levels within same or different organisms [125].The expression of target proteins can be increased in the host by manipulating over-expression of genes coding for rare t-RNAs which masks the associated problem of low expression which might have been because of rare codons appearing as clusters or in the amino terminus of target protein [123].
When a heterologous protein is over-expressed in E. coli misfolding and aggregation occur frequently, resulting in the aggregation of proteins into inclusion bodies.Difficulty to express proteins of higher organisms or of eukaryotic origin in E. coli is due to an order of magnitude faster rate of translation and protein folding in latter compared to the former system [24,124].Incubating inclusion bodies with 10% sarkosyl effectively yields > 95% of solubilized proteins.Using specific ratio of Triton X-100 and CHAPS, a high yield recovery of protein is possible from sarkosyl-solubilized fusion proteins.A combination of these three detergents significantly improves binding efficiency of GST and GST fusion proteins to glutathione (GSH) Sepharose.It is postulated that the sarkosyl molecules encapsulate proteins and disrupt aggregates, while Triton X-100 and CHAPS, with a critical micelle concentrations of 0.25 mM and 6 -10 mM, respectively, forms mixed micelle or bicelle structures that incorporate sarkosyl molecules from the solution.This in turn facilitates proper protein refolding [126].

CONCLUSION
The development of new methods and technologies to over-express proteins in E. coli has opened up new avenues for structural studies of proteins.It is expected that this trend will continue and many new vectors/methods will be developed for high yield expression of eukaryotic proteins (Table 6).

ACKNOWLEDGEMENTS
The facilities provided by NMR Research Centre at IISc supported by Department of Science and Technology (DST), India is gratefully ac

Figure 1 .
Figure 1.Depiction of different methods leading to high yield protein production in Escherichia coli.

Figure 2 .
Figure 2. Schematic depiction of folding pathway of a protein in presence of chaperone proteins.

Figure 3 .
Figure 3. Various isotope-labeling schemes in proteins for NMR studies.

Figure 4 .
Figure 4. Pathways of amino acid biosynthesis in E. coli.

Table 1 .
Advantages and disadvantages of using different tags with their size.
His-tag, poly-histidine tag; GST, Glutathione S-transferase; MBP, Maltose Binding Protein; NusA, N-Utilization Sample; FLAG, FLAG-tag peptide; CBP, Calmodulin Binding Peptide; SET, Solubility Enhancing Tag; SUMO, Small Ubiquitin Modifier; STREPII, Streptavidin binding peptide; CBD, Chitin Binding Domain; BAP, Biotin Acceptor Peptide.Among these MBP used as both affinity and solubility tag properly where His, GST, FLAG, S-tag, CBP, STREPII, CBD and BAP are the common affinity tags and remaining all are common solubility tags.

Table 2 .
Use of some familiar proteases for tag removal.

Table 3 .
Details of the chaperone expression systems.
[86,87]lasmid having the replication origin of pACYC184; b Genes encoding chaperones are under control of the promoters indicated in the parentheses; c Km r , kenamycin resistance and Cm r , chloramphenicol resistance; d The level of DnaK-DnaJ-GrpE expression is low for unknown reason[85].Trigger-factor repreentated as tig, which is three domain chaperone protein that binds to ribosome with moderate affinity[86,87].s

Table 4 .
Sizes of the different chaperone protein are as follows.

Table 5 .
Representative examples of E. coli expressed proteins studied by NMR Spectroscopy.

Table 6 .
Comparison of different methods for high yield protein expression.