Targeted Genome Engineering and Its Application in Trait Improvement of Crop Plants

Targeted genome engineering refers to technologies that are used for site-specific genome modifications such as knockout, knockin and transcriptional regulation of genes of interest in organisms. Site-specific recombination system, zinc finger nucleases (ZFNs), transcriptional activator-like effector nucleases (TALENs) and clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated protein-9 nuclease (Cas9) (CRISPR/Cas9) technologies are the representatives of targeted genome engineering and have been widely used in crop basic and applied research. In this review, we intro-duce the basic information and action modes of these different genome engineering technologies, summarize the recent progresses of targeted genome engineering technologies and their applications in crop improvement, and propose perspectives for genome engineering-mediated modifications of crop plants in the future.


Introduction
In the past decades, global climate changes and improper agricultural practices have caused serious environmental problems such as soil salinization and acidification [1], soil erosion [2], drought [3] and explosion of insect pests [4], which severely challenge agricultural production in the world, while simultaneously the [8]. In contrast to the widespread utilization of transgenic technology in the basic and applied research of crops in the lab, commercialization of GM crops is still strictly regulated by governments in the world such as China, Japan and European countries because of intensive concerns about biosafety of GM crops [9]. Therefore, only a very small portion of GM crops have been released by now [10] [11] [12] [13].
Targeted genome engineering refers to technologies that are used for sitespecific genome modifications including gene knockout, knockin and transcriptional regulation [14] [15]. Site-specific recombination system, ZFNs, TALENs and CRISPR/Cas9 technologies are the representatives of targeted genome engineering. In comparison to the conventional transgenic technology, genome editing has obvious advantages such as easy design and construction, high precision and efficiency of modifications for genome loci responsible for traits of interest, being capable of stacking multiple genes of interest simultaneously, generation of descendants without transgenic elements, and so on. Therefore, targeted genome engineering has attracted extensive attentions from plant scientists and breeders, and has been rapidly adopted in crop improvement, especially with the emergence of CRISPR/Cas9 system since 2013 [16] [17]. In this review, we summarize recent progresses of targeted genome engineering and its application in genetical modifications of crop plants, and propose perspectives for future research on genome editing-based crop improvement.

Site-Specific Recombination System
Site-specific recombination refers to the reaction of two DNA molecules catalyzed by specific enzymes (recombinases) at their cognate pairs of sequences or target sites [18]. This recombination requires three components, a recombinase such as Cre responsible for DNA editing and two DNA partners such as LoxP cognate sites recognized by the recombinase. The recombinases can be divided into two families, tyrosine recombinase family and serine recombinase family  (Table 1) [18]. The tyrosine recombinases contain a conserved tyrosine active site and catalyze DNA rearrangement via formation and resolution of a Holliday junction intermediate [19], while the serine recombinases contain a conserved serine active site and catalyze site-specific DNA recombination through a concerted, four-strand cleavage and rejoining mechanism [20].
Among these site-specific recombination systems, Cre/LoxP is the most commonly used one, which is composed of Cre recombinase and 34-bp LoxP sequences that can be recognized by Cre [21]. When the Cre recombinase is expressed, recombination events will occur to the cells harboring LoxP recognition sites in their genomes. In general, there are three possible outcomes from the Cre/LoxP-derived recombination including inversion, translocation or excision, depending on the initial arrangement of LoxP recombination sites ( Figure 1).
Inversion event or excision event can occur when the recombination sites are located on the same chromosome with the same orientation or the opposite orientation, respectively. Translocation event can result from the exchange of DNA segments when the two recombination sites are located on separate chromosomes with the same orientation [21].
There are two main applications of site-specific recombination system in genetical modifications of crop plants: removal of undesirable transgenic elements such as selectable marker genes and site-specific integration of genes of interest.
For example, by using Cre/LoxP-mediated recombination, the selectable maker gene, HPT, has been successfully eliminated from transgenic mustard plants with insect resistance [22]. Elimination of selectable marker genes has also been reported in rice [23], potato [24], tomato [25] and so on. Cre/LoxP system has been adopted in construction of maize and rice minichromosomes as well, wherein genes of interest are expected to be stacked without limitations [26] [27]. The application of Flp/Frt recombination system has also been reported for elimination of selectable marker genes in plants such as rice [28] and maize [29].
The representative GM crops that were generated via site-specific recombination recently are summarized in Table 2.
Although targeted genome editing has been realized via different site-specific recombination tools, there are still disadvantages limiting their applications in current crop improvement such as the failure of complete removal of transgenic elements, complicated design for vectors, time-consuming multiple transformation and genetic hybridization, and relatively low targeting efficiency [52].

ZFN Technology
ZF proteins are the common group of DNA binding proteins in eukaryotic organisms. Each ZF protein is composed of about 30 amino acids in a conserved β-β-α configuration [53]. The DNA binding ability is determined by the specific amino acids present on the surface of the α-helix in each ZF with varying specificity. Based on the specific DNA binding trait, a targeted genome editing platform, ZFNs, has been constructed ( Figure 2). A ZFN system is composed of two arrays of ZF proteins and a nuclease such as Fok I. Each array of ZF proteins is linked with a subunit of Fok I. Fok I can work normally after the two arrays of ZF proteins bind to the DNA sites of interest and two subunits of Fok I are dimerized. There are several strategies used for the assembly of ZFNs. The first one is called as modular assembly, a strategy based on the library of ZFs with well-known DNA-binding specificities. ZFNs can also be assembled through webbased tools by combining random assembly of multi-finger libraries with specificity screening or by companies. In general, there are two types of DNA editing by ZFNs ( Figure 2). The first type is the targeted gene knockout, of which the purpose is to create a null mutant by interfering with the expression of genes of interest at DNA level. For example, the HIV-1 resistance has been detected in the primary T cells and the hematopoietic stem/progenitor cells by ZFN-mediated knockout of the CC chemokine receptor 5 [54] [55]. A maize ipk1 mutant line has also been generated by ZFN-mediated gene knockout, leading to a modified phytate biosynthesis pathway in the resulted maize plants [56]. The other is the targeted gene   [51] knockin, of which the purpose is to create organisms expressing genes of interest by introducing exogenous genes at a specific site of the genome or by repairing the mutation sites of endogenous genes. In this case, for example, ZFNs have been applied to repair the mutation sites that are closely associated with diseases such as haemophilia B [57], sickle-cell disease [58] and Parkinson's disease [59].
In tobacco BY2 cells, a functional GFP gene has been successfully introduced into both the pre-integrated defective reporter construct and an endogenous locus by ZFNs [60]. In addition to the targeted gene knockout and knockin, ZFs can be linked with transcriptional factors to regulate gene expression at DNA level [61]. Table 3 shows a summary of representative GM crops that were generated via ZFN system recently.
Despite the successes of ZF-associated technologies in previous studies, some disadvantages such as the complexity of assembly, context-dependent binding specificity and relatively low targeting efficiency have not been well addressed and thus limited the application of these technologies in current research [74].

TALEN Technology
TALEs are a group of natural proteins from the genus of plant bacteria, Xanthomonas, and can bind some specific DNA regions in plant genome through a series of 33-35 amino acid domains that each can recognize a single base pair [75]. The binding specificity of TALEs is dependent on the RVD, the two highly-variable amino acids at the position of 12 and 13 in each TALE (Table 4) [75].
TALEs can be designed to recognize specific DNA sequences in the genome. By   adding a nuclease such as the Fok-I, TALEN system has been developed to make DSBs on pre-selected genome sites to generate knockout or knockin editing in the genome (Figure 3).
For example, AvrXa7 is the effector-binding element in the promoter of a bacterial blight susceptibility gene Os11N3, and the TALEN-mediated knockout of this element has generated rice with resistance to bacterial blight [76]. In another report, rice fragrance has been improved by TALEN-mediated targeting to a defective badh2 allele, which is responsible for the synthesis of 2AP, a major fragrance compound in rice [77]. The TALE designer transcription factors have been used to regulate OCT4 and NANOG loci by targeting their enhancers in mammalian cells, leading to stimulation or inhibition of reprogramming somatic cells to the induced pluripotent cells [78], while similar applications have not been introduced into plants yet. Table 5 shows a summary of representative GM crops that were generated via TALEN recently.

CRISPR/Cas9 System
CRISPR/Cas is an adaptive innate immune system to defend against the invasion of viral and plasmid DNA in bacteria and archaes [99]. Scientists have found three types of CRISPR/Cas systems (type I, II and III) within a series of microbes, and each type includes a Cas protein and the corresponding CRISPR arrays  Among the three CRISPR systems, the type II CRISPR system is the first one to be engineered for targeted genome editing in eukaryotic organisms ( Figure   4). In this system DNA sequence bearing a 5'-NGG-3' PAM can be recognized by a duplex of two non-coding RNAs, a crRNA and a tracrRNA, or with a sgRNA, which is a synthetic fusion of crRNA and tracrRNA, and then be degraded by Cas9 protein that is complexed with the duplex of crRNA and tracrRNA or the sgRNA [16] [17]. The Cas9 protein from Streptococcus pyogenes (SpCas9) has two nuclease motifs, HNH and RuvC, which play critical roles in generating DSBs at target sites [101]. The target specificity of CRISPR/Cas9 is determined by a seed sequence, which is a 12-base sequence upstream of the PAM and must match with the sequence of crRNA or sgRNA [102]. Compared with the  Targeted gene repression can also be realized via the only dCas9 associated with specific sgRNAs. The binding specificity of CRISPR/Cas9 system is dependent on a seed sequence, which is an about 12-base sequence upstream of the PAM sequence and must match with the crRNA or sgRNA.
formerly-designed ZFNs and TALENs, CRISPR/Cas9 system has more advantages such as the simplicity of design and assembly, the high efficiency of targeting and the versatility of application, and thus is expected to be a powerful tool for targeted genome editing (Table 6).  genome-wide level [104]. Thereafter, this system is adopted for gene knockout or knockin in plants such as Arabidopsis, tobacco, sorghum, rice, and wheat [105] [106]. By modifying two amino acids at the position of 10 and 841 from aspartate (D) and histidine (H) to alanine (A), scientists have successfully developed two variants of Cas9, Cas9n (D10A) with a nickase activity and dCas9 (D10A and H841A) without any catalytic activity. The two variants of Cas9 have been further developed to CRIPSR/Cas9n [107] and CRISPR/dCas9 systems [108].
Some Cas9-derived base editor systems have also been established by fusing an adenine or a cytidine deaminase to a Cas9n protein [109]. Being different from the classical Cas9 system, the base editor systems can catalyze two kinds of nucleotide replacement reactions, from adenine (A) to guanine (G) or from cytosine (C) to thymine (T), depending on the deaminase that is linked to Cas9n [109].
Applications of these Cas9-derived systems have been reported in targeted genome editing. For example, in a comparison study of gene expression regulation by CRISPR/dCas9 and TALE designer transcription factor systems, scientists have demonstrated better performance in activation of gene expression by TALE activator system, while repression by CRISPR/dCa9 was similar with or better than TALE repressor [78]. The successful utilization of base editors has been well documented in crop plants such as rice, tomato, potato, wheat, maize, rape and watermelon as well [110]- [118]. To date, CRISPR/Cas9-mediated genome editing has been widely used for crop improvement and a number of GM crops with desired traits have been created (Table 7).

FAD2
Alteration of lipid compositions [128] Potato ALS1 Herbicide resistance [129] Potato GBSS Tomato ALS2 Chlorsulfuron resistance [147] Wheat GFP Insertion assessment [148] Transcriptional activation Rice GW7, ER1 Activation assessment [149] Rice Os03g01240, Os04g39780, Os11g35410 Activation assessment [150] Tobacco PDS Activation assessment [151] Tobacco GUS Activation assessment [152] Tobacco LUC Activation assessment [153] Transcriptional repression Tobacco PDS Repression assessment [151] Tobacco LUC Repression assessment [153] of Arabidopsis, tobacco and rice ( Cas9 cleavage at target sites. The resulted cohesive ends by Cpf1 display some potential advantages over blunt ends such as the improvement of knockin efficiency and the increased possibility of multiple editing events. The CRISPR/Cpf1 system is thus considered as a promising tool and has been applied in genetical modifications of crop plants [ Table 8]. And we believe that this system will be widely used for crop improvement in the near future.

Perspectives
The  Multigene mutagenesis assessment [166] CRISPR/Cpf1 Lachnospiraceae bacterium Rice EPFL9 Stomata development [167] considered as the advantageous tool over the other three and has been used in genetical modifications of crops most widely. A lot of new crop germplasms that do not exist in nature have been efficiently created via CRISPR-mediated knockout, knockin, transcriptional activation and transcriptional repression of genes of interest. However, challenges still remain in current CRISPR platform. Thus far, most of reported modifications for crops are CRISPR-mediated gene knockouts while knockin events are rare and usually at low efficiency, though knockin is very useful for crop breeding because it can confer novel traits that do not exist in crops in nature by editing of existing alleles or adding of new ones. CRISPR/dCas9 is considered as a promising tool for transcriptional regulation of gene expression, particularly for genes with highly methylated promoter regions [168] [169], while available data for gene modification of this kind are quite limited, constraining the improvement of CRISPR-based transcriptional regulation systems and their applications in crops. Transformation and tissue culture are crucial for CRISPR-mediated genome editing, while their efficiencies are challenged for most of crop plants. Last but not the least, off-targeting is still an intensive concern for plant scientists, though numerous studies have evidenced the precision of CRISPR-mediated genome editing in plants [170]. All of these challenges should be addressed in the future studies in order to promote the applications of CRISPR systems in crop improvement.