Research Article
Identification of Genes Related to Photoperiod that Regulating the Circadian Clock in Legume Plants
2 Cuixi Academy of Biotechnology, Zhuji, 311800, China
3 Hainan Institute of Tropical Agricultural Resources, Sanya, 572025, China
4 College of Life Sciences and Technology, Guangxi University, Nanning, 530005, China
Author Correspondence author
Molecular Plant Breeding, 2017, Vol. 8, No. 1 doi: 10.5376/mpb.2017.08.0001
Received: 01 Mar., 2017 Accepted: 03 Mar., 2017 Published: 30 Mar., 2017
Li Z.F., Zhang J., Wei F., Cai M.D., Liu Z.P., and Fang X.J., 2017, Identification of genes related to photoperiod that regulating the circadian clock in legume plants, Molecular Plant Breeding, 8(1): 1-20 (doi: 10.5376/mpb.2017.08.0001)
In this study, 9 kinds of genes related to photoperiod-mediated circadian clock were analyzed by bioinformatics methods in order to identify the genes related to photoperiod that regulate circadian clock in legume, which were Phytochrome, Cryptochrome, LKP1/FKFI/LKP2, PIF3/PIF4/PIF5, SRR1, CO/COL1/COL2, TIC, XCT and FHY3. The results showed that different degrees of differentiation occurred in the 3 species of Arabidopsis thaliana and the legumes. Using Arabidopsis thaliana as a reference, the differentiation degree of Lotus japonicus and Medicago truncatula genes related to photoperiod-mediated circadian regulation network was higher than that of Glycine max, which might be due to the sizes of Lotus japonicus and Medicago truncatula genome is far less than that of Glycine max.
Introduction
The circadian clock is almost involved in all the regulation processes of metabolism, growth and development of the plant body, and mainly achieved by controlling the expression of genes. For example, some pigments regulate the body's circadian clock oscillatory rhythm by sensing the light signals in the external environment. And at the same time, their own expressions are also affected by the internal clock mechanism of precision control (Covington et al., 2008; Li et al., 2015a). The growth and development of plant organs are controlled by the internal circadian clock. For example, the function of the circadian clock in regulating the growth of embryonic axes is believed to be the main method to control the input of the light signal, which is similar to control the expression of the CAB gene (Li et al., 2015b). A variety of physiological activities can be coordinated by circadian clock, so that they can be carried out in the body at an appropriate time. Therefore, it has great significance to study the regulating network of all kinds of plant biological clock.
The circadian clock of the plant is similar to the 24-hour photoperiod on the earth (Devlin and Kay, 2001). The light is an extremely important signal of the time synchronization mechanism in the external environment. So the researchers thought that the identification of light signal path associated with the biological clock as a member of the biological clock is an important clue. In Arabidopsis, two major photoreceptors, phytochrome (PHY) and cryptochrome (CRY) are involved in the photoperiod-controlled circadian clock (Somers et al., 1998). Both the light-sensitive pigment and the cryptochrome can control the circadian rhythm of the circadian clock by sensing the light signal in the external environment. Meanwhile, the expression of the circadian clock is controlled by the internal circadian clock mechanism. It is indicated that there is a control loop r for regulating the light signal input and setting the circadian rhythm in plants. At the same time, there are a series of valve mechanism to control the external light signal into circadian clock system in the plant body.
In this study, Medicago truncatula and Lotus japonicus are model plants in the Legume, and Glycine max is an important economic crop. These three species had successively completed the whole genome sequencing which could provide complete genomic data for our analysis. The corresponding bioinformatics analysis was carried out through comparing the photoperiodic regulation factor and three species of Medicago truncatula, Lotus japonicus and soybean at the whole genome level.
1 Results and Analysis
1.1 Identification of the orthologous genes of the photosensitive pigments in Legume
In Arabidopsis thaliana, the 5 proteins which were encoded by the photosensitive pigment, PHYB (Canto´n, 1999) mainly accepted the red light, while PHYA (Franklin et al., 2007) and PHYE (Clack et al., 1994) mainly accepted far-red light. There were two kinds of photosensitive pigments: one was red light absorption (Pr), the other was far-red absorption type (Pfr). Pfr was a physiological activation type, unstable, while Pr was a physiological inactivation type and relatively stable. The state of Pfr and Pr could be transformed under light. PHYA was one of the type I (Johnson et al., 1994), it would transfer into an active PFR form after absorbing red light, and then underwent rapid proteolysis; Other four genes belonged to type II, with a high degree of light stability.
The movement of the light induced photosensitive pigments in the nucleus--cytoplasm was related to the plant physiological rhythm (Hanano et al., 2006). There were a series of photosensitive pigments involved in this process, the most clearly studied photosensitive pigment interaction factor in the nucleus was PIF3 (phytochrome interacting factor 3) (McWatters et al., 2000). PIF3 bound with G-box of the gene promoter regions of CCA1 and LHY, and then inhibited the expression of these genes; after red light irradiation, PHYB was activated and entered into the nucleus from the cytoplasm, and then bound with PIF3 to release the inhibition of PIF3 on CCA1 and LHY and other genes; and further induced the expression of downstream the light induced gene.
The structure domain of PHYA was mainly composed of 4 parts, which were the photosensitive pigment domain close to the N, and two PAS domain close to the C end in the middle. The PAS domain was the common structure of the related factors of the circadian clock, which could be combined with G-box cis acting elements. And other parts of structure domain of PHYA were HATPase domain of C end, GAP and His domain (Figure 1).
Figure 1 Conservation domain analysis, protein sequence alignment, molecular evolution tree and evolutional distance of PHY proteins Note: A: Conservation domain analysis of PRR protein; B: PHY protein sequences alignment; C: Molecular evolution tree of PHY proteins; D: The distance of PHY protein between four species |
From the construction of the molecular phylogenetic tree, we could see that in Arabidopsis, PHY could be divided into 2 categories. PHYA, PHYC, and PHYE were the first group, PHYB, and PHYD were the second group. In the first group, PHYA and PHYC were also clustered into a subgroup.
PHYA had two orthologous candidate genes in Medicago truncatula, and one orthologous homologous candidate gene in Lotus japonicus. PHYB and PHYD in each of the two species had one orthologous homologous candidate gene. However, the similarity between MtPHYB and PHYB was only 22.3%, and there was no EST expression data, indicating that this gene may have been reduced to pseudogene. The similarity between MTPHYA1 and MTPHYA2 was 100%. Both of them were on chromosome 1 in Medicago truncatula, the starting position of MTPHYA1 was 19 823 651 bp. and the initial position of MTPHYA2 was 19 721 115 bp, the ending position was 19 726 501 bp. The two genes differed by about 100 000 bp on chromosome 1. A copy evolutionary event of the two genes occurred recently, which may be partial gene duplication including MtPHYA in chromosome 1, may also be inserted mutation of transposons duplication of MtPHYA individual gene.
In Glycine max, PHYA had four orthologous candidate genes, while PHYB and PHYE each had two orthologous candidate genes (Table 1).
Table 1 The ortholog Candidate genes of PHY in three species of Legume |
In these three species, conservations of PHYA and PHYB were more than 70%, which were higher than PHYE which was about 60%.
The orthologous genes of PHYB and PHYD were not found in 3 species of Legume. It showed that in Arabidopsis thaliana, these two genes may be produced through genome duplication after separation of Cruciferae and Leguminosae.
The evolutionary history of 5 homologous gene of PHY in Cruciferae and Leguminosae were as follows: before separation of Cruciferae and Leguminosae, the ancestral genes of PHY were replicated and then differentiated into PHYB and PHYE. Then PHYE was replicated, and divided into PHYA and PHYE. After the separation of Cruciferae and Leguminosae, in Arabidopsis, PHYB were replicated, differentiated into PHYB and PHYD. PHYA was replicated, and divided into PHYA and PHYD. In Medicago truncatula and Lotus japonicus, PHYA, PHYB and PHYE lost the copy after the whole genome duplication, and kept the single copy of the three genes. In Glycine max, PHYA, PHYB and PHYE experienced two rounds of duplication, but PHYB and PHYE lost a copy after the first duplication event. GmPHYA1 and GmPHYA2 were generated by the second duplication event and the evolutionary distance of the two genes was 0.056. The evolutionary distance of GmPHYB1 and GmPHYB2 was 0.063, and the evolutionary distance of GmPHYE1 and GmPHYE2 was 0.053, which indicated that the 2 genes were produced in second time.
1.2 Identification of orthologous genes of cryptochrome in Legume
As the light receptor of circadian clock, CRY can accept the Blue light. CRY1 and CRY2, as the receptor for the blue light mediated circadian clock, their function was redundant. CRY1 and CRY2 accepted the blue light signal of different intensity, CRY1 was stable in blue light, mainly accepted high intensity blue light, while the CRY2 rapidly degraded in the blue light, mainly accepted low intensity light and blue light induced degradation of CRY2 may be related to ZTL.
In Arabidopsis, cryptochrome included CRY1, CRY2 and CRY3. The differences of CRY3 with CRY1 and CRY2 were comparatively far. Study on the function of CRY1 and CRY2 had been known. The similarity of CRY3 with the former two was low, which was below 30%. And the lack of DNS key domains showed that differentiation era of CRY3 with CRY2 and CRY1 was far. CRY3 may have degraded into pseudogene or evolved into other functions (Figure 2). CRY1 and CRY2 each identified one orthologous homologous gene in Medicago truncatula. CRY1 identified three orthologous homologous genes in Lotus japonicus, and only one of which was a full length gene, and CRY2 identified two orthologous homologous gene (Table 2).
Figure 2 Conservation domain analysis, protein sequence alignment, molecular evolution tree and evolutional distance of CRY proteins Note: A: Conservation domain analysis of CRY protein; B: CRY protein sequences alignment; C: Molecular evolution tree of CRY proteins; D: The distance of CRY protein bewteen four species |
Table 2 The forecast Orthologs genes of CRY in three species of Legume |
CRY1 identified four orthologous homologous genes in Glycine max, and CRY2 identified two. The conservation of CRY1 and CRY2 gene in Arabidopsis and three species of Legume were high. It showed that CRY gene was very important for plants. The conservation of CRY1 in Arabidopsis thaliana and the three species of Legume was more than CRY2, the former was more than 75%, while the latter was about 60%. However, CRY3 could not find any orthologous genes in the three species of Legume.
In the four CRY1 genes of Glycine max, GmCRY1a and GmCRY1b were clustered into one group, GmCRY1c and GmCRY1d were clustered into one group. Through the analysis of the evolutionary distance of their amino acid sequences, we found that the evolutionary distance of GmCRY1a and GmCRY1b was 0.024, GmCRY1c and GmCRY1d was 0.016. It showed that GmCRY1c and GmCRY1d were more conservative because the choice pressure which they faced was greater than GmCRY1a and GmCRY1b. It could also be seen from the evolution distance among the 4 genes and the CRY1 gene in Arabidopsis. The evolutionary distances among GmCRY1a, GmCRY1b, GmCRY1c, GmCRY1d and Arabidopsis thaliana were 0.160, 0.172, 0.148, 0.145 respectively. In this sense, GmCRY1c and GmCRY1d were more likely to have the function of regulating the inhibition of the elongation of the embryonic axis and the circadian rhythm of the blue light in the Glycine max species. The evolution distances of MtCRY1, LjCRY1a and AtCRY1 were 0.133, 0.139. The highly conservatism showed that they may have the function of regulating the inhibition of the elongation of the embryonic axis and the circadian rhythm of the blue light in Medicago truncatula and Lotus japonicus.
1.3 Identification of the orthologous candidate genes of the LKP1/FKFI/LKP2 family in Legume
Related studies had indicated that ZTL (ZEITLUPE), also known as LKP1, FKF1 FLAVIN-BINDING, KELCH REPEAT, F-BOX (1), LKP2 (LOV, KELCHPROTEIN 2) may be involved in the circadian clock physiological pathways which controlled by light signaling pathway. F-box domain appeared in the receptor proteins, and could make specific substrates connected to ubiquitin protein ligase to degrade (Craig, 2001). The unique binding properties of these domains of the ZTL protein family showed that they may mediate some kinds of physiological processes which degrading circadian clock components depend on the light, and TOC1 was one of the substrates of ZTL (Figure 3).
Figure 3 Conservation domain analysis, protein sequence alignment, molecular evolution tree and evolutional distance of LKP1/LKP2/FKF1 proteins Note: A: Conservation domain analysis of LKP1/LKP2/FKF1 protein; B: LKP1/LKP2/FKF1 protein sequences alignment; C: Molecular evolution tree of LKP1/LKP2/FKF1 proteins; D: The distance of LKP1/LKP2/FKF1 protein between four species |
The role of these genes in the light input pathway may be determined by their direct interaction with the light receptors. ZTL had been found to interact with PHYB and CRY1 genes in Legumes. ZTL could also be combined with TOC1 and PRR5 bonder. The binding of TOC1 and PRR5 with ZTL was related to their highly phosphorylated state. The vivo experiments showed that the degradation of PRR5 and ZTL conjugate was inhibited by blue light. TOC1 and PRR3 were interacted in vivo, and their phosphorylation could promote their combination. In addition, both ZTL and PRR3 could be combined with N-terminal of TOC1, which indicated that PRR3 may be combined with TOC1 through phosphorylation to protect TOC1 from ZTL degradation.
LKP2 has no orthologous genes among the 3 species in the Legume, so LKP2 may be produced by some mechanism to cause the duplication of LKP gene and the functional differentiation in Arabidopsis after Cruciferae and Leguminosae separated. The similarity of amino acid sequences of LKP1 and LKP2 was 72.1% in Arabidopsis thaliana (Table 3).
Table 3 The forecast Orthologs genes of LKP1/LKP2/FKF1in three species of Legume |
LKP1 each had an orthologous homologous gene in Medicago truncatula and lotus, and the similarity of them were 82.8% and 75.5% respectively, while it had 4 orthologous genes in the Glycine max. In addition to the similarity of GmLKP1c was only 68%, the similarity of other three genes with Arabidopsis LKP1 were about 85%. PUT sequence data showed that full length of MtLKP1 and GmLKP1a, GmLKP1b, GmLKP1c length were expressed in each vivo. And ljLKP1 temporarily did not find the EST expression data, and the similarities of the remaining LKP1 were also on the low sides which were only 75.5%.
FKF1 had only one orthologous homologous gene in Lotus japonicus, and had two in Medicago truncatula and Glycine max. The orthologous genes of FKF1 were only partially expressed in Medicago truncatula and lotus, while the two orthologous genes were full-length expressed in vivo of Glycine max.
It could be seen from the amino acid sequence alignment that the protein sequence of GmLKP1c had two large deletions in the N side of the conservative region, and MtFKF1a also had three obvious deletions or mutations in the C side of the conservative region.
1.4 Identification of orthologous candidate genes of PIF3/PIF4/PIF5 in Legume
The PIF (PHYTOCHROME-INTERACTING FACTOR) family was a kind of factor that interacts with the photosensitive pigments. Among them, PIF3/PIF4/PIF5 was a kind of interaction factor in the nucleus with the photosensitive pigments. PIF3, PIF4, and PIF5 acted together with the photosensitive pigments in Arabidopsis thaliana to regulate the response of plants to cope with far red light. The main domains of PIF3, PIF4 and PIF5 proteins were transcription factors in the helix-loop-helix (bHLH) structure, which were composed of basic amino acids.
The PIF3 protein also contained one PAS (domain per-arn-sim-like) domain, which could be combined with the G-box cis acting element, mainly to regulate myb transcription factors that containing G-box in the downstream, such as LHY/CCA1. Related studies had indicated that PIF3 combined with G-box of gene promoter regions of LHY/CCA1 to inhibit the expression of these genes, and this inhibition could be released by PHYB.
PIF5 encoded a novel class of bHLH transcription factors, which belonged to the PIF3 class transcription factor family. The interaction between PIF5 and TOC1 had a negative regulation on the expression of PHYB, and the level of protein expression of PIF5 was regulated by PHYB. The role of PIF4 was similar to PIF5, and PIF5 also involved in the shade avoidance physiology of plants.
PIF4 and PIF5 also played an important role in the physiological pathway of photoperiod controlled embryonic axis elongation. PIF4 and PIF5 could also interact with the photosensitive pigment. On the night, short-day conditions were needed to stimulate the increase of expression of PIF4 and PIF5, while long-day conditions were not sufficient to promote its expression. Therefore, the photoperiod controlled embryonic axis elongation was related to the accumulation of PIF4 and PIF5 under the short-day conditions, which was decided by the internal circadian clock mechanism and the external photoperiodic mechanisms.
PIF3 had two orthologous candidate genes in Medicago truncatula and lotus, and had four in Glycine max. However, the similarity of PIF3 between Legume and Arabidopsis thaliana was low (Table 4). But, fter the phylogenetic analysis, we found that MtPIF3a, MtPIF3b and LjPIF3b were all paralogous genes of PIF3. It would be analyzed later (Figure 4).
Table 4 The forecast Orthologs genes of PIF3/PIF4/PIF5 in three species of Legume |
Figure 4 Conservation domain analysis, protein sequence alignment, molecular evolution tree and evolutional distance of PIF3/PIF4/PIF5 proteins Note: A: Conservation domain analysis of PIF3/PIF4/PIF5 protein; B: PIF3/PIF4/PIF5 protein sequences alignment; C: Molecular evolution tree of PIF3/PIF4/PIF5 proteins; D: The distance of PIF3/PIF4/PIF5 protein between four species |
PIF3 had four orthologous candidate genes in Glycine max, in which GmPIF3a and GmPIF3b were both full length, and the full-length of the gene coding sequence was covered with PUT sequence. However, GmPIF3c and GmPIF3d were not full length, and with no EST expression data. It showed that these two genes may have been reduced to the pseudogene with no function in the long-term evolution process.
PIF4 did not find orthologous candidate genes in Medicago truncatula and had one candidate orthologous gene fragment in Lotus japonicus, but no EST expression data.
PIF4 had three orthologous candidate genes in Glycine max. And the GmPIF4a and GmPIF4b genes were full-length covered with PUT sequences, and some GmPIF4c EST also partially expressed.
PIF5 did not find orthologous candidate genes in the three species of Legume. It could be found that the orthologous gene of PIF5 was lost during the long term evolution of the Legume.
The evolutionary history of PIF3 in Legume was more complex, the genetic relationships among the MtPIF3a and other PIF3/PIF4/PIF5 were far, and the similarity between MtPIF3a and PIF3 was only 16.1%. MtPIF3b and LjPIF3b were located in the outer layer of PIF3, and MtPIF3a may be another gene of PIF family in Medicago truncatula. In the long term evolution, the orthologous candidate genes of MtPIF3a gene were lost in Arabidopsis thaliana. Of course there may be another possibility, which will be analyzed in the following. The evolutionary history of PIF3 in Cruciferae which represented by Arabidopsis and Legume may as follows: before the separation of Cruciferae and Legume, PIF3 gene had a duplication, which replicated into PIF3A and PIF3B. After the separation of Cruciferae and Leguminosae, PIF3B was lost in Arabidopsis, but preserved in the three species of Legume which were the Medicago truncatula, Lotus japonicus and Glycine max, respectively were MtPIF3b, LjPIF3b, GmPIF3c and GmPIF3d. PIF3A existed in MtPIF3a in Medicago truncatula. PIF3A had a greater functional differentiation, probably due to some kind of mechanism. And its functions were replaced by MtPIF3b. In Glycine max, the PIF3A had a duplication, which replicated into GmPIF3a and GmPIF3b.
The evolutionary history of PIF4 was relatively simple. PIF4 had lost in Medicago truncatula, and may also have evolved in Lotus japonicus. In Glycine max, it had two duplications and lost one of them.
In short, in the long-term evolution of Legume, PIF5 of Arabidopsis could not find orthologous genes in 3 species of Legume, and orthologous genes of PIF3 and PIF4 had been lost or evolved into other functional genes in M. truncatula. There were corresponding functional genes in Glycine max; the situation of PIF3 and PIF4 in Lotus japonicus was between Medicago truncatula and Glycine max.
1.5 Identification of candidate genes of SRR1 in Legume
SRR1 (sensitivity to red light reduced 1) was similar to PIF3 in the function of the circadian clock. It was related to the PHYB mediated light signal transduction, and was a necessary regulatory factor which circadian clock central oscillator needed in producing the normal rhythm. The connection between the optical signal pathway and the circadian clock was first found evidence in the SRR1 gene. There are some defects in the PHYB mediated signaling pathway and the normal circadian clock signal output pathway of the SRR1 mutant strain, which indicates that the SRR1 gene was necessary for the normal function of the circadian clock. Molecular genetic studies of ssr1 mutant lines in the case of the same light conditions would change a lot of composition of the output of circadian clock.
SRR1 gene did not find orthologous candidate genes in Medicago truncatula, but 2 orthologous genes in the lotus, including LjSRR1a with PUT sequence covered in full-length, while LjSRR1b was not full length (Table 5).
Table 5 The forecast Orthologs genes of SSR1 in two species of Legume |
SRR1 also had two orthologous genes in Glycine max, its similarity was above 53%, and its full length covered with PUT sequence. The evolutionary distance of GmSRR1a and GmSRR1b was only 0.008, which indicates that the two genes were produced in the most recent duplication after isolated from the Legume and Medicago (Figure 5).
Figure 5 Conservation domain analysis, protein sequence alignment, molecular evolution tree and evolutional distance of SSR1 proteins Note: A: Conservation domain analysis of SSR1 protein; B: SSR1 protein sequences alignment; C: Molecular evolution tree of SSR1 proteins; D: The distance of SSR1 protein between four species |
1.6 Identification orthologous candidate genes of CO/COL1/COL2 in Legume
CO gene encoding a nuclear protein which contained a CCT domain both TOC1 protein and CO-like protein had, and two b-box type zinc finger regions. The two regions may mediate the interaction between proteins. Under the long-day condition, the expression of CO gene was the largest during the last period of the day and at night. But in the short-day condition, its expression pattern changed and most of the expression occurred at night. The expression of CO, which was consistent with the photoperiod, was believed to play an important role in the physiological processes of induced flowering under long-day conditions. The expression mechanism of CO seems to be related to the activity control of FT (FLOWERING LOCUS T). FT was one of the important genes for catalyze flowering, and its expression was closely related to the activity of CO. In toc1-1 mutants, the expression pattern of CO changed. Either in long-day or in short-day conditions, the change had led to the early flowering phenomenon of photoperiod insensitive.
In Arabidopsis thaliana, the genetic relationship between CO and COL2 was much closer than that of CO and COL1. Arabidopsis thaliana was a long-day plant, while lotus, Glycine max, Medicago truncatula were short-day plants. CO was a very important gene in the flowering physiological pathway, which was controlled by the circadian clock in Arabidopsis thaliana. CO and COL1 could not find ortholog in Medicago truncatula, lotus and Glycine max, also showed that the CO gene was one of the key genes that controlled flowering in the short-day conditions.
The orthologous genes of COL1 and COL2 were not found in Medicago truncatula, and the most similar gene was AC127169_10, corresponding to CONSTANS-like b-box zinc finger protein in Arabidopsis (Table 6).
Table 6 The forecast Orthologs genes of CO/COL1/COL2 in two species of Legume |
In the lotus, COL2 only had an orthologous candidate gene fragment, its similarity was 50.1%, and it had EST evidence.
It could be seen from the amino acid sequence alignment that CO, COL1 and COL2 genes respectively had a conservative region in the N and C ends. And it could be known from the PUT evidence that GmCOL2a and GmCOL2b were likely to express in Glycine max. GmCOL2c and GmCOL2d had not yet found expressed data in most specials (Figure 6).
Figure 6 Conservation domain analysis, protein sequence alignment, molecular evolution tree and evolutional distance of CO/COL1/COL2 proteins Note: A: Conservation domain analysis of CO/COL1/COL2 protein; B: CO/COL1/COL2 protein sequences alignment; C: Molecular evolution tree of CO/COL1/COL2 proteins; D: The distance of CO/COL1/COL2 protein between four species |
In Glycine max, COL2 had four orthologous candidate genes, and its similarity was more than 50%. The GmCOL2a and GmCOL2b had the support of the full length or close to the full length of the PUT sequence.
1.7 Identification orthologous candidate genes of TIC in Legume
TIC (timeforcoffee), was mainly to inhibit the optical signal which mediated by phytochrome inputting into plant circadian clock system andits function was similar to ELF3. As the valve of the circadian clock, TIC kept the rhythm of the central oscillator from the interference of high intensity light outside. The regulating target of TIC in the downstream was mainly LHY.
Further studies showed that some of the circadian clock control traits showed no rhythm in the tic mutant strains, which indicated that it had a certain function in the physiological pathway of the circadian clock. Similar to ELF3 in the night, TIC has the function of controlling a switch in a circadian clock during the day. The complete non-rhythmic properties of tic/elf3 double mutant strains showed that these two genes had important functions in the physiological pathways of biological clocks. The link between light signal transduction pathway and circadian clock was initially found evidence in SRR1 genes (sensitivity to red light reduced 1). SRR1 mutants existed some defects in the signaling pathway mediated by PHYB and the general way of clock signal output (Staiger et al., 2003), suggesting that the SRR1 gene was necessary for the normal oscillate function of the biological clock.
TIC gene had an orthologous candidate gene in Medicago truncatula, its similarity was 40.4%, and it had EST evidence (Table 7).
Table 7 The forecast Orthologs genes of TIC in three species of Legume |
TIC gene had four orthologous candidate genes which were not full-length in Lotus japonicus. Among them, LjTIC2 and LjTIC1 were supported by EST data. The positions of LjTIC1 and LjTIC4 were nearby in CM0357. So we speculated that a TIC gene was annotated into two gene fragments due to incomplete information in Crowtoe (Table 7). And the two gene fragments were LjTIC1 and LjTIC4. In Glycine max, TIC gene had 4 orthologous candidate genes in which GmTIC1, GmTIC2, and GmTIC3 were full-length, and with support of EST. While GmTIC4 was not full-length, and there is no EST support (Table 7).
Although LjTIC1 was not full length, the longest translated protein sequence of LjTIC1 also contained 1 007 amino acids. So we considered that it could be roughly represented the possible full-length gene sequence to construct phylogenetic tree and calculate the evolutionary distance (Figure 7).
Figure 7 Conservation domain analysis, protein sequence alignment, molecular evolution tree and evolutional distance of TIC proteins Note: A: Conservation domain analysis of TIC protein; B: TIC protein sequences alignment; C: Molecular evolution tree of TIC proteins; D: The distance of TIC protein between four species |
It could be clearly seen from the amino acid sequence alignment that a considerable part of sequence variation among species occurred in the glutamine repeat region. Especially the LjTIC protein sequence was more than 4 repetitive sequences of " QQQQH " of TIC protein sequences of other species in the conserved region, and a number of repeats of glutamine were also in the around.
1.8 Identification orthologous candidate genes of XCT in Legume
Martin (2008) identified the XCT (XAP5 CIRCADIAN TIMEKEEPER) gene in Arabidopsis thaliana, which had an important role in the physiological processes of light regulated circadian clock. XCT played an important role in the circadian clock function of the plants. Under various conditions, the xct mutant strains showed shortening of the oscillating period. Even more interesting was that the role of XCT in responding to light was different and even opposite when the characteristics and wavelength of light were different. xct mutant strain was particularly sensitive to red light, but its reaction to blue light was normal. In contrast, the inhibitory effect of xct mutant strain on the elongation of the embryonic axis was sensitive to blue light, but was not sensitive to red light. XCT may coordinate various physiological activities of plant growth and development to coordinate the plant to response to light in general. to. XCT contained a conserved domain of XAP, which was conserved in biological clock of higher eukaryotes, indicating that it was very important in the function of higher eukaryotes.
The similarity of XCT gene was very high in Arabidopsis thaliana and Legume of Medicago truncatula, Leguminosae and Glycine max, indicating that the gene played a very important role for plants to maintain the basic functions, and was not limited to the circadian clock regulation pathway (Table 8). As shown in Figure 8, XCT had a very conservative xap5 domain, while GmXCT1 missed a large part of sequence in the conservative region.
Table 8 The forecast Orthologs genes of XCT in three species of Legume |
Figure 8 Conservation domain analysis, protein sequence alignment, molecular evolution tree and evolutional distance of XCT proteins Note: A: Conservation domain analysis of XCT protein; B: XCT protein sequences alignment; C: Molecular evolution tree of XCT proteins; D: The distance of XCT protein between four species |
1.9 Identification orthologous candidate genes of FHY3 in Legume
FHY3 (far-red elongated hypocoty l3) was a member of the PHYA optical signal transduction pathway. FHY3 and FAR1 was a pair of homologous proteins, and they both had a distinct sequence that was homologous to the mutant. As newly discovered transcription factors, they played key roles in the activate expression of FHY1 and FHL. The expressions of FHY1 and FHL were necessary for the accumulation of PHYA in the nucleus and the subsequent light signal transduction induced by light.
FHY3 had three distinct structural domains. Both ends were conserved zinc finger domains, which the N-end was C2H2 type zinc finger domain that belonged to the domain of FAR1 subfamily. It played an important role in the binding of DNA and preaching mediated light signals. While the core transcription area in the middle belonged to the subfamily of MULE, and it regulated the activity of FHY3 with SWIM type zinc finger domains of C-end at the transcriptional level. It could form a homodimer with itself or a heterodimer with FAR1. Furthermore, the ability to form a homogenous and heterogeneous dimer was correlated with the activity of FHY3 (Rongcheng et al., 2008) (Figure 9).
Figure 9 Conservation domain analysis, protein sequence alignment of FHY3 proteins Note: A: Conservation domain analysis of FHY3 protein; B: FHY3 protein sequences alignment |
Under the conditions of red light, the inhibitory effect on the elongation of two different FHY3 alleles was significantly enhanced, and the circadian clock rhythm of the extension of the embryonic axes was interrupted.
FHY3 played a very important role in the resetting of circadian clock which carried out by the reception of red light, especially in the early stage of the day to maintain the circadian clock rhythm (Allen et al., 2006).
FHY3 did not find orthologous genes in Medicago truncatula and lotus, but two orthologous candidate genes in Glycine max, the similarity between these two genes was 97.3%, and their similarity with FHY3 both were 63.9%. It indicated that GmFHY3a and GmFHY3b were two homologous genes (Table 9) which were produced by a recent gene duplication event, in which GmFHY3a full length expressed in vivo, while GmFHY3b only partially expressed PUT data.
Table 9 The forecast Orthologs genes of LHY in three species of Legume |
2 Discussion
In this part, the method of bioinformatics was used to analyze the regulating network of circadian clock of photoperiod in the Legume. We identified 41 candidate genes related to the circadian clock of Glycine max, 22 candidate genes of lotus, and 13 candidate genes of Medicago truncatula. We constructed 9 molecular phylogenetic trees based on the identified candidate genes, and analyzed the function and evolutionary trend of these genes in combination with the results of functional domain and multi sequence alignment. The analysis showed that the circadian clock control pathway was differentiated in different degrees among Arabidopsis thaliana and three species of Legume. In Arabidopsis as a reference, the differentiation degree of regulating network related gene of circadian clock of photoperiod in Lotus japonicus and Medicago truncatula was greater than Glycine max, which may be due to the genomes of Lotus japonicus and Medicago truncatula were far less than the Glycine max.
From an evolutionary point of view, the functions of orthologous genes among different species were derived from the same ancestor gene. In general, their functions were most close. The candidate genes that we identified by homologous genes, were the most similar functional genes related to Arabidopsis that we could infer in the absence of experimental data to support currently. Compared with the traditional methods of molecular biology, bioinformatics was more convenient, more purposeful, and could be analyzed in whole genomic level, and its conclusion was also more comprehensive. Our study could have a certain reference value for the further research in the experimental stage of the circadian clock regulation network of Legume.
3 Materials and Methods
3.1 Genomic data sources and access
The original data of Medicago truncatula, Lotus japonicus and Glycine max were mainly obtained and analyzed online. In Table 10, bioinformatics analysis of the genome database and the corresponding sites of the four species were listed.
Table 10 The Name and Website of the Databank of four species |
3.2 Identification of orthologous genes
The amino acid sequences of the genes in Arabidopsis thaliana were used to carry out Blastp or tBlastn on the respective genomic databases of these 3 species. With the standard of the score was above 100 and E value was under -30, the appropriate sequences were selected as the candidate. In comparison results, the sequence which had highest similarity was carried out reversed Blastp in the NCBI Arabidopsis protein library to determine whether it was orthologous gene. If not, it was indicated that the gene had no orthologous gene in the species, if it was, using the sequence as the standard to further determine the other orthologous candidate genes.
3.3 Analysis of conservative structure functional domain
Comprehensively utilizing CDD (Marchler-Bauer, 2007) (Conserved Domain Search) on the NCBI to analyze the conservative domain of target sequence.
3.4 Multiple sequence alignment, phylogenetic tree construction and evolutionary analysis
By using ClustalX program to carry out amino acid sequence alignment of the orthologous candidate genes in the 4 species. And the program used the default parameters. The construction of phylogenetic tree was constructed with the NJ method of mega4 software, after 1000 times of self-expansion, the evolutionary distance was calculated. At the same time, according to the phylogenetic trees, we could roughly calculate the evolutionary history of a gene between Legumes and Cruciferae.
3.5 Acquisition and analysis of expression data
The PUT sequence of PlantGDB was a high quality EST sequence of full-length cDNA, which was partially assembled with the sequences after clustering to remove redundancy. Using BLASTN to compare the nucleotide sequence of orthologous candidate genes with PUT sequence, if the similarity of matching PUT sequence was greater than 95%, the base length was longer than 200 bp, and the E-value was less than 1E-30, we considered that the PUT could prove the matching orthologous candidate gene was real expressed. We divided EST evidence into 3 categories: F, E, N. F represented the entire candidate gene was covered by PUT sequence, or both ends with a PUT sequence covered and 80 percent of sequence of the entire gene was covered by the PUT sequence. N indicated that the gene could not find a matching PUT sequence. E represented that some part of the gene or a few parts were in line with the requirements of the PUT sequence coverage, but had not yet reached the F standard.
3.6 Re-annotation of gene
There were no full-length ORF genes in the identification of orthologous candidate genes in the three species of Legume due to incomplete genome information or annotation errors. We re-annotated the genomic region corresponding to this part of the ORF gene by GenScan to obtain the full-length ORF gene. And the obtained ORF sequence was re-Blast compared with the PlantGDB genome sequence of corresponding species to confirm whether it corresponded to the original genome region. At the same time, the translated amino acid sequence of ORF sequence was conducted to reverseBlastp Arabidopsis thaliana protein database to confirm whether it was related orthologous gene.
Authors’ contributions
Li Zongfei was the executor of this experiment, and was responsible for the experimental design, implementation, data analysis and draft writing; Zhang Jie and Liu Zhenpeng were participated in the data analysis, the formation and amendment of the draft; Cai Mengdie and Fang Wei were responsible for manuscript proofreading; Fang Xuanjun determined the research project conception, guided writing and revising paper. All authors have read and approved the final manuscript.
Acknowledgements
This research was sponsored by the Open Invention Fund of Life Science and Biotechnology (No 20161201) of Cuixi Academy of Biotechnology. Authors would thank Ms Jia Xuan for critical review and edit English manuscript.
Allen T., Koustenis A., Theodorou G., Somers D.E., Kay S.A., Whitelam G.C., and Devlin P.F., 2006, Arabidopsis FHY3 specifically gates phytochrome signaling to the circadian clock, Plant Cell, 18(10): 2506-16
https://doi.org/10.1105/tpc.105.037358
Canto´ n F.R., 1999, Quail PH Both phyA and phyB mediate light-imposed repression of PHYA gene expression in Arabidopsis.Plant Physiol, 121: 1207-1215
Clack T., Mathews S., and Sharrock R.A., 1994, The phytochrome apoprotein family in Arabidopsis is encoded by five genes: the sequences and expression of PHYD and PHYE, Plant Mol. Biol, 25: 413-427
https://doi.org/10.1007/BF00043870
Covington M.F., Maloof J.N., Straume M., Kay S.A., and Harmer S.L., 2008, Global transcriptome analysis reveals circadian regulation of key pathways in plant growth and development, Genome Biol, 9(8): 1
https://doi.org/10.1186/gb-2008-9-8-r130
Craig K.L., and Tyers M., 2001, The F-box: A new motif for ubiquitin dependent proteolysis in cell cycle regulation and signal transduction. Prog. Biophys, Mol.Biol, 72(3): 299-328
https://doi.org/10.1016/S0079-6107(99)00010-3
Franklin K.A., Allen T., and Whitelam G.C., 2007, Phytochrome A is an irradiance-dependent red light sensor, Plant J., 50(1): 108-17
https://doi.org/10.1111/j.1365-313X.2007.03036.x
Hanano S., Domagalska M.A., Nagy F., and Davis S.J., 2006, Multiple phytohormones influence distinct parameters of the plant circadian clock, Genes Cells, 11(12): 1381-1392
https://doi.org/10.1111/j.1365-2443.2006.01026.x
Johnson E., Bradley J.M., Harberd N.P., and Whitelam G.C., 1994, Photoresponses of light-grown phyA mutants of Arabidopsis (phytochrome A is required for the perception of daylength extensions). Plant Physiol, 105: 141-149
https://doi.org/10.1104/pp.105.1.141
Li Z.F., Zhang J., Liu Z.P., and Fang X.J., 2015a, Gene Regulation Network of Biological Clock in Plant, Fenzi Zhiwu Yuzhong (online) (Molecular Plant Breeding), 13(1): 1001-1008
Li Z.F., Zhuo W., Liu Z.P., and Fang X.J., 2015b, Effects of Gene Regulation of Circadian Clock on Plant Growth and Development, Douke Jiyinzuxue Yu Yichuanxue (online) (Legume Genomics and Genetics), 6(1): 1-4
Lin R.C., Teng Y.B., Park H.J., Ding L., Black C., Fang P., and Wang H.Y., 2008, Discrete and Essential Roles of the Multiple Domains of Arabidopsis FHY3 in Mediating Phytochrome A Signal Transduction, Plant Physiology, 148: 981-992
https://doi.org/10.1104/pp.108.120436
Marchler-Bauer A., 2007, CDD: a conserved domain database for interactive domain family analysis, Nucleic Acids Res, 35(1): 237-240
https://doi.org/10.1093/nar/gkl951
McWatters H.G., Bastow R.M., and Hall A., 2000, Millar AJ.The ELF3 zeitnehmer regulates light signalling to the circadian clock, Nature, 408: 716-720
https://doi.org/10.1038/35047079
Staiger D., Allenbach L., Salathia N., Fiechter V., Davis S.J., Millar A.J., Chory J., and Fankhauser C, 2003, The Arabidopsis SRR1 gene mediates PHYB signaling and is required for normal circadian clock function, Genes Dev, 17: 256-26