Characterization of Genomic Microsatellite Markers and Analysis of Pollen Donors Number for Single Cone of Chinese fir

Chinese fir [Cunninghamia lanceolate (Lamb.) Hook] is one of the most important indigenous timber tree species in China. The aim of the present work is to characterize simple sequence repeat (SSR) loci derived from the specific length amplified fragment sequencing (SLAF) data of the genome and to investigate the number of pollen donors for per cone was with novel SSR markers of low-frequency null alleles. A total of 58855 SLAF-SSR with frequency of 42.04 SSR/Mbp were identified in about 1.40 Gb Chinese fir genome. Dinucleotide repeat SSR contributed to 66.4% of the total SSR from SLAF data. The AT/AT and ATG/CAT motifs were predominant in the category of dinand trinucleotide repeat SSR. Low frequencies of null alleles (<5%) were detected at the nine novel SSR markers with average expected heterozygosity of 0.513 and polymorphism information content score of 0.508. The number of tested progeny of a cone was from 4 to 13. It could be 67 pollinizers for 15 cones and the average number of pollen donors per cone was 4.5. The study points out, for the first time, that there are multiple pollen donors for single cone in gymnosperm.

Chinese fir [Cunninghamia lanceolate (Lamb.) Hook], a gymnosperm, is one of the most important indigenous timber tree species in China for its high growth rate, good wood quality, and the versatility of its wood (Li et al., 2017). To our best knowledge, the basic characteristic of genomic SSR from Chinese fir is unknown and the research on the pollen donors number of offspring from a cone of gymnosperm has not been reported. In the present study we describe: (1) characterization of SSR derived from the specific length amplified fragment sequencing data(SLAF-SSR)of the Chinese fir genome. (2) Evaluation of the primers validity and diversity information parameters of SLAF-SSR and expressed sequence tag (EST-SSR). (3) development of SSR markers with low-frequency null alleles (<5%). (4) analysis of the pollen donors number of a Chinese fir cone.

SLAF-SSR survey
A total of 58855 SLAF-SSR, frequency of which was 42.04 SSR/Mbp, were identified in about 1.40 Gb Chinese fir genome. The number of dinucleotide repeat SSR was 399 9, contributed to 66.4% of the total SSR. This is followed by trinucleotide repeat SSR with 16,547, accounting for 28.1% of the total. The AT/AT and ATG/CAT motifs were predominant in the category of din-and trinucleotide repeat SSR, accounting for 43.9% and 19.9% respectively (Figure 1). The number of di-and trinucleotide SLAF-SSR was declining accompanied by number increase of the motif repeat.

Validation of SLAF-SSR and EST-SSR
A subset of 100 markers, which were successfully designed the primer, were randomly chosen respectively from SLAF-SSR and EST-SSR of Chinese fir. 12 individuals were employed to test the amplification validation. 62 SLAF-SSR markers presented specific products, but 51 markers of them occurred stutter bands. Only 3 SLAF-SSR markers displayed specific products and no stutter bands and demonstrated to be polymorphic. All the EST-SSR markers did not present stutter bands. 86 markers displayed specific products and 13 of them demonstrated to be polymorphic. Therefore, a total of 16 novel polymorphic SSR markers, which appear specific product and no stutter band, were development.

Assessment of genetic diversity
Details of 16 novel polymorphic SSR markers and their variability characteristics across 48 Chinese fir individuals were summarized in Table 1. Seventy-one alleles were identified, with an average of 4.44 observed alleles per locus, ranging from two to nine. There were twenty-one alleles in the three loci of SLAF-SSR, with an average of 7.00, ranging from five to nine. The expected heterozygosity, ranging from 0.334 to 0.810 (average 0.566) was clearly higher than the observed heterozygosity, ranging from 0.283 to 0.727 (average 0.499), which was in accordance with the mean fixation index (FIS=0.103; P<0.05) and indicated there were a greater excess of homozygotes most often a result of inbreeding (Frankham et al. 2010). As a measurement of the genetic diversity, the PIC mean polymorphism level of the loci was 0.512. Eight loci were at HWE and other eight loci showed significant departure from HWE (P-value<0.05). Low null allele frequencies were showed at the nine loci (F-Null<0.05), which were SSR4, SSR5, SSR6, SSR7, SSR8, SSR9, SSR11, SSR12, SSR16, with average expected heterozygosity of 0.513 and PIC score of 0.508. Figure 1 Distribution of di-and trinucleotide motifs SLAF-SSR of Chinese fir Note: types of tri-nucleotide motif less than 0.01% of that were not showed in this figure

Discussion
Trinucleotide repeats were the most abundant repeats in both EST-SSR and SSR from transcription factor unigene of Chinese fir (Wen et al., 2015;Li et al., 2019). The AT/AT and AAG/CTT repeats were predominant in din-and trinucleotide repeats of EST-SSR of Chinese fir (Wen et al., 2015). The AG/CT and AGC/GCT repeats were predominant in those of SSR from transcription factor unigene of Chinese fir (Xu et al., 2014;Li et al., 2019). This indicates that there are differences in the distribution of SSR motifs in the transcribed and non-transcribed regions of the Chinese fir genome. SLAF-SSR markers occurred a lot of stutter bands may be because these markers existed mainly in non-transcribed regions of the genome and flanking sequence of the markers were not conserved (Angers et al., 1997;Grimaldi et al., 1997). Although SLAF-SSR may be obtained more alleles per locus than EST-SSR, the locus with low null allele frequencies were all in the category of EST-SSR. It indicated that EST-SSR had a lower frequency of null alleles than genomic SSR (Rungis et al., 2004;Ellis et al., 2007). Sibship assignment by the program have been conducted in forest trees (Lalitha, 2000;Litkowiec et al., 2018). This is the first time to exposit that there are multiple pollinators for single cone in gymnosperm.

Material collection and DNA extraction
Fresh needles of 48 Chinese fir clones were collected in a clonal seed orchard located at Xishan Forest Farm, Rongan County, Guangxi Province, China as experimental materials. The clonal seed orchard was established with plus trees collected from Guangxi, Guangdong, Hunan, Guizhou, Fujian, Zhejiang. Twelve clones were randomly selected from 48 clones for SSR primer validity test. The polymorphism and null allele frequencies was assessed with 48 clones.
Fifteen Chinese fir clones were randomly selected from seed orchard. One cone was picked and numbered from each Chinese fir clones. The seeds were separated according to the clone number and cultured for germination. The seeds of each cone were wrapped in gauze and labeled. After sterilizing with 0.5% potassium permanganate solution for 25 min, rinsing with distilled water for 3~5 times, then soaking in water for 24 hours, and seeds were put on the pad with the infiltrated cotton wool in germination boxes. Germination boxes were put in constant temperature incubator at 25℃. When the seedlings grew to about 10 cm high, the whole seedlings were placed in a 2 mL centrifuge tube and stored in an ultra-low temperature freezer at -80℃. A total of 108 seedlings were germinated, in which SSR loci were detected for full sibling group analysis. The total genomic DNA of each experimental material was extracted using an Ezup Column plant Genomic DNA Kit (Sangon Biotech, China) according to the instructions. The DNA extraction quality was measured by 1% agarose gel electrophoresis, and the DNA concentration was measured using a Nanodrop 2000 Spectrophotometer (Thermo Scientific).

SSR loci development and primer design
One Chinese fir clone was randomly selected from each of six provinces for specific-locus amplified fragment sequencing (SLAF-seq). SLAF library construction were achieved as described (Wang et al., 2018). The SSR sites were searched using the MISA program (Beier et al., 2017) with SLAF-seq data. SSR containing di-, tri-, tetra-, penta-, or hexanucleotide units repeated at least 6, 5, 4, 3 or 3 times, respectively, were selected. The SSR markers were designed according to the SLAF-seq data and the transcriptome sequencing data (unpublished) with primer 5 software (Lalitha, 2000). The primers were commissioned by Sangon Biotech (China).

Statistical analysis
According to the molecular weight, from large to low one, the alleles was recorded in alphabetical order. Allele sizes were estimated by comparison to an M13 sequence ladder. The observed number of alleles per locus (NA), the effective number of alleles (NE), Shannon Index Shannon's Information index (I), observed heterozygosity observed heterozygosity (Ho), expected heterozygosity (HE), Inbreeding among individuals within subpopulations (Fis), Significant level for deviations from Hardy-Weinberg equilibrium (P-Value) was calculated using POPGENE version 1.32 (Yeh et al., 1999). Polymorphism information content (PIC), the frequency of null alleles (F-Null) were calculated using the computer program CERVUS (Kalinowski et al., 2007). Sibship identification was performed employing COLONY (Jones and Wang, 2010).

Authors' contributions
LKP carried out the experimental research, the data analysis, and drafted the manuscript. CSC participated in the experimental research and the data analysis. DLM participated in the data analysis. LJ and CDX participated in the experimental design. HKY was in charge of the project, guided experiment design and draft revision. All authors read and approved the final manuscript.