Genetic mapping studies in Coffea sp using molecular marker methods

Genetic analysis has become an important tool in plant breeding for crop improvement. One of their greatest potential appears to be the identification of molecular markers useful for genetic mapping. Genetic mapping is one of important steps in genetic analysis. The essence of all genetic mapping is to place a collection of molecular markers onto their respective positions on the genome. Thus, it leads to identification of new quantitative trait loci (QTLs) by making benefits of natural available genetic diversity and to improve important and valuable traits. Until present, thirteen genetic maps were published and available in Coffea sp. creating a huge database for genetic framework. One most recent and open reference genetic map for robusta coffee has been generated by the International Coffee Genomics Network (ICGN) comprising 3230 loci, genetic size 1471 cM (1cM ~500 Kb), with an average density close to one marker every 220 Kb. The Coffea genetic maps have been utilized from gene characterization to genomic comparative analysis with different plant species. Nowadays, the feasibility of NGS for DNA and RNA sequencing allow the validation of genetic map related to the prediction of QTLs and adjacent genes related to important traits for Coffea sp. [


Introduction
Molecular markers have become available in plant systems for basic and applied studies.They have become an increasingly important tool in plant breeding for genetic analysis and improvement of important traits (Malik et al., 2014;Grover & Sharma, 2014).They have the potential to facilitate the plant breeding programs through numerous ways, such as fingerprint of elite germplasms, assessment of genetic diversity, identification and stack genes controlling quantitative traits, development of a possible environment-neutral selection.However, one of their greatest potentials appears to be in the benefit use of available natural genetic diversity such as creating genetic mapping for the improvement of economic traits via the identification and use of new quantitative trait loci (QTLs) (Zhang et al., 2014).
Marker-based genetic mapping is essential for effective manipulation of important genes (Jiang, 2013).While the possibilities appear limitless, the application of this knowledge is still in infant level.It may take some time before it could be optimally used for most plant breeding programs.Nevertheless, the era of genotyping becomes less expensive than phenotyping which will enhance the increase use of molecular markers in plant breeding (Kumar et al., 2009).
Recent progress in molecular genetics and statistic approaches provide the possibility of genetic map creation, of chromosome area identification in which lies the QTLs, and of further identification of favourable allele(s) determining the trait (Elshire et al. 2011;Kesawat & Das Kumar, 2009;Moose & Mumm, 2008).In general, this enhancement will assist plant breeders to select superior plants by identifying the deoxyribonucleic acids (DNA) markers linked to the loci determining these desirable traits.This paper was aimed to review, in general aspects, molecular markers and their application for genetic mapping for Coffea sp.

Definition of genetic mapping
In general, genetic map is a graphic representation of the arrangement of genes on a chromosome.A genetic map is used to locate and identify the gene or genes that determine a particular inherited trait.It indicates also the position and relative genetic distance between markers along the chromosomes (Triwitayakorn et al., 2011).Locating and identifying genes in a genetic map is called "genetic mapping".Genetic mapping are sometimes called as "chromosome mapping" and "linkage mapping".In this paper, these terms will be used in discussion.
A genetic or linkage map could be viewed as a "road map" of the chromosomes from a segregating population.The most important application of genetic maps is aimed to identify QTLs (chromosomal locations) associated with traits of interest (Leonforte et al., 2013).The QTL mapping is based on the principle that traits and DNA markers segregate concomitantly during meiosis allowing their analysis in the progeny (Paterson, 2002).Traits and genes/markers that are close or tightly-linked will be transmitted together from parent to progeny more frequently than genes or markers that are located further apart (Kearsey & Farquhar, 1998).
A segregating population is a mixture of genotypes recombining the genetic backgrounds of the parents (Rieseberg et al., 1999).The recombinant genotypes can be used to calculate the recombination frequencies and therefore the genetic distance between markers.By analysing the marker segregation, one can determine the distribution and distances between markers.The lower the frequency of recombination between two markers, the closer they are on a chromosome and reciprocally.Furthermore, recombination frequencies can be converted into centi Morgans (cM) (Kearsey & Farquhar, 1998;Kearsey & Pooni, 1996).The map distance equals the recombination frequency for short distances (<10 cM) but it does not strictly apply for longer distance (>10 cM) (Hartl & Jones, 2001).
A mapping function adjusts the observed recombination frequencies to better estimate the map distance by taking into account single cross-over and double cross-over.Two mapping function such as Haldane's and Kosambi's were well-known (Van Ooijen, 2006).Haldane's mapping function assumes that there is no interference which would increase or decrease the proportion of double cross-over.Determining the map distance using Haldane's mapping function is based on the observed rate of recombinant genotypes adjusted with the number of unobservable double crossovers.The physical distance between loci does not equal the observed recombination fraction due to double cross-overs which result in undetectable recombination events (Huehn, 2011).As for Kosambi's mapping function, it is based on empirical data regarding the proportion of double cross-overs as the physical distance varies.Kosambi's function adjusts the map distance based on interferences that change the proportion of double cross-overs.The relative physical distance between two loci can be measured by the proportion of crossovers (Vinod, 2011).
The constructions of linkage maps in plants were limited just until the development of molecular mapping.The inability to incorporate many phenotypic markers into a single genotype to be used for further genetic analysis due to the possibly cumulative deleterious effects of the mutant phenotypes was the most significant limitation for developing linkage maps (Bandyopadhyay, 2011).Compared to traditional phenotypic markers, DNA or proteins nowadays used to score genetic materials are phenotypically neutral (Grover & Sharma, 2014).Moreover, linkage maps can be constructed with different types of population such as F 2 , recombinant inbreed line (RIL), backcross (BC) or double haploid (Milczarski et al., 2011).The utility of DNA marker based genetic maps in linkage studies and genetic improvement of crop is at this time well established.The potential of such maps was expected to be even higher in case of perennial species, such as coffee, where conventional breeding efforts were severely constrained due to the lack of screening tools and long generation cycle (Mishra & Slater, 2012).The first genetic maps, based on morphological markers or growth behaviours, allowed breeders to estimate the offspring"s genotypes without, or prior to, field testing (Winter & Kahl, 1995;Hendre & Aggarwal, 2014).

Past and current status of genetic mapping
In the past, a number of phenotypic markers segregating within one cross are required to construct a complete linkage map.The development of biochemical markers including isozymes and other polymorphic proteins significantly increased the number of markers observable in a single segregating population.In early 1930"s, genetic linkage groups were available merely in tomato and maize.These linkage groups consisted entirely of loci causing visible changes of morphological characteristics.Until 1970"s, genetic mapping was restricted to these phenotypic markers (Morton, 2004).In 1986, a plant genetic map based on isozymes and cDNA sequences was first reported in tomato (Bernatzky & Tanksley, 1986).Unfortunately, scoring of biochemical markers was sensitive to environment, developmental stage, and type of tissue or organ (Lidah et al., 2006) This problem of sensitivity was one of the reasons why DNA molecular markers were then developed to be implemented for genetic studies.Genome mapping of plant species was developed shortly after the genetic mapping of Drosophilla sp.(Morton, 2004).These efforts were driven by the possibility offered by genetic maps to make indirect selection by associating desired traits to more easily determined markers (Kraakman et al., 2006, Kraakman et al., 2004).
Generation of high density maps for entire genome or single chromosome requires the isolation and characterization of hundred of markers (Delourme et al., 2013).Various types of molecular markers are used to assess DNA polymorphism.In recent years, different DNA marker systems such as Restriction Fragment Length Polymorphisms (RFLPs), Random Amplified Polymorphic DNAs (RAPDs), Amplified Fragment Length Polymorphisms (AFLPs), Simple Sequence Repeats (SSRs) also called microsatellites, Single Nucleotide Polymorphims (SNPs) have been developed (Malik et al., 2014;Lv et al., 2014;Jiang, 2013).In Coffea canephora, RFLPs has been used to identify polymorphism level in several clones (Priyono et al. 2000).The genetic diversity between cultivated and wild accessions of Coffea arabica has been carried out using RAPD markers (Lashermes et al., 1996).Even only a moderate degree of genetic diversity were identified among the accessions, RAPD markers were considered effective on grouping germplasms in Coffea species (Sera et al., 2003, Lashermes et al., 1996).The AFLP markers have been successfully used for genetic mapping of leaf rust resistance (Diola et al., 2011).Using these types of marker, several genes related to the disease has been localized (Prakash et al., 2004).Among others, the SSR markers has been largely used in Coffea species to study genetic diversity, genus structure, cross-species transferability, and to develop core collection (Leroy et al., 2014;Sumirat et al., 2012;Geleta et al., 2012;Cubry et al., 2008;Musoli et al., 2009;Hendre et al., 2008;Poncet et al., 2007).These microsatellite markers were able also to be used in association studies to target selected areas in the genome for accelerate Coffea breeding (Cubry et al., 2013).
Moreover, with the advancement of New Generation of Sequencing (NGS), Single Nucleotide Polymorphism (SNP) markers have been commonly used to study genetic diversity on more precise scale but at high debit analysis (Dereeper et al., 2015).RNA sequencing technology was recently able to be used as an alternative of SNP detection (Vidal et al., 2010).This technology leading to the development of Expressed Sequence Tags (ESTs) database was furthermore powerful to perform comprehensive genome-wide transcript profile study for example between two different species of Coffea (Mondego et al., 2011). Herrera et al. (2014) has demonstrated by using the same technology within 20.000 unigenes to correlate genomic relationship between three Timor hybrids.These successful analyses reconfirmed the effectiveness of RNA sequencing applied for genomic analysis in polyploidy species.
Transferability of genetic map in Coffea sp.
Generally, it has been known that linkage maps are unique as a product of a given population and marker sets.The molecular markers building the genetic map might not be polymorphic and reliable across populations (Cheema & Dicks, 2009).Accordingly, in order to correlate the information from one map to another, shared markers are required.Anchors, known for common markers that are highly polymorphic in population mapping, which are typically SSRs or RFLPs were used as shared markers (Studer et al., 2010;Motta et al., 2014).Finally, a "consensus" map could be generated by incorporation of shared anchor markers into different genetic maps (Yu et al., 2014;Raman et al., 2013;Yang et al., 2013).In this review, this capability is considered as the transferability of linkage maps which are meant also as the transferability of molecular markers.
As an example, the first consensus genetic map using RFLP and SSR markers was developed in C. canephora var.Robusta 11 years ago (Crouzillat et al. 2004).The latest, two genetic linkage maps A (FRT 58 x FRT 51) and B (FRT 67 x FRT 51) were built from C. canephora as parents comprising 369 F1 individuals (Mérot-L" Anthoëne et al., 2014).This map was stored in an integrative database for functional, comparative and diversity studies in the Rubiaceae family, named after "MoccaDB" (Plechakova et al., 2009).Several studies using EST-SSR and SSR markers in Coffea species has revealed the high transferability across distant Rubiaceae species which defined the practical aspect of these markers (Poncet et al., 2007;Poncet et al., 2006;Aggarwal et al., 2007;Cubry et al., 2008;Plechakova et al., 2009).The generalized use of an increasingly larger set of transferable markers, will allow faster and more precise investigations about QTL synteny among species, QTL validation across different genetic backgrounds and positioning of a growing number of candidate genes co-localized with QTLs.

Currently available genetic maps for Coffea sp.
A number of genetic maps for Coffea species has been reported.These maps were generally considered as genetic reference by the scientific community.The established linkage maps of Coffea species have been used for gene tagging, genome organization, and evolutionary studies, as well as in Marker-Assisted Selection.For high-resolution mapping, at least <5 cM but ideally <1 cM or more tightly-linked markers are used to discriminate between a single and several genes involved in a single QTL (Sharma et al., 2008;Grover & Sharma, 2014).So far, thirteen genetic maps of Coffea sp. are available comprising four maps of C. canephora, two of C. arabica, two of interspesific F1 and/or F2, and four of interspecific backcross population (BC 1 ) (Table 1).
The first map available used 47 RFLPs and 100 PCR-based markers (RAPD) to construct 1402 cM of map length comprising 15 linkage groups with an average marker density of 10 cM.(Paillard et al., 1996).Due to its low DNA polymorphism, this map needs additional markers to cover the genome.Even so, it is understandable that at that time, such genetic map was already considered as a breakthrough in genetic analysis of Coffea species.This map was developed in order to provide a starting point in genetic studies in Coffea species and to be applied for QTL analysis.The same double-haploids population was used to develop another map of 162 loci comprising 97 AFLP, 11 RAPD, 36 RFLP, and 18 SSR markers covering 1.041 cM with an average density of 6.5 cM (Lashermes et al., 2001).This genetic map provided a demonstration of low significant differences for recombination.The result suggested the possibility of crossing programmes in the most convenient manner for C. canephora.
In parallel with the map of Lashermes and colleagues, Ky et al. (2000) has demonstrated the use of AFLP marker-based genetic map for diploid coffee genome.The interspecific backcross using in total 181 DNA markers has produced 1.144 cM of map length with 6.9 cM of marker density.This is the first Coffea interspecific linkage map which provides evidence of genetic segregation distortion in Coffea species.The fourth map by Coulibaly et al. (2003) implemented the strategy of Ky et al. (2000) in the use of backcross population of two different Coffea species in genome size.A 1.360 cM of map length comprising 15 linkage groups with 7.2 cM of marker density generated from 190 AFLP and SSR markers has suggested a potential radiative speciation within the Coffea genus.In this map, Coulibaly et al. (2003) has shown the dual function of AFLP markers to map qualitative morphological traits and to be used as core map for further location of additional markers, in this case SSRs.
In 2004, the first robusta consensus map was developed (Crouzillat et al., 2004).It was even considered as the first high-resolution genetic map comprising 1.258 cM of map length with 1.7 cM of marker density.The high number of SSR markers has increased the number of polymorphism which showed that this type of markers was suitable to create a consensus map for genetic studies across the genus (Lefebvre-Pautigny et al., 2010).This consensus map was intended to guide and speed up plant breeding programs.In the same year, Pearl et al. (2004) constructed a 1.803 cM of map length using large number of AFLP markers (464 loci).Eventually, with 10.2 cM of marker density, this sixth map could not be considered as highresolution map.However, it was the first published map available for arabica coffee.In the same year, another partial genetic map has been developed by Teixeira-Cabral (2004).The unique part of this map was the use of 82 RAPD markers, known also as dominant markers, to confirm the diploid-like meiotic behaviour in allotetraploid Coffea species.However, low level of polymorphism in arabica species has been an obstacle of genome mapping.
The genetic map number eight was constructed by Hamon et al. (2005) using 222 AFLP and SSR markers.The length of this interspecific genetic map was 1360 cM with 7.2 cM of marker density.Another intraspecific genetic map was developed by López & Moncada (2006) in C. liberica and C. eugeniodes using 76 SSR markers resulting in a short 378 cM of map length with 5.0 cM of marker density.Both of these intraspecific markers were use to study marker-assisted breeding for each species tested.A linkage map of cultivated diploid robusta using a pseudo-testcross population was also reported (Hendre & Aggarwal 2007).This 1.234 cM of map length was constructed using 374 markers to cover 16 linkage groups.The future utilization was expected to identify QTLs for drought tolerance.Following up the genetic studies of Coffea species, Priolli et al. (2008) has created an interspecific genetic map from C. arabica and C. canephora comprising 1.011cM of map length with moderate marker density (5.9 cM).
The next genetic map was derived from C. arabica by Priolli et al. (2008).Actually, this genetic map was the second for arabica coffee but the first saturated for leaf rust resistance.A 137.4 cM of map length with low marker distance (9.4 CM) has been constructed using AFLP markers completed with SCAR markers.In term of plant disease, higher efficiency for genotyping perfomed by SCARs permitted SCARs to amplify DNA fragments related to resistance and susceptibility of C. arabica to leaf rust.Therefore, the aim of the study was to introduce DNA markers for marker-assisted selection for the resistance to Hemileia vastratix, pathogen of leaf rust.
The most complete high-density genetic map was constructed by the International Coffee Genomics Network (ICGN) in collaboration with IRD/CIRAD, Nestlé R&D Centre and the Indonesian Coffee and Cocoa Research Institute (ICCRI) in 2015.This map was the improvement of the works by Crouzillat et al. (2004) and Lefebvre-Pautigny et al. (2010).As reported in ICGN Meeting, there were three stages of final marker development comprising the use of thousands of molecular markers  2015).Until present, the latest was used as the international reference genetic map for genomic studies in Coffea sp.
The use of genetic maps in gene characterization for Coffea sp.
The use of genetic maps in gene characterization for Coffea sp. has been known.Mahesh et al. (2006) has used the genetic map generated by Ky et al. (2000) to locate the gene CcPAL1 encoding phenylalanine ammonia-lyase within the genome.This gene was considered important due to its role for accumulation of caffeoyl quinic acids providing the taste of coffee.Using similar approach, Lepelley et al.The availability of genus-wide genetic maps has become a necessity for the effective advance of genomic undertakings (Brondani et al. 2006).Comparative mapping reveals a high degree of colinearity within species as well between closely related species which allows the transfer of markers between these maps.Using transcriptomic data from 39 genotypes of C. canephora, C. arabica, and other Coffea sp., Aggarwal et al. (2007) et al., 2010).Using the same population and genetic maps, Guyot et al. (2012) has demonstrated macrosynteny detection between coffee, tomato, and grapevine.Coffee and tomato genomes share 318 orthologous markers while coffee and grapevine share 299 markers.Comparative analysis was also useful in the estimation of divergence between diploid and tetraploid genomes in Coffea species (Cenci et al., 2012).
The recent technology of NGS has allowed the sequencing of three Coffea genomes (C.canephora, C. eugenioides dan C. arabica) (ICGN 2015, Dereeper et al, 2015;Denoeud et al., 2014).These genomic alongside with transcriptomic databases were indubitably able to give a significant contribution to the improvement and validation of genetic mapping in Coffea.All these available data were important in order to create Coffea as a model for studies of trait-evolution, speciation and domestication toward adaptation to climate change (ICGN, 2015).

Conclusion and Remarks
Genetic mapping is one important step of genetic analysis for plant improvement program.The essence of all genome mapping is to place a collection of molecular markers onto their respective positions on the genome.The next step consists of the identification of QTLs which can be related to adjacent genes associated with traits of interest.Until present, thirteen genetic maps were published and available in Coffea sp.creating a huge database for genetic framework.The latest high density genetic map was generated by the International Coffee Genomics Network (ICGN) with the collaboration with IRD/CIRAD, Nestlé R&D Centre and the Indonesian Coffee and Cocoa Research Institute (ICCRI).This map was considered as reference genetic map.The application of Coffea genetic maps has been performed from gene characterization to genomic comparative analysis with different plant species.Nowadays, RNA sequencing technique allows scientists to construct genetic map even faster and more accurate.Moreover, the feasibility of NGS for genome sequencing allows the validation of genetic map related to the prediction of QTLs and adjacent genes related to important traits for Coffea sp.In larger scale, all available data were crucial for future coffee improvement strategies and also provide basic knowledge to study the evolution of euasterids.
Bernatzky R & SD Tanksley (1986) (2012) used the genetic map ofLefebvre- Pautigny et al. (2010)  to map three PAL genes (CcPAL1, CcPAL2 and CcPAL3) in C. canephora.The authors have demonstrated differential expression of CcPAL genes in several tissues tested.Genetic mapping helped position each genes in three different coffee linkage groups, where as one gene CcPAL2 was considered to have ancestral role in the evolution of the family of PAL genes in Coffea.The use of genetic maps in comparative analysis for Coffea sp.