25.05.2018 03:13 - About Us - Mediadaten - Imprint & Contact - succidia AG
The utility of the chloroplast genome in verifying food authenticity: a case study looking at Ecuadorian fine/flavour cocoa

The utility of the chloroplast genome in verifying food authenticity: a case study looking at Ecuadorian fine/flavour cocoa

Fine flavours

Chocolate and other cocoa-based products have long been popular foodstuffs, associated as they are with ­pleasure and enjoyment of the finer things in life. If we think for a minute about the product design, the choice ­of packaging – and also the price – of some dark chocolate brands, we can see how these are promoted as premium products, just as is the case for other foods such as wine or coffee. Accordingly, consumers often find that cocoa-based products include details of the place of origin for the beans used, as a sign of the product's quality. Since consumer demand for these kinds of foods continues to rise, production of fine/flavour (i.e. premium) cocoa varieties (currently about 5% of total cocoa production) must also increase to match this demand.

The popularity of fine/flavour ­cocoa products continues to rise

Arriba is probably the most important fine/flavour cocoa variety. Making up around half of total fine/flavour cocoa production, the variety is cultivated primarily in Ecuador and is known for its unique and full-bodied flavour. Alongside Arriba cocoa, another variety has also been cultivated in Ecuador since the 1970s – the bulk grade cocoa CCN 51 [1, 2]. A clonal variety that offers high yields and is more robust and resistant, CCN 51 nonetheless fails to match the flavour of the more fragile Arriba [3, 4]. Since cultivation of CCN 51 carries less risk of a poor harvest, this variety is grown preferentially by Ecuadorian cocoa farmers. Yet increased production of bulk cocoa is a trend that runs counter to the growing consumer demand for fine/flavour beans.

It's reasonable to assume, therefore, that the two Ecuadorian cocoa varieties become mixed together, ultimately resulting in a poorer-quality end product. This intermixing could be deliberate, so as to boost revenue by selling what are only ­apparently fine/flavour cocoa beans. Yet lower-grade beans could also become mixed with the premium beans inadvertently, too, since both varieties are often cultivated and processed alongside one another. For both of these reasons, there is an abiding interest in identifying a method capable of distinguishing between the two cocoa varieties (fig. 1).

Fig. 1 Mature cocoa pods from CCN 51 (left) and Arriba (right). While these pods are visually distinct, this is no longer true of the processed cocoa products.

DNA: a biomarker candidate

Of all the biomarkers that are available within a plant cell, DNA – unlike proteins or metabolic end products, for example – has the unusual property of being specific to an individual organism. DNA is also ­unchanging, i.e. it remains impervious to external factors for the period of time that is crucial for food analysis work. Alongside the nuclear genome, each plant cell also has a chloroplast genome (i.e. plastid DNA). Since CCN 51 and Arriba are organisms that are very strongly interrelated, conducting the search for sequence differences in ­plastid DNA rather than nuclear DNA is a more viable approach, since we know that chloroplast genomes exhibit a greater level of variability than the nuclear genome [5].

The first task is to carry out compre­hensive sequencing work. Ordinarily, ­sequencing the complete chloroplast ­genome would first require its isolation from the nuclear genome. Happily, Kane et al. (2012) have demonstrated how to avoid time-consuming steps in separating the two genomes. For each cell, a single copy of nuclear DNA is available. This self-same cell also contains about three to ten chloroplasts, each of which possesses around 70 copies of its DNA. While the chloroplast genome is about 2,000 times smaller than the nuclear genome, it's far more numerous within each cell. This fact is exploited by the method referred to as “low coverage whole genome shotgun sequencing”, as developed by Kane et al. As a result of this low coverage during sequencing, the high copy fractions are recorded. Even with ­nuclear DNA present, this method can be used to sequence the plastid DNA [6].

SNPs in the chloroplast genome

A comparison of the chloroplast genome from the two cocoa varieties CCN 51 and Arriba shows, as expected, only minimal deviation (about 0.03%). This deviation merely involves single base substitutions (single nucleotide polymorphisms, SNPs), which make applying straightforward PCR (polymerase chain reaction) more difficult. Since a few of these are located in the recognition sequence for a restriction enzyme, however, these SNPs can actually be utilised as a means of cocoa variety differentiation [7]. The first step is to conduct variety-­unspecific PCR, whereby the primers flank the SNP at a distance. For both cocoa varieties, this generates a PCR product that, in a subsequent stage – termed “restriction fragment length polymorphism” (RFLP) – is then digested by a restriction enzyme. Since these are high-affinity endonucleases, they can cut only the PCR product containing the predefined sequence. In a final stage, detection is performed using AGE (agarose gel electrophoresis), CGE (capillary gel electrophoresis) or dHPLC (denaturing HPLC). If digestion is successful – due to a correct recognition sequence – then two fragments of a smaller size will be observed. If the amplicon does not contain the exact sequence, digestion does not occur and the undigested PCR product is detected instead of the two fragments. In accordance with the stated objectives of the project AiF/FEI 16796N in relation to the mixing of Ecuadorian fine/flavour Arriba cocoa with the lower-cost bulk CCN 51 cocoa, the PCR-RFLP procedure can be designed to enable the detection of the CCN 51 variety in the final stage following enzymatic digestion. Accordingly, the presence of CCN 51 will result in the presence of two fragments, since the bulk cocoa sequence contains the correct recognition sequence for the restric­tion enzyme (fig.2).

Fig. 2 Detection following PCR-RFLP via AGE and CGE. Following PCR, PCR products are obtained (1 and 2, before restriction, AGE). Amplicon 1 contains the recognition sequence of the restriction enzyme deployed and is digested: two fragments of around 170 bp and 260 bp are detected (1, after restriction, AGE and CGE). Amplicon 2 does not contain the recognition sequence, due to a SNP, and is not digested: instead of two fragments, the unfragmented PCR product is once again detected at around 420 bp even after restriction (2, after restriction, AGE and CGE). BE = blank reading.

A further deviation in the chloroplast genome sequence is exhibited by the IRR (inverted repeat region), following its partial sequencing with the aid of 27 primer pairs by Dhingra et al. (2005) [8]. The IRR is a sequence that, when reversed, has a complement sequence located in the same genome. In the chloroplast genome under consideration, this contains the five-base sequence (TAAAG)n (repeat), whose rate of repetition differs between the two varieties (fig.3).

Since the issue is not one of qualitative sequence differences but simply the difference in the number of repeat sequences in the two cocoa varieties, it is not a problem that can be resolved by designing a PCR method with variety-specific primers. On the other hand, the difference in repetition frequency can once again be used for a detection method. For both cocoa varieties, PCR products were generated, whereby the PCR region flanked the IRR at a distance. In a subsequent step, the PCR products obtained were detected using AGE, CGE or dHPLC. Following this step, it was possible to observe variety-specific amplicons that differed from one another by 40 bp, since the five-base sequence is repeated 6 times in one variety (TAAAG)6 and 14 times in the other (TAAAG)14 (fig.4).

Fig. 3 Section of alignment following sequencing of the IRR. The repeat (TAAAG)n occurs 6x in one cocoa variety and 14x in the other variety.

Fig. 4 Detection of the PCR product following a variety-unspecific PCR, in which the primers flank the (TAAAG)n repeat. Since the repetition frequency of the repeat differs between the varieties (1:(TAAAG)14 and 2:(TAAAG)6), PCR products are obtained that differ by 40 bp (1: roughly 150 bp and 2: roughly 110 bp).

Alternative methods

The sequence differences between CCN 51 and Arriba are restricted to SNPs or a variation in repeat repetition in the chloroplast genome. By using these deviations as a starting point, it was possible to design PCR methods capable of detecting an inter­mingling of fine/flavour cocoa with bulk cocoa. Alternatively, other methods could be developed on the basis of the sequence differences mentioned.

If a SNP is present, then LPA (ligation-dependent probe amplification) can be utilised. Two primer probes hybridise immediately adjacent to one another to a target sequence. The next stage involves the covalent ligation of these two varieties by a high-affinity ligase. Ligation can only take place, however, if hybridisation completed fully at the target sequence – which does not happen in the case of a SNP. PCR then follows, whereby the necessary primers are already hybridised to the probes. The gene­ration of a PCR product is assured only in the case of ligated probes, however: if a SNP is present, then neither ligation nor PCR occurs. No PCR product can then be detected [9].

Work on differentiation could also utilise a comparison of micro­satellites. Microsatellites are short, non-coding genome ­sequences that repeat themselves at an identical locus, as already described in the case of the IRR in the cocoa chloroplast genome. As a rule, these repetitions occur with variable frequency. Microsatellites also occur in nuclear DNA [10, 11, 12]: After analysis and detection of the PCR products obtained from CCN 51 and Arriba, acquired from the nuclear genome in accordance with the published research, this could also produce corresponding conclusions about the varieties.


Verifying the authenticity of plant-derived foodstuffs constitutes an interesting challenge within the field of food analysis. One can expect to see further innovations and techniques developed within sequencing. Since these are also less time-consuming and costly than they were some years ago, it has now become possible to sequence entire genomes in a short space of time. This has been accompanied by rapid advances in the development of methods, with whose help the problem of foodstuff authenticity can be viewed in an entirely new light. The procedure sketched out in relation to cocoa is merely one example: we may assume that a similar strategy can be adopted for a wide range of other raw materials and thus, in turn, for foodstuffs.


[1] Lieberei, R. (2006) Relations Gesellschaft für Kommunikation GmbH 2, 6–12
[2] Kakaoverein, Geschäftsbericht 2011/12, Hamburg
[3] Stern, J. G. (2011) The History of CCN-51 in Ecuador, www.jeffreygstern.com.
[4] Turnbull, C. J. & Hadley, P. (2013), CCN 51, www.icgd.rdg.ac.uk.
[5] Jansen, R. K. et al. (2010) Mol. Biol. Evol. 28, 835–847
[6] Kane, N. et al. (2012) Am. J. Botany, 99 (2), 320–329
[7] Motamayor, J. C. et al. (2002), Nature Publishing Group, Heredity 89,380–386
[8] Dhingra, A. & Folta, K. M. (2005) BMC Genomics 6, 176
[9] MRC-Holland MLPA (2012), Multiplex Ligation-dependent Probe Amplification (MLPA)
[10] Lanaud, C. et al. (1999) Molecular Ecology 8, 2141–2152
[11] Saunders, J. A. (2004) Theor. Appl. Genet. 110, 41–47
[12] Smulders, M. J. M. (2012) AgSci, Plant Science, INGENIC 12, 1–13

Photo: © panthermedia.net|Viet Doan

L&M int. 2 / 2014

The articles are publishes in issue L&M int. 2 / 2014.
Free download here: download here

The Authors:

Read more articles online