Gentron Research Unit, Arenales 1457 – 2° Piso, Buenos Aires C1061AAO, Argentina

Gene Therapy Laboratory, Leloir Institute, CONICET, University of Buenos Aires, Patricias Argentinas 435, Buenos Aires C1405BWE, Argentina

Neuroimmunomodulation and Gene Therapy Laboratory, Leloir Institute, CONICET, University of Buenos Aires, Patricias Argentinas 435, Buenos Aires C1405BWE, Argentina

Joint Centers for Systems Biology, Columbia University, 1130 St Nicholas Avenue, New York, NY 10032, USA

Abstract

Background

Reverse transcription followed by real-time PCR is widely used for quantification of specific mRNA, and with the use of double-stranded DNA binding dyes it is becoming a standard for microarray data validation. Despite the kinetic information generated by real-time PCR, most popular analysis methods assume constant amplification efficiency among samples, introducing strong biases when amplification efficiencies are not the same.

Results

We present here a new mathematical model based on the classic exponential description of the PCR, but modeling amplification efficiency as a sigmoidal function of the product yield. The model was validated with experimental results and used for the development of a new method for real-time PCR data analysis. This model based method for real-time PCR data analysis showed the best accuracy and precision compared with previous methods when used for quantification of

Conclusion

The presented method showed the best accuracy and precision. Moreover, it does not depend on calibration curves, making it ideal for fully automated high-throughput applications.

Background

The reverse transcription polymerase chain reaction (RT-PCR) is the most sensitive method for the detection of specific mRNAs

Most currently used real-time PCR data analysis methods are based on determining the threshold cycle (

Results and discussion

The model

According to its discrete nature, the PCR process can be expressed by the difference equation,

_{n+1 }= _{n}·(1 +_{n}); _{n }∈ (0,1) (1)

where _{n }is the PCR product yield at cycle ^{-15 }and p < 10^{-4}, Wilcoxon paired test, respectively). Thus, the following two parameters sigmoid expression for

Models for PCR amplification efficiency

**Models for PCR amplification efficiency**. The effective amplification efficiency for each PCR cycle was calculated as _{n+1 }/_{n }– 1, where _{n }and _{n+1 }were the PCR product yield at cycles ^{2}), corrected Akike's Information Criterion (AIC) and the best fit value for

where

The intrinsic amplification efficiency _{0 }estimation (see below), and can be obtained from Eq. (2) as,

Model based estimation of the initial template amount (T0)

In dsDNA binding dye protocols, real-time thermocyclers generate fluorescence intensity data. For most applications in which the dsDNA binding dye is in great excess, it can be assumed that the fluorescence intensity is proportional to the amount of double stranded DNA (product yield) _{n}) will be expressed in arbitrary fluorescent units, as the initial template amount (_{0}). Since the fluorescent dye will be in great excess when compared to the initial template amount, we can assume that _{0 }is proportional to the initial template amount, and so the semiquantitative comparison between different samples using _{0 }is valid.

The simplest way to estimate _{0 }is assuming that amplification efficiency at cycle _{CT}) is very similar to

_{CT }= _{0 }(1 + ^{CT } (4)

from which,

_{0 }= _{CT }(1 + ^{-CT } (5)

_{0 }is calculated assuming _{0 }estimation. These errors can be minimized using _{i }estimation accuracy. Replicates must be done by splitting a master-mix containing all components of the PCR to minimize the variability introduced be the operator.

To test this method, we amplified three serial dilutions of cDNA from mouse midbrain with β-actin specific primers. Pooled data from triplicate experiments were used for _{0 }from this data were accurate and precise for _{2}-microglobulin specific primers, but using different amounts of Taq DNA polymerase (Fig.

Effect of CT on the initial template amount estimation

**Effect of CT on the initial template amount estimation**. (A) Product yield vs. cycle number for the amplification of three serial dilutions (0.1, 1 and 10) of cDNA from mouse midbrain with β-actin specific primers performed in triplicate. Horizontal lines show the values for the product yield at which _{0 }determination and _{0 }estimated for dilution 1 at the smallest _{0 }was calculated assuming constant amplification efficiency and using different amounts of PCR product for the estimation of _{2 }microglobulin specific primers using 0.1 and 0.25 units of Taq DNA polymerase. Horizontal lines show the values for the product yield at which

We formulated an alternative more robust method for _{0 }estimation that does not rely on product threshold determinations. This alternative method is based on the fit of Eq. (1) and (2) to real-time PCR data by non-linear regression to obtain the best-fit estimators for the parameters _{0}. This approach does not assume constant amplification efficiency and estimates _{0 }from all the data points that fall into the exponential-linear growth phase instead of using only the product yield at cycle _{0 }estimations were precise and accurate (Table _{0 }and both model 3 parameters _{0 }(see standard error of _{0 }in Table _{0}. _{0 }estimations by MoBPA were as precise and accurate as using the CT method for small PCR product yield-derived CT values (Fig.

Estimation of model parameters.

**A**

**B**

Mean ±

Correlation

Mean ±

Dilution

_{0}

SE

SEM

_{0 }×

_{0 }×

_{0}

SE

SEM

10

8.52

2.27

8.63 ±

0.963

-0.97

12.4

0.29

12.17 ±

10

7

2

0.98

0.963

-0.977

11.2

0.34

0.52

10

10.4

1.95

0.963

-0.969

12.9

0.36

1

0.42

0.06

0.967

-0.962

0.981

0.018

1

1.08

0.56

1 ± 0.32

0.964

-0.948

0.804

0.024

1 ± 0.12

1

1.51

0.74

0.965

-0.977

1.21

0.043

0.1

0.09

0.04

0.073 ±

0.968

-0.965

0.121

0.0032

0.12 ±

0.1

0.065

0.015

0.0087

0.969

-0.974

0.111

0.0026

0.0028

0.1

0.063

0.023

0.968

-0.921

0.117

0.0028

Estimation of _{0}, _{0 }by fitting Eq. (1) and (2) to experimental data using _{0}, the asymptotic estimation of the standard error (SE), and the _{0 }mean value ± standard error of the mean (SEM). For (A), the correlation between _{0 }and the other parameters estimated by non-linear regression is also shown.

Comparison of MoBPA with previous methods

Next, we evaluated the performance of different methods for quantification when amplification efficiency for different samples is not the same. For this, we analysed two different datasets: 1) _{2}-microglobulin specific primers, but with different amounts of Taq DNA polymerase (Fig.

Simulation data was generated _{2}-microglobulin specific primers were used to estimate plausible values for the simulation parameters. The analysis of these simulation results must be taken with care, because we used the same model for generating and fitting the data, so some overfitting is expected. However, we also evaluated the performance of the different methods on real PCR data. To obtain different amplification efficiency in our PCR runs, we used two different amount of DNA polymerase. Efficiencies estimated with our approach from triplicate results were 0.855 and 0.915 for 0.1 and 0.25 units of Taq DNA polymerase, respectively. The use of more than 0.25 units of DNA polymerase did not led to additional increments in amplification efficiency, while PCR with less that 0.1 units of DNA polymerase showed no product. We also tried to partially reduce the PCR amplification efficiency by adding Mg^{2+ }chelating agents like EDTA, by increasing the amount of dNTPs, by lowering the amount of Mg^{2+}, and by adding known DNA polymerase inhibitors like phenol. However, we could not find the appropriate conditions to achieve the partial inhibition of DNA polymerase in a reproducible way.

Both

where _{0 }is the initial template amount, _{i }the threshold cycle for each sample. The simplest form of threshold-based methods even assumes amplification efficiency equal 1. Thus, the ratio of initial template amounts between two samples will be 2^{ΔCT }^{-1/slope}-1. However, sample contamination with salt, phenol, chloroform, etc. may result in a lower-than-expected PCR efficiency

Analysis of _{0 }estimation from replicated samples was very precise (mean CV: 6%, range: 0.75–12%), possibly because these approaches are only affected by errors in

Effect of amplification efficiency over the quantifications performed by different methods

**Effect of amplification efficiency over the quantifications performed by different methods**. (A) _{0 }= 0.001 and different intrinsic amplification efficiencies (_{0 }estimations from each simulated reaction and efficiency 0.8 ones vs. the amplification efficiency bias as mean ± SEM of triplicates. (B) Analysis of experimental results by different methods (see below). Bars represent the error of quantifications as mean ± SEM for triplicates. Bars marked with (*) are under-estimations, conversely, the rest of the bars are over-estimations. Method 6 under-estimated _{0 }by 737%, note that it is out of scale in the graph. Data was analysed with the

Four different methods for single reaction amplification efficiency estimation have been proposed. All of them use the kinetic data generated by real-time PCR cyclers, they do not assume equivalent amplification efficiency among samples and have the additional advantage that no dilutions curve is needed. Three of them assume constant amplification efficiency during the exponential phase of the PCR and estimate it from the few data points that fall into this phase _{1 }/_{2})^{1/(CT1-CT2) }- 1 _{0 }determinations depend on both efficiency and

Recently, Liu et.al. proposed a three parameters sigmoidal function for modelling the whole kinetic process of real-time PCR. Then, the initial template amount is estimated after fitting the sigmoidal model to background subtracted real-time PCR results _{0 }estimation, leading to unreliable quantifications (Fig.

Finally, quantification from the same

To test the reliability of the different methods in conditions of similar amplification efficiency between samples, we amplified three dilutions of mouse midbrain cDNA with β-actin specific primers. Among threshold-based methods, the use of amplification efficiency estimated for each experiment from the dilution series

Analysis of real-time PCR results with similar amplification efficiency among samples. Data represent the quantification of dilutions 0.1 and 10 as the mean ± SEM for 12 experiments performed in triplicate.

**Dilution**

**Analysis method**

1

2

3

4

5

6

7

0.1

0.084 ± 0.003

0.11 ± 0.0033

0.57 ± 0.30

0.37 ± 0.19

0.75 ± 0.45

0.71 ± 0.63

0.12 ± 0.022

1

1 ± 0.019

1 ± 0.017

1 ± 0.11

1 ± 0.20

1 ± 0.12

1 ± 0.10

1 ± 0.018

10

14 ± 0.680

10 ± 0.35

242 ± 206

83 ± 52

15 ± 5.5

48 ± 1.0

10 ± 1.1

(1)

(2)

(3) Amplification efficiency estimated at two product yield thresholds [15].

(4) Amplification efficiency estimated with LinRegPCR software [13].

(5) Amplification efficiency estimated with the Tichopad et.al. approach [16].

(6) Amplification efficiency estimated from the model proposed by Liu et.al. [17].

(7) Our model based real-time PCR analysis method (MoBPA).

Our model assumes that the signal is proportional to the amount of product, which is often the case for SYBR-Green I real-time PCR performed with saturating concentrations of dye. In such conditions centrally symmetric amplification curves are expected. However, in TaqMan applications, where the Taq DNA polymerase digests a probe labelled with a fluorescent reporter and quencher dye, the signal diverges from the product resulting in non-symmetric amplification curves (Supplementary Fig.

supplementary figures

Click here for file

Conclusion

Following the emergence of functional genomic methodologies, the development of high-throughput methods for microarray-derived data validation is becoming indispensably. Nevertheless, high-throughput quantification by real-time PCR is difficult to achieve, primarily due to deficiencies of the threshold-based methodologies, which require reliable estimation of amplification efficiencies

Methods

RNA extraction and reverse transcription

Mice were killed by cervical dislocation and brains removed. Midbrains were dissected, snap frozen in liquid nitrogen and stored at -80°C until RNA extraction. Total RNA was isolated using TRIzol Reagent (Invitrogen, MD), genomic DNA contaminant was removed using DNAse I (Ambion, Inc., TX), and mRNA was purified by MicroPoly(A)Pure kit (Ambion, Inc., TX). First-strand complementary DNA was synthesized at 42°C by priming with oligo-dT_{12–18 }(Invitrogen, MD) and using SuperScriptII reverse transcriptase according to the protocol provided by the manufacturer (Invitrogen, MD).

Polymerase chain reaction

PCR amplifications were obtained using an Icycler IQ Real-Time PCR Detection System (BioRad, CA). cDNA samples were assayed by triplicate. PCR reactions were performed in a final volume of 25 μl containing 1 μl of cDNA, 2.5 μl of the reaction buffer (200 mM Tris-HCl pH 8.4, 500 mM KCl), 3 mM MgCl_{2}, 0.3 mM of dNTPs mix, 0.2 nM of each primer, 0.3 × SYBR-Green I (Molecular Probes, OR), 100 μg/ml BSA, 0.25 μl ROX Reference Dye (Invitrogen, MD), 1% glycerol, and 1.25 U of Taq Platinum Polymerase (Invitrogen, MD). The primer sequences used were: β2-microglobulin, sense: TGA CCG GCT TGT ATG CTA TC and antisense: CAG TGT GAG CCA GGA TAT AG; β-actin, sense: CAA TGT GGC TGA GGA CTT TG and antisense: ACA GAA GCA ATG CTG TCA CC. PCR was performed as follows: one initial cycle of 94°C for 2.5 min, 40 cycles of 94°C for 30 sec, 58°C for 30 sec, and 72°C for 15 sec. TaqMan real-time PCRs were performed as the SYBR-Green I assays, but with no addition of SyBrGreen I and with the following primers and probes: β-actin sense: AGA AAA TCT GGC ACC ACA CC, antisense: CAG AGG CGT ACA GGG ATA GC, and probe: ACC GCG AGA AGA TGA CCC AGA TCA T; HPRT sense: AGA CTG AAG AGC TAT TGT AAT, antisense: CAG CAA GCT TGC GAC CTT GAC, and probe: TGC TTT CCT TGG TCA GGC AGT ATA.

Analysis of real-time PCR data

ROX base-line corrected real-time PCR results were analysed with different methods as described ^{2 }value. Then we inspected the fit for each PCR curve and manually corrected the windows of linearity when needed.

The implementation of our method comprise the following steps: 1) Identification of ground, exponential, lineal growth and plateau phases, 2) background subtraction, 3) effective amplification efficiency estimation and fitting eq (2) to experimental data, 4) initial template amount calculation.

The ground phase was identified as described

Here,

The background level was calculated as the last ground phase data point, which in tern was estimated by a linear regression over the last five data points of this phase

The effective amplification efficiency for each PCR cycle (_{n}) was solved from Eq. (1) as _{n }= _{n+1 }/_{n }- 1 _{0 }was estimated by fitting our discrete model, shown below in R-code:

to background subtracted experimental data by the nls function of R-system, using the values for the parameters

Our method was implemented in the R-System. The source code and Windows binary of MoBPA package are available for non-commercial research use (see Additional files

Source files for the R-system package MoBPA.

Click here for file

Windows binary files for the R-system package MoBPA.

Click here for file

Quantification error and Akaike's Information Criterion

For calculating the quantification error, we defined a reference sample (

For comparing alternative models we used a corrected form of the Akaike's Information Criterion (

where

Simulation data was generated _{2}-microglobulin specific primers were used to estimate plausible values for the simulation parameters. In such a way, the initial template amount in fluorescent arbitrary units (_{0}) was set to 10^{-3}, and the product amount at the PCR plateau phase (

Then, for _{n }= _{n }tends to zero, so we calculated approximated values for _{n }= _{n }= 0.001.

Zero mean, normally distributed random noise was added to the

Authors' contributions

MJA carried out the design of the study, participated in data analysis and drafted the manuscript. GJV carried out the real time PCR. MCS participated in data collection and analysis. OLP and FJP participated in the design of the study and critically revised the manuscript. All authors read and approved the final manuscript.

Acknowledgements

We thank Dr. Andrea Califano for helpful comments on the manuscript. We specially thank Cynthia Serra and Daniela Celi for technical assistance. We are in debt with María Romina Girotti and Andrea Sabina Llera for facilitating their TaqMan PCR results.