Nuffield Department of Clinical Medicine, University of Oxford, The Peter Medawar Building for Pathogen Research, South Parks Road, Oxford, OX1 3SY, UK

Department of Biology, The Pennsylvania State University, University Park, PA 16802, USA

Department of Statistics, University of Oxford, The Peter Medawar Building for Pathogen Research, South Parks Road, Oxford, OX1 3SY, UK

Department of Pediatrics, The Johns Hopkins Hospital, Baltimore, MD 21287, USA

Department of Pediatrics, Columbia University College of Physicians and Surgeons and Harlem Hospital Center, NY, USA

Department of Zoology, University of Oxford, South Parks Road, Oxford, OX1 3PS, UK

Department of Computer Science, University of Auckland, Private Bag 92019, New Zealand

Abstract

Background

Genetic diversity of the human immunodeficiency virus type 1 (HIV-1) population within an individual is lost during transmission to a new host. The demography of transmission is an important determinant of evolutionary dynamics, particularly the relative impact of natural selection and genetic drift immediately following HIV-1 infection. Despite this, the magnitude of this population bottleneck is unclear.

Results

We use coalescent methods to quantify the bottleneck in a single case of homosexual transmission and find that over 99% of the

Conclusion

Assuming the bottleneck at transmission is selectively neutral, such a severe reduction in genetic diversity has important implications for adaptation in HIV-1, since beneficial mutations have a reduced chance of transmission.

Background

The size of the inoculum that initiates infection in HIV-1 is unknown, although the loss of diversity is thought to be substantial following both horizontal

In RNA viruses with a high deleterious mutation rate the majority of variants exhibit a replicative capacity lower than the mean

Conversely, natural selection may lower the susceptibility of HIV-1 to reductions of fitness associated with transmission. In acutely infected HIV-1 patients, the usually diverse envelope

Herein we estimate, using population genetic techniques, the proportion of genetic diversity that survives transmission in a single homosexual transmitter pair, with samples available before and after the transmission event. The demographic history of the virus population in both donor and recipient was reconstructed using coalescent methodology, allowing quantification of the diversity present close to the time of infection. The coalescent was implemented within a Bayesian framework, which enabled co-estimation of substitution and demographic parameters using serially sampled sequences

Through a comparison of different regions of the genome (namely

Results

To directly visualise the change in genetic diversity during horizontal HIV-1 transmission between the donor-recipient pair studied, we first inferred the phylogenetic relationships among their HIV-1 sequences using maximum likelihood methods. The phylogenies for

Phylogenetic relationship of (a)

**Phylogenetic relationship of (a) env V1-V4 and (b) gag p24 sequences**. Maximum likelihood phylogenies depicting the relationship between sequences from donor and recipient, illustrating the reduction in genetic diversity at transmission. Horizontal branch lengths are drawn on a scale of nucleotide changes per site. Branches leading to recipient sequences are highlighted in red, with the day of sample collection relative to the first recipient sample (day 0) shown for each sequence.

To investigate the demographics of viral transmission in this transmitter pair more closely, four coalescent models were fitted to the sequence data. Crucially, samples were available both before and after the transmission event allowing distinct demographic functions for donor and recipient HIV-1 populations (Equations 1 to 5), with the time of transition between them estimated from the data _{D}) and recipient (_{R}) to be identical (so that there is no bottleneck at transmission), models with constant, exponential and logistic demographic functions for the recipient population were fitted. In all cases the donor population size was assumed to be constant.

The relative Bayesian posterior scores for each demographic model are listed in Table

Fit of demographic models

Demographic Model

Coalescent

Recipient

Donor

^{b}

AIC^{c}

ESS^{d}

**Constant**

**-**
^{a}

-4155.385

8310.77

523.85

**Constant**

**Constant**

-4144.993

8291.99

419.77

**Exponential**

**Constant**

-4103.717

8211.43

643.10

**Logistic**

**Constant**

-4090.154

8186.31

126.43

**Constant**

**-**
^{a}

-3118.180

6236.36

483.84

**Constant**

**Constant**

-3121.900

6245.80

440.34

**Exponential**

**Constant**

-3116.760

6237.52

378.33

**Logistic**

**Constant**

-3089.852

6185.70

202.69

^{a}Population size in Recipient constrained to be the same as that in Donor

^{b}Natural logarithm of the likelihood obtained from fitting the demographic model to the data

^{c}Akaike Information Criteria

^{d}Effective Sample Size (number of independent coalescent genealogies sampled from the posterior distribution)

Reconstructed demographic profiles for (a)

**Reconstructed demographic profiles for (a) env V1-V4 and (b) gag p24**. Estimates of

To further test the extent of the transmission bottleneck, the demographic history of the population was reconstructed using the Bayesian skyline plot [see Methods,

The Bayesian skyline plot also justifies our use of the logistic-constant demographic model to estimate the diversity that survives during horizontal transmission of HIV-1. Using the logistic growth model (Equations 4 and 5) we were able to calculate diversity in the recipient _{R}_{trans}. We estimated _{trans }to be approximately 30 days prior to collection of the first recipient sample (day 0) for _{R}_{trans}) to be 1.6 for _{D}_{R}_{trans}) as a percentage ratio _{D}_{D}

Parameter estimates used to calculate the percentage diversity that survived transmission

Parameter

Mean^{a}

HPD^{b }Lower

HPD Upper

ESS^{c}

_{
R
}

1216.7

534.4

2033.9

1338.94

_{
D
}

1014.0

541.2

1538.0

955.71

_{
trans
}
^{d}

30.9

15.2

46.9

174.57

_{R}_{trans})

1.6

1.0

3.1

2456.87

**
δ
**

0.17

0.06

0.35

2043.08

_{
R
}

926.6

419.9

1512.6

1454.31

_{
D
}

770.7

413.5

1184.7

1356.90

_{
trans
}
^{d}

42.4

27.5

53.0

274.51

_{R}_{trans})

2.0

1.0

4.5

3532.69

**
δ
**

0.29

0.07

0.67

3310.56

^{a}Mean of the marginal posterior probability distribution of parameter values

^{b}Highest Posterior Density encompassing 95% of the marginal posterior distribution of parameter values

^{c}Effective Sample Size (number of independent samples taken from the posterior distribution of values for a particular parameter)

^{d}Estimated time of transmission in days prior to the day of the fist recipient sample (day 0)

Effective population size at transmission _{R}_{trans})

**Effective population size at transmission N _{R}τ(t_{trans})**. The marginal posterior probability density of

Importantly, if selection was acting on

We conclude that > 99% of genetic diversity in the donor viral population, in both

To generalise this result we next investigated diversity (

Estimates of viral diversity close to the time of transmission

Patient

Best-fitting demographic model

^{a}

^{b}

^{c}

Mean^{d}

HPD^{e }upper

Horizontal transmission

p1

Logistic

0.0123

2293

36.00

153.19

p2

Logistic

0.0166

4441

27.98

148.32

p3

Logistic

0.0175

1612

29.02

80.12

p5

Exponential

0.0223

2439

287.78

670.67

p6

Logistic

0.0195

1511

7.86

20.39

p7

Logistic

0.0085

8632

253.78

1173.95

p8

Exponential

0.0162

6003

1722.98

2911.65

p9

Logistic

0.0071

7168

1283.11

3211.35

p11

Logistic

0.0128

6505

9.76

34.48

**Mean**

0.0148

4512

406.47

933.79

Vertical transmission

p1

Logistic

0.0201

4183

15.48

79.99

p2

Constant

0.0560

275

275.46

511.97

p3

Constant

0.0163

383

383.27

714.69

p4

Exponential

0.0098

67696

1214.76

5810.63

p5

Exponential

0.0133

7372

1360.46

3444.98

p6

Constant

0.0251

183

181.98

323.10

p7

Exponential

0.0226

1165

151.26

294.62

p8

Constant

0.0145

521

521.84

857.01

p9

Exponential

0.0120

2740

191.34

405.22

p10

Logistic

0.0188

1050

384.47

877.21

p11

Logistic

0.0163

740

410.67

926.82

p12

Exponential

0.0206

1730

121.83

254.40

p13

Exponential

0.0218

2603

81.27

173.89

p14

Exponential

0.0164

1865

208.00

411.70

p15

Logistic

0.0397

889

1.55

4.80

p16

Logistic

0.0173

269904

261.90

577.67

p18

Logistic

0.0097

146723

960.34

1978.63

p19

Exponential

0.0095

2842

342.42

655.38

p21

Exponential

0.0053

302840

1712.90

4936.78

p22

Exponential

0.0046

547670

3159.60

11050.00

p23

Logistic

0.0093

123360

371.85

706.09

p24

Logistic

0.0102

97018

524.69

814.59

p25

Logistic

0.0071

2508

2194.63

4321.04

pa

Constant

0.0280

640

638.53

895.76

pb

Constant

0.0076

2006

2000.48

3593.83

pc

Constant

0.0094

879

879.45

1626.67

pd

Constant

0.0146

254

254.26

526.95

**Mean**

0.0169

58890

696.47

1732.39

^{a}Substitution rate in number of changes per site per year

^{b}Product of the effective population size and generation time in days at the most recent time point

^{c}Seroconversion or birth for horizontally and vertically infected patients respectively

^{d}Mean of the marginal posterior probability distribution of parameter values

^{e}Highest Posterior Density

Finally, to compare the diversity present close to the time of infection in patients infected via two different modes of transmission, we estimated

Discussion

From our analysis of a single donor and recipient transmission pair, we conclude that in this case the viral diversity sampled during homosexual transmission of HIV-1 was very small (< 1%). This result was consistent for both

It is possible that the diversity present in the inoculum itself was larger, and that selection acting on

Neutral transmission also means that the degree of genetic diversity passed between individuals is dependent on the diversity present in the donor at the time of transmission. Because diversity in their respective donors is likely to vary greatly depending on the stage of infection

We can conclude from our results that diversity of the founding population is similarly restricted during both horizontal and vertical transmission. However, it is also clear that further study is required to investigate the variability observed. For example, although a reduction in diversity is frequent

Conclusion

Our findings quantify the contraction in genetic diversity that occurs during horizontal transmission of HIV-1. It is clear from the severity of the bottleneck that further work is required to investigate the nature of the selective forces surrounding transmission, if we are to interpret the fitness consequences for HIV-1 in the newly infected individual. Furthermore, the analyses presented suggest that the mode of transmission may not be a significant influence on the genetic diversity transmitted.

Methods

Patient material

The donor and recipient patients of the transmitter pair analysed here were recruited as part of an on-going study of acute HIV-1 infection and have been described in detail elsewhere ^{+ }cell counts. The clinical data for both donor and recipient during sampling is given in

Clinical data for transmission pair

Click here for file

The first recipient sample (day 0) was collected six weeks after he last tested PCR (polymerase chain reaction) negative for HIV-1 DNA and RNA. Three additional samples from the recipient were available at days 11, 59 and 237. Donor samples were collected 70 and 155 days prior to the first sample from the recipient.

Envelope sequences were also obtained from 27 HIV-1 positive children. All were HIV negative by PCR at birth indicating that infection occurred

Clinical categorisation and sequencing profile of vertically infected infants

Click here for file

Phylogenetic inference

Sequences were first aligned manually using Se-Al _{4 }model of nucleotide substitution

Quantification of the diversity lost during horizontal transmission

Within a coalescent framework, and assuming the HKY85 + dΓ_{4 }model of nucleotide substitution

Null model: _{t }= _{R }= _{D } [1]

All substitution and demographic parameters, including the time of transmission _{trans}, growth rate _{50}, were estimated from the data within a Bayesian coalescent framework by Markov chain Monte Carlo (MCMC), using the BEAST program

Uncertainty in the estimated parameter values is summarized by the highest posterior density (HPD) interval, which contains 95% of the marginal posterior distribution. The length of the MCMC chain was chosen so that the effective sample size (ESS) for each parameter was > 100, indicating that parameter space had been sufficiently explored _{R}_{trans}). The prior boundaries for the time of transmission _{trans }were set from when the recipient was last confirmed HIV-1 negative (53 days before the first recipient sample) to the time at which the first recipient sample was collected (day 0). We placed a minimum prior bound of one on _{R}_{trans}). With the exception of _{trans }and _{R}_{trans}), the MCMC chain did not impinge on any of the prescribed prior boundaries for the models tested.

The relative fit of each model to the data was assessed using the Akaike Information Criteria (AIC)

Selection of the appropriate demographic model allowed us to calculate _{R}_{trans }) and quantify the amount of diversity lost at transmission through a comparison of _{R}_{trans }) with _{D}

Bayesian skyline plot

The skyline plot is a piecewise-constant model of population size that estimates

Authors' contributions

CTTE collected the data, performed the analysis and wrote the paper. ECH contributed to the design of the study and the writing of the manuscript. DJW assisted with the analysis. RPV and EJA collected sequences for the infant data set. REP contributed to the design of the study and provided the funding. AJD contributed to the design of the study, the development of the software, the analysis and the writing of the article.

Acknowledgements

This work was supported by the Wellcome Trust (CTTE, AJD, ECH and REP) and Biotechnology and Biological Sciences Research Council (DJW).