Gene dosage alterations revealed by cDNA microarray analysis in cervical cancer: Identification of candidate amplified and overexpressed genes

Cervical cancer (CC) cells exhibit complex karyotypic alterations, which is consistent with deregulation of numerous critical genes in its formation and progression. To characterize this karyotypic complexity at the molecular level, we used cDNA array comparative genomic hybridization (aCGH) to analyze 29 CC cases and identified a number of over represented and deleted genes. The aCGH analysis revealed at least 17 recurrent amplicons and six common regions of deletions. These regions contain several known tumor‐associated genes, such as those involved in transcription, apoptosis, cytoskeletal remodeling, ion‐transport, drug metabolism, and immune response. Using the fluorescence in situ hybridization (FISH) approach we demonstrated the presence of high‐level amplifications at the 8q24.3, 11q22.2, and 20q13 regions in CC cell lines. To identify amplification‐associated genes that correspond to focal amplicons, we examined one or more genes in each of the 17 amplicons by Affymetrix U133A expression arrays and semiquantitative reverse‐transcription PCR (RT‐PCR) in 31 CC tumors. This analysis exhibited frequent and robust upregulated expression in CC relative to normal cervix for genes EPHB2 (1p36), CDCA8 (1p34.3), AIM2 (1q22‐23), RFC4, MUC4, and HRASLS (3q27‐29), SKP2 (5p12‐13), CENTD3 (5q31.3), PTK2, RECQL4 (8q24), MMP1 and MMP13 (11q22.2), AKT1 (14q32.3), ABCC3 (17q21‐22), SMARCA4 (19p13.3) LIG1 (19q13.3), UBE2C (20q13.1), SMC1L1 (Xp11), KIF4A (Xq12), TMSNB (Xq22), and CSAG2 (Xq28). Thus, the gene dosage and expression profiles generated here have enabled the identification of focal amplicons characteristic for the CC genome and facilitated the validation of relevant genes in these amplicons. These data, thus, form an important step toward the identification of biologically relevant genes in CC pathogenesis. This article contains Supplementary Material available at http://www.interscience.wiley.com/jpages/1045‐2257/suppmat. © 2007 Wiley‐Liss, Inc.

In an effort to identify the molecular alterations associated with invasive CC, we performed microarray CGH (aCGH) analysis to identify gene dosage changes in CC. This analysis identified 17 amplified and six deleted chromosomal regions characteristic of CC. FISH analysis demonstrated high-level amplifications at 8q24.3, 11q22.2, and 20q13, and increased copies of 3q. Affymetrix gene expression profile and RT-PCR analyses identified overexpression of a number of genes mapped within the focal amplicons in CC and enabled the identification of relevant transcriptional targets.

Tumor Specimens and Cell Lines
A total of 54 CC cases (9 cell lines and 45 primary tumor specimens) and 16 normal cervical tissues obtained from hysterectomy specimens as controls were used in this study. Twenty-nine tumors (21 primary tumors-6 stage IB, 8 stage IIB, 7 stage IIIB; and 8 cell lines) were used in aCGH analysis and 25 additional tumors (24 primary tumors and one cell line) were used in expression studies. The cell lines (HT-3, ME-180, CaSki, MS751, C-4I, C-33A, SW756, HeLa, and SiHa) were obtained from American Type Culture Collection (ATCC, Manassas, VA) and grown in tissue culture as per the supplier's specifications. The tumors were obtained from patients evaluated at the Instituto Nacional de Cancerologia (Santa Fe de Bogota, Colombia) (Pulido et al., 2000), the Department of the Obstetrics and Gynecology of Friedrich Schiller University (Jena, Germany), and Columbia University Medical Center, NY. All specimens were obtained after appropriate informed consent and approval of protocols by institutional review boards. The primary tumors were classified as FIGO stage IB (8 tumors), IIB (18 tumors) or IIIB/IV (19 tumors). Forty-two tumors were diagnosed as squamous-cell carcinoma (SCC) and three as adenocarcinoma. All tumor specimens were determined to contain at least 70% tumor cellularity by H&E staining. High molecular weight DNA and total RNA from tumor and normal tissues, and cell lines were isolated by standard methods. DNA isolated from placenta was used as a reference in aCGH analysis.

aCGH Hybridization and Image Analysis
The EST arrays generated at the Albert Einstein College of Medicine microarray facility (www.aecom. yu.edu/cancer/new/cores/microarray/default.htm#) contained 9,206 T3/T7 PCR-amplified cDNA inserts of human I.M.A.G.E. consortium clones printed on glass slides. The slides were hybridized as previously described (Bourdon et al., 2002). Briefly, the slides were first incubated for 1-5 hr with 20 ll of prehybridization mix. A total of 5 lg of test and reference DNAs was digested with DpnII for 1 hr, purified using a PCR clean-up kit (Promega, Madison, WI), and extracted in 50 ll of water. Digested DNAs were concentrated by ultrafiltration (Microcon YM-30, Amicon; Millipore, Bedford, MA). Equal amounts of test and reference (placenta) DNAs (1.8-2.2 lg) were labeled separately in 50 ll reactions using Cy3-dUTP or Cy5-dUTP (Amersham, Piscataway, NJ), respectively. The reaction mixtures were pooled, purified, and hybridized in the presence of a blocking reagent (Pollack et al., 1999). The slides were prepared after posthybridization washes as described (Bourdon et al., 2002).
The arrays were scanned using an Axon dual color laser scanner (GenePix 4000A; Axon, Union City, CA). At the time of the scanning, the laser power was adjusted to have <5% features saturated; the digitized Cy3 and Cy5 signals were pseudocolored green and red, respectively (GenePix Pro 3.0; Axon). After gridding, each dot on the 24bit ratio image was visually inspected and unsatisfactory dots were manually flagged if necessary. A GenePix results (*.gpr) file of the raw data (F635 median-B635 median, F532 median-B532 median) was used for further analyses.
The signals obtained after laser excitation of the dyes were digitized, and the raw data (median feature pixel intensity with the median local background intensity subtracted at each wavelength) were then subjected to statistical analysis. To correct for systematic errors introduced by the intensity-dependent dye efficiencies, the hybridization signal data from each slide were normalized using a local regression of the log-ratio variable Y ¼ log 2 (G/R) versus the log-product X ¼ log 10 (R 3 G)/ 2 (R and G represent the intensities of the Cy3 and Cy5, respectively). It was important to construct an indicator to identify ESTs that exhibited significant signal deviation from normal in a given slide. To this end, we computed the intensity-dependent (local) variance r(X N ) 2 from a local regression of Y N 2 vs. X N after normalization (X N and Y N represent the normalized X and Y variables) and attributed significance to amplified/deleted ESTs according to the values of Y N /r(X N ) (LR/SD), independently for each slide (Bourdon et al., 2002). With the binomial distribution it is extremely unlikely to get more than two false positive calls out of 29 samples with a P < 0.001. Therefore, a sequence was called amplified or deleted when the value LR/SD was !3.1 or À3.1, respectively, in at least three tumor samples and none of the controls (three placenta versus placenta experiments). All EST clones on the array were mapped in silico using NCBI genome map viewer build 34.3 (www.ncbi.nlm.nih.gov/mapview/) and assigned to subchromosomal regions. The normalized data have been deposited in the Gene Expression Omnibus (GEO) database (Accession GSE1715) (www.ncbi.nlm.nih.gov/geo/).

HPV Typing
HPV types were determined as previously described (Narayan et al., 2003a).

Semiquantitative RT-PCR Analysis
Total RNA from normal cervix was obtained from three commercial sources (Ambion, Austin, TX; Stratagene, La Jolla, CA; BioChain, Hayward, CA). Total RNA was isolated from nine cell lines (eight used in aCGH analysis), 18 primary tumors (all SCC; nine of these also studied by aCGH), and five normal cervix were reverse transcribed using random primers and the Pro-STAR first strand RT-PCR kit (Stratagene, La Jolla, CA). A semiquantitative analysis of gene expression was performed in duplicate or triplicate experiments using 26-28 cycles of multiplex RT-PCR with b-actin (ACTB) as the control and gene specific primers spanning at least two exons (Supplementary Table 1; supplementary material for this article can be found at http://www.interscience.wiley.com/jpages/1045-2257/ suppmat).
The PCR products were run on 1.5% agarose gels, visualized by ethidium bromide staining and quantified using the Kodak Digital Image Analysis System (Kodak, New Haven, CT). The values obtained for each gene were normalized against ACTB. For each gene, at least three different normal cervix RNA samples were used to calculate the mean and SD. A gene was considered upregulated if the gene/control ratio was !mean + 2 SD of the normal cervix.

Oligonucleotide Microarray Gene Expression Analysis
Biotinylated cRNA preparation and hybridization to Affymetrix U133A oligonucleotide microarray (Affymetrix, Santa Clara, CA), which contains 14,500 genes was performed on 22 primary CC cases (only one of these cases was studied by aCGH), nine CC cell lines (eight were studied by aCGH), and 16 normal cervical epithelium specimens by the standard protocols supplied by the manufacturer. Arrays were subsequently developed and scanned to obtain quantitative gene expression levels. Expression values for the genes were determined using the Affymetrix GeneChip Operating Software (GCOS) and the Global Scaling option, which allows a number of experiments to be normalized to one target intensity to account for the differences in global chip intensity. To perform the supervised gene expression analysis, we used the Genes@Work software platform, which is a gene expression analysis tool based on the pattern discovery algorithm SPLASH (Structural Pattern Localization Analysis by Sequential Histograms) (Califano, 2000).

RESULTS
Our previous molecular cytogenetic analyses of CC have identified complex chromosome alterations that include recurrent sites of high-level amplifications, +3q, and del(2q) (Harris et al., 2003;Narayan et al., 2003b;Rao et al., 2004). To characterize this karyotypic complexity at the molecular level, we performed cDNA array CGH (aCGH) analysis of a series of 29 CC cases that included 8 cell lines and 21 primary tumor biopsies. Of these, 27 (91%) were HPV positive (20 with HPV16/18; 7 harbored other HPV types) and two were HPV negative. Among the 9,206-cDNA sequences on the array, 445 (64.1 6 35.2/tumor) were found to be over represented and 121 (16.8 6 13.9/tumor) were deleted. A gene was considered either gained or lost if present in >10% of tumors based on the criteria described in the methods. Although the frequency of gene copy number gains was similar in cell lines and primary tumors, deletions were more common in cell lines than in primary tumors (28.4 6 16.1 vs. 12.

Identification of Genes with Copy Number Deletions
A total of 121 cDNAs were under represented in the CC genome compared to normal (Fig. 1C). The under represented genes will be referred as deleted genes hereafter. The deleted cDNA clones were found to be preferentially localized to chromosomes 4 (21%), 2 (11%), 13q (8%), 8 (8%), 11 (7%), 3p (5%), and 12q (5%). This nonrandom dis-tribution of deleted regions in the genome suggests that these chromosomal regions harbor candidate tumor suppressor genes relevant to CC.

Identification of Amplicons
The nonrandom clustering of the majority of over represented clones in the dataset to a few chromosomal regions prompted us to use an objective criterion to identify and define the amplicons. Toward this end, all 445 amplified clones identified by aCGH were mapped in silico to specific chromosomal sites at the sequence level using the MapView browser (http://www.ncbi.nlm.nih.gov/ mapview/). A discrete locus of regional copy number increase represented by four or more clones within 15 Mb genomic region in three or more tumors was considered a potential amplicon. This algorithm was highly effective as it identified 17 amplicons (four on X chromosome, three on chromosome 1, and one each at chromosomal regions 3q27. 3-29, 5p12-13, 5q31.3, 8q24.3, 11q22-23, 14q32, 17q21-22, 19p13, 19q13.3, and 20q13.1) ( Table 3). A number of recurrent over represented cDNA clones, however, remained single genes at their specific chromosomal regions, which requires further confirmation by other methods.

Expression-Array Validation of Genes in Amplicons
In the present study, we restricted the validation to the genes present within the 17 amplicons identified above. To identify expression profiles of the genes within the amplicons, we used the Affymetrix U133A array data sets derived from 16 normal  As expected, this analysis identified a large number of genes in each of the amplicons (data not shown). These probe sets were examined in the normalized expression profiles derived from normal cervix and invasive CC to identify over expressed genes. A gene was considered over expressed if the expression levels exceeded mean + 2SD of normal in >10% tumors. This analysis identified a number of overexpressed genes in each of the amplicons: 1p36,13;1p34,12;6;21;13;5q31.3,2;8q24.3,14;11q22.2,9;14q32.33,16;7;19p13.13,17;19q13.3,46;20q13.1,12;5;Xq12,2;Xq22,2;and Xq28,13 (

RT-PCR Validation of Genes Overexpressed by Array Expression Analysis
To further validate the genes that showed evidence of over expression by Affymetrix expression profiles, we chose one to three genes functionally relevant in tumor development in each of the focal amplicons and analyzed them by semiquantitative RT-PCR. Thus, a total of 23 genes that mapped to 17 amplicons were examined by RT-PCR in 8 normal cervical epithelium, 9 CC cell lines, and 10 primary tumors. These analyses showed that all genes tested, except PTPA2 at 1p34 and HMGB3 at Xq28, showed similar levels of increased expression in tumors compared to the corresponding normal cervix (Supplementary Table 3B). Thus, in a subset of tumors, the overexpression of genes within amplicons was confirmed by both Affymeterix expression and RT-PCR analyses, which showed a similar fold increase (Fig. 2, and supplementary Table 3B). RT-PCR analysis of CDCA8, AIM2, ABCC3, RECQL4, SMARCA4, and CSAG2 genes showed no detectable levels of expression in normal cervix and the fold increase for these genes in tumor specimens was considered 100% (Fig. 2). Thus, using two different validation methods, we showed overexpression of a number of genes mapped within the amplicons identified by aCGH array.

FISH Validation of Amplicons
Four of the amplicons (3q27.3-29, 8q24.3, 11q22.2, and 20q13.1) have been further examined by FISH to assess genomic copy number increase in eight CC cell lines. FISH analysis using RP11-480A16 BAC clone containing the SDHA-like gene at 3q29 showed four or more copies in three of the eight CC cell lines tested (Fig. 3A). By FISH analysis of a BAC clone RP11-374B7 mapped 2 Mb proximal to the 8q24.3 amplicon we found a homogeneously staining region (hsr) present on three different chromosomes in the SW756 cells (Fig. 3B). Three other cell lines (HT-3, MS751, and CaSki) showed 4-6 copies of signals. The 11q22.2 amplicon was studied using a BAC clone RP11-750P5 containing the MMP1, MMP10, MMP8, and MMP27 genes. We showed evidence of hsr-type amplification in two of eight cell lines. The cell line CaSki showed three copies of chromosome 11 containing highly amplified regions and C-4I had two amplified segments also on chromosome 11 (Fig. 3C). Both of these cell lines also exhibited amplification by aCGH analysis. A third cell line, HT-3, had four copies of the signal. The remaining five cell lines had only 2-3 copies of the signals. The 20q13.1 amplicon was tested by using RP11-30F23, which covers a region that is located approximately 200 kb distal to the YWHAB/14-3-3b gene. Hsr-type amplification of this region was found in the HT-3 cell line and 4-7 copies of signals were seen in SW756, SiHa, CaSki, and MS751 cell lines (Fig. 3D). Figure 2. Comparison of expression levels by Affymeterix microarray and semiquantitative RT-PCR analyses of genes mapped to various amplicons in CC. Fold increase was calculated based on averages obtained for a given gene in all normals analyzed and only tumors that showed evidence of overexpression using the criteria defined in Materials and Methods. Note that the genes (CDCA8, AIM2, RECQL4, ABCC3, and SMARCA4) that did not show detectable expression in normal cervix by RT-PCR were considered as 100-fold overexpressed in tumors. Two genes (PTP4A2 and HMGB3) showed no increased levels of expression by RT-PCR in tumors compared to normal cervix.

DISCUSSION
Like many other epithelial cancers, invasive CC exhibits complex chromosomal changes (Atkin, 1997;Harris et al., 2003;Rao et al., 2004). The molecular consequence of this genomic complexity is poorly understood. Extensive genome-wide LOH studies have shown allelic deletions of chromosome arms 2q, 3p, 4, 5p, 6p, and 11q (Mitra et al., 1994a;Mullokandov et al., 1996;Narayan et al., 2003b). A number of studies have provided evidence for gain or amplification of chromosomal regions or genes (Mitra et al., 1994b;Heselmeyer et al., 1996;Narayan et al., 2003b;Rao et al., 2004). Gene dosage changes play a major role in tumor formation and progression (Albertson et al., 2003). The analyses presented here identified several such gene dosage alterations in CC. The copy number changes identified by aCGH showed a near concordance with the previously reported chromosomal CGH (cCGH) data on the same panel of tumors validating the present data (Harris et al., 2003;Narayan et al., 2003b;Rao et al., 2004). Comparing the recurrent increased copy number of one or more cDNA clones by aCGH in the present study with those of the cCGH data showed common amplifications that correspond to the regions 1p31,2q32,7q22,10q23,11q22,20q11.2,20q13.1,and Xp (Fig. 4A). Analysis of these data sets also showed similar concordance of chromosomal gains at 1p, 3q, 5p, 9q, 14q, 17q, and X (Fig. 4A). In addition, analysis of the data on deletions showed a similar correlation between the cCGH and aCGH with common regions of deletions at 2q33-37, 3p, 4p, 6q, 8p, 10p, 11q22-25, 13q, and 18q (Fig. 4B). However, the chromosomal amplifications at 7p11.2, 10q21, 11q13, and 12q15 regions revealed by cCGH could not be confirmed by aCGH (Fig. 4A). This discrepancy between cCGH and aCGH data may be due to the differences in coverage by each of the techniques. Although cCGH will identify all of the genomic changes at a resolution of megabase level, the array  we used for aCGH has only an average coverage of 300 kb. Since the cDNA array used by us had low genomic representations in certain regions of the genome, we assume that genomic regions of some of the amplicons identified by cCGH are under represented in the cDNA array. Second, the criteria that we applied to identify amplifications at aCGH analysis in the present study will be eliminated amplifications present in less than three tumors.
The amplification of oncogenes is a known genetic mechanism underlying the development of a number of tumor types. Our previous studies suggested that gene amplification is a common event in CC (Mitra et al., 1994b;Harris et al., 2003;Narayan et al., 2003b). The present analysis identified increased copy number of cDNA clones on the entire X chromosome suggesting gain of this chromosome (Rao et al., 2004). The gain of 3q26-29 has been commonly reported in invasive CC and was shown to occur during the progression from low-to high-grade cervical intraepithelial neoplasia (CIN) (Heselmeyer et al., 1996;Heselmeyer-Haddad et al., 2003;Hidalgo et al., 2005;Narayan et al., 2003b;Rao et al., 2004;Fitzpatrick et al., 2006). Here we identified an amplicon spanning 11.8 Mb in 3q27.3-29. Gain of distal 3q is commonly seen in many other tumor types, such as head and neck squamous-cell carcinomas, and lung and ovarian cancer. Potential target oncogenes at 3q26-29 such as PIK3CA, TP73L, CCNL1, and EIF5A2 have been reported (Redon et al., 2002). Previous studies have implicated PIK3CA and TERC as target genes in CC (Ma et al., 2000;Sugita et al., 2000;Heselmeyer-Haddad et al., 2003). Mapped to 3q26.3, these genes are, however, 10 Mb proximal to the 3q amplicon identified in the present study. The 3q27.3-29 region contains several genes of relevance to cancer (Table  3). We showed here a 2.5 to 10.9-fold increased expression of three genes (RFC4, MUC4, and HRASLS) by both microarray expression profiles and RT-PCR (Fig. 2). RFC4 (replication factor 4) plays a critical role in DNA damage checkpoint pathways (Ellison and Stillman, 2003). Mucin 4 (MUC4) secreted by epithelial surfaces including cervix is implicated in renewal and differentiation of these cells. MUC4 has been reported to be overexpressed in pancreatic cancer and cervical dysplasias, and acts as ligand for ERBB2 and a target for the TGFB pathway (Lopez-Ferrer et al., 2001;Jonckheere et al., 2004;Singh et al., 2004). The mouse homologue of HRASLS encodes a ras-responsive gene, which modulates the HRAS-mediated signaling pathway.
Amplicons at 8q24.3 and 20q13.1 have been found in many tumor types, including CC (Zhang et al., 2002;Hodgson et al., 2003). The 8q24 region harbors a number of genes including MYC. In the present study, two genes, PTK2 and RECQL4, mapped to this amplicon were examined and shown to exhibit 3 to >9.7-fold increased expression in CC. The protein tyrosine kinase 2 (PTK2) gene, which encodes a cytoplasmic protein tyrosine kinase, is implicated in signaling pathways involved in cell motility, proliferation, and apoptosis (McLean et al., 2005). The RecQ protein-like 4 (RECQL4) encodes a DNA helicase involved in the maintenance of genomic integrity (Hickson, 2003). No report of RECQL4 amplification and/or over expression in human cancer is known thus far and it remains to be seen whether the over expression of RECQL4 has any functional role in CC tumorigenesis or represents a bystander affect. The 20q13.1 region, known to be amplified in diverse tumor types, harbors several genes implicated in tumorigenesis, such as AIB1, BTAK, and PTPN1. Our Affymetrix gene expression profiles identified increased expression of 12 genes, including UBE2C, within the 20q13.1 amplicon. Of these, the overexpression of UBE2C gene was further confirmed by RT-PCR analysis (Fig. 2). Ubiquitinconjugating enzyme E2C (UBE2C) encodes a member of the E2 ubiquitin-conjugating enzyme family, which is essential for destruction of mitotic cyclins and for cell cycle progression. The ubiquitin-conjugase family genes are amplified and overexpressed in many human tumors, including CC (Wagner et al., 2004;Santin et al., 2005). The present study also identified a number of previously uncharacterized amplifications, which could include genes relevant to CC that may be revealed by positional approaches. For instance, the 820 kb 11q22.2 amplicon contains a number of matrix metalloproteinase (MMPs 1, 3, 12, and 13) genes, which are known to be overexpressed in many tumor types and that promote tumor growth, cell proliferation, and migration (Overall and Lopez-Otin, 2002).
This work represents the first high-resolution aCGH analysis of CC, which forms a basis for further studies on a subset of candidate genes in delineating the molecular mechanisms involved in its development. Identification of tumor-specific gene dosage profiles has important potential diagnostic and therapeutic implications. The distinct genetic losses and gains seen in the present study may be characteristic of CC as some of these changes (e.g., gain of 5p, 5q, and loss of 2q, 4p, 4q) are not commonly seen in other epithelial cancers. Detailed characterization of the amplified and deleted regions may facilitate the identification and functional characterization of genes involved in CC development. In addition, examination of these changes in CIN lesions may provide new insights into the role of these genes in the progression of CC and thus in the diagnostic identification of lesions at high-risk for progression into invasive cancer.