Structural variant detection in colorectal cancer E van den Broek 1 , JC Haan 1 , MH Jansen 1 , B Carvalho 1 , MA van de Wiel 2 , ID Nagtegaal 3 , CJA Punt 4 , B Ylstra 1 , S Abeln 5 , GA Meijer 1 , RJA Fijneman 1 1 Dept. of Pathology, VU University Medical Center, Amsterdam, The Netherlands 2 Dept. of Epidemiology & Biostatistics, VU University Medical Center, Amsterdam, The Netherlands 3 Dept. of Pathology, Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands 4 Dept. of Medical Oncology, Academic Medical Center, University of Amsterdam, The Netherlands 5 Dept. of Computer Science, VU University, Amsterdam, The Netherlands 1
Colorectal cancer (CRC) • Colorectal cancer is a major health concern worldwide • Second cause of cancer related death – The incidence worldwide is 1,200,000 – The incidence in the US is 144,000 • Mortality rates Stage 1 < 10 % Stage 2 25 - 30 % Stage 3 45 – 50 % Stage 4 > 90 % 2
CRC research Clinical needs for biomarker discovery progressive CRC lymph node CRC liver normal colon adenoma localized CRC adenoma metastasis metastasis Key molecular activated genomic instability ~15% MIN+ ~3% MIN+ features Wnt signaling (~5% of adenomas) ~85% CIN+ ~97% CIN+ 100 Clinical needs: 5-year survival rate (%) 80 1. Screening Diagnostic biomarkers 60 Prognostic biomarkers 2. Predict recurrence 40 Predictive biomarkers 3. Personalized therapy 20 0 CRC stage: Stage I+II Stage III Stage IV 3
Chromosomal Instability a hallmark of CRC SKY: numerical & structural aberrations M Hermsen et al., Oncogene 2005 4
CAIRO & CAIRO2 studies Phase III clinical trials In total 1575 patients were included CApecitabine, IRinotecan, Oxaliplatin in advanced colorectal cancer CAIRO: Koopman et al. Lancet 2007 CAIRO2: Tol et al. N Engl J Med 2009 DNA from 356 patients: primary tumor and matched normal – Representative group – Isolated from FFPE 5
Comparative Genomic Hybridization ( CGH ) Agilent, 180k array CGH Array CGH: 356 CAIRO & CAIRO2 samples 1 Segmentation 2 Calling 3 Copy numbers Numerical aberrations 6
Segmentation - array CGH Profile of one tumor with 180k probes Segmentation was performed using Circular Binary Segmentation algorithm ( DNAcopy. Olshen et al. 2004 ) 7
Calling - array CGH Profile of one tumor with 180k probes Calling was performed using CGHcall ( CGHcall. vd Wiel et al. 2007 ) 8
Structural Variants (SV) in cancer Hematological disorders • Philadelphia chromosome – t(9;22) – Fusion gene: BCR-ABL – Drug: Imatinib / Gleevec Epithelial cancers • TMPRSS2-ERG in prostate cancers • VTI1A-TCF7L2 is confirmed in 3% of 97 CRCs – Bass et al., Nature Genetics 2011 9
AIM TO IDENTIFY RECURRENT SOMATIC STRUCTURAL GENOMIC VARIANTS THAT CAUSE CRC 10
Breakpoint (BP) detection Based on array CGH Array CGH: 356 CAIRO & CAIRO2 samples 1 Segmentation 2 Breakpoint detection 3 Candidate genes Structural variants 11
BP detection in array CGH Profile of one tumor with 180k probes Breakpoints are defined by the start position of the first probe of each segment Breakpoint annotation per gene 12
Results based on array CGH BP detection • Total number of genes with BPs: 5,737 genes • 482 candidate genes were identified with recurrent BP (FDR < 0.1) Amount of affected samples in array CGH 140 MACROD2 120 100 80 60 40 20 0 Candidate genes (top 15) 13
Overall survival: MACROD2 Recurrent BP (1) versus no-BP (0) MACROD2 1.0 0.8 Survival probability 0.6 BP (samples) 0 ( 207 ) 1 ( 144 ) 0.4 0.2 Log rank P= 0.08 0.0 0 500 1000 1500 14 Overall Survival (days)
Results based on array CGH BP detection • Total number of genes with BPs: 5,737 genes • 482 candidate genes were identified with recurrent BP (FDR < 0.1) CGH 482 CANDIDATE GENES Limitations breakpoint determination using array CGH: – Location BP is estimation (average probe distance is ~17 kb) – DNA structure is unknown – Balanced events will be missed 15
Validation array CGH BPs NGS data from TCGA • 482 candidate genes were identified with recurrent BP (FDR < 0.1) CGH 482 CANDIDATE GENES Candidate validation is required The Cancer Genome Atlas CRC samples (COAD & READ) Whole Genome DNA Seq from paired tumor-normal samples 16
Validation array CGH BPs NGS data from TCGA • 482 candidate genes were identified with recurrent BP (FDR < 0.1) CGH Negative 482 CANDIDATE GENES Control Genes (no BP) Candidate validation is required Structural Variant (SV) detection Candidate driven 17
Computational methods Focus on candidate genes ref Based on paired-end NGS data • Read-pair approach – Discordance: location / bridge length / orientation reads Discordant pairs (DP) types • Translocation > different chromosomes • Insertion > bridge length • Deletion > bridge length • Inversion > orientation • Eversion > orientation • Single mapped could indicate a breakpoint 18
Computational methods Focus on candidate genes 1 ref Based on paired-end NGS data 1. Read-pair approach 2 Combined with: 2. Read-depth 3. Define breakpoint location 3 4. Determine tumor specific events 19
Translocation IGV MACROD2 • Discordant pairs • Breakpoints Fusion partner 20 !
Distribution DP groups per type Preliminary results candidate genes in TCGA data Based on DP groups: Deletion 50.8% Eversion Translocation 8.4% 7.7% Inversion 6.7% Insertion 26.4% Approximately 5 fold higher number of translocation-DP groups for candidate genes compared to control genes 21
Translocation-DP groups per candidate gene in TCGA samples Putative translocations Frequency of translocation-DP groups (au) Frequency of translocation-DP groups (au) MACROD2 Candidate genes (top 20) Candidate genes 22
Correlation per candidate gene - Frequency of samples with BP based on array CGH - Frequency of translocation-DP groups in TCGA data Frequency of translocation-DP groups (au) MACROD2 0% 10% 20% 30% 40% Frequency of affected samples in array CGH analysis 23
Conclusions • 482 candidate genes with recurrent breakpoints were identified in a large cohort of 356 CRC samples, based on array CGH analysis • The TCGA provided an essential CRC reference dataset (COAD, READ) to validate Structural Variants in candidate genes with recurrent breakpoints • Identification of BPs based on array CGH is correlated with SV detection in TCGA CRC NGS data • Further studies will be performed to investigate clinical and functional significance of validated candidate genes 24
Recommend
More recommend