short tandem repeat str
play

Short tandem Repeat (STR) Simple sequence repeat Located in - PDF document

In Seok Yang Dept. of Forensic Medicine Yonsei University College of Medicine Short tandem Repeat (STR) Simple sequence repeat Located in non-coding region of genome Variable among human individuals Applied in human identification 1


  1. In Seok Yang Dept. of Forensic Medicine Yonsei University College of Medicine Short tandem Repeat (STR) — Simple sequence repeat — Located in non-coding region of genome — Variable among human individuals — Applied in human identification 1

  2. Capillary electrophoresis (CE) based assay — This method has been used for genotyping of STR markers in forensic genetics for over a decade. — At the present, there are several commercially available multiplex PCR kit for CE-based STR genotyping. — Alleles of STR markers are separated based on their length and labeled fluorescence dye during CE. 3730xl DNA sequencer Sample profile Limitation of CE-based assay on STR genotyping — Use of limited number of STR loci to be measured simultaneously related to the number of fluorescence dyes and the maximum size of STR amplicons — Impossible to distinguish STR alleles with the same length, but with different sequences from each other. Adopted from John M. Butler’s presentation at ISHI 2013 2

  3. Next generation sequencing (NGS) — Recently NGS has been on the spotlight as an ultimate genotyping tool to overcome the limitation of CE-based STR analysis in forensic field. — STR profiling using NGS has become available along with advance of bioinformatics software. — Therefore, appropriate data analysis protocol may be required for STR profiling using NGS. Outlines — Preparation of STR amplicons and DNA libraries — NGS platform and sequencing data generation — NGS data analysis for STR genotyping Design of STR reference sequences for alignment 1. Analysis of alignment output 2. Determination of STR alleles 3. Determination of repeat structure of target STR region 4. Estimation of mixture ratio 5. Estimation of male/female ratio 6. 3

  4. Preparation of STR amplicons and DNA libraries (1) PCR amplicon preparation 1. 2800M and 9947A sample DNAs Sample DNA 2. Primers without fluorescence dye labeling that included in PowerPlex 16 HS system Multiplex PCR à Single source and 1:1 mixture samples were prepared. (2) Library preparation 1. Use Rapid Library Prep Method (without neubilization) 2. Adaptors (with MIDs) are ligated non- Adapter ligation directionally to PCR products (3) Advantage Multiplexed PCR system previously developed for CE-based STR typing can be used for amplicon generation. Bioanalyzer profile of DNA libraries < 2800M Single > < 9947A Single > < 1:1 mixture > Bioanalyzer profiles of DNA libraries were obtained after size selection using AMPure beads for removal of fragments with less than 100 bp in length. 4

  5. NGS data generation — NGS platform Size distribution of NGS reads — GS Junior Total number of reads: 164,468 Average read length: 183.64 base pairs — NGS dataset No. of reads — 2800M — 9947A — 1:1 mixture Read length (bp) Sample name MID sequences No.of reads 2800M ACACGACGACT 51475 9947A ACACGTAGTAT 33213 1:1 Mixture ACGACACGTAT 76943 Unsorted 2837 5

  6. < Design of reference sequences > < Example of STR genotyping result > NGS data analysis protocol in this study Input sequence format FASTA or FASTQ Tested NGS platform 454 platform OS Linux and Windows systems Data analysis protocol MS Excel Build STR reference sequences #.fasta file NGS data Bowtie 2 Indexing and alignment #.fasta file #.sam file or #.bed file #.fastq file SAMtools Convert SAM into BAM #.sam file #.bam file BEDTools Convert BAM into BED #.bed file MS Excel Determine alleles by counting coverage 6

Recommend


More recommend