blast summary blast summary
play

Blast summary Blast summary Basic ideas: Basic ideas: Alignment - PowerPoint PPT Presentation

Blast summary Blast summary Basic ideas: Basic ideas: Alignment (global/local/affine gaps) Alignment (global/local/affine gaps) scoring matrices, (DNA/AA(PAM, Blosum62)), scoring matrices, (DNA/AA(PAM, Blosum62)), position


  1. Blast summary Blast summary ß Basic ideas: ß Basic ideas: ß Alignment (global/local/affine gaps) Alignment (global/local/affine gaps) ß ß scoring matrices, (DNA/AA(PAM, Blosum62)), scoring matrices, (DNA/AA(PAM, Blosum62)), ß position specific (Later in the course) position specific (Later in the course) ß p-value p-value ß ß Seed selection, algorithms for keyword search Seed selection, algorithms for keyword search ß ß Flavors: ß Flavors: blastn blastn, , blastx blastx, , tblastn tblastn… … ß Other variants: ß Other variants: psi psi-blast.. (later in the course) -blast.. (later in the course)

  2. Assignment 2 schematic Assignment 2 schematic query: genomic sequence exons 3’ UTR Subject: aa seq Predicted cDNA Why does it not match the subject perfectly?

  3. Blast summary Blast summary ß Basic ideas: ß Basic ideas: ß Alignment (global/local/affine gaps) Alignment (global/local/affine gaps) ß ß scoring matrices, (DNA/AA(PAM, Blosum62)), scoring matrices, (DNA/AA(PAM, Blosum62)), ß position specific (Later in the course) position specific (Later in the course) ß p-value p-value ß ß Seed selection, algorithms for keyword search Seed selection, algorithms for keyword search ß ß Flavors: ß Flavors: blastn blastn, , blastx blastx, , tblastn tblastn… … ß Other variants: ß Other variants: psi psi-blast.. (later in the course) -blast.. (later in the course)

  4. Proteins Proteins

  5. CS view of a protein CS view of a protein • >sp|P00974|BPT1_BOVIN Pancreatic >sp|P00974|BPT1_BOVIN Pancreatic • trypsin inhibitor precursor (Basic protease inhibitor precursor (Basic protease trypsin inhibitor) (BPI) (BPTI) (Aprotinin Aprotinin) - ) - Bos Bos inhibitor) (BPI) (BPTI) ( taurus (Bovine). (Bovine). taurus • MKMSRLCLSVALLVLLGTLAASTPGCDT MKMSRLCLSVALLVLLGTLAASTPGCDT • SNQAKAQRPDFCLEPPYTGPCKARIIRY SNQAKAQRPDFCLEPPYTGPCKARIIRY FYNAKAGLCQTFVYGGCRAKRNNFKSA FYNAKAGLCQTFVYGGCRAKRNNFKSA EDCMRTCGGAIGPWENL EDCMRTCGGAIGPWENL

  6. Protein structure basics Protein structure basics

  7. Bond angles form structural Bond angles form structural constraints constraints

  8. Alpha-helix Alpha-helix ß 3.6 residues per ß 3.6 residues per turn turn ß H-bonds between ß H-bonds between 1st and 4th residue 1st and 4th residue stabilize the stabilize the structure. structure. ß First discovered by ß First discovered by Linus Pauling Linus Pauling

  9. Beta-sheet Beta-sheet ß ß Each strand by itself has 2 residues per turn, and is not stable. Each strand by itself has 2 residues per turn, and is not stable. ß ß Adjacent strands hydrogen-bond to form stable beta-sheets, parallel or anti-parallel. Adjacent strands hydrogen-bond to form stable beta-sheets, parallel or anti-parallel. ß ß Beta sheets have long range interactions that stabilize the structure, while alpha- Beta sheets have long range interactions that stabilize the structure, while alpha- helices have local interactions. helices have local interactions.

  10. Domains Domains ß The basic structures (helix, strand, loop) ß The basic structures (helix, strand, loop) combine to form complex 3D structures. combine to form complex 3D structures. ß Certain combinations are popular. Many ß Certain combinations are popular. Many sequences, but only a few folds sequences, but only a few folds

  11. 3D structure 3D structure • Predicting tertiary structure is an important problem in Bioinformatics. • Premise: Clues to structure can be found in the sequence. • While de novo tertiary structure prediction is hard, there are many intermediate, and tractable goals.

  12. Protein Domains Protein Domains ß An important realization (in the last ß An important realization (in the last decade) is that proteins have a modular decade) is that proteins have a modular architecture of domains/folds. architecture of domains/folds. ß Example: The zinc finger domain is a ß Example: The zinc finger domain is a DNA-binding domain. DNA-binding domain.

  13. Zinc Finger domain Zinc Finger domain

  14. Proteins containing zf zf Proteins containing domains domains How can we find a motif corresponding to a zf domain

  15. The sequence analysis perspective The sequence analysis perspective ß Zinc Finger motif ß #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] ß 2 conserved C, and 2 conserved H ß How can we search a database using these motifs? ß The ‘regular expression’ motif is weak. How can we make it stronger

Recommend


More recommend