CSCE 471/871 Lecture 0: Administrivia CSCE 471/871 Lecture 0: Stephen Scott Administrivia Welcome Introduction What is Bioin- formatics? Stephen Scott Biology Background Fundamental Questions sscott@cse.unl.edu 1 / 16
Welcome to 471/871! CSCE 471/871 Lecture 0: Administrivia Check your name on the roster, or write your name if Stephen Scott you’re not listed Introduce yourself Welcome Introduction Who are you? 1 What is Bioin- What are you? 2 formatics? Why are you here? 3 Biology What is one thing about you that few others know Background 4 about? Fundamental Questions You should have the following handouts: Syllabus 1 Copies of slides 2 Bring a laptop on Thursday! 2 / 16
CSCE 471/871 Lecture 1: CSCE 471/871 Lecture 1: Introduction Stephen Scott Introduction Welcome Introduction What is Bioin- Stephen Scott formatics? Biology Background Fundamental Questions (With thanks to Andy Benson and Jitender Deogun) sscott@cse.unl.edu 3 / 16
Outline CSCE 471/871 Lecture 1: Introduction Stephen Scott Welcome What is bioinformatics? Introduction What is Bioin- Relevant biology background formatics? Fundamental questions in bioinformatics Biology Background What we will (and will not) cover in this course Fundamental Questions 4 / 16
What is Bioinformatics? CSCE 471/871 Lecture 1: Introduction Stephen Scott Welcome Bio = (molecular) biology Introduction Informatics = computer science What is Bioin- formatics? Bioinformatics = using computer science tools and Biology Background techniques for solving problems in (molecular) biology Fundamental (Loose) synonym: Computational Biology Questions 5 / 16
What is Bioinformatics? (cont’d) CSCE 471/871 Lecture 1: Introduction Stephen Scott Original motivation comes from molecular biology Welcome Sequence analysis Introduction Most accurate analysis is via experimentation (“bench What is Bioin- work”), but expensive and time-consuming (e.g., formatics? GenBank has > 1 . 5 × 10 11 base pairs from > 1 . 6 × 10 8 Biology Background sequences) Fundamental Questions Bio problems suggest computational problems, which then suggest new biological experiments 6 / 16
Relevant Biology Background CSCE 471/871 Lecture 1: Introduction Stephen Scott Welcome Basic idea: genes (chains of nucleotides) are converted Introduction into proteins (chains of amino acids) What is Bioin- formatics? Proteins are the “workhorses” of biological systems, Biology governing metabolic processes Background E.g., blood clotting is a process that consists of a chain Flow of Information DNA and Genes reaction of numerous protein interactions Translation Protein Structure Fundamental Questions 7 / 16
Relevant Biology Background Flow of Information CSCE Flow of Information 471/871 Lecture 1: Introduction Stephen Scott Coding Region Welcome DNA Introduction Transcription What is Bioin- formatics? RNA Biology Background Translation Flow of Information DNA and Genes Translation Protein Structure Protein Fundamental Questions Activity structure, physiology, gene regulation, Function cell division, differentiation 8 / 16
Relevant Biology Background DNA and Genes CSCE 471/871 Lecture 1: Introduction 1. An organism’s DNA is a (long) sequence of nucleotides Stephen Scott (bases, residues), from { Adenine (A), Guanine (G), Welcome Cytosine (C), Thymine (T) } Introduction 2. Cellular machinery transcribes the coding regions of What is Bioin- DNA into RNA formatics? Biology Has same alphabet, substituting U (uracil) for T Background Non-coding regions are not transcribed Flow of Information DNA and Genes Translation Protein Structure . . . ATTGATA ATGCTGAACTACAAATTACGGCAGGCAACCGGAGCCTGGAAGTGA TAGGA . . . Fundamental Questions ⇓ AUGCUGAACUACAAAUUACGGCAGGCAACCGGAGCCUGGAAGUGA 9 / 16
Relevant Biology Background DNA and Genes (cont’d) CSCE 471/871 Lecture 1: 3. Then introns (non-coding subsequences) are removed, Introduction yielding mRNA Stephen Scott Adjacent triples are codons, each encoding an amino Welcome acid Introduction 4. mRNA is translated codon-by-codon into a polypeptide What is Bioin- by ribosomes (organelles in cells’ cytoplasm) formatics? Biology 5. Proteins are comprised of one or more polypeptide Background Flow of Information chains DNA and Genes Translation Protein Structure Fundamental AUGCUG AA CUA C AAAUUACGGCAGGCAACCGGAGCCUGGAAGUGA Questions ⇓ AUG CUG CUA AAA UUA CGG CAG GCA ACC GGA GCC UGG AAG UGA ⇓ M L L K L R Q A T G A W K [X] 10 / 16
Relevant Biology Background Translation CSCE Second Position U C A G 471/871 Lecture 1: First position Third position Introduction Phe Ser Tyr Cys U 5’ end 3’ end Stephen Scott U Phe Ser Tyr Cys C Leu Ser STOP STOP A Welcome Leu Ser STOP Trp G Introduction Leu Pro His Arg U What is Bioin- Leu Pro His Arg C C formatics? Genetic code Leu Pro Gln Arg A Biology is degenerate Background Leu Pro Gln Arg G Flow of Information 64 codons DNA and Genes Ile Thr Asn Ser U Translation 20 amino acids Ile Thr Asn Ser Protein Structure A C Fundamental Ile Thr Lys Arg A Questions Met Thr Lys Arg G Val Ala Asp Gly U Val Ala Asp Gly G C Val Ala Glu Gly A Val Ala Glu Gly G 11 / 16
Relevant Biology Background Symbols for Amino Acids CSCE 471/871 Lecture 1: Introduction Stephen Scott A Ala Alanine M Met Methionine Welcome C Cys Cysteine N Asn Asparagine Introduction D Asp Apartic Acid P Pro Proline What is Bioin- E Glu Glutamic Acid Q Gln Glutamine formatics? F Phe Phenylalanine R Arg Arginine Biology Background G Gly Glycine S Ser Serine Flow of Information DNA and Genes H His Histidine T Thr Threonine Translation Protein Structure I Ile Isoleucine V Val Valine Fundamental K Lys Lysine W Trp Tryptophan Questions L Leu Leucine Y Tyr Tyrosine 12 / 16
Relevant Biology Background Protein Structure CSCE 471/871 Protein Folding and structure: The biggest black box Lecture 1: Introduction Stephen Scott 1. Primary Amino Acid Sequence: Predicted from DNA sequence 2. Secondary structure: local structures within the polypeptide chain Welcome that are controlled by bond rotation angles of amino acids Introduction a. Alpha helices b. Beta sheets What is Bioin- formatics? Biology Background 3. Tertiary structure: Global secondary structure packing of Flow of Information the entire polypeptide chain DNA and Genes Translation Protein Structure Fundamental Questions 4. Quaternary structure: 3-dimensional packing of multiple polypeptide chains (Multisubunit protein complexes) 13 / 16
Some Fundamental Questions CSCE 471/871 Lecture 1: Introduction Given an organism, what is its genetic sequence? Stephen Scott ⇒ Sequence assembly Welcome Given a sequence, what genes does it encode? Introduction ⇒ Gene finding What is Bioin- Given a protein: formatics? What is its structure? Biology Background ⇒ Structure prediction Fundamental What other proteins is it related to? Questions ⇒ Homology prediction/phylogeny What is its function? ⇒ Function prediction All this from (mainly) only sequences of letters! 14 / 16
What We Will Study CSCE 471/871 Lecture 1: Introduction Stephen Scott Pairwise alignment of sequences Welcome Multiple alignment of sequences Introduction What is Bioin- Profiling (modeling) a multiple alignment formatics? Building phylogenetic (evolutionary) trees (time Biology Background permitting) Fundamental Questions Predicting secondary structure and/or function of RNA and proteins (time permitting) 15 / 16
What We Will Not Study (but are still interesting problems) CSCE 471/871 Lecture 1: Introduction Stephen Scott Welcome Introduction Gene finding What is Bioin- formatics? Inferring metabolic pathways Biology Predicting tertiary structure of proteins Background Fundamental Questions 16 / 16
Recommend
More recommend