protein functions prediction
play

Protein functions prediction Swiss Institute of Bioinformatics - PDF document

Protein functions prediction Swiss Institute of Bioinformatics Institut Suisse de Bioinformatique LF-2004.08 Introduction Signal peptides Secondary structure Transmembrane regions Antigenic peptides and topology


  1. Protein functions prediction Swiss Institute of Bioinformatics Institut Suisse de Bioinformatique LF-2004.08 Introduction � Signal peptides � Secondary structure � Transmembrane regions � Antigenic peptides and topology � Domain/Motifs � PTM (post-translational � Tools modifications) � The EMBOSS package � Low complexity and biased regions � Repeats � Coils Swiss Institute of Bioinformatics Institut Suisse de Bioinformatique LF-2004.08

  2. Different techniques � Algorithms � Sliding window, Nearest Neighbor � Patterns, regular expression � Weight matrices � HMM, profiles � Neural Networks � Rules Swiss Institute of Bioinformatics Institut Suisse de Bioinformatique LF-2004.08 Sliding window THIS ISATESTSEQVENCETHATDISPLAYSTHESL ID INGWINDQ W Score 1 Score 2 Score n Width or Size=11, Step=5 Results are usually displayed as a graph, see example -> Swiss Institute of Bioinformatics Institut Suisse de Bioinformatique LF-2004.08

  3. Patterns / regular expression � Pattern: <A-x-[ST](2)-x(0,1)-{V} � Regexp: ^A.[ST]{2}.?[^V] � Text: The sequence must start with an alanine, followed by any amino acid, followed by a serine or a threonine, two times, followed by any amino acid or nothing, followed by any amino acid except a valine. � Simply the syntax differ… Swiss Institute of Bioinformatics Institut Suisse de Bioinformatique LF-2004.08 Weight matrices (PSSM) Swiss Institute of Bioinformatics Institut Suisse de Bioinformatique LF-2004.08

  4. HMM / profiles Swiss Institute of Bioinformatics Institut Suisse de Bioinformatique LF-2004.08 Neural Networks General principle: Example: Swiss Institute of Bioinformatics Institut Suisse de Bioinformatique LF-2004.08

  5. Signals found in proteins � N-ter � C-ter � exportation - secretion � GPI-anchor (Glycosyl Phosphatidyl Inositol) � mitochondria � other membrane � chloroplast anchors (see PTM) � internal � other unknown ? � NLS (nuclear localization signal) Swiss Institute of Bioinformatics Institut Suisse de Bioinformatique LF-2004.08 Signals detection tools � SignalP � Big-PI � MitoProt � DGPI � ChloroP � Predotar � PSort � TargetP � Sigcleave (EMBOSS) � Phobius Swiss Institute of Bioinformatics Institut Suisse de Bioinformatique LF-2004.08

  6. Transmembrane regions � Detection (signal peptide, hydropathy, helices) � Organisation (topology) Swiss Institute of Bioinformatics Institut Suisse de Bioinformatique LF-2004.08 Transmembrane detection tools � TMHMM � Mixture of tools � Phobius � TMPred � ConPred II � TopPred2 � DAS � HMMTop � Tmap (EMBOSS) Swiss Institute of Bioinformatics Institut Suisse de Bioinformatique LF-2004.08

  7. Post translational modifications � Phosphorylation � Farnesylation, myristylation, palmitoylation, � S - T - Y geranylgeranylation, GPI- � N-glycosylation anchor � N � C - Nter - Cter � O-glycosylation � Ubiquitination and family � S - T - (HO)K � K - Nter � Acetylation, methylation � Inteins (protein splicing) � D - E - K � Pre-translational � Sulfation � Selenoprotein � Y � C Swiss Institute of Bioinformatics Institut Suisse de Bioinformatique LF-2004.08 PTM detection � Pattern prediction NetOGlyc - Prediction of type O- � glycosylation sites in mammalian (PROSITE) proteins � Short or weak signal DictyOGlyc - Prediction of GlcNAc � O-glycosylation sites in � Frequent hit producer Dictyostelium � Best method is experimental YinOYang - O-beta-GlcNAc � attachment sites in eukaryotic � MS/MS detection protein sequences � Most method use « rules » NetPhos - Prediction of Ser, Thr � and Tyr phosphorylation sites in joining pattern detection and eukaryotic proteins knowledge to predict sites. NMT - Prediction of N-terminal N- � myristoylation Sulfinator - Prediction of tyrosine � sulfation sites Swiss Institute of Bioinformatics Institut Suisse de Bioinformatique LF-2004.08

  8. Low complexity regions � repeats � compositional bias � PEST Swiss Institute of Bioinformatics Institut Suisse de Bioinformatique LF-2004.08 Low complexity / Repeats � DUST (DNA) / SEG � EMBOSS (DNA) � de novo detection � einverted � RepeatMasker (DNA) � equicktandem � search collection � etandem � REP � palindrome � search collection � EMBOSS (protein) � REPRO, Radar � oddcomp � de novo detection � PEST, PESTFind � de novo detection Swiss Institute of Bioinformatics Institut Suisse de Bioinformatique LF-2004.08

  9. Coils � Helix of helix � coiled-coil � Leu-zipper Swiss Institute of Bioinformatics Institut Suisse de Bioinformatique LF-2004.08 Coils detection � COILS � Weight matrices � Paircoil, Multicoil � Pairwise correlation � Marcoil � HMM � Pepcoil (EMBOSS) � Weight matrices Swiss Institute of Bioinformatics Institut Suisse de Bioinformatique LF-2004.08

  10. Secondary structure � Structure to predict � Garnier (EMBOSS) � PHD � Alpha-helices � DSC � Beta-sheets � PREDATOR � Turns � NNSSP � Random coil � Jpred � Jnet � Many others Swiss Institute of Bioinformatics Institut Suisse de Bioinformatique LF-2004.08 Antigenic peptide � Use of experimental � Peptides binding to MHC knowledge � class I � Databases of known � 8, 9, 10 mers peptides � class II � 15 mers (3+9+3) SYFPEITHI � � Depend highly on MHC type HLA_Bind (BIMAS) � MAPPP combined expert � Antigenic (EMBOSS) � Many more � Prediction of proteasome � cleavage sites NetChop � PaProc � Swiss Institute of Bioinformatics Institut Suisse de Bioinformatique LF-2004.08

  11. Domain / Motif � All the protein domain � Federation: InterPro descriptors � Many techniques � PROSITE � Patterns, Regexp � PFAM � PSSM (PSI-BLAST) � SMART � Profiles � PRODOM � HMM � BLOCKS � PRINTS � TIGRfam � … Swiss Institute of Bioinformatics Institut Suisse de Bioinformatique LF-2004.08 Other Tools � You can find some of them on our servers � www.ch.embnet.org � Or on ExPASy server � www.expasy.org/tools � Or ask Google!! � www.google.com Swiss Institute of Bioinformatics Institut Suisse de Bioinformatique LF-2004.08

  12. European Molecular Biology Open Software Suite Swiss Institute of Bioinformatics Institut Suisse de Bioinformatique LF-2004.08 How to use EMBOSS/Jemboss at SIB Swiss Institute of Bioinformatics Institut Suisse de Bioinformatique LF-2004.08

  13. � Free Open Source (for most Unix plateforms) � GCG successor (compatible with GCG file format) � More than 150 programs (ver. 2.9.0) � Easy to install locally � but no interface, requires local databases � Unix command-line only � Interfaces � Jemboss, www2gcg, w2h, wemboss… (with account) � Pise, EMBOSS-GUI, SRSWWW (no account) � Staden, Kaptain, CoLiMate, Jemboss (local) � Access: www.emboss.org or emboss.sourceforge.net Swiss Institute of Bioinformatics Institut Suisse de Bioinformatique LF-2004.08 Some details Format USA � ' as is ' : : Sequence [ s ta r t :end : r eve rse ] � Forma t : : ' @ ' L is tF i l e [ s ta r t :e nd : r eve r se ] � Forma t : :' l i s t ' :L is t F i le [ s ta r t :end : r eve rse ] � Forma t : : Database :En t r y [ s t a r t :end : r eve rse ] � Forma t : : Database - SearchF ie ld : Word [ s ta r t :end : r eve rse ] � Forma t : : F i le : En t r y [ s ta r t :e nd : r eve r se ] � Forma t : : F i le: SearchF ie ld : Word [ s t a r t :end : r eve rse ] � Forma t : : Program Prog ram-pa rame te r s ' | ' [ s t a r t :end : r eve rse ] � Example: fas ta: :Sw isspr o t :UBP5_HU M AN[200 : 300 ] � Databases � Any can be added, use showdb to display the available databases � Swiss Institute of Bioinformatics Institut Suisse de Bioinformatique LF-2004.08

Recommend


More recommend