practical bioinformatics
play

Practical Bioinformatics Mark Voorhies 5/14/2019 Mark Voorhies - PowerPoint PPT Presentation

Practical Bioinformatics Mark Voorhies 5/14/2019 Mark Voorhies Practical Bioinformatics Course platform: VirtualBox Host operating system (e.g., OS X) VirtualBox Debian Linux Web 8888 8088 Jupyter Browser launches Python3 Bash Bash


  1. Practical Bioinformatics Mark Voorhies 5/14/2019 Mark Voorhies Practical Bioinformatics

  2. Course platform: VirtualBox Host operating system (e.g., OS X) VirtualBox Debian Linux Web 8888 8088 Jupyter Browser launches Python3 Bash Bash 22 8022 Mark Voorhies Practical Bioinformatics

  3. Starting the virtual machine 1 Start virtual box 2 Boot the VM guest 3 Open a bash terminal on the host 4 Log into the guest and start Jupyter: ssh − add ˜/. ssh /VM rsa ssh − p 8022 e x p l o r e r @ l o c a l h o s t j u p y t e r notebook 5 In a host web browser, go to https://localhost:8088/ Mark Voorhies Practical Bioinformatics

  4. supp2data.csv CSV File Mark Voorhies Practical Bioinformatics

  5. open(“supp2data.csv”) File object CSV File Mark Voorhies Practical Bioinformatics

  6. open(“supp2data.csv”).next() single line File object CSV File Mark Voorhies Practical Bioinformatics

  7. open(“supp2data.csv”).read() single line whole file File object CSV File Mark Voorhies Practical Bioinformatics

  8. csv.reader(open(“supp2data.csv”)).next() list reader File object CSV File Mark Voorhies Practical Bioinformatics

  9. csv.reader(urlopen(“http://example.com/csv”)).next() list reader urllib object Web service CSV File Mark Voorhies Practical Bioinformatics

  10. Anatomy of a Programming Language Mark Voorhies Practical Bioinformatics

  11. Anatomy of a Programming Language Mark Voorhies Practical Bioinformatics

  12. Anatomy of a Programming Language Mark Voorhies Practical Bioinformatics

  13. Anatomy of a Programming Language Mark Voorhies Practical Bioinformatics

  14. Talking to Python: Nouns # This i s a comment # This i s an i n t ( i n t e g e r ) 42 # This i s a f l o a t ( r a t i o n a l number ) 4.2 # These are a l l s t r i n g s ( sequences of c h a r a c t e r s ) ’ATGC ’ ”Mendel ’ s Laws” ””” > CAA36839 .1 Calmodulin MADQLTEEQIAEFKEAFSLFDKDGDGTITTKELGTVMRSLGQNPTEAEL QDMINEVDADDLPGNGTIDFPEFLTMMARKMKDTDSEEEIREAFRVFDK DGNGYISAAELRHVMTNLGEKLTDEEVDEMIREADIDGDGQVNYEEFVQ MMTAK””” Mark Voorhies Practical Bioinformatics

  15. Python as a Calculator # Addition 1+1 # Subtraction 2 − 3 # M u l t i p l i c a t i o n 3 ∗ 5 # D i v i s i o n 5/3 # Exponentiation 2 ∗∗ 3 # Order of o p e r a t i o n s 2 ∗ 3 − (3+4) ∗∗ 2 Mark Voorhies Practical Bioinformatics

  16. Remembering objects # Use a s i n g l e = f o r assignment : TLC = ”GATACA” YFG = ”CTATGT” MFG = ”CTATGT” # A name can occur on both s i d e s of an assignment : c o d o n p o s i t i o n = 1857 c o d o n p o s i t i o n = c o d o n p o s i t i o n + 3 # Short − hand f o r common updates : codon += 3 weight − = 10 e x p r e s s i o n ∗ = 2 CFU /= 10.0 Mark Voorhies Practical Bioinformatics

  17. Displaying values with print # Use p r i n t to show the value of an o b j e c t message = ” Hello , world ” print ( message ) # Or s e v e r a l o b j e c t s : print (1 ,2 ,3 ,4) # Older v e r s i o n s of Python use a # d i f f e r e n t p r i n t syntax print ” Hello , world ” Mark Voorhies Practical Bioinformatics

  18. Comparing objects # Use double == f o r comparison : YFG == MFG # Other comparison o p e r a t o r s : # Not equal : TLC != MFG # Less than : 3 < 5 # Greater than , or equal to : 7 > = 6 Mark Voorhies Practical Bioinformatics

  19. Making decisions i f (YFG == MFG) : print ( ”Synonyms ! ” ) i f ( p r o t e i n l e n g t h < 60): print ( ” Probably too s h or t to f o l d . ” ) e l i f ( p r o t e i n l e n g t h > 10000): print ( ”What i s t h i s , t i t i n ?” ) else : print ( ”Okay , t h i s looks r e a s o n a b l e . ” ) Mark Voorhies Practical Bioinformatics

  20. Collections of objects # A l i s t i s a mutable sequence of o b j e c t s m y l i s t = [1 , 3.1415926535 , ”GATACA” , 4 , 5] # Indexing m y l i s t [ 0 ] == 1 m y l i s t [ − 1] == 5 # Assigning by index m y l i s t [ 0 ] = ”ATG” # S l i c i n g m y l i s t [ 1 : 3 ] == [3.1415926535 , ”GATACA” ] m y l i s t [ : 2 ] == [1 , 3.1415926535] m y l i s t [ 3 : ] == [ 4 , 5 ] # Assigning a second name to a l i s t a l s o m y l i s t = m y l i s t # Assigning to a copy of a l i s t m y o t h e r l i s t = m y l i s t [ : ] Mark Voorhies Practical Bioinformatics

  21. Repeating yourself: iteration # A f o r loop i t e r a t e s through a l i s t one element # at a time : i [ 1 , 2 , 3 , 4 , 5 ] : for in print ( i , i ∗∗ 2) # A while loop i t e r a t e s f o r as long as a c o n d i t i o n # i s true : population = 1 while ( population < 1e5 ) : print ( population ) population ∗ = 2 Mark Voorhies Practical Bioinformatics

  22. Verb that noun! return value = function(parameter, ...) “Python, do function to parameter ” # Built − in f u n c t i o n s # Generate a l i s t from 0 to n − 1 a = range (5) # Sum over an i t e r a b l e o b j e c t sum ( a ) # Find the length of an o b j e c t len ( a ) Mark Voorhies Practical Bioinformatics

  23. Verb that noun! return value = function(parameter, ...) “Python, do function to parameter ” # Importing f u n c t i o n s from modules import numpy numpy . s q r t (9) import m a t p l o t l i b . pyplot as p l t f i g = p l t . f i g u r e () p l t . p l o t ( [ 1 , 2 , 3 , 4 , 5 ] , [ 0 , 1 , 0 , 1 , 0 ] ) from IPython . core . d i s p l a y d i s p l a y import d i s p l a y ( f i g ) Mark Voorhies Practical Bioinformatics

  24. New verbs f u n c t i o n ( parameter1 , parameter2 ) : def ”””Do t h i s ! ””” # Code to do t h i s return r e t u r n v a l u e Mark Voorhies Practical Bioinformatics

  25. Summary Python is a general purpose programming language. Mark Voorhies Practical Bioinformatics

  26. Summary Python is a general purpose programming language. We can extend Python’s built-in functions by defining our own functions (or by importing third party modules). Mark Voorhies Practical Bioinformatics

  27. Summary Python is a general purpose programming language. We can extend Python’s built-in functions by defining our own functions (or by importing third party modules). We can define complex behaviors through control statements like “for”, “while”, and “if” Mark Voorhies Practical Bioinformatics

  28. Summary Python is a general purpose programming language. We can extend Python’s built-in functions by defining our own functions (or by importing third party modules). We can define complex behaviors through control statements like “for”, “while”, and “if” We can use an interactive Python session to experiment with new ideas and to explore data. Mark Voorhies Practical Bioinformatics

  29. Summary Python is a general purpose programming language. We can extend Python’s built-in functions by defining our own functions (or by importing third party modules). We can define complex behaviors through control statements like “for”, “while”, and “if” We can use an interactive Python session to experiment with new ideas and to explore data. Saving interactive sessions is a good way to document our computer “experiments”. Mark Voorhies Practical Bioinformatics

  30. Summary Python is a general purpose programming language. We can extend Python’s built-in functions by defining our own functions (or by importing third party modules). We can define complex behaviors through control statements like “for”, “while”, and “if” We can use an interactive Python session to experiment with new ideas and to explore data. Saving interactive sessions is a good way to document our computer “experiments”. Likewise, we can use modules and scripts to document our computer “protocols”. Mark Voorhies Practical Bioinformatics

  31. Summary Python is a general purpose programming language. We can extend Python’s built-in functions by defining our own functions (or by importing third party modules). We can define complex behaviors through control statements like “for”, “while”, and “if” We can use an interactive Python session to experiment with new ideas and to explore data. Saving interactive sessions is a good way to document our computer “experiments”. Likewise, we can use modules and scripts to document our computer “protocols”. Most of these statements are applicable to any programming language (Perl, R, Bash, Java, C/C++, FORTRAN, ...) Mark Voorhies Practical Bioinformatics

  32. Homework: Make your own fun Write functions for these calculations, and test them on random data: 1 Mean: � N i x i x = ¯ N 2 Standard deviation: �� N x ) 2 i ( x i − ¯ σ x = N − 1 3 Correlation coefficient (Pearson’s r): � i ( x i − ¯ x )( y i − ¯ y ) r ( x , y ) = �� x ) 2 �� y ) 2 i ( x i − ¯ i ( y i − ¯ Mark Voorhies Practical Bioinformatics

Recommend


More recommend