Discussion Session Thursday May 15, 2008 1 Find out whos working - - PowerPoint PPT Presentation

discussion session
SMART_READER_LITE
LIVE PREVIEW

Discussion Session Thursday May 15, 2008 1 Find out whos working - - PowerPoint PPT Presentation

Discussion Session Thursday May 15, 2008 1 Find out whos working on bioinformatics at Miami Find out who has tools and expertise at Miami that can be applied to bioinformatics research Get biologists and non-biologists to talk


slide-1
SLIDE 1

Discussion Session

Thursday May 15, 2008

1

slide-2
SLIDE 2

 Find out who’s working on

bioinformatics at Miami

 Find out who has tools and expertise at

Miami that can be applied to bioinformatics research

 Get biologists and non-biologists to talk

to (and maybe even understand!) each

  • ther

2

slide-3
SLIDE 3

 Two-hour session  Brief introduction of attendees

 Biologists – state research problems that desire

collaboration on

 Non-biologists – give tools and expertise available

for collaboration on bioinformatics/biology problems

 Break up into informal discussion session, with

facilitation by

 Chun Liang and Quinn Li, botany  Valerie Cross and John Karro, computer science

3

slide-4
SLIDE 4

 Expertise in biostatistics

 Analysis of dose-related tumorigenic trends in the

presence of treatment-related toxicity

 Analysis of pharmacokinetic data, particularly,

methods for testing the equivalence of the areas under concentration-time profile curves

 Risk assessment  Inverse regression/calibration problems where the

dose associated with a particular level of response is estimated and tested

 Optimal design of experiments for simple

compartmental models

 Integration of model uncertainty in the generation of

risk estimates

4

slide-5
SLIDE 5

 Analysis and optimization of algorithms  Interested in developing efficient algorithms for

finding similar sequences in genomic databases

 Work with problems that have well-defined

measure of similarity or difference between

  • bjects

 Improve problem solutions that currently use too

much memory or take too much time

 Edit distance (number of operations to change

  • ne text/genomic string to another)

5

slide-6
SLIDE 6

 Study how insect viral genes (esp. baculovirus

and ascovirus) are regulated in insect cells

 Baculovirus – would like bioinformatic

prediction of which AATAAA used in certain processing

 Ascovirus – would like bioinformatic search for

particular stem loop structure, which could then be verified in lab

6

slide-7
SLIDE 7

Ontology - a vocabulary that represents a set of concepts of a particular domain and the relationships between those concepts

 Gene Ontology (GO) guarantees the consistency of

the referenced biological concepts in different databases

 Use to annotate genes in various databases  Annotations used to determine similarity between

genes and gene products

 Group has made various ontology software tools

7

slide-8
SLIDE 8

Multi-view FCA

1: lagging and leading strand elongation, CDC2, DBP11, POL2

QUOTA OntoSELF

8

slide-9
SLIDE 9

 Primary focus on computationally-based analysis

  • f DNA and RNA sequences

 Develop tools to help with analysis  Example: Working on identification of functional

genomic regions through comparison of genomes from related species

 Example: Developed tools for the estimation of

neutral substitution rates on a local scale

 Study structure of rates  Study effect on evolution of genomic structure

9

slide-10
SLIDE 10

 Software engineering  Software risk management and assessment  Probabilistic risk assessment  Software design methodology  Experimental verification of software design

methodology effectiveness

 Visual programming languages

10

slide-11
SLIDE 11

 DNA tiling microarrays

 Massive data sets

 Broad coverage of genome  Low signal/noise ratio

 Want to extract statistically significant information

to justify validation experiments in a wet lab

 Seek collaboration from statisticians to develop

appropriate statistics

 Seek collaboration from computer scientists to

effectively implement statistical and data processing algorithms

11

slide-12
SLIDE 12

 Looking for collaboration on genomic

sequence assembly and clustering

 Work with expressed sequence tags (EST)

from complementary DNA (cDNA)

 How trace a given set of ESTs back to their

  • riginal genes?

 New technologies can now very quickly

sequence enormous amounts of short pieces of cDNA

 Want computational tools to do correct

assembly and clustering

12

slide-13
SLIDE 13

 Software development in C/C++/Fortran for

numerical computation

 Conversion of software for parallel computation  Application support for various physics and

biophysics packages, e.g., ANSYS, Abaqus

 Modeling and simulation of vascular systems  Geometric model generation  Flow solving  Data visualization

13

slide-14
SLIDE 14

 Expertise is applied probability  Served on graduate committees in zoology  Helped graduate students with data

analysis

 Experience in

 Analysis of variance  Markov chains  Hidden Markov Models

14

slide-15
SLIDE 15

 Expertise in optimization and simulation of

complex systems

 Bioinformatics experience

 Sequencing by hybridization  Clustering the avian-flu viruses (with Henry Wan)  Working with Chun Liang (Botany) and CSA colleagues

to cluster Expressed Sequence Tags (ESTs) to identify genes for conifers

 Would like to hear from other biologists with similar

research, e.g., use of ESTs for gene identification and regulation

15

slide-16
SLIDE 16

 Has taught classes in introductory

statistics, regression analysis, and time series analysis

 Extensive experience applying statistics in

business, social science, and natural science

 Time series analysis to study chemical concentrations

  • f stream flows into Acton Lake

 Applications of regression techniques

16

slide-17
SLIDE 17

 Scientific programming, especially C++ and

MATLAB

 Parallel programs on cluster  Graphical user interfaces (GUI) for programs  Mathematical modeling  Digital image processing  Basic knowledge of variety of mathematical

techniques

17

slide-18
SLIDE 18

 Works in Michael Kennedy’s lab  Seek collaboration and support for

 Nuclear Magnetic Resonance (NMR) data  Use of principal component analysis

(PCA)

 Use of partial least squares discriminant

analysis (PLS-DA)

18

slide-19
SLIDE 19

 Installation and configuration of

bioinformatics applications on the cluster

 IT infrastructure planning and support -

servers, network, storage, etc.

 Scripting (writing programs for cluster)

and help with cluster batch system

 Database creation and advice on use  General support of cluster users

19

slide-20
SLIDE 20

 Principal expertise

 Mathematical optimization (theory, algorithms,

software)

 Modeling of decision problems

 Research interests

 Reformulating mathematical problems for efficiency  Applications of optimization to data-fitting  Parallel processing in optimization  Optimal design of experiments

 Areas of application (to date)

 Crystallography, statistics, hydrology, econometrics,

toxicology, engineering, ecology

20

slide-21
SLIDE 21

 Knowledge of statistics useful in

 Microarray studies (separating signal from

noise, cluster analysis, missing data), image analysis

 Clinical studies, forestry and wild life, public

health

 Specific statistical tools

 Bayesian hierarchical modeling and Markov

chain Monte Carlo (MCMC) algorithms

 Spatial analysis (areal data and point-referenced

data), including prediction and model checking

21