Examples of online analysis tools for gene expression data Tools integrated in data repositories Tools for raw data analysis (cel files, or other scanner output) Processed data analysis tools Tools linking gene expression with gene function Tools linking gene expression with sequence analysis
Tools from the data repositories Advantages : Fast Done for a huge amount of public data Allow quick & dirty overview of “what's already known” Drawbacks Not usable for custom data Not flexible, poor tuning Examples GEO ArrayExpress SAGEmap
GEO Tools Raw data retrieval (soft or matrix-formatted objects) GEO DataSet Cluster Analysis : a visualization tool for displaying precomputed cluster heat maps GEO Profiles : expression profiles per each gene/spot of one selected dataset
GEO DataSet cluster analysis : example
GEO DataSet cluster analysis : example
GEO DataSet cluster analysis : example
GEO differential expression analysis : example
ArrayExpress Tools Processed (matrix) or Raw data retrieval Expression Profiles (per gene and per experiment)
SAGE Anatomic Viewer (SAV) Displays gene expression results based on SAGE tags counts in human normal and malignant tissues
Tools for raw files transformation Input : Affymetrix cel files Genepix or Scanalyze output files Functions : Standard microarray corrections and normalization Background correction Spot filtering Intra- and Interchip normalization Replicate scaling Data quality assessment and scoring
Tools for raw files transformation : Express Yourself
Processed data analysis tools Drawbacks Can be quite slow Input data format is very important Need to know well your data before using them Advantages Usually contains lots of functionalities Usable for custom data Can be very flexible Examples CIMminer GEDA Expression Profiler GEPAS
CIMminer Generates color-coded Clustered Image Maps (CIMs) ("heat maps") Easy to use, but few tuning possibilities Good start for online clustering tools
GEDA Specifically designed for the integrated analysis of global gene expression patterns in cancer Easy to use BUT : careful with the results interpretation
GEDA : A few Screenshots
GEDA : A few Screenshots
GEDA : A few Screenshots
GEDA : A few Screenshots
GEDA : A few Screenshots
GEDA : A few Screenshots
Expression Profiler at EBI
Expression Profiler at EBI
Expression Profiler at EBI
GEPAS
GEPAS
GEPAS
GEPAS
GEPAS
GEPAS
Tools to retrieve gene functions and annotations Goals Link Gene Ontology information to co-expressed genes Find pathways specificities under certain biological conditions Find promoter elements common in co-expressed genes Input files Expression data matrix with classes AND gene names Gene lists to compare Promoter sequences in FASTA format Examples Carrie Babelomics DAVID : Database for Annotation, Visualization and Integrated Discovery Inclusive : MotifSampler SSA
CARRIE Computational Ascertainment of Regulatory Relationships Inferred from Expression Input Expression data matrix with gene Ids and sample classes Associated promoter sequences Output Known transcription factors associated with co-expressed genes KEGG pathways associated with genes Gene Ontology for selected genes
CARRIE
CARRIE
Babelomics : FatiGO Linked to the GEPAS gene expression analysis tools Web-tools for functional annotation and analysis of group of genes in high- throughput experiments.
Babelomics : FatiGO Input : Two gene lists to compare (differentially expressed genes) Different gene IDs supported (Entrez, HUGO, RefSeq, Affy...) Uses GO (Gene Ontology) database Output : Summary with the input parameters Summary input data: Initial number of genes, number of genes have ensembl correspondence and number of genes that have been used for the analysis. Links with the results for each repository that has been selected and the number of genes for which gene ontology annotation exist. Graphical view of GO terms represented in gene lists
Babelomics : FatiGO
Babelomics : FatiGO
Babelomics : FatiGO
MotifSampler Description Part of the INCLUSive suite which also contains gene expression data analysis Tries to find motifs in a given list of sequences Input Sequences in FASTA format An organism-specific background model (given) Motif length Number of motifs to retrieve Output A list of motifs instances for each input sequence
Other online Tool : ArrayQuest Applies to data from GEO or custom data Contains Bioconductor methods, BioPerl and C++ based scripts Accepts new analysis method submission
Recommend
More recommend