learning in parallel universes bernd wiswedel
play

Learning in Parallel Universes Bernd Wiswedel 15 September, 2008 - PDF document

Ny Nyco come med Chair d Chair for Bioinf nforma ormati tics & cs & Information M on Mining Learning in Parallel Universes Bernd Wiswedel 15 September, 2008 Overview What are Parallel Universes? Application Scenarios


  1. Ny Nyco come med Chair d Chair for Bioinf nforma ormati tics & cs & Information M on Mining Learning in Parallel Universes Bernd Wiswedel 15 September, 2008 Overview • What are Parallel Universes? • Application Scenarios • One sample approach: Neighborgrams • Connection to LeGo • Connection to LeGo 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #2 1

  2. Motivation • Data Mining as application to analyse huge amounts of data • One focus of Data Mining: Find interesting patterns in a data set, e.g. cluster • Often data very complex sometimes multiple Often data very complex, sometimes multiple representations of data available � Parallel Universes 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #3 What are Parallel Universes? • Usually: Data given in a single feature space – Mostly high-dimensional and numeric representation … … – Definition of one, global distance measure 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #4 2

  3. What are Parallel Universes? • Parallel Universes – Different object representations … … … … … – Different distance measures 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #5 Why Parallel Universes? • Example 1: Chemistry - universes encode, e.g. – shape (3D) – graph structure – properties… ti … … … … … … … … see also: A. Bender, R. Glen: Molecular similarity: a key technique in molecular informatics , Org. Biomol. Chem., 2:3204-3218, 2004 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #6 3

  4. Why Parallel Universes? • Example 2: Web - universes encode, e.g. – link structure – meta information (categories, tags) – content (bag of words…) t t (b f d ) … … … … … … 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #7 Why Parallel Universes? • More examples: – Music - universes encode • semantic meta information (composer, artist, genre,…) • groupings (style category • groupings (style, category,…) ) • other properties (tempo, beat, key, …) – Image or 3D object recognition – universes encode • properties (has door, has wheels…) • texture information • histogram or intensity/color distributions 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #8 4

  5. Learning in Parallel Universes • Naive Approach: – Consider only one universe at a time: Ignores information in other universes – Construct joint feature space: often impossible, introduces artifacts. ft i ibl i t d tif t • Better: – Consider all universes at once – Allow to identify (local) models that occur only in few (one) universes … … … … … … 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #9 Learning in Parallel Universes • Naive Approach: – Consider only one universe at a time: Ignores information in other universes – Construct joint feature space: often impossible, introduces artifacts. ft i ibl i t d tif t • Better: – Consider all universes at once – Allow to identify (local) models that occur only in few (one) universes … … … … … … 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #10 5

  6. Learning in Parallel Universes • Naive Approach: – Consider only one universe at a time: Ignores information in other universes – Construct joint feature space: often impossible, introduces artifacts. ft i ibl i t d tif t • Better: – Consider all universes at once – Allow to identify (local) models that occur only in few (one) universes … … … … … … 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #11 Related Approaches: Subspace Clustering • choose subset of data and attributes for each cluster – usually no interpretation of subspaces possible – selects from one, large universe – first finds also overlapping clusters – most prominent approaches: CLIQUE, COSA … … … … … … 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #12 6

  7. Related Approaches: Multi-Instance Learning • each object has several possible representations in same space (e.g. molecular confirmations in 3D) – universes all possess the same semantics – two extremes: similar in all universes, similar in at least one universe. – number of universes per object can vary. … … … … … … 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #13 Related Approaches: Multi-View Learning • each object has several possible representations in different spaces – universes with different semantics – independent and complete models in each universe (learning algorithms may assist each other) (learning algorithms may assist each other) … … … … … … 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #14 7

  8. Learning in Parallel Universes • Clear separation of Universes (a-priori given) • Each individual universe does not suffice for learning • Allow to identify (local) models that occur only in few (one) universes universes • Identify overlaps … … … … … … 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #15 One sample approach: Neighborgrams • Supervised approach • Construct local neighorhood histogram („Neighborgrams“) for objects of interest in all universes i • Derive quality values for individual neighborgrams • Covering-like approach to construct classification model • Intuitive visualization allows for interactive exploration and user-controlled model construction 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #16 8

  9. Neighborgrams on KN-DB best (fuzzy) cluster suggested by interactive clustering More Neighbors algorithm. of same class The remainder of the 100 closest neighbors Centroid of Closest neighbor at distance d Neighborgram (same class as centroid) 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #17 Neighborgrams on KN-DB First universe (Image Based) Second universe Third universe (Surface Based) (Volume Based) 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #18 9

  10. Neighborgrams on KN-DB 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #19 Neighborgrams on KN-DB 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #20 10

  11. Neighborgrams on KN-DB 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #21 Neighborgrams on KN-DB Neighborgrams on KN-DB 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #22 11

  12. Summary Neighborgrams • Visualization tool for interactive exploration of clusters • Works well for small size data sets or to model minority class • Manual clustering • Semi Automatic clustering • Semi-Automatic clustering – Inspect proposed cluster – Discard, accept or fine-tune cluster • Fully automatic clustering – Sequential covering approach – Identify greedily the next best cluster, remove covered objects, restart 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #23 Connection to LeGo • Output is selected set of Neighborgram Clusters, spread over different universes • Such clusters can be considered as local patterns • Open problem: Construction of a global model as opposed to a simply aggregation of clusters • Special focus on identifying overlaps among universes (often of special interest) 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #24 12

  13. Summary • Learning in Parallel Universes as simultaneous analysis of multiple descriptor spaces • Encompasses identification of patterns that: – are specific to individual universes and – span multiple universes (not necessarily all) • Final model construction comprises all previously identified patterns Thanks! 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #25 15 September, 2008 Bernd Wiswedel: "Learning in Parallel Universes" #26 13

Recommend


More recommend