applied machine learning in
play

Applied Machine Learning in Biomedicine Enrico Grisan - PowerPoint PPT Presentation

Applied Machine Learning in Biomedicine Enrico Grisan enrico.grisan@dei.unipd.it Course details Mon-Wed 10.30-12.00 Room 318 May 4 th through May 27 th Contact enrico.grisan@dei.unipd.it Exam: project assignment Cancer detection


  1. Applied Machine Learning in Biomedicine Enrico Grisan enrico.grisan@dei.unipd.it

  2. Course details Mon-Wed 10.30-12.00 Room 318 May 4 th through May 27 th Contact enrico.grisan@dei.unipd.it Exam: project assignment

  3. Cancer detection

  4. Face detection How would you detect a face? How does album software tag your frienss?

  5. What do we do?

  6. What do we do?

  7. Speech recognition

  8. Brain-coputer interface

  9. Recommender systems Amazon, Netflix, Spotify tell you what you might like The Netflix Prize was an open competition: predict user ratings for films, based on previous ratings without any other information about the users or films, The grand prize of US$1,000,000 was given to the BellKor's Pragmatic Chaos team which bested Netflix's own algorithm for predicting ratings by 10.06%

  10. The age of big data 10 9 messages/day CERN Collider Personal connectome 320x10 12 bytes/s 10 18 bytes/person “Every day, people create the equivalent of 2.5 quintillion bytes of data from sensors, mobile devices, online 30x10 6 messages/day transactions, and social networks; so much that 90 percent of the world's data has been generated in the past two years.. ” The Huffington Post: Arnal Dayaratna: IBM Releases Big Data

  11. The role of machine learning Design and analyze algorithms that - improve their performance - at some task - with experience Data Knowledge Learning ( experience ) algorithm ( performance on task )

  12. Imagenet challenge

  13. Kaggle challenge 100 000 $ prize 35000 retinal images 4 DR classes ongoing!!!

  14. Machine learning in biomedicine Usually extreme conditions: Very few samples (with respect to the problem) Very large amount of descriptors per sample Very large amount of noise/uncertainty

  15. Categories – Supervised learning classification, regression – Unsupervised learning Density estimation, clustering, dimensionality reduction – Semisupervised learning – Active learning – Reinforcement learning – …

  16. Supervised learning Feature space Target space Normal Metaplastic Gene expression Discrete labels Benign neoplastic Classification Malign neoplastic Demographic and Continuous labels CHD risk score Clinical data Regression

  17. Roadmap Binary classification - Parametric and non-parametric prediction Other supervised settings Principles for learning

  18. Oranges and Lemons

  19. A two dimensional space

  20. Stars and galaxies Minor elliptical axis (y) against Major elliptical axis (x) for stars (red) and galaxies (blue)

  21. Coronoray Heart Disease Patients with (red) and without (blue) coronary heart disease in South Africa (Rousseauw et al, 1983)

  22. Parametric model

  23. Linear classifier

  24. The weight vector

  25. Geometric meaning

  26. The weight vector % ww = Dx1 weights % Xstar = NxD test cases y_pred = sign(Xstar*ww); % Nx1

  27. Learning the weights Rosenblatt ’s Perceptron Learning Perceptron criterion: Stochastic gradient descent:

  28. Learning the weights % ww = Dx1 weights % xx = NxD test cases % yy = Nx1 targets (-1,+1) old_ww=[]; ww=zeros(D,1); while (~isequal(ww,old_ww)) old_ww=ww; for ct=1:N, pred=sign(xx(ct,:)*ww); ww=ww+(yy(ct)-pred)*xx(ct ,:)’; end; end;

  29. Learning the weights

  30. Implementing the bias

  31. Output of the perceptron

  32. Linear classifier revisited If not linearly separable must - extend model - add features

  33. Nonlinear basis function

  34. From model to no model Faith in previous knowledge Faith in the data Strong assumption on No assumption on the - data structure underlying structure - separating boundary shape Data tell me everything I need

  35. K-nearest neighbours classifier Fix an Hodges 1951

  36. Decision boundaries 1-nearest neighbour 15-nearest neighbour Linear classification classifier classifier

  37. Brain MRI application MICCAI MS lesion challenge 2008 http://www.ia.unc.edu/MSseg/index.html

  38. LANDSAT application

  39. Identification via gait analysis Characterize each person by the way he moves: gait signature Nowlan 2009 Choi 2014

  40. Parametric vs non-parametric • Starting assuming decision boundary is a plane • Non-parametric KNN has no fixed assumption: boundaries gets more complicated with more data • Non-parametric methods may need more data and can be computationally intensive

  41. Batch supervised learning Given : example inputs and targets (training set) Task : predicting target for new inputs (test set) Examples: - classification (binary or multi-class) - regression - ordinal regression - Poisson regression - ranking …

  42. Batch supervised learning • Many ways of mapping inputs outputs • How do we choose what to do? • How do we know if we are doing well?

  43. Algorithm’s objective cost Formal objective for algorithms: - minimize a cost function - maximize an objective function Proving convergence: - does objective monotonically improve? Considering alternatives : - does another algorithm score better?

  44. Loss function

  45. Choosing a loss function • Motivated by the application – 0-1 error, achieving a tolerance, business cost • Computational convenience: – Differentiability, convexity • Beware of loss dominated by artifacts: – Outliers – Unbalanced classes

Recommend


More recommend