magnushyttsten meet robin guinea pig
play

@MagnusHyttsten Meet Robin Guinea Pig Meet Robin An Awkward - PowerPoint PPT Presentation

@MagnusHyttsten Meet Robin Guinea Pig Meet Robin An Awkward Social Experiment (that I'm afraid you need to be part of...) Super ROCKS! "QCon" Input Data Examples (Train & Test Data) Model <Awkward Output (Your


  1. @MagnusHyttsten

  2. Meet Robin

  3. Guinea Pig Meet Robin

  4. An Awkward Social Experiment (that I'm afraid you need to be part of...)

  5. Super ROCKS!

  6. "QCon" Input Data Examples (Train & Test Data) Model <Awkward Output (Your Silence> Brain)

  7. "QCon" Labels Input Data (Correct Answers) Examples (Train & Test Data) "Super Rocks" Model Output Loss (Your function Brain) Optimizer

  8. "QCon" Labels Input Data (Correct Answers) Examples (Train & Test Data) "Super Rocks" "Super Rocks" Model Output Loss (Your function Brain) Optimizer

  9. Agenda Intro to Machine Learning Frontiers of Machine Learning Creating a TensorFlow Model Why are TPUs Great for Machine Learning Workloads Distributed TensorFlow Training

  10. Agenda Intro to Machine Learning Frontiers of Machine Learning Creating a TensorFlow Model Why are TPUs Great for Machine Learning Workloads Distributed TensorFlow Training

  11. Agenda Intro to Machine Learning Frontiers of Machine Learning Creating a TensorFlow Model Why are TPUs Great for Machine Learning Workloads Distributed TensorFlow Training

  12. Ophthalmology Radiology “The network performed similarly to senior orthopedic 0.95 0.91 surgeons when presented with images at the same resolution as the network.” Algorithm Ophthalmologist (median) www.tandfonline.com/doi/full/10.1080/17453674.2017.1344459

  13. Pathology https://research.googleblog.com/2017/03/assisting-pathologists-in-detecting.html

  14. ImageNet Alaskan Malamute Siberian Husky

  15. http://news.stanford.edu/2017/01/25/artificial-intelligence-used-identify-skin-cancer

  16. Input Saturation Defocus

  17. Data, Data, Data Compute, Compute, Compute

  18. Data, Data, Data Compute, Compute, Compute Humans, Humans, Humans

  19. How long did it take for a Human to construct this? Improving Inception and Image Classification in TensorFlow research.googleblog.com/2016/08/improving-inception-and-image.html

  20. AM!!!

  21. Current: Solution = ML expertise + data + computation

  22. Current: Solution = ML expertise + data + computation Can we turn this into: Solution = data + 100X computation

  23. Current: Solution = ML expertise + data + computation Can we turn this into: Solution = data + 100X computation ??? Can We Learn How To Teach Machines To Learn

  24. CIFAR-10

  25. ImageNet Learning Transferable Architectures for Scalable Image Recognition , Barret Zoph, Vijay Vasudevan, Jonathon Shlens and Quoc Le, https://arxiv.org/abs/1707.07012

  26. Agenda Intro to Machine Learning Frontiers of Machine Learning Creating a TensorFlow Model Why are TPUs Great for Machine Learning Workloads Distributed TensorFlow Training

  27. Premade Estimators Datasets Estimator tf.keras tf.keras.layers Python Frontend Java C++ TensorFlow Distributed Execution Engine CPU GPU Android iOS ...

  28. TensorFlow Estimator Architecture Estimator (tf.estimator) calls input_fn (Datasets, tf.data)

  29. Premade Estimators Estimator (tf.estimator) calls input_fn (Datasets, tf.data) subclass Premade Estimators LinearClassifier DNNLinearCombinedClassifier DNNLinearCombinedRegressor DNNClassifier BaselineClassifier LinearRegressor BaselineRegressor DNNRegressor

  30. Premade Estimators Premade Estimators Datasets LinearRegressor(...) LinearClassifier(...) DNNRegressor(...) DNNClassifier(...) estimator = DNNLinearCombinedRegressor(...) DNNLinearCombinedClassifier(...) BaselineRegressor(...) BaselineClassifier(...) # Train locally estimator.train ( input_fn=..., ... estimator.evaluate( input_fn=..., ...) Datasets estimator.predict ( input_fn=..., ...)

  31. Custom Models #1 - model_fn Estimator (tf.estimator) calls calls Keras Layers (tf.keras.layer) model_fn input_fn use (Datasets, tf.data) subclass Premade Estimators LinearClassifier DNNLinearCombinedClassifier DNNLinearCombinedRegressor DNNClassifier BaselineClassifier LinearRegressor BaselineRegressor DNNRegressor

  32. Custom Models tf.Estimator tf.keras.layers # Imports yada yada ... def model_fn (input, ...): tf.keras.layers Conv2D(32, kernel_size=(3, 3), activation='relu') MaxPooling2D( l1, pool_size=(2, 2) Flatten( l2 ) Dense( l3, 128, activation='relu') Dropout(0.2)) Dense(10, activation='softmax') model.compile(loss='categorical_crossentropy' , optimizer='adam' , metrics=['accuracy'] )

  33. Train/Evaluate Model Estimator Datasets # Convert a Keras model to tf.estimator.Estimator ... estimator = keras.estimator.model_to_estimator ( model, ... ) # Train locally estimator.train ( input_fn=..., ... estimator.evaluate( input_fn=..., ...) Datasets estimator.predict ( input_fn=..., ...)

  34. Summary - Use Estimators, Datasets, and Keras Premade Estimators (tf.estimator): When possible ● ● Custom Models a. model_fn in Estimator & tf.keras.layers ● Datasets (tf.data) for the input pipeline

  35. Agenda Intro to Machine Learning Frontiers of Machine Learning Creating a TensorFlow Model Why are TPUs Great for Machine Learning Workloads Distributed TensorFlow Training

  36. We may have a huge number of layers ● Each layer can have huge number of neurons ● --> There may be 100s millions or even billions * and + ops All knobs are W values that we need to tune So that given a certain input, they generate the correct output

  37. "Matrix Multiplication is EATING (the computing resources of) THE WORLD" h i_j = [X 0 , X 1 , X 2, ... ] * [W 0 , W 1 , W 2, ... ] h i_j = X 0 *W 0 + X 1 *W 1 + X 2 *W 2 + ...

  38. Matmul X = [1.0, 2.0, ..., 256.0] # Let's say we have 256 input values W = [0.1, 0.1, ..., 0.1] # Then we need to have 256 weight values h 0,0 = X * W # [1*0.1 + 2*0.1 + ... + 256*0.1] == 32389.6

  39. Single-threaded Execution

  40. Single-threaded Execution X = [1.0, 2.0, ..., 256.0] # Let's say we have 256 input values W = [0.1, 0.1, ..., 0.1] # Then we need to have 256 weight values h 0,0 = X * W # [1*0.1 + 2*0.1 + ... + 256*0.1] == 32389.6 X W [ [ 1 0.1 2 0.1 . . * . . . . 256 0.1 [ [

  41. Single-threaded Execution X = [1.0, 2.0, ..., 256.0] # Let's say we have 256 input values W = [0.1, 0.1, ..., 0.1] # Then we need to have 256 weight values h 0,0 = X * W # [1*0.1 + 2*0.1 + ... + 256*0.1] == 32389.6 X W [ [ 1*0.1 = 0.1 1 0.1 2 0.1 . . * . . . . 256 0.1 [ [

  42. Single-threaded Execution X = [1.0, 2.0, ..., 256.0] # Let's say we have 256 input values W = [0.1, 0.1, ..., 0.1] # Then we need to have 256 weight values h 0,0 = X * W # [1*0.1 + 2*0.1 + ... + 256*0.1] == 32389.6 X Prev W [ [ 1*0.1 = 0.1 1 0.1 2 0.1 0.1 . . * . . . . 256 0.1 [ [

  43. Single-threaded Execution X = [1.0, 2.0, ..., 256.0] # Let's say we have 256 input values W = [0.1, 0.1, ..., 0.1] # Then we need to have 256 weight values h 0,0 = X * W # [1*0.1 + 2*0.1 + ... + 256*0.1] == 32389.6 X Prev W [ [ 1*0.1 = 0.1 1 0.1 2 0.1 0.1 + 2*0.1 = 0.3 . . * . . . . 256 0.1 [ [

  44. Single-threaded Execution X = [1.0, 2.0, ..., 256.0] # Let's say we have 256 input values W = [0.1, 0.1, ..., 0.1] # Then we need to have 256 weight values h 0,0 = X * W # [1*0.1 + 2*0.1 + ... + 256*0.1] == 32389.6 X Prev W [ [ 1*0.1 = 0.1 1 0.1 2 0.1 0.1 + 2*0.1 = 0.3 . . . * . . . 3238.5+255*0.1 = 3264 . . 256 0.1 3264 + 256*0.1 = 3289.6 [ [

  45. Single-threaded Execution X = [1.0, 2.0, ..., 256.0] # Let's say we have 256 input values W = [0.1, 0.1, ..., 0.1] # Then we need to have 256 weight values h 0,0 = X * W # [1*0.1 + 2*0.1 + ... + 256*0.1] == 32389.6 X Prev W Single-threaded [ [ Execution 1*0.1 = 0.1 1 0.1 2 0.1 0.1 + 2*0.1 = 0.3 256 * t . . . * . . . 3238.5+255*0.1 = 3264 . . 256 0.1 3264 + 256*0.1 = 3289.6 [ [

  46. Tensor Processing Unit (TPU) v2

  47. X 33 Matrix Unit Systolic Array X 32 X 23 inputs X 31 X 22 X 13 X 21 X 12 Computing y = Wx X 11 3x3 systolic array W = 3x3 matrix Matrix Unit (MXU) W 11 W 12 W 13 Batch-size(x) = 3 weights W 21 W 22 W 23 W 31 W 32 W 33 accumulation

  48. Matrix Unit Systolic Array X 33 Computing y = Wx inputs with W = 3x3, batch-size(x) = 3 X 32 X 23 X 22 X 13 X 31 X 21 X 12 W 11 Matrix Unit (MXU) W 12 W 13 X 11 weights W 21 W 22 W 23 W 31 W 32 W 33 accumulation

  49. Matrix Unit Systolic Array Computing y = Wx inputs with W = 3x3, batch-size(x) = 3 X 33 X 32 X 23 X 22 X 13 X 31 W 12 X 12 W 11 Matrix Unit (MXU) W 13 + X 21 W 11 X 11 weights W 21 W 22 W 23 X 11 W 31 W 32 W 33 accumulation

  50. Matrix Unit Systolic Array Computing y = Wx inputs with W = 3x3, batch-size(x) = 3 X 33 X 32 X 23 W 12 X 22 W 13 X 13 W 11 Matrix Unit (MXU) + + X 31 W 11 X 21 ... weights W 22 X 12 W 21 W 23 + X 21 W 21 X 11 W 31 W 32 W 33 X 11 accumulation

  51. Matrix Unit Systolic Array Computing y = Wx inputs with W = 3x3, batch-size(x) = 3 outputs X 33 W 12 X 32 W 13 X 23 Matrix Unit (MXU) W 11 Y 11 = W 11 X 11 + W 12 X 12 + W 13 X 13 + + W 11 X 31 ... weights W 22 X 22 W 23 X 13 W 21 + + X 31 W 21 X 21 ... W 32 X 12 W 31 W 33 + X 21 W 31 X 11 accumulation

Recommend


More recommend