Ready, Set, Go! Using TensorFlow to prototype, train, and productionalize your models Karmel Allison
Building a model is a multi-stage process.
Setting the stage Data: Covertype (USFS + CSU) Task: Classify wilderness area (4 classes) Number of examples: ~500K Features: ● Real: elevation, slope, etc. ● Binned: hillshade given hour (0 - 255) Categorical: soil type, tree cover type ●
Setting the stage 2596,51,3,258,0,510,221,232,148,6279,1,0,0,0,0,0,0,0,0,0,0,0 ,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0 ,0,0,5 2590,56,2,212,-6,390,220,235,151,6225,1,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0, 0,0,0,5
Setting the stage 2596,51,3,258,0,510,221,232,148,6279,1,0,0,0,0,0,0,0,0,0,0,0 ,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0 ,0,0,5 # Elevation, Aspect, Slope, Horizontal_Distance_To_Hydrology, Vertical_Distance_To_Hydrology, Horizontal_Distance_To_Roadways, Hillshade_9am, Hillshade_Noon, Hillshade_3pm, Horizontal_Distance_To_Fire_Points, Wilderness_Area (4), Soil_Type (40), Cover_Type
Prototype your model import tensorflow as tf tf.enable_eager_execution()
Eager execution is immediate a = tf.constant(5) a Const Const val:3 b = a * 3 val:5 print(b) b Mul 15 <tf.Tensor: id=3, shape=(), dtype=int32, numpy=15>
Loading data defaults = [tf.int32] * 55 dataset = tf.contrib.data.CsvDataset( ['covtype.csv.train'], defaults)
Loading data defaults = [tf.int32] * 55 dataset = tf.contrib.data.CsvDataset( ['covtype.csv.train], defaults) print(list(dataset.take(1))) [(<tf.Tensor: id=188, shape=(), dtype=int32, numpy=2596>, <tf.Tensor: id=189, shape=(), dtype=int32, numpy=51>, <tf.Tensor: id=190, shape=(), dtype=int32, numpy=3>...
Parsing data def _parse_csv_row(*vals): return features, class_label
Parsing data def _parse_csv_row(*vals): soil_type_t = tf.convert_to_tensor(vals[14:54]) return features, class_label
Parsing data col_names = ['elevation', 'aspect', 'slope'...] def _parse_csv_row(*vals): soil_type_t = tf.convert_to_tensor(vals[14:54]) feat_vals = vals[:10] + (soil_type_t, vals[54]) features = dict(zip(col_names, feat_vals)) return features, class_label
Parsing data col_names = ['elevation', 'aspect', 'slope'...] def _parse_csv_row(*vals): soil_type_t = tf.convert_to_tensor(vals[14:54]) feat_vals = vals[:10] + (soil_type_t, vals[54]) features = dict(zip(col_names, feat_vals)) class_label = tf.argmax(row_vals[10:14], axis=0) return features, class_label
Parsing data dataset = dataset.map(_parse_csv_row).batch(64)
Parsing data dataset = dataset.map(_parse_csv_row).batch(64) print(list(dataset.take(1))) ({'aspect': <tf.Tensor: shape=(64,), dtype=int32, array([ 47, ... 77, 184, 328])>, ... 'soil_type': <tf.Tensor: shape=(64, 40), array([[0, 0, 0, ..., 0, 0, 0]...)>}, <tf.Tensor: shape=(64,), dtype=int64, array([0, 0, 3, ... 1, 0, 2])>)
Raw CSV Dataset 2596,51,3,2 ({'aspect': 58,0,510,22 <tf.Tensor: 1,232,148,6 id=567, 279,1,0,... shape=...
Defining features # Cover_Type / integer / 1 to 7 cover_type = tf.keras.feature_column. categorical_column_with_identity( 'cover_type', num_buckets=8)
Defining features # Cover_Type (7 types) / integer / 1 to 7 cover_type = tf.keras.feature_column. categorical_column_with_identity( 'cover_type', num_buckets=8) cover_embedding = tf.keras.feature_column. embedding_column(cover_type, dimension=10)
Defining features numeric_features = [tf.keras.feature_column. numeric_column(feat) for feat in numeric_cols]
Defining features numeric_features = [tf.keras.feature_column. numeric_column(feat) for feat in numeric_cols] # Soil_Type (40 binary columns) soil_type = tf.keras.feature_column. numeric_column(soil_type, shape=(40,))
Defining features columns = numeric_features + [ soil_type, cover_embedding] feature_layer = tf.keras.feature_column. FeatureLayer(columns) Coming soon to a release near you!
Building a model model = tf.keras.Sequential([ feature_layer, tf.keras.layers.Dense(256), tf.keras.layers.Dense(16), tf.keras.layers.Dense(8), tf.keras.layers.Dense( 4, activation=tf.nn.softmax) ])
Building a model model.compile( optimizer=tf.train.AdamOptimizer(), loss='sparse_categorical_crossentropy', metrics=['accuracy'])
Building a model model.fit(dataset, steps_per_epoch=NUM_TRAIN_EXAMPLES/64) Epoch 10 9000/9000 [====================] - 110s 12ms/step - loss: 0.8931 - acc: 0.7561
Data config layer Raw CSV Dataset Model architecture Optimizer and loss 2596,51,3,2 ({'aspect': Epoch 10 58,0,510,22 <tf.Tensor: 9000/9000 1,232,148,6 id=567, [==============] 279,1,0,... shape=...
Validating our model def load_data(*filenames): dataset = tf.contrib.data.CsvDataset( filenames, record_defaults) dataset = dataset.map(_parse_csv_row) dataset = dataset.batch(64) return dataset
Validating our model test_data = load_data('covtype.csv.test') loss, accuracy = model.evaluate( test_data, steps=50) print(loss, accuracy) 0.926471548461914, 0.7402
Data config layer Raw CSV Dataset Model architecture Validation Optimizer and loss 2596,51,3,2 ({'aspect': Epoch 10 58,0,510,22 <tf.Tensor: loss: 0.926 9000/9000 1,232,148,6 id=567, acc: 0.7402 [==============] 279,1,0,... shape=...
Export to SavedModel export_dir = tf.contrib.saved_model. save_keras_model(model, 'keras_nn') keras_nn/ 1536162174/ saved_model variables/ assets/
Export to SavedModel restored_model = tf.contrib.saved_model. load_keras_model(export_dir)
Data config layer Saved Raw CSV Dataset Model architecture Validation Model Optimizer and loss 2596,51,3,2 ({'aspect': Epoch 10 saved_model 58,0,510,22 <tf.Tensor: loss: 0.926 9000/9000 variables/ 1,232,148,6 id=567, acc: 0.7402 [==============] assets/ 279,1,0,... shape=...
Swapping the model model = tf.keras.Sequential([ feature_layer, tf.keras.layers.Dense(256), tf.keras.layers.Dense(16), tf.keras.layers.Dense(8), tf.keras.layers.Dense( 4, activation=tf.nn.softmax) ])
Swapping the model model = tf.estimator.DNNLinearCombinedClassifier( Sigmo Output Units id Rectified Hidden Layers Linear Units Dense Embeddings Sparse Features Wide Models Wide & Deep Models Deep Models
Swapping the model model = tf.estimator.DNNLinearCombinedClassifier( linear_feature_columns=[cover_type, soil_type], dnn_feature_columns=numeric_features, dnn_hidden_units=[256, 16, 8], n_classes=4)
Swapping the model model.train( input_fn=lambda: load_data('covtype.csv.train'))
Swapping the model model.train( input_fn=lambda: load_data('covtype.csv.train')) model.evaluate( input_fn=lambda: load_data('covtype.csv.test'))
Swapping the model for epoch in range(10): model.train(...) print('Epoch {}:'.format(epoch + 1)) print(model.evaluate(...)) Epoch 10: {'average_loss': 0.3369278, 'accuracy': 0.86519998, 'global_step': 90010, 'loss': 21.324545}
Swapping the model input_receiver_fn = tf.estimator.export. build_raw_serving_input_receiver_fn( ...)
Swapping the model features_sample = list(dataset.take(1))[0][0] input_receiver_fn = tf.estimator.export. build_raw_serving_input_receiver_fn( features_sample)
Swapping the model features_sample = list(dataset.take(1))[0][0] input_receiver_fn = tf.estimator.export. build_raw_serving_input_receiver_fn( features_sample) model.export_saved_model( export_dir_base='wide_deep', serving_input_receiver_fn=input_receiver_fn)
Data config layer Saved Raw CSV Dataset Model architecture Validation Model Optimizer and loss 2596,51,3,2 ({'aspect': saved_model 58,0,510,22 <tf.Tensor: Epoch 10: loss: 0.336 variables/ 1,232,148,6 id=567, global_step: 90010 acc: 0.8651 assets/ 279,1,0,... shape=...
Conclusions ● Prototype with Eager . ● Preprocess with Datasets . Transform with Feature Columns . ● ● Build with Keras . ● Borrow with Canned Estimators . ● Package with SavedModel .
Thank you! +Karmel Allison
Resources and Links ● Covertype dataset: https://archive.ics.uci.edu/ml/datasets/Covertype Eager execution: https://www.tensorflow.org/guide/eager ● ● Performance tuning datasets: https://www.tensorflow.org/performance/datasets_performance Feature columns: https://www.tensorflow.org/guide/feature_columns ● ● WideDeep publication: https://arxiv.org/abs/1606.07792 Me: https://github.com/karmel ●
Recommend
More recommend