Complexity vs. Performance: Empirical Analysis of Machine Learning as a Service Yuanshun Yao , Zhujun Xiao, Bolun Wang*, Bimal Viswanath, Haitao Zheng and Ben Y. Zhao The University of Chicago *University of California, Santa Barbara ysyao@cs.uchicago.edu
ML in Network Research congestion network user behavior control protocols link prediction analysis • Sivaraman et al., • Liu et al., IMC’16 • Wang et al., IMC’14 SIGCOMM’14 • Zhao et al., IMC’12 • Zannettouet al., IMC’17 • Winstein & Balakrishnan, SIGCOMM’13 …
Running ML is Hard Solution: Machine Learning as a Service (ML-as-a-Service) dataset model
ML-as-a-Service training data ML-as-a-Service user input (model, parameter etc.)
Why Study ML-as-a-Service? Is my model good enough? Q: How well do they perform? Q: How much does the amount of user control impact ML performance?
ML-as-a-Service Platforms Google Amazon ABM BigML PIO Microsoft Prediction ML ML less more amount of user input
Control in ML ? training data trained model
Control in ML Data Cleaning • Invalid/dup/missing data ? training data trained model
Control in ML Data Cleaning Feature Selection • Invalid/dup/missing • Mutual Info,Pearson, data Chi… ? training data trained model
Control in ML Data Cleaning Feature Selection • Invalid/dup/missing • Mutual Info,Pearson, data Chi_square… ? training data trained model Classifier Choice • Logistic Regression, Decision Tree, kNN…
Control in ML Data Cleaning Feature Selection • Invalid/dup/missing • Mutual Info,Pearson, data Chi_square… training data trained model Classifier Choice Parameter Tuning • Logistic Regression, • Logistic Regression: L1, Decision Tree, kNN… L2, max_iter…
Control in ML-as-a-Service Data ✖ ✖ ✖ ✖ ✖ ✖ Cleaning Feature ✖ ✖ ✖ ✖ ✖ ✖ ✔ Selection Complexity vs. Performance? Classifier ✖ ✖ ✖ ✔ ✔ ✔ Choice Parameter ✖ ✖ ✔ ✔ ✔ ✔ Tuning Amazon Google ABM PIO BigML Microsoft high low user control/complexity
Performance Measurement
Characterizing Performance • Theoretical modeling is hard • Output of ML model depends on dataset • No access to implementation details • Empirical data-driven analysis • Simulate a real-world scenario from end to end • Need a large number of diverse datasets • Focus on binary classification
Dataset • 119 datasets • From diverse application domains • Sample size: 15 - 245K, number of features: 1 - 4K • 79% of them are from UCI ML Repository Other Financial & Business 11% 6% Life Science 37% Physical Science 8% Social Science 9% Artificial Test Computer Applications 14% 15%
Methodology • Tune all available control dimensions Feature Selection Classifier Choice Parameter Tuning API ✖ ✔ ✔ training trained data model • Logistic Regression L1_reg • • KNN L2_reg • • SVM Max_iter • API API • … … •
Methodology • Tune all available control dimensions Feature Selection Classifier Choice Parameter Tuning API ✖ ✔ ✔ training trained data model API testing data
Trade-offs between Complexity and Performance
Complexity vs. Performance • Q: How does the complexity correlate with performance? • High complexity -> high performance 1 Optimized 0.9 Average F-Score 0.8 0.7 0.6 0.5 ABM Google Amazon BigML PIO Microsoft Scikit low complexity high
Complexity vs. Risk • Q: How does the risk correlate with complexity? • High complexity -> high risk 0.5 Performance Variance 0.4 (F-Score) 0.3 0.2 0.1 0 ABM Google Amazon BigML PIO Microsoft Scikit low complexity high
Understanding Server-side Optimization
Reverse-engineering Optimization • Q: Does server-side adapt to different datasets? • Reverser-engineering using datasets • Create synthetic datasets • Use prediction results to infer classifier information Circular Linear 2 6 Class 0 Class 1 Feature #2 Feature #2 3 1 0 0 Class 0 -3 -1 Class 1 -6 -1.5 -1 -0.5 0 0.5 1 1.5 -3 -2 -1 0 1 2 3 Feature #1 Feature #1
Understanding Optimization Google decision boundaries 2 6 Class 0 Class 1 3 Feature #2 Feature #2 1 0 0 Class 0 -3 -1 Class 1 -6 -1.5 -1 -0.5 0 0.5 1 1.5 -3 -2 -1 0 1 2 3 Feature #1 Feature #1 • Google switches between classifiers based on the dataset • Use supervised learning to infer classifier family used
Takeaways • ML-as-a-Service is an attractive tool to reduce workload • But user control still has a large impact on performance • Fully automated systems are less risky
Thank you! Questions?
Recommend
More recommend