Senska Towards an Enterprise Streaming Benchmark Dachstuhl Seminar 17441 - Big Stream Processing Systems 31 st October 2017 Guenter Hesse 1
Motivation • In a GE battery production plant in New York (state), 10,000 different data attributes are captured, some as often as every 250ms [3] • Modern manufacturing equipment, e.g., injection molding machines, generate up to terabytes, daily [2] • By 2025, total global worth of IoT technology USD 6.2 trillion [1] • Industrial manufacturing is one of the industry sectors investing most on IoT [1] [1] http://www.intel.com/content/www/us/en/internet-of-things/infographics/guide-to-iot.html [2] Huber, M.F., Voigt, M., Ngomo, A.N.: Big data architecture for the semantic analysis of complex events in manufacturing. In: Informatik 2016, 46. Jahrestagung der Gesellschaft für Informatik, 26.-30. September 2016, Klagenfurt, Osterreich. pp. 353-360 (2016) http://3.bp.blogspot.com/-9uJ1ni0tb7g/UcgNtWqKYrI/ AAAAAAAACe4/iwxNn-eiaKM/s1600/moulding+machine+lg.jpg [3] Weiner, S., Line, D. 2014. Manufacturing and the data conundrum - Too much? Too little? 2 Or just right? https://www.eiuperspectives.economist.com/sites/default/files/Manufacturing_Data_Conundrum_Jul14.pdf. (2014). Accessed: 2017-03-01.
Motivation Data Stream Processing Systems 2013 http://storm.apache.org/images/logo.png 2015 https://twitter.github.io/heron/img/ 2014 HeronTextLogo.png https://flink.apache.org/img/logo/png/1000/ flink_squirrel_1000.png 2013 https://upload.wikimedia.org/wikipedia/commons/5/50/ 2010 Samza_Logo.png 2016 http://appliance.moneta.com.mx/images/ Infosphere.png https://upload.wikimedia.org/wikipedia/commons/ e/e1/Apache_Apex_Logo.png 2015 2014 https://qph.ec.quoracdn.net/main-thumb- t-1401165-200- sjxjnvzzmvkykdhdkwmrriupmnletogh.jpeg https://azure.microsoft.com/svghandler/stream-analytics/? width=600&height=315 2013 https://mapr.com/blog/quick-guide-spark-streaming/ assets/spark-streaming-logo.png t 3
Related Work [1] [2] [3] [1] Arasu, A., Cherniack, M., Galvez, E., Maier, D., Maskey, A.S., Ryvkina, E., Stonebraker, M., Tibbetts, R.: Linear road: A stream data management benchmark. In: Proceedings of the Thirtieth International Conference on Very Large Data Bases - Volume 30. pp. 480{491. VLDB '04, VLDB Endowment (2004) [2] Lu, R., Wu, G., Xie, B., Hu, J.: Stream bench: Towards benchmarking modern distributed stream computing frameworks. In: Proceedings of the 2014 IEEE/ACM 7th International Conference on Utility 4 and Cloud Computing. pp. 69{78. UCC '14, IEEE Computer Society, Washington, DC, USA (2014) [3] Shukla, A., Chaturvedi, S., Simmhan, Y.: Riotbench: A real-time iot benchmark for distributed stream processing platforms. CoRR abs/1701.08530 (2017
Related Work Linear Road - focus on single-node DSPS - barely use of historical data - partly too complex queries - limited metrics StreamBench - typical streaming operations missing (e.g., window functions) - no validation RIoTBench - no tool support (data ingestion, result validation) - no historical data 5
Related Work Linear Road - focus on single-node DSPS - barely use of historical data - partly too complex queries - limited metrics StreamBench - typical streaming operations missing (e.g., window functions) - no validation RIoTBench - no tool support (data ingestion, result validation) - no historical data Currently, there is not satisfying Enterprise Streaming Benchmark 6
Contributions/Scope of Senska • Design of benchmark architecture • Definition and validation of query set • Design and development of benchmark toolkit for • Data ingestion • Result validation • Metric calculation • Systems’ setup • Reference implementation that can be used for benchmarking various systems 7
Architecture General Architecture Streaming Benchmark System Under Test Data Feeder Result Validator (Query Implementation) Architecture of Senska Data and Workload Generator (Toolkit) System Under Test Input Data Data Sender Message Broker Benchmark Query DBMS (Sensor Data) (Toolkit) (Apache Kafka) Implementation (Transactional Data) Result Validator and Metric Calculator (Toolkit) 8
Architecture - In Detail 9
Thank you for your attention! Guenter.Hesse@hpi.de Data and Workload Generator (Toolkit) System Under Test Input Data Data Sender Message Broker Benchmark Query DBMS (Sensor Data) (Toolkit) (Apache Kafka) Implementation (Transactional Data) Result Validator and Metric Calculator (Toolkit) 10
Recommend
More recommend