efficient storage backends for iot data master thesis mid
play

Efficient Storage Backends for IoT Data Master Thesis, Mid Talk - PowerPoint PPT Presentation

Chair of Network Architectures and Services Department of Informatics Technical University of Munich Efficient Storage Backends for IoT Data Master Thesis, Mid Talk Florian Kreitmair Advisors: Marc-Oliver Pahl, Cyrille Artho, Stefan Liebald


  1. Chair of Network Architectures and Services Department of Informatics Technical University of Munich Efficient Storage Backends for IoT Data Master Thesis, Mid Talk Florian Kreitmair Advisors: Marc-Oliver Pahl, Cyrille Artho, Stefan Liebald February 19, 2018 Chair of Network Architectures and Services Department of Informatics Technical University of Munich

  2. Chair of Network Architectures and Services Department of Informatics Technical University of Munich Contents Problem Statement Requirements Database Survey Integration Evaluation Bibliography F. Kreitmair – Efficient Storage Backends for IoT Data 2

  3. Chair of Network Architectures and Services Department of Informatics Technical University of Munich Distributed Smart Space Orchestration System [3] • Is a middleware for IoT networks. • Focus on real-time interaction between IoT components (sensors, actuators, services). • Peer-to-peer system with federated database. F. Kreitmair – Efficient Storage Backends for IoT Data 3

  4. Chair of Network Architectures and Services Department of Informatics Technical University of Munich DS2OS (cont.) Provides a database abstraction: Figure 1: State of IoT components is refleted in the VSL F. Kreitmair – Efficient Storage Backends for IoT Data 4

  5. Chair of Network Architectures and Services Department of Informatics Technical University of Munich DS2OS (cont.) • Very flexible design, but vulnerable to concurrency problems: Figure 2: Example: room with two doors – one must be locked at all times • To open door 2: locked := GET /door1/isLocked if(locked == true) then SET /door2/open=true • What if there is another process that opens door 1? • ⇒ Use transaction isolation! F. Kreitmair – Efficient Storage Backends for IoT Data 5

  6. Chair of Network Architectures and Services Department of Informatics Technical University of Munich DS2OS with a distributed database F. Kreitmair – Efficient Storage Backends for IoT Data 6

  7. Chair of Network Architectures and Services Department of Informatics Technical University of Munich Research Questions How can a distributed database be used to improve scalability, avail- ability and consistency of data-centric IoT middleware? • Requirements on the datastore? • Suitable database products? • Integration? • Qualitative properties? • Quantitative performance in a realistic scenario? F. Kreitmair – Efficient Storage Backends for IoT Data 7

  8. Chair of Network Architectures and Services Department of Informatics Technical University of Munich Distribution Figure 3: Distributed DBs store data on multiple nodes Replication: Improves failure tolerance ⇒ Availability Sharding (Partitioning): Improves scalability F. Kreitmair – Efficient Storage Backends for IoT Data 8

  9. Chair of Network Architectures and Services Department of Informatics Technical University of Munich Data structure Databases organise data in various different ways: • Key-value • Relational • Wide column • Document • Graph • . . . and a lot of hybrids • Few limitations because VSL data model is simple. • But: Hierarchical structures must be queried efficiently. F. Kreitmair – Efficient Storage Backends for IoT Data 9

  10. Chair of Network Architectures and Services Department of Informatics Technical University of Munich Concurrency Control • Services need isolated view on data, otherwise risk of inconsistent data and control flaws • ⇒ Database must isolate concurrent access • Conventional RDBMS provide this through ACID transactions • Difficult to provide in distributed settings, so few databases have it F. Kreitmair – Efficient Storage Backends for IoT Data 10

  11. Chair of Network Architectures and Services Department of Informatics Technical University of Munich Other requirements • Consistency: Linearizability, conflict freedom • Latency: The lower the better • Interface: Java API or SQL over JDBC • Storage: Disk or In-Memory possible, but should be durable • License: Open Source • Documentation • Ideally low resource consumption • Optionally: Offers triggers to handle subscriptions F. Kreitmair – Efficient Storage Backends for IoT Data 11

  12. Chair of Network Architectures and Services Department of Informatics Technical University of Munich Database Survey Sharding Strong Access Trans- Conis- + Repli- Semantics actions cation tency Zookeeper Key-value + key-prefixes - X - CouchDB Documents - - - Cassandra Wide column + secondary indices X (X) - HBase Wide column X X - VoltDB Relational X X X CockroachDB Relational X X X Riak Key-value X (X) - Scalaris Key-value X X X Voldemort Key-value X - - Infinispan Typed key-value X X X Hazelcast Typed key-value (X) X X Ignite Typed key-value X X X Geode Typed key-value (X) X X Table 1: Major Distributed Databases F. Kreitmair – Efficient Storage Backends for IoT Data 12

  13. Chair of Network Architectures and Services Department of Informatics Technical University of Munich CockroachDB CockroachDB [1] is a “NewSQL ” store similar to Google Spanner [2]. • RDBMS semantics with additional distribution and fault-tolerance. • Uses directory, organised as a 2-level hierarchy, to distribute and locate data. • Raft consensus for row-level linearizable consistency among repli- cas. • Uses Multi-Version Concurrency Control (MVCC) with snapshot isolation, timestamp order provided by hybrid logical & physical clocks. F. Kreitmair – Efficient Storage Backends for IoT Data 13

  14. Chair of Network Architectures and Services Department of Informatics Technical University of Munich Infinispan Infinispan [4] is a key-value store for Java objects with additional index- ing capabilities, similar to Hazelcast, Geode and Ignite. • Configurable modes of distribution (replication and/or sharding and/or caching). • 2-phase commit protocol for transactions (Optimistic, or pessimistic version is configurable). • Consistent hashing to distribute and locate data. • Highest possible isolation level is repeatable reads. • Consistency guarantees weakened if node churn or partitions oc- cur. F. Kreitmair – Efficient Storage Backends for IoT Data 14

  15. Chair of Network Architectures and Services Department of Informatics Technical University of Munich Transactions Transactions are not yet supported in DS2OS, so the API has to be extended. F. Kreitmair – Efficient Storage Backends for IoT Data 15

  16. Chair of Network Architectures and Services Department of Informatics Technical University of Munich Load Generation Classification of services: Sensors Generate data – write frequently to one VSL node. Actuators Consume data – read frequently from one VSL node. Complex device Device with encapsulated logic – transactional read and write access on one VSL node. Knowledge Inference Aggregate information – transactionally read data from n sensor nodes, write to one VSL node Coordination Implement control logic – read from n inferred knowl- edge or sensor nodes, and write to m VSL nodes Total load is a mixture of these. F. Kreitmair – Efficient Storage Backends for IoT Data 16

  17. Chair of Network Architectures and Services Department of Informatics Technical University of Munich Performance Evaluation Measure: • read latency • write latency • commit latency for transactions • commit success rate • resource consumption Dependent on: • Number of DB nodes • DB node failures • Load of each service class while fixing the load of the others For each DB backend. F. Kreitmair – Efficient Storage Backends for IoT Data 17

  18. Chair of Network Architectures and Services Department of Informatics Technical University of Munich Schedule September – 2017 October – 2017 November – 2017 December – 2018 January – 2018 February – 2018 March – 2015 KW36 KW37 KW38 KW39 KW40 KW41 KW42 KW43 KW44 KW45 KW46 KW47 KW48 KW49 KW50 KW51 KW52 KW01 KW02 KW03 KW04 KW05 KW06 KW07 KW08 KW09 KW10 KW11 KW12 KW13 Analysis Mid talk Design and implementation Design Evaluation Writing the thesis Hand-in Final talk and Colloqium Figure 4: Gantt Chart F. Kreitmair – Efficient Storage Backends for IoT Data 18

  19. Chair of Network Architectures and Services Department of Informatics Technical University of Munich [1] Cockroach Labs. Cockroachdb docs – architecture, 2017. [2] J. C. Corbett, J. Dean, M. Epstein, A. Fikes, C. Frost, J. J. Furman, S. Ghemawat, A. Gubarev, C. Heiser, P . Hochschild, et al. Spanner: Google’s globally distributed database. ACM Transactions on Computer Systems (TOCS) , 31(3):8, 2013. [3] M.-O. Pahl. Distributed Smart Space Orchestration . Dissertation, Technische Universität München, München, 2014. [4] The Infinispan community. Infinispan 9.1 user guide, 2018. F. Kreitmair – Efficient Storage Backends for IoT Data 19

Recommend


More recommend