IoT Platform using Geode and ActiveMQ Scalable IoT Platform - - PowerPoint PPT Presentation

iot platform using geode and activemq
SMART_READER_LITE
LIVE PREVIEW

IoT Platform using Geode and ActiveMQ Scalable IoT Platform - - PowerPoint PPT Presentation

IoT Platform using Geode and ActiveMQ Scalable IoT Platform Swapnil Bawaskar @sbawaskar sbawaskar@apache.org Agenda Introduction IoT MQTT Apache ActiveMQ Artemis Apache Geode Real world use case Q&A 2 IoT


slide-1
SLIDE 1

IoT Platform using Geode and ActiveMQ

Swapnil Bawaskar

@sbawaskar sbawaskar@apache.org

Scalable IoT Platform

slide-2
SLIDE 2
  • Introduction
  • IoT
  • MQTT
  • Apache ActiveMQ Artemis
  • Apache Geode
  • Real world use case
  • Q&A

2

Agenda

slide-3
SLIDE 3
  • Devices collect and send

data to brokers

  • Clients process data to

deliver business value

  • IoT data platform

considerations

  • Protocol
  • How to read
  • How to Analyze
  • How to scale

IoT

3

slide-4
SLIDE 4
  • MQTT
  • Message Queuing Telemetry Transport
  • Based on TCP/IP
  • Optimized binary protocol
  • No type system
  • Provides different QoS levels
  • Low energy consumption

Protocol

4

slide-5
SLIDE 5
  • Subproject of ActiveMQ
  • Non blocking architecture
  • High Performance
  • Multi Protocol
  • Embeddable
  • Clustered
  • Persistence
  • Journaled
  • Relational database

ActiveMQ Artemis

5

slide-6
SLIDE 6

Scaling

6

  • When dealing with large

number of devices

slide-7
SLIDE 7

Scaling

7

  • Cluster
  • Brokers form cluster
  • Clients are load balanced
slide-8
SLIDE 8

Scaling

8

  • Cluster
  • Need to scale the

processors

slide-9
SLIDE 9

Scaling

9

  • Cluster
  • Processors do not see all data
slide-10
SLIDE 10

Scaling

10

slide-11
SLIDE 11

Scaling

11

GEODE

slide-12
SLIDE 12

12

What is it?

slide-13
SLIDE 13

A distributed, memory-based data management platform for data oriented apps that need:

  • high performance, scalability, resiliency and continuous

availability

  • fast access to critical data set
  • location aware distributed data processing
  • event driven data architecture

13

What is it?

slide-14
SLIDE 14

Numbers Everyone Should Know

14

L1 cache reference 0.5 ns Branch mispredict 5 ns L2 cache reference 7 ns Mutex lock/unlock 100 ns Main memory reference 100 ns Compress 1K bytes with Zippy 10,000 ns 0.01 ms Send 1K bytes over 1 Gbps network 10,000 ns 0.01 ms Read 1 MB sequentially from memory 250,000 ns 0.25 ms Round trip within same datacenter 500,000 ns 0.5 ms Disk seek 10,000,000 ns 10 ms Read 1 MB sequentially from network 10,000,000 ns 10 ms Read 1 MB sequentially from disk 30,000,000 ns 30 ms Send packet CA->Netherlands->CA 150,000,000 ns 150 ms

http://static.googleusercontent.com/media/research.google.com/en/us/people/jeff/stanford-295-talk.pdf

slide-15
SLIDE 15
  • 17 billion records in memory
  • GE Power & Water's Remote Monitoring & Diagnostics Center
  • 3 TB operational data in-memory, 400 TB archived
  • China Railways
  • 4.6 Million transactions a day / 40K transactions a second
  • China Railways
  • 120,000 Concurrent Users
  • Indian Railways

15

Who are the users?

slide-16
SLIDE 16

World: ~7,349,000,000 ~36% of the world population Population: 1,251,695,616 1,401,586,609

China Railway
 Corporation Indian Railways

slide-17
SLIDE 17
  • Distributed key-value store

(java.util.concurrent.ConcurrentMap)

  • Region due to old JSR-107 spec
  • Both Keys as well as Values can be domain objects

17

Regions

Server 3 Server 2 Server 1

Key1 value1 Key2 value2 Key1 value1 Key2 value2 Key1 value1 Key2 value2 Key1 value1 Key2 value2

Partitioned Replicated

slide-18
SLIDE 18
  • Deploy Function on all servers
  • Runs in-process with the servers

18

Functions

Server 2

  • Server 2

Server 1 Server 3

Key1 value1 Key2 value2 Key1 value1 Key2 value2 Key1 value1 Key2 value2

slide-19
SLIDE 19

19

Functions

Server 2

  • Server 2

Server 1 Server 3

Key1 value1 Key2 value2 Key3 value1 Key4 value2 Key5 value1 Key6 value2

  • Deploy Function on all servers
  • Runs in-process with the servers
slide-20
SLIDE 20
  • Object Query Language (OQL)
  • Similar to SQL
  • SELECT DISTINCT * FROM /exampleRegion WHERE status = ‘active’
  • You can drill down into domain objects
  • SELECT p.name FROM /person p WHERE p.pet.type=‘dino’
  • You can also invoke methods on your domain objects
  • SELECT DISTINCT * FROM /person p WHERE p.children.size >= 2
  • Joins Possible
  • Between Replicate regions
  • Between one Partitioned and Replicate regions
  • SELECT portfolio1.ID, portfolio2.status FROM /exampleRegion portfolio1, /

exampleRegion2 portfolio2 WHERE portfolio1.status = portfolio2.status

20

Query

slide-21
SLIDE 21
  • Enables event-driven apps
  • Register a Query with the server
  • SELECT * FROM /tradeOrder t WHERE t.symbol=‘VMW’ AND t.price > 100.00
  • The server then notifies when the query condition is met
  • Client implements the CqListener callback
  • HA support
  • Domain objects not required on the server’s class-path

21

Continuous Query

slide-22
SLIDE 22

Fixed or flexible schema?

id name age pet_id

  • r

{ id : 1, name : “Fred”, age : 42, pet : { name : “Barney”, type : “dino” } }

slide-23
SLIDE 23

C#, C++, Java, JSON No IDL, no schemas, no hand-coding Schema evolution (Forward and Backward Compatible) * domain object classes not required No need to bring down cluster when domain objects change

| header | data | | pdx | length | dsid | typeid | fields | offsets |

Portable Data eXchange

slide-24
SLIDE 24

Efficient for queries

{ id : 1, name : “Fred”, age : 42, pet : { name : “Barney”, type : “dino” } } SELECT p.name FROM /Person p WHERE p.pet.type = “dino”

single field deserialization

slide-25
SLIDE 25

But how fast is it?

Benchmark: https://github.com/eishay/jvm-serializers

slide-26
SLIDE 26

Schema evolution

Member A Member B Distributed Type Definitions v2 v1 Application #1 Application #2

v2 objects preserve data from missing fields v1 objects use default values to fill in new fields PDX provides forwards and backwards compatibility, no code required

slide-27
SLIDE 27
  • Telemetry data from machines
  • Predicting failure
  • Outside one standard deviation
  • Evaluating markov model
  • Use functions to iterate over data
  • Use CQs to notify
  • Update CQs based on function results

IoT Use Case

27

slide-28
SLIDE 28

Questions?

28