Lizard A Linked Data Publishing Platform Andy Seaborne Epimorphics - PowerPoint PPT Presentation

Lizard A Linked Data Publishing Platform Andy Seaborne Epimorphics Ltd.

Outline The (a) real world of service provision What to do about (some of) it How to do that

Who am I? Andy Seaborne Editor on SPARQL query A committer on Apache Jena At Epimorphics Ltd

This work ➢ Epimorphics ➢ Funding : InnovateUK * ➢ Users ○ For the discussion and encouragement * Used to be the Technology Strategy Board. UK Department for Business, Innovation & Skills

Example Services http://environment.data.gov.uk/ http://landregistry.data.gov.uk/

Customer Requirements Maximise usage Publication not application

Running Services Data publishing != Database backed web site ● Different traffic patterns ○ Expensive queries, less control ○ Bot multiplier effect ● “Admin” ○ SLAs: Heartbleed

Problem Statement ● Reacting to events ● Machine administration / SLAs

Goals 24x7 Operation Consistency

About Consistency Makes the system easier to use ○ For users ○ For operators Each query sees an unchanging database … that did exist; no “bit of this, bit of that” Clients may conspire!

Apache Jena TDB Id RDF Term Index: SPO Index: POS Index: OSP ➢ Node Table ○ Inline values (integers, date/dateTime, …) ➢ Indexes are covering ○ Range scans ○ All key, no value ○ No "triple table"

SPARQL Execution { ?x :p 123 . } Convert to NodeIds Look in POS to get all PO?, assign S to ?x 123 is an inline constant in TDB. { ?x :p 123 . ?x :q ?v . } A database join Index join (Loop+substitution) Index join (= loop) on :x1 :q ?v where :x1 is the value of ?x

Index Implementation ➢ TDB uses threaded B+Trees for indexes ○ 8K blocks 100-way B+Tree SPO SPO SPO ------ ------ ------ Ptr Ptr ------ ------ ------ SPO SPO SPO SPO ------ ------ Ptr Ptr Ptr ------ ------ SPO SPO SPO SPO SPO SPO SPO SPO SPO SPO ------ ------

Choices Where to introduce distribution? Query and Update Indexes / B+Trees Node table / Objects Key → Value Store Blocks

This Does Not Work (very well) Query and Update Distribute the storage K->V store B+Trees Objects Index access on query processor Blocks Key→Value ➢ Easy to do (pick a KV store of your choice) ➢ Impedance mismatch ○ Too much data moving about ○ Little parallelism ○ Bad cold-start

Distribute Query and Update B+Trees Objects Blocks Key → Value ➢ Distribute the indexes ○ With modified index access ➢ Distribute the nodes ➢ Comms : Apache Thrift

Clustered Node Table ➢ Node Table ○ N replicas; Read R / Write W e.g. W=N and R =1 => Complete copies of node table on each data server ○ Can shard ○ Replaceable Requirement: NodeId for naming

Clustered Indexes ➢ Indexes ○ Can shard by subject ○ Replicas of each shard (R=1, W=N) ○ Compound access operations

Clustered Indexes Index Shard 1 Shard 2 Shard 3 Machine 1 Machine 2

Modified SPARQL Execution ➢ Different unit of index access ○ subject + several predicates (subj, pred1, pred2, pred3, …) ➢ Different join algorithms ○ Merge join ○ Parallel hash join

Configuration 1 Load Balancer (or RR-DNS) Query server Query server Data server Data server Data server Data server POS POS Node Node Copy 1 Copy 1 Copy 1 Copy 2 PSO PSO Copy 2 Copy 2

Configuration 2 Load Balancer (or RR-DNS) Query server Query server Data server Data server POS POS Copy 1 Copy 1 PSO PSO Copy 2 Copy 2 Node Node Copy 2 Copy 1

Status Working prototype Spin-off : TDB2

New Technology ● Copy-on-write indexes ● New transactional coordinator ● Apache Thrift encoded node table ● Side effect: TDB2 ○ Arbitrary scaling transactions ○ Transactional only ○ Space recovery

Paul Hirst / CC-BY-SA-2.5

Lizard A Linked Data Publishing Platform Andy Seaborne Epimorphics - PowerPoint PPT Presentation

Lizard A Linked Data Publishing Platform Andy Seaborne Epimorphics Ltd. Outline The (a) real world of service provision What to do about (some of) it How to do that Who am I? Andy Seaborne Editor on SPARQL query A committer on Apache Jena

From Lizard to Wizard 18.11.2017 Drumoland 8 th Annual Tourism Policy Workshop Setting the

At least not in your neighbourhood So you wind up campaigning Against the occasional lizard.

#FORUMCON19 Engaging in Effective Meeting Facilitation Erin Gordon , Associate, Lizard Brain

#FORUMCON19 Mid-Level Practitioners Workshop Erin Gordon , Associate, Lizard Brain Solutions

A Comparison of Python, JavaScript and Lua Scripting Language Features CS798 Scripting

Motivation Bootstrapping Semantic Lexicons A semantic lexicon contains semantic category Ex:

Guidance for English Slides Refer to year group planning for the sequence of lessons and

L ECTURE 1: I NTRODUCTION T EACHER : G IANNI A. D I C ARO C OLLECTIVE I NTELLIGENCE ? Group of

L ECTURE 1: I NTRODUCTION I NSTRUCTOR : G IANNI A. D I C ARO C OLLECTIVE I NTELLIGENCE ? Group of

A self scale Z-pinch Scalability, Similarities and Differences in Plasma Focus Devices: Basic

Application of generalized Convolution Quadrature in Acoustics and Thermoelasticity Martin Schanz

2 Transportation Card Access Card Credit Card Any potential security

Broadband Microstrip Antennas Prof. Girish Kumar Electrical Engineering Department, IIT Bombay

(In)segurana do voto eletrnico no Brasil Diego F. Aranha , Pedro Barbosa, Thiago Cardoso, Caio

Optimizing Dynamic Power Flow Anders Rantzer Automatic Control LTH Lund Universty Anders

Scattering of Spinning Black Holes from Amplitudes based on work with Alfredo Guevara and Justin

Overview of Convolution Integral Topics Impulse response defined Several derivations of the

Chapter 3 Chapter 3 Convolution Representation Convolution Representation DT Unit-Impulse

Signal and Systems Chapter 2: LTI Systems Representation of DT signals in terms of shifted unit

(e t) {. {e(tt fh} Dinrc { ncvrewr o," r\ (,'r../ f" - c f,, \e t c l*. (r)

INC 212 Signals and systems Lecture#3: Convolution integral Assoc. Prof. Benjamas Panomruttanarug

EE361: SIGNALS AND SYSTEMS II REVIEW SIGNALS AND SYSTEMS I http://www.ee.unlv.edu/~b1morris/ee361

High Field Quadrupole Status - HQ Shlomo Caspi LBNL LARP Collaboration Meeting CM15 SLAC

Dynamics of impurities in a one-dimensional Bose gas Francesco Minardi Istituto Nazionale di

Lizard A Linked Data Publishing Platform Andy Seaborne Epimorphics - PowerPoint PPT Presentation

Lizard A Linked Data Publishing Platform Andy Seaborne Epimorphics Ltd. Outline The (a) real world of service provision What to do about (some of) it How to do that Who am I? Andy Seaborne Editor on SPARQL query A committer on Apache Jena

From Lizard to Wizard 18.11.2017 Drumoland 8 th Annual Tourism Policy Workshop Setting the

At least not in your neighbourhood So you wind up campaigning Against the occasional lizard.

#FORUMCON19 Engaging in Effective Meeting Facilitation Erin Gordon , Associate, Lizard Brain

#FORUMCON19 Mid-Level Practitioners Workshop Erin Gordon , Associate, Lizard Brain Solutions

A Comparison of Python, JavaScript and Lua Scripting Language Features CS798 Scripting

Motivation Bootstrapping Semantic Lexicons A semantic lexicon contains semantic category Ex:

Guidance for English Slides Refer to year group planning for the sequence of lessons and

L ECTURE 1: I NTRODUCTION T EACHER : G IANNI A. D I C ARO C OLLECTIVE I NTELLIGENCE ? Group of

L ECTURE 1: I NTRODUCTION I NSTRUCTOR : G IANNI A. D I C ARO C OLLECTIVE I NTELLIGENCE ? Group of

A self scale Z-pinch Scalability, Similarities and Differences in Plasma Focus Devices: Basic

Application of generalized Convolution Quadrature in Acoustics and Thermoelasticity Martin Schanz

2 Transportation Card Access Card Credit Card Any potential security

Broadband Microstrip Antennas Prof. Girish Kumar Electrical Engineering Department, IIT Bombay

(In)segurana do voto eletrnico no Brasil Diego F. Aranha , Pedro Barbosa, Thiago Cardoso, Caio

Optimizing Dynamic Power Flow Anders Rantzer Automatic Control LTH Lund Universty Anders

Scattering of Spinning Black Holes from Amplitudes based on work with Alfredo Guevara and Justin

Overview of Convolution Integral Topics Impulse response defined Several derivations of the

Chapter 3 Chapter 3 Convolution Representation Convolution Representation DT Unit-Impulse

Signal and Systems Chapter 2: LTI Systems Representation of DT signals in terms of shifted unit

(e t) {. {e(tt fh} Dinrc { ncvrewr o,&quot; r\ (,'r../ f&quot; - c f,, \e t c l*. (r)

INC 212 Signals and systems Lecture#3: Convolution integral Assoc. Prof. Benjamas Panomruttanarug

EE361: SIGNALS AND SYSTEMS II REVIEW SIGNALS AND SYSTEMS I http://www.ee.unlv.edu/~b1morris/ee361

High Field Quadrupole Status - HQ Shlomo Caspi LBNL LARP Collaboration Meeting CM15 SLAC

Dynamics of impurities in a one-dimensional Bose gas Francesco Minardi Istituto Nazionale di

(e t) {. {e(tt fh} Dinrc { ncvrewr o," r\ (,'r../ f" - c f,, \e t c l*. (r)