Secondary reads: the good and the bad Bartomiej Noga Agenda Read - PowerPoint PPT Presentation

Secondary reads: the good and the bad Bartłomiej Nogaś

Agenda • Read Preference configuration • Lagging secondaries and stale or missing/duplicated data • What queries can be safely run on secondaries? • Improving read throughput: sharding vs reading from secondaries 2

Read preference configurations And impact of step downs

Client Configuration options ELIGIBLE NODE Node that satisfies all the conditions defined in the Read Preference. A client directs reads to all eligible nodes at random. 4

Client Configuration options ● serverSelectionTimeout ○ How long to wait for an ELIGIBLE NODE eligible node Node that satisfies all ○ Defaults to 30 seconds the conditions defined in the Read Preference. A client directs reads to all eligible nodes at random. 5

Client Configuration options ● serverSelectionTimeout ○ How long to wait for an ELIGIBLE NODE eligible node Node that satisfies all ○ Defaults to 30 seconds the conditions defined in the Read Preference. A client directs reads to ● localTresholdMS (default 15ms) all eligible nodes at random. ○ Size of the latency window for selecting among available replica set members 6

Latency Window ● Every 10 seconds (3.2) driver sends a heartbeat to measure network response time(last_RTT) ● Average RTT is a weighted moving function, Last observation weight is 0.2 ( last nine around 0.85 ) ● localTresholdMS is relative to the server with lowest RTT 7

Available Read Preference Modes Primary Secondary Nearest Nearest Primary preferred Secondary preferred 8

Primary and Secondary PRIMARY SECONDARY - Read only from the - Read only from Primary member secondary members - Exception if no within latency Primary is available window - Exception if there is no Secondary 9

Primary and Secondary Preferred SECONDARY PRIMARY PREFERRED PREFERRED - Read from the - Read from Primary member secondary members - If no Primary is within latency available follow the window procedure for - If no Secondary is secondary read available read from preference Primary 10

Nearest NEAREST WHEN TO USE Read from any member If you need the within the latency shortest response time window 11

Read Preference Tags Multiple DC configuration

Read preference tags ● Tag is a single key/value pair: Ex. {"dc": "A"} ● Tag set is a document containing zero or more of such tags ● Example: {"dc": "A", "role": "backup"} ● One can’t use tags with read Preference Primary 13

Multiple DC configuration ● Nearest with tags {"dc": "A"} will choose Secondary Primary (S2) (P) between ( P and S1 ) Secondary Secondary (S3) (S1) {"dc": "B"} {"dc": "A"} 14

Multiple DC configuration ● Nearest with tags {"dc": "A"} will choose Secondary Primary (S2) (P) between ( P and S1 ) ● secondaryPreferred with Secondary Secondary (S3) (S1) tags {"dc": "B"} will read from S2 or S3 or P {"dc": "B"} {"dc": "A"} 15

Multiple DC configuration ● Note : setting Mode: Secondary Secondary Primary (S2) Tags: {"dc": "A"} (P) would allow only node S1 , Secondary Secondary (S3) (S1) in case of failure of this node there will be no eligible members {"dc": "B"} {"dc": "A"} 16

Agenda • Read Preference configuration • Lagging secondaries and stale or missing/duplicated data • What queries can be safely run on secondaries? • Improving read throughput: sharding vs reading from secondaries 17

Lagging secondaries And stale or missing data

Stale data ● Replication lag ○ rs.printSlaveReplicationInfo() Primary (rs.status()) ● Typically replication lag should not be bigger than a couple of seconds ● The lag can grow big for example when secondaries uses worse hardware than Secondary primary 19

Stale data Client ● Take an example: (C) Primary ○ An update is made on P on a (P) document Secondary Secondary (S1) (S2) Lag 2s Lag 4s 20

Stale data Client ● Take an example: (C) Primary ○ An update is made on P on a (P) document ○ The write is replicated to S1 Secondary Secondary (S1) (S2) Lag 2s Lag 4s 21

Stale data Client ● Take an example: (C) Primary ○ An update is made on P on a (P) document ○ The write is replicated to S1 ○ C is reading the document from S1 (got updated version) Secondary Secondary (S1) (S2) Lag 2s Lag 4s 22

Stale data Client ● Take an example: (C) Primary ○ An update is made on P on a (P) document ○ The write is replicated to S1 ○ C is reading the document from S1 (got updated version) ○ Then C is reading the same document from S2 (old record) Secondary Secondary (S1) (S2) Lag 2s Lag 4s 23

Stale data Client ● Take an example: (C) Primary ○ An update is made on P on a (P) document ○ The write is replicated to S1 ○ C is reading the document from S1 (got updated version) ○ Then C is reading the same document from S2 (old record) ● Important to monitor Secondary Secondary replication lag (S1) (S2) Lag 2s Lag 4s 24

Changes in MongoDB 3.4 ● maxStalenessMS parameter Primary is added to read Preference (P) ● This parameter defines the maximum replication latency for a secondary to read from ● Example: if maxStalnessMS is set to 3000ms: ○ S1, lag 2s will be eligible Secondary Secondary ○ S2, lag 4s will not be eligible (S1) (S2) Lag 2s Lag 4s 25

Missing/Duplicated data in Sharded Cluster TWO PROBLEMS SERVER-3645 - ● Duplicated/outdated data Inaccurate count for because of orphaned primary SERVER-5931 - documents Inconsistent read from secondary in ● Missing data because of not sharded yet replicated chunk environment migration 26

Orphaned records and duplicated data ● Duplicated and outdated with Orphaned secondary readPreference Document In sharded cluster it’s a document that ● Orphaned Document exists also on shard that it doesn’t ○ Failed balancer rounds belong to ○ During chunk migration 27

Orphaned records and duplicated data db.test shard key: { "_id": 1 } { "_id": [object MinKey] } -> { "_id": 10 } on: test-rs0 { "_id": 10 } -> { "_id": [object MaxKey] } on: test-rs1 28

Orphaned records and duplicated data db.test shard key: { "_id": 1 } { "_id": [object MinKey] } -> { "_id": 10 } on: test-rs0 { "_id": 10 } -> { "_id": [object MaxKey] } on: test-rs1 test-rs0/test_db db.test.find(): {"_id" : 1, "rs": 0} {"_id" : 2, "rs": 0} 29

Orphaned records and duplicated data db.test shard key: { "_id": 1 } { "_id": [object MinKey] } -> { "_id": 10 } on: test-rs0 { "_id": 10 } -> { "_id": [object MaxKey] } on: test-rs1 test-rs0/test test-rs1/test db.test.find(): db.test.find(): {"_id" : 1, "rs": 0} {"_id" : 12, "rs": 1} {"_id" : 2, "rs": 0} {"_id" : 2, "rs": 1} 30

Orphaned records and duplicated data test-rs0/test Query readPreference=Primary {"_id" : 1, "rs": 0} db.test.find().readPref("primary") {"_id" : 2, "rs": 0} {"_id" : 1, "rs": 0} test-rs1/test {"_id" : 2, "rs": 0} {"_id" : 12, "rs": 1} {"_id" : 12, "rs": 1} {"_id" : 2, "rs": 1} 31

Orphaned records and duplicated data Query readPreference=Secondary test-rs0/test db.test.find().readPref("secondary") {"_id" : 1, "rs": 0} {"_id" : 2, "rs": 0} {"_id" : 1, "rs": 0} test-rs1/test {"_id" : 2, "rs": 0} {"_id" : 12, "rs": 1} {"_id" : 12, "rs": 1} {"_id" : 2, "rs": 1} {"_id" : 2, "rs": 1} 32

Orphaned records and duplicated data test-rs0/test Query readPreference=Secondary {"_id" : 1, "rs": 0} find({"_id" : 2}).readPref("secondary") {"_id" : 2, "rs": 0} {"_id" : 2, "rs": 0} test-rs1/test {"_id" : 12, "rs": 1} {"_id" : 2, "rs": 1} 33

Orphaned records and duplicated data Query readPreference=Secondary test-rs0/test find({"_id" : 2}).readPref("secondary") {"_id" : 1, "rs": 0} {"_id" : 2, "rs": 0} {"_id" : 2, "rs": 0} test-rs1/test find({"rs" : 1}).readPref("secondary") {"_id" : 12, "rs": 1} {"_id" : 2, "rs": 1} {"_id" : 2, "rs": 1} 34

Missing data with active balancer ● By default balancer migrates chunks with "writeConcern": {"w": 2} ● writeConcern for the balancer can be changed 35

Secondary reads: the good and the bad Bartomiej Noga Agenda Read - PowerPoint PPT Presentation

Secondary reads: the good and the bad Bartomiej Noga Agenda Read Preference configuration Lagging secondaries and stale or missing/duplicated data What queries can be safely run on secondaries? Improving read throughput:

Secondary Framing Secondary Framing Secondary Framing Secondary Framing 1 1 Secondary Framing

Good Data Gone Bad, Bad Data Gone Worse Renee Phillips pgconf.eu 2019 1 This is me. 2 Sakeeb

Exercise and Secondary Exercise and Secondary Exercise and Secondary Exercise and Secondary

The good, the bad and the ugly of online community engagement The good! The Really Good The

Return-oriented Programming: Exploitation without Code Injection Erik Buchanan, Ryan Roemer,

Architecture Aromatique Good Taste Good Food Good Health Based on sustainability Technical

GPU Architecture and chitecture and GPU Ar The good The good The bad The bad

Tree Pr ee Proximity ximity Finding the good and bad of trees. joe@buildfax.com Tree

Is friction good or bad? Look at the photos. Where is friction a good thing and where is it a bad

9/11/2018 JOINT ESTATE PLANS What makes the good plans good and the bad "plans" bad?

M17 - Good vs. Bad Debt Good Debt vs Bad Debt Using Energy Performance Contracting to finance

he Good, Good, the the Bad Bad & & the the Ugly Ugly Rev Revisited isited for for

Good Deals Gone Bad: Good Deals Gone Bad: Structuring Transactions to Structuring Transactions

WHY BAD DATA RUINS WHY BAD DATA RUINS PROJECTS AND HOW PROJECTS AND HOW TO FIX IT TO FIX IT

Lecture 16: Mapping Reads to a Reference Fall 2019 November 12,14, 2019 1 Next-Gen Sequencing

Alaska Reads Big Anna Bjartmarsdottir, UAA/APU Books of the Year Rayette Sterling, Anchorage

FOSDEM 2020 PostgreSQL devroom Brussels ALEXANDER KUKUSHKIN 02-02-2020 Put images in the grey

File Systems and Storage Marco Serafini COMPSCI 532 Lecture 14 2 Why GFS? Store the

GFS Arvind Krishnamurthy (based on slides from Tom Anderson & Dan Ports) Google Stack

Health-Related Services 101 September 22, 2020 Agenda Welcome Overview of

Factors influencing teachers' professional use of ICT in primary and secondary schools in Spain

Bigtable David Wyrobnik, MEng Overview What is Bigtable? Data Model API

& Why We Need 24/6 (Sessions 1 & 2) 1. How can keeping a weekly Stop Day help heal you?

Patients with Acute or Chronic Non-Cancer Pain Applicant Town Hall Cycle 3, 2016 November 3,

Secondary reads: the good and the bad Bartomiej Noga Agenda Read - PowerPoint PPT Presentation

Secondary reads: the good and the bad Bartomiej Noga Agenda Read Preference configuration Lagging secondaries and stale or missing/duplicated data What queries can be safely run on secondaries? Improving read throughput:

Secondary Framing Secondary Framing Secondary Framing Secondary Framing 1 1 Secondary Framing

Good Data Gone Bad, Bad Data Gone Worse Renee Phillips pgconf.eu 2019 1 This is me. 2 Sakeeb

Exercise and Secondary Exercise and Secondary Exercise and Secondary Exercise and Secondary

The good, the bad and the ugly of online community engagement The good! The Really Good The

Return-oriented Programming: Exploitation without Code Injection Erik Buchanan, Ryan Roemer,

Architecture Aromatique Good Taste Good Food Good Health Based on sustainability Technical

GPU Architecture and chitecture and GPU Ar The good The good The bad The bad

Tree Pr ee Proximity ximity Finding the good and bad of trees. joe@buildfax.com Tree

Is friction good or bad? Look at the photos. Where is friction a good thing and where is it a bad

9/11/2018 JOINT ESTATE PLANS What makes the good plans good and the bad &quot;plans&quot; bad?

M17 - Good vs. Bad Debt Good Debt vs Bad Debt Using Energy Performance Contracting to finance

he Good, Good, the the Bad Bad &amp; &amp; the the Ugly Ugly Rev Revisited isited for for

Good Deals Gone Bad: Good Deals Gone Bad: Structuring Transactions to Structuring Transactions

WHY BAD DATA RUINS WHY BAD DATA RUINS PROJECTS AND HOW PROJECTS AND HOW TO FIX IT TO FIX IT

Lecture 16: Mapping Reads to a Reference Fall 2019 November 12,14, 2019 1 Next-Gen Sequencing

Alaska Reads Big Anna Bjartmarsdottir, UAA/APU Books of the Year Rayette Sterling, Anchorage

FOSDEM 2020 PostgreSQL devroom Brussels ALEXANDER KUKUSHKIN 02-02-2020 Put images in the grey

File Systems and Storage Marco Serafini COMPSCI 532 Lecture 14 2 Why GFS? Store the

GFS Arvind Krishnamurthy (based on slides from Tom Anderson &amp; Dan Ports) Google Stack

Health-Related Services 101 September 22, 2020 Agenda Welcome Overview of

Factors influencing teachers' professional use of ICT in primary and secondary schools in Spain

Bigtable David Wyrobnik, MEng Overview What is Bigtable? Data Model API

&amp; Why We Need 24/6 (Sessions 1 &amp; 2) 1. How can keeping a weekly Stop Day help heal you?

Patients with Acute or Chronic Non-Cancer Pain Applicant Town Hall Cycle 3, 2016 November 3,

9/11/2018 JOINT ESTATE PLANS What makes the good plans good and the bad "plans" bad?

he Good, Good, the the Bad Bad & & the the Ugly Ugly Rev Revisited isited for for

GFS Arvind Krishnamurthy (based on slides from Tom Anderson & Dan Ports) Google Stack

& Why We Need 24/6 (Sessions 1 & 2) 1. How can keeping a weekly Stop Day help heal you?