Analysis of Derby Performance Staff Engineer Olav Sandst Senior - PowerPoint PPT Presentation

Analysis of Derby Performance Staff Engineer Olav Sandstå Senior Engineer Dyre Tjeldvoll Sun Microsystems Database Technology Group This is a draft version that is subject to change. The authors can be contacted at Olav.Sandstaa@sun.com or Dyre.Tjeldvoll@sun.com 1

Overview ● Introduction – What is a database (DBMS)? – Derby Architecture – What is performance? ● Performance Evaluation of Derby ● Performance Tips ● Comparison with MySQL and PostgreSQL 2

Introduction 3

Derby Architecture ● Embedded ● Network Server Appl. Network Appl. JDBC server JDBC JDBC SQL SQL Access Access Storage Storage 4

What Is “Performance”? ● How do you measure database performance? – Throughput – Response time ● Average? ● Max? ● Median? ● Out-of-the-box or carefully tuned? ● How to compare database systems with different tuning possibilities? ● Zero administration? 5

Performance Evaluation of Derby 6

Performance Evaluation of Derby ● Evaluation of disk and file system configurations ● Comparing Embedded and Network Server – Throughput and response times – Resource usage (CPU, network) ● The effects of adjusting the database buffer size (page cache size) ● The effect of keeping the log on a separate disk 7

Derby Performance Evaluation: Derby, OS and Hardware Configuration ● Derby configuration: – “Out of the box” – “Main memory” database – “Disk” database ● Log device: – Same as data device – Separate log disk ● Disk write cache: – Enabled – Disabled 8

Disk and File System Configurations ● Hard disk: – Write cache ● File system: – Logging/journaling in UFS – Direct I/O (partially evaluated) – File meta-data update (not evaluated) 9

Disk Write Cache: Throughput and Response Time 10

Disk and File System Configurations: Conclusions ● Disk Write Cache: Do users want high throughput and low response times, or durability/recoverability? Who are we optimizing for? ● Without write cache on disks a higher number of clients are needed to get maximum throughput ● Things to evaluate: – UFS logging – Java NIO – Direct I/O (not available from Java) – Avoid update of file meta-data 11

Comparing Embedded with Network Server: TPC-B: Throughput and Response Time 12

Comparing Embedded with Network Server: Single-record SELECT 13

Comparing Embedded with Network Server: Resource Usage Derby Embedded: Derby Network Server: CPU? Network? Appl. Network Appl. JDBC server JDBC JDBC CPU? SQL SQL Network? Access Access Storage Storage 14

Comparing Embedded and Network Server: CPU Usage CPU usage per transaction [ms] 1.4 1.3 1.2 1.1 1 0.9 0.8 User CPU 0.7 System CPU 0.6 0.5 0.4 0.3 0.2 0.1 0 TPC- TPC- Insert Insert Se- Se- Up- Up- B B CS Emb CS lect lect date date Emb Emb Cs Emb CS 15

Comparing Embedded and Network Server: CPU Usage, cont. Increase in CPU usage compared to Embedded: System CPU User CPU Total TPC-B 0.09 64% 0.25 32% 36% Insert 0.04 100% 0.05 21% 33% Select 0.04 133% 0.06 52% 65% Update 0.03 75% 0.04 21% 30% Message handling The Network (TCP/IP) Server Code 16

Comparing Embedded with Network Server: Conclusions ● Reduction in throughput of Network Server compared to Embedded: – Update operations: 5% – Select operations: 30-40% ● Response time: – Small increase of about 0.5 ms per SQL operation ● Utilizing the write cache on the disks increases the difference for update operations ● What can be done? – Profile to find CPU intensive parts in the Network Server code – Optimize 17

Comparing Embedded with Network Server: Conclusions ● The Network Server adds 30%-60% to the CPU usage compared to the Embedded version ● The main causes for increased CPU usage: – System CPU usage: ● Message sending and receiving – User CPU usage: ● Message parsing (Strings) ● Character set conversions 18

Performance Tips 19

Performance Tips ● Programming: – Prepared statements ● Database schema design: – Indices ● Derby configuration: – durability=test , if recovery is not important 20

Prepared Statements ● Compilation is expensive, especially in Derby which uses Java byte code generation ● Prepared statements virtually eliminates this cost ● Easy to fall into this trap for beginners (string concatenation is easy) ● <Example> 21

Indices ● Use indices to optimize much used access paths ● <Example> 22

Relaxed Durability ● Durability is expensive – Log must be written to disk prior to commit – This becomes a major bottleneck (cf. write cache, log on separate device) ● Durability can be disabled – Setting durability=test disables log flushing at commit – Dramatically increases throughput... – ... but database may be corrupt after a crash – Would be better if Derby could recover to a consistent state (with some transaction loss) – An alternative to a main memory database ● <Example> 23

Comparing the Performance of MySQL, PostgreSQL and Derby 24

Performance evaluation: MySQL, PostgreSQL and Derby Evaluated performance of: ● MySQL/InnoDB ● PostgreSQL ● Derby Embedded ● Derby Client/Server 25

What is a Database (DBMS)? ● Transactions ● ACID – Atomicity – Consistency – Isolation – Durability Database Atomicity Consistency Isolation Durability Derby Yes Yes Yes Yes PostgreSQL Yes Yes Yes Yes MySQL, Innodb Yes Yes Yes Yes MySQL, MyIsam No ? Table lock Yes HSQLDB No ? ? No 26

Performance Experiments ● Two configurations: – “Small” main-memory database: 10 MB data/50MB database buffer – “Big” disk-based database: 50 MB data/10MB database buffer ● Tests: – TPC-B – Single-tuple operations: select ● Load: – 1-100 concurrent clients 27

Throughput: TPC-B “Small” db “Big” db 1600 500 Derby embedded Derby embedded Derby client/server Derby client/server MySQL (InnoDB) MySQL (InnoDB) 450 1400 PostgreSQL PostgreSQL 400 1200 350 Transactions per second Transactions per second 1000 300 800 250 200 600 150 400 100 200 50 0 0 0 20 40 60 80 100 0 20 40 60 80 100 Number of clients Number of clients 28

Throughput: Select (“small” db) 25000 Derby embedded Derby client/server MySQL (InnoDB) PostgreSQL 20000 Transactions per second 15000 10000 5000 0 0 20 40 60 80 100 29 Number of clients

Conclusions ● Derby outperforms MySQL on large databases ● MySQL performs better on small main-memory databases ● No significant performance loss with client/server, except for SELECT operations ● Need to identify, and reduce the CPU overhead that limits throughput in CPU-bound configurations 30

Disk IO: “Small” Databases 2.5 2.25 2 1.75 1.5 Db writes per transaction 1.25 Log writes per transaction 1 0.75 0.5 0.25 0 Derby em- Derby cli- MySQL PostgreSQL bedded ent/server 31

Disk IO: “Big” Databases Why does Derby out-perform MySQL for disk-based databases? 2 1.8 1.6 - TPC-B like load - 50 MB database 1.4 - 10 MB buffer 1.2 Db writes per 1 transaction Log writes per transaction 0.8 0.6 0.4 0.2 0 Derby client/server MySQL 32

JDBC Driver: CPU Usage CPU usage per TPC-B transaction [ms]: 0.16 0.15 0.14 0.13 0.12 0.11 0.1 0.09 0.08 User System 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 33 Derby MySQL PostgreSQL

Conclusions: Resource Usage ● MySQL performs better than Derby when: – The database is small and fits in the database buffer – Throughput becomes CPU-bound ● Derby performs better than MySQL when: – The database is large and does not fit in the database buffer – Throughput becomes IO-bound ● Derby has focused on maintaining a low footprint 34

Analysis of Derby Performance Staff Engineer Olav Sandst Senior - PowerPoint PPT Presentation

Analysis of Derby Performance Staff Engineer Olav Sandst Senior Engineer Dyre Tjeldvoll Sun Microsystems Database Technology Group This is a draft version that is subject to change. The authors can be contacted at Olav.Sandstaa@sun.com or

VINTAGE DERBY WFTDA: DERBY 101 https://www.youtube.com/watch?v=OId6gTd2LCM 2012 - Yellowknife

Java In the Database Rick Hillegas Apache Derby August 28, 2006 General Overview Derby

Active School programme Derby County Community Trust, in partnership with Derby City Council

Derby City Council Project Water and Flood Defence Team Derby Nottingham Metro Biodiversity

decarbonise your business Liz Pasteur, Derby City Council DE-Carbonise Collaboration between

The Diabetes Rollercoaster Dr Emma Wilmot Consultant Diabetologist, Derby Nick Rycroft Derby

An Introduction to the Analysis of Rare Events Nate Derby Stakana Analytics Seattle, WA SUCCESS

ENSURING EFFICIENCY AND SAFETY IN THEATRES An insight into Derby Teaching Hospitals

Ansonia-Derby Temporary Regional School Study Committee Facilities July 2019 Facilities:

Derby High School Post-Secondary Planning Guide 2019-2020 CEEB CODE: 070155 @DHSCounselors

My name is Krista Gill. I am a parent of two children at Derby school. I attended the University

FARM MOTOR FLEET REQUESTED BY CCIB 66 Derby Road, Longridge, Preston, PR3 3FE Date: Due Date:

Managing and Monitoring Statistical Models Nate Derby Stakana Analytics Seattle, WA Winnipeg

Managing and Monitoring Statistical Models Nate Derby Stakana Analytics Seattle, WA Regina SAS

Maintaining Formats when Exporting Data from SAS into Microsoft Excel Nate Derby & Colleen

Getting Correct Results from PROC REG Nate Derby Stakana Analytics Seattle, WA, USA SUCCESS

Topics Grounding the mind in life Self-directed neuroplasticity The power of

Exploring the IoT attack surface Breaking || Liberating the brave new IoT world. doc.dr.sc.

Chapter 6: Noun Phrases and Agreement Syntactic Constructions in English Kim and Michaelis (2020)

Refactoring Your Code A Key Step to Agility Venkat Subramaniam (svenkat@cs.uh.edu)

Embedded Databases MicroBenchmark Team CodeBlooded Refinement of goals Try and understand

APGAS Programming in X10 http://x10-lang.org This tutorial was originally given by Olivier

Using Crowdsourced Data and Open Source Tools in Government Michael Schnuerle, Chief Data Officer

Knowledge Base Augmentation SIGIR 2019 tutorial - Part III Shuo Zhang and Krisztian Balog

Sambuz

Useful Links

Newsletter

Mail Us

Analysis of Derby Performance Staff Engineer Olav Sandst Senior - PowerPoint PPT Presentation

Analysis of Derby Performance Staff Engineer Olav Sandst Senior Engineer Dyre Tjeldvoll Sun Microsystems Database Technology Group This is a draft version that is subject to change. The authors can be contacted at Olav.Sandstaa@sun.com or

VINTAGE DERBY WFTDA: DERBY 101 https://www.youtube.com/watch?v=OId6gTd2LCM 2012 - Yellowknife

Java In the Database Rick Hillegas Apache Derby August 28, 2006 General Overview Derby

Active School programme Derby County Community Trust, in partnership with Derby City Council

Derby City Council Project Water and Flood Defence Team Derby Nottingham Metro Biodiversity

decarbonise your business Liz Pasteur, Derby City Council DE-Carbonise Collaboration between

The Diabetes Rollercoaster Dr Emma Wilmot Consultant Diabetologist, Derby Nick Rycroft Derby

An Introduction to the Analysis of Rare Events Nate Derby Stakana Analytics Seattle, WA SUCCESS

ENSURING EFFICIENCY AND SAFETY IN THEATRES An insight into Derby Teaching Hospitals

Ansonia-Derby Temporary Regional School Study Committee Facilities July 2019 Facilities:

Derby High School Post-Secondary Planning Guide 2019-2020 CEEB CODE: 070155 @DHSCounselors

My name is Krista Gill. I am a parent of two children at Derby school. I attended the University

FARM MOTOR FLEET REQUESTED BY CCIB 66 Derby Road, Longridge, Preston, PR3 3FE Date: Due Date:

Managing and Monitoring Statistical Models Nate Derby Stakana Analytics Seattle, WA Winnipeg

Managing and Monitoring Statistical Models Nate Derby Stakana Analytics Seattle, WA Regina SAS

Maintaining Formats when Exporting Data from SAS into Microsoft Excel Nate Derby &amp; Colleen

Getting Correct Results from PROC REG Nate Derby Stakana Analytics Seattle, WA, USA SUCCESS

Topics Grounding the mind in life Self-directed neuroplasticity The power of

Exploring the IoT attack surface Breaking || Liberating the brave new IoT world. doc.dr.sc.

Chapter 6: Noun Phrases and Agreement Syntactic Constructions in English Kim and Michaelis (2020)

Refactoring Your Code A Key Step to Agility Venkat Subramaniam (svenkat@cs.uh.edu)

Embedded Databases MicroBenchmark Team CodeBlooded Refinement of goals Try and understand

APGAS Programming in X10 http://x10-lang.org This tutorial was originally given by Olivier

Using Crowdsourced Data and Open Source Tools in Government Michael Schnuerle, Chief Data Officer

Knowledge Base Augmentation SIGIR 2019 tutorial - Part III Shuo Zhang and Krisztian Balog

Sambuz

Useful Links

Newsletter

Mail Us

Maintaining Formats when Exporting Data from SAS into Microsoft Excel Nate Derby & Colleen