Introduction to Amazon DocumentDB (with MongoDB compatibility) Fast, scalable, and fully managed MongoDB-compatible database service Joseph Idziorek, AWS Principal Product Manager
Purpose b built The right tool for the right job https://www.allthingsdistributed.com/2018/06/purpose-built-databases-in-aws.html
Data categories and common use cases Relational Key-value Document In-memory Graph Search Time-series Ledger Referential Low-latency, Indexing and Microseconds Indexing and Collect, store, Complete, Creating and integrity, ACID key lookups storing latency, key- searching and process data immutable, and navigating transactions, with high documents based queries, semistructured sequenced by verifiable history data relations schema- throughput and with support and specialized logs and data time of all changes to easily and quickly on-write fast ingestion for query on data structures application data of data any attribute Real-time bidding, Content Lift and shift, Leaderboards, Fraud detection, Product catalog, IoT applications, Systems shopping cart, management, EMR, CRM, real-time social networking, help and FAQs, event tracking of record, social personalization, finance analytics, caching recommendation full text supply chain, mobile engine health care, registrations, financial
AWS: Purpose-built databases Ledger Relational Key-value Document In-memory Graph Search Time-series Amazon Amazon RDS Amazon Amazon Amazon Amazon Amazon Amazon DynamoDB DocumentDB ElastiCache Elasticsearch Neptune Timestream Quantum Service Ledger New! Aurora Community Commercial Redis Memcached Database
Agenda What’s the plan? What is a document Introduce Amazon Challenges and Demos database? DocumentDB capabilities { } { “Hello”: “Amazon DocumentDB”, “Getting Started”: “https://aws.amazon.com/documentdb/getting-started/" }
What is a document database?
DocumentDB DynamoDB Redis QLDB Aurora SQL Server PostgreSQL Elasticsearch Timestream Oracle DB2 MySQL MongoDB Neptune Cassandra Access 1970 2000 1980 2010 1990
Evolution of document databases != Relational JSON JSON (Client) (App) (Database) JSON became the Friction when Object-relational Document de facto data converting JSON mappings (ORMs) databases solved interchange to the relational were created to help the problem format model with this friction
Document databases • Data is stored in JSON-like documents JSON documents are first-class objects of the database { • Documents map naturally to id: 1, how humans model data name: "sue", age: 26, email: "sue@example.com", • Flexible schema and indexing promotions: ["new user", "5%", "dog lover"], memberDate: 2018-2-22, shoppingCart: [ • Expressive query language {product:"abc", quantity:2, cost:19.99}, {product:"edf", quantity:3, cost: 2.99} built for documents (ad hoc ] queries and aggregations) }
Document databases help developers build applications faster and iterate quickly
Use cases for document data Content Personalization Mobile management Catalog Retail and User profiles marketing
Use cases for document data User profiles { { { { id: 181276, id: 181276, id: 181276, id: 181276, username: "sue1942", username: "sue1942", username: "sue1942", username: "sue1942", name: {first: "Susan", name: {first: "Susan", name: {first: "Susan", name: {first: "Susan", last: "Benoit"} last: "Benoit"}, last: "Benoit"}, last: "Benoit"} } } ExploidingSnails: { ExploidingSnails: { hi_score: 3185400, hi_score: 3185400, global_rank: 5139, global_rank: 5139, bonus_levels: true bonus_levels: true }, } } promotions: ["new user","5%","snail lover"] }
Challenges of existing document databases Hard to Hard Hard to Hard to Hard to set up to manage scale secure back up
What is Amazon DocumentDB? Fast, scalable, and fully managed MongoDB-compatible database service
Amazon DocumentDB Fast, scalable, and fully managed MongoDB-compatible database service Fully managed MongoDB compatible Fast Scalable Compatible with MongoDB 3.6; Millions of requests per Separation of compute and Managed by AWS: second with millisecond storage enables both layers no hardware provisioning; use the same SDKs, tools, and applications with Amazon latency; twice the throughput to scale independently; auto patching, quick setup, DocumentDB of MongoDB scale out to 15 read replicas secure, and automatic in minutes backups
Challenges with traditional database architectures Application Single monolithic architectures API Query processor Not designed Caching for the cloud Logging Scale monolithically Storage Fail monolithically
Challenges with traditional databases: Scaling Scenario: Spike in traffic and you want to add additional read capacity quickly Replication Node 1 Node 2 Node 3 Node 4
Challenges with traditional databases: Scaling Scenario: Scale up to run large analytical workloads on a replica Replication Node 4 Node 1 Node 2 Node 3
Challenges with traditional databases: Recovery Scenario: An instance experiences a failure and you want to recover quickly Replication Node 3’ Node 1 Node 2 Node 3
Amazon DocumentDB: Modern cloud-native architecture What would you do to improve scalability and availability? 1 2 3 Decouple Distribute data in Increase the compute and smaller partitions replication of storage data (6x)
Amazon DocumentDB: Modern cloud-native architecture API Scale compute Compute layer Query processor Caching Logging Decouple compute and storage 1 Storage Scale storage Storage layer
Amazon DocumentDB: Modern cloud-native architecture Logging Storage Distribute data in smaller partitions 2 Distributed storage volume AZ1 AZ2 AZ3
Amazon DocumentDB: Modern cloud-native architecture Logging Storage Distributed storage volume AZ2 AZ1 AZ3 Increase the replication of data (6x) 3
Amazon DocumentDB: Modern cloud-native architecture AWS Region Availability Zone 1 Availability Zone 2 Availability Zone 3 Instance Instance Instance (primary) (replica) (replica) Reads Reads Writes Reads Writes W r i t e s Distributed storage volume AZ1 AZ2 AZ3
Amazon DocumentDB: Scaling Scenario: A spike in traffic and you want to add additional read capacity quickly Distributed storage volume AZ1 AZ2 AZ3
Amazon DocumentDB: Failure recovery Scenario: An instance experienced a failure and you want to recover quickly Distributed storage volume AZ1 AZ2 AZ3
Amazon DocumentDB: Failure recovery Scenario: Six-way replication across three Availability Zones provides the ability to handle AZ + 1 failures Distributed storage volume AZ1 AZ2 AZ3
Dem Demo: Getting started with Amazon DocumentDB
Fast Fast , scalable, and fully managed MongoDB-compatible database service Fa More throughput Optimizations Flexible Fast Scale up an instance in Millions of requests Separation of storage and Database engine per second with compute layers offloads minutes for analytical optimizations to reduce millisecond latency replication to the storage queries and scale down at the number of IOs and volume so that your instances the end of the day minimize network packets can do more work; twice the in order to offload the throughput of MongoDB database engine
Flexible Du Durability and replication are handled by the distributed storage volume Distributed Storage Volume Distributed Storage Volume Distributed Storage Volume Scenario 3: Scale-up and scale-out for analytics Scenario 1: Dev/test with a single instance Scenario 2: Read scaling in minutes
Scalable Fast, sc scalable , and fully managed MongoDB-compatible database service Scale out Storage scales Scale up Load balancing in minutes automatically in minutes Scale out read capacity by Scale up and down Storage volumes Load balancing across adding additional replicas (up instances in minutes automatically grow from instances with replica sets to 15 replicas); adding replicas (15.25 GiB memory to 10 GB to 64 TB without takes minutes regardless of 244 GiB memory) any user action data size
Fully managed aged MongoDB-compatible database service Fast, scalable, and fu fully man anag Pay-as-you-go Automatic failure Durable, fault- pricing; enterprise recover and Point-in-time tolerant and self- grade healing storage failover recovery Replicas are On-demand, Automated backups are Data at rest is automatically promoted pay-as-you-go pricing stored in Amazon S3, replicated six ways to primary; failing enables you to pay only which is designed for across three AZs; processes are for the resources that you 99.999999999% handle AZ + 1 failures automatically detected need and only when you durability and recovered; no cache use them warmup needed
Recommend
More recommend