How Microsoft Built MySQL, PostgreSQL and MariaDB for the Cloud Santa Clara, California | April 23th – 25th, 2018
Azure Data Service Architecture • Azure Infrastructure Services Share Cluster with SQL DB Azure Service Fabric Control Plane • Cluster is decomposed into Azure Provisioning Control Connection Service Fabric applications (10+ Telemetry services data store Proxy applications) Data Plane • All applications and all tenants are Port Sharing Resource Node health individually deployable Service Governance MySQL/PG/SQL tenants • DB engine Instances are “services” managed by Azure Service Fabric Azure Storage 2
Azure Relational Database Services Platform SQL Database MySQL/MariaDB PostgreSQL SQL DW PostgreSQL Server MySQL/MariaDB Server SQL Server 2017 SQL Server 2017 + PDW Continuous Delivery Auto-mitigation of Resource Governance and Resource Connecti tion Proxy Database Jobs Azure Resource Manager APIs, through Deployment LiveSite incidents Isola latio tion per Server/Da /Data taba base se and Connection tion Client Tools, Portal integration Automation Redirection tion Database Services Platform Cross ss-region gion Data Security ity & C Com omplia pliance Backup Manager & Orchestration of and in-region gion Backups retention, Management SMART data migration tion PITR and Geo- Proactive Analytics Workflows Local Replication for HA Workload Insights Monitoring Restore Active Geo-Replication and Alerting A/B Testing Health Location Services & Routing High Stateful Fast startup & Low Latency Cluster Self-healing Container Orchestration & Monitoring Availability services shutdown messaging Service Fabric lifecycle management High Failure Detection & Automated Rolling Placement Services Hyper-Scale Density Failover Rollback Upgrades Load balancing Resizing Service Azure Storage Azure Networking Azure Compute Azure Monitoring Global Azure with 38 Regions
Microsoft Azure Service Fabric A platform for reliable, hyperscale, microservice-based applications Microservices Container Orchestration Health Location Services & Routing High Stateful Fast startup Low Latency Self-healing & lifecycle management Monitoring Availability services & shutdown Cluster Service Fabric messaging Failure High Hyper- Load Detection & Automated Rolling Placement Resizing Service Scale Density balancing Rollback Upgrades Failover Services
Cluster: A federation of machines A set of machines Node that Service Fabric stitches together to Node Node form a cluster Node Node One cluster can scale to 1000+ Node machines
Relational data services – Control Plane Front-end Services (GW) One cluster Database Connection Redirector/Proxy per region Node managed by Management Service service fabric Node Node Cluster Control Services (MN) Cluster Metadata(CMS) Node Node Provides front-end and Node cluster control services.
Relational data services – Data Plane One to many clusters Application Services per region managed by service fabric Node Platform Services Node Node Node 1 Node Node Each node has Db tenant1 Files Db Files application services Node Db tenant2 Files (MySQL server) and Db log Files platform services Azure Storage
Multi-tenancy On VM/ OS Process prem/ IaaS sharing sharing stamp • Multi-tenancy is really hard • Noisy neighbors; accidental or intentional abuse • Different levels of multi-tenancy have different tradeoffs in cost, capacity and density • More sharing leads to greater efficiencies but adds more points of contention • Expectations on performance predictability need to be managed via min guarantee and max caps across different hardware SKUs 8
Our Solution • Running a strip down version of latest Windows in a security container (SQLPAL) • Strong Security Isolation • Strong Resource Isolation • Less memory footprint (compare to a Full OS) • Less attack surface (Lock down to bare minimum for engine) • Leverage Microsoft SQL Server schedulers and memory management • Resource Governance combined with Native Windows and SQLPAL • CPU • Memory • Disk • Network 9
SQL Platform Abstraction Layer (SQLPAL) Windows Non-Windows • Windows Host Extension has a driver for creating the Pico process and a monitor process DBMS (user mode) that implements non- Win32 perf related ABIs. Ring 3 SOSv2 SQLPAL • ABI calls are handled by the driver LibOS and are either handled directly (Like File IO) or are marshalled to Host Extension(HE) Ring 0 the monitor process for handling Windows Kernel (Like File Open) 10
Decoupled Compute and Storage • Remote Storage built on top of commodity hardware • Different optimizations for I/O Path of Log and Data files • Log files require low latency write, sequence read when crash recovery • Data files require high throughput, random read/write • Snapshot based backup • Never possible for huge amount of data through other ways (TB+) • Snapshot Support • PITR support 11
Security Enhancement • Network Security • VNET • Firewall Support • Both Inbound and outbound lock down • Port Sharing Service (One per Node) • One port listen for each server • Duplicate the socket and SSL security context to the real instance • Encrypt-At-Rest • Threat Detection 12
Rate My Session 13
Thank You!
Recommend
More recommend