Components of a Distributed System PRATEEK PAREKH SOFTWARE ENGINEER @ prparekh83
t h s Building the Front-end Layer Module Overview Handling Business logic with Web Services Understanding the Data Layer Asynchronous Processing
Building the Front-end Layer
Batch job Servers CDN Customer Internet DNS Load Balancer Database Search Servers servers Back end Front end servers servers Message Queue Cache Servers Servers
State management Resource Memory Local files User Session Locks
Keep your Servers Stateless. Scale Horizontally by adding more Servers.
s h s Client and server communicate over HTTP HTTP protocol is stateless Establish HTTP session with Cookies in request and response headers State Management
State management GET /home.html Set-cookie:SID=EMPTY Response header Set-cookie:SID=abc… GET /forum.html Cookie:SID=abc… Response with HTML content Client Server
s h s Cookies sent with every individual request Store session data in an external datastore For local files, use CDN or an Object store State Management
State management GET /home.html Set-cookie:SID=EMPTY save session Response header success Set-cookie:SID=abc… GET /forum.html fetch/save Cookie:SID=abc… session Response with Redis success HTML content Client Server
Front-end Layer Components DNS - Domain Name System (eg. Route 53, easydns.com) CDN - Content Delivery Network (eg. Akamai, Amazon CloudFront) Load Balancer (eg. Elastic Load Balancer, HAProxy, Nginx) Caching (eg. Browser, Redis/Memcached)
Deploying Robomantics Frontend on AWS Route 53 DNS Customer EC2 Front HTTP end servers requests cluster distribute dynamic requests content Backend Elastic Load Cloudfront servers Balancer (ELB) CDN instance 1 instance n static files Amazon Auto Scale Amazon CloudWatch Private S3 Public S3 Service buckets buckets
Handling Business Logic with Web Services
Batch job Servers CDN Customer Internet DNS Load Balancer Database Search Servers servers Back end Front end servers servers Message Queue Cache Servers Servers
Web Services encapsulates the business logic and hides complexity behind an API contract
Globomantics Web Service Layer Account Catalog Search Checkout Payment Order
Functional Partitioning Split a large system into a set of smaller, loosely coupled and independent web services – each focusing on a subset of functionality of the overall system
Account Service Mobile client MySQL database Front end servers Functional partition Load Catalog balancer Service Cassandra database Partner Integration
REST Re Resource oriented architectural style that defines a set constraints used for creating web services of co
HTTP GET Request: URL http://globomantics.com/order/123456 JSON Response { “OrderID”: “123456”, “Date”: “09/12/2020”, “Status”: “Delivered” }
s h s Leverage HTTP features (PUT, GET, POST) Status codes (200 – OK, 400 – Bad Request) Why REST? Security Controls Caching Ecosystem of Monitoring tools
Scaling Web Services No Shared Store Load Balancers Stateless Servers Use Cache, Database Scale by adding replica Between clients and servers (AutoScale) and MessageQueues backend servers Easy Maintenance Isolate Server Failures Servers routinely ping Zero downtime load balancer update of services
Distributed Transactions Implementing an operation consisting of a set of web service calls that should either fail or pass?
Distributed Transaction Failure HTTP PUT Request Payment Shipping Order charge Create Order Service Service Service ship Client Response rollback
Understanding the Data Layer
Replication Synchronize state between 2 servers (master and slave) Scale read throughput and provide higher availability Each server holds an identical copy of data
Master-Slave Replication write Master sync sync client Slaves read
s h s Introduces complexity Replication Replication lag Challenges Only applicable for scaling reads
Sharding (Partitioning) Divide dataset into smaller chunks Each server processes only a subset of data Isolate failure of servers from each other
Sharding key Determines the distribution of the dataset among your servers (shards or partitions)
Modulo-based Sharding Server 1 Input UserID 113321 Sharding Function Server 10
s h s Sharding function can be complex Sharding Mappings can break Challenges Querying data across shards
CAP Theorem States that it’s impossible to build a distributed system that would simultaneously guarantee Consistency, Availability and Partition Tolerance
CAP Theorem Availability Partition Tolerance Consistency Server can process System is functional All servers see the same requests even when even when servers data other servers are down cannot communicate
CAP Theorem and NoSQL CP Database AP Database MongoDB Cassandra Stores data as BSON documents Wide column database Single Master-Slave configuration Masterless configuration Server shutdown during partition Server remains available
Handling Asynchronous Communication
Synchronous Communication A call is made to a remote server, which blocks until the operation completes
Synchronous Communication Example Database/ Item Load Item Page fetch data Cache blocking Service Browser load page return unblock
Asynchronous Communication The caller doesn’t wait for the operation to complete before returning
Asynchronous Communication Patterns Fire and Forget Callback Event-based Caller doesn’t care if Register a callback to Client emits events. the operation be notified when the Subscribers react completes or not operation completes independently
Asynchronous Communication Example Payment Shipping Checkout Checkout Cart Place Order async charge Thanks for your Order payment Mobile callback approved Order Confirmation Email
Message Queues
Message Queues Buffers and Distributes Asynchronous requests Messages are platform independent (JSON, XML) Producers and Consumers run as separate processes and hosted on separate servers Messages are platform independent (JSON, XML)
s h s Initiate asynchronous calls Producers Create a message and push it to the Queue Message format serves as the contract
Custom Message Format Example Serves as the contract between Producers and Consumers <?xml version=“1.0”> <message> <invoice date=“01-15-2020” id=“321332”> <item>1234432321</item> <title>A Brief History of Nearly Everything – Bill Bryson</title> <USPrice>12.99</USPrice> . . .
s h s Provides the following features - Message queuing - Persisting Message - Routing and Delivery Broker Configuration over Customization Support higher thoroughput and Scalability
s h s Process requests asynchronously Implemented in the application code Consumers - Pull model - Push model
Message Routing
Direct Worker Queue Method Single Work Queue identified by name Multiple Producers and Consumers Each Message routed to one Consumer Ideal for long running tasks - Video Transcoding - Data classification - Image resizing
Globomantics Batch Server Consumer 1 Create Producer 1 Item Listed Thumbnail Work Queue Batch Backend Servers Servers Consumer N Producer N
Batch job Servers CDN Customer Internet DNS Load Balancer Database Search Servers servers Back end Front end servers servers Message Queue Cache Servers Servers
t h s Front-end Layer - State management - DNS, CDN, Load balancer, Caching Summary Web Services Layer - Functional Partitioning - REST Data Layer - Replication - Sharding - CAP Theorem Asynchronous Processing
Recommend
More recommend