Caching GraphQL: Approaches to automate caching data for GraphQL - PowerPoint PPT Presentation

Caching GraphQL: Approaches to automate caching data for GraphQL Tanmai Gopal | @tanmaigo

Hasura GraphQL engine Instant realtime GraphQL on Postgres Connect to services & get a unified GraphQL API HASURA Runs as a docker container in your infrastructure or use hasura.io/cloud Open-source ❤ http://github.com/hasura/graphql-engine App

Query caching vs Data caching - Cache queries: - Cache query execution plan - Cache data: - Don’t hit the upstream data source @tanmaigo

Query Caching - Algorithm: - For each incoming GraphQL query, normalise it - Hash the GraphQL query, and store the sequence the of resolvers to be called in a map. - Use an LRU strategy to bound the size of the cache - Run the resolvers and return data - If the same GraphQL query or a variation comes in, do a lookup on the map and run the resolvers - If the client supports making a query using a hash directly, even better because no normalization step is required - graphql-jit / fastify-graphql @tanmaigo

10x win: Pair with DB query caching (aka prepared statements) - Instead of a pure resolver approach, consider a “pushdown” approach - Take an incoming GraphQL query, extract the parts of it that only fetch from a single databases - Compile that into a single DB query (along with authorization rules) - Databases cache their query plans as well! (Prepared statements in Postgres/MySQL) - So session variables + query variables are zoomed through directly & securely to the database Normal: SQL query → Plan & optimise → Execute Prepared: (SQL query name, variables) → Execute SQL query-id + GraphQL query-id + variables variables Postgres Client GraphQL server JSON @tanmaigo

Data Caching - Purpose : - Reduce load on upstream services: 10k requests will be 10k requests to the database - Identify HOT queries and cache their results instead of straining the upstream system - Trade-off - Consistency and stale-results :( @tanmaigo

Data Caching is hard - Automatically caching API calls that fetch dynamic is hard (not just for GraphQL) - There are 2 problems to solve: - What to cache? - How do we update / invalidate the cache @tanmaigo

Data Caching - What to cache? /restaurants /restaurants /restaurants User-id: 1 User-id: 2 User-id: 3 Who is user-id 1? Who is user-id 2? Who is user-id 3? What city are they in? What city are they in? What city are they in? User-id 1 is in SF User-id 2 is in Dublin User-id 3 is in SF Load SF restaurants Load Dublin restaurants Load SF restaurants SF restaurant Dublin restaurant cache cache @tanmaigo

Data Caching - how do we invalidate & refresh the cache? /restaurants?id=123 SF restaurant cache Update restaurant #1: Cache for 60s Is this an SF restaurant? #2: Yes. Invalidate cache. @tanmaigo

3 ways to cache data 1. Before it hits the GraphQL server 2. In GraphQL resolvers 3. At the model level (integrated with logic to fetch the data for a particular model) @tanmaigo

1. Cache before the GraphQL server - Similar to caching GET requests with a CDN - API server doesn’t know about caching at all - Algorithm : - Look at the incoming query’s identifier (or normalise and check identifier) - See if this query is cacheable (cache list, @cached directive on the client-side) - Load data from a cache instead of running resolvers. - If data is not available, async-ly populate the cache - Caveats : - Only works if you know that the result of the query doesn’t depend on the identity of the user. Eg: public APIs @tanmaigo

Cache full API call by treating it like public data /restaurants ?city=SF /restaurants ?city=Dublin /restaurants ?city=SF User-id: 1 (SF) User-id: 2 (Dublin) User-id: 3 (SF) No dependency on user No dependency on user No dependency on user identity. Load from cache. identity. Load from cache. identity. Load from cache. SF restaurant Dublin restaurant cache cache @tanmaigo

2. Cache at GraphQL resolvers - Cache inside the GraphQL resolvers - Algorithm : - Inside a resolver, create a cache key based on the upstream database query or API call - For any execution of the resolver, load the data from a cache using the cache key - Or populate the cache if there’s a cache miss - Caveats: - Hitting the cache for every resolver. N+1? Cache needs a data-loader also? - Potentially a lot of repeated code if multiple resolvers are fetching from the same model - Hard to automate @tanmaigo

Fetch from cache in resolver instead of fetching from source. /restaurants /restaurants /restaurants User-id: 1 User-id: 2 User-id: 3 Restaurants resolver Restaurants resolver Restaurants resolver User-id 1 is in SF User-id 2 is in Dublin User-id 3 is in SF Load SF restaurants Load Dublin restaurants Load SF restaurants from cache or DB from cache or DB from cache or DB SF restaurant Dublin restaurant cache cache @tanmaigo

3. Cache using model-level rules - Algorithm: - Each model should have declarative authorization & relationship rules - Resolvers fetch data from a generic model data fetching layer - Data fetching layer embeds the authorization rules automatically. - Knowing what to cache is not at the resolver level - When a query comes in, analyse the authorization rules of all the models that will be fetched in the query to determine its dependency on the user identity - For multiple user identities, we can determine if the query will result in fetching the same data - Use simple data caching at the full-query level (like in approach #1) @tanmaigo

Cache-key includes the user’s “group”. Cache full query. /restaurants /restaurants /restaurants User-id: 1 User-id: 2 User-id: 3 User-id 1 is in SF User-id 2 is in Dublin User-id 3 is in SF Use (SF, query) cache key Use (Dublin, query) cache Use (SF, query) cache key and load from cache key and load from cache and load from cache SF restaurant Dublin restaurant cache cache @tanmaigo

Caching on Hasura Cloud - LRU cache - @cached directive. Client controls tolerance for stale data. Use a combination of 2 strategies automatically. 1. Use #1 : a. Determine if query is independent of user identity 2. Use #3 : a. If data is from a database, use #3 approach b. If data is from an API source where business logic is not known, use #1 if applicable. @tanmaigo

hasura.io/cloud @tanmaigo

@tanmaigo hasura.io 19 @tanmaigo

Caching GraphQL: Approaches to automate caching data for GraphQL - PowerPoint PPT Presentation

Caching GraphQL: Approaches to automate caching data for GraphQL Tanmai Gopal | @tanmaigo Hasura GraphQL engine Instant realtime GraphQL on Postgres Connect to services & get a unified GraphQL API HASURA Runs as a docker container in

A realtime GraphQL backend as a compiler in Haskell http://bit.ly/graphql-haskell Heippa!

Serverless GraphQL Howdy! Im Jared Short Director of Innovation @ @shortjared GraphQL A

HOW TO AUTH: SECURE A GRAPHQL API WITH CONFIDENCE MANDI WISE | GRAPHQL SUMMIT 2020 AGENDA

Agenda Caching Caching Gitlab Demo Caching Demos Mirroring Caching Limitations Manual

Web Proxy Web Proxy Caching Caching Caching Web Proxy Web Proxy Caching By Miquel Company

Automate Server Mastery Keeping Your Automate Server at Its Best Presented by Brandon Lippie

Configuration Management wangth Computer Center, CS, NCTU Automate, automate, automate q

Beyond REST: Courseras Journey to GraphQL Bryan Kane @bryanskane bryan-coursera

GRAPHQL Josh Price @joshprice STEPPING STONES TO FP Language (Elixir) Strongly-Typed APIs

GraphQL with Python frameworks GraphQL with Python frameworks Create next-generation API with

Cooperative Web Caching Cooperative Web Caching Cooperative Caching Cooperative Caching

Web Caching and Content Delivery Web Caching and Content Delivery Caching for a Better Web

Project AutoMate SESAME: Dynamic Context Aware Access Control G. Zhang, The AutoMate Group The

Project AutoMate Enabling Autonomic Applications M. Parashar, The AutoMate Group The Applied

Project AutoMate Squid: Decentralized Discovery Service C. Schmidt, The AutoMate Group The

Web Caching based on: Web Caching , Geoff Huston Web Caching and Zipf-like Distributions:

Whats for Dinner? Using Predictive UX to Help Users Decide elevated third | 535 16th St. Suite

Systems programming Thread management (Cont.) Synchronization using Semaphores Most of the

Noun-Verb Decomposition Nouns Verbs Restaurant [Regular, Catering, Take- has (information) Out]

Inferring Restaurant Styles by Mining Crowd Sourced Photos from User-Review Websites Haofu Liao,

Restaurant Innovation Summit Hilton Austin Austin, Texas September 13 - 15, 2016 #RISummit16

Q2 2018 Financial Results August 7, 2018 Safe Harbor Some of the statements contained in this

Inference for Numerical Data I Dajiang Liu @PHS 525 Feb-18-2016 How to Select Significance

CARES Act - What's in it for Business Owners Disclaimer These slides are for educational