Presto Summit NYC 2019 December 11, 2019 Slack handles: @cheolsoo; @abhonsule slack-corp.com
Mission Make people’s working lives simpler, more pleasant and more productive.
Slack
Data Engineering at Slack Custodian of all data generated within Slack, the product. We provide the infrastructure and tooling necessary for stakeholders to reliably access product data for user facing features, product and business insights. 215B +270M 700B 250B Logs Daily Messages Daily Records Messages Table
Presto at Slack Airflow Databooks DAGs running on ETL scheduling system Tool used by Analysts, Data scientists, Marketing, Sales, Finance Analytics AB Testing .ts framework Slack’s internal Slack’s AB testing/ analytics portal - Presto Experiments Product Managers, framework Engineers, Analysts, Data scientists, Sales, Marketing, Finance Sqooper BI portal Batch ingestion BI tool used by Corp/ system clog queries Biztech Query client logs
Presto at Slack Past Present Future Presto on EMR Starburst on EC2 Federated clusters Single cluster Multiple clusters
Query success rate
Query count
Multiple clusters ● Static load balancing ● Per cluster config properties ● Per cluster capacity planning
Shadow clusters ● Read-only shadow cluster in parallel ● Useful for testing config changes or version upgrades
Terraform module ● Provision a cluster with 25-lines of code ● ASG optionally with spot ● Dedicated HMS per cluster
Resource groups ● Per cluster resource groups config ● Per cluster ● Per group resource groups scheduling policies ● Per group config scheduling policies ● Fair (ad-hoc) vs ● Fair (ad-hoc) vs weighted_fair (etl) weighted_fair (etl)
JVM JMX exporter -javaagent:/usr/local/jmx_exporter/jmx_exporter.jar= 7071:/usr/local/jmx_exporter/exporter.yml Prometheus self.consul_job( 'presto', datacenters=[env + '-us-east-1-dw1'], services=['presto'] )
Grafana dashboard
Graceful decommission Autoscaling curl -XPUT localhost:8889/v1/info/state -d "SHUTTING_DOWN" -H "Content-type: application/json" Chef role "auto_scaling_group": { "prepare_for_termination_cmd": "<cmd>" }
Federated clusters ● Dynamic load balancing ● High availability ● Minimize the impact of rogue queries
Q&A Slack handles: @cheolsoo; @abhonsule slack-corp.com
Recommend
More recommend