monitoring kubernetes with prometheus
play

Monitoring Kubernetes with Prometheus Henri Dubois-Ferriere - PowerPoint PPT Presentation

Monitoring Kubernetes with Prometheus Henri Dubois-Ferriere @henridf Percona Live, 2018-11-06 Hello. Henri Dubois-Ferriere Technical Director, Sysdig Doing observability for many many years, from network to web apps via many startups.


  1. Monitoring Kubernetes with Prometheus Henri Dubois-Ferriere @henridf Percona Live, 2018-11-06

  2. Hello. Henri Dubois-Ferriere Technical Director, Sysdig Doing “observability” for many many years, from network to web apps via many startups. PhD in CS from EPFL Repatriate from San Francisco to Switzerland

  3. Outline ● Kubernetes ● Prometheus ● Kubernetes metrics & sources ● Deployment

  4. Monitor why? ● Know about outages before users tell me ● Understand my production environment (or try…) ● Plan/trend/forecast

  5. Kubernetes

  6. Kubernetes - Container orchestration system - aka “OS for your cluster” - Abstracts away the underlying infra - declarative APIs with control loops

  7. https://commons.wikimedia.org/wiki/File:Kubernetes.png

  8. Prometheus

  9. Prometheus ❏ Started at SoundCloud in 2012 ❏ Motivated by challenges with monitoring dynamic environments ❏ Made public 2015, now second CNCF “graduate”

  10. More than a TSDB https://prometheus.io/assets/architecture.png

  11. It’s all about the pull - Prom scrapes targets to get metrics - Nice side effect: know when target down - Needs to know what to scrape

  12. What should Prometheus scrape? - Service discovery provides answer - Azure, Consul, GCE, K8S, EC2, ... - Can also watch a file containing target list

  13. Dimensional data model Query: http_requests_total{code=”200”, method=”get”} Metric name Selector (aka filter)

  14. Dimensional data model Query: http_requests_total{code=”200”, method=”get”} Response : http_requests_total {code="200", method=”get”, route="/api/users"} 1528706829.115 1741 http_requests_total {code="200", method=”get”, route="/api/objects"} 1528706829.115 1920 Label/value pairs (aka dimensions)

  15. Dimensional data model Query: http_requests_total{code=”200”, method=”get”} Response : http_requests_total{code="200", method=”get”, route="/api/users"} 1528706829.115 1741 http_requests_total{code="200", method=”get”, route="/api/objects"} 1528706829.115 1920 Timestamp value

  16. Metadata discovery - SD also provides metadata - Metadata can be mixed in with metrics - Powerful relabelling feature for label manipulation at ingest

  17. Instrumentation

  18. Off-the-shelf or write your own

  19. Kubernetes metrics

  20. Monitoring resources and methods - For resources like memory, queues, CPUs, disks… - USE Method: Utilization, Saturation, Errors - http://www.brendangregg.com/usemethod.html - For services - “RED” Method: Request rate, Error rate, Duration - https://www.weave.works/blog/the-red-method-key-metrics-for-micr oservices-architecture/

  21. node_exporter: node metrics - Host metrics - CPU - Memory - Disk - Network - ... - Not K8S specific, but useful as referential and for totals

  22. cAdvisor: container metrics - Runs in kubelet (usually, for now..) - Resource stats about running containers - Mostly container and node-level labels… - (k8s: plus namespace and pod_name)

  23. Sample cAdvisor metric queries Percent of total cluster memory used: sum(container_memory_rss) / sum(machine_memory_bytes) Memory used by kubernetes namespace: sum(container_memory_rss) by (namespace) Top 5 pods by network I/O: topk(5, sum by (pod_name) (rate(container_network_transmit_bytes_total[5m])))

  24. Kube-state metrics $ kubectl get deploy my-app -o yaml apiVersion: extensions/v1beta1 kind: Deployment metadata: name: my-app ... spec: replicas: 4 ... status: replicas: 4 ...

  25. Kube-state metrics $ kubectl get deploy my-app -o yaml apiVersion: extensions/v1beta1 kind: Deployment metadata: name: my-app ... spec: kube_deployment_spec_replicas{ deployment= "my-app" , ... } replicas: 4 ... Metrics created by kube-state-metrics With label set from this deployment status: replicas: 4 kube_deployment_status_replicas{ deployment= "my-app" , ... } ...

  26. Sample kube-state-metrics queries Deployments with issues kube_deployment_spec_replicas != kube_deployment_status_replicas_available Top 10 longest-running pods (“reverse uptime”) topk(10, sort_desc(time() - kube_pod_created))

  27. Kube core service metrics - API Server - etcd3 - kube-dns - scheduler, controller-manager

  28. Metrics recap Deployment mode How many Metrics about node_exporter daemonset 1 per node node resources cAdvisor inside kubelet 1 per node container resources kube-state-metrics deployment singleton k8s object state etcd, Api Server, core service singleton or HA group Itself controller manager, ...

  29. Deploying

  30. Monitoring from the inside - Monitoring runs inside thing being monitored? - Yes. It’s fine really. Really, it’s fine. - (And being outside has own challenges)

  31. Deployment outline - Metrics services - node_exporter - kube-state-metrics - (cAdvisor usually enabled out of box) - Prometheus running - Storage - Read access to API server (for service discovery) - Service discovery config for above - Service discovery config for apps/services

  32. Helm-based install helm fetch stable/prometheus vi prometheus/values.yaml # configure install helm upgrade -i # or manually deploy yaml

  33. Prometheus operator - Use Kubernetes API facilities to make Prometheus “native” - new Prometheus-related objects: ` kubectl get prometheus` - PrometheusRule, ServiceMonitor, AlertManager, AlertingSpec, ... - Prometheus configuration abstracted via all these objects - Young but promising - Consider more direct route first (hand-rolled or Helm), and Operator once more familiar with challenges of direct route

  34. Thank You. Henri Dubois-Ferriere @henridf

  35. Pointers Prometheus SD for Kubernetes: - https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config KSM metrics: https://github.com/kubernetes/kube-state-metrics/tree/master/Documentation - Prometheus Helm chart: https://github.com/helm/charts/tree/master/stable/prometheus - Prometheus operator: https://github.com/coreos/prometheus-operator - “A deep dive into Kubernetes metrics” blog series: - https://blog.freshtracks.io/a-deep-dive-into-kubernetes-metrics-66936addedae

Recommend


More recommend