know more about your ceph cluster with elk stack
play

Know more about your Ceph Cluster with ELK Stack Cameron Seader - PowerPoint PPT Presentation

Know more about your Ceph Cluster with ELK Stack Cameron Seader Technology Strategist cs@suse.com 2 Ceph and Logging rsyslog, syslog-ng to forward logs to (Logstash / Fluentbit) Filebeat Ceph has Graylog (GELF) support store


  1. Know more about your Ceph Cluster with ELK Stack Cameron Seader Technology Strategist cs@suse.com

  2. 2

  3. Ceph and Logging ● rsyslog, syslog-ng to forward logs to (Logstash / Fluentbit) ● Filebeat ● Ceph has Graylog (GELF) support ● store logs for later use ● analyze logs for Alerting ● analyze data with Machine Learning ● X-Pack machine learning ● R client ● trouble shooting/postmortem analyze 3

  4. Forwarding logs ● you can format your logs before forwarding ● there is a tutorial for rsyslog how to reformat to GELF ● Logstash has a lot of pipeline input modules ● syslog { } ● graylog { } ● Filebeat 4

  5. Ceph GELF ● to forward logs in GELF, update ceph.conf ● log_to_graylog = true ● log_graylog_host = 127.0.0.1 ● log_graylog_port = 12201 ● restart ceph services ● or use DeepSea ● has support for custom Ceph config options ● salt-run state.orch ceph.stage.3 ● that also would restart services correctly 5

  6. Parse and Manage ● Logstash provides methods to parse logs ● simple alerting could be done with Logstash ● use grok { } and other Filters of Logstash ● to add fields ● better indexing and managing your data ● Ceph Logstash example ● Supportconfig Analyzer 6

  7. Logstash Pipeline filter example filter { if [type] == "cephlog" { grok { # https://github.com/ceph/ceph/blob/master/src/log/Entry.h match => { "message" => "(?m)%{TIMESTAMP_ISO8601:stamp}\s%{NOTSPACE:thread}\s*% {INT:prio}\s(%{WORD:subsys}|):?\s%{GREEDYDATA:msg}" } # https://github.com/ceph/ceph/blob/master/src/common/LogEntry.h match => { "message" => "%{TIMESTAMP_ISO8601:stamp}\s%{NOTSPACE:name}\s% {NOTSPACE:who_type}\s%{NOTSPACE:who_addr}\s%{INT:seq}\s:\s%{PROG:channel}\s\[% {WORD:prio}\]\s%{GREEDYDATA:msg}" } } date { match => [ "stamp", "yyyy-MM-dd HH:mm:ss.SSSSSS", "ISO8601" ] } } } 7

  8. The Tools ELK EFK Graylog 8

  9. How to get it ● ELK ● docker-compose is simple for dev needs ● EFK ● Helm charts for each ● install individually in a cluster ● Graylog ● Separate installed cluster ● Future ● Ceph in containers (Rook) converged with these tools on K8s 9

  10. 10

  11. 11

  12. 12

  13. 13

  14. Kibana Search 14

  15. RGW Object storage client to the ceph cluster, exposes a RESTFUL S3 & Swift API 15

  16. RGW ● A RESTful API access to object storage, a la S3 ● implements user accounts, acls, buckets ● heavy ecosystem of s3/swift client tooling can be leveraged against RGW 16

  17. RGW ● Supports a lot of S3 like features ● Multipart uploads ● Object Versioning ● torrents ● lifecycle ● encryption ● compression ● static websites ● metadata search... ● From Jewel we support multisite which allows geographical redundancy 17

  18. ElasticSearch You know, for search ● distributed ● horizontally scalable ● schemaless ● speaks REST ● Easy Configuration without setting your hair on fire 18

  19. RGW Metadata search with ES Motivation ● Objects have metadata associated with them that is often interesting to analyze ● Since it is an “object storage” you don’t have any traditional filesystems tool at your disposal ● No du, df & friends, and either way these are hard on a distributed storage system 19

  20. Motivation ● Some existing support with admin API, however the problems with this: ● returns specific metadata, not ideal for aggregation ● no notifications when new objects/buckets/accounts are created ● also permissions for users to access the admin API is tricky, since admin API was meant for administering ● As an storage administrator you'd be interested in finding out for eg. the top 10 users, average object size etc., no of objects held on a user account etc. 20

  21. Design ● Built atop of the multisite architecture, where data & metadata is forwarded to multiple zones ● From Kraken, we have sync plugins ● Allows for data & metadata to be forwarded to external tiers, allows for building of: ● Interesting solutions analyzing bucket/object/user metadata (ES for starts) ● Backup solutions (S3/cloud sync plugin for Mimic) 21

  22. Elastic Sync Plugin ● Forwards metadata from other zones onto a ES instance ● Requires a read only zone that doesn't cater to user requests & only forwards to ES ● No off the shelf authentication module that can work with RGW ● Recommendation to not expose ES endpoint to public 22

  23. Elastic Sync Plugin: User Requests For normal user requests, RGW itself can authenticate the user; ensures users don't see other's data We have an attribute mentioning owners for an object and this is used to service user requests Also allows for custom metadata fields to be set up per user Elastic queries to analyze common system faults Integration into Ceph dashboard Analysis of meta and/or log data with Machine Learning Contribute 23

  24. S 3 me t a d a t a E S R G W P r i ma r y R G W s e c o n d a r y C e p h 24

  25. Example Metadata { _index" : "rgw-default-6cb1f916", "_type" : "object", "_id" : "86740559-297e-4487-b770- d3106b900a97.34125.1:american-gods:null", "_score" : 0.2876821, "_source" : { "bucket" : "s3bucket", "name" : "american-gods", "instance" : "null", "versioned_epoch" : 0, 25 "owner" : {

  26. Example Query curl -XPOST 'localhost:9200/rgw-gold/_search?size=0&pretty' -d { "aggs" : { "avg_size" : { "avg" : { "field" : "meta.size" } } } } 26

  27. Response { "took" : 22, "timed_out" : false, "_shards" : { "total" : 10, "successful" : 10, "failed" : 0 }, "hits" : { "total" : 22, "max_score" : 0.0, "hits" : [ ] }, "aggregations" : { "avg_size" : { "value" : 177.72727272727272 } 27 }

  28. Interesting queries possible ● Object storage PUT requests on a specific time range. ● Stats on objects with specific metadata content ● It is possible to index metadata to non string fields on a per bucket basis 28

  29. Future Work ● Support for ES 6 for RGW ● Custom metadata fields for object tagging ● Elastic queries to analyze common system faults ● Integration into Ceph dashboard ● Analysis of meta and/or log data with Machine Learning 29

  30. Contribute ● https://github.com/denisok/elk_supportconfig Github Repo for ongoing ELK work ● https://ceph.com/IRC/ - Ceph upstream community mailing lists and IRC channels ● http://lists.suse.com/mailman/listinfo/deepsea-users - DeepSea upstream mailing list. ● https://groups.google.com/forum/#!forum/openattic-users - openATTIC upstream mailing list. ● https://github.com/ceph/ceph - upstream Ceph sources 30

  31. Questions? 31

Recommend


More recommend