Core Intel On the bank secret service Krzysztof Adamski # Mariusz Derela Miami 18th May 2017
Are security breaches common? https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/4 32412/bis-15-302-information_security_breaches_survey_2015-full-report.pdf
Carbanak https://securelist.com/blog/research/68732/the-great-bank-robbery-the-carbanak-apt/ 3
CoreIntel Core Intel is a part of ING Cyber Crime Resilience Programme to structurally improve the capabilities for the cybercrime prevention • detection and the • response • 4
The reasoning Measures against e-banking fraud, DDoS and Advanced Persistent Threats (APTs). • Threat intelligence allow to respond to, or even prevent, a cybercrime attack • (This kind of intelligence is available via internal and external parties and includes both • open and closed communities) Monitoring, detection and response to “spear phishing” • Detection/mitigation of infected ING systems’ • Baselining network traffic/anomaly detection • Response to incidents (knowledge, tools, IT environment) • Automated feeds, automated analysis and historical data analysis • 5
What is there on the market nowadays? 6
The world is not enough
So the challenge is …
Most of our data is within Europe Market leaders Benelux Challengers Growth markets Commercial Banking 9
but we operate globally Market leaders Benelux Challengers Growth markets Commercial Banking 10
Expect the unexpected to collect all the data 11
So there is a challenge to capture „ all ” the data • What kind of data do we need? • Where is our data located? • How we can potentially capture it? • What are the legal implications? 12
Core Intel architecture
So what you would like to see is … Photo credit: edgarpierce via Foter.com / CC BY
…In fact it is slightly more complicated
All has its own purpose. Let’s see in details. Photo credit: https://www.pexels.com/photo/dslr-camera-equipments-147462/ 16
Local data collector 17
But tell how to capture that data https://observer.viavisolutions.com/includes/popups/taps/tap-vs-span.php 18
Kafka producer configuration (as we don’t like losing data) Broker settings: Replication factor >= 3 min.insync.replicas = 2 unclean.leader.election.enable = false replica.lag.time.max.ms Producer settings: acks = all retries = Integer.MAX_VALUE max.block.ms = Long.MAX_VALUE block.on.buffer.full = true To have data in order max.in.flight.requests.per.connection = 1 Good overview here: https://www.slideshare.net/JayeshThakrar/kafka-68540012 19
Central data collector 20
Time is crucial here Photo credit: Cargo Cult via Foter.com / CC BY 21
But your business data more, so proceed with caution Photo credit: https://www.pexels.com/photo/white-caution-cone-on-keyboard-211151/ 22
Kafka mirror maker configuration Network bandwidth control • quota.consumer.default • quota.producer.default • 23
Kafka mirror maker configuration Secure data: listeners=SSL://host.name:port ssl.client.auth=required Secure ssl.keystore.location data in ssl.keystore.password transit ssl.key.password ssl.truststore.location ssl.truststore.password 24
Streaming data 25
Spark on yarn streaming configuration spark.yarn.maxAppAttempts spark.yarn.am.attemptFailuresValidityInterval spark.yarn.max.executor.failures spark.yarn.executor.failuresValidityInterval spark.task.maxFailures spark.hadoop.fs.hdfs.impl.disable.cache spark.streaming.backpressure.enabled=true spark.streaming.kafka.maxRatePerPartition 26
In memory data grid val rddFromMap = sc.fromHazelcastMap("map-name-to-be-loaded") 27
Let’s find something in these logs Photo credit: https://www.flickr.com/photos/65363769@N08/12726065645/in/pool-555784@N20/ 28
Matching Tornado - a Python web framework and asynchronous networking library - http://www.tornadoweb.org/ MessagePack – binary transport format http://msgpack.org/ 29
Hit, alerts and dashboards Automatically & continually match network logs <->threat intel • When new threat intel arrives, against full history network logs • When new network logs arrive, against full history threat intel • Alerts are shown in a hit dashboard • Dashboard is a web-based interfaces that provide flexible charts, querying, aggregation • and browsing Quality/relevance of an alert is subject to the quality of IoC feeds and completeness of • internal log data. 30
Be smart with your tooling Photo credit https://www.flickr.com/photos/12749546@N07/ 31
and leverage e.g. elasticsearch templates 32
Elasticsearch configuration Data mapping: - doc_value - fielddata - fields Cluster settings to check: gateway.recover_after_nodes gateway.recover_after_master_nodes gateway.recover_after_data_nodes indices.recovery.max_bytes_per_sec indices.breaker.total.limit indices.breaker.fielddata.limit 33
For those who know how to use heavy equipment Photo credit: News Collection & Public Distribution @techpearce2 via Foter.com / CC BY 34
Long data storage - HDFS 35
Kafka offset management 36
Advanced analytics Core Intel allows users to perform advanced analytics on network logs using a set of powerful tools Spark API to write code to process large data sets on a cluster • perform complex aggregations to collect interesting statistics • run large scale clustering algorithms with Spark’s MLLib • run graph analyses on network logs using Spark’s GraphX • transform and extract data for use in another system (which are better for specific analytics or • visualization purposes) Kafka, co you can write own Consumers and Producers to work with your data • to perform streaming analysis on your data • to implement your own alerting logic • Toolset • Programming languages: Scala, Java, Python • IDE’s : Eclipse / Scala IDE, IPython Notebook and R Studio • 37
How do we schedule the jobs 38
How to keep everything under control Photo credit: https://www.flickr.com/photos/martijn141 39
Monitoring crucial points in your data pipeline 40
Something for smart guys Photo credit: https://www.flickr.com/photos/jdhancock/5173498203/ 41
Plenty of data to analyze 42
Challenger on the operations side. Are containers the answer? 43
OpenShift HA deployment http://playbooks-rhtconsulting.rhcloud.com/playbooks/installation/installation.html 44
OpenShift Architecture InnerPodT1 InnerPodT2 InnerPodT3 InnerPodTn ... 10.1.1.2 10.1.2.2 10.1.3.2 10.1.n.2 T1 Project T2 Project T3 Project Tn Project ... VNID: 301 VNID: 302 VNID: 303 VNID: n Tenants namespaces BR0 (OVS) VTEP OSE VXLAN BR0(OVS) VTEP Openshift set of clusters Phisical Network (ISP ECF) RT - zone T1: VlanT1 T2: VlanT2 Tn: VlanTn T1: Nodes T2: Nodes Tn: Nodes OSE - masters Infra nodes . . . Affinity: T1 Affinity: T2 Affinity: Tn Anti Affinity: Anti Affinity: Anti Affinity: [O66|R41] [O66|R41] [O66|R41] T1 nodes T2 nodes Tn nodes 45
OpenShift – Ingestion Layer 46
OpenShift – Ingestion Layer + 47
OpenShift – Ingestion Layer 48
OpenShift – Ingestion Layer 49
OpenShift – Ingestion Layer – Pros & Cons • Rolling Update 50
OpenShift – Ingestion Layer – Pros & Cons • Rolling Update • Triggers 51
OpenShift – Ingestion Layer – Pros & Cons • Rolling Update • Triggers • AutoScale 52
OpenShift – Ingestion Layer – Pros & Cons • Rolling Update • Triggers • AutoScale • Healthchecks 53
OpenShift – Elasticsearch Stack 54
OpenShift – Challanges • Persistent Storage • Rack Awarness http://dailypicksandflicks.com/2011/10/25/did-you-know-the-worlds-best-selling-toy/cat-with-rubiks-cube/ 55
OpenShift – „ PetSet ” ( Stateful Services) 56
OpenShift – Persistent Storage 57
OpenShift – Rack Awarness 58
OpenShift – Capacity 59
Q&A krzysztof.adamski@ingservicespolska.pl @adamskikrzysiek https://pl.linkedin.com/in/adamskikrzysztof mariusz.derela@ingservicespolska.pl @mariusz_derela https://www.linkedin.com/in/mariusz-derela-30649a69
61
Recommend
More recommend