Chair of Network Architectures and Services Department of Informatics Technical University of Munich MA Final Talk: Tamper-Evident Publication of Internet Measurements Max Helm Advisors: Oliver Gasser, Benjamin Hof, Quirin Scheitle July 16, 2018 Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Motivation • What? • Tamper-evident: Cryptographically provable append-only property • Publish Internet measurements (e.g. certificates) • Why? • Documentation of measurement claim, timestamping service, attribution to cryptographic iden- tity: I found and uploaded this specific certificate at this exact time • Ease the joint use of different certificate sources: Certificate Transparency ↔ Active Scans [6] • Bind measurement results to their meta data M. Helm — MA: Tamper-Evident Measurements 2
Motivation Requirements: • CT compatibility: Upload, download, proofs • Extensibility: Add additional modules • New spam protection mechanism M. Helm — MA: Tamper-Evident Measurements 3
Related Work • Data archival and reproducibility guidelines: BSI [2], TUM UB [5], Censys [3] • Tamper-evident data structures: • Chain-based: Linear hash chains, ... • Tree-based: Merkle Trees, Certificate Transparency, Persistent Authenticated Dictionaries, ... • Merkle Tree extensions: Revocation Transparency[4], CT for DNSSEC[7], ... M. Helm — MA: Tamper-Evident Measurements 4
Background • Merkle Tree: • Binary hash tree (leaves are certificates, nodes are hashes of children) • Root hash gets published • Efficient way of proving inclusion of certificates and consistency between tree versions • Certificate Transparency (CT): • Public, tamper-evident logs of certificates • Everyone can submit certificates with a valid chain • Supply simple HTTP(S) GET/POST endpoints for up- and download • Use Merkle tree to achieve tamper-evident property M. Helm — MA: Tamper-Evident Measurements 5
Approach • Three leaf types: • Certificates (comparable to CT) • Scan Data (Meta data about whole scan as well as single scan items) • Derived Data (Additional data: e.g. linter results for certificates) • Use three Merkle trees in conjunction (additional data, keep CT compatibility) M. Helm — MA: Tamper-Evident Measurements 6
Approach Workflow for researchers: • Upload scan meta data → Returns: Inclusion promise for scan + leaf hash of scan • Upload scan results → Returns: Inclusion promise for each result • Researcher has to store or publish inclusion promises → Possible to prove misbehavior of log • Log triggers derived data collection M. Helm — MA: Tamper-Evident Measurements 7
Design scan 2 scan 0 scan 1 Figure 1: Scan series tree (scan meta data) M. Helm — MA: Tamper-Evident Measurements 8
Design scan 2 scan 0 scan 1 Figure 1: Scan series tree (scan meta data) cert 0 cert 1 cert 2 cert 3 Figure 3: Certificate tree (certificate data) M. Helm — MA: Tamper-Evident Measurements 8
Design scan 2 scan 0 scan 1 Figure 1: Scan series tree (scan meta data) cert 0 cert 1 cert 2 cert 3 Figure 3: Certificate tree (certificate data) derived 0 derived 1 derived 2 derived 3 Figure 2: Derived data tree (additional data derived from certs) M. Helm — MA: Tamper-Evident Measurements 8
Design Scan Data Certificates Derived Data scan 2 scan 0 scan 1 cert 0 cert 1 cert 2 cert 3 derived 0 derived 1 derived 2 derived 3 Figure 4: The three tree types and their final interconnections. Solid lines represent tamper-evident links, dashed lines represent non-tamper-evident links. M. Helm — MA: Tamper-Evident Measurements 9
Implementation • Based on github.com/google/trillian (log server) and github.com/google/certificate-transparency-go (log client) • Extended server to support active scans with meta data and derived data • Extended client to upload scan data and added different upload modes M. Helm — MA: Tamper-Evident Measurements 10
Implementation createTree add-scan verifyPGP addToCache Tree Mode gRPC queueLeaf DB GPG Public Memcached Key Ring Instance checkCache startDerivation add-cert add-derived Figure 5: Overview of the implementation of the trillian personality. M. Helm — MA: Tamper-Evident Measurements 11
Implementation Big scans cause problems (protobuf restrictions) ⇒ Two modes for different requirements to the size of scans: Default (Scans < 5M entries): • • Three static trees • One scan consists of one node of the scan tree • Dynamic Tree Generation (Scans � 5M entries): • Three static trees + Dynamically generated trees • Each scan triggers generation of new tree • Those dynamic trees are sub trees of scan tree • One scan consists of one node of scan tree + One dynamic tree M. Helm — MA: Tamper-Evident Measurements 12
Implementation Scan Data scan 2 scan 0 scan 1 datum 0 datum 1 datum 2 datum 3 datum 0 datum 1 datum 2 datum 3 datum 0 datum 1 datum 2 datum 3 Figure 6: The scan tree on top with three dynamically generated sub trees for the three scans in the top tree. M. Helm — MA: Tamper-Evident Measurements 13
Implementation Different upload modes for certificates: • Default (like in CT): • One cert per request • Batches: • 1,000 certs per request • Concurrent batches: • Parallel upload of batches M. Helm — MA: Tamper-Evident Measurements 14
Evaluation Setup: • Single VM • Debian Jessie, 64GB RAM, 8 × 2.67GHz cores Scan upload (1M entries per scan): • Default: Four minutes • Dynamic Trees: 165 minutes Certificate upload (1M certs): • Default (like in CT): 660 minutes • Batches: 161 minutes • Concurrent batches: 70 minutes M. Helm — MA: Tamper-Evident Measurements 15
Evaluation (a) Memory and CPU usage (b) Network usage and number of unsequenced rows Figure 7: Measurement results for the scan upload in the default mode • Short upload time • Not resource bound M. Helm — MA: Tamper-Evident Measurements 16
Evaluation (a) Memory and CPU usage (b) Network usage and number of unsequenced rows Figure 8: Measurement results for the scan upload in the dynamic trees mode • Upload takes more time • Batch pattern in unsequenced rows M. Helm — MA: Tamper-Evident Measurements 17
Evaluation (a) Rocketeer log size over time[1] (b) Argon 2017 log size over time[1] Figure 9: CT logs over time • Rocketeer: mean of 300,000 per day • Argon 2017: maximum of 2,100,000 per day ⇒ Only estimation of lower bound M. Helm — MA: Tamper-Evident Measurements 18
Future Work • Improve performance: • Timeout issues of implementation • Contexts get canceled • Gossiping: • Security goals like tamper-evidence, spam protection, accountability reached • Other goals require lots of assumptions → e.g. no split view attacks M. Helm — MA: Tamper-Evident Measurements 19
Conclusion • CT compatibility: Upload ✗ , download ✓ , proof ✓ • Extensibility: Add derived module via git link ✓ • New spam protection mechanism: GPG ✓ Questions? M. Helm — MA: Tamper-Evident Measurements 20
Backup Implementation details • trillian implements highly scalable Merkle-Tree in Go • certificate-transparency-go implements CT personality • Altered CT personality to work with arbitrary data (not only certs) • Added second tree, communication between trees • Removed: root store contains valid root certs • Added: PGP signature verification on scan upload • User store: implemented as Public key ring file • Memcache for caching of cert hashes • Only accept cert uploads where cert is part of existing scan M. Helm — MA: Tamper-Evident Measurements 21
Backup Design details Roles: • Log owner (created and delete log, add admins) • Log admin (freeze log, add contributors) • Log contributor (upload data) • Log user (download data) Attacker model: • External attacker: User of or contributor to log → Alter (own) contributions • Internal attacker: Owner or admin of the log → Alter contributions to fit long term measure- ment results • Internal data corruption: Bit flips, accidental overrides, ... M. Helm — MA: Tamper-Evident Measurements 22
Backup Figure 10: Structure of an inclusion promise. M. Helm — MA: Tamper-Evident Measurements 23
Bibliography [1] CT Monitor. ct.grahamedgecombe.com . Accessed: 2018-07-15. [2] BSI. M 4.170. https://www.bsi.bund.de/DE/Themen/ITGrundschutz/ITGrundschutzKataloge/Inhalt/_content/m/m04/m04170.html . [3] Z. Durumeric, D. Adrian, A. Mirian, M. Bailey, and J. A. Halderman. A search engine backed by Internet-wide scanning. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security , pages 542–553. ACM, 2015. [4] B. Laurie and E. Kasper. Revocation transparency. Google Research, September , 2012. [5] U. TUM. Forschungsdatenmanagement. https://www.ub.tum.de/forschungsdaten-archivieren . [6] B. VanderSloot, J. Amann, M. Bernhard, Z. Durumeric, M. Bailey, and J. A. Halderman. Towards a Complete View of the Certificate Ecosystem. In Proceedings of the 2016 ACM on Internet Measurement Conference , pages 543–549. ACM, 2016. [7] D. Zhang. Certificate Transparency for Domain Name System Security Extensions. Work In Progress , 2016. https://tools.ietf.org/html/draft-zhang-trans-ct-dnssec-03. M. Helm — MA: Tamper-Evident Measurements 24
Recommend
More recommend