how risky is the
play

How risky is the Cyber Independent Testing Lab software you use? { - PowerPoint PPT Presentation

How risky is the Cyber Independent Testing Lab software you use? { Sarah Zatko , Tim Carstens , Patrick Stach , Parker Thompson , mudge } @ CITL https://shmoo18.cyber-itl.org A non-profit organization based in USA Founded by Sarah


  1. How risky is the Cyber Independent Testing Lab software you use? { Sarah Zatko , Tim Carstens , Patrick Stach , Parker Thompson , mudge } @ CITL https://shmoo18.cyber-itl.org

  2. • A non-profit organization based in USA • Founded by Sarah Zatko & mudge • Mission: to improve the state of software security by providing the public with accurate We are CITL reporting on the security of popular software • Funding from the Ford Foundation • Partners with Consumer Reports https://www.consumerreports.org & The Digital Standard https://thedigitalstandard.org

  3. Something like this, but for software security.

  4. How do you do this for software security?

  5. Scores & Hardened Gentoo Samsung UN55KS9000 Histograms Ubuntu 16 LTS LG 55UH8500

  6. Visio LG Samsung Ubuntu P55-E1 16.04 49UJ7700 UN55KS9000 Security Today: # binaries 504 1740 4243 4991 aslr 98% 67% 80% 100% stack DEP 99% 99%* 99%* 99% You can lead the 64 bit 0% 0% 0% 98% pack by mastering RELRO 100% 4% 9% 96% stack guards 68% 1% 57% 79% the fundamentals. fully fortified 7% 0% 6% 11% partial fort 43% 1% 37% 42% has good 3% 3% 25% 4% has risky 68% 66% 67% 67% has bad 28% 34% 23% 28% has ick 3% 5% 5% 3%

  7. 1. Remain independent of vendor influence 2. Automated, comparable, quantitative analysis 3. Act as a user watchdog Our goals • Non-goal: find and disclose vulnerabilities • Non-goal: tell software vendors what to do • Non-goal: perform free security testing for vendors

  8. 1. What works? Three big 2. How do you recognize when it’s being done? questions 3. Who’s doing it?

  9. The basic idea

  10. • Given a piece of software, we can ask 1. Overall, how secure is it? 2. What are all of its vulnerabilities? Information Theory • (1) appears to ask for less-info than (2) Perspective • Our Question: Develop an heuristic which can efficiently answer (1) but not necessarily (2)

  11. Step One: Static Measurements • Complexity • Functions called • Safety features Years in the field give us a good starting point – look for the same things we’d look at when trying to pick a soft target to exploit. But, this field doesn’t know enough about impact/effectiveness of best practices.

  12. Early Promise Browser “Underground” Exploit Price Microsoft Edge $80,000 Google Chrome $80,000 Apple Safari $50,000 Mozilla Firefox $30,000

  13. Step 2: Fuzzing! Lots of it. • Fuzzing provides a testable, recognized way to roughly measure software’s “security” • The more robust software is when fuzzed, the less likely it is to be exploitable • If we could fuzz everything, we wouldn't’ t even necessarily need the heuristics • But we can’t, so

  14. Step 3: Profit! Bayes! (1/3) • For some software s, we know that we can’t compute P(s is secure ) • As a surrogate, we can compute probabilities of different fuzzing outcomes, like: P h,k = P( h units of fuzzing against s yields < k unique crashes )

  15. Step 3: Profit! Bayes!(2/3) • Fuzzing is expensive, so we “go Bayesian” • Let M be an observable property of software • Examples: is compatible with RELRO, has “low complexity,” etc • For random s in S , consider the conditional probabilities P h,k (M) = P( h fuzzing on s yields < k unique crashes | M is true of s ) • What we want: Which M have P h,k (M) > 0.5 for large log(h) / k ? Which indicators (M) can be used to predict fuzzing performance?

  16. Step 3: Profit! Bayes! (3/3) Indicators might not be causal, and that’s OK: • It could be that M ’s presence literally prevents crashes • But it could also be that M is mostly only found in software written by teams who ship reliable software • If you’re looking for security, what difference does it make?

  17. Want to find: • Diamond (US Geological Survey) Look for: Indicator Minerals • Garnet (Moha112100 @ Wikipedia) • Diopside (Rob Lavinsky) • Chromite (Weinrich Minerals, Inc.)

  18. Step 4: Reports While we work on gathering data and developing our model, we’re also • Developing reports • Building relationships with partner organizations like Consumer Reports • Looking for security orgs to share data with

  19. The Progression of CITL Tech Static Static (Prototype) (Extensible) First reports First Data Final Model & Reports AFL CITL-fuzz NEW FUZZER Today

  20. • Lots of architectures: x86-*, ARM-*, MIPS-* • Lots of operating systems: Windows, Linux, OS X Applied Static • Lots of binary formats: PE, ELF, MachO Analysis • Each with their own app-armoring features • Lots of versions of each of the above!

  21. OS Comparisons Ubuntu Windows OSX • Windows lags in stack guards, but has 16.04 10 10.13.1 good usage of CFI 64 bit 97% 66% 77% • Linux does more source fortification aslr 100% 99% 100% than OSX dep 99% 98% 100% stack_guards 79% 40% 73% • Windows has the best function hygiene fully fortified 11% 2% • Linux’s function hygiene is slightly partial fort 42% 33% worse than OSX’s cfi 92% good 4% 19% 29% risky 67% 30% 60% bad 28% 3% 24% ick 3% 0% 2%

  22. Linux Browsers – Ubuntu 16.04 • Scores are all very close, Firefox wins Chrome Firefox Opera version 63.0.3239.13 57.0.4 50.0.2762.4 by a nose in static analysis 64bit 100% 100% 100% • Chrome’s sandbox isn’t factored into aslr 100% 100% 100% score yet dep 100% 100% 100% relro 86% 100% 11% • All have inconsistent function hygiene stack_guards 86% 87% 100% • Opera takes a hit for lack of RELRO partial fortification 29% 70% 56% functions • Chrome lags behind in fortification use good 12% 4% 22% risky 86% 91% 100% bad 62% 61% 89% scores 5th % 35 64 43 50th % 58 78 48 95th % 71 86 65

  23. OSX Browsers Chrome Firefox Opera Safari • Firefox and Opera had all binaries 64 63.0.3239.13 57.0.4 50.0.2762.45 11.0.1 count 9 19 8 25 bit with ASLR, Stack DEP 64bit 89% 100% 100% 88% • Firefox also made most use of stack aslr 89% 100% 100% 100% dep 100% 100% 100% 100% guards and fortification heap 11% 0% 0% 0% stack_guards 78% 95% 88% 68% • Chrome is the only one to enable partial fortification 33% 47% 38% 4% Heap protection flag good 33% 37% 25% 8% • Safari isn’t using source fortification risky 89% 95% 100% 44% much bad 44% 68% 38% 8% scores • Scores are very close, all near 95 th 5th % 33 43 38 24 percentile for High Sierra (71) 50th % 51 56 51 51 95th % 63 71 63 64 • Same general outcome as in Linux

  24. Windows 10 Browsers Chrome Edge Firefox Opera • Scores are very close, but Edge wins by version 63.0.3239 41.16299 57.0.4 50.0.2762 a hair count 31 7 31 16 • 95 th percentile is 64 for Win 10 64bit 62% 100% 94% 100% dep 100% 100% 100% 100% • Chrome has more 32 bit binaries than aslr 100% 100% 100% 100% the others cfi 13% 100% 13% 38% stack guards 94% 57% 61% 94% • Edge is the only one with 100% CFI functions good 0% 0% 3% 0% • Chrome and Opera do better on stack risky 9% 0% 16% 0% guards bad 9% 0% 0% 0% • Firefox takes a hit because it excels in scores 5th % 23 44 7.5 44 neither, has more risky functions 50th % 44 64 44 44 95th % 64 64 44 64

  25. OSX Time Progression • Looked at four versions from 10.10.5 through 10.13.1 • 7.7% increase in percent of binaries that are 64 bit • 2% increase in use of stack guards, good functions • Heap protection decrease correlates with ASLR increase? • High Sierra shows significant decrease in # of binaries (~400 fewer) OSX OSX OSX OSX total 10.10.5 10.11.6 10.12.6 10.13.1 change # binaries 6449 6456 7017 6622 64bit 69% 71% 73% 77% +8 aslr 99% 99% 100%* 100%* +1 heap 5% 5% 4% 4% -1 stack_guards 71% 71% 72% 73% +2 good functions 27% 27% 27% 29% +2 risky functions 62% 62% 60% 60% -2 bad functions 25% 25% 24% 24% -1

  26. Safari Time Progression • New binaries introduced in High Sierra generally decreased performance • Overall increases in 64bit and stack guards, but not consistently • Function hygiene got a bit worse, especially in High Sierra • Partial source fortification introduced in HS Safari total in OSX 10.10.5 10.11.6 10.12.6 10.13.1 change # binaries 9 13 22 25 64bit 83% 92% 86% 88% +5* stack_guards 67% 69% 73% 68% +1 partial fortification 0% 0% 0% 4% +4 good functions 17% 15% 9% 8% -9 risky 50% 36% 38% 44% -6* bad 0% 8% 5% 8% +8

  27. Mining Useful Spectre Gadgets • Focus on BTB poisoning aka Variant 2 widgets • Use DFA to locate this pattern: • Op reg1,[base (+index)] • Base or Index either attacker controlled or useful data • … (anything that doesn’t destroy data in reg1) • Op [base (+index)],reg2 or Op reg2,[base (+index)] • Where base or index are reg1 • Tl;dr: load, load or store

  28. Mining Useful Spectre Gadgets

  29. • We’ve been reporting bugs • Firefox on OSX was missing ASLR (they fixed it quick!) • Several patches & bugs submitted to LLVM & Qemu • We’ve inspired others CITL: Impact • Big shout-out to the Fedora Red Team • We’ve partnered to cover broader domains • Consumer Reports https://www.consumerreports.org • The Digital Standard https://thedigitalstandard.org

Recommend


More recommend