Quantitative Security Colorado State University Yashwant K Malaiya CS 559 Vulnerability Discovery Models CSU Cybersecurity Center Computer Science Dept 1 1
Modeling Vulnerability Discovery • Quantitative Vulnerability Assessment Alhazmi 2004- 2008 • Seasonality in Vulnerability Discovery Joh 2008,2009 • Discovery in Multi-Version Software Kim 2006,2007 2
Vulnerabilities 3
Motivation • For defects: Reliability modeling and SRGMs have been around for decades. • Assuming that vulnerabilities are special faults will lead us to this question: – To what degree reliability terms and models are applicable to vulnerabilities and security? [Littlewood et al]. – The need for quantitative measurements and estimation is becoming more crucial. 4
Goal: Modeling Vulnerability Discovery • Developing a quantitative model to estimate vulnerability discovery. • Using calendar time . • Using equivalent effort . • Validate these measurements and models. – Testing the models using available data • Identify security Assessment metrics – Vulnerability density – Vulnerability to Total defect ratio 5
Time – vulnerability discovery model • What factors impact the discovery process? – The changing environment • The share of installed base. • Global internet users. – Discovery effort • Discoverers: Developer, White hats or black hats. • Discovery effort is proportional to the installed base over time. • Vulnerability finders’ reward: greater rewards, higher motivation. – Security level desired for the system • Server or client 6
Time – vulnerability discovery model • Each vulnerability is recorded. – Available [NVD, vender etc]. – Needs compilation and filtering. • Data show three phases for an OS. • Assumptions: Phase 1 Phase 2 Phase 3 Vulnerabilities – The discovery is driven by the rewards factor. – Influenced by the change of market share. Time 7
Time–vulnerability Discovery model dy = - Ay ( B y ) 3 phase model S-shaped dt model. • Phase 1: B = y •Installed base –low. - ABt + BCe 1 • Phase 2: •Installed base–higher and Vulnerability time growth model growing/stable. • Phase 3: Vulnerabilities •Installed base–dropping. Time 8
AML Discovery model dy Alhazmi Malaiya Logistic model (AML ) = - Ay ( B y ) dt B = y - ABt + BCe 1 Vulnerability time growth model Vulnerabilities Time O. H. Alhazmi and Y. K. Malaiya, "Quantitative Vulnerability Assessment of Systems Software Proc. Ann. IEEE Reliability and Maintainability Symp., 2005, pp. 615-620 9
Time–based model: Windows 98 Windows 98 Windows 98 Fitted curve Total vulnerabilites 45 A 0.004873 40 35 B 37.7328 30 Vulnerabilities 25 C 0.5543 20 χ 2 7.365 15 10 χ 2critial 60.481 5 0 P-value 1- 7.6x10 -11 Jan-99 Mar-99 May-99 Jul-99 Sep -99 Nov-99 Jan-00 Mar-00 May-00 Jul-00 Sep -00 Nov-00 Jan-01 Mar-01 May-01 Jul-01 Sep -01 Nov-01 Jan-02 Mar-02 May-02 Jul-02 Sep -02 10
Time–based model: Windows NT 4.0 Windows NT 4.0 Windows NT 4.0 Total vulnerabilities Fitted curve 160 140 A 0.000692 120 B 136 Vulnerabilities 100 C 0.52288 80 60 χ 2 35.584 40 χ 2critial 103.01 20 0 P-value 0.9999973 Aug-96 Aug-97 Aug-98 Aug-99 Aug-00 Aug-01 Aug-02 Dec-96 Apr-97 Dec-97 Apr-98 Dec-98 Apr-99 Dec-99 Apr-00 Dec-00 Apr-01 Dec-01 Apr-02 Dec-02 Apr-03 11
Usage –vulnerability Discovery model Internet Growth The data: • 800 745 677 757 – The global internet 700 719 608 682 569 600 Millions of users population. 587 451 458 479 558 500 513 359 400 – The market share of the 300 304 system during a period of 248 200 147 time. 100 70 36 16 0 Dec., 1995 Dec., 1996 Dec., 1997 Dec., 1998 Dec., 1999 Mar. 2000 Jul., 2000 Dec., 2000 Mar., 2001 Jun., 2001 Aug., 2001 Apr. 2002 Jul., 2002 Sep., 2002 Mar., 2003 Sep., 2003 Oct., 2003 Dec., 2003 Feb., 2004 May, 2004 Equivalent effort • – The real environment performs an intensive testing. The percentage of the market share of O.S. Windows 95 Windows 98 Windows XP Windows NT Windows 2000 Others – Malicious activities is relevant 60 to overall activities. 50 Installed Base Percentage 40 – Defined as 30 = å = 20 n ´ E ( U P ) 10 i i i 0 0 May-99 Aug-99 -99 Feb-00 May-00 Aug-00 -00 Feb-01 May-01 Aug-01 -01 Feb-02 May-02 Aug-02 -02 Feb-03 May-03 Aug-03 -03 Feb-04 May-04 Nov Nov Nov Nov Nov 12
13 Estimating number of users Estimating the number of IE users QUANTITATIVE ANALYSES OF SOFTWARE VULNERABILITIES, HyunChul Joh, 2011 13
14 Software Reliability Modeling • Applicable to general software bugs • Key Static software metrics – Software size (without comments, KLOC) – Defect density (total defects/size) • Typical range Range 16 -0.1 /KLOC • Software evolution/reuse, requirement volatility • Team capabilities, extent of testing – Defect finding efficiency 14
15 Exponential SRGM Exponential Reliability Growth Model N(0) Assumption: rate of finding and removing bugs is • 160 140 proportional to the number of bugs present at 120 time t. 100 80 − 𝑒𝑂(𝑢) 60 = 𝛾 ! 𝑂(𝑢) 40 𝑒𝑢 20 0 Which yields 0 20000 40000 60000 80000 100000 time (sec.) 𝑂 𝑢 = 𝑂 0 𝑓 "# ! $ 0.006 Cumulative number of defects found is • 0.005 0.004 𝑂(0)(1 − 𝑓 "# ! $ ) 0.003 0.002 Defect finding rate is • 0.001 0 𝑂(0)𝑓 "# ! $ 0 50000 100000 time (sec.) • N(0) may be estimated using defect density and size 𝛾 ! depends to defect finding efficiency • 15
Usage –vulnerability Discovery model • The model: growth with effort. • Growth model based on the exponential SRGM [Musa] . • Time is eliminated. 40 35 • 𝑧 = 𝑂(0)(1 − 𝑓 !" ! # ) 30 Vulnerabilities 25 20 15 10 5 0 0 750 1500 2250 3000 3750 4500 5250 6000 6750 7500 Usage (Million user's months) 16
Effort-based model: Windows 98 Windows 98 Windows 98 Actual Vulnerabilities Fitted curve 40 35 B 37 30 Vulnerabilities λ vu 0.000505 25 20 χ 2 3.510 15 10 χ 2critial 44.9853 5 0 P-value 1- 3.3x10 -11 0 750 1500 2250 3000 3750 4500 5250 6000 6750 7500 Usage (Million user's months) 17
Effort-based model: Windows NT 4.0 Windows NT 4.0 Win NT 4.0 Actual Vulnerability Fitted 120 B 108 100 Vulnerabilities λ vu 0.003061 80 60 ` χ 2 15.05 40 20 χ 2critial 42.5569 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 P-value 0.985 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 1 1 1 1 1 1 Usage (Millions users months) 18
Discussion Windows 98 Fitted curve Total vulnerabilites 45 40 35 30 Vulnerabilities • Excellent fit for Windows 98 25 20 15 and NT 4.0. 10 5 • Model fits data for all OSs 0 Jan-99 Mar-99 May-99 Jul-99 Sep -99 Nov-99 Jan-00 Mar-00 May-00 Jul-00 Sep -00 Nov-00 Jan-01 Mar-01 May-01 Jul-01 Sep -01 Nov-01 Jan-02 Mar-02 May-02 Jul-02 Sep -02 examined. Deviation from the model caused by overlap: • – Windows 98 and Windows XP – Windows NT 4.0 and Windows 2000 • Vulnerabilities in shared code may be detected in the newer OS. • Need: approach for handling such overlap 19
Non-linear regression with Solver • Excel has the capability to solve linear (and often nonlinear) programming problems. • The SOLVER tool in Excel: – May be used to solve linear and nonlinear optimization problems – Allows integer or binary restrictions to be placed on decision variables – Can be used to solve problems with up to 200 decision variables – The SOLVER Add-in is a Microsoft Office Excel add-in program that is available when you install Microsoft Office or Excel. – To use the Solver Add-in, however, you first need to load it in Excel. The process is slightly different for Mac or PC users. 20
Classic Optimization Problem • Linear Programming, Non-Linear Programming etc. • Specified – Objective function: minimize or maximize – Constraints: equalities, inequalities • Generally solution is iterative • Excel Solver algorithms • Simplex method is used for solving linear problems • GRG solver for solving smooth nonlinear problems • Evolution solver uses genetic algorithms 21
Initial Values • Start with some initial values and the gradually iterate towards optimal. • When 3 or more parameters are used, it is best to start with some good initial guesses. • Algorithm may get stuck at a local minimum/maximum • Repeat with diverse initial guesses. 22
Example • Example: – w95exmple.xlsx B = y - ABt + BCe 1 • Decision variables: 3 parameter values. • Objective Function: Sum of squares of errors between actual vs predicted values • Constraints: all parameters must be positive 23
Vulnerability density and defect density • Defect density – Valuable metric for planning test effort – Used for setting release quality target – Some data is available – Depends on various factors, may be stable for a team/process • Vulnerabilities are a class of defects – Vulnerability data is in the public domain. – Is vulnerability density a useful measure? – Is it related to defect density? Vulnerabilities = 5% of defects [Longstaff ]? • Vulnerabilities = 1% of defects [Anderson]? • • Can be a major step in measuring security. 24
Recommend
More recommend