Main Objectives of . . . Need for Analytical . . . Describing the User’s . . . Current State and . . . Towards Analytical Analyzing Probability . . . Techniques for Systems Gauging Accuracy of . . . Data Mining Engineering Applications Hypothesis Testing Testing Griselda Valdepe˜ nas Acosta Home Page Title Page Systems Engineering Program University of Texas at El Paso, El Paso, Texas 79968, USA ◭◭ ◮◮ gvacosta@miners.utep.edu ◭ ◮ Page 1 of 71 Go Back Full Screen Close Quit
Main Objectives of . . . Need for Analytical . . . Part I Describing the User’s . . . Formulation of the Problem and a Current State and . . . General Overview of the Results Analyzing Probability . . . Gauging Accuracy of . . . Data Mining Hypothesis Testing Testing Home Page Title Page ◭◭ ◮◮ ◭ ◮ Page 2 of 71 Go Back Full Screen Close Quit
Main Objectives of . . . Need for Analytical . . . 1. Main Objectives of Systems Engineering: a Brief Describing the User’s . . . Reminder Current State and . . . • One of the main goals of systems engineering is to de- Analyzing Probability . . . sign, maintain, and analyze systems to help users. Gauging Accuracy of . . . Data Mining • To design an appropriate system for an application do- Hypothesis Testing main, we need to know: Testing – what are the users’ desires and preferences, and Home Page – what is the current state and what is the dynamics Title Page of this application domain, and ◭◭ ◮◮ – how to use all this information to select the best ◭ ◮ alternatives for the system design and maintenance. Page 3 of 71 Go Back Full Screen Close Quit
Main Objectives of . . . Need for Analytical . . . 2. Need for Analytical Techniques Describing the User’s . . . • Designing a system includes selecting numerical values Current State and . . . for many of the parameters describing this system. Analyzing Probability . . . Gauging Accuracy of . . . • At present, in many cases, this selection is made by Data Mining following semi-heuristic recommendations Hypothesis Testing • Experience shows that such heuristic imprecise recom- Testing mendations often lead to less-than-perfect results. Home Page • It is therefore desirable to come up with analytical Title Page techniques for system design, techniques based: ◭◭ ◮◮ – on valid numerical analysis and ◭ ◮ – on the solution of the corresponding optimization Page 4 of 71 problems. Go Back Full Screen Close Quit
Main Objectives of . . . Need for Analytical . . . 3. What We Do in This Dissertation: General Describing the User’s . . . Idea Current State and . . . • System engineering is a very broad discipline, with Analyzing Probability . . . many different application domains. Gauging Accuracy of . . . Data Mining • Each domain has its own specifics and requires its own Hypothesis Testing analysis and, probably, it own analytical techniques. Testing • In this dissertation, we: Home Page – formulate and analyze general problems of system Title Page design, implementation, testing, and monitoring, ◭◭ ◮◮ – and show how the corresponding analytical tech- ◭ ◮ niques can be applied to different application do- mains. Page 5 of 71 Go Back Full Screen Close Quit
Main Objectives of . . . Need for Analytical . . . 4. Describing the User’s Preferences Describing the User’s . . . • In the ideal world, we should be able to ask each user’s Current State and . . . opinion about each of the alternatives. Analyzing Probability . . . Gauging Accuracy of . . . • However, for large systems, with many possible alter- Data Mining natives, this is not realistic. Hypothesis Testing • Therefore, we need to extrapolate the user’s prefer- Testing ences based on available partial information. Home Page • There are analytical techniques for such extrapolation Title Page – e.g., the widely used matrix factorization technique. ◭◭ ◮◮ • However, this technique is purely empirical – and thus, ◭ ◮ not very reliable. Page 6 of 71 • We provide a theoretical explanation for this technique. Go Back • The existence of such an explanation makes it more Full Screen reliable. Close Quit
Main Objectives of . . . Need for Analytical . . . 5. Describing the User’s Preferences (cont-d) Describing the User’s . . . • We need to take into account that the user’s prefer- Current State and . . . ences are usually not very detailed. Analyzing Probability . . . Gauging Accuracy of . . . • Thus, because of their approximate nature, we should Data Mining not waste time trying to fit them optimally. Hypothesis Testing • This approximate nature is usually captured by the Testing empirical 7 plus minus 2 law. Home Page • According to this law, in the first approximation, a user Title Page usually divides alternatives into 7 plus minus 2 groups. ◭◭ ◮◮ • This law is purely empirical – and thus, its use is not ◭ ◮ as reliable as we would like it to be. Page 7 of 71 • To make this law more reliable, we provide a partial Go Back theoretical explanation of this law. Full Screen Close Quit
Main Objectives of . . . Need for Analytical . . . Describing the User’s . . . Current State and . . . Analyzing Probability . . . Gauging Accuracy of . . . Figure 1: Why Seven? Data Mining Hypothesis Testing Testing Home Page Title Page ◭◭ ◮◮ ◭ ◮ Page 8 of 71 Go Back Full Screen Close Quit
Main Objectives of . . . Need for Analytical . . . 6. What Is the Current State and Dynamics of an Describing the User’s . . . Application Domain? Current State and . . . • We also need to know what is the current state and Analyzing Probability . . . what is the dynamics of this application domain. Gauging Accuracy of . . . Data Mining • This information comes from two main sources: Hypothesis Testing – from measurements and Testing – from expert estimates. Home Page Title Page ◭◭ ◮◮ ◭ ◮ Page 9 of 71 Go Back Full Screen Close Quit
Main Objectives of . . . Need for Analytical . . . 7. Analytical Techniques for Analyzing Probabil- Describing the User’s . . . ity Distributions Current State and . . . • It is important to take into account that many real- Analyzing Probability . . . world processes are probabilistic. Gauging Accuracy of . . . Data Mining • In many cases, the corresponding probability distribu- Hypothesis Testing tions are Gaussian (normal). Testing • This makes perfect sense, since such processes are af- Home Page fected by many independent factors. Title Page • It is known that in such cases, the distributions should ◭◭ ◮◮ be close to normal. ◭ ◮ • However, there are cases when the corresponding dis- Page 10 of 71 tribution is different – e.g., uniform. Go Back • On a practical example, we explain why such distribu- tions appear. Full Screen Close Quit
Main Objectives of . . . Need for Analytical . . . 8. Analytical Techniques for Analyzing How Sys- Describing the User’s . . . tems Change with Time Current State and . . . • In general, systems change with time, and the corre- Analyzing Probability . . . sponding probability distributions change. Gauging Accuracy of . . . Data Mining • There are some general rules about such changes, some Hypothesis Testing of them well-explained, some more empirical. Testing • It is well-known (and well-explained) that the entropy Home Page of a closed system increases with time. Title Page • This is known as the Second Law of Thermodynamics. ◭◭ ◮◮ • Interestingly, there is another empirical observation – ◭ ◮ which is not as well justified – that Page 11 of 71 – while the entropy increases, Go Back – its rate of increase is often the smallest possible. Full Screen • This is known as the minimum entropy production principle. Close Quit
Main Objectives of . . . Need for Analytical . . . Describing the User’s . . . Current State and . . . Analyzing Probability . . . Gauging Accuracy of . . . Data Mining Hypothesis Testing Testing Home Page Title Page ◭◭ ◮◮ ◭ ◮ Page 12 of 71 Go Back Figure 2: Blame Entropy. Part 1 Full Screen Close Quit
Main Objectives of . . . Need for Analytical . . . Describing the User’s . . . Current State and . . . Analyzing Probability . . . Gauging Accuracy of . . . Data Mining Hypothesis Testing Testing Home Page Title Page ◭◭ ◮◮ ◭ ◮ Page 13 of 71 Go Back Full Screen Close Quit
Main Objectives of . . . Need for Analytical . . . 9. Minimum Entropy Production Principle Describing the User’s . . . • This principle was first formulated and explained by a Current State and . . . future Nobelist Ilya Prigogine. Analyzing Probability . . . Gauging Accuracy of . . . • Since then, many possible explanations of this principle Data Mining appeared. Hypothesis Testing • However, all these explanations are very technical, based Testing on complex analysis of differential equations. Home Page • Since this phenomenon is ubiquitous, it is desirable to Title Page look for a general system-based explanation. ◭◭ ◮◮ • We provide an explanation, based on the importance ◭ ◮ to keep as many solution options open as possible. Page 14 of 71 • In decision making, one of the main errors is to focus Go Back too quickly and to become blind to alternatives. Full Screen Close Quit
Recommend
More recommend