Need for . . . What Is . . . Cyberinfrastructure: . . . Part 1: Need for Data . . . Need for Expert Knowledge Traditional Statistical . . . (and Soft Computing) in Case of Different . . . Part 2: Use of Expert . . . Cyberinfrastructure-Based Determining Earth . . . Part 3: How to Best . . . Data Processing Home Page Title Page Vladik Kreinovich ◭◭ ◮◮ Department of Computer Science ◭ ◮ University of Texas at El Paso El Paso, TX 79968, USA Page 1 of 49 vladik@utep.edu http://www.cs.utep.edu/vladik Go Back Full Screen Close Quit
Need for . . . What Is . . . 1. Need for Cyberinfrastructure Cyberinfrastructure: . . . • A large amount of data has been collected and stored Part 1: Need for Data . . . at different locations. Traditional Statistical . . . Case of Different . . . • Researchers and practitioners need easy and fast access Part 2: Use of Expert . . . to all the relevant data. Determining Earth . . . • For example, a geoscientist needs access to: Part 3: How to Best . . . – a state geological map (which is usually stored at Home Page the state’s capital), Title Page – NASA photos (stored at NASA Headquarters and/or ◭◭ ◮◮ at one of corresponding NASA centers), ◭ ◮ – seismic data stored at different seismic stations, etc. Page 2 of 49 • An environmental scientist needs access: Go Back – to satellite radar data, – to data from bio-stations, Full Screen – to meteorological data, etc. Close Quit
Need for . . . What Is . . . 2. What Is Cyberinfrastructure Cyberinfrastructure: . . . • Cyberinfrastructure is a general name for hardware/software Part 1: Need for Data . . . tools that facilitate such data transfer/processing. Traditional Statistical . . . Case of Different . . . • Ideally, this data transfer and processing should be as Part 2: Use of Expert . . . easy and convenient as a google search. Determining Earth . . . • At present, the main challenges in cyberinfrastructure Part 3: How to Best . . . design are related to the actual development of: Home Page – the corresponding hardware tools and Title Page – the corresponding software tools. ◭◭ ◮◮ • Most existing cyberinfrastructure tools use existing well ◭ ◮ defined algorithms. Page 3 of 49 • The results of using cyberinfrastructure are exciting. Go Back • However, there is still room for improvement. Full Screen Close Quit
Need for . . . What Is . . . 3. Cyberinfrastructure: Expert Knowledge Is Needed Cyberinfrastructure: . . . • Current cyberinfrastructure results are based only on Part 1: Need for Data . . . data processing. Traditional Statistical . . . Case of Different . . . • Some of these results do not make geological sense. Part 2: Use of Expert . . . • It is necessary to take into account expert knowledge. Determining Earth . . . • Specifically, we must incorporate expert knowledge di- Part 3: How to Best . . . rectly into the cyberinfrastructure. Home Page • Some expert knowledge is formulated in precise terms; Title Page these types of knowledge are easier to incorporate. ◭◭ ◮◮ • A large part of expert knowledge is formulated by using ◭ ◮ imprecise (fuzzy) words (like “small”). Page 4 of 49 • To deal with such knowledge, fuzzy techniques have Go Back been invented. Full Screen • So, to incorporate this knowledge, it is natural to use fuzzy techniques. Close Quit
Need for . . . What Is . . . 4. What We Do In This Talk Cyberinfrastructure: . . . • In this talk, we describe several problems in which such Part 1: Need for Data . . . incorporation is needed. Traditional Statistical . . . Case of Different . . . • These problems come from our experience from geo- Part 2: Use of Expert . . . and environmental applications of cyberinfrastructure. Determining Earth . . . • First, we show that expert knowledge is needed even Part 3: How to Best . . . when we “ fuse ” data from different sources. Home Page • Then, we show how expert knowledge can be used in Title Page processing data. ◭◭ ◮◮ • Finally, we show how expert knowledge can be used in ◭ ◮ selecting the best ways of getting the data. Page 5 of 49 Go Back Full Screen Close Quit
Need for . . . What Is . . . 5. Part 1: Need for Data Fusion Cyberinfrastructure: . . . • In many practical situations, we have several results Part 1: Need for Data . . . x ( n ) of measuring the same quantity x . x (1) , . . . , � � Traditional Statistical . . . Case of Different . . . • These results are different since measurements are never Part 2: Use of Expert . . . 100% accurate. Determining Earth . . . • It is know that by combining different measurement Part 3: How to Best . . . results, we increase accuracy. Home Page • Simplest case: we use the same measuring instrument Title Page for all measurements. ◭◭ ◮◮ • In this case, an arithmetic average reduces the st. dev. by a factor of √ n : ◭ ◮ Page 6 of 49 x (1) + . . . + � x ( n ) x = � � . Go Back n Full Screen • When we fuse measurements of different accuracy, we x ( i ) . need to use different weights for different values � Close Quit
Need for . . . What Is . . . 6. Data Fusion: Challenge Cyberinfrastructure: . . . • When we fuse measurements of different accuracy, we Part 1: Need for Data . . . x ( i ) . need to use different weights for different values � Traditional Statistical . . . Case of Different . . . • Sometimes, we can find the actual values and thus, Part 2: Use of Expert . . . estimate the accuracy of different measurements. Determining Earth . . . • In other cases – e.g., in geosciences – it is difficult to Part 3: How to Best . . . find the actual density at depth 40 km. Home Page • Hence, in geosciences, it is difficult to gauge the accu- Title Page racy of seismic, gravity, and other techniques. ◭◭ ◮◮ • In this case, we need to estimate the accuracies from ◭ ◮ the observations. Page 7 of 49 • We will show that in this case, seemingly reasonable statistical methods do not work well. Go Back Full Screen • Thus, statistical methods need to be supplemented with expert knowledge. Close Quit
Need for . . . What Is . . . 7. Traditional Statistical Methods: Reminder Cyberinfrastructure: . . . • In many cases, the measurement error is caused by Part 1: Need for Data . . . many different causes. Traditional Statistical . . . Case of Different . . . • It is known that the distribution of the sum of many Part 2: Use of Expert . . . small random variables is ≈ normally distributed. Determining Earth . . . • So, we can conclude that the measurement errors are Part 3: How to Best . . . normally distributed, with probability density Home Page � � x − x ) 2 1 − ( � Title Page √ ρ ( � x ) = 2 π · σ · exp . 2 σ 2 ◭◭ ◮◮ x ( i ) of independent measurements, • If we have n results � ◭ ◮ � � x ( i ) − x ) 2 � n 1 − ( � then prob. is prop. to ρ = √ 2 π · σ · exp . Page 8 of 49 2 σ 2 i =1 Go Back • Maximum Likelihood Method: select most probable x Full Screen and σ , for which prob. (hence ρ ) is the largest. Close Quit
Need for . . . What Is . . . 8. Traditional Statistical Methods (cont-d) Cyberinfrastructure: . . . � � x ( i ) − x ) 2 � n 1 − ( � Part 1: Need for Data . . . √ • Maximizing ρ = 2 π · σ · exp is 2 σ 2 Traditional Statistical . . . i =1 equivalent to minimizing Case of Different . . . x ( i ) − x ) 2 n � Part 2: Use of Expert . . . ( � ψ = − ln( ρ ) = const + n · ln( σ ) + . 2 σ 2 Determining Earth . . . i =1 Part 3: How to Best . . . • W.r.t. x , we get the Least Squares method which leads Home Page n � to the arithmetic average x = 1 x ( i ) . n · � Title Page i =1 ◭◭ ◮◮ • Differentiating ψ w.r.t. σ and equating to 0, we get ◭ ◮ x ( i ) − x ) 2 n � n ( � σ − = 0 . Page 9 of 49 σ 3 i =1 Go Back n � • So, we get the usual estimate σ 2 = 1 x ( i ) − x ) 2 . Full Screen n · ( � i =1 Close Quit
Need for . . . What Is . . . 9. Case of Different Measuring Instruments (MI): Cyberinfrastructure: . . . Surprising Problem Part 1: Need for Data . . . • Situation: for different quantities x j , j = 1 , . . . , m , we Traditional Statistical . . . x ( i ) have measurement results � j corr. to diff. MI, w/diff. σ i . Case of Different . . . Part 2: Use of Expert . . . • The resulting probability is proportional to � � Determining Earth . . . x ( i ) n m � � j − x j ) 2 − ( � 1 Part 3: How to Best . . . ρ = √ · exp . 2 σ 2 2 π · σ i Home Page i i =1 j =1 Title Page • Seemingly natural idea: use Maximum Likelihood method, i.e., find x j and σ i for which ρ → max. ◭◭ ◮◮ • We tried , and found that at maximum, one of σ i is 0. ◭ ◮ • We then theoretically confirmed: that maximum Page 10 of 49 ρ max = ∞ is attained: Go Back – when σ i 0 = 0 for some i 0 , and Full Screen x ( i 0 ) – when x j = � for all j . Close j Quit
Recommend
More recommend