proba obabi bility lity an and ran d rando dom m
play

Proba obabi bility lity an and Ran d Rando dom m Processes - PowerPoint PPT Presentation

Proba obabi bility lity an and Ran d Rando dom m Processes ocesses ECS S 315 Asst. Prof. Dr. Prapun Suksompong prapun@siit.tu.ac.th 1 Probability and You Office Hours: Rangsit Library: Tuesday 16:20-17:20 BKD3601-7: Thursday


  1. Actual Research  University of California San Diego  The researchers have shown that codes can be easily discerned from quite a distance (at least seven metres away) and image- analysis software can automatically find the correct code in more than half of cases even one minute after the code has been entered.  This figure rose to more than eighty percent if the thermal camera was used immediately after the code was entered. K. Mowery, S. Meiklejohn , and S. Savage. 2011. “Heat of the Moment: Characterizing the Efficacy of Thermal- Camera Based Attacks”. Pro ceed- ings of WOOT 2011. http://cseweb.ucsd.edu/~kmowery/papers/thermal.pdf http://wordpress.mrreid.org/2011/08/27/hacking-pin-pads-using- thermal-vision/ 18

  2. The Birthday Problem (Paradox)  How many people do you need to assemble before the probability is greater than 1/2 that some two of them have the same birthday (month and day)?  Birthdays consist of a month and a day with no year attached.  Ignore February 29 which only comes in leap years  Assume that every day is as likely as any other to be someone’s birthday  In a group of r people, what is the probability that two or more people have the same birthday? 19

  3. Probability of birthday coincidence  Probability that there is at least two people who have the same birthday in a group of r persons   1, if r 365             365 1 r 365 364    1 · · · , if 0 r 365    365 365 3 5 6       terms r 20

  4. Probability of birthday coincidence 21

  5. The Birthday Problem ( con’t )  With 88 people, the probability is greater than 1/2 of having three people with the same birthday.  187 people gives a probability greater than1/2 of four people having the same birthday 22

  6. Birthday Coincidence: 2 nd Version  How many people do you need to assemble before the probability is greater than 1/2 that at least one of them have the same birthday (month and day) as you?  In a group of r people, what is the probability that at least one of them have the same birthday (month and day) as you? 23

  7. Distinct Passcodes (revisit)  Unknown numbers:  The number of 4-digit different passcodes = 10 4  Exactly four different numbers:  The number of 4-digit different passcodes = 4! = 24  Exactly three different numbers:   2    The number of 4-digit different passcodes = 3 4 36  Exactly two different numbers: 3 + 4 4 2 + 4  The number of 4-digit different passcodes = 1 = 14  Exactly one number:  The number of 4-digit different passcodes = 1  Check: 10 ⋅ 24 + 10 ⋅ 36 + 10 ⋅ 14 + 10 ⋅ 1 = 10,000 3 4 2 1 24

  8. Need more practice? [ http://en.wikipedia.org/wiki/Poker_probability ] Ex: Poker Probability 25

  9. Binomial Theorem    ( ) ( ) x y x y 1 1 2 2     x x x y y x y y 1 2 1 2 1 2 1 2      ( x y ) ( x y ) ( x y ) 1 1 2 2 3 3         x x x x x y x y x x y y y x x y x y y y x y y y 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3    x x x x 1 2 3    y y y y 1 2 3    ( x y ) ( x y )        2 2 xx xy yx yy x 2 xy y      ( x y ) ( x y ) ( x y )         xxx xx y xyx xyy y xx yxy yy x y yy     3 2 2 3 3 3 x x y xy y 26

  10. Success Runs (1/4)  Suppose that two people are separately asked to toss a fair coin 120 times and take note of the results. Heads is noted as a “one” and tails as a “zero”.  Results: Two lists of compiled zeros and ones: 27 [Tijms, 2007, p 192]

  11. Success Runs (2/4)  Which list is more likely? 28 [Tijms, 2007, p 192]

  12. Success Runs (3/4)  Fact: One of the two individuals has cheated and has fabricated a list of numbers without having tossed the coin.  Which list is more likely be the fabricated list? 29 [Tijms, 2007, p 192]

  13. Success Runs (4/4)  Fact: In 120 tosses of a fair coin, there is a very large probability that at some point during the tossing process, a sequence of five or more heads or five or more tails will naturally occur .  The probability of this is approximately 0.9865.  In contrast to the second list, the first list shows no such sequence of five heads in a row or five tails in a row. In the first list, the longest sequence of either heads or tails consists of three in a row.  In 120 tosses of a fair coin, the probability of the longest sequence consisting of three or less in a row is equal to 0.000053 which is extremely small .  Thus, the first list is almost certainly a fake.  Most people tend to avoid noting long sequences of consecutive heads or tails. Truly random sequences do not share this human tendency! 30 [Tijms, 2007, p 192]

  14. Fun Reading …  Entertaining Mathematical Puzzles (1986)  By Martin Gardner (1914-2010)  It includes a mixture of old and new riddles covering a variety of mathematical topics: money, speed, plane and solid geometry, probability (Part VII) , topology, tricky puzzles and more.  Carefully explained solutions follow each problem. 31

  15. Fun Books… 32

  16. Exercise from Mlodinow’s talk  At 10:14 into the video, Mlodinow shows three probabilities.  Can you derive the first two?  http://www.youtube.com/watch?v=F0sLuRsu1Do  [Mlodinow, 2008, p. 180-181] 33

  17. Proba obabi bility lity an and Ran d Rando dom m Processes ocesses ECS S 315 Asst. Prof. Dr. Prapun Suksompong prapun@siit.tu.ac.th II. Events-Based Probability Theory Office Hours: Rangsit Library: Tuesday 16:20-17:20 BKD3601-7: Thursday 14:40-16:00 1

  18. Proba obabi bility lity an and Ran d Rando dom m Processes ocesses ECS S 315 Asst. Prof. Dr. Prapun Suksompong prapun@siit.tu.ac.th 5 Foundation of Probability Theory Office Hours: Rangsit Library: Tuesday 16:20-17:20 BKD3601-7: Thursday 14:40-16:00 2

  19. Kolmogorov  Andrey Nikolaevich Kolmogorov  Soviet Russian mathematician  Advanced various scientific fields  probability theory  topology  classical mechanics  computational complexity.  1922: Constructed a Fourier series that diverges almost everywhere, gaining international recognition.  1933 : Published the book, Foundations of the Theory of Probability , laying the modern axiomatic foundations of probability theory and establishing his reputation as the world's leading living expert in this field. 3

  20. I learned probability theory from Eugene Dynkin Philip Protter Gennady Samorodnitsky Terrence Fine Xing Guo Toby Berger Rick Durrett 4

  21. Not too far from Kolmogorov You can be the 4 th -generation probability theorists 5

  22. Proba obabi bility lity an and Ran d Rando dom m Processes ocesses ECS S 315 Asst. Prof. Dr. Prapun Suksompong prapun@siit.tu.ac.th Event-Based Properties 6

  23. Daniel Kahneman  Daniel Kahneman  Israeli-American psychologist  2002 Nobel laureate  In Economics  Hebrew University, Jerusalem, Israel.  Professor emeritus of psychology and public affairs at Princeton University's Woodrow Wilson School.  With Amos Tversky , Kahneman studied and clarified the kinds of misperceptions of randomness that fuel many of the common fallacies. 7

  24. [outspoken = given to expressing yourself freely or insistently] K&T: Q1 Imagine a woman named Linda , 31 years old, single , outspoken , and very bright . In college she majored in philosophy . While a student she was deeply concerned with discrimination and social justice and participated in antinuclear demonstrations .  K&T presented this description to a group of 88 subjects and asked them to rank the eight statements (shown on the next slide) on a scale of 1 to 8 according to their probability, with 1 representing the most probable and 8 the least. [Daniel Kahneman, Paul Slovic, and Amos Tversky, eds., Judgment under Uncertainty: Heuristics and Biases (Cambridge: Cambridge University Press, 8 1982), pp. 90 – 98.]

  25. [feminist = of or relating to or advocating equal rights for women] K&T: Q1 - Results  Here are the results - from most to least probable 9

  26. K&T: Q1 – Results (2)  At first glance there may appear to be nothing unusual in these results: the description was in fact designed to be  representative of an active feminist and  unrepresentative of a bank teller or an insurance salesperson. Most probable Least likely 10

  27. K&T: Q1 – Results (3)  Let’s focus on just three of the possibilities and their average ranks.  This is the order in which 85 percent of the respondents ranked the three possibilities:  If nothing about this looks strange, then K&T have fooled you 11

  28. K&T: Q1 - Contradiction The probability that two events will both occur can never be greater than the probability that each will occur individually! 12

  29. K&T: Q2  K&T were not surprised by the result because they had given their subjects a large number of possibilities, and the connections among the three scenarios could easily have gotten lost in the shuffle.  So they presented the description of Linda to another group, but this time they presented only three possibilities :  Linda is active in the feminist movement.  Linda is a bank teller and is active in the feminist movement.  Linda is a bank teller. 13

  30. K&T: Q2 - Results  To their surprise, 87 percent of the subjects in this trial also incorrectly ranked the probability that “ Linda is a bank teller and is active in the feminist movement ” higher than the probability that “ Linda is a bank teller ” .  If the detail s we are given fit our mental picture of something, then the more details in a scenario, the more real it seems and hence the more probable we consider it to be  even though any act of adding less-than-certain details to a conjecture makes the conjecture less probable.  Even highly trained doctors make this error when analyzing symptoms.  91 percent of the doctors fall prey to the same bias. [Amos Tversky and Daniel Kahneman, “ Extensional versus Intuitive Reasoning: The Conjunction Fallacy in Probability Judgment, ” Psychological Review 14 90, no. 4 (October 1983): 293 – 315.]

  31. Related Topic  Page 34-37  Tversky and Shafir @ Princeton University 15

  32. K&T: Q3  Which is greater :  the number of six- letter English words having “n” as their fifth letter or  the number of six- letter English words ending in “ -ing ”?  Most people choose the group of words ending in “ ing ”. Why? Because words ending in “ -ing ” are easier to think of than generic six letter words having “n” as their fifth letter.  The group of six- letter words having “n” as their fifth letter words includes all six- letter words ending in “ -ing ”.  Psychologists call this type of mistake the availability bias  In reconstructing the past, we give unwarranted importance to memories that are most vivid and hence most available for retrieval. [Amos Tversky and Daniel Kahneman, “ Availability: A Heuristic for Judging Frequency and Probability, ” Cognitive Psychology 5 (1973): 207 – 32.] 16

  33. Misuse of probability in law  It is not uncommon for experts in DNA analysis to testify at a criminal trial that a DNA sample taken from a crime scene matches that taken from a suspect.  How certain are such matches?  When DNA evidence was first introduced, a number of experts testified that false positives are impossible in DNA testing.  Today DNA experts regularly testify that the odds of a random person’s matching the crime sample are less than 1 in 1 million or 1 in 1 billion .  In Oklahoma a court sentenced a man named Timothy Durham to more than 3,100 years in prison even though eleven witnesses had placed him in another state at the time of the crime. 17 [Mlodinow, 2008, p 36-37]

  34. Lab/Human Error  There is another stat istic that is often not presented to the jury, one having to do with the fact that labs make errors , for instance, in collecting or handling a sample, by accidentally mixing or swapping samples, or by misinterpreting or incorrectly reporting results.  Each of these errors is rare but not nearly as rare as a random match.  The Philadelphia City Crime Laboratory admitted that it had swapped the reference sample of the defendant and the victim in a rape case  A testing firm called Cellmark Diagnostics admitted a similar error. 18 [Mlodinow, 2008, p 36-37]

  35. Timothy Durham ’ s case  It turned out that in the initial analysis the lab had failed to completely separate the DNA of the rapist and that of the victim in the fluid they tested, and the combination of the victim’s and the rapist’s DNA produced a positive result when compared with Durham’s.  A later retest turned up the error, and Durham was released after spending nearly four years in prison. 19 [Mlodinow, 2008, p 36-37]

  36. DNA-Match Error + Lab Error  Estimates of the error rate due to human causes vary, but many experts put it at around 1 percent.  Most jurors assume that given the two types of error — the 1 in 1 billion accidental match and the 1 in 100 lab-error match — the overall error rate must be somewhere in between, say 1 in 500 million, which is still for most jurors beyond a reasonable doubt . 20 [Mlodinow, 2008, p 36-37]

  37. Wait! …  Even if the DNA match error was extremely accurate + Lab error is very small,  there is also another probability concept that should be taken into account.  More about this later.  Right now, back to notes for more properties of probability measure. 21

  38. Proba obabi bility lity an and Ran d Rando dom m Processes ocesses ECS S 315 Asst. Prof. Dr. Prapun Suksompong prapun@siit.tu.ac.th 6.1 Conditional Probability Office Hours: Rangsit Library: Tuesday 16:20-17:20 BKD3601-7: Thursday 16:00-17:00 1

  39. 2

  40. Disease Testing  Suppose we have a diagnostic test for a particular disease which is 99% accurate.  A person is picked at random and tested for the disease.  The test gives a positive result .  Q1: What is the probability that the person actually has the disease?  Natural answer: 99% because the test gets it right 99% of the times. 3

  41. 99% accurate test?  Two kinds of error  If you use this test on many persons with the disease, the test will indicate correctly that those persons have disease 99% of the time.  False negative rate = 1% = 0.01 1  0  If you use this test on many persons without the disease, the test will indicate correctly that those persons do not have disease 99% of the time. 0  1  False positive rate = 1% = 0.01 4

  42. Disease Testing: The Question  Suppose we have a diagnostic test for a particular disease which is 99% accurate.  A person is picked at random and tested for the disease.  The test gives a positive result .  Q1: What is the probability that the person actually has the disease?  Natural answer: 99% because the test gets it right 99% of the times.  Q2: Can the answer be 1% or 2%?  Q3: Can the answer be 50%? 5

  43. Disease Testing: The Answer Q1: What is the probability that the person actually has the disease? A1: The answer actually depends on how common or how rare the disease is! 6

  44. Why?  Let’s assume rare disease .  The disease affects about 1 person in 10,000.  Try an experiment with 10 6 people .  Approximately 100 people will have the disease.  What would the (99%-accurate) test say? Test 10 6 people 7

  45. Results of the test approximately 99 of them will test positive 1 of them will test negative 100 people w/ disease 989,901 of them will test negative 9,999 of them will test positive 999,900 people w/o disease 8

  46. Results of the test 99 of them will test positive 1 of them will test negative 100 people w/ disease 99 Of those who test positive, only actually have the disease! 9,999  1%  99 989,901 of them will test negative 9,999 of them will test positive 999,900 people w/o disease 9

  47. Bayes ’ Theorem Using the concept of conditional probability and Bayes’ Theorem , you can show that the probability that a person will have the disease given that the test is positive is given by  (1 ) p p TE D    (1 ) (1 ) p p p p TE D TE D where, in our example, p D = 10 -4 p TE = 1 – 0.99 = 0.01 10

  48. Bayes ’ Theorem Using the concept of conditional probability and Bayes’ Theorem , you can show that the probability P (D|T P ) that a person will have the disease given that the test result is positive is given by P (D|T P )  (1 ) p p TE D    1 (1 ) (1 ) p p p p TE D TE D When different value of p D is assumed, We get different value of P (D|T P ). Conclusion: Any value (between 0 and 1) p D can be obtained by varying the value of p D 1 11

  49. In log scale… 0 10 -1 10 -2 P (D|T P ) 10 -3 10 -4 10 -5 10 -6 -5 -4 -3 -2 -1 0 10 10 10 10 10 10 10 p D d 12

  50. Effect of p TE p TE = 1 – 0.99 = 0.01 1 0.9 p TE = 1 – 0.9 = 0.1 0.8 0.7 P (D|T P ) 0.6 p TE = 1 – 0.5 = 0.5 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 p D 13

  51. Wrap-up  Q1: What is the probability that the person actually has the disease?  A1: The answer actually depends on how common or how rare the disease is! (The answer depends on the value of P D .)  Q2: Can the answer be 1% or 2%?  A2: Yes.  Q3: Can the answer be 50%?  A3: Yes. 14

  52. Example: A Revisit  Roll a fair dice  Sneak peek: 15

Recommend


More recommend