A Quantitative Measure of Relevance Based on Kelly Gambling Theory - PowerPoint PPT Presentation

A Quantitative Measure of Relevance Based on Kelly Gambling Theory Mathias Winther Madsen Institute for Logic, Language, and Computation University of Amsterdam

PLAN ● Why? ● How? ● Examples

Why not use Shannon information? 1 H ( X ) == E log —————— Pr( X == x ) Claude Shannon (1916 – 2001)

Why not use Shannon information? Information Prior Posterior === — Content Uncertainty Uncertainty (cf. Klir 2008; Shannon 1948)

Why not use Shannon information? Pr( X == 1) == 0.15 Pr( X == 2) == 0.19 What is the Pr( X == 3) == 0.23 value of X? Pr( X == 4) == 0.21 Pr( X == 5) == 0.22 1 H ( X ) == E log —————— == 2.31 Pr( X == x )

Why not use Shannon information? Pr( X == 1) == 0.15 0 Is X == 2? 0 Pr( X == 2) == 0.19 1 Is X == 3? 1 0 Pr( X == 3) == 0.23 Is X in {4,5}? 1 Pr( X == 4) == 0.21 0 Is X == 5? 1 Pr( X == 5) == 0.22 Expected number == 2.34 of questions:

What color are my socks? H ( p ) == – ∑ p log p == 6.53 bits of entropy.

Why not use value-of-information? $ ! ? ! $ $ $ $$ Value-of- Posterior Prior — = = = Information Expectation Expectation

Why not use value-of-information? Rules: ● Your capital can be distributed freely ● Bets on the actual outcome are returned twofold ● Bets on all other outcomes are lost

Why not use value-of-information? Optimal Strategy: Expected payoff Degenerate Gambling (Everything (Everything on Heads) on Tails)

Why not use value-of-information? Capital Probability Rounds Rate of return ( R )

Why not use value-of-information? Probability Rate of return: == Capital at time i + 1 R i Capital at time i Long-run behavior: E [ R 1 · R 2 · R 3 · · · R n ] Rate of return ( R )

Why not use value-of-information? Probability Rate of return: == Capital at time i + 1 R i Capital at time i Long-run behavior: E [ R 1 · R 2 · R 3 · · · R n ] Converges to 0 Rate of return ( R ) in probability as n → ∞

Optimal reinvestment Daniel Bernoulli John Larry Kelly, Jr. (1700 – 1782) (1923 – 1965)

Optimal reinvestment Doubling rate: W i == log Capital at time i + 1 Capital at time i (so R = 2 W )

Optimal reinvestment Doubling rate: Long-run behavior: W i == log Capital at time i + 1 E [ R 1 · R 2 · R 3 · · · R n ] Capital at time i == E [2 W 1 + W 2 + W 3 + · · · + W n ] (so R = 2 W ) == 2 E [ W 1 + W 2 + W 3 + · · · + W n ] → 2 nE [ W ] for n → ∞

Optimal reinvestment Logarithmic expectation E [ W ] == ∑ p log bo is maximized by proportional gambling ( b * == p ). Arithmetic expectation E [ R ] == ∑ pbo is maximized by degenerate gambling

Measuring relevant information $ ! ? ! $ $ $ $$ Amount of Posterior Prior relevant === expected — expected information doubling rate doubling rate

Measuring relevant information Definition (Relevant Information): For an agent with utility function u , the amount of relevant information contained in the message Y == y is K ( y ) == ∑ max s ∑ Pr( x | y ) log u ( s , x ) – max s ∑ Pr( x ) log u ( s , x ) Posterior optimal Prior optimal doubling rate doubling rate

Measuring relevant information K ( y ) == ∑ max s ∑ Pr( x | y ) log u ( s , x ) – max s ∑ Pr( x ) log u ( s , x ) ● Expected relevant information is non-negative . ● Relevant information equals the maximal fraction of future gains you can pay for a piece of information without loss. ● When u has the form u ( s , x ) == v ( x ) s( x ) for some non-negative function v , relevant information equals Shannon information .

Example: Code-breaking

Example: Code-breaking ? ? ? ? Entropy: H = 4 Accumulated information: I ( X ; Y ) == 0

Example: Code-breaking 1 ? ? ? 1 bit! Entropy: H = 3 Accumulated information: I ( X ; Y ) == 1

Example: Code-breaking 1 0 ? ? 1 bit! Entropy: H = 2 Accumulated information: I ( X ; Y ) == 2

Example: Code-breaking 1 0 1 ? 1 bit! Entropy: H = 1 Accumulated information: I ( X ; Y ) == 3

Example: Code-breaking 1 0 1 1 1 bit! Entropy: H = 0 Accumulated information: I ( X ; Y ) == 4

Example: Code-breaking 1 0 1 1 1 bit 1 bit 1 bit 1 bit Entropy: H = 0 Accumulated information: I ( X ; Y ) == 4

Example: Code-breaking Rules: ? ● You can invest a fraction f of your capital in the guessing game ? ● If you guess the correct code, you get your investment back 16-fold: ? u == 1 – f + 16 f ? ● Otherwise, you lose it: u == 1 – f 15 1 W ( f ) == —— log(1 – f ) + —— log(1 – f + 16 f ) 16 16

Example: Code-breaking ? ? ? ? Optimal strategy: f * == 0 Optimal doubling rate: W ( f *) == 0.00 15 1 W ( f ) == —— log(1 – f ) + —— log(1 – f + 16 f ) 16 16

Example: Code-breaking 1 ? ? ? 0.04 bits Optimal strategy: f * == 1/15 Optimal doubling rate: W ( f *) == 0.04 7 1 W ( f ) == —— log(1 – f ) + —— log(1 – f + 16 f ) 8 8

Example: Code-breaking 1 0 ? ? 0.22 bits Optimal strategy: f * == 3/15 Optimal doubling rate: W ( f *) == 0.26 3 1 W ( f ) == —— log(1 – f ) + —— log(1 – f + 16 f ) 4 4

Example: Code-breaking 1 0 1 ? 0.79 bits Optimal strategy: f * == 7/15 Optimal doubling rate: W ( f *) == 1.05 1 1 W ( f ) == —— log(1 – f ) + —— log(1 – f + 16 f ) 2 2

Example: Code-breaking 1 0 1 1 2.95 bits Optimal strategy: f * == 1 Optimal doubling rate: W ( f *) == 4.00 0 1 W ( f ) == —— log(1 – f ) + —— log(1 – f + 16 f ) 1 1

Example: Code-breaking ? ? ? ? 1.00 1.00 1.00 1.00 Raw information (drop in entropy ) Relevant information 0.04 0.22 0.79 2.95 (increase in doubling rate )

Example: Randomization

Example: Randomization def choose(): if flip(): if flip(): return ROCK 1/3, 1/3, 1/3 else: return PAPER 1/2, 1/4, 1/4 else: return SCISSORS

Example: Randomization Rules: 1 ● You (1) and the adversary (2) both bet $1 2 ● You move first ● The winner takes the whole pool W ( p ) == log min { p 1 + 2 p 2 , p 2 + 2 p 3 , p 3 + 2 p 1 }

Example: Randomization Best accessible strategy: p * == (1, 0, 0) Doubling rate: W ( p *) == –∞ W ( p ) == log min { p 1 + 2 p 2 , p 2 + 2 p 3 , p 3 + 2 p 1 }

Example: Randomization Best accessible strategy: p * == (1/2, 1/2, 0) Doubling rate: W ( p *) == –1.00 W ( p ) == log min { p 1 + 2 p 2 , p 2 + 2 p 3 , p 3 + 2 p 1 }

Example: Randomization Best accessible strategy: p * == (2/4, 1/4, 1/4) Doubling rate: W ( p *) == –0.42 W ( p ) == log min { p 1 + 2 p 2 , p 2 + 2 p 3 , p 3 + 2 p 1 }

Example: Randomization Coin flips Distribution Doubling rate 0 (1, 0, 0) – ∞ ∞ 1 (1/2, 1/2, 0) –1.00 0.58 2 (1/2, 1/4, 1/4) –0.42 0.23 3 (3/8, 3/8, 2/8) –0.19 0.10 4 (6/16, 5/16, 5/16) –0.09 . . . . . . . . . (1/3, 1/3, 1/3) 0.00 ∞

January: Project course in information theory h t i w w o N E R O M Day 3: Guessing and Gambling ! N O N N A H Evidence, likelihood ratios, competitive prediction S Kullback-Leibler divergence Examples of diverging stochastic models Expressivity and the bias/variance tradeoffs. Day 1: Uncertainty and Inference Doubling rates and proportional betting Probability theory: Card color prediction Semantics and expressivity Random variables Day 4: Asking Questions and Engineering Answers Generative Bayesian models stochastic processes Questions and answers (or experiments and observations) mutual information Uncertain and information: Coin weighing Uncertainty as cost The maximum entropy principle The Hartley measure Shannon information content and entropy The channel coding theorem Huffman coding Day 5: Informative Descriptions and Residual Randomness Day 2: Counting Typical Sequences The practical problem of source coding The law of large numbers Kraft’s inequality and prefix codes Typical sequences and the source coding theorem. Arithmetic coding Stochastic processes and entropy rates Kolmogorov complexity the source coding theorem for stochastic processes Tests of randomness Examples Asymptotic equivalence of complexity and entropy

A Quantitative Measure of Relevance Based on Kelly Gambling Theory - PowerPoint PPT Presentation

A Quantitative Measure of Relevance Based on Kelly Gambling Theory Mathias Winther Madsen Institute for Logic, Language, and Computation University of Amsterdam PLAN Why? How? Examples Why? Why? How? Why not use Shannon

Topic of this talk Topic of this talk From E- -Relevance Relevance From E to W- -Relevance

Quantitative Quantitative Quantitative Quantitative Modal Modal Transition Transition

Food Solutions New England Tom Kelly PhD Executive Director UNH Sustainability Institute

Relevance Vector Machines Jukka Lankinen LUT February 21, 2011 Jukka Lankinen Relevance Vector

Relevance Feedback Relevance Feedback Relevance Feedback Prof. Paolo Ciaccia Prof. Paolo

HILLSIDE KELLY SLATER REDUCTION MONOTYPE ON PAPER WHITE PINE, BERKSHIRES KELLY SLATER,

R. Kelly Crace, Ph.D. R. Kelly Crace, Ph.D. College of William & Mary College of William

Regional Measure 3 May 16, 2017 SFMTA Board of Directors Regional Measure 3 Prior Regional

Polynomial Julia sets with positive measure Why bother? Quasiconformal NILF Measure 0? Measure

Quantitative Ethics Victor Piercey Joint Math Meetings 2015 San Antonio, TX Quantitative Reasoning

The relevance of quantitative theory for historical demography David de la Croix Universit

WORKING LIFE RELEVANCE OF THE STUDY PROGRAM: CONSTRUCTION OF A MEASURE NASSEEM HESSAMI

1 Introductions Measure H: Background Measure H: Bond Program Progress Measure H:

Modeling Relevance Gain Evaluation, session 4 CS6200: Information Retrieval Expected Relevance

What is Measure FF? Measure FF is on the November 2018 ballot to extend existing,

COMMUNITY UPDATE Measure AA Voter Information CITY OF WILDOMAR Fall 2018 Measure AA on November

Exercise 6a: Arithmetic Coding: (1) Overview syntax elements bins bits entropy encoder binary

Daalas advanced coding techniques FFmpeg implementation and how they fit in AOMedias codec

Pre-Knowledge In order to complete this lab

Learning to Exploit Long-term Relational Dependencies in Knowledge Graphs Lingbing Guo, Zequn

In the name of Allah In the name of Allah the compassionate, the merciful the compassionate, the

Information Theory Lecture 1 Course introduction Entropy, relative entropy and mutual

Data Compression Techniques Grzegorz Pastuszak Warsaw University of Technology Trieste

Introduction to Symbolic Dynamics Part 4: Entropy Silvio Capobianco Institute of Cybernetics at

A Quantitative Measure of Relevance Based on Kelly Gambling Theory - PowerPoint PPT Presentation

A Quantitative Measure of Relevance Based on Kelly Gambling Theory Mathias Winther Madsen Institute for Logic, Language, and Computation University of Amsterdam PLAN Why? How? Examples Why? Why? How? Why not use Shannon

Topic of this talk Topic of this talk From E- -Relevance Relevance From E to W- -Relevance

Quantitative Quantitative Quantitative Quantitative Modal Modal Transition Transition

Food Solutions New England Tom Kelly PhD Executive Director UNH Sustainability Institute

Relevance Vector Machines Jukka Lankinen LUT February 21, 2011 Jukka Lankinen Relevance Vector

Relevance Feedback Relevance Feedback Relevance Feedback Prof. Paolo Ciaccia Prof. Paolo

HILLSIDE KELLY SLATER REDUCTION MONOTYPE ON PAPER WHITE PINE, BERKSHIRES KELLY SLATER,

R. Kelly Crace, Ph.D. R. Kelly Crace, Ph.D. College of William &amp; Mary College of William

Regional Measure 3 May 16, 2017 SFMTA Board of Directors Regional Measure 3 Prior Regional

Polynomial Julia sets with positive measure Why bother? Quasiconformal NILF Measure 0? Measure

Quantitative Ethics Victor Piercey Joint Math Meetings 2015 San Antonio, TX Quantitative Reasoning

The relevance of quantitative theory for historical demography David de la Croix Universit

WORKING LIFE RELEVANCE OF THE STUDY PROGRAM: CONSTRUCTION OF A MEASURE NASSEEM HESSAMI

1 Introductions Measure H: Background Measure H: Bond Program Progress Measure H:

Modeling Relevance Gain Evaluation, session 4 CS6200: Information Retrieval Expected Relevance

What is Measure FF? Measure FF is on the November 2018 ballot to extend existing,

COMMUNITY UPDATE Measure AA Voter Information CITY OF WILDOMAR Fall 2018 Measure AA on November

Exercise 6a: Arithmetic Coding: (1) Overview syntax elements bins bits entropy encoder binary

Daalas advanced coding techniques FFmpeg implementation and how they fit in AOMedias codec

Pre-Knowledge In order to complete this lab

Learning to Exploit Long-term Relational Dependencies in Knowledge Graphs Lingbing Guo, Zequn

In the name of Allah In the name of Allah the compassionate, the merciful the compassionate, the

Information Theory Lecture 1 Course introduction Entropy, relative entropy and mutual

Data Compression Techniques Grzegorz Pastuszak Warsaw University of Technology Trieste

Introduction to Symbolic Dynamics Part 4: Entropy Silvio Capobianco Institute of Cybernetics at

R. Kelly Crace, Ph.D. R. Kelly Crace, Ph.D. College of William & Mary College of William