Counting Words: Type probabilities Population models Type-rich - PowerPoint PPT Presentation

. . . and its solution Populations & ➥ We need a model for the population samples Baroni & Evert ◮ This model embodies our hypothesis that the distribution of type probabilities has a certain general shape The population Type probabilities (more precisely, we speak of a family of models) Population models ZM & fZM Sampling from the population Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

. . . and its solution Populations & ➥ We need a model for the population samples Baroni & Evert ◮ This model embodies our hypothesis that the distribution of type probabilities has a certain general shape The population Type probabilities (more precisely, we speak of a family of models) Population models ZM & fZM ◮ The exact form of the distribution is then determined by Sampling from the population a small number of parameters (typically 2 or 3) Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

. . . and its solution Populations & ➥ We need a model for the population samples Baroni & Evert ◮ This model embodies our hypothesis that the distribution of type probabilities has a certain general shape The population Type probabilities (more precisely, we speak of a family of models) Population models ZM & fZM ◮ The exact form of the distribution is then determined by Sampling from the population a small number of parameters (typically 2 or 3) Random samples Expectation ◮ These parameters can be estimated with relative ease Mini-example Parameter estimation Trial & error Automatic estimation A practical example

Examples of population models Populations & 0.10 0.10 samples ● 0.08 0.08 Baroni & Evert ●●●● ● ● ● 0.06 ● 0.06 ● The population ● ● ● Type probabilities π k ● π k ● ● Population models 0.04 0.04 ● ● ● ● ZM & fZM ● ● ● ● ● ● Sampling from 0.02 0.02 ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● the population ● ● ● ●●●●●●●●●●●●●●●●●●●●● Random samples 0.00 0.00 Expectation 0 10 20 30 40 50 0 10 20 30 40 50 Mini-example k k Parameter estimation 0.10 0.10 Trial & error ● Automatic 0.08 0.08 ● estimation ● A practical ● 0.06 0.06 ● example ● ● ● π k π k ● ● 0.04 ● 0.04 ● ● ● ● ● ● ● ● ● ● ● ● 0.02 0.02 ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● 0.00 0.00 0 10 20 30 40 50 0 10 20 30 40 50 k k

The Zipf-Mandelbrot law as a population model Populations & What is the right family of models for lexical frequency samples distributions? Baroni & Evert ◮ We have already seen that the Zipf-Mandelbrot law The population Type probabilities captures the distribution of observed frequencies very Population models ZM & fZM well, across many phenomena and data sets Sampling from the population Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

The Zipf-Mandelbrot law as a population model Populations & What is the right family of models for lexical frequency samples distributions? Baroni & Evert ◮ We have already seen that the Zipf-Mandelbrot law The population Type probabilities captures the distribution of observed frequencies very Population models ZM & fZM well, across many phenomena and data sets Sampling from the population ◮ Re-phrase the law for type probabilities instead of Random samples Expectation frequencies: Mini-example C Parameter π k := estimation ( k + b ) a Trial & error Automatic estimation A practical example

The Zipf-Mandelbrot law as a population model Populations & What is the right family of models for lexical frequency samples distributions? Baroni & Evert ◮ We have already seen that the Zipf-Mandelbrot law The population Type probabilities captures the distribution of observed frequencies very Population models ZM & fZM well, across many phenomena and data sets Sampling from the population ◮ Re-phrase the law for type probabilities instead of Random samples Expectation frequencies: Mini-example C Parameter π k := estimation ( k + b ) a Trial & error Automatic estimation A practical ◮ Two free parameters: a > 1 and b ≥ 0 example ◮ C is not a parameter but a normalization constant, needed to ensure that � k π k = 1

The Zipf-Mandelbrot law as a population model Populations & What is the right family of models for lexical frequency samples distributions? Baroni & Evert ◮ We have already seen that the Zipf-Mandelbrot law The population Type probabilities captures the distribution of observed frequencies very Population models ZM & fZM well, across many phenomena and data sets Sampling from the population ◮ Re-phrase the law for type probabilities instead of Random samples Expectation frequencies: Mini-example C Parameter π k := estimation ( k + b ) a Trial & error Automatic estimation A practical ◮ Two free parameters: a > 1 and b ≥ 0 example ◮ C is not a parameter but a normalization constant, needed to ensure that � k π k = 1 ➥ the Zipf-Mandelbrot population model

The parameters of the Zipf-Mandelbrot model 0.10 0.10 Populations & ● samples a = 1.2 ● a = 2 0.08 0.08 b = 1.5 b = 10 Baroni & Evert ● 0.06 0.06 ● ● The population ● π k π k ● Type probabilities 0.04 0.04 ● ● Population models ● ● ● ZM & fZM ● ● ● 0.02 0.02 ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● Sampling from ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● the population 0.00 0.00 Random samples 0 10 20 30 40 50 0 10 20 30 40 50 Expectation Mini-example k k Parameter 0.10 0.10 estimation ● Trial & error a = 2 a = 5 0.08 0.08 ● Automatic b = 15 b = 40 estimation ● ● A practical 0.06 0.06 ● ● example ● ● π k π k ● ● 0.04 ● 0.04 ● ● ● ● ● ● ● ● ● ● ● ● 0.02 0.02 ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● 0.00 0.00 0 10 20 30 40 50 0 10 20 30 40 50 k k

The parameters of the Zipf-Mandelbrot model ● Populations & ● ● 5e−02 5e−02 ● ● ● samples a = 1.2 a = 2 ● ● ● ●●●●● ● b = 1.5 b = 10 ● ● ● ● Baroni & Evert ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● 5e−03 ● 5e−03 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● The population ● ● ● ● ● ● ● ● ● ● ● ● π k ● ● π k ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Type probabilities ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Population models ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ZM & fZM 5e−04 ● ● ● ● ● ● ● 5e−04 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Sampling from ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● the population ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1e−04 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1e−04 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Random samples ● ● ● ● ● ● ● ● ● ● ● 1 2 5 10 20 50 100 1 2 5 10 20 50 100 Expectation Mini-example k k Parameter estimation ● ● 5e−02 5e−02 ● ● ● ● ● ●●●●● ● Trial & error ● a = 2 a = 5 ● ● ● ●●●●● Automatic b = 15 ● b = 40 ● ● ● ● estimation ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● A practical ● ● 5e−03 ● ● 5e−03 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● example ● ● ● ● ● ● ● ● ● ● ● ● ● ● π k ● ● ● π k ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5e−04 ● ● ● ● ● ● 5e−04 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1e−04 ● ● ● ● ● ● ● ● ● ● ● ● ● 1e−04 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1 2 5 10 20 50 100 1 2 5 10 20 50 100 k k

The finite Zipf-Mandelbrot model Populations & ◮ Zipf-Mandelbrot population model characterizes an samples infinite type population: there is no upper bound on k , Baroni & Evert and the type probabilities π k can become arbitrarily small The population Type probabilities Population models ZM & fZM Sampling from the population Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

The finite Zipf-Mandelbrot model Populations & ◮ Zipf-Mandelbrot population model characterizes an samples infinite type population: there is no upper bound on k , Baroni & Evert and the type probabilities π k can become arbitrarily small The population ◮ π = 10 − 6 (once every million words), π = 10 − 9 (once Type probabilities Population models every billion words), π = 10 − 12 (once on the entire ZM & fZM Sampling from Internet), π = 10 − 100 (once in the universe?) the population Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

The finite Zipf-Mandelbrot model Populations & ◮ Zipf-Mandelbrot population model characterizes an samples infinite type population: there is no upper bound on k , Baroni & Evert and the type probabilities π k can become arbitrarily small The population ◮ π = 10 − 6 (once every million words), π = 10 − 9 (once Type probabilities Population models every billion words), π = 10 − 12 (once on the entire ZM & fZM Sampling from Internet), π = 10 − 100 (once in the universe?) the population Random samples ◮ Alternative: finite (but often very large) number Expectation Mini-example of types in the population Parameter estimation ◮ We call this the population vocabulary size S Trial & error Automatic (and write S = ∞ for an infinite type population) estimation A practical example

The finite Zipf-Mandelbrot model Populations & ◮ The finite Zipf-Mandelbrot model simply stops after samples the first S types ( w 1 , . . . , w S ) Baroni & Evert The population Type probabilities Population models ZM & fZM Sampling from the population Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

The finite Zipf-Mandelbrot model Populations & ◮ The finite Zipf-Mandelbrot model simply stops after samples the first S types ( w 1 , . . . , w S ) Baroni & Evert ◮ S becomes a new parameter of the model The population Type probabilities ➜ the finite Zipf-Mandelbrot model has 3 parameters Population models ZM & fZM ◮ NB: C will not have the same value as for the Sampling from the population corresponding infinite ZM model Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

The finite Zipf-Mandelbrot model Populations & ◮ The finite Zipf-Mandelbrot model simply stops after samples the first S types ( w 1 , . . . , w S ) Baroni & Evert ◮ S becomes a new parameter of the model The population Type probabilities ➜ the finite Zipf-Mandelbrot model has 3 parameters Population models ZM & fZM ◮ NB: C will not have the same value as for the Sampling from the population corresponding infinite ZM model Random samples Expectation Abbreviations: ZM for Zipf-Mandelbrot model, Mini-example Parameter and fZM for finite Zipf-Mandelbrot model estimation Trial & error Automatic estimation A practical example

The next steps Populations & Once we have a population model . . . samples Baroni & Evert ◮ We still need to estimate the values of its parameters ◮ we’ll see later how we can do this The population Type probabilities Population models ZM & fZM Sampling from the population Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

The next steps Populations & Once we have a population model . . . samples Baroni & Evert ◮ We still need to estimate the values of its parameters ◮ we’ll see later how we can do this The population Type probabilities ◮ We want to simulate random samples from the Population models ZM & fZM population described by the model Sampling from ◮ basic assumption: real data sets (such as corpora) are the population Random samples random samples from this population Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

The next steps Populations & Once we have a population model . . . samples Baroni & Evert ◮ We still need to estimate the values of its parameters ◮ we’ll see later how we can do this The population Type probabilities ◮ We want to simulate random samples from the Population models ZM & fZM population described by the model Sampling from ◮ basic assumption: real data sets (such as corpora) are the population Random samples random samples from this population Expectation Mini-example ◮ this allows us to predict vocabulary growth, the number Parameter of previously unseen types as more text is added to a estimation Trial & error corpus, the frequency spectrum of a larger data set, etc. Automatic estimation A practical example

The next steps Populations & Once we have a population model . . . samples Baroni & Evert ◮ We still need to estimate the values of its parameters ◮ we’ll see later how we can do this The population Type probabilities ◮ We want to simulate random samples from the Population models ZM & fZM population described by the model Sampling from ◮ basic assumption: real data sets (such as corpora) are the population Random samples random samples from this population Expectation Mini-example ◮ this allows us to predict vocabulary growth, the number Parameter of previously unseen types as more text is added to a estimation Trial & error corpus, the frequency spectrum of a larger data set, etc. Automatic ◮ it will also allow us to estimate the model parameters estimation A practical example

Outline Populations & samples The type population Baroni & Evert The population Sampling from the population Type probabilities Population models ZM & fZM Sampling from Parameter estimation the population Random samples Expectation Mini-example A practical example Parameter estimation Trial & error Automatic estimation A practical example

Sampling from a population model Populations & Assume we believe that the population we are interested in samples can be described by a Zipf-Mandelbrot model: Baroni & Evert The population 0.05 Type probabilities 5e−02 Population models a = 3 a = 3 ● ● ● 0.04 ● ● ● ●●●●● ZM & fZM b = 50 b = 50 ● ● ● ● ● ● ● ● ● ● ● ● Sampling from ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.03 ● ● the population ● ● ● ● 5e−03 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Random samples ● ● ● ● π k π k ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Expectation ● ● ● 0.02 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ● ● ● Mini-example ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5e−04 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Parameter ● ● ● ● ● ● 0.01 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● estimation ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Trial & error ● ● ● ● ● ● ● ● 1e−04 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.00 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Automatic ● ● ● ● ● ● ● ● ● ● estimation 0 10 20 30 40 50 1 2 5 10 20 50 100 A practical k k example

Sampling from a population model Populations & Assume we believe that the population we are interested in samples can be described by a Zipf-Mandelbrot model: Baroni & Evert The population 0.05 Type probabilities 5e−02 Population models a = 3 a = 3 ● ● ● 0.04 ● ● ● ●●●●● ZM & fZM b = 50 b = 50 ● ● ● ● ● ● ● ● ● ● ● ● Sampling from ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.03 ● ● the population ● ● ● ● 5e−03 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Random samples ● ● ● ● π k π k ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Expectation ● ● ● 0.02 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ● ● ● Mini-example ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5e−04 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Parameter ● ● ● ● ● ● 0.01 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● estimation ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Trial & error ● ● ● ● ● ● ● ● 1e−04 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.00 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Automatic ● ● ● ● ● ● ● ● ● ● estimation 0 10 20 30 40 50 1 2 5 10 20 50 100 A practical k k example Use computer simulation to sample from this model: ◮ Draw N tokens from the population such that in each step, type w k has probability π k to be picked

Sampling from a population model Populations & samples #1: 1 42 34 23 108 18 48 18 1 . . . Baroni & Evert The population Type probabilities Population models ZM & fZM Sampling from the population Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

Sampling from a population model Populations & samples #1: 1 42 34 23 108 18 48 18 1 . . . Baroni & Evert time order room school town course area course time . . . The population Type probabilities Population models ZM & fZM Sampling from the population Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

Sampling from a population model Populations & samples #1: 1 42 34 23 108 18 48 18 1 . . . Baroni & Evert time order room school town course area course time . . . The population #2: 286 28 23 36 3 4 7 4 8 . . . Type probabilities Population models ZM & fZM Sampling from the population Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

Sampling from a population model Populations & samples #1: 1 42 34 23 108 18 48 18 1 . . . Baroni & Evert time order room school town course area course time . . . The population #2: 286 28 23 36 3 4 7 4 8 . . . Type probabilities Population models ZM & fZM #3: 2 11 105 21 11 17 17 1 16 . . . Sampling from the population Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

Sampling from a population model Populations & samples #1: 1 42 34 23 108 18 48 18 1 . . . Baroni & Evert time order room school town course area course time . . . The population #2: 286 28 23 36 3 4 7 4 8 . . . Type probabilities Population models ZM & fZM #3: 2 11 105 21 11 17 17 1 16 . . . Sampling from the population Random samples #4: 44 3 110 34 223 2 25 20 28 . . . Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

Sampling from a population model Populations & samples #1: 1 42 34 23 108 18 48 18 1 . . . Baroni & Evert time order room school town course area course time . . . The population #2: 286 28 23 36 3 4 7 4 8 . . . Type probabilities Population models ZM & fZM #3: 2 11 105 21 11 17 17 1 16 . . . Sampling from the population Random samples #4: 44 3 110 34 223 2 25 20 28 . . . Expectation Mini-example Parameter #5: 24 81 54 11 8 61 1 31 35 . . . estimation Trial & error Automatic #6: 3 65 9 165 5 42 16 20 7 . . . estimation A practical #7: 10 21 11 60 164 54 18 16 203 . . . example #8: 11 7 147 5 24 19 15 85 37 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Sampling from a population model Populations & In this way, we can . . . samples ◮ draw samples of arbitrary size N Baroni & Evert ◮ the computer can do it efficiently even for large N The population Type probabilities Population models ZM & fZM Sampling from the population Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

Sampling from a population model Populations & In this way, we can . . . samples ◮ draw samples of arbitrary size N Baroni & Evert ◮ the computer can do it efficiently even for large N The population Type probabilities ◮ draw as many samples as we need Population models ZM & fZM Sampling from the population Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

Sampling from a population model Populations & In this way, we can . . . samples ◮ draw samples of arbitrary size N Baroni & Evert ◮ the computer can do it efficiently even for large N The population Type probabilities ◮ draw as many samples as we need Population models ZM & fZM ◮ compute type frequency lists, frequency spectra and Sampling from the population vocabulary growth curves from these samples Random samples ◮ i.e., we can analyze them with the same methods that we Expectation Mini-example have applied to the observed data sets Parameter estimation Trial & error Automatic estimation A practical example

Sampling from a population model Populations & In this way, we can . . . samples ◮ draw samples of arbitrary size N Baroni & Evert ◮ the computer can do it efficiently even for large N The population Type probabilities ◮ draw as many samples as we need Population models ZM & fZM ◮ compute type frequency lists, frequency spectra and Sampling from the population vocabulary growth curves from these samples Random samples ◮ i.e., we can analyze them with the same methods that we Expectation Mini-example have applied to the observed data sets Parameter estimation Trial & error Here are some results for samples of size N = 1000 . . . Automatic estimation A practical example

Samples: type frequency list & spectrum Populations & samples rank r type k f r m V m Baroni & Evert 1 37 6 1 83 The population 2 36 1 2 22 Type probabilities Population models 3 33 3 3 20 ZM & fZM Sampling from 4 31 7 4 12 the population 5 31 10 5 10 Random samples Expectation 6 30 5 6 5 Mini-example Parameter 7 28 12 7 5 estimation Trial & error 8 27 2 8 3 Automatic estimation 9 24 4 9 3 A practical 10 24 16 10 3 example . . 11 23 8 . . . . 12 22 14 . . . . . . . . . sample #1

Samples: type frequency list & spectrum Populations & samples rank r type k f r m V m Baroni & Evert 1 39 2 1 76 The population 2 34 3 2 27 Type probabilities Population models 3 30 5 3 17 ZM & fZM Sampling from 4 29 10 4 10 the population 5 28 8 5 6 Random samples Expectation 6 26 1 6 5 Mini-example Parameter 7 25 13 7 7 estimation Trial & error 8 24 7 8 3 Automatic estimation 9 23 6 10 4 A practical 10 23 11 11 2 example . . 11 20 4 . . . . 12 19 17 . . . . . . . . . sample #2

Random variation in type-frequency lists Populations & Sample #1 Sample #2 40 40 ● samples ● ● ● ● Baroni & Evert ●● 30 30 ● ● ● ● ● ● ● ● The population ●● ● ● ●● ● ● Type probabilities 20 20 f r f r ● ● ●● r ↔ f r ● ●● Population models ●●● ●●●●● ● ● ●●● ZM & fZM ● ● ●● ● ●●●● ●●● ● ●● 10 10 ●●● ●●●● Sampling from ●●● ●●● ●●● ●●●●● ●●●●●●● the population ●●●●● ●●●●● ●●●● Random samples 0 0 Expectation 0 10 20 30 40 50 0 10 20 30 40 50 Mini-example r r Parameter estimation Trial & error Automatic estimation A practical example

Random variation in type-frequency lists Populations & Sample #1 Sample #2 40 40 ● samples ● ● ● ● Baroni & Evert ●● 30 30 ● ● ● ● ● ● ● ● The population ●● ● ● ●● ● ● Type probabilities 20 20 f r f r ● ● ●● r ↔ f r ● ●● Population models ●●● ●●●●● ● ● ●●● ZM & fZM ● ● ●● ● ●●●● ●●● ● ●● 10 10 ●●● ●●●● Sampling from ●●● ●●● ●●● ●●●●● ●●●●●●● the population ●●●●● ●●●●● ●●●● Random samples 0 0 Expectation 0 10 20 30 40 50 0 10 20 30 40 50 Mini-example r r Parameter estimation Sample #1 Sample #2 40 40 ● Trial & error ● ● Automatic ● ● estimation ● ● 30 30 ● ● ● ● ● A practical ● ● ● example ● ● ● ● ● ● ● ● 20 20 f k f k ● k ↔ f k ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● 10 10 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● 0 0 0 10 20 30 40 50 0 10 20 30 40 50 k k

Random variation in type-frequency lists Populations & ◮ Random variation leads to different type frequencies f k samples in every new sample Baroni & Evert ◮ particularly obvious when we plot them in population The population order (bottom row, k ↔ f k ) Type probabilities Population models ZM & fZM Sampling from the population Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

Random variation in type-frequency lists Populations & ◮ Random variation leads to different type frequencies f k samples in every new sample Baroni & Evert ◮ particularly obvious when we plot them in population The population order (bottom row, k ↔ f k ) Type probabilities Population models ◮ Different ordering of types in the Zipf ranking ZM & fZM Sampling from for every new sample the population ◮ Zipf rank r in sample � = population rank k ! Random samples Expectation ◮ leads to severe problems with statistical methods Mini-example Parameter estimation Trial & error Automatic estimation A practical example

Random variation in type-frequency lists Populations & ◮ Random variation leads to different type frequencies f k samples in every new sample Baroni & Evert ◮ particularly obvious when we plot them in population The population order (bottom row, k ↔ f k ) Type probabilities Population models ◮ Different ordering of types in the Zipf ranking ZM & fZM Sampling from for every new sample the population ◮ Zipf rank r in sample � = population rank k ! Random samples Expectation ◮ leads to severe problems with statistical methods Mini-example Parameter ◮ Individual types are irrelevant for our purposes, so let us estimation take a perspective that abstracts away from them Trial & error Automatic estimation ◮ frequency spectrum A practical ◮ vocabulary growth curve example

Random variation in type-frequency lists Populations & ◮ Random variation leads to different type frequencies f k samples in every new sample Baroni & Evert ◮ particularly obvious when we plot them in population The population order (bottom row, k ↔ f k ) Type probabilities Population models ◮ Different ordering of types in the Zipf ranking ZM & fZM Sampling from for every new sample the population ◮ Zipf rank r in sample � = population rank k ! Random samples Expectation ◮ leads to severe problems with statistical methods Mini-example Parameter ◮ Individual types are irrelevant for our purposes, so let us estimation take a perspective that abstracts away from them Trial & error Automatic estimation ◮ frequency spectrum A practical ◮ vocabulary growth curve example ➥ considerable amount of random variation still visible

Random variation: frequency spectrum 100 Sample #1 100 Sample #2 Populations & samples 80 80 Baroni & Evert 60 60 The population V m V m Type probabilities 40 40 Population models ZM & fZM 20 20 Sampling from the population 0 0 Random samples Expectation Mini-example m m Parameter Sample #3 Sample #4 100 100 estimation Trial & error Automatic 80 80 estimation A practical 60 60 example V m V m 40 40 20 20 0 0 m m

Random variation: vocabulary growth curve 200 Sample #1 200 Sample #2 Populations & samples 150 150 Baroni & Evert V ( N ) V 1 ( N ) V ( N ) V 1 ( N ) The population 100 100 Type probabilities Population models ZM & fZM 50 50 Sampling from the population 0 0 Random samples 0 200 400 600 800 1000 0 200 400 600 800 1000 Expectation Mini-example N N Parameter Sample #3 Sample #4 200 200 estimation Trial & error Automatic 150 150 estimation A practical V ( N ) V 1 ( N ) V ( N ) V 1 ( N ) example 100 100 50 50 0 0 0 200 400 600 800 1000 0 200 400 600 800 1000 N N

Expected values Populations & ◮ There is no reason why we should choose a particular samples sample to make a prediction for the real data – each one Baroni & Evert is equally likely or unlikely The population Type probabilities Population models ZM & fZM Sampling from the population Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

Expected values Populations & ◮ There is no reason why we should choose a particular samples sample to make a prediction for the real data – each one Baroni & Evert is equally likely or unlikely The population Type probabilities ➥ Take the average over a large number of samples Population models ZM & fZM Sampling from the population Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

Expected values Populations & ◮ There is no reason why we should choose a particular samples sample to make a prediction for the real data – each one Baroni & Evert is equally likely or unlikely The population Type probabilities ➥ Take the average over a large number of samples Population models ZM & fZM ◮ Such averages are called expected values or Sampling from the population expectations in statistics (frequentist approach) Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

Expected values Populations & ◮ There is no reason why we should choose a particular samples sample to make a prediction for the real data – each one Baroni & Evert is equally likely or unlikely The population Type probabilities ➥ Take the average over a large number of samples Population models ZM & fZM ◮ Such averages are called expected values or Sampling from the population expectations in statistics (frequentist approach) Random samples ◮ Notation: E [ V ( N )] and E [ V m ( N )] Expectation Mini-example ◮ indicates that we are referring to expected values for a Parameter estimation sample of size N Trial & error ◮ rather than to the specific values V and V m Automatic estimation observed in a particular sample or a real-world data set A practical example ◮ Usually we can omit the sample size: E [ V ] and E [ V m ]

The expected frequency spectrum 100 Sample #1 100 Sample #2 Populations & V m V m samples E [ V m ] E [ V m ] 80 80 Baroni & Evert 60 60 V m E [ V m ] V m E [ V m ] The population Type probabilities 40 40 Population models ZM & fZM 20 20 Sampling from the population 0 0 Random samples Expectation Mini-example m m Parameter Sample #3 Sample #4 100 100 estimation V m V m Trial & error E [ V m ] E [ V m ] Automatic 80 80 estimation A practical 60 60 V m E [ V m ] V m E [ V m ] example 40 40 20 20 0 0 m m

The expected vocabulary growth curve Populations & samples Baroni & Evert Sample #1 Sample #1 200 200 The population Type probabilities Population models 150 150 ZM & fZM Sampling from E [ V 1 ( N )] E [ V ( N )] the population 100 100 Random samples Expectation Mini-example Parameter 50 50 estimation V ( N ) V 1 ( N ) E [ V ( N )] E [ V 1 ( N )] Trial & error Automatic estimation 0 0 0 200 400 600 800 1000 0 200 400 600 800 1000 A practical example N N

Great expectations made easy Populations & ◮ Fortunately, we don’t have to take many thousands of samples samples to calculate expectations: there is a (relatively Baroni & Evert simple) mathematical solution ( ➜ Wednesday) The population Type probabilities Population models ZM & fZM Sampling from the population Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

Great expectations made easy Populations & ◮ Fortunately, we don’t have to take many thousands of samples samples to calculate expectations: there is a (relatively Baroni & Evert simple) mathematical solution ( ➜ Wednesday) The population ◮ This solution also allows us to estimate the amount of Type probabilities Population models random variation ➜ variance and confidence intervals ZM & fZM Sampling from ◮ example: expected VGCs with confidence intervals the population ◮ we won’t pursue variance any further in this course Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

Confidence intervals for the expected VGC Populations & samples Baroni & Evert Sample #1 Sample #1 200 200 The population Type probabilities Population models 150 150 ZM & fZM Sampling from E [ V 1 ( N )] E [ V ( N )] the population 100 100 Random samples Expectation Mini-example Parameter 50 50 estimation V ( N ) V 1 ( N ) E [ V ( N )] E [ V 1 ( N )] Trial & error Automatic estimation 0 0 0 200 400 600 800 1000 0 200 400 600 800 1000 A practical example N N

A mini-example Populations & ◮ G. K. Zipf claimed that the distribution of English word samples frequencies follows Zipf’s law with a ≈ 1 Baroni & Evert ◮ a ≈ 1 . 5 seems a more reasonable value when you The population look at larger text samples than Zipf did Type probabilities Population models ZM & fZM Sampling from the population Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

A mini-example Populations & ◮ G. K. Zipf claimed that the distribution of English word samples frequencies follows Zipf’s law with a ≈ 1 Baroni & Evert ◮ a ≈ 1 . 5 seems a more reasonable value when you The population look at larger text samples than Zipf did Type probabilities Population models ◮ The most frequent word in English is the with π ≈ . 06 ZM & fZM Sampling from the population Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

A mini-example Populations & ◮ G. K. Zipf claimed that the distribution of English word samples frequencies follows Zipf’s law with a ≈ 1 Baroni & Evert ◮ a ≈ 1 . 5 seems a more reasonable value when you The population look at larger text samples than Zipf did Type probabilities Population models ◮ The most frequent word in English is the with π ≈ . 06 ZM & fZM Sampling from ◮ Zipf-Mandelbrot law with a = 1 . 5 and b = 7 . 5 yields a the population Random samples population model where π 1 ≈ . 06 (by trial & error) Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

A mini-example Populations & ◮ How many different words do we expect to find in a samples 1-million word text? Baroni & Evert ◮ N = 1,000,000 ➜ E [ V ( N )] = 33026 . 7 The population ◮ 95%-confidence interval: V ( N ) = 32753 . 6 . . . 33299 . 7 Type probabilities Population models ZM & fZM Sampling from the population Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

A mini-example Populations & ◮ How many different words do we expect to find in a samples 1-million word text? Baroni & Evert ◮ N = 1,000,000 ➜ E [ V ( N )] = 33026 . 7 The population ◮ 95%-confidence interval: V ( N ) = 32753 . 6 . . . 33299 . 7 Type probabilities Population models ◮ How many do we really find? ZM & fZM ◮ Brown corpus: 1 million words of edited American English Sampling from the population ◮ V = 45215 ➜ ZM model is not quite right Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

A mini-example Populations & ◮ How many different words do we expect to find in a samples 1-million word text? Baroni & Evert ◮ N = 1,000,000 ➜ E [ V ( N )] = 33026 . 7 The population ◮ 95%-confidence interval: V ( N ) = 32753 . 6 . . . 33299 . 7 Type probabilities Population models ◮ How many do we really find? ZM & fZM ◮ Brown corpus: 1 million words of edited American English Sampling from the population ◮ V = 45215 ➜ ZM model is not quite right Random samples Expectation ◮ Physicists (and some mathematicians) are happy as long Mini-example as they get the order of magnitude right . . . Parameter estimation Trial & error Automatic estimation A practical example

A mini-example Populations & ◮ How many different words do we expect to find in a samples 1-million word text? Baroni & Evert ◮ N = 1,000,000 ➜ E [ V ( N )] = 33026 . 7 The population ◮ 95%-confidence interval: V ( N ) = 32753 . 6 . . . 33299 . 7 Type probabilities Population models ◮ How many do we really find? ZM & fZM ◮ Brown corpus: 1 million words of edited American English Sampling from the population ◮ V = 45215 ➜ ZM model is not quite right Random samples Expectation ◮ Physicists (and some mathematicians) are happy as long Mini-example as they get the order of magnitude right . . . Parameter estimation ☞ Model was not based on actual data! Trial & error Automatic estimation A practical example

Outline Populations & samples The type population Baroni & Evert The population Sampling from the population Type probabilities Population models ZM & fZM Sampling from Parameter estimation the population Random samples Expectation Mini-example A practical example Parameter estimation Trial & error Automatic estimation A practical example

Estimating model parameters Populations & ◮ Parameter settings in the mini-example were based on samples general assumptions (claims from the literature) Baroni & Evert ◮ But we also have empirical data on the word frequency The population Type probabilities distribution of English available (the Brown corpus) Population models ZM & fZM Sampling from the population Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

Estimating model parameters Populations & ◮ Parameter settings in the mini-example were based on samples general assumptions (claims from the literature) Baroni & Evert ◮ But we also have empirical data on the word frequency The population Type probabilities distribution of English available (the Brown corpus) Population models ZM & fZM ◮ Choose parameters so that population model matches Sampling from the population the empirical distribution as well as possible Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

Estimating model parameters Populations & ◮ Parameter settings in the mini-example were based on samples general assumptions (claims from the literature) Baroni & Evert ◮ But we also have empirical data on the word frequency The population Type probabilities distribution of English available (the Brown corpus) Population models ZM & fZM ◮ Choose parameters so that population model matches Sampling from the population the empirical distribution as well as possible Random samples ◮ E.g. by trial and error . . . Expectation Mini-example ◮ guess parameters Parameter estimation ◮ compare model predictions for sample of size N 0 Trial & error Automatic with observed data ( N 0 tokens) estimation ◮ based on frequency spectrum or vocabulary growth curve A practical ◮ change parameters & repeat until satisfied example ◮ This process is called parameter estimation

Parameter estimation by trial & error Populations & samples Baroni & Evert 25000 a = 1.5 , b = 7.5 50000 a = 1.5 , b = 7.5 The population observed ZM model Type probabilities 20000 40000 Population models ZM & fZM 15000 30000 V ( N ) E [ V ( N )] Sampling from V m E [ V m ] the population Random samples 10000 20000 Expectation Mini-example Parameter 10000 5000 estimation observed Trial & error ZM model Automatic estimation 0 0 0e+00 2e+05 4e+05 6e+05 8e+05 1e+06 A practical example m N

Parameter estimation by trial & error Populations & samples Baroni & Evert 25000 a = 1.7 , b = 80 50000 a = 1.7 , b = 80 The population observed ZM model Type probabilities 20000 40000 Population models ZM & fZM 15000 30000 V ( N ) E [ V ( N )] Sampling from V m E [ V m ] the population Random samples 10000 20000 Expectation Mini-example Parameter 10000 5000 estimation observed Trial & error ZM model Automatic estimation 0 0 0e+00 2e+05 4e+05 6e+05 8e+05 1e+06 A practical example m N

Parameter estimation by trial & error Populations & samples Baroni & Evert 25000 a = 2 , b = 550 50000 a = 2 , b = 550 The population observed ZM model Type probabilities 20000 40000 Population models ZM & fZM 15000 30000 V ( N ) E [ V ( N )] Sampling from V m E [ V m ] the population Random samples 10000 20000 Expectation Mini-example Parameter 10000 5000 estimation observed Trial & error ZM model Automatic estimation 0 0 0e+00 2e+05 4e+05 6e+05 8e+05 1e+06 A practical example m N

Automatic parameter estimation Populations & ◮ Parameter estimation by trial & error is tedious samples ➜ let the computer to the work! Baroni & Evert The population Type probabilities Population models ZM & fZM Sampling from the population Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

Automatic parameter estimation Populations & ◮ Parameter estimation by trial & error is tedious samples ➜ let the computer to the work! Baroni & Evert ◮ Need cost function to quantify “distance” between The population model expectations and observed data Type probabilities Population models ◮ based on vocabulary size and vocabulary spectrum ZM & fZM Sampling from (these are the most convenient criteria) the population Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

Automatic parameter estimation Populations & ◮ Parameter estimation by trial & error is tedious samples ➜ let the computer to the work! Baroni & Evert ◮ Need cost function to quantify “distance” between The population model expectations and observed data Type probabilities Population models ◮ based on vocabulary size and vocabulary spectrum ZM & fZM Sampling from (these are the most convenient criteria) the population ◮ Computer estimates parameters by automatic Random samples Expectation minimization of cost function Mini-example Parameter ◮ clever algorithms exist that find out quickly in which estimation direction they have to “push” the parameters to Trial & error Automatic approach the minimum estimation ◮ implemented in standard software packages A practical example

Cost functions for parameter estimation Populations & ◮ Cost functions compare expected frequency spectrum samples E [ V m ( N 0 )] with observed spectrum V m ( N 0 ) Baroni & Evert The population Type probabilities Population models ZM & fZM Sampling from the population Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

Cost functions for parameter estimation Populations & ◮ Cost functions compare expected frequency spectrum samples E [ V m ( N 0 )] with observed spectrum V m ( N 0 ) Baroni & Evert ◮ Choice #1: how to weight differences The population Type probabilities M Population models � ◮ absolute values of differences � � � V m − E [ V m ] ZM & fZM � Sampling from m =1 the population M ◮ mean squared error 1 � 2 Random samples � � V m − E [ V m ] Expectation M Mini-example m =1 ◮ chi-squared criterion: scale by estimated variances Parameter estimation Trial & error Automatic estimation A practical example

Cost functions for parameter estimation Populations & ◮ Cost functions compare expected frequency spectrum samples E [ V m ( N 0 )] with observed spectrum V m ( N 0 ) Baroni & Evert ◮ Choice #1: how to weight differences The population Type probabilities ◮ Choice #2: how many spectrum elements to use Population models ZM & fZM ◮ typically between M = 2 and M = 15 Sampling from ◮ what happens if M < number of parameters? the population Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

Cost functions for parameter estimation Populations & ◮ Cost functions compare expected frequency spectrum samples E [ V m ( N 0 )] with observed spectrum V m ( N 0 ) Baroni & Evert ◮ Choice #1: how to weight differences The population Type probabilities ◮ Choice #2: how many spectrum elements to use Population models ZM & fZM ◮ typically between M = 2 and M = 15 Sampling from ◮ what happens if M < number of parameters? the population Random samples ◮ For many applications, it is important to match V Expectation Mini-example precisely: additional constraint E [ V ( N 0 )] = V ( N 0 ) Parameter ◮ general principle: you can match as many constraints estimation Trial & error as there are free parameters in the model Automatic estimation A practical example

Cost functions for parameter estimation Populations & ◮ Cost functions compare expected frequency spectrum samples E [ V m ( N 0 )] with observed spectrum V m ( N 0 ) Baroni & Evert ◮ Choice #1: how to weight differences The population Type probabilities ◮ Choice #2: how many spectrum elements to use Population models ZM & fZM ◮ typically between M = 2 and M = 15 Sampling from ◮ what happens if M < number of parameters? the population Random samples ◮ For many applications, it is important to match V Expectation Mini-example precisely: additional constraint E [ V ( N 0 )] = V ( N 0 ) Parameter ◮ general principle: you can match as many constraints estimation Trial & error as there are free parameters in the model Automatic estimation ◮ Felicitous choice of cost function and M can A practical example substantially improve the quality of the estimated model ◮ It isn’t a science, it’s an art . . .

Goodness-of-fit Populations & ◮ Automatic estimation procedure minimizes cost function samples until no further improvement can be found Baroni & Evert ◮ this is a so-called local minimum of the cost function The population ◮ not necessarily the global minimum that we want to find Type probabilities Population models ZM & fZM Sampling from the population Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

Goodness-of-fit Populations & ◮ Automatic estimation procedure minimizes cost function samples until no further improvement can be found Baroni & Evert ◮ this is a so-called local minimum of the cost function The population ◮ not necessarily the global minimum that we want to find Type probabilities Population models ◮ Key question: is the estimated model good enough? ZM & fZM Sampling from the population Random samples Expectation Mini-example Parameter estimation Trial & error Automatic estimation A practical example

Goodness-of-fit Populations & ◮ Automatic estimation procedure minimizes cost function samples until no further improvement can be found Baroni & Evert ◮ this is a so-called local minimum of the cost function The population ◮ not necessarily the global minimum that we want to find Type probabilities Population models ◮ Key question: is the estimated model good enough? ZM & fZM Sampling from ◮ In other words: does the model provide a plausible the population Random samples explanation of the observed data as a random sample Expectation Mini-example from the population? Parameter estimation Trial & error Automatic estimation A practical example

Goodness-of-fit Populations & ◮ Automatic estimation procedure minimizes cost function samples until no further improvement can be found Baroni & Evert ◮ this is a so-called local minimum of the cost function The population ◮ not necessarily the global minimum that we want to find Type probabilities Population models ◮ Key question: is the estimated model good enough? ZM & fZM Sampling from ◮ In other words: does the model provide a plausible the population Random samples explanation of the observed data as a random sample Expectation Mini-example from the population? Parameter ◮ Can be measured by goodness-of-fit test estimation Trial & error ◮ use special tests for such models (Baayen 2001) Automatic estimation ◮ p-value specifies whether model is plausible A practical ◮ small p-value ➜ reject model as explanation for data example ➥ we want to achieve a high p-value

Goodness-of-fit Populations & ◮ Automatic estimation procedure minimizes cost function samples until no further improvement can be found Baroni & Evert ◮ this is a so-called local minimum of the cost function The population ◮ not necessarily the global minimum that we want to find Type probabilities Population models ◮ Key question: is the estimated model good enough? ZM & fZM Sampling from ◮ In other words: does the model provide a plausible the population Random samples explanation of the observed data as a random sample Expectation Mini-example from the population? Parameter ◮ Can be measured by goodness-of-fit test estimation Trial & error ◮ use special tests for such models (Baayen 2001) Automatic estimation ◮ p-value specifies whether model is plausible A practical ◮ small p-value ➜ reject model as explanation for data example ➥ we want to achieve a high p-value ◮ Typically, we find p < . 001 – but the models can still be useful for many purposes!

Counting Words: Type probabilities Population models Type-rich - PowerPoint PPT Presentation

Populations & samples Baroni & Evert The population Counting Words: Type probabilities Population models Type-rich populations, samples, ZM & fZM Sampling from and statistical models the population Random samples Expectation

44 Days And Counting 44 Days And Counting 2010 World Equestrian Games Overview September 25

Counting is Hard: Probabilistically Counting Views at Reddit Krishnan Chandra, Data Engineer

Counting Basic 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 of 1 10/02/2003 04:00 PM 1

Counting CS1200, CSE IIT Madras Meghana Nasre April 2, 2020 CS1200, CSE IIT Madras Meghana

Counting CS1200, CSE IIT Madras Meghana Nasre March 26, 2020 CS1200, CSE IIT Madras Meghana

Counting and Probability Whats to come? Counting and Probability Whats to come?

Counting with automorphisms Lectures for CO 430 / 630 March 24 April 2, 2020 1. Counting

Triangle Counting in Large Sparse Graph Meng-Tsung Tsai r95065@cise.ntu.edu.tw Triangle Counting

Computing Lecture 6b: Step Counting & Activity Recognition Emmanuel Agu Step Counting (How

3/31/14 Counting counting is hard with only 10 fingers How many ways to do X ? X = Choose an

Words, Words, Words AND WHY THEY MATTER IN ADVERTISING AND MARKETING Steve Kaplan Becky

Proverbs Words: The Power of Life and Death Words: The Power of 3. Words: They Can Be

The nature and quantity of the unique words of narratives (i.e.., the words beyond the

Question 5-1) Number of words = 256K words = 2 8 *2 10 words Number of bits pre each word = 32 bit

Sturmian words, Lecture 3 Standard words Dominique Perrin 1 er d ecembre 2011 Dominique

Simplicity in Practice https://xkcd.com/1349/ Words, words, words. Hamlet, Act 2 Scene

On the Implementation Code of the Secure Mesh Routing Protocol PASER in OMNeT++: The Big Picture

A critical look at sensor network security A personal odyssey Naveen Sastry (nks@cs.berkeley.edu)

Securing Neighbor Discovery the wormhole attack centralized and decentralized wormhole

We start with a simple remark about amplitudes The 1 1 amplitude in String Theory <latexit

Gravitational Collapse in SD Andrea Napoletano Sapienza Universit` a di Roma In collaboration

Local search to improve task mapping Balzuweit, Bunde, Vitus

A Superfluid Universe Kerson Huang Physics Department, MIT, Cambridge, USA Institute of Advanced

of Efficient 3D Network-on-Chip for Custom Multi-Core SoC Akram Ben Ahmed, Abderazek Ben

Counting Words: Type probabilities Population models Type-rich - PowerPoint PPT Presentation

Populations & samples Baroni & Evert The population Counting Words: Type probabilities Population models Type-rich populations, samples, ZM & fZM Sampling from and statistical models the population Random samples Expectation

44 Days And Counting 44 Days And Counting 2010 World Equestrian Games Overview September 25

Counting is Hard: Probabilistically Counting Views at Reddit Krishnan Chandra, Data Engineer

Counting Basic 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 of 1 10/02/2003 04:00 PM 1

Counting CS1200, CSE IIT Madras Meghana Nasre April 2, 2020 CS1200, CSE IIT Madras Meghana

Counting CS1200, CSE IIT Madras Meghana Nasre March 26, 2020 CS1200, CSE IIT Madras Meghana

Counting and Probability Whats to come? Counting and Probability Whats to come?

Counting with automorphisms Lectures for CO 430 / 630 March 24 April 2, 2020 1. Counting

Triangle Counting in Large Sparse Graph Meng-Tsung Tsai r95065@cise.ntu.edu.tw Triangle Counting

Computing Lecture 6b: Step Counting &amp; Activity Recognition Emmanuel Agu Step Counting (How

3/31/14 Counting counting is hard with only 10 fingers How many ways to do X ? X = Choose an

Words, Words, Words AND WHY THEY MATTER IN ADVERTISING AND MARKETING Steve Kaplan Becky

Proverbs Words: The Power of Life and Death Words: The Power of 3. Words: They Can Be

The nature and quantity of the unique words of narratives (i.e.., the words beyond the

Question 5-1) Number of words = 256K words = 2 8 *2 10 words Number of bits pre each word = 32 bit

Sturmian words, Lecture 3 Standard words Dominique Perrin 1 er d ecembre 2011 Dominique

Simplicity in Practice https://xkcd.com/1349/ Words, words, words. Hamlet, Act 2 Scene

On the Implementation Code of the Secure Mesh Routing Protocol PASER in OMNeT++: The Big Picture

A critical look at sensor network security A personal odyssey Naveen Sastry (nks@cs.berkeley.edu)

Securing Neighbor Discovery the wormhole attack centralized and decentralized wormhole

We start with a simple remark about amplitudes The 1 1 amplitude in String Theory &lt;latexit

Gravitational Collapse in SD Andrea Napoletano Sapienza Universit` a di Roma In collaboration

Local search to improve task mapping Balzuweit*, Bunde*, Vitus

A Superfluid Universe Kerson Huang Physics Department, MIT, Cambridge, USA Institute of Advanced

of Efficient 3D Network-on-Chip for Custom Multi-Core SoC Akram Ben Ahmed, Abderazek Ben

Computing Lecture 6b: Step Counting & Activity Recognition Emmanuel Agu Step Counting (How

We start with a simple remark about amplitudes The 1 1 amplitude in String Theory <latexit

Local search to improve task mapping Balzuweit, Bunde, Vitus