table of contents
play

Table of contents 1. Introduction: You are already an - PowerPoint PPT Presentation

Table of contents 1. Introduction: You are already an experimentalist 2. Conditions 3. Items Section 1: 4. Ordering items for presentation Design 5. Judgment Tasks 6. Recruiting participants 7. Pre-processing data (if necessary) 8.


  1. Table of contents 1. Introduction: You are already an experimentalist 2. Conditions 3. Items Section 1: 4. Ordering items for presentation Design 5. Judgment Tasks 6. Recruiting participants 7. Pre-processing data (if necessary) 8. Plotting 9. Building linear mixed effects models Section 2: Analysis 10. Evaluating linear mixed effects models using Fisher 11. Neyman-Pearson and controlling error rates 12. Bayesian statistics and Bayes Factors 13. Validity and replicability of judgments Section 3: 14. The source of judgment effects Application 15. Gradience in judgments 159

  2. Before anything else — Look at your data! I cannot stress this enough. You have to look at your data . You can’t just plop it into a statistical test and report that result. Well, you can, but you may miss something important. (And, to be fair, I am guilty of not looking at my data enough, so I say this with real experience behind it — look at your data!) There are lot of different ways to “look at” your data, and there is no prescribed way that will work for all experiments. But there are two graphs that are going to be important for nearly all experiments: (i) the distribution of responses per condition, and (ii) the means and standard errors per condition. distribution by condition means and se by condition short long 1.5 1 non − island ● 1.0 mean z − score judgment ● ● 0.5 0 density 0.0 1.5 ● − 1 1.0 island 0.5 short long dependency length 0.0 embedded structure non − island island ● ● − 2 − 1 0 1 − 2 − 1 0 1 zscores 160

  3. Plotting in R: base vs ggplot2 One of the major benefits of R is the ability to make publication quality figures easily (and in the same environment as your statistical analysis). R’s base comes with all of the functions that you might need to create beautiful figures. The primary function is plot(), with a long list of additional functions that will add tick marks, add labels, format the plotting area, draw shapes, etc. If you spend the time to become proficient at plotting with base functions, you will find that you end up drawing your figures in layers: you draw the plot area, you add points, you add lines, you add error bars, you add a legend, etc. There is a package, written by Hadley Wickham (also the creator of dplyr and tidyr), called ggplot2 that takes this fact to its logical conclusion. The two g’s in the name stand for “grammar of graphics”. The idea is that the functions in ggplot allow you to construct a beautiful figure layer by layer, without having to spend as much effort as you would with the base R functions. The received wisdom is that base R functions give you the most flexibility, but require the most effort to create a good looking figures, while ggplot requires the least effort to create good looking figures, but you lose some flexibility (or rather, deviating substantially from the default look in ggplot will lead to complex code, just like base R). 161

  4. Why do we look at distributions? A distribution is simply a description of the number of times that an event (in this case, a judgment or rating) occurs relative to the other possible events. For each sentence type in our experiment, we assume that it has a single underlying acceptability value. However, there are other factors affecting its judgment — the lexical items and meaning of the specific item, the noise of the judgment process itself, any biases that the subject has, etc. So, in practice, we expect that there will be a distribution of judgments for a sentence type. The first thing we want to do is look at that distribution for each of our experimental conditions. In theory, we expect the distribution of judgments to be relatively normal (or gaussian, or bell-shaped). The reason for this is that we expect the other factors that are influencing the judgments to be relatively random. When you mix a bunch of random factors together on top of a non- random factor (the sentence type), you get a normal (gaussian, bell-shaped) distribution. So what we want to do is look at the distribution of each of our experimental items to make sure that they are roughly normally distributed. If they aren’t roughly normal, then something might be wrong in our experiment (an outlier or two, a non-random bias for some number of participants, a non-random factor that we failed to control for, etc.) 162

  5. Histograms A histogram shows the counts of each response type. The benefit of a histogram is that the y-axis, counts, is very intuitive, and shows you what the raw data looks like. One drawback of a histogram is that the shape of the distribution in a histogram is strongly dependent on the size of the bins that you choose (with continuous data, like z-scores, you have to define bins). If the bins are too small, a normal distribution will look non-normal, and if the bins are too big, a non-normal distribution can look normal. You can sue the code in distribution.plots.r to generate histograms with different bin-widths and see the effect: short long short long 12.5 6 10.0 non − island 7.5 non − island 4 5.0 2 2.5 density density 0 0.0 12.5 6 10.0 7.5 4 island island 5.0 2 2.5 0 0.0 − 1 0 1 − 1 0 1 − 2 − 1 0 1 − 2 − 1 0 1 163 zscores zscores

  6. Density plots A density plot shows you the probability density function for your distribution. The “curve” that people think of when they think about distributions is a probability density function. The idea behind a probability density function is that it shows the relative likelihood that a certain judgment will occur. Speaking more precisely, the total area under the curve of a pdf will be 1, and the area under the curve between two points will be the probability that a judgment will be between those two values. Much like binning, pdfs are necessary because there are an infinite number of possible values on a continuous scale (like z-scores), so the probability of any given judgment is infinitesimal. That isn’t helpful. So we use the pdf to calculate the probability that a judgment is between two possible values. Like histograms and binning, pdfs will short long vary based on the kernel density 1.5 non − island estimation method that you use to 1.0 calculate them. R tries its best to do 0.5 this reasonable. density 0.0 1.5 You can use the code in the script to 1.0 island generate density plots using R’s default 0.5 kernel density estimation. 0.0 − 1 0 1 − 1 0 1 164 zscores

  7. Combining histograms and density plots You can combine histograms and density plots into one figure if you want. The code in distribution.plots.r shows you how to do this. One thing to note is that short long 12.5 frequencies and density are 10.0 7.5 non − island typically on different scales. 5.0 Frequency is typically much larger 2.5 than density. So if you plot the density 0.0 12.5 two together, the density curve 10.0 will be flattened. 7.5 island 5.0 2.5 0.0 − 2 − 1 0 1 − 2 − 1 0 1 zscores So what we probably want to do is short long 1.5 use density alone for the y-axis, non − island and scale the histogram to fit. R 1.0 does this very easily (see the 0.5 code). The result makes the density 0.0 histogram harder to interpret, but 1.5 allows you to compare the raw 1.0 island responses to the estimated 0.5 density function nicely. 0.0 − 2 − 1 0 1 − 2 − 1 0 1 165 zscores

  8. Arranging the plots in different ways You may have noticed that the short long 1.5 distribution plots have been arranged non − island according to the two factors and their 1.0 levels. This is called faceting, and is a 0.5 very convenient way to organize density 0.0 multiple plots. 1.5 1.0 island 0.5 0.0 − 2 − 1 0 1 − 2 − 1 0 1 zscores You can organize faceting based on any factor you want. 1.5 wh.non.sh 1.0 You can also do it based on one factor alone (creating a 0.5 single column or a single row). 0.0 1.5 wh.non.lg 1.0 The trick is to choose an arrangement that helps readers 0.5 density understand the data. For example, if you aligned the 0.0 1.5 four conditions in a column, you can highlight the wh.isl.sh 1.0 different locations of the distributions on the x-axis. This 0.5 0.0 makes it clear that the fourth condition tends to have 1.5 lower acceptability than the other three. wh.isl.lg 1.0 0.5 0.0 − 2 − 1 0 1 166 zscores

  9. Plotting means and standard errors The second major plot type that you will (pretty much always) want to create is a plot of the condition means and their (estimated) standard errors. For any design that has more than one factor (two factors, three factors, etc), 1 you will probably want to create ● mean z − score judgment ● ● something called an interaction plot. 0 An interaction plot is a line-plot arranged by the levels of the factors. ● − 1 In a 2-D plot, you can only directly short long specify one axis. The other is the value dependency length embedded structure non − island island ● ● of the responses. Typically, you specify the x-axis, and let the y-axis be the value of the responses. 1 ● mean z − score judgment ● ● So, if we specify the x-axis to be the 0 two levels of the DEPENDENCY LENGTH factor, we then need to use something ● − 1 else to specify the levels of EMBEDDED STRUCTURE . We can either use color or short long dependency length the type of line. embeddedStructure non − island island 167

Recommend


More recommend