Table of contents 1. Introduction: You are already an experimentalist 2. Conditions 3. Items Section 1: 4. Ordering items for presentation Design 5. Judgment Tasks 6. Recruiting participants 7. Pre-processing data (if necessary) 8. Plotting 9. Building linear mixed effects models Section 2: Analysis 10. Evaluating linear mixed effects models using Fisher 11. Neyman-Pearson and controlling error rates 12. Bayesian statistics and Bayes Factors 13. Validity and replicability of judgments Section 3: 14. The source of judgment effects Application 15. Gradience in judgments 28
Before we get started Getting in touch: - I invite you to join the Experimental Syntax Slack Channel . You can join the ‘team’ and get an account by following this link: https://join.slack.com/expsyntax2017/shared_invite/MjA4ODE1MzExNjk3LTE0OTk0MjkzODgtYjQ1YWJiYmViMg - I plan to hold office hours on Monday from 12-2 in the Library Starbucks; other appointments are available on request. 29
Before we get started Think of Slack as a giant chatroom! I have a set up several channels for class topics that you can use to chat. I will do my best to answer chats in these channels as fast as I can… but you can also help your fellow classmates as well! 30
Quick recap: To now you have: - Gotten a taste of what experimental syntax is all about - Seen two-way crossed factorial designs (the 2x2!); - Worked through additive factors logic - Seen a variety of dependent measures used to measure sentence acceptability, including Likert scale ratings, magnitude estimation, and 2AFC tasks (both Yes-No and Forced choice) - Discussed effect sizes and statistical power for observing effects. Today we will: - Discuss the ‘how-to’ of items construction for a sample 2x2 experiment. - See how to arrange items in an experimental context. - Talk about various ‘task effects’ and how to mitigate against them. - Work an example of Latin Square distribution of items into experimental lists by hand to understand the logic. 31
Linguistics tends to use repeated measures Repeated Measures Independent Measures condition 2 condition 1 condition 2 condition 1 Repeated If each participants sees every condition, we call it repeated Measures: measures. It is also called a within-subjects design. Independent If each participants sees only one condition, we call it Measures: independent measures. It is also called a between-subjects design. 32
Linguistics tends to use repeated measures Repeated Measures Independent Measures Requires fewer participants Requires more participants Individual differences between Individual differences between participants is not a confound participants is a possible confound Increased statistical power Decreased statistical power Interaction of two conditions is a Interaction of two conditions is potential confound impossible 33
There are four types of items to create After you have designed your conditions, the next step is to actually make the items that will go in your experiment. The are four types of items that you will need to construct: Instruction items: These are the items that appear in your instructions. The goal there is to illustrate the task, and if necessary, anchor the response scale. Practice items: These are items that occur at the beginning of the experiment. They help to familiarize the participant with the task. They are typically not analyzed in any way. They can be marked as separate (announced) or just part of the experiment (unannounced). Experimental items: These are your treatment and control conditions. Filler items: These are items that you add to the experiment for various reasons: filling out the scale, hiding the experiment’s purpose, and balancing types of items. 34
Instruction items The number and type of instruction items depends on your task. If the task is a scale task with an odd number of points (e.g, 7-point scale), I recommend three instruction items: one at the bottom of the scale, one at the top, and one in middle. Here are three that I use. They were pre-tested in my massive LI replication study: LI-Mode LI-Mean The was insulted waitress frequently. 1 1 Tanya danced with as handsome a boy as her father. 4 4 This is a pen. 7 7 If the scale has an even number of points, you would probably just use two: the bottom and top of the scale. If the task is yes/no, you might use three: a clear yes, a clear no, and one in between. If the task is forced-choice, you might use 3 pairs: a pair with a large difference, a pair with a medium difference, and one with a small difference. 35
Practice items Practice items give participants a chance to work out any bugs before they respond to items that you actually care about (the experimental items). For scale tasks, practice items give participants a chance to see the full range of variability in acceptability, so that they can use the scale appropriately. So in scale tasks, it is important to have practice items that span the range of acceptability. Here are 9 that I have pre-tested in the LI study. One for each point on a 7-point scale, plus one more for each endpoint. LI-Mode LI-Mean She was the winner. 7 7.00 Promise to wash, Neal did the car. 1 1.31 The brother and sister that were playing all the time 4 3.91 had to be sent to bed The children were cared for by the adults and the teenagers 6 6.08 Ben is hopeful for everyone you do to attend. 2 2.00 All the men seem to have all eaten supper 5 4.92 They consider a teacher of Chris geeky. 3 3.09 It seems to me that Robert can’t be trusted. 7 6.92 There might mice seem to be in the cupboard. 1 1.25 36
Practice items For non-scale tasks, the rationale behind the practice items might be different. For yes/no tasks, you may want to give a mix of clear yes’s, clear no’s, and intermediate sentences, so that participants can sharpen their own internal boundary. For forced-choice tasks, you may want to include a mix of large differences, small differences, and medium differences, so that participants can practice identifying each size of difference. Announced practice is when you clearly indicate in the experiment that the items are practice items. This signals to the participants that it is ok to make mistakes. Announced practice is typical in psycholinguistic experiments, because it gives participants a chance to ask questions of the experimenter. Unannounced practice is when the practice items simply appear as part of the main experiment. This is appropriate if the task is relatively intuitive, such that participants won’t have questions. This is what I do with all of my judgment studies. I typically present the (unannounced) practice items in the same order for all participants. You could also counterbalance the order (more on this later). 37
Experimental items Here is a starting set of experimental items for the whether island experiment we started to construct in the previous section. Let’s use these to see the issues that arise in creating experimental items. Condition 1: non-island short Condition 2: non-island long Who __ thinks that Jack stole the car? What do you think that Jack stole __? 1. 1. Who __ thinks that Amy chased the bus? What do you think that Amy chased __? 2. 2. 3. Who __ thinks that Dale sold the TV? 3. What do you think that Dale sold __? 4. Who __ thinks that Stacey wrote the letter? 4. What do you think that Stacey wrote __? Condition 3: island short Condition 4: island long 1. Who __ wonders whether Jack stole the car? 1. What do you wonder whether Jack stole __? 2. Who __ wonders whether Amy chased the bus? 2. What do you wonder whether Amy chased __? 3. Who __ wonders whether Dale sold the TV? 3. What do you wonder whether Dale sold __? 4. Who __ wonders whether Stacey wrote the letter? 4. What do you wonder whether Stacey wrote __? 38
Experimental items - Lexically matched sets The first thing to note is that the items are created in lexically matched sets. The idea here is that the only thing you want varying between conditions is the syntactic manipulation. So, to the extent possible, you use the same lexical items in all 4 conditions. Condition 1: non-island short Condition 2: non-island long Who __ thinks that Jack stole the car? What do you think that Jack stole __? 1. 1. Who __ thinks that Amy chased the bus? What do you think that Amy chased __? 2. 2. 3. Who __ thinks that Dale sold the TV? 3. What do you think that Dale sold __? 4. Who __ thinks that Stacey wrote the letter? 4. What do you think that Stacey wrote __? Condition 3: island short Condition 4: island long 1. Who __ wonders whether Jack stole the car? 1. What do you wonder whether Jack stole __? 2. Who __ wonders whether Amy chased the bus? 2. What do you wonder whether Amy chased __? 3. Who __ wonders whether Dale sold the TV? 3. What do you wonder whether Dale sold __? 4. Who __ wonders whether Stacey wrote the letter? 4. What do you wonder whether Stacey wrote __? This helps minimize confounds in the experiment. The only lexical confound left is if the syntactic manipulation interacts with the lexical items. 39
Recommend
More recommend