experimental design
play

Experimental design Spring 2017 Michelle Mazurek Some content - PowerPoint PPT Presentation

Experimental design Spring 2017 Michelle Mazurek Some content adapted from Bilge Mutlu, Vibha Sazawal, Howard Seltman 1 Administrative No class Tuesday Homework 1 Plug for Tanu Mitra talk + grad student session 2 Todays class


  1. Experimental design Spring 2017 Michelle Mazurek Some content adapted from Bilge Mutlu, Vibha Sazawal, Howard Seltman 1

  2. Administrative • No class Tuesday • Homework 1 • Plug for Tanu Mitra talk + grad student session 2

  3. Today’s class • Quick HCI background • Defining an experiment • Threats to validity 3

  4. QUI QUICK CK HCI HCI BA BACKGRO ROUND 4

  5. Th The old co computing is about what co computers ca can do do; The he new co computing is about what people ca can do. – Ben Ben Shnei Shneiderm derman 5

  6. How would you define HCI? • What are the key goals/questions? 6

  7. Some questions • How to DESIGN a computer system? • How to EVALUATE a computer system? • What are the PSYCHOLOGICAL THEORIES governing interaction with technology? • How does emerging technology create SOCIETAL CHANGE? • How does technology intersect with ECONOMICS and POLICY? 7

  8. DEFINING AN DE AN EXPERIMENT 8

  9. The goal of an experiment is … • “The goal of any research design is to arrive at clear answers to questions of interest [about the populations] while expending a minimum of resources.” – Ramsey and Shafer • Avoid threats that reduce in ity or limit interpretabil ilit ge generalizabi bility ol what you can’t prevent – Con Control 9

  10. So what is an experiment? • Ideally, testing causality – Change in X causes a change in Y • Experimental setup: – Multiple levels of independent var • (“conditions”, “treatments”) – Control for other things that might matter 10

  11. What do we mean by control? • Two basic options: • Control variables: Same for every condition – Plus: actually controlling – Minus: hard to get them all. Generalizability? • Random variables: Randomly assign to conditions – Law of large numbers: Unimportant differences will fall out in the noise – Minus: How large is large? 11

  12. Third option: Constrained variables • Also sometimes called blocking • Distribute variation across conditions: – Per condition:1/3 novice, 1/3 intermediate, 1/3 expert • Pluses: Works in smaller samples • Minuses: What is not being controlled for? 12

  13. TH THREA REATS TS TO TO VALIDITY TY 13

  14. 1. Internal validity • Experiment was properly designed – And can show causality! • Threatened by co confounds ds – Multiple things that vary between conditions 14

  15. Avoiding confounds • Randomize condition assignment – Best option whenever possible • Change only one variable at a time – Use more conditions for more variables • Use blinding – Expectation is a confound! (on both sides) • Use a control group 15

  16. Threats to internal validity • Learning/ordering effects (much more later) • Placebo effect • Self-selection • Dropouts • Errors in measurement • Bad randomization 16

  17. 2. External validity • What population does your sample represent? – Race, gender, age, nationality, education, others • What environment does your sample represent? – Carefully controlled study vs. real world 17

  18. Sampling • Best: Truly random sample of the population – Hard, expensive • Worst: Convenience – Undergrads who want free pizza – Other grad students in my lab – My Facebook friends • Most good studies are somewhere in between • CS studies sometimes lean toward convenience 18

  19. Environment • Does your experiment reflect real-world conditions for the thing you are testing? • In medicine, taking medications in correct dosage and on time • In security research, secondary task • In general: time constraints, reality of synthetic task, competing incentives, etc. 19

  20. Threats to EV • Poor sampling • Non-response/self-selection, dropout • Unrealistic environment 20

  21. Internal vs. external • In general, tension between them. Why? • More control variables – Internal: UP – External: DOWN 21

  22. 3. Construct validity • Often difficult to directly measure the concept(s) of interest – Ind. and dep. vars • What do our metrics measure? – Is it what we intended? 22

  23. Analyzing your construct(s) • Is there a gold standard? – Use it – Or, correlate your construct with it • What else should it correlate with? • Is it reliable? – Inter-rater, test-retest • Floor and ceiling effects • Potential for circular reasoning 23

Recommend


More recommend