Conducting web-based experiments for numerical cognition research Arnold Kochari Institute for Logic, Language, and Computation, University of Amsterdam Donders Institute for Brain, Cognition, and Behaviour, Radboud University in Nijmegen
Web-based data collection for psychology What is possible: - surveys - reaction times - mouse tracking - audio/video recording Advantages: - fast: - simultaneous data collection - no need for appointment management - reaching diverse populations; large samples - cheaper (not because of underpaying) - easy sharing of the experiment scripts (no special software needed) Issues: - experiment programming and participant recruitment - no control of the environment in which experiment is completed - timing less accurate than in labs
Timings of RT experiments in web-browsers - More variability in RTs due to imprecise timings (different monitors, keyboards, browsers) - Clear lag in RTs measured by JavaScript (2-45 ms - Reimers & Stewart 2015; 25 ms - de Leeuw & Motz 2016) However: - it is random - this lag is stable across conditions / within-systems - we can compensate for that by having more participants not a big issue for within-participant designs between-participant effects can still be reliable, just with more participants Replications of classical effects in web-based experiments: - Crump et al 2013 (stroop task, task switching, flanker task, simon task, attentional blink, masked priming etc.) - Zwaan and Pecher 2012 (mental simulation in language comprehension) - Barnhoorn et al 2014 (stroop, masked priming, attentional blink)
Technical questions
Programming experiments for web-browsers Free and open source: • jsPsych - scripting manually • lab.js - graphical interface + scripting manually • PsychoPy/PsychoJS - graphical interface + scripting manually • PsyToolkit - also takes care of data collection Commercial: • Gorilla - graphical interface Importantly: same scripts can be used on lab computers
Data collection tools Hosting data collection: • combined with the experiment programming platform (e.g. Gorilla.sc) • special dedicated service (e.g. JATOS, psiTurk) • personal or university web hosting space Participant recruitment: • Amazon MTurk (not available from every country) • Prolific.ac • Qualtrics • others
Results with 2 classical paradigms
Size-congruity effect close replication of Henik & Tzelgov 1982 Materials: 8 digit pairs Experimental factors: congruity (congruent vs. incongruent) X numerical distance (2 vs 4) X font size distance (small vs. large) - 64 unique exp. trials + neutral and empty trials - trial: fixation cross for 150 ms followed by digit pairs for max. 1850 ms - participants used P and Q as response keys Experiment 1a: number comparison task ; N participants = 23 Experiment 1b: size comparison task ; N participants = 24 - average time spent on task: 6-8 minutes
Size-congruity effect: numerical distance Current study: Henik & Tzelgov, Exp 2: Semantic comparison * main e ff ect of congruity * main e ff ect of numerical distance Physical comparison * main e ff ect of congruity * interaction of congruity and physical size distance Experiment scripts, data and analysis code are available at: http://osf.io/dy8kf
Size-congruity effect: physical size distance Henik & Tzelgov, Exp 2: Current study: Physical comparison + * main e ff ect of physical size distance + * interaction of congruity and physical relevant dimension size distance Semantic comparison to be ignored dimension + * interaction of congruity and physical size distance Experiment scripts, data and analysis code are available at: http://osf.io/dy8kf
Distance and priming effects replication of Van Opstal, Gevers, de Moor, & Verguts 2008 500 ms 100 ms 83 ms 100 ms max. 2000 ms Materials: all digits from 1 to 9 except 5 included as primes and targets; 64 unique combinations Experimental factors: distance of the target from the standard (1-4) distance of the prime from the standard (1-4) congruity - 256 trials in total - participants N = 72 - participants used P and Q as response keys 2 di ff erent possible mappings - average time spent on task: 15 minutes
Distance and priming effects: distance of the target Van Opstal et al (exp 1): Current study: - only trials with identical prime and target analysed - 2 (size: before/after the standard) X 4 ( abs. comparison distance: 1, 2, 3, or 4) within- subjects ANOVA on median correct RTs - main e ff ect of comparison distance: F (3, 213) = 10.6, p < 0.001. Experiment scripts, data and analysis code are available at: http://osf.io/dy8kf
Distance and priming effects: congruity before standard: Van Opstal et al (Exp 1): 405 congruent: =22 ms (median) 427 incongruent: after standard: congruent: 399 =30 ms incongruent: 429 Current study: before standard: 528 congruent: =24 ms 552 incongruent: after standard: congruent: 538 =22 ms incongruent: 560 - only trials with non-identical prime and target analysed - 2 (size: before/after the standard) X 2 (congruency) within-subjects ANOVA on median correct RTs - main e ff ect of congruency: F (1, 71) = 58.4, p < 0.001. Experiment scripts, data and analysis code are available at: http://osf.io/dy8kf
Distance and priming effects: distance of the prime Current study: Van Opstal et al (exp 1): - only congruent trials with non-identical prime and target analysed - 2 (size: before/after the standard) X 4 (abs. priming distance: 1, 2, 3, or 4) within-subjects ANOVA on median correct RTs - main e ff ect of prime distance: F (3, 213) = 13.9, p < 0.001. NB: Primes were not actually displayed for exactly 83 ms! Experiment scripts, data and analysis code are available at: http://osf.io/dy8kf
Lessons learnt
Some tips - Fair pay is important - Not too many trials. I try to have experiments for max 15-20 minutes. - Pre-register participant exclusion criteria: a lot of researcher degrees of freedom here. - Ensuring participants put e ff ort: - I automatically exclude everyone who spent less than X s on reading the instructions - 50% error rate - a question at the end where they can given an honest answer - Ensuring participants do not get distracted: - tell them their data will be lost if they switch windows/tabs - experiment has an automatic pace - no opportunity to decide to do something else - inspect the duration of breaks (>3 minutes means they got distracted) - Ensuring participant naiveté: - put a cap on the number of previous studies
Experiment scripts, data and analysis code are available at: http://osf.io/dy8kf
References • Barnhoorn, J. S., Haasnoot, E., Bocanegra, B. R., & van Steenbergen, H. (2015). QRTEngine: An easy solution for running online reaction time experiments using Qualtrics. Behavior research methods, 47(4), 918-929. • Crump, M. J., McDonnell, J. V., & Gureckis, T. M. (2013). Evaluating Amazon's Mechanical Turk as a tool for experimental behavioral research. PloS one, 8(3), e57410. • de Leeuw, J. R., & Motz, B. A. (2016). Psychophysics in a Web browser? Comparing response times collected with JavaScript and Psychophysics Toolbox in a visual search task. Behavior Research Methods, 48(1), 1-12. • Henik, A., & Tzelgov, J. (1982). Is three greater than five: The relation between physical and semantic size in comparison tasks. Memory & cognition, 10(4), 389-395. • Reimers, S., & Stewart, N. (2015). Presentation and response timing accuracy in Adobe Flash and HTML5/JavaScript Web experiments. Behavior research methods, 47(2), 309-327. • Van Opstal, F ., Gevers, W., De Moor, W., & Verguts, T. (2008). Dissecting the symbolic distance e ff ect: Comparison and priming e ff ects in numerical and nonnumerical orders. Psychonomic Bulletin & Review, 15(2), 419-425. • Zwaan, R. A., & Pecher, D. (2012). Revisiting mental simulation in language comprehension: Six replication attempts. PloS one, 7(12), e51382.
Recommend
More recommend