IN5060 Performance in distributed systems User studies (cntd)
Does blur hide asynchrony? study by Ragnhild Eg (Simula) et al., 2011
Perception of synchrony Sensitivity for perceptual synchrony is subjective and depends on the content Spoken sentences (Grant et al., 2003) − Discrimination thresholds: ≈ 50 ms audio lead, ≈ 200 ms audio lag Hitting table with wand (Levitin et al., 2000) − Synchrony thresholds set to 75 %: 41 ms Alead to 45 ms Alag Music, baseball, speech (Vatakis & Spence, 2006) − Temporal order judgements (audio/video first) IN5060
Stimuli 3 content types Chess game News broadcast Drummer 9 asynchrony levels IN5060
Stimuli Visual distortion, 4 levels, Gaussian blur filter Undistorted Blur 2x2 pixels Blur 4x4 pixels Blur 6x6 pixels IN5060
Procedure § Carried out at the Speech Lab, NTNU IN5060
Audio streaming from PPT in Zoom is really bad. See the examples here: https://drive.google.com/drive/folders/1hxXFdh5xCeN 1pMril2kZzNwPC3ZmuL-u?usp=sharing IN5060
Chess content - 200 ms audio lead IN5060
Chess content - 200 ms audio lag, blurred IN5060
News - 300 ms audio lag, blurred IN5060
Drums - 100 ms audio lag, blurred IN5060
Drums - 150 ms audio lead, slightly blurred IN5060
Design & Analysis § 2 independent studies § Full-factorial design § 2 repetitions of each condition § Binomial responses converted to percentages § Repeated-measures ANOVAs § Separate analyses for: − Audio lag and audio lead (different scales) − Content types (different response patterns) IN5060
Mean perceived synchrony, averaged across blur levels Mean % perceived synchrony Asynchrony times IN5060
Assessment of relevance Visual distortion Content F-statistics Chess F(4,85)=88.79, p<.001 Audio lag TV2 F(4,85)=232.54, p<.001 5 settings Drums F(4,85)=197.57, p<.001 18 participants Audio lead Chess F(4,85)=71.77, p<.001 TV2 F(4,85)=100.26, p<.001 Drums F(4,85)=126.31, p<.001 IN5060
Audio lag Audio lag TV2 Chess F(13,204)=0.59 F(13,204)=0.73 not significant not significant Audio lag Blur Drums distortion F(13,204)=1.44 not significant IN5060
F(13,204)=2.26, p<.01 Audio lead TV2 Audio lead Chess F(13,204)=1.99, p<.05 F(13,204)=1.25 not significant Audio lead Drums Blur distortion IN5060
ANOVA Analysis of Variance
Analysis of Variance (ANOVA) § Partitioning variation into part that can be explained and part that cannot be explained § Example: − Easy to see regression that explains 70% of variation is not as good as one that explains 90% of variation − But how much of the explained variation is good? § Enter: ANOVA IN5060
Before-and-After Comparison a b Candidate Audio lag Audio lead Difference ( i ) ( b i ) ( a i ) ( d i = b i – a i ) 1 85 86 -1 2 83 88 -5 3 94 90 4 4 90 95 -5 5 88 91 -3 6 87 83 4 𝑒 = −1 , Standard deviation 𝜏 ! = 4.15 Mean of differences ̅ IN5060
Before-and-After Comparison 𝑒 = −1 Mean of differences ̅ Standard deviation 𝜏 ! = 4.15 § From mean of differences, appears that audio lag reduced performance § However, standard deviation is large § Is the variation between the two alternatives greater than the variation (error) in the measurements? § Confidence intervals can work, but what if there are more than two alternatives? IN5060
Comparing more than two alternatives § Naïve approach − Compare confidence intervals − Need to do for all pairs . This grows very quickly. − Example: 7 alternatives would require 21 pair-wise comparisons • possible combinations: 𝑜 !(!#$)⋯(!#'($) = 𝑙 '('#$)⋯$ • for our case: 7 )∗+ -, 2 = ,∗$ = , = 21 − Would not be surprising to find 1 pair differed (at 95%) IN5060
ANOVA – Analysis of Variance § Separates total variation observed in a set of measurements into: 1. Variation within one system due to uncontrolled measurement errors 2. Variation between systems due to real differences + random error § Is variation (2) statistically greater than variation (1)? IN5060
ANOVA – Analysis of Variance § Make n measurements of k alternatives § y ij = i - th measurement on j - th alternative § Assumes errors are − independent − normally distributed § In user studies, each measurement is the set of responses by one participant IN5060
All Measurements for All Alternatives Alternatives 1 2 … j … k Measure- ments 1 y 11 y 12 … y 1j … y k1 2 y 21 y 22 … y 2j … y 2k … … … … … … … i y i1 y i2 … y ij … y ik … … … … … … … n y n1 y n2 … y nj … y nk IN5060
Overall Mean Average of all measurements made of all alternatives: $ & ∑ !"# ∑ %"# 𝑧 %! 𝑧 = ! 𝑙𝑜 Alternatives 1 2 … j … k Measure- ments 1 y 11 y 12 … y 1j … y k1 2 y 21 y 22 … y 2j … y 2k … … … … … … … i y i1 y i2 … y ij … y ik … … … … … … … n y n1 y n2 … y nj … y nk IN5060
Column Means Column means are average values of all & ∑ %"# 𝑧 %! 𝑧 .! = measurements within a single alternative 𝑜 § average performance of a single alternative Alternatives 1 2 … j … k Measure- ments 1 y 11 y 12 … y 1j … y k1 2 y 21 y 22 … y 2j … y 2k … … … … … … … i y i1 y i2 … y ij … y ik … … … … … … … n y n1 y n2 … y nj … y nk y .1 y .2 … y .j … y .k Column mean IN5060
Effect = Deviation From Overall Mean § 𝛽 " : effect of alternative j = deviation of column mean from overall mean: 𝛽 " = 𝑧 ." − , 𝑧 Alternatives 1 2 … j … k Measure- ments 1 y 11 y 12 … y 1j … y k1 2 y 21 y 22 … y 2j … y 2k … … … … … … … i y i1 y i2 … y ij … y ik … … … … … … … n y n1 y n2 … y nj … y nk y .1 y .2 … y .j … y .k Column mean α 1 α 2 α j α k … … Effect IN5060
Error = Deviation From Column Mean § 𝑓 $" : error of each measurement = deviation from column mean: 𝑓 $" = 𝑧 $" − 𝑧 ." Alternatives 1 2 … j … k Measure- ments 1 y 11 y 12 … y 1j … y k1 2 y 21 y 22 … y 2j … y 2k … … … … … … … i y i1 y i2 … y ij … y ik … … … … … … … n y n1 y n2 … y nj … y nk y .1 y .2 … y .j … y .k Column mean IN5060
Effects and Errors § Effect is distance of column mean from overall mean − Horizontally across alternatives § Error is distance of sample from column mean − Vertically within one alternative − Error across alternatives, too § Note that neither Effect nor Error are absolute values, they can be positive of negative § Individual measurements are then: 𝑧 %! = ! 𝑧 + 𝛽 ! + 𝑓 %! IN5060
Sum of Squares of Differences § SST = differences between each measurement and overall mean $ & ( 𝑇𝑇𝑈 = , , 𝑧 %! − ! 𝑧 !"# %"# § SSA = variation due to effects of alternatives $ $ ( = 𝑜 , ( 𝑇𝑇𝐵 = 𝑜 , 𝛽 ! 𝑧 .! − ! 𝑧 !"# !"# § SSE = variation due to errors in measurements $ & $ & ( = , ( 𝑇𝑇𝐹 = , , 𝑓 %! , 𝑧 %! − 𝑧 .! !"# %"# !"# %"# § 𝑇𝑇𝐹 = 𝑇𝑇𝑈 − 𝑇𝑇𝐵 ⟺ 𝑇𝑇𝑈 = 𝑇𝑇𝐹 + 𝑇𝑇𝐵 IN5060
ANOVA Separates variation in measured values into: 1. variation due to effects of alternatives • SSA – variation across column averages 2. variation due to errors • SSE – variation within a single column If differences among alternatives are due to real differences: à SSA statistically greater than SSE IN5060
Comparing SSE and SSA § Simple approach − %%& %%' = fraction of total variation explained by differences among alternatives − %%( %%' = %%')%%& = fraction of total variation due to %%' experimental error § But is it statistically significant? IN5060
Comparing SSE and SSA § Is it statistically significant? § variance = mean square values = total variation / degrees of freedom 𝑇𝑇𝑦 ( = 𝜏 ) 𝑒𝑔(𝑇𝑇𝑦) § df(SSx): − degrees of freedom − this is the number of independent terms in sum IN5060
Degrees of Freedom for Effects 𝑒𝑔 𝑇𝑇𝐵 = 𝑙 − 1 , since k alternatives Al Alternativ ives Me Measu sure- 1 2 … j … k ments me 1 y 11 y 12 … y 1j … y k1 2 y 21 y 22 … y 2j … y 2k … … … … … … … i y i1 y i2 … y ij … y ik … … … … … … … n y n1 y n2 … y nj … y nk Co Column y .1 y .2 … y .j … y .k me mean Ef Effect α 1 α 2 … α j … α k IN5060
Degrees of Freedom for Errors 𝑒𝑔 𝑇𝑇𝐹 = 𝑙 - (𝑜 − 1) , since k alternatives, each with ( n – 1) degrees of freedom Alternativ Al ives Me Measu sure- 1 2 … j … k me ments 1 y 11 y 12 … y 1j … y k1 2 y 21 y 22 … y 2j … y 2k … … … … … … … i y i1 y i2 … y ij … y ik … … … … … … … n y n1 y n2 … y nj … y nk Column Co y .1 y .2 … y .j … y .k me mean Effect Ef α 1 α 2 … α j … α k IN5060
Recommend
More recommend