What We Learned By Using the Rigor Metric Emily S. Patterson, PhD (and numerous colleagues)
Need for Analytic Rigor: Avoid Surprise August 7, 1998 Bombings of US Embassies in Africa 224 killed, including 12 US personnel
Need for Rigor Transparency: Be Calibrated "Another lack of rigor cited by the panel is the widespread use of PowerPoint presentations in lieu of actual engineering data and analyses.” NASA Columbia Accident Investigation
How to Increase Rigor: Broadening Checks (“up” arrows) Down - Collect Corroboration Exploration Hypothesis Conflict & Elm, W., Potter, S., Tittle, J., Woods, D.D., Grossman, J., Patterson, E.S. (2005). Finding decision support requirements for effective intelligence analysis tools. Proceedings of the Human Factors and Ergonomics Society 49th Annual Meeting.
Rigor Metric
What We Did with the Rigor Metric • Study 1: Plan to move troops and supplies – 12, 3-person teams – Undergraduate students (security & intelligence specialization) • Study 2: Causes and impacts of Ariane 501 accident – 2 novice, 8 expert intelligence analysts – 1 coder; entire session • Cases: Low-Moderate-High for 8 attributes – 1992, 2008: Separatist movements in Georgia – 2009: Lebanese elections/pro-Western shift potential – 1999-09: Chavez manipulation of democracy to retain power – 2009: Uyghur separatist movement/regional stability
Study 1: 3-Person Team; Logistics Planning Task Constraints: Fastest (<2.5 hours) Cheapest (least fuel) Secure: - Avoid railway (enemy agents) - Avoid route C (most attacks) Best answer: Trucks with supplies on route A Armored vehicles for troops on route B
Solution Scoresheet
Team 3 Rigor Measures
Explanation Critiquing: Example of High and Low High (Team 3): Extensive error checking throughout task Low (Team 10): Supportive throughout: “I think that’s probably the best way to go.” Low (Team 12): Blanket agreement with dominant, intimidating leader
Task Fidelity: No Specialist Collaboration
Task/Participant Fidelity: No “High” Scores on 3 Attributes
Perfect Task Score: Team 3
Perfect Task Score: Team 10
Perfect Task Score: Team 11
Low Performance (32%): Team 6
Low Performance (40%): Team 7
Study 2: Individual Analysts; Ariane 501 Accident • Time: 2 hour session (avg = 55 min) • Task: Causes, impacts of Ariane 501 accident • Participants: 10 NASIC analysts (avg = 13 yrs) • Tools: Search/browse features of Pathfinder • Data: “On topic” database (~2000 documents) • Briefing: Verbal (video-taped) • Procotol: Think aloud, semi-structured interviews • Analysis: Process tracing, briefing accuracy ( к = 0.84)
Embedded Risks for Inaccurate Statements › Repeated inaccurate information › Missed update that changed assessment › Inapplicable assumption program cancelled rebuild 1 rebuild all 4 The monetary loss can be lost satellites recovered by the (no insurance) Cluster Satellite Program insurance...
Task Fidelity: No Spec Collab or Expl Critiquing
2 Novices vs. 8 Experts
Rely on Weak vs. ‘High Profit’ Documents
The Bottom Line: What We Learned Study 1 • Reliable for 3-person team over session ( K =0.92) • Discovers task and participant fidelity issues Study 2 • (Likely) Detects novice-expert differences • New insights from old study data Cases • All attributes worked (all cases) • Low vs. High easy; Moderate more variable • Jargon (knowledge shields) • Context-dependent risks: • Linguistic barriers (information search) • Polarized issues (information validation) • Limited access (specialist collaboration) • Deliberate deception (stance analysis)
Next Step: Guidance for When to Invest ‘More’
Recommend
More recommend