Logic or probability? An ERP study of defeasible reasoning Michiel van Lambalgen ILLC/Dept of Philosophy University of Amsterdam
As a modelling tool in cognitive science, logic is on the back foot: [M]uch of our reasoning with conditionals is uncertain, and may be overturned by future information; that is, they are non-monotonic. But logic based approaches to inference are typically monotonic, and hence are unable to deal with this uncertainty. Moreover, to the extent that formal logical approaches embrace non-monotonicity, they appear to be unable to cope with the fact that it is the content of the rules, rather than their logical form, which appears to determine the inferences that people draw. We now argue that perhaps by encoding more of the content of people's knowledge, by probability theory, we may more adequately capture the nature of everyday human inference. This seems to make intuitive sense, because the problems that we have identified concern how uncertainty is handled in human inference, and probability is the calculus of uncertainty. (Oaksford & Chater, Bayesian Rationality, OUP 2007)
The task which occasioned these remarks: the suppression effect (Byrne (1989)) (1) If Marian has an essay, she studies late in the library. (2) Marian has an essay. (a) Does Marian study late in the library? (3) If the library is open, Marian studies late in the library. (b) Does Marian study late in the library? The percentage of `yes’ responses to (a) is around 90%; for (b) it is around 60% -- one says that `MP is suppressed’ Some argue that therefore subjects do not reason `logically’; although it is safer to say they don’t use a monotonic logic The supposed inability of `logic’ to handle this phenomenon has given a boost to probabilistic analyses in which the conditional is represented by a conditional probability
Topics • Logical and Bayesian explanations of the suppression task • Non-monotonicity: logical and probabilistic • We then address the question whether subjects’ reasoning in the suppression task is Bayesian or closed world by means of an EEG study
Formalising the suppression task
Formalisation in logic programming with the Closed World Assumption • We consider first logic programs consisting of Horn clauses p 1 ∧ ..... ∧ p n ⟶ q , such that no other clauses have the same consequent q - if all p i are true, so is q - if some p i is false, so is q [unrestricted closed world assumption] - we thus get p 1 ∧ ..... ∧ p n ↔ q [ p 1 ∧ ..... ∧ p n is a definition of q ] • Using Kleene 3-valued semantics some propositional variables can be released from the closed world aassumption • If the clause contains a negation, say ¬ s ∧ p 1 ∧ ..... ∧ p n ⟶ q , do the preceding for the clause α ⟶ s (where we may have α = ⊥ ) and replace s by its definition • If there are several clauses p i1 ∧ ..... ∧ p in ⟶ q with q as consequent, the definition of q is given by ⋁ i ( p i1 ∧ ..... ∧ p in ) ↔ q [where the disjunction is taken over all such clauses]
Logical analysis of the suppression task • Represent the conditional as A ∧ ¬ ab → E : `if A and nothing abnormal is the case, then E ’ • The meaning of the conditional is partially specified and depends on what abnormalities there are • We’ll take CWA to apply to ab only • Suppose we know A but nothing else; then by closed world reasoning ¬ ab and we can draw the modus ponens conclusion E • Now suppose a possible abnormality ¬ C comes to light [`the library is not open’], then ¬ C → ab ; but no other abnormalities • Then in fact C ↔ ¬ ab , so that the conditional becomes A ∧ C → E • Now we can no longer infer anything from A • The logical analysis led to the prediction that subjects with autism would suppress MP and MT significantly less often; prediction verified in Pijnacker & al, Neuropsychologia (2009)
probability in a nutshell • For our purposes, a probability is a [0,1]-valued function P on a classical propositional logic, satisfying • the probability of a tautology is 1, that of a contradiction is 0 • logically equivalent formulas have the same probability • if ⊨ φ → ¬ ψ , then P ( φ ∨ ψ ) = P ( φ ) + P ( ψ ) • the conditional probability P ( E | A ) is defined as P ( E ∧ A )/ P ( A ) if P ( A ) > 0 • probability is not truth functional, therefore tremendous storage requirements
(Non-)monotonicity in Bayesian probability • Bayesian probability = axioms of probability + rule of inference Bayesian conditionalisation : if E summarises all our evidence and E occurs then for any S the a posteriori probability P f ( S ) of S equals the a priori conditional probability P i ( S | E ) [‘probabilistic modus ponens’, but controversial] • In Bayesianism and some forms of formal semantics the conditional ‘if E then S ’ is represented as a conditional probability P ( S | E ) • In theory Bayesianism holds that probabilities are defined over all variables of (possible) interest [recall: probability not truth functional] • In practice the sets of relevant variables grow and the challenge is to find (rational) principles which govern the transfer of a probability from a set to an expansion of that set • Non-monotonicity lite : in general P ( X|Y ) ≠ P ( X|Y , Z ) • Problematic non-monotonicity P i ( X|Y ) ≠ P f ( X|Y )
Prior and posterior probability • Upon processing the first conditional, the subject sets the prior conditional probability P i ( E | A ) ≈ 1 • The second conditional is supposed to lead to the posterior conditional probability P f ( E | A ) << P i ( E | A ) • Is there a Bayesian explanation for this transition? • Bayesian orthodoxy assumes there is a prior probability defined over all events; so we may assume there is a prior probability for the library being closed ( C ) P i ( E|A ) = P i ( E|CA ) P i ( C|A ) + P i ( E|¬CA ) P i (¬ C|A ) = [independence of A , C ] = P i ( E|CA ) P i ( C ) + P i ( E|¬CA ) P i (¬ C ) • Hence P f ( E | A ) << P i ( E | A ) if P f ( C ) >> P i ( C ), and P i ( C ) must be small to get high P i ( E | A ), i.e. the fact that C becomes salient increases its probability • Not very Bayesian, and assumption of universal prior imposes impossible demands on storage, since based on knowledge not computation
The trouble with novelty • Novel events : the validity of Bayesian conditionalisation requires that P 0 be defined on `all’ events • Cognitively this is an implausible assumption; a better model is provided by having multiple algebras • However • if both E, S belong to two distinct algebras, there need not be unique P f (S); hence at each time there is single algebra of events • if there is a single algebra that grows over time, then we need Renyi’s Axiom: P i ( S | AE ) P i ( A | E ) = P i ( AS | E ) to ensure that P i ( S | E ) is the same in the algebra with and without event A • in which case Bayesian conditionalisation is invariant under the addition of novel events • We have seen that Renyi’s Axiom must be dropped if there is to be a probabilistic model of the suppression effect • One question is whether there exists a rational justification for Bayesianism thus modified • Another question is: do subjects actually engage in probabilistic reasoning in the suppression task?
The time course of defeasible reasoning
Experimental comparison of Bayesian and closed world reasoning • Bayesian probability is explicitly proposed as a computational model of higher cognitive phenomena (Oaksford & Chater (2007, 2009), Gopnik & al (2004), ..) • As such, they should lead both to behavioural and neuroimaging predictions • As an example, we want to compare the expected EEG signatures of Bayesian reasoning and closed world reasoning in a variant of the suppression task • (Pijnacker, Geurts, vL, Buitelaar, Hagoort: `Reasoning with Exceptions: An Event-related Brain Potentials Study’, J. Cogn Neurosci 2010)
a more explicit form of the suppression task The difference with the standard suppression task is that the possible exception to the conditional is now given in more explicit form The results are as usual: in the congruent condition 90% endorses MP, in the disabling condition only 45%
Bayesian model (1) • The `inhibitory event’ E :`Lisa has lost a contact lens’ now forms part of the initial sample space and P i ( E ) is high ( E is said to be `probable’) • The conditional probability corresponding to the conditional `If Lisa is going to play hockey ( H ), she will wear contact lenses ( W )’ must be evaluated by taking E into account: P i ( W|H ) = P i ( W|EH ) P i ( E|H ) + P i ( W|¬EH ) P i (¬ E|H ) = = P i ( W|EH ) P i ( E ) + P i ( W|¬EH ) P i (¬ E ) (assuming independence of E and H ) • If the meaning of a conditional is in part given by a conditional probability, then P i ( W|H ) has to be computed while processing the 2nd premiss • The final probability of the conclusion P f ( W ) is obtained by Bayesian conditionalisation on H
Recommend
More recommend