argument mining the bottleneck of knowledge and reasoning
play

Argument Mining: the Bottleneck of Knowledge and Reasoning Patrick - PowerPoint PPT Presentation

Argument Mining: the Bottleneck of Knowledge and Reasoning Patrick Saint-Dizier IRIT - CNRS, Toulouse, France. stdizier@irit.fr April 14, 2016 The Relatedness Problem Given a controversial issue: argument mining from texts besides


  1. Argument Mining: the Bottleneck of Knowledge and Reasoning Patrick Saint-Dizier IRIT - CNRS, Toulouse, France. stdizier@irit.fr April 14, 2016

  2. The Relatedness Problem Given a controversial issue: argument mining from texts ⇒ besides linguistic aspects, domain knowledge + inferences are often required: ◮ Issue: the situation of women has improved in India , ◮ Support: (a) we now see long lines of happy young girls with school bags walking along the roads ◮ Then: (b) School buses must be provided so that schoolchildren do not reach the school totally exhausted after a long early morning walk. ◮ (b) is an attack of (a) ( these young girls may not be so happy ) it is not an attack of the issue: the facet that is concerned in the relation between (b) and (a) does not concern women’s conditions in particular. ⇒ knowledge and reasoning useful to establish relateness and polarity: the WHY, HOW and HOW MUCH.

  3. Problem Analysis and Research questions ◮ Corpus construction: issues + arguments found in various texts ◮ How to tag arguments to characterize the need of knowledge and reasoning ? ◮ How to categorize the knowledge involved ? ◮ How to pair NLP with KR for Argument mining ? ◮ Knowledge-driven argument mining: how to account for the diversity of arguments w.r.t. an issue ? ◮ The Qualia of the Generative Lexicon: an useful lexical and knowledge representation for argument mining? ◮ Case studies: knowledge vs. reasoning ?

  4. Corpus Construction Issue Corpus size nb. of arguments (1) Ebola vaccination 16 texts, 50 is necessary 8300 words (2) Women’s condition has 9 texts, 24 improved in India 4600 words (3) The development of nu- 7 texts, 31 clear plants is necessary 5800 words (4) Organic agriculture 19 texts, 17 is the future 5800 words Total 51 texts, 122 24500 words

  5. Arguments and argument compounds ⇒ Arguments seldom come in isolation. ⇒ They are often articulated within a context that indicates e.g.: circumstances, restrictions, illustrations, concessions, comparisons, purposes, and various forms of elaborations. ⇒ We call such a form an argument compound , where the argument is the kernel: —- > allows for a larger diversity of arguments.

  6. Corpus Tagging The following tags have been identified, but need to be further elaborated: 1. the text span involved that delimits the argument compound and its kernel, 2. the polarity of the argument w.r.t. the issue: support, attack, argumentative concession or contrast. 3. the conceptual relation(s) with the issue , 4. the knowledge involved , to identify the argument: list of the main concepts used, 5. the a priori strength of the argument , 6. the discourse structures associated with the argument kernel.

  7. Illustration for issue (1) < argument nb= 11, polarity= attack with concession , relation To Issue= limited proofs of efficiency and safety of vaccination, concepts Involved= efficiency measures, safety measures, test and evaluation methods, strength= moderate > < concession > Even if the vaccine seems 100% efficient and without any side effects on the tested population, < / concession > < main arg > it is necessary to wait for more conclusive data before making large vaccination campaigns. < / main arg > < elaboration > The national authority of Guinea has approved the continuation of the tests on targeted populations. < /elaboration > < / argument > .

  8. Evidence for knowledge for argument mining Need of knowledge: total nb of arguments / nb of those that require knowledge. Issue need of knowledge total number of concepts nb of cases (rate) involved (estimate) (1) 44 (88%) 54 (2) 18 (75%) 23 (3) 18 (58%) 19 (4) 15 (88%) 25 Total 95 (78%) 121

  9. Main concepts used in argument kernels and their expression in language (issue 1) Supports: efficiency is very good, 100% protection; avoids or reduces dissemination of disease; limited side-effects, etc. Attacks: limited number of cases and deaths compared to other diseases; limited risks of contamination, ignorance of contamination forms; toxicity and high side-effects, etc. Concessions or Contrasts: some side-effects; high production and development costs; vaccine not yet available; ethical problems, etc. The above arguments are expressed in various ways: - evaluative expressions: Vaccine development is very expensive , - comparatives: number of sick people much smaller than for Malaria . - facts related to properties of the main concept(s) of the issue: Vaccine is not yet available. There is no risk of dissemination . - facts related to the consequences, purposes, uses or goals of the issue: vaccine prevents bio-terrorism .

  10. From Concepts to Knowledge Representation The terms used in argument kernels concern: purposes, properties, parts, creation and development, etc. of the head terms of the issue or of derived concepts. These are relatively well defined in the Generative Lexicon. From issue (1): Vaccine(X):   � � CONSTITUTIVE : ACTIVE PRINCIPLE , ADJUVANT ,    � �  MAIN : PROTECT FROM (X,Y,D), AVOID (X, DISSEMINATION (D)),   TELIC : ,   MEANS : INJECT (Z,X,Y)       � �   FORMAL : MEDICINE , ARTEFACT ,       � �   AGENTIVE : DEVELOP (T,X), TEST (T,X), SELL (T,X)

  11. Modeling the Diversity / the Generative expansion of arguments ◮ Arguments attack or support specific facets of the concepts of the controversial issue (called root concepts ). � (protect from(X,Y, (infect(E1,ebola, Y) ⇒ get sick(E2,Y) ⇒ ♦ die(E3,Y))) ∧ avoid(X,dissemination(ebola)) . ◮ Arguments may also attack or support concepts derived from these initial concepts (related to functions, parts, etc.). ◮ For example, they may attack properties or purposes of the adjuvant or of the protocols used to test the vaccine. Arguments must however remain functionally close to the root. ⇒ Develop a network of Concepts and their Qualias derived from those involved in the controversial issue, with a limited depth. ⇒ Qualias structure the concepts in terms of parts, functions, etc.

  12. Generative expansion of arguments: a simple example - Consider: constitutive(vaccine(X)) = { active principle, adjuvant } . Network of Concepts and Qualias: - ‘active principle’: terminal concept in the network, associated with its lexicalizations (e.g. active principle, vaccine ), - ‘Adjuvant’: non-terminal concept, included with its Qualia into the network: Adjuvant(Y,X1):  � �  FORMAL : MEDICINE , CHEMICALS ,   � �   TELIC : DILUTE (Y,X1), ALLOW ( INJECT (X1,P)) - Then the non-terminal concepts (e.g. medicine, dilute(Y,X1), inject(X1,P) introduce new Qualias in the network. Natural language terms are associated to these concepts, e.g.: medicine, chemicals, inject, injection, dilute, dilution .

  13. Case study 1 Utterance A1: The adjuvant of the Ebola vaccine is toxic Utterance A1 matches with the language pattern: [np, ‘is’, evaluative expression] A1 negatively evaluates the adjuvant (lexical feature of the adjective ‘toxic’), but it does not explicitly say anything about the vaccine. Then, given: Ebola:  � �  FORMAL : , DISEASE   � �   INFECT (E1,E BOLA , P) ⇒ GET SICK (E2,P) ⇒ ♦ DIE (E3,P)   TELIC :   ∧ E1 � E2 � E3.

  14. The constitutive role of vaccine(X) says that the adjuvant is part of the vaccine. The Qualia of ‘adjuvant’ indicates that the active principle X1 is mixed by dilution with the adjuvant Y : ◮ Adjuvant Qualia: dilute(Y,X1) Y and X1 are mixed together to form a single entity, the vaccine X. ◮ upwards inheritance of a property in a part-of relation : if a (major) constitutive part K1 of an object K has a property P , then (probably) the entire object K has P: has property(K1,P) ∧ part of(K1,K) ⇒ has property(K,P). ◮ since Y and X1 are parts of X, then since Y is toxic for humans, it follows that X is also toxic for humans. Therefore, A1 attacks the controversial issue. This statement may also be interpreted as a contrast to the controversial issue: ’the vaccine is necessary BUT it is toxic’. (Winterstein 2012)

  15. A2: Seven persons died during the Ebola vaccine tests ◮ In the GL structure of vaccine(X), the ’test’ activity is related to the agentive role. ◮ Axiomatization of the GL structure: by definition, the agentive role is pre-telic: it occurs before the functions or the roles given in the telic role and their related properties are active: ∀ P(E) ∈ agentive-role, ∀ Q(E1) ∈ telic-role, E � E1 ∧ ¬ (P ⇒ Q). ◮ From that point of view, A2 is about tests, it does not say anything about the vaccine roles, functions and consequences once it has been fully tested and approved. ◮ Argument2 is irrelevant or neutral w.r.t. the controversial issue.

  16. A3: No one is infected by Ebola in Europe ∀ Z, human(Z), in(Z,europe) ⇒ ¬ infect(E,ebola,Z) . (1) contradiction with telic role(epidemic(ebola)): infect(E1,ebola,Z) . (2) Therefore, from the telic role of epidemic, there is no dissemination at the moment and therefore, from the telic of vaccine, no need of vaccine. (3) Therefore, Argument A3 is a partial attack of the controversial issue, with several restrictions: it is valid only in Europe and it relates a fact that occurs at the present time (compositionality rule again), (4) A3 may be analyzed as a contrast: vaccine is indeed necessary as a general principle, BUT since there are no cases in Europe at the moment it may not be necessary in Europe .

Recommend


More recommend