Knowledge Representation in Practice: Project Halo and the Semantic - PowerPoint PPT Presentation

Knowledge Representation in Practice: Project Halo and the Semantic Web Mark Greaves Vulcan, Inc. markg@vulcan.com (206) 342-2276

Talk Outline  The Halo Vision  Systems AI – Vulcan’s Halo Program – The Halo Pilot: The Limits of Expert Systems – Halo Phase II: Deep Reasoning over the AP problem – Halo Today: Leveraging the Web  The Future of Halo 2

KR&R Systems, Scaling, and the Google Property  We seek KR&R systems that have the “Google Property:” KR&R Goals they get (much) better as they get bigger – Google’s PageRank™ yields better relevance judgments when it Speed & Quality of Answers indexes more pages – Current KR&R systems have the antithesis of this property Ideal KR&R  So what are the components of a scalable KR&R system? – Distributed, robust, reliable infrastructure – Multiple linked ontologies and points of view KR&R now • Single ontologies are feasible only at the program/agency level – Mixture of deep and shallow knowledge repositories – Simulations and procedural knowledge components KR&R System Scale • “Knowing how” and “knowing that” (Number of Assertions – Embrace uncertainty, defaults, and nonmonotonicity in all Number of Ontologies/Contexts components Number of Rules – Uncertainty in the KB – you don’t know what you know, things go Linkages to other KBs away, contradiction is rampant, resource-aware computing is Reasoning Engine Types …) necessary, surveying the KB is not possible Scalable KR&R Systems should look just like the Web!! (coupled with great question-answering technology) 4

Envisioning the Digital Aristotle for Scientific Knowledge Inspired by Dickson’s Final Encyclopedia, the  HAL-9000, and the broad SF vision of computing – The “Big AI” Vision of computers that work with people The volume of scientific knowledge has outpaced  our ability to manage it – This volume is too great for researchers in a given domain to keep abreast of all the developments – Research results may have cross-domain implications that are not apparent due to terminology and knowledge volume “Shallow” information retrieval and keyword  indexing systems are not well suited to scientific knowledge management because they cannot reason about the subject matter – Example: “What are the reaction products if metallic copper is heated strongly with concentrated sulfuric acid?” (Answer: Cu 2+ , SO 2 (g), and H 2 O) Response to a query should supply the answer  (possibly coupled with conceptual navigation) rather than simply list 1000s of possibly relevant documents 5

How do we get to the Digital Aristotle?  What we want: – Technology to enable a global, widely-authored, very large knowledge base (VLKB) about human affairs and science, – Technology that answers questions and proactively supplies information, – Technology that uses powerful reasoning about rules and processes, and – Technology that can be customized in its content and actions for individual organizations or people 6

How do we get to the Digital Aristotle?  What we want: – Technology to enable a global, widely-authored, very large knowledge base (VLKB) about human affairs and science, – Technology that answers questions and proactively supplies information, – Technology that uses powerful reasoning about rules and processes, and – Technology that can be customized in its content and actions for individual organizations or people  Vulcan’s Goals … ) – Address the problem of scale in people, Now Knowledge Bases • Scaling by web-style participation KB Effort (cost, • Incorporate large numbers of people in KB construction and maintenance Vulcan – Have high impact • Show that the Digital Aristotle is Future possible • Change our experience of the Web KB size (number of assertions, complexity… ) • Have quantifiable, explainable metrics – Be a commercializable approach  Project Halo is a concrete research program that addresses these goals 7

The Project Halo Pilot (2004)  In 2004, Vulcan funded a six-month effort to determine the state- of-the-art in fielded “deep reasoning” systems – Can these systems support reasoning in scientific domains? – Can they answer novel questions? – Can they produce domain appropriate answer justifications?  Three teams were selected, and used their available technology – SRI, with Boeing Phantom Works and UT-Austin – Cycorp – Ontoprise GmbH  No NLP in the Pilot FL English English NLP QA System Answer & Justification 9

The Halo Pilot Domain  70 pages from the AP-chemistry syllabus (Stoichiometry, Reactions in aqueous solutions, Acid-Base equilibria) – Small and self contained enough to be do-able in a short period of time, but large enough to create many novel questions – Complex “deep” combinations of rules – Standardize exam with well understood scores (AP1-AP5) – Chemistry is an exact science, more “monotonic” – No undo reliance on graphics (e.g., free-body diagrams) – Availability of experts for exam generation and grading  Example: Balance the following reactions, and indicate whether they are examples of combustion, decomposition, or combination C 4 H 10 + O 2  CO 2 + H 2 O • KClO 3  KCl + O 2 • CH 3 CH 2 OH + O 2  CO 2 + H 2 O • P 4 + O 2  P 2 O 5 • N 2 O 5 + H 2 O  HNO 3 • 10

Halo Pilot Evaluation Process  Evaluation – Teams were given 4 months to formulate the knowledge in 70 pages from the AP Chemistry syllabus – Systems were sequestered and run by Vulcan against 100 novel AP-style questions (hand coded queries) – Exams were graded by chemistry professors using AP methodology  Metrics – Coverage: The ability of the system to answer novel questions from the syllabus • What percentage of the questions was the system capable of answering? – Justification: The ability to provide concise, domain appropriate explanations • What percentage of the answer justifications were acceptable to domain evaluators? – Query encoding: The ability to faithfully represent queries – Brittleness: What were the major causes of failure? How can these be remedied? 11

Halo Pilot Results Challenge Answer Scores 60.00 Best scoring system achieved 50.00 roughly an AP3 (on our very 40.00 Scores (%) CYCORP restricted syllabus) 30.00 ONTOPRISE SRI 20.00 10.00 Challenge Justification Scores 0.00 SME1 SME2 SME3 45.00 40.00 35.00 30.00 Cyc had issues with answer Scores (%) CYCORP 25.00 ONTOPRISE 20.00 justification and question focus SRI 15.00 10.00 5.00 0.00 SME1 SME2 SME3 Full Details in AI Magazine 25:4, “Project Halo: Towards a Digital Aristotle” ...and at www.projecthalo.com 12

From the Halo Pilot to the Halo Project Halo Pilot Results  – Much better than expected results on a very tough evaluation – Most failures attributed to modeling errors due to contractors’ lack of domain knowledge – Expensive: O($10,000) per page, per team Project Halo Goal: To determine whether tools can be built to  facilitate robust knowledge formulation, query and evaluation by domain experts, with ever-decreasing reliance on knowledge engineers – Can SMEs build robust question-answering systems that demonstrate excellent coverage of a given syllabus, the ability to answer novel questions, and produce readable domain appropriate justifications using reasonable computational resources? – Will SMEs be capable of posing questions and complex problems to these systems? – Do these systems address key failure, scalability and cost issues encountered in the Pilot? Scope: Selected portions of the AP syllabi for chemistry, biology  and physics – This allows us to expand the types of reasoning addressed by Halo Two competing teams/approaches (F-Logic, Concept Maps/KM)  Evaluation and downselect in September 2006  14

Knowledge Representation in Practice: Project Halo and the Semantic - PowerPoint PPT Presentation

Knowledge Representation in Practice: Project Halo and the Semantic Web Mark Greaves Vulcan, Inc. markg@vulcan.com (206) 342-2276 Talk Outline The Halo Vision Systems AI Vulcans Halo Program The Halo Pilot: The Limits

Knowledge-Based Agents knowledge knowledge representation, knowledge base, types of knowledge

Dark Matter Spikes in our Galactic Halo Dark Matter Spikes in our Galactic Halo Pearl Sandick

Plan for today Knowledge-based systems 1 Explicit knowledge Knowledge Representation Inferred

Plan for today Knowledge-based systems 1 Tacit knowledge Knowledge Representation Inferred

26:198:722 Expert Systems I Knowledge representation I Knowledge acquisition I Machine learning I

K K Knowledge Knowledge l d l d Representation Representation Representation

Large scale structure: Phenomenology The halo model: Theory Halo abundances, clustering,

1 NEO: HALO / OTCQX: AGEEF Investor Presentation Experts in cannabis oil and concentrates MAY

Kia Ora, Bula Vinaka, Talofa Lava, Halo, Kia Orana, Halo Olketa and welcome to the Business Link

ADVANCES IN PHASE-SPACE For Halo and Galaxy Finding Peter Behroozi, STScI +Risa W echsler, Hao

Breakup Reactions of Breakup Reactions of Halo Nuclei Halo Nuclei T. Sugimoto a), a), * * , ,

Milky Way Halo in Action Space GyuChul Myeong Prof. Wyn Evans, Dr. Vasily Belokurov Milky Way

Galaxy Halo Assembly Simon White Max Planck Institute for Astrophysics Halo assembly for

HALO TRIP 2009 East St. Louis The Forgotten City Meet Halo Who are we? King Hall

G. G. Stokes 1857 Stokes diagram with Stokes directions Halo at with singular directions

Finding Galactic- halo substructure in the Gaia data Amina Helmi Stellar halo: treasure trove of

Information Planning Division Director, Lee Kiwan (kiwani@seoul.go.kr) About Seoul Ranked 6th in

Result Clustering for Keyword Search on Graphs Madhulika Mohanty Supervisor: Dr Maya Ramanath

IMPORTANCE OF SECTORS AND PLACE Presentation to EDA Forum Justin Hanney, Head, Employment

1 NCAA Rule Book Larger - More Sections Rules in FR Section Case Plays in FI Section Rules

MOBILITY AS A SERVICE EFFECTIVENESS OF INTELLIGENT SPEED ADAPTATION, COLLISION WARNING AND

An Integrated Approach for Large-scale Relation Extraction from the Web Naimdjon Takhirov, Fabien

February 18, 2019 4pm-7pm Riyadh Chamber KSA Technology in Facilities Management & Its

Global Developer of Free-to-Play Games for Mobile Social PC Other platforms TEAM

Knowledge Representation in Practice: Project Halo and the Semantic - PowerPoint PPT Presentation

Knowledge Representation in Practice: Project Halo and the Semantic Web Mark Greaves Vulcan, Inc. markg@vulcan.com (206) 342-2276 Talk Outline The Halo Vision Systems AI Vulcans Halo Program The Halo Pilot: The Limits

Knowledge-Based Agents knowledge knowledge representation, knowledge base, types of knowledge

Dark Matter Spikes in our Galactic Halo Dark Matter Spikes in our Galactic Halo Pearl Sandick

Plan for today Knowledge-based systems 1 Explicit knowledge Knowledge Representation Inferred

Plan for today Knowledge-based systems 1 Tacit knowledge Knowledge Representation Inferred

26:198:722 Expert Systems I Knowledge representation I Knowledge acquisition I Machine learning I

K K Knowledge Knowledge l d l d Representation Representation Representation

Large scale structure: Phenomenology The halo model: Theory Halo abundances, clustering,

1 NEO: HALO / OTCQX: AGEEF Investor Presentation Experts in cannabis oil and concentrates MAY

Kia Ora, Bula Vinaka, Talofa Lava, Halo, Kia Orana, Halo Olketa and welcome to the Business Link

ADVANCES IN PHASE-SPACE For Halo and Galaxy Finding Peter Behroozi, STScI +Risa W echsler, Hao

Breakup Reactions of Breakup Reactions of Halo Nuclei Halo Nuclei T. Sugimoto a), a), * * , ,

Milky Way Halo in Action Space GyuChul Myeong Prof. Wyn Evans, Dr. Vasily Belokurov Milky Way

Galaxy Halo Assembly Simon White Max Planck Institute for Astrophysics Halo assembly for

HALO TRIP 2009 East St. Louis The Forgotten City Meet Halo Who are we? King Hall

G. G. Stokes 1857 Stokes diagram with Stokes directions Halo at with singular directions

Finding Galactic- halo substructure in the Gaia data Amina Helmi Stellar halo: treasure trove of

Information Planning Division Director, Lee Kiwan (kiwani@seoul.go.kr) About Seoul Ranked 6th in

Result Clustering for Keyword Search on Graphs Madhulika Mohanty Supervisor: Dr Maya Ramanath

IMPORTANCE OF SECTORS AND PLACE Presentation to EDA Forum Justin Hanney, Head, Employment

1 NCAA Rule Book Larger - More Sections Rules in FR Section Case Plays in FI Section Rules

MOBILITY AS A SERVICE EFFECTIVENESS OF INTELLIGENT SPEED ADAPTATION, COLLISION WARNING AND

An Integrated Approach for Large-scale Relation Extraction from the Web Naimdjon Takhirov, Fabien

February 18, 2019 4pm-7pm Riyadh Chamber KSA Technology in Facilities Management &amp; Its

Global Developer of Free-to-Play Games for Mobile Social PC Other platforms TEAM

February 18, 2019 4pm-7pm Riyadh Chamber KSA Technology in Facilities Management & Its