Applying Formal Verification to Reflective Reasoning R. Kumar 1 B. - PowerPoint PPT Presentation

Applying Formal Verification to Reflective Reasoning R. Kumar 1 B. Fallenstein 2 1 Data61, CSIRO and UNSW ramana@intelligence.org 2 Machine Intelligence Research Institute benya@intelligence.org Artificial Intelligence for Theorem Proving, Obergurgl 2017

Who am I? Ramana Kumar PhD, University of Cambridge Researcher, Data61, CSIRO Theorem Proving in HOL

Context: Beneficial AI Source: Future of Humanity Institute, Oxford. See also: https://intelligence.org/why-ai-safety/

Context: Beneficial AI Technical Agenda

Context: Beneficial AI Technical Agenda Highly Reliable Agent Design

Context: Beneficial AI Technical Agenda Highly Reliable Agent Design ◮ Foundations ◮ Basic problems lacking in-principle solutions

Context: Beneficial AI Technical Agenda Highly Reliable Agent Design ◮ Foundations ◮ Basic problems lacking in-principle solutions (Note: This is not MIRI’s only research agenda.)

One problem within MIRI’s 2014 agenda happened to seem to align with my expertise, theorem proving and self-verification

Problem Statement

Problem Statement Design a system that ◮ always satisfies some safety property, ◮ but is otherwise capable of arbitrary self-improvement.

Problem of Self Trust Too little self-trust Cannot make simple self-modifications Too much self-trust Unsound reasoning about successors

Overview Reflective Reasoning ◮ Self-Modifying Agents ◮ Vingean Reflection ◮ Suggester-Verifier Architecture ◮ Problem and Partial Solutions Implementation ◮ Botworld ◮ Formalisation in HOL

Reflective Reasoning

The Agent Framework observation+reward agent ( π ) environment action π ( oa 1: n ) = a n +1

The Agent Framework observation+reward agent ( π ) environment action π ( oa 1: n ) = a n +1 Cartesian boundary ◮ agent computed outside environment

Reality is not Cartesian environment agent

Reality is not Cartesian environment agent π n ( o n ) = ( a n +1 , � π n +1 � )

Vingean Principle One can reason only abstractly about a stronger reasoner

Vingean Principle One can reason only abstractly about a stronger reasoner Relevance Self-improving system must reason about programs it cannot run: its successors

Vingean Principle One can reason only abstractly about a stronger reasoner Relevance Self-improving system must reason about programs it cannot run: its successors Approach Formal logic as a model of abstract reasoning

Suggester-Verifier Architecture

Suggester-Verifier Architecture observation π, a Suggester Verifier sophisticated, untrusted proof π, a or default

Suggester-Verifier Architecture observation π, a Suggester Verifier sophisticated, untrusted proof π, a or default Verify: ⊢ u ( h ( π, a )) ≥ u ( h (default))

Suggester-Verifier Architecture observation π, a Suggester Verifier sophisticated, untrusted proof π, a or default Verify: ⊢ u ( h ( π, a )) ≥ u ( h (default)) ( ≈ Safe( a ))

Problem with Self-Modification Argument for Safety of Successor ◮ To create a successor, must prove that its actions will be safe ◮ If successor follows s-v architecture, it will only take actions it has proven to be safe ◮ However, to conclude that an action is actually safe from a proof is problematic.

Problem with Self-Modification Argument for Safety of Successor ◮ To create a successor, must prove that its actions will be safe ◮ If successor follows s-v architecture, it will only take actions it has proven to be safe ◮ However, to conclude that an action is actually safe from a proof is problematic. This principle, T ⊢ � T � ϕ � = ⇒ ϕ , is inconsistent. (G¨ odel/L¨ ob)

Partial Solutions Descending Trust T 100 ⊢ � T 99 � ϕ � = ⇒ ϕ , T 99 ⊢ � T 98 � ϕ � = ⇒ ϕ , . . .

Partial Solutions Descending Trust T 100 ⊢ � T 99 � ϕ � = ⇒ ϕ , T 99 ⊢ � T 98 � ϕ � = ⇒ ϕ , . . . Model Polymorphism 0 < κ, T ⊢ ∀ n . � T � ϕ (¯ n ) � = ⇒ ϕ [ κ − 1 /κ ]( n )

Model Polymorphism 0 < κ, T ⊢ ∀ n . � T � ϕ (¯ n ) � = ⇒ ϕ [ κ − 1 /κ ]( n )

Model Polymorphism 0 < κ, T ⊢ ∀ n . � T � ϕ (¯ n ) � = ⇒ ϕ [ κ − 1 /κ ]( n ) If Safe( a ) ≡ ∀ n . Safe( a , n ) Take ϕ ( n ) ≡ n ≤ κ = ⇒ Safe( a , n )

Model Polymorphism 0 < κ, T ⊢ ∀ n . � T � ϕ (¯ n ) � = ⇒ ϕ [ κ − 1 /κ ]( n ) If Safe( a ) ≡ ∀ n . Safe( a , n ) Take ϕ ( n ) ≡ n ≤ κ = ⇒ Safe( a , n ) ∀ a . �� ∀ n ≤ t + 1 + κ. Safe(¯ a , n ) � = ⇒ ∀ n ≤ t + κ. Safe( a , n )

Implementation

Botworld: Concrete Framework for Embedded Agents Robots can construct/inspect/destroy/program other robots

Botworld Formalisation Semantics ◮ step : state → state

Botworld Formalisation Semantics ◮ step : state → state ◮ Robots run policies in CakeML

Botworld Formalisation Semantics ◮ step : state → state ◮ Robots run policies in CakeML Counterfactuals ◮ state-with-hole for proposed action

Botworld Formalisation Semantics ◮ step : state → state ◮ Robots run policies in CakeML Counterfactuals ◮ state-with-hole for proposed action ◮ steph : s-w-h → a → (obs , state) option

Suggester-Verifier Implementation sv( π default , σ ,obs): 1. ( π, a ) = run π default 2. ( π ′ , a ′ , thm) = run σ (obs , π, a ) 3. Check thm has correct form 4. Write ( π, a ) or ( π ′ , a ′ ) accordingly

Suggester-Verifier Implementation sv( π default , σ ,obs): 1. ( π, a ) = run π default 2. ( π ′ , a ′ , thm) = run σ (obs , π, a ) 3. Check thm has correct form 4. Write ( π, a ) or ( π ′ , a ′ ) accordingly Reflection Library Automation for: �� LCA ¯ k = ⇒ P � implies LCA ( k + 1) = ⇒ P

Implementation Challenge Project Proposal Build a Botworld agent that self-modifies into a provably safe agent of the same architecture.

Implementation Challenge Project Proposal Build a Botworld agent that self-modifies into a provably safe agent of the same architecture. Eventual Project Discover how far theorem proving technology is from implementing the above...

Outlook Implementing a Self-Improving Botworld Agent ◮ Looks possible, but with more effort than anticipated ◮ I would estimate 4 person-years.

Outlook Implementing a Self-Improving Botworld Agent ◮ Looks possible, but with more effort than anticipated ◮ I would estimate 4 person-years. (building on > 25 in prereqs)

Outlook Implementing a Self-Improving Botworld Agent ◮ Looks possible, but with more effort than anticipated ◮ I would estimate 4 person-years. (building on > 25 in prereqs) ◮ Improvements on model polymorphism would be nice!

Outlook Implementing a Self-Improving Botworld Agent ◮ Looks possible, but with more effort than anticipated ◮ I would estimate 4 person-years. (building on > 25 in prereqs) ◮ Improvements on model polymorphism would be nice! Theorem Proving for AI ◮ Specifications Needed!

Outlook Implementing a Self-Improving Botworld Agent ◮ Looks possible, but with more effort than anticipated ◮ I would estimate 4 person-years. (building on > 25 in prereqs) ◮ Improvements on model polymorphism would be nice! Theorem Proving for AI ◮ Specifications Needed! ◮ Novel Architectures for AI Systems, e.g., improve on Suggester-Verifier to support logical induction and non-proof-based reasoning

Outlook Implementing a Self-Improving Botworld Agent ◮ Looks possible, but with more effort than anticipated ◮ I would estimate 4 person-years. (building on > 25 in prereqs) ◮ Improvements on model polymorphism would be nice! Theorem Proving for AI ◮ Specifications Needed! ◮ Novel Architectures for AI Systems, e.g., improve on Suggester-Verifier to support logical induction and non-proof-based reasoning ◮ Reducing Problems to Functional Correctness

Applying Formal Verification to Reflective Reasoning R. Kumar 1 B. - PowerPoint PPT Presentation

Applying Formal Verification to Reflective Reasoning R. Kumar 1 B. Fallenstein 2 1 Data61, CSIRO and UNSW ramana@intelligence.org 2 Machine Intelligence Research Institute benya@intelligence.org Artificial Intelligence for Theorem Proving,

Formal Verification of RISC-V cores with riscv-formal Clifford Wolf CTO, Symbiotic EDA

Multi-Dimensional Reflective BSDE July 29 2010, Cornell University By Qinghua Li, Columbia

Automated Reasoning Course Presentation Summary Automated Reasoning Motivations Course Plan

Model-Checking Acknowledgment Formal Verification Formal verification means to apply

DIVS DL/ID Verification Systems Verification of Legal Status DIVS Passport Verification

Formal Verification Methods 5: Floating Point Verification John Harrison Intel Corporation

Formal Verification and Testing for Formal Verification and Testing for Reactive Systems

Formal Verification of Mathematical Algorithms John Harrison Intel Corporation The cost of

Evidential and Causal Reasoning Much reasoning in AI can be seen as evidential reasoning ,

Formal Verification in Industry John Harrison Intel Corporation The cost of bugs Formal

Formal Verification of Floating-Point Arithmetic John Harrison Intel Corporation Formal

Formal Hardware Verification: getting started Mary Sheeran Making Formal Verification work Aim

Formal Definition of a Finite Automaton Formal Definition of a Finite Automaton p.1/23 Why a

3 New Indoor Positioning Solutions FOR AUTOMOTIVE TESTING 1. REFLECTIVE STRIP-AIDING

Reflective Parallel Programming Nicholas D. Matsakis, Thomas R. Gross ETH Zurich 1 Friday,

seL4: Formal Verification of an OS Kernel . Marek.Sapota@students.mimuw.edu.pl December 15, 2010

CS 4160 Formal Verification Prof. Clarkson Spring 2019 Approaches to validation Social

Unfolding schematic formal systems: from non-finitist to feasible arithmetic Thomas Strahm

A consistent formal system which verifies its own consistency Nik Weaver Washington University

rr r

A Formal Verification of Strong Stubborn Set Based Pruning Travis Rivera Petit <

A Case Study on Formal Verification of the Anaxagoros Paging System with Frama-C Allan

A Proof Repository for Formal Verification of Software Michael Franssen WASDeTT- 3 September 20

Towards Formal Verification in Cryptographic Web Applications A Three Year Evolution Nadim

Applying Formal Verification to Reflective Reasoning R. Kumar 1 B. - PowerPoint PPT Presentation

Applying Formal Verification to Reflective Reasoning R. Kumar 1 B. Fallenstein 2 1 Data61, CSIRO and UNSW ramana@intelligence.org 2 Machine Intelligence Research Institute benya@intelligence.org Artificial Intelligence for Theorem Proving,

Formal Verification of RISC-V cores with riscv-formal Clifford Wolf CTO, Symbiotic EDA

Multi-Dimensional Reflective BSDE July 29 2010, Cornell University By Qinghua Li, Columbia

Automated Reasoning Course Presentation Summary Automated Reasoning Motivations Course Plan

Model-Checking Acknowledgment Formal Verification Formal verification means to apply

DIVS DL/ID Verification Systems Verification of Legal Status DIVS Passport Verification

Formal Verification Methods 5: Floating Point Verification John Harrison Intel Corporation

Formal Verification and Testing for Formal Verification and Testing for Reactive Systems

Formal Verification of Mathematical Algorithms John Harrison Intel Corporation The cost of

Evidential and Causal Reasoning Much reasoning in AI can be seen as evidential reasoning ,

Formal Verification in Industry John Harrison Intel Corporation The cost of bugs Formal

Formal Verification of Floating-Point Arithmetic John Harrison Intel Corporation Formal

Formal Hardware Verification: getting started Mary Sheeran Making Formal Verification work Aim

Formal Definition of a Finite Automaton Formal Definition of a Finite Automaton p.1/23 Why a

3 New Indoor Positioning Solutions FOR AUTOMOTIVE TESTING 1. REFLECTIVE STRIP-AIDING

Reflective Parallel Programming Nicholas D. Matsakis, Thomas R. Gross ETH Zurich 1 Friday,

seL4: Formal Verification of an OS Kernel . Marek.Sapota@students.mimuw.edu.pl December 15, 2010

CS 4160 Formal Verification Prof. Clarkson Spring 2019 Approaches to validation Social

Unfolding schematic formal systems: from non-finitist to feasible arithmetic Thomas Strahm

A consistent formal system which verifies its own consistency Nik Weaver Washington University

rr r

A Formal Verification of Strong Stubborn Set Based Pruning Travis Rivera Petit &lt;

A Case Study on Formal Verification of the Anaxagoros Paging System with Frama-C Allan

A Proof Repository for Formal Verification of Software Michael Franssen WASDeTT- 3 September 20

Towards Formal Verification in Cryptographic Web Applications A Three Year Evolution Nadim

A Formal Verification of Strong Stubborn Set Based Pruning Travis Rivera Petit <