re reinfor inforce ceme ment nt lea learn rning ing: A A co - PowerPoint PPT Presentation

Faculty of Informatics Eötvös Loránd University Hippo Hippoca campa mpal l forma formation tion br brea eaks ks co combina mbinato torial rial ex explos plosion ion for for re reinfor inforce ceme ment nt lea learn rning ing: A A co conjec njectu ture re Andras Lorincz Department of Information Systems Eötvös Lorá nd University

Eötvös Loránd University Support and collaborators Support  AFOSR Information Directorate – on reinforcement learning  EU Framework Program – on multiagent systems Faculty of Informatics Collaborators  Barnabas Poczos  Zoltan Szabo  Gabor Szirtes  Istvan Szita Combinatorial Explosion AAAI FSS BICA 2008

Eötvös Loránd University Motivation: Symbols and symbol manipulation Control Dynamical system Faculty of Informatics Mixed observation Independent driving components Combinatorial Explosion AAAI FSS BICA 2008

Eötvös Loránd University Problem statement  Artificial Intelligence started from computations  Computations work by manipulating symbols  The symbol grounding problem emerges Faculty of Informatics  Grounding of symbols  connect the symbols to experiences  Symbols represent parts (components) of (in) the world and their relations  symbol grounding corresponds to graph matching  it is exponentially hard  It seems necessary to focus on polynomial time learning tasks  Then the symbol learning problem emerges (Lorincz, 2008) Combinatorial Explosion AAAI FSS BICA 2008

Eötvös Loránd University The symbol learning task Find high-entropy variables, or symbols , x i ( i = 1 , 2 , …, k ) Faculty of Informatics and low-entropy random variables, or manifestations for the symbols z i,j i (( i = 1 , 2 , …, k ); ( j i = 1 , 2 , … ;K i ); K i >> 1 for all i such that the transition probability between the low-entropy variables z i,j i and z k,j k i.e., P ( z k,j k |z i,j i ) is roughly determined by the transition probability  between the high-entropy variables x i and x l, i.e., by P ( x l |x i )  for almost all manifestations. Combinatorial Explosion AAAI FSS BICA 2008

Eötvös Loránd University The symbol learning task The symbol learning task is possible Tao (2005) rephrased the famouse Szemeredi Regularity Lemma of extreme graph theory Faculty of Informatics to information theory The symbol learning task is polynomial Frieze and Kannan (1999). Combinatorial Explosion AAAI FSS BICA 2008

Eötvös Loránd University If we have the symbols  Reinforcement learning is still exponential  BUT IF variables factorize (  ‘complementarity’ )  e.g., [color and shape], [position and speed], [where and what] Faculty of Informatics  then factored RL is  polynomial  with a novel sampling technique (I. Szita and A. Lorincz, 2008)  No general method to find variables that factorize  No solution to the factored symbol learning task  Exception:  control (position, speed, acceleration,force)  in linear approximation  Autoregressive Moving Average (ARMA) processes Combinatorial Explosion AAAI FSS BICA 2008

Eötvös Loránd University ARMA processes  Steps 1. Remove temporal dependencies (ARMA removal, Gaussian assumption) 2. Compute ARMA innovations := driving causes of ARMA processes Faculty of Informatics 3. Analyze the causes, they should be independent 4. Find the hidden independences: Independent Subspace Analysis 5. Learn the hidden processes driven by the hidden causes  Independent Process Analysis  polynomial time algorithm (Poczos, Szabo, Lorincz, 2006-2007)  Putting the steps into ANN and insisting on Hebbian learning at each step  one receives an architecture, which is similar to the hippocampal formation. HC is  responsible for declarative memory (planning aspect)  holds representations of position and direction in rodents Combinatorial Explosion AAAI FSS BICA 2008

Faculty of Informatics Eötvös Loránd University Comparison: 1. Hebbian architecture for Autoregressive Independent Process Analysis versus 2. hippocampal formation

Eötvös Loránd University The architecture we get Architecture Faculty of Informatics Hippocampal formation with additional CA3  dentate gyrus loops serving moving average compensation Combinatorial Explosion AAAI FSS BICA 2008

Eötvös Loránd University Con onject jectur ure repeated repeated Faculty of Informatics Hippo Hippoca campa mpal l for orma mation tion br brea eaks ks co comb mbina inato toria rial l exp xplosio losion n for or reinf einfor orce cemen ment lear learning ning

Faculty of Informatics Eötvös Loránd University Thank you!

Faculty of Informatics Eötvös Loránd University Supplementary materials and references

Eötvös Loránd University Grids and place cells inputs Hexagonal grids hexagonal grids Faculty of Informatics  grids and place fields emerge together in the model (Lorincz, Kiszlinger, Szirtes, 2008) place fields Combinatorial Explosion AAAI FSS BICA 2008

Eötvös Loránd University Independent Process Analysis observed: Faculty of Informatics input of ISA: estimated : Combinatorial Explosion AAAI FSS BICA 2008

Eötvös Loránd University References-1  Christian Jutten, Jeanny Hérault: Blind separation of sources: An adaptive algorithm based on neuromimetic architecture. Signal Processing , 24:1-10, 1991. Faculty of Informatics  Pierre Comon: Independent component analysis, a new concept? Signal Processing , 36 (3): 287-314, 1994.  Jean-Francois Cardoso: Multidimensional independent component analysis. ICASSP’98 , volume 4, 1941-1944.  Zoltán Szabó, Barnabás Póczos, András Lőrincz: Undercomplete blind subspace deconvolution. Journal of Machine Learning Research 8(May):1063-1095, 2007. Combinatorial Explosion AAAI FSS BICA 2008

Eötvös Loránd University References-2  Aapo Hyvarinen: Independent component analysis for time-dependent stochastic processes, ICANN’98 , 541- 546.  Barnabás Póczos, Bálint Takács, András Lőrincz: Faculty of Informatics Independent subspace analysis on innovations, ECML- 2005 , 698-706.  Barnabás Póczos, András Lőrincz: D-optimal Bayesian interrogation for parameter and noise identification of recurrent neural networks, 2008 (submitted). Available at http://arxiv.org/abs/0801.1883  Zoltán Szabó, András Lőrincz: Towards independent subspace analysis in controlled dynamical systems. ICARN-2008 , (accepted). Combinatorial Explosion AAAI FSS BICA 2008

re reinfor inforce ceme ment nt lea learn rning ing: A A co - PowerPoint PPT Presentation

Faculty of Informatics Etvs Lornd University Hippo Hippoca campa mpal l forma formation tion br brea eaks ks co combina mbinato torial rial ex explos plosion ion for for re reinfor inforce ceme ment nt lea learn

RICE CE HUSK K AS AN ALTERN TERNATIV TIVE E ENER ERGY GY FOR OR CEME MENT NT PROD ODUC

First Meeting of Creditors Orlc 92 Pty Ltd 12 April 2018 Red Lea Franchise Pty Ltd Red Lea

chi hildren dren enj njoy learni rning, ng, to f o feel strong rong abo bout learn rning,

Intro dution to Lea rning Classier Systems (mostly X CS) Stew a rt W. Wilson

Lea rning F rom Data Y aser S. Abu-Mostafa Califo rnia Institute of T

Lea Learn rning ing 101: 1: A V A Verm ermon ont t Pri rime mer Vermont Agency of

Reo Reopen pening ing and and Con Conti tinuity nuity of Lea of Learn rning ng Pl Plan

Lea Learn rning ing to to Bi Bid d Wi With thout out Kn Knowin wing g yo your ur Va

Who le Bo dy L e a rning 1 THE C LO V ERLEA F SC HO O L Who le Bo dy L e a rning 2 De

Spelling, Punctuation and Grammar Suffixes -ing Year One SPaG | Suffixes -ing Suffixes Suffixes

Me ta L e a rning : L e ve ra g ing Re se a rc h o n L e a rning to I mpro ve Stude nt Suc

Oral Examination on Turn rning Form ormal Assessment Into o Individual Learn rning

GETT GETTING ING RES RESUL ULTS TS FR FROM OM WORK ORK BASED ASED LEA LEARNING RNING

You will learn what git is . You will learn how you can use git . You will learn how to learn more

No No Le Learn rning Gather data 1 Write Analyze report data 2 3 Example: Policy

Acc ccele elera ratin ting g Ma Mach chin ine e Lea earnin rning g wit with h Tra

k -Step Ahead Prediction Error Model 1. ARMAX model is ARMA plus eXogeneous signal model: A ( z )

Nonuniformly elliptic problems Presentation June 2020 CITATIONS READS 0 7 1 author:

In Institutional-Level Assessment in in REF Follow us on Twitter @REF_2021 ARMA, Yor ork, 18

Nonlinear tools in the fractional setting (and vice-versa) Giuseppe Mingione ICTP May 31,

Fluid flow and rotation: a fascinating interplay J urgen Saal Mathematics for Nonlinear

Overview of the ARM architecture Simon Aittamaa Dept. of Computer Science, Electrical and Space

Giving Everyone Access To Open Source Best Practices OpenChain Project - The Linux Foundation

System Construction Autumn Semester 2017 ETH Zrich Felix Friedrich 1 Goals Competence in