on variational inference and optimal control VO LKSWAGEN GRO U P AI RESEARCH Patrick van der Smagt Director of AI Research Volkswagen Group Munich, Germany https://argmax.ai
control approach #1: feedback control controller x d (t) u(t) K - plant x(t+1) z -1 problem: requires very fast feedback loop
control approach #2: model-based feedback control controller x d (t) u(t) K - plant x(t+1) model -1 LQR z -1 problem: requires fast feedback loop and inverse model
control approach #3: model-reference control controller plant u (t) simulator x d (t) u(t) K - x(t+1:T) "model” x(t+1) z -1 simulator "dreams" the future, aka predictive coding problem: how do I get this model?
problems 1) engineered models are expensive to set up 2) engineered models are expensive to compute 3) engineered models do not scale
we really want to represent p ( x ) we can write Z p ( x ) = p ( x | z ) p ( z ) dz z x
we really want to represent p ( x ) Z p ( x ) = p ( x | z ) p ( z ) dz z x Two problems: p ( z ) (1) how do we shape to carry the right information of ? A: We don't hand-design it. x Assume it is a Gaussian pd. (2) how do we compute the integral? It is intractable (we only have the data; need MCMC)
we really want to represent p ( x ) Z p ( x ) = p ( x | z ) p ( z ) dz bummer, we don't have it x z x Trick to do e ffi cient MCMC: p ( z ) (1) we choose a specific and look in its neighbourhood (to find that most likely produced it) x p ( z | x ) p ( z ) (2) use to sample the corresponding p ( x | z ) (3) evaluate there
we really want to represent p ( x ) Z p ( x ) = p ( x | z ) p ( z ) dz x z x Trick to do e ffi cient MCMC: p ( z ) (1) we choose a specific and look in its neighbourhood (to find that most likely produced it) x q ( z | x ) p ( z ) (2) use to sample the corresponding p ( x | z ) (3) evaluate there
minimise Kullback-Leibler to make q look like p q ( z | x ) log q ( z | x ) X KL[ q ( z | x ) k p ( z | x )] = p ( z | x ) z = E [log q ( z | x ) � log p ( z | x )] � log q ( z | x ) � log p ( x | z ) P ( z ) = E P ( x ) = E [log q ( z | x ) � log p ( x | z ) � log p ( z ) + log p ( x )] log p ( x ) � KL[ q ( z | x ) k p ( z | x )] = E [log p ( x | z ) � (log q ( z | x ) � log p ( z ))] = E [log p ( x | z )] � KL[ q ( z | x ) k p ( z )]
. r o i r p e h t o t l a u q I can compute this e z log p ( x ) � KL[ q ( z | x ) k p ( z | x )] = E [log p ( x | z )] � KL[ q ( z | x ) k p ( z )] e k a m e s a e l p . ) . d g . n z n a i n l p . e . m . v i a g s x y r b o f n E o i L t c M u r e t s h n t o g c n e i s r i e m h i x t a g m n i s y i b m . i . t . p o ( . . . p o t e s o l c e b o t q t n I need this a w e this is why I chose argmax.ai for our lab website w e l i h w . . . . . . l e d o m e v i t a r e n e g r u o argmax θ t e g n a c e w
e ffi cient computation as a neural network: the Variational AutoEncoder ~ } x = reconstruction of x loss = reconstruction loss p(x|z) + KL[ q(z|x) || prior] decoder "nonlinear PCA" probability density with (Gauss) prior } latent space z q(z|x) encoder system state x z x Durk Kingma and Max Welling, 2013 Rezende, Mohamed & Wierstra, 2014
unsupervised preprocessing sensor data with VAE—emerging properties Maximilian Karl, Nutan Chen, Patrick van der Smagt (2014) video: SynTouch, LLC Maximilian Nutan Karl Chen 200 150 100 taxel values 50 0 − 50 − 100 − 150 0 1 2 3 4 5 6 7 time(s)
<latexit sha1_base64="451VOAlQ67V5poN/XQhpmDSxUM=">ACrnicbVHbtQwEHXCrYTbAo+8jKiQdtVlFfNCValSBRLisZV2txWbEDmO07XqxJY9QSxR/oWf4Qd4w/4CS8F6p2y0iWzpw5czwe50ZJh3H8Kwhv3b5z97O/ejBw0ePn/SePps63VguJlwrbc9y5oStZigRCXOjBWsypU4zS/eL+unX4R1UtdjXBiRVuy8lqXkD2V9X6YfpvkJXztspYejLshrNJvmxShZrqtlQAziExM51/5+QDiACSIzVBWQtHtLuczt+TtPAphLWYt79Kqfp3B4aY0DL46SYTKMtpyu+SzHxG2XAWS93XgUrwJuAroBu0cfv/5Tg5zno/k0LzphI1csWcm9HYNoyi5Ir0UVJ4Rh/IKdi5mHNauES9vVujt45ZkCSm39qRFW7NWOlXOLarcKyuGc7dW5L/q80aLPfTVtamQVHz9UVlowA1LP8OCmkFR7XwgHEr/azA58wyjv6HI78Euv3km2D6ZkTjET3x23hH1rFDXpCXpE8oeUuOyEdyTCaEB3vBSfApmIVxOA3TMFtLw2DT85xci3D+FzMgzno=</latexit> <latexit sha1_base64="Ixq18gsPFA3NVFov5Lbq4w01tg4=">ACrnicbVHbtQwEHXCrYTbAo+8jKiQdtVlFfMCQqpUgYR4bKXdbcUmRI7jdK06cWRPEIsVfoWv4QfgiT/gJ5DwXqjaLSNZOnPmzPF4nDdKWozjX0F47fqNm7d2bkd37t67/6D38NHU6tZwMeFaXOSMyuUrMUEJSpx0hjBqlyJ4/zs7bJ+/EkYK3U9xkUj0oqd1rKUnKGnst73pu+SvITPXebo63E3hFX6ZNCUsliTbUbagD7kJi57v8T0gFEAEljdAGZw3afXTj57TzJEBzLnO4Ry/6eQqH59Y48OIoGSbDaMvpks9yTNx2GUDW241H8SrgKqAbsHvw7vefbz/18Os9yMpNG8rUSNXzNoZjRtMHTMouRJdlLRWNIyfsVMx87BmlbCpW627g2eKaDUxp8aYcVe7HCsnZR5V5ZMZzb7dqS/F9t1mL5KnWybloUNV9fVLYKUMPy76CQRnBUCw8YN9LPCnzODOPofzjyS6DbT74Kpi9GNB7RI7+N2QdO+QJeUr6hJKX5IC8J4dkQniwFxwFH4JZGIfTMA2ztTQMNj2PyaUI538BiLvQOQ=</latexit> <latexit sha1_base64="Ixq18gsPFA3NVFov5Lbq4w01tg4=">ACrnicbVHbtQwEHXCrYTbAo+8jKiQdtVlFfMCQqpUgYR4bKXdbcUmRI7jdK06cWRPEIsVfoWv4QfgiT/gJ5DwXqjaLSNZOnPmzPF4nDdKWozjX0F47fqNm7d2bkd37t67/6D38NHU6tZwMeFaXOSMyuUrMUEJSpx0hjBqlyJ4/zs7bJ+/EkYK3U9xkUj0oqd1rKUnKGnst73pu+SvITPXebo63E3hFX6ZNCUsliTbUbagD7kJi57v8T0gFEAEljdAGZw3afXTj57TzJEBzLnO4Ry/6eQqH59Y48OIoGSbDaMvpks9yTNx2GUDW241H8SrgKqAbsHvw7vefbz/18Os9yMpNG8rUSNXzNoZjRtMHTMouRJdlLRWNIyfsVMx87BmlbCpW627g2eKaDUxp8aYcVe7HCsnZR5V5ZMZzb7dqS/F9t1mL5KnWybloUNV9fVLYKUMPy76CQRnBUCw8YN9LPCnzODOPofzjyS6DbT74Kpi9GNB7RI7+N2QdO+QJeUr6hJKX5IC8J4dkQniwFxwFH4JZGIfTMA2ztTQMNj2PyaUI538BiLvQOQ=</latexit> <latexit sha1_base64="xj+iNcZHmN+e6TRSPrdvrVIbdkI=">ACrnicbVFNj9MwEHXC1xK+Chy5jKiQWm2pYi4gpJVWcOG4K7XdFU2IHMfZWusklj1BFCs/jz/AjX+D24bVbpeRL158+Z5PM61khbj+E8Q3rl7/6Dg4fRo8dPnj4bPH+xsE1ruJjzRjXmPGdWKFmLOUpU4lwbwapcibP8vOmfvZdGCubeoZrLdKXdSylJyhp7LBLz1ySV7Cjy5z9Osm8A2/dmnkFSy2FtT43hCBKzakb/hHQMEUCiTVNA5vCIdt/c7C3tPAmgr2QOD+l1P0/h5Moax14cJZNkEu053fDZjIn7LmPIBsN4Gm8DbgPagyHp4yQb/E6KhreVqJErZu2SxhpTxwxKrkQXJa0VmvFLdiGWHtasEjZ123V38MYzBZSN8adG2LXOxyrF1XuVdWDFd2v7Yh/1dbtlh+SJ2sdYui5ruLylYBNrD5OyikERzV2gPGjfSzAl8xwzj6H478Euj+k2+Dxbspjaf0NB4ef+rXcUBekdkRCh5T47JF3JC5oQHh8Fp8DVYhnG4CNMw20nDoO95SW5EuPoLF3Lbw=</latexit> Deep Variational Bayes Filter Maximilian Justin Maximilian Karl Bayer Sölch Graphical model assumes ut ut+1 latent Markovian dynamics i) Observations depend only on the current xt xt+1 xt+2 state, ii) State depends only on the previous state and control signal, zt zt+1 zt+2 T − 1 T Y Y p ( x 1: T , z 1: T | u 1: T ) = ρ ( z 1 ) p ( z t +1 | z t , u t ) p ( x t | z t ) t =1 t =1
Deep Variational Bayes Filtering: filtering in latent space of a variational autoencoder ~ ~ ~ system state x(t+1) system state x(t+2) system state x(t) ~ ~ A z(t) + A z(t+1) + z(t+1) z(t+2) B u(t) + B u(t+1) + z(t) C x(t+1) C x(t+2) control input u (t) control input u (t+1) system state x(t) system state x(t+1) = process noise system state x(t+2) = process noise Karl & Soelch & Bayer & van der Smagt, ICLR 2017
Deep Variational Bayes Filter: example ml 2 ¨ ϕ ( t ) = − µ ˙ ϕ ( t ) + mgl sin ϕ ( t ) + u ( t ) transition model: z(t+1) = A z(t) + B u(t) + C x(t+1)
Formula E use case with Audi Motorsport Philip Audi Motorsport is interested in optimal energy strategies . Becker Knowing future battery temperature is key. baseline Approach : Learn simulator of battery temperature given race conditions and control commands. Use simulator to choose strategy that has best temperature for final performance. our method Project … … initiated end of August, … started a week later, … deployed to hardware during test in November 2017, … tested on car during race early December 2017. Results: error < ± 1 degree in 50% of the races
Recommend
More recommend