Learning Light Transport the Reinforced Way Ken Dahm and Alexander - PowerPoint PPT Presentation

Learning Light Transport the Reinforced Way Ken Dahm and Alexander Keller

Light Transport Simulation How to do importance sampling � compute functionals of a Fredholm integral equation of the 2nd kind � L ( x , ω ) = L e ( x , ω )+ + ( x ) L ( h ( x , ω i ) , − ω i ) f r ( ω i , x , ω ) cos θ i d ω i S 2 2

Light Transport Simulation How to do importance sampling � example: direct illumination � L ( x , ω ) = L e ( x , ω )+ + ( x ) L e ( h ( x , ω i ) , − ω i ) f r ( ω i , x , ω ) cos θ i d ω i S 2 2

Light Transport Simulation How to do importance sampling � example: direct illumination � L ( x , ω ) = L e ( x , ω )+ + ( x ) L e ( h ( x , ω i ) , − ω i ) f r ( ω i , x , ω ) cos θ i d ω i S 2 N − 1 L e ( x , ω )+ 1 L e ( h ( x , ω i ) , − ω i ) f r ( ω i , x , ω ) cos θ i ∑ ≈ N p ( ω i ) i = 0 2

Light Transport Simulation How to do importance sampling � example: direct illumination � L ( x , ω ) = L e ( x , ω )+ + ( x ) L e ( h ( x , ω i ) , − ω i ) f r ( ω i , x , ω ) cos θ i d ω i S 2 N − 1 L e ( x , ω )+ 1 L e ( h ( x , ω i ) , − ω i ) f r ( ω i , x , ω ) cos θ i ∑ ≈ N p ( ω i ) i = 0 p ∼ f r cos θ

Light Transport Simulation How to do importance sampling � example: direct illumination � L ( x , ω ) = L e ( x , ω )+ + ( x ) L e ( h ( x , ω i ) , − ω i ) f r ( ω i , x , ω ) cos θ i d ω i S 2 N − 1 L e ( x , ω )+ 1 L e ( h ( x , ω i ) , − ω i ) f r ( ω i , x , ω ) cos θ i ∑ ≈ N p ( ω i ) i = 0 p ∼ f r cos θ p ∼ L e f r cos θ 2

Machine Learning

Machine Learning Taxonomy supervised learning: learning from labeled data goal: extrapolate/generalize response to unseen data example: artificial neural networks 4

Machine Learning Taxonomy supervised learning: learning from labeled data goal: extrapolate/generalize response to unseen data example: artificial neural networks unsupervised learning: learning from unlabeled data goal: identify structure in data example: autoencoder networks 4

Machine Learning Taxonomy supervised learning: learning from labeled data goal: extrapolate/generalize response to unseen data example: artificial neural networks unsupervised learning: learning from unlabeled data goal: identify structure in data example: autoencoder networks semi-supervised learning: reward-based supervision goal: maximize reward example: reinforcement learning 4

The Reinforcement Learning Problem Maximize reward � policy π t : S → A ( S t ) Agent S t – to select an action A t ∈ A ( S t ) – given current state S t ∈ S S t + 1 R t + 1 ( A t | S t ) A t Environment 5

The Reinforcement Learning Problem Maximize reward � policy π t : S → A ( S t ) Agent S t – to select an action A t ∈ A ( S t ) – given current state S t ∈ S S t + 1 R t + 1 ( A t | S t ) A t � state transition yields reward R t + 1 ( A t | S t ) ∈ R Environment 5

The Reinforcement Learning Problem Maximize reward � policy π t : S → A ( S t ) Agent S t – to select an action A t ∈ A ( S t ) – given current state S t ∈ S S t + 1 R t + 1 ( A t | S t ) A t � state transition yields reward R t + 1 ( A t | S t ) ∈ R Environment � classic goal: find policy to maximize discounted cumulative reward ∞ γ k R t + 1 + k ( A t + k | S t + k ) , where 0 < γ < 1 V π ( S t ) ≡ ∑ k = 0 5

The Reinforcement Learning Problem Q-Learning [Watkins 1989] � finds optimal action-selection policy for any given Markov decision process Q ′ ( s , a ) r ( s , a )+ γ V ( s ′ ) � � = ( 1 − α ) · Q ( s , a )+ α · for a learning rate α ∈ [ 0 , 1 ] 6

The Reinforcement Learning Problem Q-Learning [Watkins 1989] � finds optimal action-selection policy for any given Markov decision process Q ′ ( s , a ) r ( s , a )+ γ V ( s ′ ) � � = ( 1 − α ) · Q ( s , a )+ α · for a learning rate α ∈ [ 0 , 1 ] with the following options for the discounted cumulative reward  max a ′ ∈ A Q ( s ′ , a ′ ) consider best action         V ( s ′ ) ≡         6

The Reinforcement Learning Problem Q-Learning [Watkins 1989] � finds optimal action-selection policy for any given Markov decision process Q ′ ( s , a ) r ( s , a )+ γ V ( s ′ ) � � = ( 1 − α ) · Q ( s , a )+ α · for a learning rate α ∈ [ 0 , 1 ] with the following options for the discounted cumulative reward  max a ′ ∈ A Q ( s ′ , a ′ ) consider best action         V ( s ′ ) ∑ a ′ ∈ A π ( s ′ , a ′ ) Q ( s ′ , a ′ ) ≡ policy weighted average over discrete action space         6

The Reinforcement Learning Problem Q-Learning [Watkins 1989] � finds optimal action-selection policy for any given Markov decision process Q ′ ( s , a ) r ( s , a )+ γ V ( s ′ ) � � = ( 1 − α ) · Q ( s , a )+ α · for a learning rate α ∈ [ 0 , 1 ] with the following options for the discounted cumulative reward  max a ′ ∈ A Q ( s ′ , a ′ ) consider best action         V ( s ′ ) ∑ a ′ ∈ A π ( s ′ , a ′ ) Q ( s ′ , a ′ ) ≡ policy weighted average over discrete action space     �  A π ( s ′ , a ′ ) Q ( s ′ , a ′ ) da ′  policy weighted average over continuous action space   6

Light Transport Simulation and Reinforcement Learning

Light Transport Simulation and Reinforcement Learning Structural equivalence of the integral equations � matching terms � L ( x , ω ) = L e ( x , ω ) + f r ( ω i , x , ω ) cos θ i L ( h ( x , ω i ) , − ω i ) d ω i S 2 + ( x ) � � � Q ′ ( s , a ) π ( s ′ , a ′ ) Q ( s ′ , a ′ ) da ′ = ( 1 − α ) Q ( s , a )+ α r ( s , a ) + γ A 8

Light Transport Simulation and Reinforcement Learning Structural equivalence of the integral equations � matching terms � L ( x , ω ) = L e ( x , ω ) + f r ( ω i , x , ω ) cos θ i L ( h ( x , ω i ) , − ω i ) d ω i S 2 + ( x ) � � � Q ′ ( s , a ) π ( s ′ , a ′ ) Q ( s ′ , a ′ ) da ′ = ( 1 − α ) Q ( s , a )+ α r ( s , a ) + γ A hints at learning the incident radiance � � � Q ′ ( x , ω ) = ( 1 − α ) Q ( x , ω )+ α L e ( y , − ω )+ + ( y ) f r ( ω i , y , − ω ) cos θ i Q ( y , ω i ) d ω i S 2 as a policy for selecting an action ω in state x to reach the next state y := h ( x , ω ) � the learning rate α is the only parameter left 8

Light Transport Simulation and Reinforcement Learning Discretization of Q in analogy to irradiance representations � action space: a ∈ S 2 + ( y ) – equally sized patches √     1 − u 2 cos ( 2 π v ) x √  = 1 − u 2 sin ( 2 π v ) y    z u 9

Light Transport Simulation and Reinforcement Learning Discretization of Q in analogy to irradiance representations � action space: a ∈ S 2 + ( y ) – equally sized patches √     1 − u 2 cos ( 2 π v ) x √  = 1 − u 2 sin ( 2 π v ) y    z u � state space: s ∈ dV – Voronoi diagram of low-discrepancy sequence � Φ 2 ( i ) � x i = for i = 0 ,..., N − 1 i / N – nearest neighbor search 9

Light Transport Simulation and Reinforcement Learning Discretization of Q in analogy to irradiance representations � action space: a ∈ S 2 + ( y ) – equally sized patches √     1 − u 2 cos ( 2 π v ) x √  = 1 − u 2 sin ( 2 π v ) y    z u � state space: s ∈ dV – Voronoi diagram of low-discrepancy sequence � Φ 2 ( i ) � x i = for i = 0 ,..., N − 1 i / N – nearest neighbor search including surface normal 9

Learning Light Transport the Reinforced Way Ken Dahm and Alexander - PowerPoint PPT Presentation

Learning Light Transport the Reinforced Way Ken Dahm and Alexander Keller Light Transport Simulation How to do importance sampling compute functionals of a Fredholm integral equation of the 2nd kind L ( x , ) = L e ( x , )+ + ( x

Glass Fiber Reinforced Concrete What is Glass Fiber Reinforced Concrete Glass-fiber Reinforced

ST ST 13:Fiber ST ST 13:Fiber 13:Fiber Reinforced Concrete 13:Fiber Reinforced Concrete

Edge-Reinforced Random Walk, Vertex-Reinforced Jump Process and the second generalised Ray-Knight

Computer Graphics - Light Transport - Philipp Slusallek LIGHT 2 What is Light ?

Outline Light Real light How humans see light How computers trick humans into

Red- -Light Running Light Running Red Red-Light Running 2 Traffic Signals Traffic Signals

Red- -Light Running Light Running Red Red-Light Running 2 Traffic Signals Traffic Signals

Light Energy Gabriella Bicknell Mrs.Branin Grade 5 What is Light? Light is like sound. We

light right light right light right light right to steady the tongue, hold the sides of

What is Light ? Discussion Questions: 1) What is light? 2) How fast does light travel? 3) What

BEST Reinforced Concrete Column Reinforced concrete design according to DIN, EN with

A Glimpse on FRP (Fiberglass Reinforced Plastics) (Fiberglass Reinforced Plastics) Presented

STRUCTURAL BEHAVIOUR OF REINFORCED CONCRETE COLUMNS REINFORCED WITH VARIOUS MATERIALS Prof.Y B I

Seismic Performance of Seismic Performance of Reinforced Concrete Columns Reinforced Concrete

Computer Graphics - Light Transport - Hendrik Lensch Computer Graphics WS07/08 Light

Chapter 5 Light: The Cosmic Messenger 5.1 Basic Properties of Light and Matter Our goals for

Segmentation of Polycrystalline Images Using Voronoi Diagrams Uzziel Cortez April 24, 2019

Imprecise probabilistic models for inference in exponential families Erik Quaeghebeur Gert de

In the aerospace industry a move towards probabilistic design practices has been

APPLICATION OF AN INTEGRATED HIGH-PERFORMANCE APPLICATION OF AN INTEGRATED HIGH PERFORMANCE

Profiling Burglary in London using Geodemographics Chris Gale 1 , Alex Singleton 2 , Paul Longley

Why Procedural Terrain? Large area exploration Small details interest Endless

Computational Geometry Point Pattern Matching Kevin B ockler Institut f ur Theoretische

Housekeeping Items Cell phones off or on vibrate Washroom location Fire exits