Reinforcement Learning technique (contd.) Stochastic Approximation Seed set For each node 𝑗 in Algorithm Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 9
Reinforcement Learning technique (contd.) Stochastic Approximation Seed set For each node 𝑗 in Function sum Algorithm inside a tour Cost function Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 9
Reinforcement Learning technique (contd.) Stochastic Approximation Seed set For each node 𝑗 in Function sum Algorithm inside a tour Cost function ……. sample 1 sample 2 sample 𝑙 Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 9
Reinforcement Learning technique (contd.) Stochastic Approximation Seed set For each node 𝑗 in Function sum Algorithm inside a tour Cost function ……. sample 1 sample 2 sample 𝑙 Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 9
Reinforcement Learning technique (contd.) Stochastic Approximation Seed set For each node 𝑗 in Function sum Algorithm inside a tour Cost function Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 9
Which Random Walk method to select ? Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 10
Which Random Walk method to select ? Mixing time Not a good criterion here due to burn-in period. Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 10
Which Random Walk method to select ? Mixing time Not a good criterion here due to burn-in period. X k : rejected sample : accepted sample ……. X 1 X 2 Burn-in period Approximate stationary regime Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 10
Which Random Walk method to select ? Mixing time Not a good criterion here due to burn-in period. X k : rejected sample : accepted sample ……. X 1 X 2 Burn-in period Approximate stationary regime Reinforcement Learning technique does not require burn-in period Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 10
Which Random Walk method to select ? Mixing time Not a good criterion here due to burn-in period. Efficiency of the estimator: How many samples are needed to achieve certain accuracy Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 10
Asymptotic Variance Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 11
Asymptotic Variance Asymptotic variance of the estimator Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 11
Asymptotic Variance Asymptotic variance of the estimator Also from Central Limit Theorem equivalent Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 11
Asymptotic Variance (contd.) Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 12
Asymptotic Variance (contd.) For Metropolis-Hastings Sampling, where Fundamental matrix of Markov chain Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 12
Asymptotic Variance (contd.) For Metropolis-Hastings Sampling, where Fundamental matrix of Markov chain For Respondent Driven Sampling, Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 12
Asymptotic Variance (contd.) For Metropolis-Hastings Sampling, where Fundamental matrix of Markov chain For Respondent Driven Sampling, For Reinforcement Learning based sampling, Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 12
Numerical Studies Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 13
Numerical Studies Normalized Root Mean Square Error (NRMSE) vs Budget B Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 13
Numerical Studies Normalized Root Mean Square Error (NRMSE) vs Budget B Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 13
Numerical Studies Normalized Root Mean Square Error (NRMSE) vs Budget B Why MSE ? Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 13
Numerical Studies Normalized Root Mean Square Error (NRMSE) vs Budget B Why MSE ? Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 13
Numerical Studies Normalized Root Mean Square Error (NRMSE) vs Budget B Why MSE ? Budget B: number of allowed samples Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 13
Les Misérables network Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 14
Les Misérables network Number of nodes: 77, number of edges: 254. Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 14
Les Misérables network Number of nodes: 77, number of edges: 254. Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 14
Les Misérables network Number of nodes: 77, number of edges: 254. Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 14
Les Misérables network contd. Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 15
Les Misérables network contd. Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 15
Les Misérables network contd. Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 15
Les Misérables network contd. Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 16
Les Misérables network contd. Study of asymptotic variance Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 16
Les Misérables network contd. Study of asymptotic variance Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 16
Les Misérables network contd. Study of asymptotic variance Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 16
Les Misérables network contd. Study of asymptotic variance Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 16
Friendster network Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 17
Friendster network Number of nodes ~ 65K number of edges ~ 1.25M Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 17
Friendster network Number of nodes ~ 65K number of edges ~ 1.25M Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 17
Friendster network Number of nodes ~ 65K number of edges ~ 1.25M Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 17
Friendster network contd. Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 18
Friendster network contd. Stability of sample paths: Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 18
Friendster network contd. Stability of sample paths: single path example Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 18
Friendster network contd. Stability of sample paths: single path example Varying super-node size Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 18
Friendster network contd. Stability of sample paths: single path example Varying super-node size Varying step size Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 18
Conclusions Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 19
Conclusions Rand Walk based estimators of Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 19
Conclusions Rand Walk based estimators of Numerical and theoretical study of Mean Square Error & Asymptotic Variance of Metropolis-Hastings sampling Respondent Driven sampling (RDS) New Reinforcement Learning based sampling (RL) Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 19
Conclusions Rand Walk based estimators of Numerical and theoretical study of Mean Square Error & Asymptotic Variance of Metropolis-Hastings sampling Respondent Driven sampling (RDS) New Reinforcement Learning based sampling (RL) Reinforcement Learning technique: Tackles disconnected graph A cross between deterministic iteration and MCMC Can control the stability of the algorithm with step sizes Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 19
Conclusions Rand Walk based estimators of Numerical and theoretical study of Mean Square Error & Asymptotic Variance of Metropolis-Hastings sampling Respondent Driven sampling (RDS) New Reinforcement Learning based sampling (RL) Reinforcement Learning technique: Tackles disconnected graph A cross between deterministic iteration and MCMC Can control the stability of the algorithm with step sizes RDS works better. RL technique comparable, yet more stable and no burn-in ! Jithin K. Sreedharan (jithin.sreedharan@inria.fr) 19
Recommend
More recommend