A unifying computational framework for teaching and active learning Scott Cheng-Hsin Yang, Wai Keen Vong, Yue Yu & Patrick Shafto
Active learning World Learner
Teaching Teacher World Learner
Self-teaching Self as teacher World Learner
Active learning active learning intervene strategy x P L (x) step 0 update belief observe h 1 h 2 h 3 h 4 y P L (h|x,y) consequence step 1 World: h* Learner h 1 h 2 h 3 h 4 step 2 h 1 h 2 h 3 h 4
Teaching teaching active learning show strategy strategy P T (x,y|h*) x,y P L (x) Teacher knows y and h*; learner does not. update belief P L (h|x,y) Teacher World: h* Learner learner’s inference P L ( h | x, y ) ∝ P T ( x, y | h ) P L ( h ) teacher’s selection P T ( x, y | h ) ∝ P L ( h | x, y ) P T ( x, y ) Shafto et al. 2008, 2014
Teaching (marginalize out y) teaching strategy active learning show (y marginalized) strategy P T (x|h*) x P L (x) update belief observe y P L (h|x,y) consequence Teacher World: h* Learner P L ( h | x, y ) ∝ P ( y | x, h ) P T ( x | h ) P L ( h ) learner’s inference X teacher’s selection P T ( x | h ) = P T ( x, y | h ) y ∈ Y Yang & Shafto 2017
Knowledgeability (marginalize out “h”) teaching strategy teaching strategy active learning (y & h* marginalized) = (y marginalized) strategy P T (x|h*) P T (x) = P L (x) P L (x) δ (g|h): truth δ ST (g|h) = P L (h): truth h 1 h 2 h 3 h 4 h 1 h 2 h 3 h 4 teacher’s belief learner’s belief g 1 1 0 0 0 g 1 1/4 1/4 1/4 1/4 g 2 g 2 0 1 0 0 1/4 1/4 1/4 1/4 g 3 g 3 0 0 1 0 1/4 1/4 1/4 1/4 g 4 0 0 0 1 g 4 1/4 1/4 1/4 1/4 Teacher World: h* Learner X X P T ( x | h ) = P T ( x | g ) δ ( g | h ) P T ( x ) = P T ( x | g ) P L ( g ) g ∈ H g ∈ H Shafto, Eaves, et al. 2012
Self-teaching self-teaching x P T (x) = P L (x) y P L (h|x,y) World: h* Learner P ( y | x, h ) P T ( x ) P L ( h ) learner’s inference P L ( h | x, y ) = P h 0 2 H P ( y | x, h 0 ) P T ( x ) P L ( h 0 ) X self-teacher’s selection P T ( x ) = P T ( x | g ) P L ( g ) g 2 H
How is the Self-Teaching model di ff erent from the most common model of active learning objective —optimizing for expected information gain? Does the Self-Teaching model capture human’s active learning behavior?
<latexit sha1_base64="cqcFkR/GCh4kNstgZEF+ICQVjI=">ACM3icbVBNS8MwGE7n9/yaevQSHMIGMloR1IMgehHxMG56TZKmqVdWJqWJBVL7X/y4h/xIgHRbz6H0xrDzp9IeR5n+d9SN7HCRmVyjSfjdLE5NT0zOxceX5hcWm5srJ6KYNIYNLCAQtEx0GSMpJS1HFSCcUBPkOI21ndJzp7RsiJA34hYpD0veRx6lLMVKasiunTfuidluHB7AnI9OvB7l8CQtujrnTnCoSTpn1W8+5ut+J6btJ3mlzXvHqaC3W7UjUbZl7wL7AKUAVFNe3KY28Q4MgnXGpOxaZqj6CRKYkbSci+SJER4hDzS1ZAjn8h+ku+cwk3NDKAbCH24gjn705EgX8rYd/Skj9RQjmsZ+Z/WjZS7108oDyNFOP5+yI0YVAHMAoQDKghWLNYAYUH1XyEeIp2P0jGXdQjW+Mp/weV2w9p7J/vVA+PijhmwTrYADVgV1wCE5AE7QABvfgCbyCN+PBeDHejY/v0ZJReNbArzI+vwASwqjF</latexit> <latexit sha1_base64="qafWQDz31JGEjQxfXGriyYGcUIM=">ACFnicbVDLSgMxFM3UV62vqks3wSK0YMuMFNSFUBSxgosK9iFtGTJp2oZmMkOSkQ7TfoUbf8WNC0Xcijv/xvSx0NYDFw7n3Mu9zg+o1KZ5rcRW1hcWl6JrybW1jc2t5LbOxXpBQKTMvaYJ2oOkoRTsqKkZqviDIdRipOr2LkV9IEJSj9+p0CdNF3U4bVOMlJbsZPby+irdz8AzWEx3MzALGzJw7ShsUA7vhyX7Jh0O+hntDfqHYcZOpsycOQacJ9aUpMAUJTv51Wh5OHAJV5ghKeuW6atmhISimJFhohFI4iPcQx1S15Qjl8hmNH5rCA+0oJtT+jiCo7V3xMRcqUMXUd3ukh15aw3Ev/z6oFqnzQjyv1AEY4ni9oBg8qDo4xgiwqCFQs1QVhQfSvEXSQVjrJhA7Bmn15nlSOclY+d3qbTxXOp3HEwR7YB2lgWNQAEVQAmWAwSN4Bq/gzXgyXox342PSGjOmM7vgD4zPH3RmnI=</latexit> Self-Teaching Expected information gain P L ( g | x, y ) P T ( x, y ) X X X P T ( x ) = P L ( g ) EIG ( x ) = H ( h ) − P L ( y | x ) H ( h | x, y ) Z ( g ) g ∈ H y ∈ Y y ∈ Y • Uses only the rules of • Also uses entropy and probability subtraction • Meta-reasons about oneself • Reasons about the world as the teacher • Hypothesis testing for • Overall uncertainty reduction distinctive hypothesis
<latexit sha1_base64="ohAqT4yxP6i/L/LHqRm7pZ9r6b8=">ACznichVLdatswGJXdtWuzrUvby918LAwS2I9AksvCoXeBLqLDPLTLU6DrCipiCy7khyaOqa3e7d9QH2HpMdb0uTwT4QHJ1zPh39+RFnSjvOo2XvPNvde75/UHrx8tXh6/LRcU+FsS0S0IeyisfK8qZoF3NKdXkaQ48Dnt+7OLTO/PqVQsFB29iOgwFPBJoxgbahR+WcbRpBAB1Kowh3UAM7AU3GQs1PwmAvwPqGYG6IlrGl8LRlaWy1gvuc9n8/6v80Reb+te1lN8rLk3Se2OtbaSvc3/TAb4V6Dr54KalUbni1J28YBu4Baigotqj8g9vHJI4oEITjpUauE6khwmWmhFO05IXKxphMsNTOjBQ4ICqYZI/RwrvDOGSjNEBpydr0jwYFSi8A3zuzQalPLyH9pg1hPmsOEiSjWVJBV0CTmoEPI3hbGTFKi+cIATCQzewVygyUm2vyA7BLczSNvg97Hutuon35pVM6bxXsozfoLaoiF31C56iF2qiLiHVp3Vr3VmK37bmd2g8rq20VPSfoSdnfwGscqr</latexit> Self-teaching: confirming distinctive h X X X P L ( g | x, y ) P T ( x, y ) P L ( g ) Z ( g ) − 1 P T ( x ) = P T ( x | g ) P L ( g ) = g ∈ H g ∈ H y ∈ Y X X Z ( g ) = P L ( g | x, y ) P T ( x, y ) y ∈ Y x ∈ X Learner’s Self-teaching posterior probability x 1 y 0 x 1 x 1 y 1 x 2 y 0 x 2 * x 2 y 1 x 3 y 0 x 3 x 3 y 1 A distinctive hypothesis is h 1 h 2 h 3 h 4 Distinctiveness one that is on average less likely to be inferred if all interventions and observations are equally likely to occur.
How is the Self-Teaching model di ff erent from the most common model of active learning objective —optimizing for expected information gain? Does the Self-Teaching model capture human’s active learning behavior?
Boundary game task ? ? ?
Causal graph learning task ? ? Coenen et al. 2015
icti Self-Teaching model Human choices Expected information gain Expected information gain Coenen, Rehder, & Gureckis. (2015). Strategies to intervene on causal systems are adaptively selected. Cognitive psychology, 79, 102-133.
Conclusions • We derived a Self-Teaching model , a novel form of active learning. • It depends on only the rules of probability (may have implications for active machine learning). • It unifies teaching and active learning under a single learning mechanism. • It matches human’s active learning behavior in many cases. Collaborators Wai Keen Vong Yue Yu Patrick Shafto Yang, Vong, Yu & Shafto. (2019). A unifying computational framework for teaching and active learning. Topics in Cognitive Science 11(2): 316-337.
Recommend
More recommend