finding similar exercises in online education systems
play

Finding Similar Exercises in Online Education Systems Reporter: Zai - PowerPoint PPT Presentation

The 24nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2018/08/19-08/23 , London Finding Similar Exercises in Online Education Systems Reporter: Zai Huang Date: 2018.07.22 Anhui Province Key Laboratory Of Big Data Analysis and


  1. The 24nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2018/08/19-08/23 , London Finding Similar Exercises in Online Education Systems Reporter: Zai Huang Date: 2018.07.22 Anhui Province Key Laboratory Of Big Data Analysis and Application

  2. Outline Background and Related Work 1 Problem Definition 2 Study Overview 3 MANN Framework 4 Experiments 5 Conclusion and Future Work 6 Anhui Province Key Laboratory Of Big Data Analysis and Application

  3. Background  Online education systems  Such as KhanAcademy, Knewton, Zhixue  Exercise: collected millions of exercises  Applications: similar exercise retrieval and recommendation, personalized cognitive diagnosis based on exercise similarities  Fundamental task  Finding Similar Exercises (FSE).  finding the similar ones of each given exercise Anhui Province Key Laboratory Of Big Data Analysis and Application 3

  4. Exercise  Exercise contains multiple heterogeneous data  Complex  Rich semantics Text content Image Knowledge concepts Anhui Province Key Laboratory Of Big Data Analysis and Application

  5. What are similar exercises?  Following Educational Psychology, similar exercises are those having the same purpose embedded in exercise contents . Share the same purpose The stereogram of a object is shown in figure (a) and AB 2 -AB-2=0. The front, top and side 𝐹 1 : 𝐹 2 : views of a geometric The front, top and side views of it are shown in figure (b), (c) and object are shown in (d). The volume of the object is ( ) Similar figure (a), (b) and (c). A. 3 B. 4 C.5 D. 6 Please calculate the H Concepts D G :Solid geometry volume of the object. 𝐷 1 1 C 2 F :Volume 𝐷 2 E 1 1 A B :Quadratic equation 𝐷 3 (a) (b) (c) (d) 2 1 1 1 A geometric object is shown in figure (a) and its volume is V. 𝐹 3 : (a) (b) (c) AOB = 90°, and OB = 2. What is the relationship of AB and V ? Dissimilar Concepts A Concepts :Solid geometry 𝐷 1 :Solid geometry 𝐷 1 :Volume 𝐷 2 B O :Volume 𝐷 2 (a) Anhui Province Key Laboratory Of Big Data Analysis and Application 5

  6. Background  Existing solutions for Finding Similar Exercises (FSE) task  Manual Labeling  On a small quantity of exercises  requires strong expertise and takes much time  not suitable for large-scale online education systems containing millions of exercises  Methods based on text similarity  Use the same concepts or the similar words  cannot exploit rich semantics in the heterogeneous data  Urgent Issue  Design an effective FSE solution for large-scale online education systems by exploit the heterogeneous data to understand exercise semantics and purposes. Anhui Province Key Laboratory Of Big Data Analysis and Application 6

  7. Challenge 1 for FSE  Exercises contain multiple heterogenous data.  texts  Images  knowledge concepts  integrates multiple heterogeneous data to understand and represent exercise semantics and purposes . The stereogram of a object is shown in figure (a) and AB 2 -AB-2=0. The front, top and side views of 𝐹 2 : it are shown in figure (b), (c) and (d). The volume of the object is ( ) A. 3 B. 4 C.5 D. 6 Concepts H D :Solid geometry 𝐷 1 G 1 C 2 :Volume 𝐷 2 F 1 1 E A B :Quadratic equation 𝐷 3 (a) (b) (c) (d) Anhui Province Key Laboratory Of Big Data Analysis and Application 7

  8. Challenge 2 for FSE  In a single exercise, different parts/words of the text are associated with different concepts (text-concept) or images (text-image).  For better understanding each exercise, it is necessary to capture these text-concept and text-image associations. The stereogram of a object is shown in figure (a) and AB 2 -AB-2=0. The front, top and side views of The front, top and side 𝐹 1 : 𝐹 2 : views of a geometric it are shown in figure (b), (c) and (d). The volume of the object is ( ) object are shown in A. 3 B. 4 C.5 D. 6 Similar Concepts H figure (a), (b) and (c). D :Solid geometry 𝐷 1 G 1 Please calculate the 2 C :Volume 𝐷 2 F E 1 1 volume of the object. A B :Quadratic equation 𝐷 3 (a) (b) (c) (d) 2 A geometric object is shown in figure (a) and its volume is V. AOB = 90°, and OB = 2. What is the 𝐹 3 : 1 1 1 (a) (b) (c) Dissimilar A relationship of AB and V ? Concepts Concepts :Solid geometry 𝐷 1 :Solid geometry 𝐷 1 B O :Volume 𝐷 2 :Volume 𝐷 2 (a) Anhui Province Key Laboratory Of Big Data Analysis and Application 8

  9. Challenge 3 for FSE  A pair of similar exercises may consist of different texts, images and concepts.  Finding similar exercises needs to measure the similar parts in each exercise pair by deeply interpreting their semantic relations. The stereogram of a object is shown in figure (a) and AB 2 -AB-2=0. The front, top and side views of The front, top and side 𝐹 1 : 𝐹 2 : views of a geometric it are shown in figure (b), (c) and (d). The volume of the object is ( ) object are shown in A. 3 B. 4 C.5 D. 6 Similar Concepts H figure (a), (b) and (c). D :Solid geometry 𝐷 1 G 1 Please calculate the 2 C :Volume 𝐷 2 F E 1 1 volume of the object. A B :Quadratic equation 𝐷 3 (a) (b) (c) (d) 2 A geometric object is shown in figure (a) and its volume is V. AOB = 90°, and OB = 2. What is the 𝐹 3 : 1 1 1 (a) (b) (c) Dissimilar A relationship of AB and V ? Concepts Concepts :Solid geometry 𝐷 1 :Solid geometry 𝐷 1 B O :Volume 𝐷 2 :Volume 𝐷 2 (a) Anhui Province Key Laboratory Of Big Data Analysis and Application 9

  10. Related Work  Studies on FSE  Methods based on text similarity Neglect semantics in heterogeneous materials  Use the same concepts or the similar words of exercises.  Vector Space Model (VSM)  Methods based on learners’ performance data Cannot understand  Multimodal Learning exercise purposes or measure similar parts  Powerful approach to handle heterogeneous data between two exercises  Sound-video, video-text, image-text  Pair Modeling Cannot handle instances having multiple  Learn the relations between two instances in a pair heterogeneous data  Sentence pair, image pair, video-sentence pair Anhui Province Key Laboratory Of Big Data Analysis and Application 10

  11. Outline Background and Related Work 1 Problem Definition 2 Study Overview 3 MANN Framework 4 Experiments 5 Conclusion and Future Work 6 Anhui Province Key Laboratory Of Big Data Analysis and Application

  12. Problem Definition  Given: exercises with corresponding heterogeneous materials including texts, images and concepts  Goal: learn a model F to measure the similarity scores of exercise pairs and find similar exercises for any exercise E by ranking the candidate ones R with similarity scores Parameters of F Candidates for E Similar exercises for E Model Anhui Province Key Laboratory Of Big Data Analysis and Application 12

  13. Outline Background and Related Work 1 2 Problem Definition Study Overview 3 MANN Framework 4 Experiments 5 Conclusion and Future Work 6 Anhui Province Key Laboratory Of Big Data Analysis and Application

  14. Study Overview  Two-stage solution  Training stage The front, top and side views of a geometric object are shown in figure  MANN (a), (b) and (c). Please calculate the volume of the object.  Pairwise training Concepts 2 C 1 :Solid geometry 1 1 1 C 2 :Volume  Testing stage (a) (b) (c)  FSE for any exercise Exercises Heterogeneous materials: text, images and concepts 𝑇 𝐹 1 , 𝐹 1, 𝑡 > 𝑇 𝐹 1 , 𝐹 1, 𝑒𝑡 Ranked candidates 𝑇 𝐹 2 , 𝐹 2, 𝑡 > 𝑇 𝐹 2 , 𝐹 2, 𝑒𝑡 𝑡 , 𝐹 𝑏 ,2 𝑡 , 𝐹 𝑏 ,3 𝑡 , … ) 𝐹 𝑏 ( 𝐹 𝑏 ,1 MANN 𝑇 𝐹 𝑜 , 𝐹 𝑜 , 𝑡 > 𝑇 𝐹 𝑜 , 𝐹 𝑜 , 𝑒𝑡 c 𝑡 , 𝐹 𝑐 ,2 𝑡 , 𝐹 𝑐 ,3 𝑡 , … ) c Testing Training 𝐹 𝑐 ( 𝐹 𝑐 ,1 : similar exercises of 𝑇𝑗𝑛 𝐹 𝐹 : dissimilar exercises of 𝐸𝑇 𝐹 𝐹 FSE for any exercise Model 𝐹 𝑡 ∈ 𝑇𝑗𝑛 𝐹 , 𝐹 𝑒𝑡 ∈ 𝐸𝑇 𝐹 Anhui Province Key Laboratory Of Big Data Analysis and Application 14

  15. Outline Background and Related Work 1 2 Problem Definition Study Overview 3 4 MANN Framework Experiments 5 Conclusion and Future Work 6 Anhui Province Key Laboratory Of Big Data Analysis and Application

  16. MANN Framework  Multimodal Attention-based Neural Network (MANN)  Learn a unified semantic representation of each exercise by Challenge 1: multimodal handling its heterogeneous materials in a multimodal way exercises understanding and representation  Propose two attention strategies to capture the text-image Challenge 2: learning and text-concept associations in each single exercise text-image, text-concept associations  Design a Similarity Attention to measure the similar parts in Challenge 3: learning each exercise pair with their semantic representations similar parts Anhui Province Key Laboratory Of Big Data Analysis and Application 16

Recommend


More recommend