The 24nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2018/08/19-08/23 , London Finding Similar Exercises in Online Education Systems Reporter: Zai Huang Date: 2018.07.22 Anhui Province Key Laboratory Of Big Data Analysis and Application
Outline Background and Related Work 1 Problem Definition 2 Study Overview 3 MANN Framework 4 Experiments 5 Conclusion and Future Work 6 Anhui Province Key Laboratory Of Big Data Analysis and Application
Background Online education systems Such as KhanAcademy, Knewton, Zhixue Exercise: collected millions of exercises Applications: similar exercise retrieval and recommendation, personalized cognitive diagnosis based on exercise similarities Fundamental task Finding Similar Exercises (FSE). finding the similar ones of each given exercise Anhui Province Key Laboratory Of Big Data Analysis and Application 3
Exercise Exercise contains multiple heterogeneous data Complex Rich semantics Text content Image Knowledge concepts Anhui Province Key Laboratory Of Big Data Analysis and Application
What are similar exercises? Following Educational Psychology, similar exercises are those having the same purpose embedded in exercise contents . Share the same purpose The stereogram of a object is shown in figure (a) and AB 2 -AB-2=0. The front, top and side 𝐹 1 : 𝐹 2 : views of a geometric The front, top and side views of it are shown in figure (b), (c) and object are shown in (d). The volume of the object is ( ) Similar figure (a), (b) and (c). A. 3 B. 4 C.5 D. 6 Please calculate the H Concepts D G :Solid geometry volume of the object. 𝐷 1 1 C 2 F :Volume 𝐷 2 E 1 1 A B :Quadratic equation 𝐷 3 (a) (b) (c) (d) 2 1 1 1 A geometric object is shown in figure (a) and its volume is V. 𝐹 3 : (a) (b) (c) AOB = 90°, and OB = 2. What is the relationship of AB and V ? Dissimilar Concepts A Concepts :Solid geometry 𝐷 1 :Solid geometry 𝐷 1 :Volume 𝐷 2 B O :Volume 𝐷 2 (a) Anhui Province Key Laboratory Of Big Data Analysis and Application 5
Background Existing solutions for Finding Similar Exercises (FSE) task Manual Labeling On a small quantity of exercises requires strong expertise and takes much time not suitable for large-scale online education systems containing millions of exercises Methods based on text similarity Use the same concepts or the similar words cannot exploit rich semantics in the heterogeneous data Urgent Issue Design an effective FSE solution for large-scale online education systems by exploit the heterogeneous data to understand exercise semantics and purposes. Anhui Province Key Laboratory Of Big Data Analysis and Application 6
Challenge 1 for FSE Exercises contain multiple heterogenous data. texts Images knowledge concepts integrates multiple heterogeneous data to understand and represent exercise semantics and purposes . The stereogram of a object is shown in figure (a) and AB 2 -AB-2=0. The front, top and side views of 𝐹 2 : it are shown in figure (b), (c) and (d). The volume of the object is ( ) A. 3 B. 4 C.5 D. 6 Concepts H D :Solid geometry 𝐷 1 G 1 C 2 :Volume 𝐷 2 F 1 1 E A B :Quadratic equation 𝐷 3 (a) (b) (c) (d) Anhui Province Key Laboratory Of Big Data Analysis and Application 7
Challenge 2 for FSE In a single exercise, different parts/words of the text are associated with different concepts (text-concept) or images (text-image). For better understanding each exercise, it is necessary to capture these text-concept and text-image associations. The stereogram of a object is shown in figure (a) and AB 2 -AB-2=0. The front, top and side views of The front, top and side 𝐹 1 : 𝐹 2 : views of a geometric it are shown in figure (b), (c) and (d). The volume of the object is ( ) object are shown in A. 3 B. 4 C.5 D. 6 Similar Concepts H figure (a), (b) and (c). D :Solid geometry 𝐷 1 G 1 Please calculate the 2 C :Volume 𝐷 2 F E 1 1 volume of the object. A B :Quadratic equation 𝐷 3 (a) (b) (c) (d) 2 A geometric object is shown in figure (a) and its volume is V. AOB = 90°, and OB = 2. What is the 𝐹 3 : 1 1 1 (a) (b) (c) Dissimilar A relationship of AB and V ? Concepts Concepts :Solid geometry 𝐷 1 :Solid geometry 𝐷 1 B O :Volume 𝐷 2 :Volume 𝐷 2 (a) Anhui Province Key Laboratory Of Big Data Analysis and Application 8
Challenge 3 for FSE A pair of similar exercises may consist of different texts, images and concepts. Finding similar exercises needs to measure the similar parts in each exercise pair by deeply interpreting their semantic relations. The stereogram of a object is shown in figure (a) and AB 2 -AB-2=0. The front, top and side views of The front, top and side 𝐹 1 : 𝐹 2 : views of a geometric it are shown in figure (b), (c) and (d). The volume of the object is ( ) object are shown in A. 3 B. 4 C.5 D. 6 Similar Concepts H figure (a), (b) and (c). D :Solid geometry 𝐷 1 G 1 Please calculate the 2 C :Volume 𝐷 2 F E 1 1 volume of the object. A B :Quadratic equation 𝐷 3 (a) (b) (c) (d) 2 A geometric object is shown in figure (a) and its volume is V. AOB = 90°, and OB = 2. What is the 𝐹 3 : 1 1 1 (a) (b) (c) Dissimilar A relationship of AB and V ? Concepts Concepts :Solid geometry 𝐷 1 :Solid geometry 𝐷 1 B O :Volume 𝐷 2 :Volume 𝐷 2 (a) Anhui Province Key Laboratory Of Big Data Analysis and Application 9
Related Work Studies on FSE Methods based on text similarity Neglect semantics in heterogeneous materials Use the same concepts or the similar words of exercises. Vector Space Model (VSM) Methods based on learners’ performance data Cannot understand Multimodal Learning exercise purposes or measure similar parts Powerful approach to handle heterogeneous data between two exercises Sound-video, video-text, image-text Pair Modeling Cannot handle instances having multiple Learn the relations between two instances in a pair heterogeneous data Sentence pair, image pair, video-sentence pair Anhui Province Key Laboratory Of Big Data Analysis and Application 10
Outline Background and Related Work 1 Problem Definition 2 Study Overview 3 MANN Framework 4 Experiments 5 Conclusion and Future Work 6 Anhui Province Key Laboratory Of Big Data Analysis and Application
Problem Definition Given: exercises with corresponding heterogeneous materials including texts, images and concepts Goal: learn a model F to measure the similarity scores of exercise pairs and find similar exercises for any exercise E by ranking the candidate ones R with similarity scores Parameters of F Candidates for E Similar exercises for E Model Anhui Province Key Laboratory Of Big Data Analysis and Application 12
Outline Background and Related Work 1 2 Problem Definition Study Overview 3 MANN Framework 4 Experiments 5 Conclusion and Future Work 6 Anhui Province Key Laboratory Of Big Data Analysis and Application
Study Overview Two-stage solution Training stage The front, top and side views of a geometric object are shown in figure MANN (a), (b) and (c). Please calculate the volume of the object. Pairwise training Concepts 2 C 1 :Solid geometry 1 1 1 C 2 :Volume Testing stage (a) (b) (c) FSE for any exercise Exercises Heterogeneous materials: text, images and concepts 𝑇 𝐹 1 , 𝐹 1, 𝑡 > 𝑇 𝐹 1 , 𝐹 1, 𝑒𝑡 Ranked candidates 𝑇 𝐹 2 , 𝐹 2, 𝑡 > 𝑇 𝐹 2 , 𝐹 2, 𝑒𝑡 𝑡 , 𝐹 𝑏 ,2 𝑡 , 𝐹 𝑏 ,3 𝑡 , … ) 𝐹 𝑏 ( 𝐹 𝑏 ,1 MANN 𝑇 𝐹 𝑜 , 𝐹 𝑜 , 𝑡 > 𝑇 𝐹 𝑜 , 𝐹 𝑜 , 𝑒𝑡 c 𝑡 , 𝐹 𝑐 ,2 𝑡 , 𝐹 𝑐 ,3 𝑡 , … ) c Testing Training 𝐹 𝑐 ( 𝐹 𝑐 ,1 : similar exercises of 𝑇𝑗𝑛 𝐹 𝐹 : dissimilar exercises of 𝐸𝑇 𝐹 𝐹 FSE for any exercise Model 𝐹 𝑡 ∈ 𝑇𝑗𝑛 𝐹 , 𝐹 𝑒𝑡 ∈ 𝐸𝑇 𝐹 Anhui Province Key Laboratory Of Big Data Analysis and Application 14
Outline Background and Related Work 1 2 Problem Definition Study Overview 3 4 MANN Framework Experiments 5 Conclusion and Future Work 6 Anhui Province Key Laboratory Of Big Data Analysis and Application
MANN Framework Multimodal Attention-based Neural Network (MANN) Learn a unified semantic representation of each exercise by Challenge 1: multimodal handling its heterogeneous materials in a multimodal way exercises understanding and representation Propose two attention strategies to capture the text-image Challenge 2: learning and text-concept associations in each single exercise text-image, text-concept associations Design a Similarity Attention to measure the similar parts in Challenge 3: learning each exercise pair with their semantic representations similar parts Anhui Province Key Laboratory Of Big Data Analysis and Application 16
Recommend
More recommend