Task Understanding From Confusing Multi-task Data Yizhou JIANG Shangqi GUO Feng CHEN Xin SU Tsinghua University Tsinghua University Tsinghua University Tsinghua University
2 Motivation: From Narrow AI to AGI Narrow AI: A specific task in the determined environment. Task 1 … (Color) “Yellow” “Yellow” “Red” “Green” Multi-Task Learning : Task 2 … Comprehensive problems in (Name) “ Banana ” “ Apple ” “Lemon” “Apple” different semantic space Task 3 … (Taste) “Sweet” “Sour” “ Sweet ” “Sour” Task Annotation Label Annotation —— Manual Task Definition Do not exist in natural raw data AGI Problem: How can we learn task concept from original raw data?
3 Confusing Supervised Learning (CSL) Without task annotation: Mapping conflicts between multi-task Data De-confuse Confusing Data “Yellow” “Banana” “Green” “Yellow” “Green” “Apple” “Lemon” “Sweet” “Red” Task Understanding Multi-Task Learning “Red” “Apple” “Banana” “Apple” “Red” ( Deconfusing Function ) ( Mapping Function ) “Apple” “Sweet” “Lemon” “Sweet” “Sour” “Sour” “Sour” “Sour” “Apple” Confusing Supervised Learning CSL: Learning task concepts by reducing mapping conflicts
4 Method: CSL-Net Mapping Net Deconfusing Net 𝑛 𝑙 𝑛 2 , 𝑙 = 1, … , 𝑜 2 𝑙 − 𝑙 𝑦 𝑗 𝑙 ℎ 𝑦 𝑗 , 𝑧 𝑗 − min 𝑙 𝑀 𝑛𝑏𝑞 𝑙 = 𝑧 𝑗 min ℎ 𝑀 𝑒𝑓𝑑 ℎ = ℎ 𝑦 𝑗 , 𝑧 𝑗 𝑗=1 𝑗=1 Sample Temporary Assignment 𝑦 𝑗 𝑦 𝑗 Ground-truth ℎ(𝑦 𝑗 ,𝑧 𝑗 ) Deconfusing-Net Deconfusing-Net 𝑧 𝑗 𝑧 𝑗 ℎ(𝑦 𝑗 ,𝑧 𝑗 ) … … … 𝑴 𝒆𝒇𝒅 ℎ(𝑦 𝑗 ,𝑧 𝑗 ) 𝑙 (𝑦 𝑗 ) 𝑧 𝑗 𝑴 𝒏𝒃𝒒 1 (𝑦 𝑗 ) 1 (𝑦 𝑗 ) 1 (𝑦 𝑗 ) 𝑦 𝑗 argmin … 𝑦 𝑗 … … Mapping Net Mapping Net Multi-task 𝑙 𝑜 (𝑦 𝑗 ) 𝑜 (𝑦 𝑗 ) 𝑜 (𝑦 𝑗 ) Outputs Mapping-Net Training Deconfusing-Net Training
5 Motivation: From Narrow AI to AGI AI Success: Exceeded human-level performance on various problems. Narrow AI: A specific task in the determined environment.
6 Motivation: From Narrow AI to AGI Task 1 … (Color) “Yellow” “Yellow” “Red” “Green” Multi-Task Learning : Task 2 … Comprehensive problems in (Fruit) “ Banana ” “ Apple ” “Lemon” “Apple” different semantic space Task 3 … (Taste) “Sweet” “Sour” “ Sweet ” “Sour” Task Annotation Label Annotation —— Manual Task Definition Do not exist in natural raw data AGI Problem: How can we learn task concept from original raw data?
7 Confusing Data Multi-tasks cannot be represented by a single mapping function. Task understanding is vital for multi-task learning. Confusing Data: Multi-task data without Task Annotation “Sweet” “Green” “Yellow” “Sweet” “Apple” “Banana” “Red” “Red” “Apple” “Sweet” “Lemon” “Sour” “Sour” Mixed task is confusing! “Apple”
8 Comparison of Existing Methods A novel learning problem! Supervised Learning & Latent Variable Learning : Mapping Confusing. Multi-Task Learning : Task annotation is needed. Multi-Label Learning : Multiple labels are allocated. Confusing Supervised Learning: No task annotation or samples allocation.
9 Confusing Supervised Learning (CSL) Without task annotation: Mapping conflicts between multi-task Data De-confuse Confusing Data “Yellow” “Banana” “Green” “Yellow” “Green” Task 1 “Red” “Lemon” “Sweet” (Color) Task Understanding Multi-Task Learning “Red” “Apple” Task 2 “Banana” “Apple” “Apple” “Red” ( Deconfusing Function ) ( Mapping Function ) (Fruit) “Apple” Task 3 “Sweet” “Lemon” “Sweet” “Sour” “Sour” (Taste) “Sour” “Sour” “Apple” Confusing Supervised Learning
10 Learning Objective: Risk Functional of CSL Model Traditional Supervised Learning Confusing Supervised Learning Risk Functional Solution min 𝑆 ∗ > 0 min 𝑆 ∗ , ℎ ∗ = 0 Deconfusing ℎ(𝑦, 𝑔, ) Mapping (𝑦) 𝑍 𝑍 𝑍 𝑍 𝑌 𝑌 𝑌 𝑌
11 Feasibility: Loss → 0 Wrong allocation of confusing samples leads to unavoidable loss. function 1 function 2 Loss > 0 × Y Sam pl es Y X function 1 function 2 function 3 √ Loss ≈ 0 Y X C onf usi ng Sam pl es X Task concept driven by global loss: Empirical risk should go towards 0!
12 Training Target & CSL-Net Optimization Target: Expected Result: Constraint: The output of Deconfusing-Net is one-hot! Difficulty: Approximation of Softmax leads to a trivial solution. Joint BP is not available.
13 Training Algorithm of CSL-Net Training of Mapping Net Training of Deconfusing Net 𝑛 𝑙 𝑛 2 , 𝑙 = 1, … , 𝑜 2 𝑙 − 𝑙 𝑦 𝑗 𝑙 ℎ 𝑦 𝑗 , 𝑧 𝑗 − min 𝑙 𝑀 𝑛𝑏𝑞 𝑙 = 𝑧 𝑗 min ℎ 𝑀 𝑒𝑓𝑑 ℎ = ℎ 𝑦 𝑗 , 𝑧 𝑗 𝑗=1 𝑗=1 Sample Temporary Assignment 𝑦 𝑗 𝑦 𝑗 Ground-truth ℎ(𝑦 𝑗 ,𝑧 𝑗 ) Deconfusing-Net Deconfusing-Net 𝑧 𝑗 𝑧 𝑗 ℎ(𝑦 𝑗 ,𝑧 𝑗 ) … … … 𝑴 𝒆𝒇𝒅 ℎ(𝑦 𝑗 ,𝑧 𝑗 ) 𝑙 (𝑦 𝑗 ) 𝑧 𝑗 𝑴 𝒏𝒃𝒒 1 (𝑦 𝑗 ) 1 (𝑦 𝑗 ) 1 (𝑦 𝑗 ) 𝑦 𝑗 argmin … 𝑦 𝑗 … … Mapping Net Mapping Net Multi-task 𝑙 𝑜 (𝑦 𝑗 ) 𝑜 (𝑦 𝑗 ) 𝑜 (𝑦 𝑗 ) Outputs Mapping-Net Training Deconfusing-Net Training
14 Experiment: Function Regression Supervised learning fails to fit multiple functions. Incorrect task number leads to confusing fitting results. CSL-Net learns reasonable task concepts and complete multi-task mapping. Results in the training process
15 Experiment: Pattern Recognition Each sample represents the classification result of only one task. Two Learning Goal: Color Name Taste • Task Understanding “Apple” “Banana” “Sweet” “Spicy” “Sour” “Lemon” “Red” “Green” “Yellow” • Classification of Multi-Task Two Evaluation Metrics: • Task Understanding • Classification of Multi-Task
16 Experiment: Pattern Recognition Results on two confusing supervised datasets.
17 Experiment: Pattern Recognition Feature Visualization of Deconfusing Net. Before After Before After Deconfusing Net could separate confusing samples to reasonable task groups.
18 Conclusion A novel learning problem for general raw data: • Task annotation is unknown in natural raw data. • Understanding task concept from raw data (confusing data). A novel learning paradigm : Confusing Supervised Learning • Deconfusing Function: Samples allocation for tasks • Mapping Function: Multi-task mappings. • Global Risk Functional: Over all risk of representation for raw data. A novel network : CSL-Net • Algorithm of alternating two-stage training to realize the task constraint. A novel application: learning system towards general intelligence. • The agent autonomously defines task concepts and learns multi-task mapping without manual task annotation.
19 Thanks! Xin Su, Tsinghua University suxin16@mails.tsinghua.edu.cn
Recommend
More recommend