Matching Guided Distillation ECCV 2020 Kaiyu Yue, Jiangfan Deng, - PowerPoint PPT Presentation

Nov 02, 2023 •467 likes •662 views

Matching Guided Distillation ECCV 2020 Kaiyu Yue, Jiangfan Deng, and Feng Zhou Algorithm Research, Aibee Inc. Motivation Motivation Distillation Obstacle The gap in semantic feature structure between the intermediate features of teacher

Matching Guided Distillation ECCV 2020 Kaiyu Yue, Jiangfan Deng, and Feng Zhou Algorithm Research, Aibee Inc.
Motivation
Motivation Distillation Obstacle • The gap in semantic feature structure between the intermediate features of teacher and student Classic Scheme • Transform intermediate features by adding the adaptation modules, such the conv layer Problems • 1) The adaptation module brings more parameters into training 2) The adaptation module with random initialization or special transformation isn’t friendly for distilling a pre-trained student
Matching Guided Distillation Framework
Matching Guided Distillation – Matching Given two feature sets from teacher and student, we use Hungarian method to achieve the flow guided matrix M . Flow guided matrix M indicates the matched relationships.
Matching Guided Distillation – Channels Reduction One student channel could match multiple teacher channels. We perform reduction into one tensor for guiding the student. Channels Reduction
Matching Guided Distillation – Distillation After reducing teacher channels, we start to distill student using partial distance training loss, such as L2 loss. Channels Reduction Distance Loss
Matching Guided Distillation – Coordinate Descent Optimization The overall training takes a coordinate-descent approach between two optimization objects — flow guided matrix update and parameters update. Updating flow guided matrix M Coordinate Descent Optimization Channels Reduction Training student model using SGD Distance Loss
Matching Guided Distillation Reduction Methods
Matching Guided Distillation – Channels Reduction We propose three efficient methods for reducing teacher channels: Sparse Matching, Random Drop and Absolute Max Pooling.
Matching Guided Distillation – Sparse Matching Each student channel will only match the most related teacher channel. Unmatched teacher channels are ignored. Matching Distance Loss
Matching Guided Distillation – Random Drop Sampling a random teacher channel from the ones associated with each student channel. Matching Distance Loss
Matching Guided Distillation – Absolute Max Pooling To keep both positive and negative feature information of teacher, we propose a novel pooling mechanism that reduce features according to the absolute value at the same feature structure location. Matching Distance Loss
Matching Guided Distillation Main Results
Results – Fine-Grained Recognition on CUB-200 + 3.97 % on top1 + 5.44 % on top1
Results – Large-Scale Classification on ImageNet-1K + 1.83 % on top1 + 2.6 % on top1
Results – Object Detection and Instance Segmentation on COCO
Summary MGD is lightweight and efficient for various tasks • MGD gets rid of channels number constraint between teacher and student, it’s flexible to plug into network • MGD is friendly for distilling a pre-trained student • Project webpage: http://kaiyuyue.com/mgd

Recommend

7.5 Bipartite Matching Matching Matching. Input: undirected graph G = (V, E). M E

7.5 Bipartite Matching Matching Matching. Input: undirected graph G = (V, E). M E is a matching if each node appears in at most edge in M. Max matching: find a max cardinality matching. 3 Bipartite Matching Bipartite matching.

542 views • 37 slides

FFR Guided Functional FFR Guided Functional FFR Guided Functional FFR Guided Functional

FFR Guided Functional FFR Guided Functional FFR Guided Functional FFR Guided Functional Angioplasty Angioplasty g g p p y y in Complex Anatomy in Complex Anatomy Jung-Min Ahn, MD g , Heart Institute Asan Medical Center Heart

671 views • 27 slides

Guided Therapeutics in Cancer Surgery Guided Therapeutics in Cancer Surgery Guided Therapeutics

Guided Therapeutics in Cancer Surgery Guided Therapeutics in Cancer Surgery Guided Therapeutics in Cancer Surgery Guided Therapeutics in Cancer Surgery Jeff Jeff Siewerdsen Siewerdsen Jonathan Irish J Jonathan Irish J th th I i h I i h

713 views • 38 slides

Distillation. Optimal operation using simple control structures Sigurd Skogestad, NTNU, Trondheim

Distillation. Optimal operation using simple control structures Sigurd Skogestad, NTNU, Trondheim EFCE Working Group on Separations, Gteborg, Sweden, June 2019 Distillation is part of the future 1. Its a myth that distillation is bad in

652 views • 53 slides

Complex distillation systems. Theory and models. Pio Aguirre INGAR Santa Fe-Argentina Outline

Complex distillation systems. Theory and models. Pio Aguirre INGAR Santa Fe-Argentina Outline 1.- Introduction. 2.- Theory in simple columns design. 3.- Reversible distillation columns and sequences. 4.- Optimal synthesis distillation

720 views • 59 slides

Effective Topic Distillation Effective Topic Distillation with Key Resource Pre- -selection

Effective Topic Distillation Effective Topic Distillation with Key Resource Pre- -selection selection with Key Resource Pre Yiqun Liu, Min Zhang and Shaoping Ma State Key Lab of Intelligent Tech. & Sys. Tsinghua University, Beijing,

266 views • 22 slides

Matching of Matrix Elements and Parton Showers CKKW matching in e + e collisions Lecture 2:

Lecture 2: Matching in e + e collisions Why Matching? Present matching approaches Matching of Matrix Elements and Parton Showers CKKW matching in e + e collisions Lecture 2: Matching in e + e collisions The MLM procedure Johan

611 views • 29 slides

Global Shape Matching Section 3.3: Articulated Matching using Graph Cuts Global Shape Matching:

Global Shape Matching Section 3.3: Articulated Matching using Graph Cuts Global Shape Matching: Extrinsic Key Point Detection and Feature Descriptors 1 1 Articulated Shape Matching Feature-based matching alone is not enough to find

413 views • 25 slides

MVC Guided Pathways Brief review of Guided Pathways at MVC Plan for Today Spring

MVC Guided Pathways Brief review of Guided Pathways at MVC Plan for Today Spring 2018-Summer 2019 Action Plan and Budget Guided Pathways in California A Framework for Improving Completion Multiple Guided Pathways Initiatives

407 views • 13 slides

Matching Bipartite Matching Input Given a (undirected) graph G = ( V , E ) Input Given a bipartite

Matching Bipartite Matching Input Given a (undirected) graph G = ( V , E ) Input Given a bipartite graph G = ( L R , E ) Goal Find a matching of maximum cardinality Goal Find a matching of maximum cardinality A matching is M E such

175 views • 6 slides

Other Hydrocarbons Presented by Sachin Joshi Licensing Manager GTC Technology US, LLC

Topic Advances in Distillation for Petrochemicals and Other Hydrocarbons Presented by Sachin Joshi Licensing Manager GTC Technology US, LLC Improving Efficiency in Distillation Advanced Distillation GT-HIDS (Heat Integrated

658 views • 29 slides

Knowledge Distillation Xiachong Feng Pic h%ps://data-soup.gitlab.io/blog/knowledge-dis8lla8on/

Knowledge Distillation Xiachong Feng Pic h%ps://data-soup.gitlab.io/blog/knowledge-dis8lla8on/ Outline Why Knowledge Distillation? Distilling the knowledge in a neural network NIPS2014 Model Compression Distilling Task-Specific

1.28k views • 73 slides

Separation of Ethanol and Water with Extractive Distillation David LaJambe Ethanol-Water Systems

Separation of Ethanol and Water with Extractive Distillation David LaJambe Ethanol-Water Systems Maximum purity from regular distillation limited by low-boiling azeotrope to 90 mole% ethanol Fuel grade ethanol requires 98.7 mole%

472 views • 28 slides

Non-asymptotic entanglement distillation arXiv:1706.06221 Kun Fang Joint work with Xin Wang,

Non-asymptotic entanglement distillation arXiv:1706.06221 Kun Fang Joint work with Xin Wang, Marco Tomamichel, Runyao Duan Centre for Quantum Software and Information U niversity of T echnology S ydney Entanglement distillation [Bennett,

538 views • 52 slides

Tight bounds for Communication assisted agreement distillation Jaikumar Radhakrishnan Tata

Tight bounds for Communication assisted agreement distillation Jaikumar Radhakrishnan Tata Institute of Fundamental Research, Mumbai Joint work with Venkat Guruswami, Carnegie Mellon University Agreement distillation Alice Bob Input: X {

930 views • 55 slides

Distillation Codes and DOS Resistant Multicast Moderation Prepared for CS 624 Fabian Monrose

Distillation Codes and DOS Resistant Multicast Moderation Prepared for CS 624 Fabian Monrose Johns Hopkins University Kevin Snow & Ryan Gardner Recall We showed how distillation codes broke received packets into partitions to reduce

299 views • 26 slides

Session 05 Robust and Resistant Regression V&R 6.5, p. 156 ff A radical change of view

Session 05 Robust and Resistant Regression V&R 6.5, p. 156 ff A radical change of view In most data analysis contexts the data are considered the gold standard and models to describe the process giving rise to it are largely

606 views • 34 slides

SysML Model Transformation for Safety and Security Florian Lugou, Raba Ameur-Boulifa, Ludovic

SysML Model Transformation for Safety and Security Florian Lugou, Raba Ameur-Boulifa, Ludovic APVRILLE ludovic.apvrille@telecom-paristech.fr ISSA2018 - Barcelona Context: Security for Embedded Systems SysML-Sec Case study Conclusion

355 views • 33 slides

Summary of the NuSTEC Workshop on Neutrino-Nucleus Pion Production in the Resonance Region

Summary of the NuSTEC Workshop on Neutrino-Nucleus Pion Production in the Resonance Region Jonathan Paley NuSTEC Board Meeting December 10, 2019 1 Jonathan M. Paley How this workshop got started During the NuSTEC Workshop on

326 views • 11 slides

Modeling Heterogeneous Embedded Systems with TTool Daniela Genius, Ludovic Apvrille, Letitia W.

Modeling Heterogeneous Embedded Systems with TTool Daniela Genius, Ludovic Apvrille, Letitia W. Li, Marie-Minerve Lou erat, Fran cois P echeux, Haralampos Stratigopoulos Daniela.Genius@lip6.fr IDM, GDR GPL Context Method Case Study

694 views • 56 slides

Deep Model Compression Xin Wang Oct.31.2016 Some of the contents are borrowed from Hintons

Deep Model Compression Xin Wang Oct.31.2016 Some of the contents are borrowed from Hintons and Songs slides. Two papers Distilling the Knowledge in a Neural Network by Geoffrey Hinton et al Whats the dark knowledge of

1.02k views • 21 slides

Quantifying the Unextendibility of Entanglement Kun WANG Shenzhen Institute for Quantum Science

Quantifying the Unextendibility of Entanglement Kun WANG Shenzhen Institute for Quantum Science and Engineering (SIQSE) Southern University of Science and Technology Joint work with Xin WANG (Baidu) and Mark M. WILDE (LSU) (arxiv:1911.07433)

659 views • 19 slides

Unconstrained Handwritten Text Recognition Reporter: Zecheng Xie South China University of

Distilling GRU with Data Augmentation for Unconstrained Handwritten Text Recognition Reporter: Zecheng Xie South China University of Technology August 6 2018 Outline Problem Definition Multi-layer Distilling GRU Data Augmentation

345 views • 22 slides

CSE 440: Introduction to HCI User Interface Design, Prototyping, and Evaluation Lecture 03:

CSE 440: Introduction to HCI User Interface Design, Prototyping, and Evaluation Lecture 03: James Fogarty Contextual Inquiry Alex Fiannaca Lauren Milne Saba Kawas Kelsey Munsell Tuesday/Thursday 12:00 to 1:20 Amazing Color Changing Card

1.05k views • 73 slides