Temporally Distributed Networks for Fast Video Semantic Segmentation Ping Hu 1 Fabian Caba Heilbron 2 Oliver Wang 2 Zhe Lin 2 Stan Sclaroff 1 Federico Perazzi 2 1 Boston University 2 Adobe Research
Challenge Video Semantic Segmentation ❏ frame {..., T-1, T, T+1, …} frame {..., T-1, T, T+1, …} High data volume ❏ ❏ Content redundancy Spatial-temporal variations between frames ❏ ❏ Requiring: (1) High Accuracy; (2) High Speed; (3) Low-latency;
Main Contribution & Novelty Temporally distributed network ⇨ Low-latency video processing. ❏ Attention propagation mechanism ⇨ Robust feature aggregation. ❏ Grouped knowledge distillation ⇨ Effective model training. ❏ TDNet - SOTA in accuracy and speed.
Temporally Distributed Networks
Temporally Distributed Networks
Temporally Distributed Networks
Temporally Distributed Networks Challenge: ❏ Pixelwise tasks are sensitive to the spatial misalignment caused by motion between frames.
Temporally Distributed Networks Attention propagation ❏ Challenge: ❏ Pixelwise tasks are sensitive to the spatial misalignment caused by motion between frames.
Temporally Distributed Networks Attention propagation ❏ Challenge: ❏ Pixelwise tasks are sensitive to the spatial misalignment caused by motion between frames. Attention Propagation: ❏ ❏ Attention Downsampling: Saving computation by downsample the reference data in attention.
Temporally Distributed Network Grouped Knowledge Distillation ❏ Transfer knowledge at the subspace ❏ level. Enhance the complementarity of ❏ sub-feature maps in the full feature space.
Approaches & Challenge Previous Methods Our TDNet Key-frame Temporal-context Overall-accuracy × √ √ Overall-speed √ × √ Low-latency × × √
Recommend
More recommend