Cascaded 3D Fully Convolutional Networks for Medical Image Segmentation Holger Roth Assistant Professor (Research) Nagoya University, Japan Contributors: Hirohisa Oda a , Xiangrong Zhou b , Natsuki Shimizu a , Ying Yang a , Chen Shen a , Yuichiro Hayashi a , Masahiro Oda a , Michitaka Fujiwara c , Kazunari Misawa d , Kensaku Mori a aNagoya University, Furo-cho, Chikusa-ku, Nagoya, Japan bGifu University, Yanagido, Gifu, Japan cNagoya University Graduate School of Medicine, Nagoya, Japan dAichi Cancer Center, Kanokoden, Chikusa-ku, Nagoya, Japan www.holgerroth.com 2018/03/26 1
Motivation • Multi-organ segmentation is an important prerequisite for many CADx systems in medical imaging. • Segmentation could provide quantitative analysis, important for diagnosis & treatment – Measure organ volumes, discover shape irregularities, 3D printing, and surgical navigation • Challenging because of high anatomical variability of organs’ appearance, especially in the abdomen. contrast enhanced CT scans of the lower abdomen www.holgerroth.com 2018/03/26 2
Surgical navigation system for gastric cancer treatment Hayashi, Yuichiro, et al. "Clinical application of a surgical navigation system based on virtual laparoscopy in laparoscopic gastrectomy for gastric cancer." International Journal of Computer Assisted Radiology and Surgery 11.5 (2016): 827-836. www.holgerroth.com 2018/03/26 3
Deep learning for image segmentation • Recent advances in deep learning, like fully convolutional networks (FCN), have made it feasible to train deep models for dense semantic segmentation tasks. [Long et al., CVPR 2015] • Extensions to 3D have been shown to work well for biomedical images ( 3D U-Net ) [Cicek et al., MICCAI 2016] • In this work we present a cascaded 3D FCN approach trained on manually labelled data of several abdominal organs and vessels . • We achieve competitive segmentation results on clinical CT images used in gastric surgery. www.holgerroth.com 2018/03/26 4
Background: Convolutional Neural Network (CNN) Image: 3D Filter kernel: Kernel elements Kernel elements are trained from are trained from the data! the data! First layer 3D kernels Output image: [Krizhevsky et al., NIPS 2012] [Roth et al. TMI 2015] https://github.com/vdumoulin/conv_arithmetic www.holgerroth.com 2018/03/26 5
Classification CNN “abdomen” Figure from e.g. LeNet, AlexNet, VGG-Net, etc… [Roth et al., JAMIT 2018, arxiv:1803.08691] www.holgerroth.com 2018/03/26 6
Fully convolutional networks (FCN) [Long et al., “Fully convolutional networks for semantic segmentation”, CVPR 2015] Figure from [Roth et al., JAMIT 2018, arxiv:1803.08691] www.holgerroth.com 2018/03/26 7
Fully convolutional architectures (3D U-Net) [Cicek et al., MICCAI 2016]* Input: Concat Output: [132,132,116] [44,44,28] Conv + BatchNorm + ReLu 1 32 64 Max pool De-conv 64+128 64 64 [128,128,112] CT 3D probability concatenation (skip connection) maps for each [48,48,32] class with cropping 64 128 128+256 128 [60,60,52] [24,24,16] [64,64,56] [28,28,20] 128 256 256 + 512 256 [26,26,22] [14,14,10] [30,30,26] [18,18,14] 256 synthesis path analysis path [13,13,11] [9,9,7] (decoder) (encoder) • 19 million learnable parameters • Fits on one 12GB GPU ( NVIDIA TITAN X ) for training, ~ 6GB needed for inference *Implementation in Caffe Figure after 3D U-Net [ Çiçek et al., “ 3D U-Net: learning dense volumetric segmentation from sparse annotation”, MICCAI 2016] www.holgerroth.com 2018/03/26 8
A cascaded approach Images , Labels Detect Train 3D Multi-class Dilate fore- Train 3D Final patient’s FCN prediction ground FCN prediction body 1 st Stage 2 nd Stage H. Roth et al., “An application of cascaded 3D fully convolutional networks for medical image segmentation”, Computerized Medical Imaging and Graphics, 2018 (arXiv 1803.05431) www.holgerroth.com 2018/03/26 9
A cascaded approach 3D FCN sees ~40% of the 3D FCN sees ~10% of the voxels in the image voxels in the image This approach encourages better segmentation around the boundary of organs. H. Roth et al., Computerized Medical Imaging and Graphics, 2018 (arXiv 1803.05431) www.holgerroth.com 2018/03/26 10
Subvolume tiling approach in training & testing All CT volumes are downsampled 2x • Down-sampling and sub-volume size are largely depending on the amount of available GPU memory What the What the network network sees learns www.holgerroth.com 2018/03/26 11
Dataset creation CTs acquired at Aichi Cancer Center, Nagoya, Japan • semi-automated segmentation tools • graph-cuts • region growing • data collected from 2009~2017 (331 cases) http://pluto.newves.org www.holgerroth.com 2018/03/26 12
Candidate region Mask of patient’s body • Thresholding – Morphological opening – Largest connected – component Removes ~60% of • voxels in the image www.holgerroth.com 2018/03/26 13
Dealing with data Testing high imbalance Balancing weight (smallest organ gets highest weight) : low � � � � 1 � � : total number in candidate region � : number classes weighted voxel-wise cross-entropy loss: � � : softmax likelihood � � : number of voxels for class � www.holgerroth.com 2018/03/26 14
Data augmentation • random cropping • random rotations • elastic B-spline deformations 'CreateDeformation' and 'ApplyDeformation ' layers 3D U-Net [Cicek et al. MICCAI 2016] www.holgerroth.com 2018/03/26 15
Feature maps Learned feature kernels CT image Segmentation … … … concatenation (skip) ������ of 3D connections at each level 3x3x3 kernels probability maps throughout the network www.holgerroth.com 2018/03/26 16
Experiments • Network trained on abdominal contrast enhanced CT images: • 281/50 training/validation split • 8 classes manually labeled • artery, vein, liver, spleen, stomach, gallbladder, pancreas + background • Training on 281 cases can take 2-3 days for 200-k iterations, inference in 1.4-3.3 minutes ( NVIDIA TITAN X ) • Compared approaches: • Single 3D U-Net FCN • Cascade (train one FCN to define candidate region for second FCN) www.holgerroth.com 2018/03/26 17
Inference (test case) – 1 st stage artery vein liver spleen stomach gallbladder pancreas • 5-6 minutes using non-overlapping tiles • 15-20 minutes using overlapping tiles • NVIDIA GeForce GTX TITAN X with 12 GB memory www.holgerroth.com 2018/03/26 18
Ground truth Stage 2 – Tiling Stage 2 result (a) (b) (c) Example of the validation set with (a) ground truth and illustrating (b) Tiling approach on 2nd stage candidate region, Resulting segmentation is shown in (c). Note that the grid shows the output tiles of size 44 × 44 × 28 (x,y,z-directions). Each predicted tile is based on a larger input of 132 × 132 × 116 that the network processes as defined by GPU requirements (12 GB) H. Roth et al., Computerized Medical Imaging and Graphics, 2018 (arXiv 1803.05431) www.holgerroth.com 2018/03/26 19
Validation: Dice scores (1 st & 2 nd stages) Note: a marked improvement in performance of the 2 nd stage can be observed. 0.95 0.90 0.85 0.80 0.75 0.70 0.65 0.60 0.55 0.50 artery vein liver spleen stomach gall pancreas 1st 0.60 0.67 0.90 0.85 0.80 0.72 0.56 2nd 0.80 0.73 0.93 0.91 0.84 0.71 0.63 1st 2nd www.holgerroth.com 2018/03/26 20
H. Roth et al., Computerized Medical Imaging Test on unseen dataset and Graphics, 2018 (arXiv 1803.05431) 1 st stage 2 nd stage 150 abdominal ceCTs from different hospital and scanner www.holgerroth.com 2018/03/26 21
Comparison to other methods H. Roth et al., Computerized Medical Imaging and Graphics, 2018 (arXiv 1803.05431) www.holgerroth.com 2018/03/26 22
Visualization for Surgical navigation Anatomical name display on blood vessel surface in 3D views Segmentation result overlaid with Anatomical name display in anatomical labeling of blood vessels surgical navigation view Matsuzaki, T., Oda, M., Kitasaka, T., Hayashi, Y., Misawa, K., & Mori, K. (2015). Automated anatomical labeling of abdominal arteries and hepatic portal system extracted from abdominal CT volumes. Medical image analysis www.holgerroth.com 2018/03/26 23
Conclusions • 3D fully convolutional architectures (3D U-net) can achieve competitive results for multi-organ Same input/output size segmentation. • They can be efficiently deployed on a single GPU. – larger GPU memory or multi-GPU processing is helpful, see Roth et al. SPIE 2018 (arXiv 1711.06439) Close to 90% Dice • An cascaded approach for training & testing was on average for pancreas! able to markedly improve the results. www.holgerroth.com 2018/03/26 24
Recommend
More recommend