81

ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks
Page 2: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

• ImageNet, GPU

2

Page 3: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

8

3

Page 4: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

4

Page 5: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

19

5

Page 6: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

22

6

Page 7: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

T +

H

xx

1-T

T +

H

xx

1-T

T +

H

xx

1-T

T +

H

xx

1-T

100+

7

Page 8: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

152

8

Page 9: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

9

Page 10: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

• Going deeper• Stack multiple blocks: VGG

• Improve information flow by skip connections• GoogleNet, Highway, ResNet, Deeply-Fused Nets, FractalNets, DenseNets, Merge-and-run

Page 11: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

50

Page 12: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

51

Page 13: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

52

Page 14: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

53

Page 15: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

• Going deeper• Stack multiple blocks: VGG

• Improve information flow by skip connections• GoogleNet, Highway, ResNet, Deeply-Fused Nets, FractalNets, DenseNets, Merge-and-run

• Convolution operations

54

Page 16: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

• Integer

• Quantization

55

Page 17: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

• Binarization

• Quantization

56

Page 18: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

• Binarization

• Integer

57

Page 19: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

• Going deeper• Stack multiple blocks: VGG

• Improve information flow by skip connections• GoogleNet, Highway, ResNet, Deeply-Fused Nets, FractalNets, DenseNets, Merge-and-run

• Convolution operations

• Low-precision kernels

58

Page 20: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

• Filter pruning

• Channel pruning

59

Page 21: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

• Filter pruning

• Channel pruning

60

Page 22: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

• Filter pruning

• Channel pruning

61

Page 23: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

• Filter pruning

• Channel pruning

62

Page 24: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

• Going deeper• Stack multiple blocks: VGG

• Improve information flow by skip connections• GoogleNet, Highway, ResNet, Deeply-Fused Nets, FractalNets, DenseNets, Merge-and-run

• Convolution operations

• Low-precision kernels

• Low-rank kernels

63

Page 25: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

64

Page 26: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

• Going deeper• Stack multiple blocks: VGG

• Improve information flow by skip connections• GoogleNet, Highway, ResNet, Deeply-Fused Nets, FractalNets, DenseNets, Merge-and-run

• Convolution operations

• Low-precision kernels

• Low-rank kernels

• Composition from low-rank kernels

65

Page 27: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

• Non-structured sparse

• Structured sparse

66

Page 28: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

• Non-structured sparse

• Structured sparse

67

Page 29: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

• Going deeper• Stack multiple blocks: VGG

• Improve information flow by skip connections• GoogleNet, Highway, ResNet, Deeply-Fused Nets, FractalNets, DenseNets, Merge-and-run

• Convolution operations

• Low-precision kernels

• Low-rank kernels

• Composition from low-rank kernels

• Sparse kernels

68

Page 30: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

69[1] Ting Zhang, Guo-Jun Qi, Bin Xiao, Jingdong Wang: Interleaved Group Convolutions. ICCV 2017: 4383-4392

[2] Guotian Xie, Jingdong Wang, Ting Zhang, Jianhuang Lai, Richang Hong, and Guo-JunQi. IGCV2: Interleaved Structured Sparse Convolutional Neural Networks. CVPR 2018.

[3] Ke Sun, Mingjie Li, Dong Liu, and Jingdong Wang. IGCV3: Interleaved Low-Rank Group Convolutions for Efficient Deep Neural Networks. BMVC 2018.

Page 31: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

large

Ting Zhang, Guo-Jun Qi, Bin Xiao, Jingdong Wang: Interleaved Group Convolutions, ICCV 2017.70

Page 32: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

Conv

Channel

71

Page 33: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

Split Concat.

Conv.

Conv.

72

Page 34: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

Split Concat.

Conv.

Conv.

Primary group convolution

Split

Conv.

Conv.

Conv.

Concat.

Secondary group 1 × 1 convolution

L=2 primary partitions

3 channels in each partition

M=3 secondary partitions

2 channels in each partition73

Page 35: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

The path connecting each input channel with each output channel

Split Concat.

Conv.

Conv.

Split

Conv.

Conv.

Conv.

Concat.

74

Page 36: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

Split Concat.

Conv.

Conv.

Split

Conv.

Conv.

Conv.

Concat.

x′ = PW𝑑P𝑇W𝑝x

x

W𝑝

P𝑇

W𝑑

Px′

75

Page 37: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

x′ = PW𝑑P𝑇W𝑝x

dense

76

Page 38: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

Advantages: Wider than regular convolutions

Thus, our IGC is wider except 𝐿 = 1 under the same #parameters

77

Page 39: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

depth RegConv-18 IGC20 92.55 ± 0.14 92.84 ± 0.26

38 91.57 ± 0.09 92.24 ± 0.62

62 88.60 ± 0.49 90.03 ± 0.85

depth RegConv-18 IGC

20 0.34 0.1538 0.71 0.31

62 1.20 0.52

depth RegConv-18 IGC

20 0.51 0.2938 1.1 0.57

62 1.7 0.95

+1.43

78

Page 40: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

depth RegConv-18 IGC20 68.71 ± 0.32 70.54 ± 0.26

38 65.00 ± 0.57 69.56 ± 0.76

62 58.52 ± 2.31 65.84 ± 0.75

depth RegConv-18 IGC

20 0.34 0.1538 0.71 0.31

62 1.20 0.52

depth RegConv-18 IGC

20 0.51 0.2938 1.1 0.57

62 1.7 0.95

+7.32

79

Page 41: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

#params FLOPS Training error Validation error

Top-1 Top-5 Top-1 Top-5ResNet (Reg. Conv.) 1.133 2.1 21.43 5.96 30.58 10.77

Our approach 0.861 1.3 13.93 2.75 26.95 8.92

+3.63 +1.85

80

Page 42: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

• Primary group convolution

• Secondary group convolution

Regular convolutions are interleaved group convolutions

Primary

Secondary

W11 W12 W21 W22

+ +

W

81

Page 43: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

Split Concat. Conv.

Conv.

Conv.

Conv.

Conv.

Conv.

Conv.

Conv.

83

Page 44: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

23 32 39 50 54 72 84 80 64 44 62 88 124 154 184 196 205 192 170 128width

M=2: Higher accuracy than Xception

Best accuracy

84

Page 45: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

Xception Our approach testing error 36.93 ± 0.54 36.11 ± 0.62

#params 3.62 × 104 3.80 × 104

FLOPS 3.05 × 107 3.07 × 107

Xception Our approachtesting error 32.87 ± 0.67 31.87 ± 0.58

#params 1.21 × 105 1.26 × 105

FLOPS 1.11 × 108 1.12 × 108

−0.82

−1.00

85

Page 46: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

Split Conv.

Conv.

Conv.

Conv.

Conv.

Concat Split Conv.

Conv.

Conv.

Concat

86

Page 47: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

87

Page 48: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

88

21 38.6M 5.22 23.30 2.01

21 38.6M 4.60 23.73 1.87

110 1.7M 6.41 27.22 2.01

200 10.2M 4.35 20.42 -

16 11.0M 4.81 22.07 -

28 36.5M 4.17 20.50 -

40 1.0M 5.24 24.42 1.79

100 27.2M 3.74 19.25 1.59

DMRNet 56 1.7M 4.94 24.46 1.66

DMRNet-Wide 32 14.9M 3.94 19.25 1.51

DMRNet-Wide 50 24.8M 3.57 19.00 1.55

IGC-L16M32 20 17.7M 3.37 19.31 1.63

IGC-L450M2 20 19.3M 3.30 19.00 -

IGC-L32M26 20 24.1M 3.31 18.75 1.56

Page 49: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

Interleaved Structured Sparse

Convolutional Neural Networks

Guotian Xie, Jingdong Wang, Ting Zhang, Jianhuang Lai, Richang Hong, and Guo-JunQi. IGCV2: Interleaved Structured Sparse Convolutional Neural Networks. CVPR 2018. 89

Page 50: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

Guotian Xie, Jingdong Wang et. al. IGCV2: Interleaved Structured Sparse Convolutional Neural Networks. CVPR 2018.

small

90

Page 51: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

Split

Conv.

Conv.

Conv.

Conv.

Split

Conv.

Conv.

ConcatConcat

91

Page 52: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

Split

Conv.

Conv.

Conv.

Conv.

Split

Conv.

Conv.

ConcatConcat

IGCV1

IGCV1

92

Page 53: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

Split

Conv.

Conv.

Conv.

Conv.

Split

Conv.

Conv.

Split

Conv.

Conv.

Split Concat

Concat

Concat

Concat

Conv.

IGCV1

93

Page 54: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

Split

Conv.

Conv.

Conv.

Conv.

Split

Conv.

Conv.

Split

Conv.

Conv.

Split Concat

Conv.

Conv.

Split

Conv.

Conv.

Split Concat

Concat

Concat

Concat

Concat

94

Page 55: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

Split

Conv.

Conv.

Conv.

Conv.

Conv.

Conv.

Conv.

Conv.

Conv.

Conv.

Conv.

Conv.

ConcatSplitConcat SplitConcat

𝐿 (e.g., 3 or >3) group convolutionsx′ = P3W3P2W2P1W1x

W3W1

P1

W2

P2 P3x

Dense due to complementary condition95

Page 56: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

96

Page 57: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

Design:

Balance condition minimum total #parameters, three conditions hold:

1) #parameters for (𝐿 − 1) group convolutions are the same

2) #branches for (𝐿 − 1) group convolutions are the same

3) #channels in each branch are the same

How to design 𝐿 group convolutions

#channels in each branch: 𝐾2 = ⋯ = 𝐾𝐿 = 𝐶1

𝐿−1 = 𝐾

width

IGC block: 1 channel-wise 3 × 3 convolution + (𝐿 − 1) group 1 × 1 convolutions

97

Page 58: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

Meet complementary and balance conditions

98

Page 59: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

Meet complementary and balance conditions

99

Page 60: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

#parameters When balance condition holds3 × 3 1 × 1

How many group convolutions?

𝐿 = log 𝐶 + 1 so that 𝑄 is minimum

Benefit: Further redundancy reduction by composing more (𝐿 > 2) sparse matrices

𝑄 𝐿 = 3 < 𝑄 𝐿 = 2 if 𝐶 > 4

100

Page 61: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

101

Page 62: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

Depth IGCV1 IGCV28 66.52 ± 0.12 67.65 ± 0.29

20 70.87 ± 0.39 72.63 ± 0.07

26 71.82 ± 0.40 73.49 ± 0.34

Depth IGCV1 IGCV2

8 0.046 0.04620 0.151 0.144

26 0.203 0.193

Depth IGCV1 IGCV28 1.00 1.18

20 2.89 3.2026 3.83 4.21

+1.67

102

Page 63: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

Depth IGCV1 IGCV28 48.57 ± 0.53 51.49 ± 0.33

20 54.83 ± 0.27 56.40 ± 0.17

26 56.64 ± 0.15 57.12 ± 0.09

Depth IGCV1 IGCV2

8 0.046 0.04620 0.151 0.144

26 0.203 0.193

Depth IGCV1 IGCV28 1.00 1.18

20 2.89 3.2026 3.83 4.21

+0.48

103

Page 64: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

21 38.6M 5.22 23.30 -

110 1.7M 5.52 28.02 46.5

56 1.7M 4.94 24.46 -

18 10.3M 5.01 22.90 -

34 21.4M - - 46.9

18 25.7M - - 44.6

32 7.4M 5.43 23.55 39.63

40 8.9M 4.53 21.18 -

DenseNet (k=12) 40 1.0M - - 39.09

DenseNet (k=12) 40 1.0M 5.24 24.42 -

DenseNet-BC (k=12) 100 0.8M 4.51 22.27 -

IGC-V2*-C416 20 0.65M 5.49 22.95 38.81

104

Page 65: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

Guotian Xie, Jingdong Wang et. al. IGCV2: Interleaved Structured Sparse Convolutional Neural Networks. CVPR 2018.

• Complementary condition

• Strict → loose

small

105

Page 66: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

Strict

one and only one

106

Page 67: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

Strict

is one and only one path˄

Loose

are more paths

rich

107

Page 68: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

Strict

channels lie in differentbranches and

super-channels

Loose

108

Page 69: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

Guotian Xie, Jingdong Wang et. al. IGCV2: Interleaved Structured Sparse Convolutional Neural Networks. CVPR 2018.

small

IGCV2

109

Page 70: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

Model #Params. (M) FLOPS (M) Accuracy (%)

MobileNetV1-1.0 4.2 569 70.6

IGCV2-1.0 4.1 564 70.7

MobileNetV1-0.5 1.3 149 63.7

IGCV2-0.5 1.3 156 65.5

MobileNetV1-0.25 0.5 41 50.6

IGCV2-0.25 0.5 46 54.9

111

Page 71: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

Interleaved Low-Rank Group Convolutions

Ke Sun, Mingjie Li, Dong Liu, and Jingdong Wang. IGCV3: Interleaved Low-Rank Group Convolutions for Efficient Deep Neural Networks. BMVC 2018. 112

Page 72: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

Ke Sun, Mingjie Li, Dong Liu, and Jingdong Wang. IGCV3: Interleaved Low-Rank Group Convolutions for Efficient Deep Neural Networks. BMVC 2018.

• Low-rank structured sparse matrix composition

• Structured sparse + low-rank

small

113

Page 73: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

1 × 1 Gconv. 3 × 3 Cconv. 1 × 1 Gconv.

Channel

Gconv

3 × 3 Cconv

Gcon

114

Page 74: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

Ke Sun, Mingjie Li, Dong Liu, and Jingdong Wang. IGCV3: Interleaved Low-Rank Group Convolutions for Efficient Deep Neural Networks. BMVC 2018.

small

IGCV3

115

Page 75: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

117

Page 76: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

118

Page 77: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

119

Page 78: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

• Small

• Fast

• High

Summary

120

Page 79: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

[1] Guotian Xie, Jingdong Wang et. al. IGCV2: Interleaved Structured Sparse Convolutional Neural Networks. CVPR 2018.

[2] Ting Zhang, Guo-Jun Qi, Bin Xiao, and Jingdong Wang: Interleaved Group Convolutions for Deep Neural Networks. ICCV (2017)

[3] Liming Zhao, Mingjie Li, Depu Meng, Xi Li, Zhuowen Tu, and Jingdong Wang: Deep Convolutional Neural Networks with Merge-and-Run Mappings. IJCAI 2018.

[4] Guotian Xie, Ting Zhang, Kuiyuan Yang, Jianhuang Lai, and Jingdong Wang: Decoupled Convolutions for CNNs. AAAI 2018

Ke Sun, Mingjie Li, Dong Liu, and Jingdong Wang: IGCV3: Interleaved Low-Rank Group Convolutions for Efficient Deep Neural Networks. BMVC 2018.

[6] Jingdong Wang, Zhen Wei, and Ting Zhang: Deeply-fused nets. 2016.

122

Page 80: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

• Liming Zhao

• Ting Zhang

• Zhen Wei

123

Page 81: ImageNet, GPU · IGC-L16M32 20 17.7M 3.37 19.31 1.63 IGC-L450M2 20 19.3M 3.30 19.00 - IGC-L32M26 20 24.1M 3.31 18.75 1.56. Interleaved Structured Sparse Convolutional Neural Networks

Thanks!

Q&A

Homepage:

https://jingdongwang2017.github.io/

124

Deep fusion code:

https://github.com/zlmzju/fusenet

IGC Code:

https://github.com/homles11/IGCV3