45
CANN V100R020C20 TensorFlow Parser Scope Fusion Rules Reference Issue 01 Date 2021-02-08 HUAWEI TECHNOLOGIES CO., LTD.

TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

CANNV100R020C20

TensorFlow Parser Scope FusionRules Reference

Issue 01

Date 2021-02-08

HUAWEI TECHNOLOGIES CO., LTD.

Page 2: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

Copyright © Huawei Technologies Co., Ltd. 2021. All rights reserved.

No part of this document may be reproduced or transmitted in any form or by any means without priorwritten consent of Huawei Technologies Co., Ltd. Trademarks and Permissions

and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.All other trademarks and trade names mentioned in this document are the property of their respectiveholders. NoticeThe purchased products, services and features are stipulated by the contract made between Huawei andthe customer. All or part of the products, services and features described in this document may not bewithin the purchase scope or the usage scope. Unless otherwise specified in the contract, all statements,information, and recommendations in this document are provided "AS IS" without warranties, guaranteesor representations of any kind, either express or implied.

The information in this document is subject to change without notice. Every effort has been made in thepreparation of this document to ensure accuracy of the contents, but all statements, information, andrecommendations in this document do not constitute a warranty of any kind, express or implied.

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. i

Page 3: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

Contents

1 Introduction.............................................................................................................................. 1

2 Fusion Rules.............................................................................................................................. 72.1 ScopeLayerNormPass............................................................................................................................................................. 72.2 ScopeLayerNormGradPass................................................................................................................................................... 92.3 ScopeBasicLSTMCellPass.................................................................................................................................................... 112.4 ScopeDynamicLSTMPass.................................................................................................................................................... 122.5 ScopeClipBoxesPass.............................................................................................................................................................. 132.6 ScopeROIAlignPass............................................................................................................................................................... 152.7 ScopeRpnProposalsPass...................................................................................................................................................... 162.8 ScopeFastrcnnPredictionsPass.......................................................................................................................................... 192.9 ScopeDecodeBboxPass........................................................................................................................................................ 222.10 ScopeToAbsoluteBBoxPass.............................................................................................................................................. 242.11 ScopeNormalizeBBoxPass................................................................................................................................................ 272.12 ScopeDecodeBboxV2Pass.................................................................................................................................................302.13 ScopeBatchMultiClassNMSPass..................................................................................................................................... 312.14 ScopeKeepRatioResizeBilinearPass............................................................................................................................... 342.15 ScopeBatchMultiClassNonMaxSuppressionPass...................................................................................................... 372.16 ScopeDynamicGRUPass.................................................................................................................................................... 402.17 ScopeDynamicRNNPass................................................................................................................................................... 41

CANNTensorFlow Parser Scope Fusion Rules Reference Contents

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. ii

Page 4: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

1 Introduction

OverviewScope fusion is a scope-based fusion capability that replaces small operators in ascope with one larger operator or a combination of operators to improveefficiency.

This document describes the built-in scope fusion patterns. A collection of APIs isopened to developers to customize scope fusion patterns. You can find detaileddescription about the APIs in TensorFlow Parser Scope Fusion Pattern DeveloperGuide.

Built-in Fusion Pattern Summary

Table 1-1 Built-in scope fusion patterns

No. Fusion Pattern Description ApplicableNetwork

Classification

1 ScopeLayerNormPass

Fuses the layernorm/batchnorm and layernorm/moments scopes generated bytf.layernorm into a LayerNormoperator.

BERT Generalfusionpattern

2 ScopeLayerNormGradPass

Fuses the layernorm/batchnorm and layernorm/moments scopes generated bytf.layernorm into a LayerNormoperator.

BERT Generalfusionpattern

CANNTensorFlow Parser Scope Fusion Rules Reference 1 Introduction

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 1

Page 5: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

No. Fusion Pattern Description ApplicableNetwork

Classification

3 ScopeBasicLSTMCellPass

Fuses the small operatorswithin the scope generated bytf.nn.rnn_cell.BasicLSTMCellinto a BasicLSTMCell operator.

Non-loopinferencenetworkthat uses asingleBasicLSTMCell, such asthe NMT

Non-generalfusionpattern

4 ScopeDynamicLSTMPass

Fuses small operators withinthe scope generated bytf.nn.dynamic_rnn ortf.nn.bidirectional_dynamic_rnninto a DynamicLSTM operator.Currently, only the loopscenario where the cell result isBasicLSTMcel is supported, andonly some shapes aresupported.

Inferencenetworkthat usesdynamic_rnn and asingleBasicLSTMCell

Non-generalfusionpattern

5 ScopeClipBoxesPass

Fuses the clip_boxes scope intoa ClipBoxes operator. The scopeincludes the tf.Maximum,tf.ReverseV2, tf.Tile, and tf.Minimum operators, and doesnot include the Gather_2,TopKV2, Reshape_2, Split,Greater, Squeeze, Gather,boolean_mask, anddecode_bbox_target operators.

2D-H1 Non-generalfusionpattern

6 ScopeROIAlignPass

Fuses tf.AvgPool andtf.image.CropAndResize into anROIAlign operator, excludingthe Merge operator.

2D-H1 Non-generalfusionpattern

7 ScopeRpnProposalsPass

Fuses generate_rpn_proposalsScope into an RpnProposalsoperator. The scope includes:tf.NonMaxSuppressionV2operators, tf.TopKV2 operators,a multiple of four tf.Whereoperators, and a multiple of sixtf.Gather operators. TheExpandDims, Switch, andtranspose operators are notincluded.

2D-H1 Non-generalfusionpattern

CANNTensorFlow Parser Scope Fusion Rules Reference 1 Introduction

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 2

Page 6: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

No. Fusion Pattern Description ApplicableNetwork

Classification

8 ScopeFastrcnnPre-dictionsPass

Fuses the fastrcnn_predictionsScope operator into aFastrcnnPredictions operator.The scope includes a multipleof two tf.TopKV2 operators, amultiple of three tf.Whereoperators, tf.NonMaxSuppressionV2operators, tf.Less operators,and tf.LoopCond operators. TheExpandDims, clip_boxes, anddecode_bbox_target operatorsare excluded.

2D-H1 Non-generalfusionpattern

9 ScopeDecodeBboxPass

Fuses a scope containing thefollowing operators into aDecodeBbox operator: amultiple of three tf.Reshapeoperators, a multiple of twotf.Split operators, tf.Minimumoperators, a multiple of threetf.Add operators, tf. ConcatV2operators, and a multiple oftwo tf.Sub operators. TheGreater, Squeeze, Gather_2,TopKV2 and boolean_maskoperators are excluded.

2D-H1 Non-generalfusionpattern

10 ScopeToAbsoluteBBoxPass

Fuses a scope containing amap that has a while node,ToAbsoluteCoordinates underthe while node, and Scaleunder ToAbsoluteCoordinates.In addition, there are four Muloperators under the scopeoperator, which are fused intothe ToAbsoluteBBox operator.

Fast R-CNN Non-generalfusionpattern

11 ScopeNormalizeBBoxPass

Fuses a scope containing amap that has a while node,ToNormalizedCoordinatesunder the while node, andScale underToNormalizedCoordinates. Inaddition, there are four Muloperators under the scopeoperator, which are fused intothe NormalizeBBox operator.

Fast R-CNN Non-generalfusionpattern

CANNTensorFlow Parser Scope Fusion Rules Reference 1 Introduction

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 3

Page 7: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

No. Fusion Pattern Description ApplicableNetwork

Classification

12 ScopeDecodeBboxV2Pass

Fuses the following two scopesinto a DecodeBboxV2 operator:Scope 1 contains at least twoExp operators, four Muloperators, four Sub operators,a multiple of two RealDivoperators, two Unpackoperators, one Pack operator,and three transpose operators,excluding the Softmaxoperator.Scope 2 contains at least twoExp operators, four Muloperators, 10 Sub operators, amultiple of two RealDivoperators, two Unpackoperators, one Pack operator,three transpose operators,three Rank operators, andthree Range operators,excluding the Sigmoidoperator.

Fast R-CNNSSD-Resnet34SSD-Resnet50V1-FPN

Non-generalfusionpattern

13 ScopeBatchMulti-ClassNMSPass

Fuses the following scope intoa BatchMultiClassNonMaxSup-pression operator. The scopepath is map/while/MultiClassNonMaxSuppres-sion/.

Fast R-CNNSSD-Resnet50V1-FPNMask R-CNN

Non-generalfusionpattern

14 ScopeKeepRatioR-esizeBilinearPass

Fuses a specific scope into acombination ofKeepRationResizeBilinearoperators, Shape operators, amultiple of two Slice operators,Expandims operators,ConcatV2 operators, Tileoperators, and a multiple offour Const operators. Thescope includes the graphstructure map/while/ResizeToRange/, and operatorsMaximum, Minimum, Roundand ResizeBilinear.

Fast R-CNN Non-generalfusionpattern

CANNTensorFlow Parser Scope Fusion Rules Reference 1 Introduction

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 4

Page 8: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

No. Fusion Pattern Description ApplicableNetwork

Classification

15 ScopeBatchMulti-ClassNonMaxSuppressionPass

Fuses the specified scope into aBatchMultiClassNonMaxSup-pression operator, when any ofthe following four fusionpatterns is matched:Scope1 contains oneNonMaxSuppressionV3operator and excludes thetranspose operator.Scope2 contains oneNonMaxSuppressionV3operator, five Range operators,one ConcatV2 operator, and 80Fill operators.

FaceBoxRetinanet

Non-generalfusionpattern

16 ScopeDynamicGRUPass

Fuses a scope containing thefollowing operators into aDynamicGRU operator: Thescope includes five AddV2operators, three Mul operators,and one Tanh operator, anddoes not contain Transposeoperators.

DeepSpeech2

Non-generalfusionpattern

17 ScopeDynamicRNNPass

Fuses a scope containing thefollowing operators into aDynamicRNN operator: Thescope contains a whilesubscope and does not containTranspose operators. The whilesubscope contains a multipleof 4 BiasAdd operators, amultiple of 2 Tanh operators,eight MatMul operators, and aSplit operator.

TacoTron Non-generalfusionpattern

General and Non-General Fusion Patterns● General fusion patterns are applicable to all networks. They are enabled by

default and cannot be manually disabled.● Non-general fusion patterns are applicable to specific networks. By default,

they are disabled. You can enable the non-general fusion patterns as required.

CANNTensorFlow Parser Scope Fusion Rules Reference 1 Introduction

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 5

Page 9: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

Table 1-2 Enabling a non-general fusion pattern

Use Case Method

TensorFlow modelbuilding using ATC

Use the model conversion command parameterenable_scope_fusion_passes to specify the fusionpatterns that need to take effect. Separate fusionpatterns by commas (,).--enable_scope_fusion_passes = DecodeBboxV2ScopeFusionPass

Execution in theTensorFlow framework

Use the running configuration parameterenable_scope_fusion_passes of the TensorFlowframework to specify the fusion patterns thatneed to take effect. Separate fusion patterns bycommas (,).import tensorflow as tffrom npu_bridge.estimator import npu_opsfrom tensorflow.core.protobuf.rewriter_config_pb2 import RewriterConfig

config = tf.ConfigProto()custom_op = config.graph_options.rewrite_options.custom_optimizers.add()custom_op.name = "NpuOptimizer"custom_op.parameter_map["use_off_line"].b = Truecustom_op.parameter_map["enable_scope_fusion_passes"].s = tf.compat.as_bytes("DecodeBboxV2ScopeFusionPass")config.graph_options.rewrite_options.remapping = RewriterConfig.OFF

with tf.Session(config=config) as sess: sess.run()

CANNTensorFlow Parser Scope Fusion Rules Reference 1 Introduction

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 6

Page 10: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

2 Fusion Rules

2.1 ScopeLayerNormPass

2.2 ScopeLayerNormGradPass

2.3 ScopeBasicLSTMCellPass

2.4 ScopeDynamicLSTMPass

2.5 ScopeClipBoxesPass

2.6 ScopeROIAlignPass

2.7 ScopeRpnProposalsPass

2.8 ScopeFastrcnnPredictionsPass

2.9 ScopeDecodeBboxPass

2.10 ScopeToAbsoluteBBoxPass

2.11 ScopeNormalizeBBoxPass

2.12 ScopeDecodeBboxV2Pass

2.13 ScopeBatchMultiClassNMSPass

2.14 ScopeKeepRatioResizeBilinearPass

2.15 ScopeBatchMultiClassNonMaxSuppressionPass

2.16 ScopeDynamicGRUPass

2.17 ScopeDynamicRNNPass

2.1 ScopeLayerNormPass

DescriptionFuses the layernorm/batchnorm and layernorm/moments scopes generated bytf.layernorm into a LayerNorm operator.

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 7

Page 11: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

Scope Details

batchnorm unfolded:

moments unfolded:

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 8

Page 12: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

Result Operator Prototype

LayerNorm. For details, see CANN Operator List (Ascend 310).

Fusion Mapping

When there are Cast nodes, the input of the first Cast node is used as the firstinput x of the fused operator.

The input gamma of the Mul node is used as the second input gamma of thefused operator.

The input beta of the last Add node is used as the third input beta of the fusedoperator.

The fourth begin_norm_axis uses the default value 1.

The fifth begin_param_axis uses the default value –1.

2.2 ScopeLayerNormGradPass

Description

Fuses the backward scope of tf.layernorm into a LayerNormGrad operator.

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 9

Page 13: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

Scope Details

mul_grad unfolded:

sub_grad unfolded:

Result Operator PrototypeLayerNorm. For details, see CANN Operator List (Ascend 310).

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 10

Page 14: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

Fusion MappingThe backward input of LayerNorm is used as the first input dy of the fusedoperator.

The forward input of LayerNorm is used as the second input x of the fusedoperator.

The third forward output variance is used as the third backward input variance.

The second forward output mean is used as the third backward input mean.

The second forward input gamma is used as the fourth backward input gamma.

The first backward output connects to the output of the last addN node in thebackward graph.

The second backward output gamma_backprop connects to the output of the Mulnode in mul_grad that will connect to a cast node.

The third backward output beta_backprop connects to the output of the Sum nodein sub_grad that will connect to a cast node.

2.3 ScopeBasicLSTMCellPassDescription

Fuses the small operators within the scope generated bytf.nn.rnn_cell.BasicLSTMCell into a BasicLSTMCell operator.

Scope Details

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 11

Page 15: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

Result Operator Prototype

BasicLSTMCell. For details, see CANN Operator List (Ascend 310).

Fusion Mapping

Input 1 of the concat operator is used as input 1 x after fusion.

Input 2 of the concat operator is used as input 2 h after fusion.

Input 1 of the Mul operator is used as input 3 c after fusion.

Input 2 of the MatMul operator is used as input 4 w after fusion.

Input 2 of the BiasAdd operator is used as input 5 b after fusion.

The output of Add_1 is used as output 0 ct after fusion.

The output of Mul_2 is used as the output 1 ht after fusion.

2.4 ScopeDynamicLSTMPass

Description

Fuses small operators within the scope generated by tf.nn.dynamic_rnn ortf.nn.bidirectional_dynamic_rnn into a DynamicLSTM operator. Currently, only theloop scenario where the cell result is BasicLSTMcel is supported, and only someshapes are supported.

Scope Details

The scope structure corresponding to dynamic_rnn is as follows.

Alternatively, the two dynamic_rnn values in bidirectional_dynamic_rnn are FWand BW, respectively.

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 12

Page 16: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

Result Operator PrototypeDynamicLSTM. For details, see CANN Operator List (Ascend 310).

Fusion MappingWhen time_major is set to False:

Input 1 of the rnn/transpose node is used as input 1 x after fusion.

The input of the rnn/while/basic_lstm_cell/MatMul/Enter node is used as input 2w after fusion.

The input of the rnn/while/basic_lstm_cell/BiasAdd/Enter node is used as input 3 bafter fusion.

The output of the rnn/transpose_1 node is used as the output output_h afterfusion.

When time_major is set to True:

Input 3 of the rnn/TensorArrayUnstack/TensorArrayScatter/TensorArrayScatterV3node is used as input 1 x after fusion.

The input of the rnn/while/basic_lstm_cell/MatMul/Enter node is used as input 2w after fusion.

The input of the rnn/while/basic_lstm_cell/BiasAdd/Enter node is used as input 3 bafter fusion.

The output of the rnn/TensorArrayStack/TensorArrayGatherV3 node is used as theoutput output_h after fusion.

NO TE

In the preceding scope example, time_major is set to True.

2.5 ScopeClipBoxesPass

DescriptionFuses the clip_boxes scope into a ClipBoxes operator. The scope includes thetf.Maximum, tf.ReverseV2, tf.Tile, and tf. Minimum operators, and does not include

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 13

Page 17: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

the Gather_2, TopKV2, Reshape_2, Split, Greater, Squeeze, Gather, boolean_mask,and decode_bbox_target operators.

Scope Details

Result Operator PrototypeClipBoxes. For details, see CANN Operator List (Ascend 310).

Fusion MappingThe input of clip_boxes/Maximum is used as the first input boxes_input of thefused operator.

The input of clip_boxes/ReverseV2 is used as the second input im_info of the fusedoperator.

The output of clip_boxes/fastrcnn_all_boxes (Minimum) is used as the outputboxes_output of the fused operator.

The output of clip_boxes/ReverseV2 is used as the input of clip_boxes/Tile.

The output of clip_boxes/Tile is used as the input of clip_boxes/ToFloat.

The input of clip_boxes/ToFloat is used as the second input of clip_boxes/fastrcnn_all_boxes.

The output of clip_boxes/Maximum is used as the first input of clip_boxes/fastrcnn_all_boxes.

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 14

Page 18: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

2.6 ScopeROIAlignPass

DescriptionFuses tf.AvgPool and tf.image.CropAndResize into an ROIAlign operator, excludingthe Merge operator.

Scope Details

Result Operator PrototypeROIAlign. For details, see CANN Operator List (Ascend 310).

Fusion MappingThe input of crop_and_resize/Shape/Switch is used as the first input features ofthe fused operator.

The input of Shape is used as the second input rois of the fused operator.

The output of Avgpool is used as the output y of the fused operator.

The output of Shape is used as the input of StridedSlice.

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 15

Page 19: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

The output of StridedSlice is used as the input of zeros.

The output of crop_and_resize/Shape/Switch is used as the input ofcrop_and_resize/Shape/Shape and crop_and_resize/transpose.

The output of crop_and_resize/Shape/Shape is used as the input ofcrop_and_resize/strided_slice.

The output of crop_and_resize/strided_slice is used as the input ofcrop_and_resize/transform_fpcoor_for_tf.

The outputs of crop_and_resize/transform_fpcoor_for_tf and crop_and_resize/transpose are used as the input of CropAndResize.

The output of crop_and_resize or CropAndResize is used as the input ofcrop_and_resize/transpose_1.

The output of crop_and_resize/transpose_1 is used as the input of AvgPool.

2.7 ScopeRpnProposalsPass

DescriptionFuses generate_rpn_proposals Scope into an RpnProposals operator. The scopeincludes: tf.NonMaxSuppressionV2 operators, tf.TopKV2 operators, a multiple offour tf.Where operators, and a multiple of six tf.Gather operators. TheExpandDims, Switch, and transpose operators are not included.

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 16

Page 20: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

Scope Details

Result Operator PrototypeRpnProposals. For details, see CANN Operator List (Ascend 310).

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 17

Page 21: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

Fusion MappingThe input of transpose is used as the first input rois of the fused operator.

The inputs of filtered_boxes, filtered_scores, and Gather_1 are used as the secondinput cls_bg_prob of the fused operator.

The input of clip_boxes/ReverseV2 is used as the third input img_info of the fusedoperator.

The output of boxes is used as the output of sorted_box of the fused operator.

The output of filtered_boxes is used as the input of Where.

The outputs of transpose and Where are used as the input of Gather.

The output of Gather is used as the input of Reshape.

The output of filtered_scores is used as the input of Where_1.

The output of Where_1 is used as the input of Gather_1.

The output of Gather_1 is used as the input of Reshape_1.

The output of Reshape_1 is used as the input of TopK V2 and size.

The output of Size is used as the input of Minimum.

The output of Minimum is used as the input of TopKV2.

The output of TopKV2 is used as the input of Gather_2 and boolean_mask_1.

The output of Gather_2 is used as the input of clip_boxes/Maximum.

The output of clip_boxes/ReverseV2 is used as the input of clip_boxes/Tile.

The output of clip_boxes/Tile is used as the input of clip_boxes/ToFloat.

The outputs of clip_boxes/Maximum and clip_boxes/ToFloat are used as the inputof clip_boxes/Minimum.

The output of clip_boxes/Minimum is used as the input of Reshape_2.

The output of Reshape_2 is used as the input of boolean_mask and split.

The output of Split is used as the input of sub.

The output of Sub is used as the input of Squeeze.

The output of Squeeze is used as the input of Greater.

The output of Greater is used as the input of All.

The output of All is used as the input of boolean_mask and boolean_mask_1.

The output of boolean_mask is used as the inputs of Reshape_3 and ReverseV2.

The output of ReverseV2 is used as the input of nms_input_boxes.

The output of nms_input_boxes is used as the input of non_max_suppression.

The output of boolean_mask_1 is used as the input of non_max_suppression.

The outputs of non_max_suppression and Reshape_3 are used as the input ofBoxes.

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 18

Page 22: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

2.8 ScopeFastrcnnPredictionsPass

DescriptionFuses the fastrcnn_predictions Scope operator into a FastrcnnPredictions operator.The scope includes a multiple of two tf.TopKV2 operators, a multiple of threetf.Where operators, tf. NonMaxSuppressionV2 operators, tf.Less operators, andtf.LoopCond operators. The ExpandDims, clip_boxes, and decode_bbox_targetoperators are excluded.

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 19

Page 23: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

Scope Details

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 20

Page 24: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

Result Operator Prototype

FastrcnnPredictions. For details, see CANN Operator List (Ascend 310).

Fusion Mapping

The inputs of fastrcnn_predictions/transpose and fastrcnn_predictions/GatherNdare used as the input rois after fusion.

The input of fastrcnn_predictions/strided_slice is used as the input score afterfusion.

The output of fastrcnn_predictions/TopKV2 is used as the output sorted_rois afterfusion.

The output of fastrcnn_predictions/GatherNd is used as the output sorted_scoresafter fusion.

The output of fastrcnn_predictions/Add is used as the output sorted_classes afterfusion.

The output of fastrcnn_predictions/strided_slice is used as the input offastrcnn_predictions/transpose_1.

The output of fastrcnn_predictions/transpose_1 is used as the inputs offastrcnn_predictions/map and fastrcnn_predictions/boolean_mask.

The output of astrcnn_predictions/map is used as the input offastrcnn_predictions/Where.

The output of fastrcnn_predictions/Where is used as the input offastrcnn_predictions/Gather.

The output of fastrcnn_predictions/boolean_mask is used as the inputs offastrcnn_predictions/Size and fastrcnn_predictions/TopKV2.

The output of fastrcnn_predictions/Size is used as the input offastrcnn_predictions/Minimum.

The output of fastrcnn_predictions/Minimum is used as the input offastrcnn_predictions/TopKV2.

fastrcnn_predictions/TopKV2 is used as the output sorted_rois after fusion andinput of fastrcnn_predictions/Gather.

The output of fastrcnn_predictions/Gather is used as the input offastrcnn_predictions/filtered_indices.

The output of fastrcnn_predictions/filtered_indices is used as the inputs offastrcnn_predictions/GatherNd and fastrcnn_predictions/ToFloat.

The output of fastrcnn_predictions/ToFloat is used as the input offastrcnn_predictions/strided_slice_1.

The output of fastrcnn_predictions/strided_slice_1 is used as the input offastrcnn_predictions/Add.

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 21

Page 25: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

2.9 ScopeDecodeBboxPass

DescriptionFuses a scope containing the following operators into a DecodeBbox operator: amultiple of three tf.Reshape operators, a multiple of two tf.Split operators,tf.Minimum operators, a multiple of three tf.Add operators, tf. ConcatV2 operators,and a multiple of two tf.Sub operators. The Greater, Squeeze, Gather_2, TopKV2and boolean_mask operators are excluded.

Scope DetailsThere are two types of scopes based on whether the transpose operator isincluded.

The transpose operator not included:

The transpose operator included:

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 22

Page 26: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

Result Operator PrototypeDecodeBbox. For details, see CANN Operator List (Ascend 310).

Fusion Mapping● If the transpose operator is not included:

The input of Reshape is used as the input box_predictions after fusion.The inputs of Shape and Reshape_1 are used as the input anchors afterfusion.The output of Reshape_2 is used as the output decoded_boxes after fusion.The output of Reshape is used as the input of Split.The output of Split is used as the inputs of Minimum and Mul.The output of Minimum is used as the input of Exp.The output of Exp is used as the input of mul.The output of Reshape_1 is used as the input of split_1.The output of split_1 is used as the inputs of Sub and Add.The outputs of Sub and Add are used as the input of Mul.The output of Mul is used as the inputs of Add_1, Sub_1, and Add_2.The output of Add_1 is used as the inputs of Sub_1 and Add_2.The outputs of Sub_1 and Add_2 are used as the input of concat.The outputs of Shape and concat are used as the input of Reshape_2.Operators such as Greater, Squeeze, Gather_2, TopKV2 and boolean_mask areexcluded.

● If the transpose operator is included:The input of transpose is used as the input box_predictions after fusion.The input of transpose_1 is used as the input anchors after fusion.

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 23

Page 27: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

The output of transpose_2 is used the output decoded_boxes after fusion.The output of transpose is used as the input of Reshape.The output of Reshape is used as the input of Split.The output of Split is used as the inputs of Minimum and Mul.The output of Minimum is used as the input of Exp.The output of Exp is used as the input of mul.The output of transpose_1 is used as the inputs of Reshape_1 and Shape.The output of Reshape_1 is used as the input of split_1.The output of split_1 is used as the inputs of Sub and Add.The outputs of Sub and Add are used as the input of Mul.The output of Mul is used as the inputs of Add_1, Sub_1, and Add_2.The output of Add_1 is used as the inputs of Sub_1 and Add_2.The outputs of Sub_1 and Add_2 are used as the input of concat.The outputs of Shape and concat are used as the input of Reshape_2.The output of Reshape_2 is used as the input of transpose_2.

2.10 ScopeToAbsoluteBBoxPass

DescriptionFuses a scope containing a map that has a while node, ToAbsoluteCoordinatesunder the while node, and Scale under ToAbsoluteCoordinates. In addition, thereare four Mul operators under the scope operator, which are fused into theToAbsoluteBBox operator.

Scope Detailswhile exists under map_1:

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 24

Page 28: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

ToAbsoluteCoordinates exists under while:

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 25

Page 29: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

Scale exists under ToAbsoluteCoordinates:

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 26

Page 30: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

Result Operator PrototypeToAbsoluteBBox. For details, see CANN Operator List (Ascend 310).

Fusion MappingThe input of Shape or TensorArrayUnstack/Shape is used as the first input of thefused operator.

The input of while/strided_slice/Enter is used as the second input of the fusedoperator.

The output of TensorArrayStack/TensorArrayGatherV3 is used as the first output ofthe fused operator.

The fused scope contains the while/ToAbsoluteCoordinates/Scale subgraphstructure.

2.11 ScopeNormalizeBBoxPass

DescriptionFuses a scope containing a map that has a while node, ToNormalizedCoordinatesunder the while node, and Scale under ToNormalizedCoordinates. In addition,there are four Mul operators under the scope operator, which are fused into theNormalizeBBox operator.

Scope Detailswhile exists under map:

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 27

Page 31: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 28

Page 32: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

Result Operator Prototype

NormalizeBBox. For details, see CANN Operator List (Ascend 310).

Fusion Mapping

The input of map/shape is used as the first input of the fused operator.

The input of map/TensorArrayUnstack_1/Shape is used as the second input of thefused operator.

The input of TensorArrayStack/TensorArrayGatherV3 is used as the first output ofthe fused operator.

The scope has the while/ToNormalizedCoordinates/Scale/ graph structure.

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 29

Page 33: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

2.12 ScopeDecodeBboxV2Pass

Description

Fuses the following two scopes into a DecodeBboxV2 operator:

Scope 1 contains at least two Exp operators, four Mul operators, four Suboperators, a multiple of two RealDiv operators, two Unpack operators, one Packoperator, and three transpose operators, excluding the Softmax operator.

Scope 2 contains at least two Exp operators, four Mul operators, 10 Sub operators,a multiple of two RealDiv operators, two Unpack operators, one Pack operator,three transpose operators, three Rank operators, and three Range operators,excluding the Sigmoid operator.

Scope Details

Scope 1:

Scope 2:

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 30

Page 34: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

Result Operator PrototypeDecodeBboxV2. For details, see CANN Operator List (Ascend 310).

Fusion MappingScope 1:

The input of the transpose operator is used as the first input of the fused operator.

The input of get_center_coordinates_and_sizes/transpose is used as the secondinput of the fused operator.

The output of transpose_1 is used as the first output of the fused operator.

Scope 2:

The input of transpose/Rank is used as the first input of the fused operator.

The input of get_center_coordinates_and_sizes/transpose/Rank is used as thesecond input of the fused operator.

The output of transpose_1 is used as the first output of the fused operator.

2.13 ScopeBatchMultiClassNMSPass

DescriptionFuses the following scope into a BatchMultiClassNonMaxSuppression operator.The scope path is map/while/MultiClassNonMaxSuppression/.

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 31

Page 35: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

Scope Details

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 32

Page 36: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

Result Operator PrototypeBatchMultiClassNonMaxSuppression. For details, see CANN Operator List(Ascend 310).

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 33

Page 37: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

Fusion Mappingmap/TensorArrayUnstack/Shape is used as the first input of the fused operator.

map/TensorArrayUnstack_1/Shape is, used as the second input of the fusedoperator.

map/TensorArrayUnstack_3/Shape, if any, is used as the third input of the fusedoperator.

map/TensorArrayUnstack_4/Shape, if any, is used as the fourth input of the fusedoperator.

map/TensorArrayStack/TensorArrayGatherV3, if any, is used as the first output ofthe fused operator.

map/TensorArrayStack_1/TensorArrayGatherV3, if any, is used as the secondoutput of the fused operator.

map/TensorArrayStack_2/TensorArrayGatherV3, if any, is used as the third outputof the fused operator.

map/TensorArrayStack_4/TensorArrayGatherV3, if any, is used as the fourth outputof the fused operator.

2.14 ScopeKeepRatioResizeBilinearPass

DescriptionFuses a specific scope into a combination of KeepRationResizeBilinear operators,Shape operators, a multiple of two Slice operators, Expandims operators,ConcatV2 operators, Tile operators, and a multiple of four Const operators. Thescope includes the graph structure map/while/ResizeToRange/, and operatorsMaximum, Minimum, Round and ResizeBilinear.

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 34

Page 38: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

Scope Details

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 35

Page 39: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

Result Operator Prototype

KeepRationResizeBilinear + Shape + Slice x 2 + Expandims + ConcatV2 + Tile +Const x 4. For details, see CANN Operator List (Ascend 310).

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 36

Page 40: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

Fusion MappingThe output of the scope name+/Shape node is used as the input of the subgraphafter fusion.

The output of TensorArrayStack/TensorArrayStack/TensorArrayGatherV3 is used asthe first output of the subgraph after fusion.

The output of TensorArrayStack_1/TensorArrayStack_1/TensorArrayGatherV3 isused as the second output of the subgraph after fusion.

After the fusion, the output of the KeepRationResizeBilinear operator of thesubgraph connects to the first output of the scope.

After the fusion, the output of the subgraph Tile connects to the second output ofthe scope.

2.15 ScopeBatchMultiClassNonMaxSuppressionPass

DescriptionFuses a scope into a BatchMultiClassNonMaxSuppression operator. The fusionscope contains the scope path.

The fusion pattern contains two child patterns:

ScopeFaceBoxesBatchMultiClassNMSPattern: includes one NonMaxSuppressionV3operator and excludes the transpose operator.

ScopeFilteredBatchMultiClassNMSPattern: includes one NonMaxSuppressionV3operator, five Range operators, one ConcatV2 operator, and 80 Fill operators.

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 37

Page 41: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

Scope Details

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 38

Page 42: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

Result Operator PrototypeBatchMultiClassNonMaxSuppression. For details, see CANN Operator List(Ascend 310).

Fusion Mappingmap/TensorArrayUnstack/Shape is used as the first input of the fused operator.

map/TensorArrayUnstack_1/Shape is, used as the second input of the fusedoperator.

map/TensorArrayUnstack_3/Shape, if any, is used as the third input of the fusedoperator.

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 39

Page 43: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

map/TensorArrayUnstack_4/Shape, if any, is used as the fourth input of the fusedoperator.

map/TensorArrayStack/TensorArrayGatherV3, if any, is used as the first output ofthe fused operator.

map/TensorArrayStack_1/TensorArrayGatherV3, if any, is used as the secondoutput of the fused operator.

map/TensorArrayStack_2/TensorArrayGatherV3, if any, is used as the third outputof the fused operator.

map/TensorArrayStack_4/TensorArrayGatherV3, if any, is used as the fourth outputof the fused operator.

2.16 ScopeDynamicGRUPass

DescriptionFuses a scope containing the following operators into a DynamicGRU operator:The scope includes five AddV2 operators, three Mul operators, and one Tanhoperator, and does not contain Transpose operators.

Scope DetailsSee the following scope example.

The while scope contains five AddV2 operators, three Mul operators, one Tanhoperator, and does not contain Transpose operators. After fusion, all operators inthe red box are fused into one DynamicGRU operator.

Result Operator PrototypeDynamicGRUV2. For details, see Operator List.

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 40

Page 44: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

Fusion Mapping

The input of TensorArrayUnstack/TensorArrayScatter/TensorArrayScatterV3 is usedas the first input of the fused operator.

The input of while/ReadVariableOp_1/Enter is used as the second input of thefused operator.

The input of while/ReadVariableOp_4/Enter is used as the third input of the fusedoperator.

The input of while/ReadVariableOp_00/Enter is used as the fourth input of thefused operator.

The input of while/ReadVariableOp_01/Enter is used as the fifth input of the fusedoperator.

The output of TensorArrayStack/TensorArrayGatherV3 is used as the first output ofthe fused operator.

2.17 ScopeDynamicRNNPass

Description

Fuses a scope containing the following operators into a DynamicRNN operator:The scope contains a while subscope and does not contain Transpose operators.The while subscope contains a multiple of 4 BiasAdd operators, a multiple of 2Tanh operators, eight MatMul operators, and a Split operator.

Scope Details

See the following scope example.

The while subscope contains a multiple of 4 BiasAdd operators, a multiple of 2Tanh operators, eight MatMul operators, and a Split operator.

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 41

Page 45: TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD

Result Operator PrototypeDynamicRNN. For details, see Operator List.

Fusion MappingThe input of TensorArrayUnstack/TensorArrayScatter/TensorArrayScatterV3 is usedas the first input of the fused operator.

The input of while/split/ReadVariableOp/Enter is used as the second input of thefused operator.

The input of while/split_1/ReadVariableOp/Enter is used as the third input of thefused operator.

The output of TensorArrayStack/TensorArrayGatherV3 is used as the first output ofthe fused operator.

CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules

Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 42