16
Sensors 2020, 20, x; doi: FOR PEER REVIEW www.mdpi.com/journal/sensors Article 1 Deep Learning Methods for Fault Detection and 2 Classification in MMC-HVDC systems 3 Qinghua Wang 1,3 , Yuexiao Yu 2 , Hosameldin O. A. Ahmed 3 , Mohamed Darwish 3 , and Asoke K. 4 Nandi Fellow, IEEE 3, * 5 1 School of Mechatronic Engineering, Xi’an Technological University, Shaanxi 710021, China; 6 [email protected] 7 2 State Grid Sichuan Electric Power Research Institute of China, Chengdu 610094, China; 8 [email protected] 9 3 College of Engineering, Design and Physical Sciences, Brunel University London, UK; 10 [email protected]; [email protected]; 11 * College of Engineering, Design and Physical Sciences, Brunel University London, UK; 12 [email protected]; Tel.: +44 (0)1895 266119 13 Received: date; Accepted: date; Published: date 14 Abstract: As there is not much literature about deep learning-based fault diagnosis for modular 15 multilevel converters (MMCs) and comparison among deep learning methods used in fault 16 diagnosis for MMC, two deep learning methods, namely, Convolutional Neural Networks (CNN) 17 and Auto Encoder based Deep Neural networks (AE-based DNN) as well as stand-alone Softmax 18 classifier are explored for the detection and classification of faults of MMC-based high voltage direct 19 current converter (MMC-HVDC). Only AC-side three-phase current and the upper and lower 20 bridges’ currents of the MMCs are used directly by our proposed approaches without any explicit 21 feature extraction or feature subset selection. The two-terminal MMC-HVDC system is established 22 in PSCAD/EMTDC to verify and compare our methods. The simulation results indicate CNN, AE- 23 based DNN, and Softmax classifier can detect and classify faults with high detection accuracy and 24 classification accuracy. Compared with CNN and AE-based DNN, the Softmax classifier behaved 25 better in detection and classification accuracy as well as testing speed. The detection accuracy of 26 AE-based DNN is a little better than CNN, while CNN needs less training time than the AE-based 27 DNN and Softmax classifier. 28 Keywords: MMC-HVDC; fault detection; fault classification; CNN; AE-based DNN; Softmax 29 classifier; classification accuracy; speed 30 31 1. Introduction 32 With the increasing application of modular multilevel converter-based high-voltage direct 33 current (MMC-HVDC) systems, the reliability of MMC is of major importance in ensuring power 34 systems are safe and reliable. Topology configuration redundant strategies of fault-tolerant systems 35 are useful methods to improve reliability which can be achieved by using more semiconductor 36 devices as switches in an SM [1] or integrating redundant SMs into the arm submodule [2]. But, it is 37 well to remember that, fault detection is a precondition for fault-tolerant operation which is needed 38 to be as fast and accurate as possible to ensure converter continuous service. Therefore, Fault 39 detection and classification are one of challenging tasks in MMC-HVDC systems to improve its 40 reliability and thus reducing potential dangers in the power systems because there are a large number 41 of power electronic sub-modules (SMs) in the MMC circuit and each SM is a potential failure point 42 [3,4]. 43

Deep Learning Methods for Fault Detection and

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Deep Learning Methods for Fault Detection and

Sensors 2020, 20, x; doi: FOR PEER REVIEW www.mdpi.com/journal/sensors

Article 1

Deep Learning Methods for Fault Detection and 2

Classification in MMC-HVDC systems 3

Qinghua Wang 1,3, Yuexiao Yu 2, Hosameldin O. A. Ahmed 3, Mohamed Darwish 3, and Asoke K. 4 Nandi Fellow, IEEE 3,* 5

1 School of Mechatronic Engineering, Xi’an Technological University, Shaanxi 710021, China; 6 [email protected] 7

2 State Grid Sichuan Electric Power Research Institute of China, Chengdu 610094, China; 8 [email protected] 9

3 College of Engineering, Design and Physical Sciences, Brunel University London, UK; 10 [email protected]; [email protected]; 11

* College of Engineering, Design and Physical Sciences, Brunel University London, UK; 12 [email protected]; Tel.: +44 (0)1895 266119 13

Received: date; Accepted: date; Published: date 14

Abstract: As there is not much literature about deep learning-based fault diagnosis for modular 15 multilevel converters (MMCs) and comparison among deep learning methods used in fault 16 diagnosis for MMC, two deep learning methods, namely, Convolutional Neural Networks (CNN) 17 and Auto Encoder based Deep Neural networks (AE-based DNN) as well as stand-alone Softmax 18 classifier are explored for the detection and classification of faults of MMC-based high voltage direct 19 current converter (MMC-HVDC). Only AC-side three-phase current and the upper and lower 20 bridges’ currents of the MMCs are used directly by our proposed approaches without any explicit 21 feature extraction or feature subset selection. The two-terminal MMC-HVDC system is established 22 in PSCAD/EMTDC to verify and compare our methods. The simulation results indicate CNN, AE-23 based DNN, and Softmax classifier can detect and classify faults with high detection accuracy and 24 classification accuracy. Compared with CNN and AE-based DNN, the Softmax classifier behaved 25 better in detection and classification accuracy as well as testing speed. The detection accuracy of 26 AE-based DNN is a little better than CNN, while CNN needs less training time than the AE-based 27 DNN and Softmax classifier. 28

Keywords: MMC-HVDC; fault detection; fault classification; CNN; AE-based DNN; Softmax 29 classifier; classification accuracy; speed 30

31

1. Introduction 32

With the increasing application of modular multilevel converter-based high-voltage direct 33 current (MMC-HVDC) systems, the reliability of MMC is of major importance in ensuring power 34 systems are safe and reliable. Topology configuration redundant strategies of fault-tolerant systems 35 are useful methods to improve reliability which can be achieved by using more semiconductor 36 devices as switches in an SM [1] or integrating redundant SMs into the arm submodule [2]. But, it is 37 well to remember that, fault detection is a precondition for fault-tolerant operation which is needed 38 to be as fast and accurate as possible to ensure converter continuous service. Therefore, Fault 39 detection and classification are one of challenging tasks in MMC-HVDC systems to improve its 40 reliability and thus reducing potential dangers in the power systems because there are a large number 41 of power electronic sub-modules (SMs) in the MMC circuit and each SM is a potential failure point 42 [3,4]. 43

Page 2: Deep Learning Methods for Fault Detection and

Sensors 2020, 20, x FOR PEER REVIEW 2 of 16

The research of fault detection and classification in MMC-HVDC systems applications can be 44 broadly categorized into three basic approaches that are mechanism-based, signal processing-based, 45 and artificial intelligence-based [5]. All the mechanism-based methods need many sensors 46 monitoring the inner characteristics (circulating current, arm currents, capacitor voltages, etc.). Signal 47 processing-based methods employ output characteristics rather than inner characteristics to detect a 48 fault. Signal processing-based methods have been deemed reliable and fast by researchers [6-9] with 49 the advancement of signal processing methods in recent years. But both of them need suitable 50 methods to obtain expected inner characteristics or threshold of certain derived features, such as zero-51 crossing current slope or harmonic content which degrades the robustness of fault detection and 52 classification. The artificial intelligent methods do not need any mathematical models of MMC 53 functionality and any threshold setting, yet, they can improve the accuracy of fault diagnosis due to 54 their advantage of nonlinear representations. 55

A neural network as the most basic artificial intelligence method is used by many researchers. 56 Khomfoi and Tolbert [10] propose a fault diagnosis and reconfiguration technique for a cascaded H-57 bridge multilevel inverter drive using principal component analysis (PCA) and neural network (NN). 58 In this method, the genetic algorithm is used to select valuable principal components. Simulation and 59 experimental results showed that the proposed method is satisfactory to detect fault type, fault 60 location, and reconfiguration. Wang et al. [11] propose an artificial NN-based robust DC fault 61 protection algorithm for MMC high voltage direct current grid. In which, the discrete wavelet 62 transform is used as an extractor of distinctive features at the input of the ANN. Furqan Asghar et al. 63 [12] present NN-based fault detection and diagnosis system for three-phase inverter using several 64 features extracted from the Clarke transformed output as an input of NNs. Merlin et al. [13] design 65 thirteen artificial NNs for the voltage-source converter-HVDC systems to detect a fault condition in 66 the whole HVDC system based only on voltage waveforms measured at the rectifier substation. 67

Although the NN based methods achieved some improvements in the diagnosis of failed 68 converters and identification of defective switches [14,15], the prerequisite for the successful 69 application of NNs is to have enough training data and long training time. Multi-class relevance 70 vector machines (RVM) and support vector machine (SVM) replace a neural network to classify and 71 locate the faults because of their rapid training speed and strongly regularized characteristic [5]. 72 Wang et al. [16] use a PCA and multiclass RVM approach for cascaded H-bridge multilevel inverter 73 system fault diagnosis. Wang et al. [17] propose and analyze a fault-diagnosis technique to identify 74 shorted switches based on features generated through the wavelet transform of the converter output 75 and subsequent classification in SVMs. The multi-class SVM is trained with multiple recordings of 76 the output of each fault condition as well as the converter under normal operation. Jiao et al. [18] 77 used the three-phase AC output side voltage of MMC as the fault characteristic signal, combined with 78 PCA data preprocessing and firefly algorithm optimized SVM (FA-SVM) for MMC fault diagnosis. 79 Zhang and Wang [19] proposes a least-squares-based ɛ-support vector regression scheme, which 80 captures fault features via the Hilbert–Huang transform. Fault features are used as the inputs of ɛ-81 support vector regression to obtain fault distance. Then, the least-squares method is utilized to 82 optimize the parameters of the model so that it can meet the demand on fault location for MMC–83 MTDC transmission lines. 84

To build the aforementioned artificial intelligence machine, feature extraction techniques such 85 as Fourier analysis [20,21], wavelet transform [14,15], Clarke transform [12] or feature subset selection 86 techniques such as Principal component analysis (PCA) [10,22] and multidimensional scaling (MDS) 87 plays an important role. Sometimes to select suitable sub-features, the Genetic Algorithm (GA) 88 [10,22,23] or particle swarm optimization (PSO) [24] are employed. It is well known that feature 89 extraction has always been a bottleneck in the field of fault diagnosis. Moreover, the feature extraction 90 and all the following post-operation increase the computation burden. 91

Deep learning methods have been explored to learn the features from the data which can be 92 generalized to different cases. Zhu et al. [25] proposed Convolutional Neural Networks (CNN) for 93 fault classification and fault location in AC transmission lines with back-to-back MMC-HVDC, in 94 which, two convolutional layers were used to extract the complex features of the voltage and the 95

Page 3: Deep Learning Methods for Fault Detection and

Sensors 2020, 20, x FOR PEER REVIEW 3 of 16

current signals of only one terminal of transmission lines. Serkan Kiranyaz et al. [26] use 1-D CNN to 96 detect and localize the switch open-circuit fault using four cell capacitor voltage, circulating current 97 and load current signals. This method can achieve a detection probability of 0.989 and an average 98 identification probability of 0.997 in less than 100ms. Qu et al. [27] propose CNN for MMC fault 99 detection using each capacitor’s voltage signal. Wang et al. [28] propose CNN for DC fault detection 100 and classification using wavelet logarithmic energy entropy of transient current signal. In the past 101 our research group proposed some related methods of NNs [29~31], AE-based DNN [32] and softmax 102 classifier [33] for bearing fault detection and classification, but not for MMC-HVDC. Moreover, to the 103 best of our knowledge, use of deep learning methods for MMC fault detection and classification have 104 been very limited and there is no comparison of two deep learning methods. Furthermore, Afore-105 mentioned CNNs have achieved success, but their advantages have not been explored completely, 106 e.g., the ability of feature extraction, the speed of processing, and its stability. In summary, up to now, 107 there is still much room to further improve the performance of the open-circuit fault diagnosis of 108 MMCs. 109

To shorten such a gap and achieve high fault classification accuracy with fewer sensors and 110 reduced computational time for fault diagnosis of MMCs, we propose two deep learning methods 111 and one stand-alone Softmax Classifier for MMCs faults detection and classification using raw data 112 collected from current sensors to recognize automatically the open-circuit failures of IGBT in MMCs. 113 The contributions of this paper are as follows: 114

a. Only current sensors data are used for fault diagnosis and achieved high accuracy of fault 115 detection and classification. 116

b. Multichannel current signals are used instead of a single channel to improve reliability 117 because the sensors may also cause some faults. 118

c. Excellent accuracy on fault detection and identification without data preprocessing or post-119 operation; 120

d. Two deep learning methods and a stand-alone Softmax Classifier are used with raw data 121 collected by current sensors to achieve improved classification accuracy and reduced computation 122 time. 123

e. Performance comparison of CNN, AE-based DNN, and Softmax Classifier in terms of fault 124 diagnosis accuracy, stability and speed for MMC-HVDC fault diagnosis. 125

This paper is organized as follows. Section 2 introduces the topology and data acquisition from 126 MMC. Section 3 proposes the framework of this paper and the design of CNN, AE-based DNN, and 127 Softmax Classifier. The feasibility and performance of the proposed approaches are evaluated in 128 Section 4. Section 5 compares the three deep learning methods. Conclusions are drawn in section 6. 129

2. MMC topology and data acquisition 130

The data for this study was simulated from a two-terminal model of the MMC-HVDC 131 transmission power system using PSCAD/EMTDC [34]. It solves the differential equations of the 132 entire power system and its controls. Figure 1 shows that each phase of the three-phase MMC consists 133 of two arms (upper and lower) that are connected to two inductors L. Each arm contains a series of 134 SMs, and each SM involves two IGBTs (i.e., T1 and T2 ), two diodes D, and a DC storage capacitor. 135

Page 4: Deep Learning Methods for Fault Detection and

Sensors 2020, 20, x FOR PEER REVIEW 4 of 16

Figure 1. Structure of a three-phase MMC with half-bridge submodules 136

In our simulation (Table 1), we recorded 9 channels of data for normal and 6 different locations 137 of IGBT break-circuit fault manually for each bridge (namely A-phase lower SMs, A-phase upper 138 SMs, B-phase lower SMs, B-phase upper SMs, C-phase lower SMs, and C-phase upper SMs). There 139 are 100 cases of IGBT break-circuit fault that happened at different IGBTs of the six bridges at 140 different times. The power system is depicted in Figure 2. The type of SMs is half-bridge and the 141 direction of the flow is shown as the arrow above. Ba-A1 and Ba-A2 are two AC bus bars. Bb-A1 and 142 Bb-A2 are two DC bus bars. E1 is an equivalent voltage source for an AC network. E2 is a wind farm. 143

Table 1. Parameters of MMC. 144

Parameters Value

number of SMs per arm 9

SM capacitor 1000uF

arm inductance 50mH

AC frequency 50Hz

145

Figure 2. Structure of the HVDC 146

The whole time period used is 0.1s while the time for the IGBT open circuit fault duration is 147 varied from 0.03s to 0.07s. The simulation time step is 2μs and the sampling frequency is 20μs. The 148 acquired data channels for fault diagnosis are AC-side three-phase current (Ia, Ib, Ic,) and three-phase 149 circulation current (Idiffa, Idiffb, Idiffc). 150

1apSM

2apSM

apnSM

1apSM

2apSM

apnSM

2bpSM

bpnSM

1bpSM

2bpSM

bpnSM

1cpSM

2cpSM

cpnSM

1cpSM

2cpSM

cpnSM

L

api

L L

LL

cpi

bnianicni

bpi

L

A

B

C

aI

bI

cI

dcV

Phase A converter Phase B converter Phase C converter

Up

per

arm

sL

ow

er a

rms

1bpSM1T

2T

1D

2DSMV

Page 5: Deep Learning Methods for Fault Detection and

Sensors 2020, 20, x FOR PEER REVIEW 5 of 16

3. The framework of fault classification and design of deep learning methods 151

3.1. The framework for fault detection and classification 152

This paper proposes three methods to complete both the fault detection and classification task 153 for MMC, as shown in Figure 3, which are CNN, AE-based DNN, and a stand-alone Softmax 154 classifier. CNN processes the raw sensors data which are nine current signals (Ia, Ib, Ic, iap, ibp, icp, ian, ibn, 155 and icn) and obtains the fault diagnosis results. AE-based DNN and Softmax process the combined 156 information which is concatenated the measurements of these nine parameters to form a vector of 157 samples that represent the current health condition of the MMCs, then obtain the fault diagnosis 158 results. 159

160

Figure 3. Framework for fault detection and classification for MMC 161

3.2. Design of CNN 162

Convolutional neural networks (CNNs) are widely used tools for deep learning which is 163 different from the traditional feed-forward ANN because of its three architectural properties of the 164 visual cortex cell: local receptive regions, shared weights, and subsampling. The crucial advantage of 165 CNNs is that both feature extraction and classification operations are fused into a single Machine 166 learning body to be jointly optimized to maximize the classification performances [26]. 167

CNN consists of multiple layers such as figure 4 which are the input layer, convolutional layer, 168 activation layer, pooling layer, full connect layer, softmax layer, and a classification layer. Among 169 these layers, there are two basic layers in CNN which are the convolutional layer and the pooling 170 layer. Convolution operation implements the first two properties that are local receptive regions and 171 shared weights. The pooling operation implements the subsampling property [35]. 172

173

Figure 4. Architecture of the signal-level CNN classifier 174

Page 6: Deep Learning Methods for Fault Detection and

Sensors 2020, 20, x FOR PEER REVIEW 6 of 16

A convolutional layer consists of neurons that connect to small regions of the input and operate 175 the convolution computation. The output feature map of the convolutional layer can be written as: 176

𝐹𝑗 = 𝜑(∑ 𝑊𝑖,𝑗𝑁𝑖=1 ⊗ 𝐼𝑖 + 𝑏𝑗), (1)

For the jth filter, the output is a new feature map 𝐹𝑗, Where 𝑊𝑖,𝑗 and 𝑏𝑗 denote the jth filter kernel 177

and bias, respectively. 𝐼𝑖 is the input matrix of the ith channel, ⊗. represents the convolutional 178 operation, and 𝐼𝑖 is convoluted with a corresponding filter kernel 𝑊𝑖,𝑗. The sum of all convoluted 179

matrices is then obtained and a bias term 𝑏𝑗 is added to each element of the resulting matrix. There 180

are many several choices we could make activation function 𝜑 be a non-linear. But in this paper, we 181 simply use a named leaky rectified linear unit (leaky ReLU). The function of leaky ReLU is given by: 182

𝜑(𝑥) = {𝑥, 𝑥 ≥ 0𝑠𝑐𝑎𝑙𝑒 ∗ 𝑥, 𝑥 < 0

(2)

It is a simple threshold that makes the negative value be zero. Then we can obtain the output feature 183 map 𝐹𝑗. 184

Pooling layers perform down-sampling operations. Pooling methods usually include max-185 pooling and average-pooling. In this paper, the average-pooling function is applied which outputs 186 the average values of rectangular regions of its input. In a fully connected layer, neurons between 187 two adjacent layers are fully pairwise connected but neurons within the same layer share no 188 connections. Then the Softmax function is commonly adopted for classification tasks. The 189 introduction of Softmax will be presented in the following subsection 3.4. 190

3.3. Design of AE-based DNN 191

192

Figure 5. Architecture of the AE-based DNN 193

An AE-based DNN (Deep Neural Network) is constructed by several autoencoders (AEs) stacked 194 with each other and a Softmax classifier on the output layer. In this paper, we stacked one AE with a 195 Softmax classifier as can be seen in figure 5. The AE needs to be pretrained by Greedy layer-wise 196 training algorithm. The simplest form of an AE includes three layers: the input layer, hidden layer, 197 and output layer. An AE network consists of an encoder and a decoder. The encoder maps the input 198 to a hidden representation and the decoder attempts to map this representation back to the original 199 input. given an unlabeled vector sample x, The encoder network can be explicitly defined as: 200

ℎ = 𝑓(𝑤1𝑥 + 𝑏1), (3)

Similarly, the decoder network can be defined as: 201

�̂� = 𝑔(𝑤2𝑥 + 𝑏2), (4)

...

...

...

...

...

..

...

...

..

Updatingweight

X1 X1 Y1 X2 Y2

WT

W W W

W1 W1Fine-tune

Auto-encoder Deep Neural Network(training) Testing for Classification

Softmax Softmax Normal

A-phase lower SMs

A-phase upper SMs

B-phase lower SMs

B-phase upper SMs

C-phase lower SMs

C-phase upper SMs

encode

uncode

Page 7: Deep Learning Methods for Fault Detection and

Sensors 2020, 20, x FOR PEER REVIEW 7 of 16

where �̂� are the approximate reconstruction of the inputs, and 𝜃 = {w, 𝑏} are the reconstructing 202 parameters, and f and g are the activation function of the encoder and decoder, respectively. The 203 reconstruction error E between the inputs x and output �̂� are defined as: 204

𝐸 =1

𝑁 ∑(𝑥𝑖 − �̂�𝑖)

2

𝑁

𝑖=1⏟ mean squared error

+ 𝜆 ∗ 𝛺𝑤𝑒𝑖gℎ𝑡𝑠⏟ 𝐿2

regularization

(5)

Where the first part is the mean square variance used to measure the average discrepancy and N is 205 the number of neurons in the output layer, and the second part is the regularization term used to 206 prevent overfitting. λ is the coefficient for the L2 regularization term. 207

𝛺𝑤𝑒𝑖gℎ𝑡𝑠 =1

2∑∑(𝑤𝑗

(𝑙))2

𝑁

𝑗

𝐿

𝑙

(6)

Where 𝐿 is the number of hidden layers. The following subsection introduces the softmax classifier. 208

3.4. Introduction of Softmax classifier 209

The Softmax function, also known as softargmax or normalized exponential function, is a 210 function that takes as input a vector of K real numbers and normalizes it into a probability 211 distribution consisting of K probabilities proportional to the exponentials of the input numbers. It is 212 calculated as: 213

𝑦𝑟(𝑥) = 𝑃(𝑐𝑟|𝑥, 𝜃) =𝑒𝑥𝑝 (𝑎𝑟(𝑥))

∑ 𝑒𝑥𝑝 (𝑎𝑗(𝑥))𝑘𝑗=1

, (7)

The loss function can use mean squared error function and the cross-entropy function. In this 214 paper, we used the cross-entropy function which is given by: 215

𝐸 = −∑ ∑ 𝑡𝑖𝑗𝑘𝑗=1

𝑁𝑖=1 𝑙𝑛𝑦𝑖𝑗, (8)

where 𝑡𝑖𝑗 is the indicator that the ith example belongs to the jth class, 𝑦𝑖𝑗 is the output for 216

example i, which here is the value from the softmax function. 217

4. Experimental study 218

Seven conditions of MMCs status have been recorded which include normal, A-phase lower 219 SMs, A-phase upper SMs, B-phase lower SMs, B-phase upper SMs, C-phase lower SMs, and C-phase 220 upper SMs faults. 100 examples were collected from each condition. So there are a total of 700 (100 x 221 7) raw data files to process with. All the nine parameters, i.e., Ia, Ib, Ic, iap, ibp, icp, ian, ibn, and icn, were 222 recorded to obtain 5001-time samples. 223

Experiments were conducted for testing data rates from 0.1 to 0.9 and 20 run times for each 224 testing data rate. We need to point out that the detection and classification results in the following 225 paper are the average of 20 run results. In order not to be influenced by the difference in data used, 226 it is important to ensure that these methods work with the same data at each run. The following code 227 is pseudo-code which can explain this scenario. 228

For TestingDataRate=0.1:0.1:0.9 229 For i=1:20 230

[trainData testData]=split(RawData,TestingDataRate); 231 CNN=trainCNN (trainData); 232 ResultsCNN=CNN(testData); 233 [trainDataCI testDataCI]=combined Information( trainData, testData); 234 AE-basedDNN=trainAE-basedDNN(trainDataCI); 235

ResultsAE= AE-basedDNN(testDataCI); 236 Softmax=trainSoftmax(trainDataCI); 237

ResultsSoftmax =Softmax(testDataCI) 238 End 239

Page 8: Deep Learning Methods for Fault Detection and

Sensors 2020, 20, x FOR PEER REVIEW 8 of 16

End 240

4.1. Implementation details and results of CNN 241

4.1.1. Implementation details of CNN 242

Figure 4 illustrates the architecture of CNN for fault detection and classification. The input data 243 is the raw sensor signals. Each channel denotes one sensor which records 5001-time samples. So the 244 size of input current signals is [5001x1x9], where the length is 5001 and the height is 1 as the signals 245 are one dimensional, and the depth is 9 as the signals come from 9 channels. The input is convolved 246 with 6 filters of size [30 1] with stride 9 and padding 3, then applied a leaky ReLU function, in which 247 the scalar multiplier for negative inputs is set as 0.01, resulting in a new feature map of size 554x1 248 and 6 channels. The sequence is pooling operation which is applied to each feature map separately. 249 Our pooling size is set 6x1 and stride is 6. Therefore, a convolution feature map is divided into several 250 disjoint patches and then the average value in each patch is selected to represent the patch and 251 transmit to the pooling layer, then the feature map is reduced to 94x1 by the pooling operation. 252

As stochastic gradient descent with momentum (SGDM) algorithm may reduce the oscillations 253 along the path of the steepest descent towards the optimum that is sometimes caused by stochastic 254 gradient descent algorithm [36], we use the SGDM algorithm to update the parameters of the deep 255 NN. The momentum is set at 0.95, the learning rate is 0.01 and the maximum number of epochs to 256 use for training is set at 30. 257

4.1.2. Results of CNN 258

The accuracy of the CNN fault detection is shown in Table 2. For fault detection, the output 259 network is divided into two types: fault and normal. We can see from Table 2 when the testing rate 260 is 0.1~0.5 and 0.7, the detection accuracy is 100%. The min of the detection accuracy is 99.7% at the 261 testing rate of 0.9. In which, there are 0.3% fault cases are misclassified as normal cases. 262

Table 2 Fault detection accuracy of CNN 263

Testing data rate 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Detection accuracy(%) 100 100 100 100 100 99.9 100 99.8 99. 7

The classification results of training and testing data using convolutional NNs are shown in 264 figure 6. From the viewpoint of trending, we can see that with the testing data rate increases, both 265 classification accuracy for training data and testing data decline. For the training dataset, the standard 266 deviation of classification accuracy increases with the increase of the testing data rate. For testing data 267 set, the max of mean accuracy is 98.6% with testing data rate 0.1 and the min of the average accuracy 268 is 93.0% with testing data rate 0.9. The standard deviation of classification accuracy in the middle of 269 the testing data rate is smaller than both ends of the testing data rate. Moreover, for each testing data 270 rate, the standard deviation of classification accuracy for the training data set is less than the standard 271 deviation of classification accuracy for the testing data set. 272

Page 9: Deep Learning Methods for Fault Detection and

Sensors 2020, 20, x FOR PEER REVIEW 9 of 16

273

Figure 6. The classification accuracy and the standard deviation of CNN 274

Table.3 provides a confusion matrix of the classification results for each condition with testing 275 data rates of 0.2, 0.5 and 0.8. As can be seen from Table.3 that the recognition of the normal condition 276 of the MMCs is 100% with 0.2, 0.5, and 0.8 testing data rates. With 0.2 testing data rate, our method 277 misclassified 3.2% of testing examples of condition 4 as condition 2 and 2% of testing examples of 278 condition 4 as condition 6; With 0.5 testing data rate, our method misclassified 1.6% of testing 279 examples of condition 4 as condition 2 and 3.4% of testing examples of condition 4 as condition 6; 280 Furthermore, with 0.8 testing data rate, our method misclassified 0.8% of testing examples of 281 condition 4 as condition 2 and 6.4% of testing examples of condition 4 as condition 6. 282

Table.3 Sample confusion matrix of the classification results of CNN 283

Testing data rate=0.2 Testing data rate=0.5 Testing data rate=0.8

100 0 0 0 0 0 0 100 0 0 0 0 0 0 100 0.2 0 0 0 0.5 1

0 97.8 0 3.2 0 0 0 0 95 0 1.6 0.2 0.7 1.3 0 91.6 0 0.8 0 0.9 2.3

0 0 97.3 0 0 0 0.8 0 0 97.2 0 0.9 0 1.1 0 0 94.4 0 2.5 0 2.2

0 0.7 0 94.8 0 2.2 0 0 1 0 95 0 1.9 0 0 3.8 0.4 92.8 0 0.6 0.2

0 0 2.2 0 99.8 0 3.2 0 0 1.9 0 96.1 0 3 0 0 3.2 0 90.8 0 2.5

0 0.7 0 2 0 97.8 0 0 3.6 0.2 3.4 0.3 96.9 0.4 0 4 0.3 6.4 0.6 97.1 0.9

0 0.2 0.5 0 0.2 0 96 0 0.4 0.7 0 2.5 0.5 94.2 0 0.4 1.7 0 6.1 0.9 90.9

4.2. Implementation details and results of AE-based DNN 284

4.2.1. Implementation details of AE-based DNN 285

First, the measurements of nine current signals were concatenated to form a vector of samples 286 that represent the current health condition of the MMCs. This gave a total of 45009 (5001 x 9) samples 287 dimension for each vector of health condition. Second, we used the AE with three layers: the input 288 layer, hidden layer, and output layer. In which, the number of neurons in the hidden layer is set as 289 250 which means the sample dimension will be reduced from 45009 to 250. An AE network consists 290 of an encoder and a decoder. The transfer function for the encoder and the decoder is the Satlin 291

Page 10: Deep Learning Methods for Fault Detection and

Sensors 2020, 20, x FOR PEER REVIEW 10 of 16

function and the logistic sigmoid function, respectively. Satlin function is a positive saturating linear 292 transfer function given as: 293

𝑓(𝑧) = {

0, 𝑖𝑓 𝑧 ≤ 0 𝑧, 𝑖𝑓 0 < 𝑧 < 1

1, 𝑖𝑓 𝑧 ≥ 1, (9)

The algorithm to use for training the autoencoder applied scaled conjugate gradient descent 294 (SCGD). The maximum number of training epochs for this autoencoder is set as 10. Third, the 250 295 features achieved by trained AE are used as the input of the Softmax classifier. The maximum number 296 of training epochs for the Softmax classifier is set as 20. Next, we stacked the trained AE and Softmax 297 classifier into a deep NN. Finally, we trained this deep NN using the training data. 298

4.2.2. Results of AE-based DNN 299

The fault detection results of the AE -based DNN are shown in table 4. When the testing rate 300 varies from 0.1 to 0.7, the detection accuracy is 100%. The lowest detection accuracy is 99.7% at the 301 testing rate of 0.9. In which, there are 0.3% fault cases are misclassified as normal cases. Compared 302 with Table 2 of CNN, AE-based DNN has better detection accuracy. 303

Table 4. Detection accuracy of AE -based DNN 304

Testing data rate 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Detection accuracy 100 100 100 100 100 100 100 99.9 99.7

Figure 7 shows the classification results of training and testing data using AE-based DNN. From 305 the viewpoint of trending analysis, we can see that with the testing data rate increase, the 306 classification mean accuracy for testing data declines but the classification accuracy for training data 307 increases. For the training data set, the highest average accuracy is 99.5% with testing data rate 0.8 308 and the lowest is 98.6% with testing data rate 0.1. The standard deviation of classification accuracy 309 increases with the increase of the testing data rate. For the testing data set, the max of mean accuracy 310 is 97.6% with testing data rate 0.1 and the min of mean accuracy is 92.1% with testing data rate 0.9. 311 The standard deviation of classification accuracy in the middle of the testing data rate is smaller than 312 both ends of the testing data rate. We also can see that for each testing data rate the standard deviation 313 of classification accuracy for the training data set is less than the standard deviation of classification 314 accuracy for the testing data set. 315

316

Figure 7. The classification accuracy and the standard deviation of AE-based DNN 317

Page 11: Deep Learning Methods for Fault Detection and

Sensors 2020, 20, x FOR PEER REVIEW 11 of 16

Table 5 provides a confusion matrix of the classification results for each condition with testing 318 data rates of 0.2, 0.5and 0.8. As can be seen from Table 5 that the recognition of the normal condition 319 of the MMCs is 100% with 0.2, 0.5, and 0.8 testing data rates. With 0.2 testing data rate, our method 320 misclassified 1.5% of testing examples of condition 3 as condition 5; With 0.5 testing data rate, our 321 method misclassified 1.8% of testing examples of condition 3 as condition 5 and 0.2% of testing 322 examples of condition 3 as condition 7. With 0.8 testing data rate, our method misclassified 0.7% of 323 testing examples of condition 3 as condition 4, 1.6% of testing examples of condition 3 as condition 5, 324 1% of testing examples of condition 3 as condition 6 and 1.9% of testing examples of condition 3 as 325 condition 7. 326

Table 5. Sample confusion matrix of the classification results of AE-based DNN 327

Testing data rate=0.2 Testing data rate=0.5 Testing data rate=0.8

100 0 0 0 0 0 0 100 0 0 0 0 0 0 100 0.1 0 0 0 0.2 0.4

0 97 0 3.2 0 1.5 0.3 0 96.3 0 2 0.3 0.8 1.3 0 96.1 0 1.4 0.7 0.8 2.4

0 0 98.5 0 0 0 0 0 0 98 0 0.4 0 0.5 0 0 94.8 0 2.1 0.1 1.8

0 0.7 0 95.5 0 1 0.2 0 1.5 0 97 0 1.8 0.1 0 2.5 0.7 96.1 0 2 1.4

0 0 1.5 0 97 0 2.5 0 0 1.8 0 97.2 0 3.8 0 0.1 1.6 0 92.6 0 2.7

0 2.3 0 1.3 0.5 97.5 0 0 1.5 0 1 1.2 96 0 0 0.6 1 2.5 1.2 96.4 1

0 0 0 0 2.5 0 97 0 0.7 0.2 0 0.9 1.4 94.3 0 0.6 1.9 0 3.4 0.5 90.3

4.3. Results of Softmax classifier 328

The accuracy of Softmax Classifier fault detection is shown in Table 6. The detection accuracy is 329 100% at all testing rates. 330

Table 6. Detection accuracy of AE -based DNN 331

Testing data rate 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Detection accuracy 100 100 100 100 100 100 100 100 100

Figure 8 shows the classification results of training and testing data using the Softmax classifier. 332 From the trending view, we can see that with the testing data rate increases, the classification average 333 accuracy for testing data declines but the classification average accuracy for training data keeps 334 steady which is 100%. The standard deviation of classification accuracy in the middle of the testing 335 data rate is smaller than both end of testing data rate for testing data set but the standard deviation 336 of classification accuracy keeps steady which is 0. For testing data set, the highest average accuracy 337 is 99.46% with testing data rate of 0.2 and the lowest average accuracy is 93.52% with testing data rate 338 of 0.9. It is obvious to see that for each testing data rate the standard deviation of classification 339 accuracy for training data set is less than the standard deviation of classification accuracy for testing 340 data set. 341

Page 12: Deep Learning Methods for Fault Detection and

Sensors 2020, 20, x FOR PEER REVIEW 12 of 16

342

Figure 8. The classification accuracy and the standard deviation of the stand-alone Softmax classifier 343

Table 7 provides a confusion matrix of the classification results for each condition with testing 344 data rates of 0.2, 0.5 and 0.8. As can be seen from Table 7 that the recognition of the normal condition 345 of the MMCs is 100% with 0.2, 0.5, and 0.8 testing data rate. With 0.2 testing data rate, our method 346 misclassified none of the testing examples of condition 4; With 0.5 testing data rate, our method 347 misclassified 0.4% of testing examples of condition 4 as condition 2. With 0.8 testing data rate, our 348 method misclassified 1.5% of testing examples of condition 4 as condition 2 and 1.88% of testing 349 examples of condition 4 as condition 6. 350

Table 7. Sample confusion matrix of the classification results of the stand-alone Softmax classifier 351

Testing data rate=0.2 Testing data rate=0.5 Testing data rate=0.8

100 0 0 0 0 0 0 100 0 0 0 0 0 0 100 0 0 0 0 0 0

0 98.5 0 0 0 0.5 0.5 0 98.1 0 0.4 0.2 0.8 0.4 0 97.1 0 1.5 0.4 1.4 3.7

0 0 100 0 0.2 0 0 0 0 99.4 0 0.7 0 0.2 0 0 95.6 0 1.3 0 0.5

0 0 0 100 0 0.8 0 0 0.7 0 99.6 0 0.6 0 0 2.2 0.3 96.6 0 2.3 0

0 0 0 0 99.8 0 0 0 0 0.6 0 99.1 0 0.6 0 0 1.5 0 94.3 0 2

0 1.5 0 0 0 98.5 0 0 0.6 0 0 0 97.4 0 0 0.5 0.3 1.9 0.8 95.9 0.7

0 0 0 0 0 0.2 99.5 0 0.6 0 0 0 1.2 98.8 0 0.2 2.3 0 3.2 0.4 93.1

Above all, for the training data set, with the increase of testing data rate, the average accuracy 352 of Softmax keeps steady which is 100% and the average accuracy of CNN decreases but the average 353 accuracy of AE-based increases. The standard deviation of accuracy for SoftMax keeps steady which 354 is 0 and the standard deviation of accuracy for other methods increases with the increase of the testing 355 data rate. For the testing data set, the average accuracy of all methods decreases with the increase of 356 the testing data rate. And the standard deviation of accuracy in the middle is less than both ends of 357 the testing data rate for all methods. 358

5. Comparisons 359

We compared the three methods on the classification accuracy and the standard deviation of 360 classification accuracy for the testing data with the testing data rate from 0.1 to 0.9 and compared the 361 three methods from the viewpoint of training time spent and testing time spent which are presented 362 in Figure 9~11 respectively. 363

Page 13: Deep Learning Methods for Fault Detection and

Sensors 2020, 20, x FOR PEER REVIEW 13 of 16

5.1. Comparison of average accuracy 364

From figure 9, we can see that the Softmax classifier behaves outstandingly on the testing data 365 rate from 0.1 to 0.9 compared to CNN and AE-based DNN. When the testing data rate is 0.1, 0.2 and 366 0.9 which locates both ends, the classification accuracy of CNN is better than AE-based DNN. 367

368

Figure 9. Comparison of classification accuracy for the three methods 369

5.2. Comparison of Standard deviation 370

We know that in statistics, the standard deviation is a measure that is used to quantify the 371 amount of variation or dispersion of a set of data values. A low standard deviation indicates that the 372 values tend to be close to the expected value of set, while a high standard deviation indicates that the 373 values are spread out over a wider range. From figure 10, it is clear that the standard deviation of 374 accuracy of Softmax is lower than other methods when the testing data rate is 0.1 to 0.6, which means 375 that for every run for different training data set and testing data set, the classification accuracy of 376 Softmax is more stable and other methods are more spread out. When the testing data rate varies 377 from 0.7 to 0.9, the AE-based DNN has the lowest standard deviation. AE-based DNN is the most 378 spread out when the testing data rate is from 0.1 to 0.5 and CNN is the most spread out when the 379 testing data rate is from 0.6 to 0.9. 380

381

Figure 10. Comparison of the standard deviation of classification accuracy for the three methods 382

0.9

0.91

0.92

0.93

0.94

0.95

0.96

0.97

0.98

0.99

1

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Cla

ssifi

catio

n a

ccu

racy

Testing data rate

Softmax AE-based DNN CNN

0

0.002

0.004

0.006

0.008

0.01

0.012

0.014

0.016

0.018

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Std

of

accu

racy

Testing data rate

Softmax AE-based DNN CNN

Page 14: Deep Learning Methods for Fault Detection and

Sensors 2020, 20, x FOR PEER REVIEW 14 of 16

5.3. Speed Comparison 383

Figure 11 describes the training time and testing time spent by three methods. It shows that for 384 each testing data rate, the AE-based DNN spends more training time than other methods, and the 385 CNN spends the least training time, and the Softmax takes the least testing time when the testing 386 data rate is from 0.3 to 0.9, and the AE-based DNN spends the most testing time. 387

(a)

(b)

Figure 11. Comparison of speed for the three methods. (a) Comparison of training time spent by the 388 three methods; (b) Comparison of testing time spent by the three method. 389

Over all, the stand-alone Softmax Classifier provides better functionality, including fault 390 detection accuracy, classification accuracy, least standard deviation and speed, as well as its strong 391 ability to dealing with high dimensional data. The AE-based DNN has the second best classification 392 ability, but it needs more training time and testing time. CNN has enough classification accuracy and 393 it needs the least training time. 394

6. Conclusions 395

Fault detection and classification are two of the challenging tasks in MMC-HVDC systems. This 396 paper presented two deep learning methods (CNN and AE-based DNN) and a stand-alone Softmax 397 classifier for fault detection and classification. CNN and AE-based DNN can fuse both feature 398 extraction and classification operations into a single machine learning scheme for joint optimization 399 to maximize the classification performance, which avoided the design of handcrafted features. In this 400 paper, we only use raw current sensor data as input to our proposed approaches to detect and classify 401 faults of MMC-HVDC. The simulation results in PSCAD/EMTDC show that three methods all have 402 high detection accuracy more than 99.7%, in which the stand-alone Softmax classifier has 100% 403 detection accuracy, and AE-based DNN is a little better than of CNN. Three methods also have high 404 classification accuracy, small standard deviation, and high speed. Softmax classifier behaved better 405 than others in classification accuracy and testing speed, while CNN needs the least training time. 406

Author Contributions: Qinghua Wang, Hosameldin O. A. Ahmed and Asoke K. Nandi conceived and designed 407 this paper. Yuexiao Yu performed the experiments and generated the raw data, Mohamed Darwish gave the 408 best suggestions about these experiments. Qinghua Wang analyzed the data and wrote a draft of the paper. All 409 authors contributed to discussing the results and revising the manuscript. 410

Funding: This research was funded by the National Natural Science Foundation of China, grant number No. 411 51105291, and the Shaanxi Provincial Science and Technology Agency, No. 2020GY-124. 412

Acknowledgments: This work is supported by Brunel University London (UK) as well as the National Fund for 413 Study Abroad (China). 414

Conflicts of Interest: The authors declare no conflict of interest. 415

References 416

1. Wang, C.; Li, Z.; Murphy, D.L.K., et al. Photovoltaic multilevel inverter with distributed maximum power 417 point tracking and dynamic circuit reconfiguration. 2017 Int. Future Energy Electronics and ECCE Asia, 418 Kaohsiung Taiwan, 2017; Publisher: IEEE; pp.1520-1525. (10.1109/IFEEC.2017.7992271) 419

0100200300400

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Trai

nin

g t

ime

(s)

Testing data rae

Softmax AE-based DNN CNN

0

1

2

3

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Test

ing

tim

e (s

)

Testing data rate

Softmax AE-based DNN CNN

Page 15: Deep Learning Methods for Fault Detection and

Sensors 2020, 20, x FOR PEER REVIEW 15 of 16

2. Ghazanfari, A.; Mohamed, Y.A.-R.I. A resilient framework for fault-tolerant operation of modular 420 multilevel converters. IEEE Trans. Ind. Electron 2016, Volume 63, (5), pp.2669-2678. 421 (10.1109/TIE.2016.2516968) 422

3. Shahbazi M.; Poure P.; Saadate S.; et al. FPGA-based Fast Detection with Reduced sensor Count for a Fault-423 tolerant Three-Phase Converter. IEEE Trans on Industrial Informatics 2013, Volume 9(3), pp.1343-1350. 424 (10.1109/TII.2012.2209665) 425

4. F Richardeau. T T L Pham. Reliability Calculation of Multilevel Converters: Theory and Applications. IEEE 426 Trans on Industrial Electron 2013, Volume 60, (10), pp. 4225-4233. (10.1109/TIE.2012.2211315) 427

5. Wang C.; Zhou L.; Li Z. Survey of switch fault diagnosis for modular multilevel converter. IET Circuits 428 Devices & Systems 2019, Volume 13, (2), pp. 117-124. (10.1049/iet-cds.2018.5136) 429

6. Nandi, R.; Panigrahi, B. K. Detection of fault in a hybrid power system using wavelet transform. Michael 430 Faraday IET Int. Summit 2015, Kolkata, India, September 2015; Publisher: IET; pp. 203–206. 431 (10.1049/cp.2015.1631) 432

7. Li, Y.; Shi, X.; Wang, F.; et al. Dc fault protection of multiterminal VSCHVDC system with hybrid dc circuit 433 breaker. 2016 IEEE Energy Conversion Congress and Exposition (ECCE), Milwaukee, WI, September 2016; 434 Publisher: IEEE; pp. 1–8. (10.1109/ECCE.2016.7854990) 435

8. Liu, L.; Popov, M.;Van Der Meijden, M.; et al. A wavelet transform-based protection scheme of multi-436 terminal HVDC system. 2016 IEEE Int. Conf. Power System Technology (POWERCON), Wollongong, 437 NSW, Sept 2016; Publisher: IEEE; pp. 1–6. (10.1109/POWERCON.2016.8048434) 438

9. Costa, F. B. Boundary wavelet coefficients for real-time detection of transients induced by faults and power-439 quality disturbances. IEEE Trans. Power Deliv 2014, Volume 29, (6), pp. 2674–2687. 440 (10.1109/TPWRD.2014.2321178) 441

10. Khomfoi, S.; Tolbert, L.M. Fault diagnosis and reconfiguration for multilevel inverter drive using AI-based 442 techniques. IEEE Transactions on Industrial Electronics 2007, Volume 54, (6), pp.2954-2968. 443 (10.1109/TIE.2007.906994) 444

11. Wang X.; Saizhao Y.; Jinyu W. ANN-based Robust DC Fault Protection Algorithm for MMC High-voltage 445 Direct Current Grid. IET Renewable Power Generation 2020, Volume 14, (2), pp.199-210. (10.1049/iet-446 rpg.2019.0733) 447

12. Furqan A.; Muhammad T.; Sung H. K. Neural Network Based Fault Detection and Diagnosis System for 448 Three-Phase Inverter in Variable Speed Drive with Induction Motor. Journal of Control Science and 449 Engineering 2016, Volume 2016, pp.1-13. 450

13. Merlin, V. L.; Santos, R. C. dos; Le Blond, S.; et al. Efficient and robust ANN-based method for an improved 451 protection of VSC-HVDC systems. IET Renew Power Gener 2018, Volume 12, (13), pp. 1555-1562. 452 (10.1049/iet-rpg.2018.5097) 453

14. Liu H.; Loh, P. C.; Blaabjerg, F. Sub-Module Short Circuit Fault Diagnosis in Modular Multilevel Converter 454 based on Wavelet Transform and Adaptive Neuro Fuzzy Inference System. Electric Power Components and 455 Systems 2015, Volume 43, (8-10), pp. 1080-1088. (10.1080/15325008.2015.1022668) 456

15. Parimalasundar, E.; N. Suthanthira, V. Identification of Open-Switch and Short-Switch Failure of Multilevel 457 Inverters through DWT and ANN Approach using LabVIEW. Journal of Electrical Engineering & Technology 458 2015, Volume 10, (6), pp. 2277-2287. (http://dx.doi.org/10.5370/JEET.2015.10.6.2277) 459

16. Wang, T.; Xu, H.; Han, J.; et al. Cascaded H-bridge multilevel inverter system fault diagnosis using a PCA 460 and multiclass relevance vector machine approach. IEEE Trans. Power Electron 2015, Volume 30, (12), pp. 461 7006–7018. (10.1109/TPEL.2015.2393373) 462

17. Wang, C.; Lizana F, R.; Li, Z.; et al. Submodule short-circuit fault diagnosis based on wavelet transform 463 and support vector machines for modular multilevel converter with parallel connectivity. 43rd Annual 464 Conf. of the IEEE Industrial Electronics Society, Beijing, China, 2017; Publisher: Technical Committee on 465 Control Theory, Chinese Association of Automation; pp. 3239–3244. (10.1109/IECON.2017.8216547) 466

18. Jiao W.; Liu Z.; Zhang Y. Fault Diagnosis of Modular Multilevel Converter with FA-SVM Algorithm. 467 Proceedings of the 38th Chinese Control Conference. China, July, 2019; Publisher: IEEE; pp.5093 – 5098. 468 (10.23919/ChiCC.2019.8866021) 469

19. Zhang M.; Wang H. Fault location for MMC–MTDC transmission lines based on least squares-support 470 vector regression. The Journal of Engineering 2019,Volume 2019, (16), pp. 2125-2130. 471 (10.1049/joe.2018.8640) 472

Page 16: Deep Learning Methods for Fault Detection and

Sensors 2020, 20, x FOR PEER REVIEW 16 of 16

20. Khomfoi, S.; Tolbert, L.M. Fault diagnostic system for a multilevel inverter using a neural network. IEEE 473 Trans on Power Electron 2007, Volume 22, (3), pp.1062–1069. (10.1109/TPEL.2007.897128) 474

21. Wang, T.; Xu, H.; Han, J.; et al. Cascaded H-bridge multilevel inverter system fault diagnosis using a PCA 475 and multiclass relevance vector machine approach. IEEE Trans. Power Electron 2015, Volume 30, (12), pp. 476 7006–7018. (10.1109/TPEL.2015.2393373) 477

22. Khomfoi, S.; Tolbert, L. M. A diagnostic technique for multilevel inverters based on a genetic-algorithm to 478 select a principal component neural network. Proc. IEEE Applied Power Electronics Conference and 479 Exposition, Anaheim, CA, Feb,2007; Publisher: IEEE; pp. 1497–1503. (10.1109/APEX.2007.357715) 480

23. Zhang, Y..; Hu, H.; Liu, Z.; et al. Concurrent fault diagnosis of modular multilevel converter with Kalman 481 filter and optimized support vector machine. Systems Science & Control Engineering: Optimization and Control 482 in Financial Engineering 2019, Volume 7, (3), pp.43 – 53. (10.1080/21642583.2019.1650840) 483

24. Geometry, V.; Selvaperumal, S. Fault detection and classification with optimization techniques for three 484 phase single inverter circuit. Journal of Power Electronics 2016, Volume 16, pp. 1–14. 485

(https://doi.org/10.6113/JPE.2016.16.3.1097) 486 25. Zhu B.; Wang H.; Shi S.; et al. Fault location in AC transmission lines with back-to-back MMC-HVDC using 487

ConvNets. Journal of Engineering 2019, pp.2430-2434. (10.1049/joe.2018.8706) 488 26. Kiranyaz S.; Gastli, A.; Ben-Brahim, L.; et al. Real-time fault detection and identification for MMC using 1-489

D Convolutional Neural Networks. IEEE Transactions on Industrial Electronics 2019, Volume 66, No.11, 490 pp.8760-8771. (10.1109/TIE.2018.2833045) 491

27. Qu, X.; Duan, B.; Yin, Q.; et al. Deep convolution neural network based fault detection and identification 492 for modular multilevel converters. 2018 IEEE Power & Energy Society General Meeting (PESGM), Portland, 493 or USA, August, 2018; Publisher: IEEE; pp.1-5. 494

28. Wang, J.; Zheng, X.; Tai, N. DC fault detection and classification approach of MMC-HVDC based on 495 convolutional neural network. 2018 2nd IEEE Conf. on Energy Internet and Energy System Integration; 496 Publisher: IEEE; pp.1-6. (10.1109/EI2.2018.8582391) 497

29. Jack L.B; Nandi A. K. Fault detection using support vector machines and artificial neural networks, 498 augmented by genetic algorithms. Mech. Syst. Signal Process 2002, Volume 16, (2–3), pp. 373–390. 499 (10.1006/mssp.2001.1454) 500

30. McCormick, A. C; Nandi, A. K. Real-time classification of rotating shaft loading conditions using artificial 501 neural networks. IEEE Trans. Neural Netw1997, Volume 8, (3), pp. 748–757. (10.1109/72.572110) 502

31. Guo, H.; Jack, L.B.; Nandi, A. K. Feature generation using genetic programming with application to fault 503 classification. IEEE Trans. Syst., Man, Cybern, Part B(Cybernetics) 2005, Volume 35, (1), pp. 89–99. 504 (10.1109/TSMCB.2004.841426) 505

32. Ahmed, H. O. A.; Wong, M. L. D.; Nandi, A. K. Intelligent condition monitoring method for bearing faults 506 from highly compressed measurements using sparse over-complete features. Mech. Syst. Signal Process 507 2018, Volume 99, pp. 459–477. (10.1016/j.ymssp.2017.06.027) 508

33. Ahmed, H.; Nandi, A. K. Compressive Sampling and Feature Ranking Framework for Bearing Fault 509 Classification With Vibration Signals. IEEE Access 2018, Volume 6, pp. 44731-44746. 510 (10.1109/ACCESS.2018.2865116) 511

34. Yang, S.; Tang, Y.; Wang, P. Seamless fault-tolerant operation of a modular multilevel converter with switch 512 open-circuit fault diagnosis in a distributed control architecture. IEEE Transactions on Power Electronics 2018, 513 Volume 33, (8), pp.7058-7070. (10.1109/TPEL.2017.2756849) 514

35. LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, Volume 521, (7553), pp. 436–444. 515 (10.1038/nature14539) 516

36. Murphy, K. P. Machine Learning: A Probabilistic Perspective. The MIT Press, Cambridge, Massachusetts, 517 2012. 518

© 2020 by the authors. Submitted for possible open access publication under the terms

and conditions of the Creative Commons Attribution (CC BY) license

(http://creativecommons.org/licenses/by/4.0/).

519