Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Research ArticleMulti-Information Flow CNN and Attribute-Aided Reranking forPerson Reidentification
Haifeng Sang1 Chuanzheng Wang 1 Dakuo He 2 and Qing Liu2
1School of Information Science and Engineering Shenyang University of Technology Shenyang Liaoning 110870 China2College of Information Science and Engineering Northeastern University Shenyang Liaoning 110004 China
Correspondence should be addressed to Chuanzheng Wang qing_0304outlookcom
Received 7 November 2018 Accepted 8 January 2019 Published 6 February 2019
Academic Editor Elio Masciari
Copyright copy 2019 Haifeng Sang et al 0is is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited
0is paper presents a multi-information flow convolutional neural network (MiF-CNN) model for person reidentification (re-id)It contains several specific multilayer convolutional structures where the input and output of a convolutional layer are con-catenated together on channel dimension With this idea layers of model can go deeper and feature maps can be reused by eachsubsequent layer Inspired by an image caption a person attribute recognition network is proposed based on long-short-termmemory network and attention mechanism By fusing identification results of MiF-CNN and attribute recognition this paperintroduces the attribute-aided reranking algorithm to improve the accuracy of person re-id further Experiments on VIPeRCUHK01 and Market1501 datasets verify the proposed MiF-CNN can be trained sufficiently with small-scale datasets and obtainoutstanding accuracy of person re-id Contrast experiments also confirm the availability of the attribute-assistedreranking algorithm
1 Introduction
Person reidentification (re-id) refers to matching and rec-ognizing the identities of pedestrians captured by multi-cameras with nonoverlapping views which is significant toimprove the efficiency of the security system Owing to thelow resolution of cameras it is hard to obtain discriminativeface features so the current person re-id methods are mainlybased on visual features of pedestrians such as color andtexture [1] In practice changes in viewpoint pose andillumination among different camera views as well as partialocclusions and background clutters pose a great challenge toperson re-id [2]
Two principal person re-id methods are feature repre-sentation and metric learning Feature representation seeksto find features with stronger discrimination and betterrobustness to represent pedestrians Many kinds of featureshave been utilized for this in which appearance features arethe simplest and the most popular ones Color texture andshape are the features that can be extracted for humanappearance [3]in feature representation such as HSV color
histogram LBP texture and Gabor features and then usedfor reidentifying people with similarity among pedestrianfeatures Attribute features are also widely used in person re-id Common attributes include gender length of hair andclothing 0ese attributes are highly intuitive and un-derstandable descriptors which have proved to be successfulin several tasks such as face recognition and activity rec-ognition [4] Although attribute features are complicated interms of extraction and expression they contain rich se-mantic information and are more robust to illumination andviewpoint changes 0erefore the combination of attributefeatures and low-level features can effectively improve theaccuracy of person re-id [5] 0e metric learning methodsemploy the machine learning algorithm to learn a goodsimilarity metric which makes the feature similarity of thesame pedestrian greater than that of different pedestrians
In recent years deep learning has shown great success ina variety of tasks in image classification and frequencydomain [6] where CNN is particularly outstanding Com-pared with the traditional methods CNN has strongerfeature learning ability and the learned features are more
HindawiComputational Intelligence and NeuroscienceVolume 2019 Article ID 7028107 12 pageshttpsdoiorg10115520197028107
intrinsically representative to the original data so it hasbetter performance in extracting image features Two typesof CNNmodels are commonly employed in the community0e first type is the classification model as used in imageclassification and the second is the Siamese model usingimage pairs or triplets as input [7] Most of the existingpublic datasets of person re-id only contain thousands ofpedestrian image samples a small number of trainingsamples can easily lead to overfitting which limits theperformance of person re-id model In addition the deepneural networks for person re-id are similar in structure thatis the feature maps extracted by convolutional layer aredirectly fed into the next convolutional layer [8ndash12] Suchstructure usually ignores the correlation among features ofeach layer thus reducing the mobility of feature informationto some extent In the process of back propagation as thenumber of layers in neural network deepens the gradientupdate information may attenuate in exponential form andcause vanishing gradient problem
0is work proposes to develop a modified deep neuralnetwork model for person re-id that could reduce overfittingcaused by the lack of training samples Moreover this workaims to improve the identify accuracy of person re-id net-work with assistance of pedestrian attribute recognition Tothis end contribution of this paper is three-fold first thispaper designs a multi-information flow convolutional neuralnetwork (MiF-CNN) to solve the person re-id problem 0enetwork contains a series of multi-information flow con-volution structures which connect the input and output ofeach convolutional layer together realizes the reuse offeatures and enhances the feature information flow andgradient back propagation of the entire network Secondthis paper designs a person attribute recognition network(PARN) based on long-short-term memory (LSTM) net-work and attention mechanism 0e PARN decodes pe-destrian visual features extracted by MiF-CNN into attributefeatures and outputs the attribute words of each person0ird this paper presents an attribute-aided reranking al-gorithm which rematches attribute features among samplesto aid more positive samples rank higher in rank list so as toimprove the identify accuracy further
0e rest of this paper is organized as follows Section 2reviews the state of the art for person re-id Section 3 in-troduces the details of MiF-CNN Section 4 shows theprinciple of the PARN 0e proposed attribute-aidedreranking algorithm is detailed in Section 5 0e experi-mental results and analysis are given in Section 6 Finallyconclusion and future works are discussed in Section 7
2 Related Works
0e early person re-id methods extract the manuallydesigned features to represent pedestrians Farenzenaet al divided pedestrian images into multiple areas andextracted three complementary kinds of featuresweighted color histograms maximally stable color re-gions and recurrent high-structured patches 0enmatch these features and measure the similarity betweenpedestrian pair [13] Yang et al proposed a novel salient
color names based color descriptor (SCNCD) which wasutilized to guarantee that a higher probability will beassigned to the color name near to the color Based onSCNCD color distributions in different color spaceswere fused into feature representation for person [14]Bazzani et al proposed asymmetry-based HPE de-scriptor which accumulated HSV histogram of multiplepedestrian images as a global appearance feature anddetected patches portraying highly informative recurrentingredient in local regions as local feature [15] Wu et aldesigned a novel gradient self-similarity (GSS) featurebased on HOG to capture the patterns of pairwisesimilarities of local gradient patches 0e combination ofHOG and GSS achieved improvement in person re-idaccuracy [16]
Apart from manually designed low-level featuresattribute features that represent mid-level semantic in-formation apply to person re-id as well Compared withlow-level descriptors attributes are more robust to imagetranslations [7] Layne et al labeled 15 binary attributesfor the VIPeR dataset and trained SVM to detect attri-butes 0ey also learned a weighted L2-norm distancemetric to fix each attribute and fused them with low-levelvisual features [17] Wang et al predicted complete at-tribute vector by exploiting both visual feature andmarked attributes and obtained the overall ranking list byfusing the rank result from visual features and attributevectors separately [18] Chen et al learned attribute ofperson by part-specific CNN and merged them withanother identification CNN embedding in a tripletstructure for person re-id task [19] Wang et al proposed adeep neural network that contains an auto-encoder modelto learn hidden attributes of person from visual feature inan unsupervised manner which alleviated the re-quirement of massive annotation [20]
Deep learning has become popular for solving person re-id problems in recent years Ahmed et al present a deepCNN architecture for person re-id 0e architecture com-puted differences in feature values across the two viewsaround a neighborhood of each feature location to addrobustness to positional differences in corresponding fea-tures of the two input images [12] Cheng et al proposed anovel multichannel CNN After the first layer of CNNfeatures were divided into four equal parts that aimed tolearn features for the respective body part 0e proposedCNN was trained with improved triplet loss function [21]Lin et al used ResNet [22] as the base network to learn low-level features and attributes jointly and trained networkwith combining the person re-ID loss and attribute pre-diction loss [23] Yan et al proposed an attention blockwhich learned par-level attention on different local regionsand integrated the proposed block into existing CNNstructures for training with the identify loss [24] Inspired byabove works this paper proposes a multi-information flowconvolutional neural network to extract discriminative pe-destrian features In addition this paper designs a personattribute recognition network based on LSTM and attentionmechanism for improving person re-id results with assis-tance of attributes recognition
2 Computational Intelligence and Neuroscience
3 Multi-Information Flow ConvolutionalNeural Network
0e proposed MiF-CNN solves the person re-id problemwith classification thought 0e overall network structure isshown in Figure 1 0e structure includes 2 shallow con-volutional layers 3 multi-information flow convolutionalstructures with novel connection pattern fully connectedlayers max pooling layers and classification output layersLow-level features of pedestrian images are extracted first by2 shallow convolutional layers After deeper multi-information flow convolutional structures MiF-CNNextracted higher level features 0e final discriminativepedestrian feature vectors are obtained after reducing di-mensions by pooling layers and integrating by fully con-nected layers
31 Features Extraction In MiF-CNN all convolutionalfilters are 3times 3 with stride 1 Batch normalization and ReLUactivation function are applied after each convolutionallayer 0e operation process of convolutional layer can beformulated as
z(l)j 1113944
i
x(lminus1)i otimesw
(l)j
x(l)j σ zl
j1113872 1113873
⎧⎪⎪⎨
⎪⎪⎩lgt 1 (1)
where x(lminus1)i is the i-th feature map from the (lminus 1)-th
convolutional layer z(l)j is the convolutional output of x
(lminus1)i
x(l)j is the j-th feature map from the l-th convolutional layer
w(l)j is the filter on the j-th feature map in the l-th con-
volutional layer and otimes represents the convolutional opera-tion0e process of convolutional layers extracting features isthat neuron on the j-th feature map in the l-th convolutionallayer sum each feature map after connecting and convolutionby filter w
(l)j and map the extracted features on j-th feature
map in the l-th convolutional layer σ(middot) is the ReLU acti-vation function which is formulated as σ(x) max(0 x)Because of batch normalization bias is ignored
32Multi-Information Flow Convolutional Structure In thisstructure both output and input of the current convolu-tional layer are concatenated together and fed into the nextconvolutional layer ie the input of each layer is theconnection combination of outputs from all previous layers0e detail of multi-information flow convolutional struc-tures is shown in Figure 2
0is connection pattern makes feature maps of eachlayers be reused by all subsequent layers in forward prop-agation process which makes the whole CNN model learnmore feature information of pedestrian images It can beconsidered as a special ldquoData Augmentationrdquo in featuremaps so as to enhance the information mobility of thenetwork In back propagation process gradient of input ineach layer contains derivative of loss function with respect toinput which makes propagation of gradient more effectiveand network easier to be trained In multi-information flow
convolutional structure the number of feature maps thateach layer outputs is a constant value ρ so the number offeature maps that l-th layers outputs is ρ0 + ρ(lminus 1) where ρ0is the number of feature maps in the initial layer Supposingthe feature map of l-th channel in the initial layer is x
(0)i
where i isin (1 ρ0) then the feature map of j-th channel thatinitial layer outputs can be expressed as
z(1)j 1113944
i
x(0)i otimesw
(1)j (2)
where w(1)j is the weight of the initial layer 0e output after
activation function is
a(1)j σ z
(1)j1113872 1113873 (3)
where j isin (1 ρ) 0e feature map that the l-th layer outputscan be expressed as
z(l)q 1113944
p
x(lminus1)p otimesw(l)
q
a(l)q σ z(l)
q1113872 1113873
⎧⎪⎪⎨
⎪⎪⎩lgt 0 (4)
where p isin (1 ρ0 + ρ(lminus 1)) q isin (1 ρ) 0e output of thel-th layer after concatenate operation is
x(l)r x
(lminus1)p a
(l)q1113960 1113961 lgt 0 (5)
where r isin (1 ρ0 + ρ middot l) [middot middot] represents the concatenateoperation on channel dimension
In the process of back propagation supposing Δx(l)r is
the derivative of loss function with respect to x(l)r Due to x(l)
r
containing x(lminus1)p and a(l)
q it produces two parts of gradientas shown below
Δa(l)q Δx(l)
r middotzx(l)
r
za(l)q
Δx(lminus1)p Δx(l)
r middotzx(l)
r
zx(lminus1)p
⎧⎪⎪⎪⎪⎪⎪⎪⎨
⎪⎪⎪⎪⎪⎪⎪⎩
(6)
where Δa(l)q is the gradient of output from the l-th layer after
activation functionΔx(lminus1)p is the gradient of output from the
(lminus 1)-th layer 0e gradient of weight in the l-th layer is
Δz(l)q Δa(l)
q middotza(l)
q
zz(l)q
Δx(l)r middot
zx(l)r
za(l)q
middot σprime z(l)q1113872 1113873
Δw(l)q Δz(l)
q middotzz(l)
q
zw(l)q
Δz(l)q middot x
(lminus1)p
⎧⎪⎪⎪⎪⎪⎪⎪⎪⎨
⎪⎪⎪⎪⎪⎪⎪⎪⎩
(7)
where Δz(l)q is the gradient of the convolution result in the
l-th layer σprime(z(l)q ) is the derivative of the activation function
with respect to z(l)q andΔw(l)
q is the gradient of weights in thel-th layer 0e network utilizes Δw(l)
q to update weights ofeach layer which is formulated as
w(l)q1113872 1113873new w
(l)q minus η middot Δw(l)
q (8)
Computational Intelligence and Neuroscience 3
where η is the learning rate Gradient keeps back propa-gation to the (lminus 1)-th layer 0e gradient of x(lminus1)
p is
Δx(lminus1)p Δz(l)
q middotzz(l)
q
zx(lminus1)p
Δz(l)q middot w
(l)q (9)
As shown in equations (6) and (9) the loss functionproduces two flows of gradient with respect to outputs ofeach convolutional layer which makes error informationpropagating more effective in network and restrains thevanishing gradient to a certain extent
0e proposed MiF-CNN includes three multi-information flow convolutional structures Between everytwo multi-information flow convolutional structures amiddle pooling layer is applied to compares the redundancyfeatures Hyperparameter ρ is set with a small value When itcomes to a new multi-information flow convolutionalstructure ρ is doubled Such a design makes each convo-lution layer learn a small quantity of features and reduce theredundant features so as to optimize the efficiency of net-work With deeper layers the network can learn more high-level and complex pedestrian features and improve the finalidentification accuracy
33 Loss Function Current deep learning algorithms usuallyuse cross-entropy loss as cost function which is formulatedas
LS minus1
M1113944
M
i1y
(i) logeθ
Ty(i)x
(i)
1113936kj1e
θTj x(i)
(10)
where y(i) is the ground truth of pedestrian categories intraining set θ is the parameter of the last fully connectedlayer x(i) is the feature vector of training samples k is thenumber of pedestrian categories in the training set andM isthe batch size
However in practice when using cross-entropy lossmerely if the quality of extracted features is not goodenough it will lead to intraclass distance being greater thaninterclass distance Aiming at this problem Wen et alproposed center loss in 2016 [25] Combination of cross-entropy loss and center loss can enhance the discriminationand generalization ability of the network Center loss isdefined as follows
LC 12M
1113944
M
i1xi minus cj
2
2 (11)
where cj is the center of the j-th pedestrian feature and xi isthe feature vector of pedestrian Center loss minimizes thedistance between feature and its center in order to reduce theintraclass distance Center cj is updated with equation (12)
Δctj
1113936Mi1δ yi j( 1113857 middot ct
j minus xi1113872 1113873
1 + 1113936Mi1δ yi j( 1113857
ct+1j ct
j minus β middot Δctj
⎧⎪⎪⎪⎪⎪⎨
⎪⎪⎪⎪⎪⎩
(12)
where β is the update rate δ(yi j) is 1 if prediction equalsto ground truth otherwise is 0 0at is to say center isupdated only when network predicts correctly
(ρ0) (ρ) (ρ0 + ρ) (ρ) (ρ0 + 2ρ) (ρ0 + l middot ρ)(ρ)
Input Conv3 times 3
Conv3 times 3
Conv3 times 3amp amp amp
1st layer 2nd layer lndashth layer
Figure 2 Multi-information flow convolutional structures where ldquoamprdquo represents concatenate operation on channel dimension
MiF P MiF P MiF
Input image
Pooling2 times 2
FC1024
Somaxk
Conv3 times 3
Conv3 times 3
Figure 1 Architecture for MiF-CNN where green rectangle denotes the multi-information flow convolutional structures orange rectangledenotes the middle pooling layer and k is the number of pedestrian categories in the training set
4 Computational Intelligence and Neuroscience
4 Person Attributes Recognition Network
0e rank list of MiF-CNN is shown in Figure 3 In theincorrect identification results of rank 1 (pedestrian B andpedestrian C) there is a big difference in attribute featuresbetween the top-ranking negative samples and the queryimage including gender clothing whether carrying hand-bag or not and so on Hence this paper recognizes personattributes for improving accuracy of person re-id
Based on Encoder-Decoder idea Xu et al [26] proposeda neural network model that can learn and generate thecontent of images 0e model utilized CNN as encoder toextract features of images which were then fed into a re-current neural network (RNN) for decoding into languagecaptions of images Inspired by that this paper presents aperson attribute recognition network (PARN) with LSTMand attention mechanism 0e proposed PARN takes pe-destrian features that are extracted by MiF-CNN as inputand outputs the attributes information of pedestrian images0e architecture of PARN is demonstrated in Figure 4
41 Input of PARN In PARN the input is the feature mapsthat are before the last fully connected layer in the MiF-CNN0e input feature is split into n feature vectors each ofwhich corresponds to a part of the image Each feature vectoris a DX-dimensional vector which is represented as
X x1 x2 xn1113864 1113865 xi isin RDX (13)
Referring to the natural language processing methodPARN transforms words in person attribute labels into wordembedding As a part of the input of PARN each wordembedding is a DY-dimensional vector
y y1 y2 ym1113864 1113865 yi isin RDY (14)
where m is the number of attribute words in each pedestrianimage and yi is the word embedding corresponding to eachattribute word
For attribute words common one-hot encoding con-siders each word as an individual which ignores the cor-relation among words However word embeddingrepresents each word as a continuous dense vector whichmakes those correlative words closer in space
42 Attention Mechanism Attention mechanism has beenwidely used in natural language processing and computer visionBy measuring the correlation between the output and differentparts of the input attention mechanism gives different weightsto different parts of the input enabling the network to use moreimportant feature information for prediction and reduce thedimension of input data [27]
In practice a certain attribute of pedestrian is onlycorresponding to a certain part of the image For examplewhen recognizing whether a pedestrian is wearing a hatpeople only pay attention to the area above the head of thepedestrian usually instead of other areas irrelevant to theattribute 0erefore before feeding the image features intothe LSTM network attention mechanism is introduced to
calculate the correlation between different positions ofimage features and the hidden state of LSTM at the previoustime 0e schematic diagram of attention mechanism isshown in Figure 5
At time t the fully connected layer and tanh function areused to integrate the information of the input feature vectorand the hidden state of LSTM at the previous time which isformulated as
vi tanh WattXxi + Watthhtminus1( 1113857 (15)
where WattX and Watth represent the fully connected layerweights of xi and htminus1 respectively 0e attention weights αi
is obtained by supplying softmax function on score vector vi
αi softmax Wattvvi( 1113857 (16)
where Wattv represents the fully connected layer weights ofvi 0e output Z is the weighted sums of all αi
Z 1113944n
i1αixi (17)
Feature Z highlights the local features that are helpful topredict and suppress the other local features that makesmall contribution to prediction which reduces the di-mension of features to some extent and makes LSTMnetwork focus on the part of input features that havegreater correlation with prediction while recognizingperson attributes so that improves the efficiency andprediction accuracy of the network
43 LSTM Normal RNN updates network parameters withback propagation which easily suffers from the vanishinggradient when there is a long gap between relevant informationand the current position to predict LSTM network solves thisproblem well because of its own structure advantage 0e ar-chitecture of LSTM unit is shown in Figure 6
Here c is the cell state for storing and transferring in-formation f is the forget gate which decides what should beabandoned in the cell state i is the input gate which decideswhat should be stored in the cell state g represents a vector ofnew candidate values that should be added to the cell state o isthe output gate which decides what parts of the cell stateshould be output to the next time h is the hidden state ofLSTM x is the input of LSTM and t denotes the current time
In PARN at any time t the input of LSTM consists oftwo parts word embedding ytminus1 at the previous time andimage features Zt at the current time 0e operation pro-cessing of LSTM in PARN is formulated as
f sigmoid Wf htminus1 ytminus1 Zt1113858 1113859 + bf1113872 1113873
i sigmoid Wi htminus1 ytminus1 Zt1113858 1113859 + bi( 1113857
g sigmoid Wg htminus1 ytminus1 Zt1113858 1113859 + bg1113872 1113873
o sigmoid Wo htminus1 ytminus1 Zt1113858 1113859 + bo( 1113857
ct f⊙ ctminus1 + i⊙g
ht o⊙ tanh ct( 1113857
⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨
⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩
(18)
Computational Intelligence and Neuroscience 5
where W and b represent the weights and biases of eachgate respectively Sigmoid function outputs a numberbetween 0 and 1 to decide the quantity scale of values thatgo through these gates Tanh function is the activationfunction of input 0is design makes LSTM give differentweights for information at different times so that it canchoose what part of information to remember or forget in along-term sequence
0e time step of LSTM in PARN is set as the number ofattributes of each pedestrian imagem 0e output hidden stateht is calculated by a fully connected layer and softmax functionat each time step to predict pedestrian attribute words
It is worth noting that at each time step the input wordembedding of LSTM is the one that LSTM learned at theprevious time step which takes advantage of the LSTM whensolving the long-term sequence problem Because of the
LSTMt = 1
LSTMt = 2
LSTMt = 3
LSTMt = m
Attention Attention Attention Attention
MiF-CNN
X
y1h1
y2h2
y3h3
ymndash1hmndash1
u1 u2 u3 um
y0h0
Figure 4 Architecture of PARN
Query Rank list1 2 3 4 5 6 7 8 9 10
A
B
C
Figure 3 Identification results of MiF-CNN (green bounding box represents positive samples in gallery)
6 Computational Intelligence and Neuroscience
correlation among attributes of pedestrian like the pedestrianowning attribute ldquomalerdquo generally own attribute ldquoshort hairrdquoand attribute ldquopantsrdquo LSTM can utilize the previous attributeresult to predict the current attribute more accurately
44 Network Optimization 0e loss function of PARN in-cludes two parts one is the cross-entropy between pre-diction and ground truth of network which is expressed as
Lsa 1113944m
i1uprime log
eθTaiu
1113936nw
j1eθT
aju⎛⎝ ⎞⎠ (19)
where u is the prediction value of the network uprime is theground truth of pedestrian labels m is the time step ofLSTM and nw is the total number of attribute words in eachdataset As shown in equation (19) the loss of a single imagein single epoch is the cumulated loss value afterm iterations0e other one is the loss function of attention mechanism asdescribed in [40]
Lα 12n
1113944
n
i11minus 1113944
m
i1αj
i⎛⎝ ⎞⎠
2
(20)
where αji represents the attention weights of the image
feature xi and n is the number of parts in the image feature0e object function of PARN to optimize is
J minus1
M1113944
M
i1L
isa + λ middot L
iα1113872 1113873 (21)
whereM is the batch size and λ is the rate of Lα in the objectfunction
5 Attribute-Aided Reranking Algorithm
Given a probe image q0 and gallery image setG g1 g2 gN1113864 1113865 including N pedestrian images theinitial similarity distance between q0 and gi is computedwith the Euclidean distance as
d q0 gi( 1113857
1113944
N
i1xq0minusxgi
1113872 11138732
11139741113972
(22)
where xq0and xgi
represent the features of q0 and gi re-spectively that extracted by MiF-CNN 0e initial ranklist R0 g
0prime1 g
0prime2 g
0primeN is obtained by sorting similarity
distance d(q0 gi) in ascending order whered(q0 g
0primei )ltd(q0 g
0primei+1) If the top-1 gallery g
0prime1 is the positive
sample it means rank 1 is correct If the positive sample isnot the top-1 gallery but the top-5 gallery it means rank 5 iscorrect
0e attribute-aided reranking algorithm reranks therank list R0 according to the attribute feature similaritybetween q0 and gi so that more positive gallery getshigher ranking in R0 which could improve the perfor-mance of person re-id To be specific when rank 5 iscorrect but rank 1 is not the proposed algorithm dis-tributes the initial feature score sf for top-5 gallery g
0prime1 simg
0prime5
according to their ranking in R0 the higher the rankingthe higher the sf 0en the proposed algorithm dis-tributes attribute score sa for g
0prime1simg
0prime5 according to the
number of their attributes that are the same as q0 themore same attributes the higher sa 0e total score ofeach gallery is s sf(1minus c) + sac where c is the scoreweight 0e reranking rank list Rlowast0 is obtained byreranking g
0prime1simg
0prime5 in descending order of their total score
s For M query images the whole processing of theattribute-aided reranking algorithm is shown asAlgorithm 1
6 Experiments
0is paper evaluates the attribute recognition accuracy ofPARN on person re-id public datasets first and then com-pares our results with other person attribute recognitionmethods 0en this paper evaluates the performance ofMiF-CNN on three people re-id datasets and the availabilityof the proposed attribute-aided reranking algorithm forimproving the accuracy of person re-id 0e analysis ofexperiment results is also given after comparing our resultswith the state-of-the-art person re-id methods
61 Datasets and Evaluation Protocol 0is paper evaluatesthe proposed methods on three challenging person re-id
α3α2α1
Z
Xx1 x2 x3 xn
Softmax Softmax Softmax Softmax
tanh tanh tanh tanh
Watt Watt Watt Watt
htndash1 htndash1 htndash1 htndash1
v1 v2 v3 vn
αn
Figure 5 Schematic diagram of attention mechanism
Wf Wi Wg Wo
f i g o
htndash1 ht
tanh
tanh
ctndash1 ct
xt
σ σ σ
Figure 6 Architecture of the LSTM unit
Computational Intelligence and Neuroscience 7
public datasets including VIPeR [28] CUHK01 [29] andMarket1501 [30]
VIPeR Dataset It contains 1264 images of 632 per-sons Because of the lack of image samples low-resolution a great change in pose view and illu-mination it is one of the hardest difficult person re-iddatasets In experiment the VIPeR dataset is ran-domly split into two parts images of 316 persons fortraining and the images of remaining 316 persons fortestingCUHK01 Dataset It consists of 3884 images of 971persons which are captured by two cameras withdifferent views 485 pedestrians are randomlychosen for training and other 486 pedestrians fortestingMarket1501 Dataset It is one of the largest person re-id datasets that include 32268 images of 1501 personsEach person has a number of images that are capturedby six cameras with different views 0e dataset isdivided into two parts 12936 images of 751 personsfor training and 19732 images of 750 persons fortestingEvaluation Protocol For the single-shot datasets VI-PeR and CUHK01 cumulative match characteristic(CMC) is used to record the ranks of correct identify[31] For the multishot dataset Market1501 apartfrom CMC mean Average Precision (mAP) is uti-lized to evaluate the performance of the proposedmethods
62 Experimental Results on Attribute Recognition 0e at-tributes in PARN refer to pedestrian identity-level attri-butes For the VIPeR dataset attributes include ldquogenderrdquoldquolength of hairrdquo ldquolower clothingrdquo ldquoupper clothingrdquo ldquoback-pack or notrdquo and ldquocarrying anything or notrdquo For theCUHK01 dataset attributes include ldquogenderrdquo ldquolength ofhairrdquo ldquobackpack or notrdquo ldquohandbag or notrdquo ldquocolor of upperrdquoand ldquocolor of lowerrdquo For the Market1501 dataset attributescontain ldquogenderrdquo ldquolength of hairrdquo ldquowearing hat or notrdquoldquolower clothingrdquo ldquobackpack or notrdquo ldquohandbag or notrdquoldquolength of sleeverdquo and ldquolength of lower clothingrdquo Eachperson in each dataset owns an attribute word list as shownin Figure 7
0is paper evaluates the attribute recognition accuracy ofPARN on the Market1501 dataset and compares our resultswith outstanding person attribute recognition method APR[22] 0e experiment results are reported in Table 1 whereldquoLslvrdquo represents the ldquolength of sleeverdquo and ldquoLlowrdquo repre-sents the ldquolength of lower clothingrdquo
It can be observed from Table 1 that PARN obtainssuperior recognition accuracy of 8 attributes especiallyldquolength of hairrdquo ldquohandbag or notrdquo and mean accuracy arehigher than APR whereas accuracy of other attributes iscloser with APR Comparison with APR demonstrates theoutstanding attribute recognition performance of PARNwhich plays an important role in improving the accuracy ofperson re-id
63 Experimental Results on Person re-id 0is paper eval-uates the performance of the proposed methods on threedatasets 0e experimental results of the proposed methodsand other state-of-the-art methods are presented inTable 2
As shown in Table 2 the proposed MiF-CNN modelobtains a great performance among state-of-the-artmethods and on the contrary comparison between MiFand MiF + PARN demonstrates that the proposed attribute-aided reranking algorithm is helpful to increase the personre-id accuracy 0e improvement effect is especially obviouson VIPeR and CUHK01 datasets where the identificationaccuracy of rank 1 rank 5 and rank 10 improves by 602633 and 222 and 494 494 and 226 respectively0e proposed MiF-CNN model with the attribute-aidedreranking algorithm gets the best rank 5 accuracy on theVIPeR dataset the best rank 1 and rank 5 accuracy on theCUHK01 dataset and the best mAP on the Market1501dataset among various methods
64 Analysis of Experimental Results Because of the smallquantity of training samples in VIPeR and CUHK01 data-sets a deep CNN model is hard to be trained and easilysuffers from overfitting To handle this the FT-JSTL+DGDmethod based on deep CNN learned deep features frommultiple domains jointly by merging all the datasets togetherand fine-tuned the pretrained model on VIPeR andCUHK01 separately 0e structure and hyperparameters ofthe deep CNN model in the M3TCP method need to beadjusted manually for adapting the scale of different datasetsso as to overcome the training problem of the deep neuralnetwork
In contrast the proposed MiF-CNN has an immobilestructure and can be trained on smaller datasets directlywithout any fine tuning In spite of the deep structure themodel can converge quickly and well MiF-CNNwithout theattribute-aided reranking algorithm outperforms FT-JSTL +DGD andM3TCP by 027 and 1317 respectivelyat rank 1 accuracy on the CUHK01 dataset It also out-performs FT-JSTL+DGD by 222 at rank 1 accuracy onthe VIPeR dataset which indicates that the proposed MiF-CNN model has an excellent ability of reducing overfittingand generalization and an outstanding performance inperson re-id
Figure 8 demonstrates the rank list of MiF andMiF + PARN It can be seen that the MiF-CNN with theattribute-aided reranking method enables the lower rankedpositive sample in rank list of MiF-CNN obtain a higherranking so as to improve the accuracy of person re-idwhich verifies the effectiveness of the attribute-aidedreranking algorithm for improving performance of per-son re-id model
7 Conclusion
0is paper studies and discusses the training problem ofdeep neural network in person re-id task and the using ofpedestrian attributes for further improving the accuracy of
8 Computational Intelligence and Neuroscience
person re-id 0e proposed MiF-CNN model realizes thereuse of feature maps and gradient information whichenhances the feature mobility of network and improves theefficiency of gradient propagation 0e designed person at-tribute recognition network uses an attention mechanism tomeasure the correlation between input feature maps and thehidden state of LSTM at previous time for reducing the di-mension of features It also employs LSTM to decode imagefeatures into pedestrian attributes On the basis of the at-tribute recognition results the attribute-aided reranking
algorithm is presented which rematches attribute featuresamong samples to aid more positive samples rank higher inrank list so as to improve the identify accuracy further
0e experimental results on three public person re-iddatasets indicate the outstanding performance of MiF-CNNmodel in person re-id 0e attribute-aided reranking algo-rithm makes a major contribution to improve the accuracyof person re-id In the future more improvement and op-timization will be done on pedestrian attribute recognitionnetwork Moreover the caption of person images or videos
Input Initial rank list R0Output 0e reranking rank list Rlowast
(1) for i 1 2 M do(2) if rank 1 correct(3) Rlowast R0(4) end if(5) elif rank 5 correct(6) Distributing initial feature score sf for top-5 gallery g
iprime1simg
iprime5 according to their ranking in R0
(7) Distributing attribute score sa for giprime1simg
iprime5 according to the number of their attributes that are the same as qi
(8) s sf(1minus c) + sac
(9) Reranking giprime1simg
iprime5 in descending order of their total score s and obtain Rlowast
(10) end if(11) elif rank 10 correct(12) Changing g
iprime1simg
iprime5 to g
iprime1simg
iprime10 repeat 6ndash10
(13) end if(14) else(15) Changing g
iprime1simg
iprime5 to g
iprime1simg
iprimeM repeat 6ndash10
(16) end if(17) end for
ALGORITHM 1 0e attribute-aided reranking algorithm
Male
Hat
Pants
Stripes
Backpack
Carrying nothing
(a)
Male
Short hair
Pants
No hat
No backpack
No handbag
Short sleeve
Short lower
(b)
Figure 7 Attribute label of the pedestrian in datasets
Table 1 Recognition accuracy of different attributes in PARN and APR ()
Method Gender Hair Clothes Hat Backpack Handbag Lslv Llow MeanAPR 8578 8346 9136 8821 8632 7601 9412 9264 8723PARN 8493 7910 8935 9667 8386 8518 9284 8845 8755
Computational Intelligence and Neuroscience 9
can be obtained by the LSTM model with natural languageprocess ideas which make person re-id methods moresignificant
Data Availability
0e (attributes recognition accuracy on the Market1501dataset and person re-id accuracy on VIPeR CUHK01 andMarket1501 datasets) data used to support the findings ofthis study are included within the article
Conflicts of Interest
0e authors declare that they have no conflicts of interest
Acknowledgments
0is work was supported by the National Natural ScienceFoundation of China (no 61773105) the Natural ScienceFoundation of Liaoning Province (no 20170540675) andthe Scientific Research Project of Liaoning EducationalDepartment (no LQGD2017023)
Table 2 Comparison of various methods with the proposed methods on three datasets ()
MethodsVIPeR CUHK01 Market1501
Rank 1 Rank 5 Rank 10 Rank 1 Rank 5 Rank 10 Rank 1 Rank 5 Rank 10 mAPDNS [32] 4228 7146 8294 6498 8496 8992 6796 mdash mdash 4189DLDA [33] 4411 7259 8166 6712 8945 9168 4815 mdash mdash 2994FT-JSTL +DGD [11] 386 mdash mdash 6660 mdash mdash mdash mdash mdash mdashPDC [34] 5127 7405 8418 mdash mdash mdash 8414 9273 9492 6341K-means-CNN [35] 4650 6930 8070 5350 8250 9120 mdash mdash mdash mdashSpindle [9] 5380 7410 8320 mdash mdash mdash 7690 9150 9460 mdashPersonNet [36] mdash mdash mdash 7114 9007 9500 3721 mdash mdash 1857CSBT [37] 3660 6620 8830 5120 7630 9180 4290 mdash mdash 2030SDH-CNN [38] mdash mdash mdash mdash mdash mdash 5812 6850 8082 4820M3TCP [21] mdash mdash mdash 5370 8430 9100 mdash mdash mdash mdashDM3 [39] 4270 7430 8510 4970 7730 8610 7580 8910 9240 5320MiF 4082 6835 8101 6687 8539 8951 7892 8646 8928 6600MiF + PARN 4684 7468 8323 7181 9033 9177 7939 8895 9136 6646ldquoMiFrdquo represents the MiF-CNN method and ldquoMiF + PARNrdquo represents the MiF-CNN with the attribute-aided reranking method
Query
1 2 3 4 5 6 7 8 9 10
Query
MiF
MiF + PARN
MiF
MiF + PARN
Rank list
Figure 8 Rank list comparison between MiF and MiF + PARN
10 Computational Intelligence and Neuroscience
References
[1] L Li and J Guo ldquoPedestrian re-identification based onreinforced deep feature fusionrdquo Information Technologyvol 42 no 7 pp 15ndash19 2018 in Chinese
[2] J Dai Y Zhang H Lu and H Wang ldquoCross-view semanticprojection learning for person re-identificationrdquo PatternRecognition vol 75 pp 63ndash76 2018
[3] T T T Pham T-L Le H Vu et al ldquoFully-automated personre-identification in multi-camera surveillance system with arobust kernel descriptor and effective shadow removalmethodrdquo Image and Vision Computing vol 59 pp 44ndash622017
[4] P DaoDao and H J Lee ldquoDeep multi-task network forlearning person identity and attributesrdquo IEEE Access vol 6pp 60801ndash60811 2018
[5] J Li L Zhuo J Zhang F Li and H Zhang ldquoA survey ofperson re-identificationrdquo Acta Automatica Sinica vol 44no 9 pp 1554ndash1568 2018 in Chinese
[6] L Wu Y Wang J Gao et al ldquoDeep adaptive feature em-bedding with local sample distributions for person re-iden-tificationrdquo Pattern Recognition vol 73 pp 275ndash288 2018
[7] L Li Y Yang and A G Hauptmann ldquoPerson re-identification past present and futurerdquo 2016 httparxivorgabs161002984
[8] D Wu S J Zheng C A Yuan and D S Huang ldquoA deepmodel with combined losses for person re-identificationrdquoCognitive Systems Research vol 54 pp 74ndash82 2018
[9] H Zhao M Tian S Sun et al ldquoSpindle net person re-identification with human body region guided feature de-composition and fusionrdquo in Proceedings of the 2017 IEEEConference on Computer Vision and Pattern Recognition(CVPR) pp 1077ndash1085 Honolulu HI USA 2017
[10] W Chen X Chen J Zhang and K Huang ldquoBeyond tripletloss a deep quadruplet network for person re-identificationrdquoin Proceedings of the 2017 IEEE Conference on ComputerVision and Pattern Recognition (CVPR) vol 28 Honolulu HIUSA July 2017
[11] T Xiao H Li W Ouyang et al ldquoLearning deep featurerepresentations with domain guided dropout for person re-identificationrdquo in Proceedings of the 2016 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 1249ndash1258 Las Vegas NV USA June 2016
[12] E Ahmed M Jones and T K Marks ldquoAn improved deeplearning architecture for person re-identificationrdquo in Pro-ceedings of the 2015 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 3908ndash3916 Boston MAUSA June 2015
[13] M Farenzena L Bazzani A Perina et al ldquoPerson re-identification by symmetry-driven accumulation of localfeaturesrdquo in Proceedings of the 2010 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 2360ndash2367 San Francisco CA USA June 2010
[14] Y Yang J Yang J Yan et al ldquoSalient color names for personre-identificationrdquo in Proceedings of Computer Vision-ECCV2014 pp 536ndash551 Zurich Switzerland September 2014
[15] L Liao M Cristani A Perina et al ldquoMultiple-shot person re-identification by chromatic and epitomic analysesrdquo PatternRecognition Letters vol 33 no 7 pp 898ndash903 2012
[16] S Murino R Laganiere and P Payeur ldquoImproving pedes-trian detection with selective gradient self-similarity featurerdquoPattern Recognition vol 48 no 8 pp 2364ndash2376 2015
[17] R Layne T M Hospedales S Gong et al ldquoPerson re-identification by attributesrdquo in Proceedings of the 23th
BritishMachine Vision Conference (BMVC) vol 2 no 3 p 83Surrey UK September 2012
[18] Z Wang R Hu Y Yu C Liang and W Huang ldquoMulti-levelfusion for person Re-identification with incomplete marksrdquoin Proceedings of the 23rd ACM international conference onMultimedia pp 1267ndash1270 Brisbane Australia March 2018
[19] Y Chen S Duffner A Stoian et al ldquoDeep and low-levelfeature based attribute learning for person re-identificationrdquoImage and Vision Computing vol 79 pp 25ndash34 2018
[20] Z Dufour X Bai M Ye and S Satoh ldquoIncremental deephidden attribute learningrdquo in Proceedings of the 26th ACMinternational conference on Multimedia pp 72ndash80 SeoulSouth Korea October 2018
[21] D Cheng Y Gong S Zhou et al ldquoPerson re-identification bymulti-channel parts-based cnn with improved triplet lossfunctionrdquo in Proceedings of the 2016 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 1335ndash1344 Las Vegas NV USA June 2016
[22] K He X Zhang S Ren et al ldquoDeep residual learning forimage recognitionrdquo in Proceedings of the 2016 IEEE Confer-ence on Computer Vision and Pattern Recognition (CVPR)pp 770ndash778 Las Vegas NV USA June 2016
[23] Y Lin L Zheng Z Zheng et al ldquoImproving person re-identification by attribute and identity learningrdquo 2017 httparxivorgabs170307220
[24] Y Yan B Ni J Liu and X Yang ldquoMulti-level attentionmodelfor person re-identificationrdquo Pattern Recognition Letters2018 In press
[25] Y Wen K Zhang Z Li et al ldquoA discriminative featurelearning approach for deep face recognitionrdquo in Proceedingsof 14th European Conference pp 499ndash515 AmsterdamNetherlands October 2016
[26] K Xu J Ba R Kiros et al ldquoShow attend and tell neuralimage caption generation with visual attentionrdquo in Pro-ceedings of the 32nd International Conference on MachineLearning (ICML) pp 2048ndash2057 Lille France July 2015
[27] H Choi K Cho and Y Bengio ldquoFine-grained attentionmechanism for neural machine translationrdquoNeurocomputingvol 284 pp 171ndash176 2018
[28] D Gray and H Tao ldquoViewpoint invariant pedestrian rec-ognition with an ensemble of localized featuresrdquo in Pro-ceedings of European Conference on Computer Vision LectureNotes in Computer Science pp 262ndash275 Marseille FranceOctober 2008
[29] W Li R Zhao and X Wang ldquoHuman reidentification withtransferred metric learningrdquo in Proceedings of the 2008 AsianConference on Computer Vision (ACCV) pp 31ndash44 DaejeonKorea November 2012
[30] L Zheng L Shen L Tian et al ldquoScalable person re-identification a benchmarkrdquo in Proceedings of the 2015IEEE Conference on Computer Vision and Pattern Recognition(CVPR) pp 1116ndash1124 Boston MA USA December 2015
[31] W Li R Zhao T Xiao et al ldquoDeepReID deep filter pairingneural network for person re-identificationrdquo in Proceedings ofthe 2014 Computer Vision and Pattern Recognition (CVPR)pp 152ndash159 Columbus OH USA June 2014
[32] L Zhang T Xiang and S Gong ldquoLearning a discriminativenull space for person re-identificationrdquo in Proceedings of theIEEE Conference on Computer Cision and Pattern Recognition(CVPR) pp 1239ndash1248 Las Vegas NV USA June 2016
[33] L Wu C Shen and A van den Hengel ldquoDeep linear dis-criminant analysis on Fisher networks a hybrid architecturefor person re-identificationrdquo Pattern Recognition vol 65pp 238ndash250 2017
Computational Intelligence and Neuroscience 11
[34] C Su J Li S Zhang et al ldquoPose-driven deep convolutionalmodel for person re-identificationrdquo in Proceedings of the 2016International Conference on Computer Vision (ICCV)pp 3980ndash3989 Venice Italy October 2017
[35] Z Zhao B Zhao and F Su ldquoPerson re-identification viaintegrating patch-based metric learning and local saliencelearningrdquo Pattern Recognition vol 75 pp 90ndash98 2018
[36] L Wu C Shen and A Hengel ldquoPersonnet person re-identification with deep convolutional neural networksrdquo2016 httparxivorgabs160107255
[37] J Chen Y Wang J Qin et al ldquoFast person re-identificationvia cross-camera semantic binary transformationrdquo in Pro-ceedings of the 2017 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 5330ndash5339 Honolulu HIUSA July 2017
[38] L Wu Y Wang Z Ge et al ldquoStructured deep hashing withconvolutional neural networks for fast person re-identifica-tionrdquo Computer Vision and Image Understanding vol 167no 10 pp 63ndash73 2018
[39] Z Hu R Hu C Chen Y Yu et al ldquoPerson reidentification viadiscrepancy matrix and matrix metricrdquo IEEE Transactions onCybernetics vol 48 no 10 pp 3006ndash3020 2018
[40] D Jiang K Cho and Y Bengio ldquoNeural machine translationby jointly learning to align and translaterdquo 2014 httparxivorgabs14090473
12 Computational Intelligence and Neuroscience
Computer Games Technology
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
Advances in
FuzzySystems
Hindawiwwwhindawicom
Volume 2018
International Journal of
ReconfigurableComputing
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
thinspArtificial Intelligence
Hindawiwwwhindawicom Volumethinsp2018
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawiwwwhindawicom Volume 2018
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Computational Intelligence and Neuroscience
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018
Human-ComputerInteraction
Advances in
Hindawiwwwhindawicom Volume 2018
Scientic Programming
Submit your manuscripts atwwwhindawicom
intrinsically representative to the original data so it hasbetter performance in extracting image features Two typesof CNNmodels are commonly employed in the community0e first type is the classification model as used in imageclassification and the second is the Siamese model usingimage pairs or triplets as input [7] Most of the existingpublic datasets of person re-id only contain thousands ofpedestrian image samples a small number of trainingsamples can easily lead to overfitting which limits theperformance of person re-id model In addition the deepneural networks for person re-id are similar in structure thatis the feature maps extracted by convolutional layer aredirectly fed into the next convolutional layer [8ndash12] Suchstructure usually ignores the correlation among features ofeach layer thus reducing the mobility of feature informationto some extent In the process of back propagation as thenumber of layers in neural network deepens the gradientupdate information may attenuate in exponential form andcause vanishing gradient problem
0is work proposes to develop a modified deep neuralnetwork model for person re-id that could reduce overfittingcaused by the lack of training samples Moreover this workaims to improve the identify accuracy of person re-id net-work with assistance of pedestrian attribute recognition Tothis end contribution of this paper is three-fold first thispaper designs a multi-information flow convolutional neuralnetwork (MiF-CNN) to solve the person re-id problem 0enetwork contains a series of multi-information flow con-volution structures which connect the input and output ofeach convolutional layer together realizes the reuse offeatures and enhances the feature information flow andgradient back propagation of the entire network Secondthis paper designs a person attribute recognition network(PARN) based on long-short-term memory (LSTM) net-work and attention mechanism 0e PARN decodes pe-destrian visual features extracted by MiF-CNN into attributefeatures and outputs the attribute words of each person0ird this paper presents an attribute-aided reranking al-gorithm which rematches attribute features among samplesto aid more positive samples rank higher in rank list so as toimprove the identify accuracy further
0e rest of this paper is organized as follows Section 2reviews the state of the art for person re-id Section 3 in-troduces the details of MiF-CNN Section 4 shows theprinciple of the PARN 0e proposed attribute-aidedreranking algorithm is detailed in Section 5 0e experi-mental results and analysis are given in Section 6 Finallyconclusion and future works are discussed in Section 7
2 Related Works
0e early person re-id methods extract the manuallydesigned features to represent pedestrians Farenzenaet al divided pedestrian images into multiple areas andextracted three complementary kinds of featuresweighted color histograms maximally stable color re-gions and recurrent high-structured patches 0enmatch these features and measure the similarity betweenpedestrian pair [13] Yang et al proposed a novel salient
color names based color descriptor (SCNCD) which wasutilized to guarantee that a higher probability will beassigned to the color name near to the color Based onSCNCD color distributions in different color spaceswere fused into feature representation for person [14]Bazzani et al proposed asymmetry-based HPE de-scriptor which accumulated HSV histogram of multiplepedestrian images as a global appearance feature anddetected patches portraying highly informative recurrentingredient in local regions as local feature [15] Wu et aldesigned a novel gradient self-similarity (GSS) featurebased on HOG to capture the patterns of pairwisesimilarities of local gradient patches 0e combination ofHOG and GSS achieved improvement in person re-idaccuracy [16]
Apart from manually designed low-level featuresattribute features that represent mid-level semantic in-formation apply to person re-id as well Compared withlow-level descriptors attributes are more robust to imagetranslations [7] Layne et al labeled 15 binary attributesfor the VIPeR dataset and trained SVM to detect attri-butes 0ey also learned a weighted L2-norm distancemetric to fix each attribute and fused them with low-levelvisual features [17] Wang et al predicted complete at-tribute vector by exploiting both visual feature andmarked attributes and obtained the overall ranking list byfusing the rank result from visual features and attributevectors separately [18] Chen et al learned attribute ofperson by part-specific CNN and merged them withanother identification CNN embedding in a tripletstructure for person re-id task [19] Wang et al proposed adeep neural network that contains an auto-encoder modelto learn hidden attributes of person from visual feature inan unsupervised manner which alleviated the re-quirement of massive annotation [20]
Deep learning has become popular for solving person re-id problems in recent years Ahmed et al present a deepCNN architecture for person re-id 0e architecture com-puted differences in feature values across the two viewsaround a neighborhood of each feature location to addrobustness to positional differences in corresponding fea-tures of the two input images [12] Cheng et al proposed anovel multichannel CNN After the first layer of CNNfeatures were divided into four equal parts that aimed tolearn features for the respective body part 0e proposedCNN was trained with improved triplet loss function [21]Lin et al used ResNet [22] as the base network to learn low-level features and attributes jointly and trained networkwith combining the person re-ID loss and attribute pre-diction loss [23] Yan et al proposed an attention blockwhich learned par-level attention on different local regionsand integrated the proposed block into existing CNNstructures for training with the identify loss [24] Inspired byabove works this paper proposes a multi-information flowconvolutional neural network to extract discriminative pe-destrian features In addition this paper designs a personattribute recognition network based on LSTM and attentionmechanism for improving person re-id results with assis-tance of attributes recognition
2 Computational Intelligence and Neuroscience
3 Multi-Information Flow ConvolutionalNeural Network
0e proposed MiF-CNN solves the person re-id problemwith classification thought 0e overall network structure isshown in Figure 1 0e structure includes 2 shallow con-volutional layers 3 multi-information flow convolutionalstructures with novel connection pattern fully connectedlayers max pooling layers and classification output layersLow-level features of pedestrian images are extracted first by2 shallow convolutional layers After deeper multi-information flow convolutional structures MiF-CNNextracted higher level features 0e final discriminativepedestrian feature vectors are obtained after reducing di-mensions by pooling layers and integrating by fully con-nected layers
31 Features Extraction In MiF-CNN all convolutionalfilters are 3times 3 with stride 1 Batch normalization and ReLUactivation function are applied after each convolutionallayer 0e operation process of convolutional layer can beformulated as
z(l)j 1113944
i
x(lminus1)i otimesw
(l)j
x(l)j σ zl
j1113872 1113873
⎧⎪⎪⎨
⎪⎪⎩lgt 1 (1)
where x(lminus1)i is the i-th feature map from the (lminus 1)-th
convolutional layer z(l)j is the convolutional output of x
(lminus1)i
x(l)j is the j-th feature map from the l-th convolutional layer
w(l)j is the filter on the j-th feature map in the l-th con-
volutional layer and otimes represents the convolutional opera-tion0e process of convolutional layers extracting features isthat neuron on the j-th feature map in the l-th convolutionallayer sum each feature map after connecting and convolutionby filter w
(l)j and map the extracted features on j-th feature
map in the l-th convolutional layer σ(middot) is the ReLU acti-vation function which is formulated as σ(x) max(0 x)Because of batch normalization bias is ignored
32Multi-Information Flow Convolutional Structure In thisstructure both output and input of the current convolu-tional layer are concatenated together and fed into the nextconvolutional layer ie the input of each layer is theconnection combination of outputs from all previous layers0e detail of multi-information flow convolutional struc-tures is shown in Figure 2
0is connection pattern makes feature maps of eachlayers be reused by all subsequent layers in forward prop-agation process which makes the whole CNN model learnmore feature information of pedestrian images It can beconsidered as a special ldquoData Augmentationrdquo in featuremaps so as to enhance the information mobility of thenetwork In back propagation process gradient of input ineach layer contains derivative of loss function with respect toinput which makes propagation of gradient more effectiveand network easier to be trained In multi-information flow
convolutional structure the number of feature maps thateach layer outputs is a constant value ρ so the number offeature maps that l-th layers outputs is ρ0 + ρ(lminus 1) where ρ0is the number of feature maps in the initial layer Supposingthe feature map of l-th channel in the initial layer is x
(0)i
where i isin (1 ρ0) then the feature map of j-th channel thatinitial layer outputs can be expressed as
z(1)j 1113944
i
x(0)i otimesw
(1)j (2)
where w(1)j is the weight of the initial layer 0e output after
activation function is
a(1)j σ z
(1)j1113872 1113873 (3)
where j isin (1 ρ) 0e feature map that the l-th layer outputscan be expressed as
z(l)q 1113944
p
x(lminus1)p otimesw(l)
q
a(l)q σ z(l)
q1113872 1113873
⎧⎪⎪⎨
⎪⎪⎩lgt 0 (4)
where p isin (1 ρ0 + ρ(lminus 1)) q isin (1 ρ) 0e output of thel-th layer after concatenate operation is
x(l)r x
(lminus1)p a
(l)q1113960 1113961 lgt 0 (5)
where r isin (1 ρ0 + ρ middot l) [middot middot] represents the concatenateoperation on channel dimension
In the process of back propagation supposing Δx(l)r is
the derivative of loss function with respect to x(l)r Due to x(l)
r
containing x(lminus1)p and a(l)
q it produces two parts of gradientas shown below
Δa(l)q Δx(l)
r middotzx(l)
r
za(l)q
Δx(lminus1)p Δx(l)
r middotzx(l)
r
zx(lminus1)p
⎧⎪⎪⎪⎪⎪⎪⎪⎨
⎪⎪⎪⎪⎪⎪⎪⎩
(6)
where Δa(l)q is the gradient of output from the l-th layer after
activation functionΔx(lminus1)p is the gradient of output from the
(lminus 1)-th layer 0e gradient of weight in the l-th layer is
Δz(l)q Δa(l)
q middotza(l)
q
zz(l)q
Δx(l)r middot
zx(l)r
za(l)q
middot σprime z(l)q1113872 1113873
Δw(l)q Δz(l)
q middotzz(l)
q
zw(l)q
Δz(l)q middot x
(lminus1)p
⎧⎪⎪⎪⎪⎪⎪⎪⎪⎨
⎪⎪⎪⎪⎪⎪⎪⎪⎩
(7)
where Δz(l)q is the gradient of the convolution result in the
l-th layer σprime(z(l)q ) is the derivative of the activation function
with respect to z(l)q andΔw(l)
q is the gradient of weights in thel-th layer 0e network utilizes Δw(l)
q to update weights ofeach layer which is formulated as
w(l)q1113872 1113873new w
(l)q minus η middot Δw(l)
q (8)
Computational Intelligence and Neuroscience 3
where η is the learning rate Gradient keeps back propa-gation to the (lminus 1)-th layer 0e gradient of x(lminus1)
p is
Δx(lminus1)p Δz(l)
q middotzz(l)
q
zx(lminus1)p
Δz(l)q middot w
(l)q (9)
As shown in equations (6) and (9) the loss functionproduces two flows of gradient with respect to outputs ofeach convolutional layer which makes error informationpropagating more effective in network and restrains thevanishing gradient to a certain extent
0e proposed MiF-CNN includes three multi-information flow convolutional structures Between everytwo multi-information flow convolutional structures amiddle pooling layer is applied to compares the redundancyfeatures Hyperparameter ρ is set with a small value When itcomes to a new multi-information flow convolutionalstructure ρ is doubled Such a design makes each convo-lution layer learn a small quantity of features and reduce theredundant features so as to optimize the efficiency of net-work With deeper layers the network can learn more high-level and complex pedestrian features and improve the finalidentification accuracy
33 Loss Function Current deep learning algorithms usuallyuse cross-entropy loss as cost function which is formulatedas
LS minus1
M1113944
M
i1y
(i) logeθ
Ty(i)x
(i)
1113936kj1e
θTj x(i)
(10)
where y(i) is the ground truth of pedestrian categories intraining set θ is the parameter of the last fully connectedlayer x(i) is the feature vector of training samples k is thenumber of pedestrian categories in the training set andM isthe batch size
However in practice when using cross-entropy lossmerely if the quality of extracted features is not goodenough it will lead to intraclass distance being greater thaninterclass distance Aiming at this problem Wen et alproposed center loss in 2016 [25] Combination of cross-entropy loss and center loss can enhance the discriminationand generalization ability of the network Center loss isdefined as follows
LC 12M
1113944
M
i1xi minus cj
2
2 (11)
where cj is the center of the j-th pedestrian feature and xi isthe feature vector of pedestrian Center loss minimizes thedistance between feature and its center in order to reduce theintraclass distance Center cj is updated with equation (12)
Δctj
1113936Mi1δ yi j( 1113857 middot ct
j minus xi1113872 1113873
1 + 1113936Mi1δ yi j( 1113857
ct+1j ct
j minus β middot Δctj
⎧⎪⎪⎪⎪⎪⎨
⎪⎪⎪⎪⎪⎩
(12)
where β is the update rate δ(yi j) is 1 if prediction equalsto ground truth otherwise is 0 0at is to say center isupdated only when network predicts correctly
(ρ0) (ρ) (ρ0 + ρ) (ρ) (ρ0 + 2ρ) (ρ0 + l middot ρ)(ρ)
Input Conv3 times 3
Conv3 times 3
Conv3 times 3amp amp amp
1st layer 2nd layer lndashth layer
Figure 2 Multi-information flow convolutional structures where ldquoamprdquo represents concatenate operation on channel dimension
MiF P MiF P MiF
Input image
Pooling2 times 2
FC1024
Somaxk
Conv3 times 3
Conv3 times 3
Figure 1 Architecture for MiF-CNN where green rectangle denotes the multi-information flow convolutional structures orange rectangledenotes the middle pooling layer and k is the number of pedestrian categories in the training set
4 Computational Intelligence and Neuroscience
4 Person Attributes Recognition Network
0e rank list of MiF-CNN is shown in Figure 3 In theincorrect identification results of rank 1 (pedestrian B andpedestrian C) there is a big difference in attribute featuresbetween the top-ranking negative samples and the queryimage including gender clothing whether carrying hand-bag or not and so on Hence this paper recognizes personattributes for improving accuracy of person re-id
Based on Encoder-Decoder idea Xu et al [26] proposeda neural network model that can learn and generate thecontent of images 0e model utilized CNN as encoder toextract features of images which were then fed into a re-current neural network (RNN) for decoding into languagecaptions of images Inspired by that this paper presents aperson attribute recognition network (PARN) with LSTMand attention mechanism 0e proposed PARN takes pe-destrian features that are extracted by MiF-CNN as inputand outputs the attributes information of pedestrian images0e architecture of PARN is demonstrated in Figure 4
41 Input of PARN In PARN the input is the feature mapsthat are before the last fully connected layer in the MiF-CNN0e input feature is split into n feature vectors each ofwhich corresponds to a part of the image Each feature vectoris a DX-dimensional vector which is represented as
X x1 x2 xn1113864 1113865 xi isin RDX (13)
Referring to the natural language processing methodPARN transforms words in person attribute labels into wordembedding As a part of the input of PARN each wordembedding is a DY-dimensional vector
y y1 y2 ym1113864 1113865 yi isin RDY (14)
where m is the number of attribute words in each pedestrianimage and yi is the word embedding corresponding to eachattribute word
For attribute words common one-hot encoding con-siders each word as an individual which ignores the cor-relation among words However word embeddingrepresents each word as a continuous dense vector whichmakes those correlative words closer in space
42 Attention Mechanism Attention mechanism has beenwidely used in natural language processing and computer visionBy measuring the correlation between the output and differentparts of the input attention mechanism gives different weightsto different parts of the input enabling the network to use moreimportant feature information for prediction and reduce thedimension of input data [27]
In practice a certain attribute of pedestrian is onlycorresponding to a certain part of the image For examplewhen recognizing whether a pedestrian is wearing a hatpeople only pay attention to the area above the head of thepedestrian usually instead of other areas irrelevant to theattribute 0erefore before feeding the image features intothe LSTM network attention mechanism is introduced to
calculate the correlation between different positions ofimage features and the hidden state of LSTM at the previoustime 0e schematic diagram of attention mechanism isshown in Figure 5
At time t the fully connected layer and tanh function areused to integrate the information of the input feature vectorand the hidden state of LSTM at the previous time which isformulated as
vi tanh WattXxi + Watthhtminus1( 1113857 (15)
where WattX and Watth represent the fully connected layerweights of xi and htminus1 respectively 0e attention weights αi
is obtained by supplying softmax function on score vector vi
αi softmax Wattvvi( 1113857 (16)
where Wattv represents the fully connected layer weights ofvi 0e output Z is the weighted sums of all αi
Z 1113944n
i1αixi (17)
Feature Z highlights the local features that are helpful topredict and suppress the other local features that makesmall contribution to prediction which reduces the di-mension of features to some extent and makes LSTMnetwork focus on the part of input features that havegreater correlation with prediction while recognizingperson attributes so that improves the efficiency andprediction accuracy of the network
43 LSTM Normal RNN updates network parameters withback propagation which easily suffers from the vanishinggradient when there is a long gap between relevant informationand the current position to predict LSTM network solves thisproblem well because of its own structure advantage 0e ar-chitecture of LSTM unit is shown in Figure 6
Here c is the cell state for storing and transferring in-formation f is the forget gate which decides what should beabandoned in the cell state i is the input gate which decideswhat should be stored in the cell state g represents a vector ofnew candidate values that should be added to the cell state o isthe output gate which decides what parts of the cell stateshould be output to the next time h is the hidden state ofLSTM x is the input of LSTM and t denotes the current time
In PARN at any time t the input of LSTM consists oftwo parts word embedding ytminus1 at the previous time andimage features Zt at the current time 0e operation pro-cessing of LSTM in PARN is formulated as
f sigmoid Wf htminus1 ytminus1 Zt1113858 1113859 + bf1113872 1113873
i sigmoid Wi htminus1 ytminus1 Zt1113858 1113859 + bi( 1113857
g sigmoid Wg htminus1 ytminus1 Zt1113858 1113859 + bg1113872 1113873
o sigmoid Wo htminus1 ytminus1 Zt1113858 1113859 + bo( 1113857
ct f⊙ ctminus1 + i⊙g
ht o⊙ tanh ct( 1113857
⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨
⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩
(18)
Computational Intelligence and Neuroscience 5
where W and b represent the weights and biases of eachgate respectively Sigmoid function outputs a numberbetween 0 and 1 to decide the quantity scale of values thatgo through these gates Tanh function is the activationfunction of input 0is design makes LSTM give differentweights for information at different times so that it canchoose what part of information to remember or forget in along-term sequence
0e time step of LSTM in PARN is set as the number ofattributes of each pedestrian imagem 0e output hidden stateht is calculated by a fully connected layer and softmax functionat each time step to predict pedestrian attribute words
It is worth noting that at each time step the input wordembedding of LSTM is the one that LSTM learned at theprevious time step which takes advantage of the LSTM whensolving the long-term sequence problem Because of the
LSTMt = 1
LSTMt = 2
LSTMt = 3
LSTMt = m
Attention Attention Attention Attention
MiF-CNN
X
y1h1
y2h2
y3h3
ymndash1hmndash1
u1 u2 u3 um
y0h0
Figure 4 Architecture of PARN
Query Rank list1 2 3 4 5 6 7 8 9 10
A
B
C
Figure 3 Identification results of MiF-CNN (green bounding box represents positive samples in gallery)
6 Computational Intelligence and Neuroscience
correlation among attributes of pedestrian like the pedestrianowning attribute ldquomalerdquo generally own attribute ldquoshort hairrdquoand attribute ldquopantsrdquo LSTM can utilize the previous attributeresult to predict the current attribute more accurately
44 Network Optimization 0e loss function of PARN in-cludes two parts one is the cross-entropy between pre-diction and ground truth of network which is expressed as
Lsa 1113944m
i1uprime log
eθTaiu
1113936nw
j1eθT
aju⎛⎝ ⎞⎠ (19)
where u is the prediction value of the network uprime is theground truth of pedestrian labels m is the time step ofLSTM and nw is the total number of attribute words in eachdataset As shown in equation (19) the loss of a single imagein single epoch is the cumulated loss value afterm iterations0e other one is the loss function of attention mechanism asdescribed in [40]
Lα 12n
1113944
n
i11minus 1113944
m
i1αj
i⎛⎝ ⎞⎠
2
(20)
where αji represents the attention weights of the image
feature xi and n is the number of parts in the image feature0e object function of PARN to optimize is
J minus1
M1113944
M
i1L
isa + λ middot L
iα1113872 1113873 (21)
whereM is the batch size and λ is the rate of Lα in the objectfunction
5 Attribute-Aided Reranking Algorithm
Given a probe image q0 and gallery image setG g1 g2 gN1113864 1113865 including N pedestrian images theinitial similarity distance between q0 and gi is computedwith the Euclidean distance as
d q0 gi( 1113857
1113944
N
i1xq0minusxgi
1113872 11138732
11139741113972
(22)
where xq0and xgi
represent the features of q0 and gi re-spectively that extracted by MiF-CNN 0e initial ranklist R0 g
0prime1 g
0prime2 g
0primeN is obtained by sorting similarity
distance d(q0 gi) in ascending order whered(q0 g
0primei )ltd(q0 g
0primei+1) If the top-1 gallery g
0prime1 is the positive
sample it means rank 1 is correct If the positive sample isnot the top-1 gallery but the top-5 gallery it means rank 5 iscorrect
0e attribute-aided reranking algorithm reranks therank list R0 according to the attribute feature similaritybetween q0 and gi so that more positive gallery getshigher ranking in R0 which could improve the perfor-mance of person re-id To be specific when rank 5 iscorrect but rank 1 is not the proposed algorithm dis-tributes the initial feature score sf for top-5 gallery g
0prime1 simg
0prime5
according to their ranking in R0 the higher the rankingthe higher the sf 0en the proposed algorithm dis-tributes attribute score sa for g
0prime1simg
0prime5 according to the
number of their attributes that are the same as q0 themore same attributes the higher sa 0e total score ofeach gallery is s sf(1minus c) + sac where c is the scoreweight 0e reranking rank list Rlowast0 is obtained byreranking g
0prime1simg
0prime5 in descending order of their total score
s For M query images the whole processing of theattribute-aided reranking algorithm is shown asAlgorithm 1
6 Experiments
0is paper evaluates the attribute recognition accuracy ofPARN on person re-id public datasets first and then com-pares our results with other person attribute recognitionmethods 0en this paper evaluates the performance ofMiF-CNN on three people re-id datasets and the availabilityof the proposed attribute-aided reranking algorithm forimproving the accuracy of person re-id 0e analysis ofexperiment results is also given after comparing our resultswith the state-of-the-art person re-id methods
61 Datasets and Evaluation Protocol 0is paper evaluatesthe proposed methods on three challenging person re-id
α3α2α1
Z
Xx1 x2 x3 xn
Softmax Softmax Softmax Softmax
tanh tanh tanh tanh
Watt Watt Watt Watt
htndash1 htndash1 htndash1 htndash1
v1 v2 v3 vn
αn
Figure 5 Schematic diagram of attention mechanism
Wf Wi Wg Wo
f i g o
htndash1 ht
tanh
tanh
ctndash1 ct
xt
σ σ σ
Figure 6 Architecture of the LSTM unit
Computational Intelligence and Neuroscience 7
public datasets including VIPeR [28] CUHK01 [29] andMarket1501 [30]
VIPeR Dataset It contains 1264 images of 632 per-sons Because of the lack of image samples low-resolution a great change in pose view and illu-mination it is one of the hardest difficult person re-iddatasets In experiment the VIPeR dataset is ran-domly split into two parts images of 316 persons fortraining and the images of remaining 316 persons fortestingCUHK01 Dataset It consists of 3884 images of 971persons which are captured by two cameras withdifferent views 485 pedestrians are randomlychosen for training and other 486 pedestrians fortestingMarket1501 Dataset It is one of the largest person re-id datasets that include 32268 images of 1501 personsEach person has a number of images that are capturedby six cameras with different views 0e dataset isdivided into two parts 12936 images of 751 personsfor training and 19732 images of 750 persons fortestingEvaluation Protocol For the single-shot datasets VI-PeR and CUHK01 cumulative match characteristic(CMC) is used to record the ranks of correct identify[31] For the multishot dataset Market1501 apartfrom CMC mean Average Precision (mAP) is uti-lized to evaluate the performance of the proposedmethods
62 Experimental Results on Attribute Recognition 0e at-tributes in PARN refer to pedestrian identity-level attri-butes For the VIPeR dataset attributes include ldquogenderrdquoldquolength of hairrdquo ldquolower clothingrdquo ldquoupper clothingrdquo ldquoback-pack or notrdquo and ldquocarrying anything or notrdquo For theCUHK01 dataset attributes include ldquogenderrdquo ldquolength ofhairrdquo ldquobackpack or notrdquo ldquohandbag or notrdquo ldquocolor of upperrdquoand ldquocolor of lowerrdquo For the Market1501 dataset attributescontain ldquogenderrdquo ldquolength of hairrdquo ldquowearing hat or notrdquoldquolower clothingrdquo ldquobackpack or notrdquo ldquohandbag or notrdquoldquolength of sleeverdquo and ldquolength of lower clothingrdquo Eachperson in each dataset owns an attribute word list as shownin Figure 7
0is paper evaluates the attribute recognition accuracy ofPARN on the Market1501 dataset and compares our resultswith outstanding person attribute recognition method APR[22] 0e experiment results are reported in Table 1 whereldquoLslvrdquo represents the ldquolength of sleeverdquo and ldquoLlowrdquo repre-sents the ldquolength of lower clothingrdquo
It can be observed from Table 1 that PARN obtainssuperior recognition accuracy of 8 attributes especiallyldquolength of hairrdquo ldquohandbag or notrdquo and mean accuracy arehigher than APR whereas accuracy of other attributes iscloser with APR Comparison with APR demonstrates theoutstanding attribute recognition performance of PARNwhich plays an important role in improving the accuracy ofperson re-id
63 Experimental Results on Person re-id 0is paper eval-uates the performance of the proposed methods on threedatasets 0e experimental results of the proposed methodsand other state-of-the-art methods are presented inTable 2
As shown in Table 2 the proposed MiF-CNN modelobtains a great performance among state-of-the-artmethods and on the contrary comparison between MiFand MiF + PARN demonstrates that the proposed attribute-aided reranking algorithm is helpful to increase the personre-id accuracy 0e improvement effect is especially obviouson VIPeR and CUHK01 datasets where the identificationaccuracy of rank 1 rank 5 and rank 10 improves by 602633 and 222 and 494 494 and 226 respectively0e proposed MiF-CNN model with the attribute-aidedreranking algorithm gets the best rank 5 accuracy on theVIPeR dataset the best rank 1 and rank 5 accuracy on theCUHK01 dataset and the best mAP on the Market1501dataset among various methods
64 Analysis of Experimental Results Because of the smallquantity of training samples in VIPeR and CUHK01 data-sets a deep CNN model is hard to be trained and easilysuffers from overfitting To handle this the FT-JSTL+DGDmethod based on deep CNN learned deep features frommultiple domains jointly by merging all the datasets togetherand fine-tuned the pretrained model on VIPeR andCUHK01 separately 0e structure and hyperparameters ofthe deep CNN model in the M3TCP method need to beadjusted manually for adapting the scale of different datasetsso as to overcome the training problem of the deep neuralnetwork
In contrast the proposed MiF-CNN has an immobilestructure and can be trained on smaller datasets directlywithout any fine tuning In spite of the deep structure themodel can converge quickly and well MiF-CNNwithout theattribute-aided reranking algorithm outperforms FT-JSTL +DGD andM3TCP by 027 and 1317 respectivelyat rank 1 accuracy on the CUHK01 dataset It also out-performs FT-JSTL+DGD by 222 at rank 1 accuracy onthe VIPeR dataset which indicates that the proposed MiF-CNN model has an excellent ability of reducing overfittingand generalization and an outstanding performance inperson re-id
Figure 8 demonstrates the rank list of MiF andMiF + PARN It can be seen that the MiF-CNN with theattribute-aided reranking method enables the lower rankedpositive sample in rank list of MiF-CNN obtain a higherranking so as to improve the accuracy of person re-idwhich verifies the effectiveness of the attribute-aidedreranking algorithm for improving performance of per-son re-id model
7 Conclusion
0is paper studies and discusses the training problem ofdeep neural network in person re-id task and the using ofpedestrian attributes for further improving the accuracy of
8 Computational Intelligence and Neuroscience
person re-id 0e proposed MiF-CNN model realizes thereuse of feature maps and gradient information whichenhances the feature mobility of network and improves theefficiency of gradient propagation 0e designed person at-tribute recognition network uses an attention mechanism tomeasure the correlation between input feature maps and thehidden state of LSTM at previous time for reducing the di-mension of features It also employs LSTM to decode imagefeatures into pedestrian attributes On the basis of the at-tribute recognition results the attribute-aided reranking
algorithm is presented which rematches attribute featuresamong samples to aid more positive samples rank higher inrank list so as to improve the identify accuracy further
0e experimental results on three public person re-iddatasets indicate the outstanding performance of MiF-CNNmodel in person re-id 0e attribute-aided reranking algo-rithm makes a major contribution to improve the accuracyof person re-id In the future more improvement and op-timization will be done on pedestrian attribute recognitionnetwork Moreover the caption of person images or videos
Input Initial rank list R0Output 0e reranking rank list Rlowast
(1) for i 1 2 M do(2) if rank 1 correct(3) Rlowast R0(4) end if(5) elif rank 5 correct(6) Distributing initial feature score sf for top-5 gallery g
iprime1simg
iprime5 according to their ranking in R0
(7) Distributing attribute score sa for giprime1simg
iprime5 according to the number of their attributes that are the same as qi
(8) s sf(1minus c) + sac
(9) Reranking giprime1simg
iprime5 in descending order of their total score s and obtain Rlowast
(10) end if(11) elif rank 10 correct(12) Changing g
iprime1simg
iprime5 to g
iprime1simg
iprime10 repeat 6ndash10
(13) end if(14) else(15) Changing g
iprime1simg
iprime5 to g
iprime1simg
iprimeM repeat 6ndash10
(16) end if(17) end for
ALGORITHM 1 0e attribute-aided reranking algorithm
Male
Hat
Pants
Stripes
Backpack
Carrying nothing
(a)
Male
Short hair
Pants
No hat
No backpack
No handbag
Short sleeve
Short lower
(b)
Figure 7 Attribute label of the pedestrian in datasets
Table 1 Recognition accuracy of different attributes in PARN and APR ()
Method Gender Hair Clothes Hat Backpack Handbag Lslv Llow MeanAPR 8578 8346 9136 8821 8632 7601 9412 9264 8723PARN 8493 7910 8935 9667 8386 8518 9284 8845 8755
Computational Intelligence and Neuroscience 9
can be obtained by the LSTM model with natural languageprocess ideas which make person re-id methods moresignificant
Data Availability
0e (attributes recognition accuracy on the Market1501dataset and person re-id accuracy on VIPeR CUHK01 andMarket1501 datasets) data used to support the findings ofthis study are included within the article
Conflicts of Interest
0e authors declare that they have no conflicts of interest
Acknowledgments
0is work was supported by the National Natural ScienceFoundation of China (no 61773105) the Natural ScienceFoundation of Liaoning Province (no 20170540675) andthe Scientific Research Project of Liaoning EducationalDepartment (no LQGD2017023)
Table 2 Comparison of various methods with the proposed methods on three datasets ()
MethodsVIPeR CUHK01 Market1501
Rank 1 Rank 5 Rank 10 Rank 1 Rank 5 Rank 10 Rank 1 Rank 5 Rank 10 mAPDNS [32] 4228 7146 8294 6498 8496 8992 6796 mdash mdash 4189DLDA [33] 4411 7259 8166 6712 8945 9168 4815 mdash mdash 2994FT-JSTL +DGD [11] 386 mdash mdash 6660 mdash mdash mdash mdash mdash mdashPDC [34] 5127 7405 8418 mdash mdash mdash 8414 9273 9492 6341K-means-CNN [35] 4650 6930 8070 5350 8250 9120 mdash mdash mdash mdashSpindle [9] 5380 7410 8320 mdash mdash mdash 7690 9150 9460 mdashPersonNet [36] mdash mdash mdash 7114 9007 9500 3721 mdash mdash 1857CSBT [37] 3660 6620 8830 5120 7630 9180 4290 mdash mdash 2030SDH-CNN [38] mdash mdash mdash mdash mdash mdash 5812 6850 8082 4820M3TCP [21] mdash mdash mdash 5370 8430 9100 mdash mdash mdash mdashDM3 [39] 4270 7430 8510 4970 7730 8610 7580 8910 9240 5320MiF 4082 6835 8101 6687 8539 8951 7892 8646 8928 6600MiF + PARN 4684 7468 8323 7181 9033 9177 7939 8895 9136 6646ldquoMiFrdquo represents the MiF-CNN method and ldquoMiF + PARNrdquo represents the MiF-CNN with the attribute-aided reranking method
Query
1 2 3 4 5 6 7 8 9 10
Query
MiF
MiF + PARN
MiF
MiF + PARN
Rank list
Figure 8 Rank list comparison between MiF and MiF + PARN
10 Computational Intelligence and Neuroscience
References
[1] L Li and J Guo ldquoPedestrian re-identification based onreinforced deep feature fusionrdquo Information Technologyvol 42 no 7 pp 15ndash19 2018 in Chinese
[2] J Dai Y Zhang H Lu and H Wang ldquoCross-view semanticprojection learning for person re-identificationrdquo PatternRecognition vol 75 pp 63ndash76 2018
[3] T T T Pham T-L Le H Vu et al ldquoFully-automated personre-identification in multi-camera surveillance system with arobust kernel descriptor and effective shadow removalmethodrdquo Image and Vision Computing vol 59 pp 44ndash622017
[4] P DaoDao and H J Lee ldquoDeep multi-task network forlearning person identity and attributesrdquo IEEE Access vol 6pp 60801ndash60811 2018
[5] J Li L Zhuo J Zhang F Li and H Zhang ldquoA survey ofperson re-identificationrdquo Acta Automatica Sinica vol 44no 9 pp 1554ndash1568 2018 in Chinese
[6] L Wu Y Wang J Gao et al ldquoDeep adaptive feature em-bedding with local sample distributions for person re-iden-tificationrdquo Pattern Recognition vol 73 pp 275ndash288 2018
[7] L Li Y Yang and A G Hauptmann ldquoPerson re-identification past present and futurerdquo 2016 httparxivorgabs161002984
[8] D Wu S J Zheng C A Yuan and D S Huang ldquoA deepmodel with combined losses for person re-identificationrdquoCognitive Systems Research vol 54 pp 74ndash82 2018
[9] H Zhao M Tian S Sun et al ldquoSpindle net person re-identification with human body region guided feature de-composition and fusionrdquo in Proceedings of the 2017 IEEEConference on Computer Vision and Pattern Recognition(CVPR) pp 1077ndash1085 Honolulu HI USA 2017
[10] W Chen X Chen J Zhang and K Huang ldquoBeyond tripletloss a deep quadruplet network for person re-identificationrdquoin Proceedings of the 2017 IEEE Conference on ComputerVision and Pattern Recognition (CVPR) vol 28 Honolulu HIUSA July 2017
[11] T Xiao H Li W Ouyang et al ldquoLearning deep featurerepresentations with domain guided dropout for person re-identificationrdquo in Proceedings of the 2016 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 1249ndash1258 Las Vegas NV USA June 2016
[12] E Ahmed M Jones and T K Marks ldquoAn improved deeplearning architecture for person re-identificationrdquo in Pro-ceedings of the 2015 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 3908ndash3916 Boston MAUSA June 2015
[13] M Farenzena L Bazzani A Perina et al ldquoPerson re-identification by symmetry-driven accumulation of localfeaturesrdquo in Proceedings of the 2010 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 2360ndash2367 San Francisco CA USA June 2010
[14] Y Yang J Yang J Yan et al ldquoSalient color names for personre-identificationrdquo in Proceedings of Computer Vision-ECCV2014 pp 536ndash551 Zurich Switzerland September 2014
[15] L Liao M Cristani A Perina et al ldquoMultiple-shot person re-identification by chromatic and epitomic analysesrdquo PatternRecognition Letters vol 33 no 7 pp 898ndash903 2012
[16] S Murino R Laganiere and P Payeur ldquoImproving pedes-trian detection with selective gradient self-similarity featurerdquoPattern Recognition vol 48 no 8 pp 2364ndash2376 2015
[17] R Layne T M Hospedales S Gong et al ldquoPerson re-identification by attributesrdquo in Proceedings of the 23th
BritishMachine Vision Conference (BMVC) vol 2 no 3 p 83Surrey UK September 2012
[18] Z Wang R Hu Y Yu C Liang and W Huang ldquoMulti-levelfusion for person Re-identification with incomplete marksrdquoin Proceedings of the 23rd ACM international conference onMultimedia pp 1267ndash1270 Brisbane Australia March 2018
[19] Y Chen S Duffner A Stoian et al ldquoDeep and low-levelfeature based attribute learning for person re-identificationrdquoImage and Vision Computing vol 79 pp 25ndash34 2018
[20] Z Dufour X Bai M Ye and S Satoh ldquoIncremental deephidden attribute learningrdquo in Proceedings of the 26th ACMinternational conference on Multimedia pp 72ndash80 SeoulSouth Korea October 2018
[21] D Cheng Y Gong S Zhou et al ldquoPerson re-identification bymulti-channel parts-based cnn with improved triplet lossfunctionrdquo in Proceedings of the 2016 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 1335ndash1344 Las Vegas NV USA June 2016
[22] K He X Zhang S Ren et al ldquoDeep residual learning forimage recognitionrdquo in Proceedings of the 2016 IEEE Confer-ence on Computer Vision and Pattern Recognition (CVPR)pp 770ndash778 Las Vegas NV USA June 2016
[23] Y Lin L Zheng Z Zheng et al ldquoImproving person re-identification by attribute and identity learningrdquo 2017 httparxivorgabs170307220
[24] Y Yan B Ni J Liu and X Yang ldquoMulti-level attentionmodelfor person re-identificationrdquo Pattern Recognition Letters2018 In press
[25] Y Wen K Zhang Z Li et al ldquoA discriminative featurelearning approach for deep face recognitionrdquo in Proceedingsof 14th European Conference pp 499ndash515 AmsterdamNetherlands October 2016
[26] K Xu J Ba R Kiros et al ldquoShow attend and tell neuralimage caption generation with visual attentionrdquo in Pro-ceedings of the 32nd International Conference on MachineLearning (ICML) pp 2048ndash2057 Lille France July 2015
[27] H Choi K Cho and Y Bengio ldquoFine-grained attentionmechanism for neural machine translationrdquoNeurocomputingvol 284 pp 171ndash176 2018
[28] D Gray and H Tao ldquoViewpoint invariant pedestrian rec-ognition with an ensemble of localized featuresrdquo in Pro-ceedings of European Conference on Computer Vision LectureNotes in Computer Science pp 262ndash275 Marseille FranceOctober 2008
[29] W Li R Zhao and X Wang ldquoHuman reidentification withtransferred metric learningrdquo in Proceedings of the 2008 AsianConference on Computer Vision (ACCV) pp 31ndash44 DaejeonKorea November 2012
[30] L Zheng L Shen L Tian et al ldquoScalable person re-identification a benchmarkrdquo in Proceedings of the 2015IEEE Conference on Computer Vision and Pattern Recognition(CVPR) pp 1116ndash1124 Boston MA USA December 2015
[31] W Li R Zhao T Xiao et al ldquoDeepReID deep filter pairingneural network for person re-identificationrdquo in Proceedings ofthe 2014 Computer Vision and Pattern Recognition (CVPR)pp 152ndash159 Columbus OH USA June 2014
[32] L Zhang T Xiang and S Gong ldquoLearning a discriminativenull space for person re-identificationrdquo in Proceedings of theIEEE Conference on Computer Cision and Pattern Recognition(CVPR) pp 1239ndash1248 Las Vegas NV USA June 2016
[33] L Wu C Shen and A van den Hengel ldquoDeep linear dis-criminant analysis on Fisher networks a hybrid architecturefor person re-identificationrdquo Pattern Recognition vol 65pp 238ndash250 2017
Computational Intelligence and Neuroscience 11
[34] C Su J Li S Zhang et al ldquoPose-driven deep convolutionalmodel for person re-identificationrdquo in Proceedings of the 2016International Conference on Computer Vision (ICCV)pp 3980ndash3989 Venice Italy October 2017
[35] Z Zhao B Zhao and F Su ldquoPerson re-identification viaintegrating patch-based metric learning and local saliencelearningrdquo Pattern Recognition vol 75 pp 90ndash98 2018
[36] L Wu C Shen and A Hengel ldquoPersonnet person re-identification with deep convolutional neural networksrdquo2016 httparxivorgabs160107255
[37] J Chen Y Wang J Qin et al ldquoFast person re-identificationvia cross-camera semantic binary transformationrdquo in Pro-ceedings of the 2017 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 5330ndash5339 Honolulu HIUSA July 2017
[38] L Wu Y Wang Z Ge et al ldquoStructured deep hashing withconvolutional neural networks for fast person re-identifica-tionrdquo Computer Vision and Image Understanding vol 167no 10 pp 63ndash73 2018
[39] Z Hu R Hu C Chen Y Yu et al ldquoPerson reidentification viadiscrepancy matrix and matrix metricrdquo IEEE Transactions onCybernetics vol 48 no 10 pp 3006ndash3020 2018
[40] D Jiang K Cho and Y Bengio ldquoNeural machine translationby jointly learning to align and translaterdquo 2014 httparxivorgabs14090473
12 Computational Intelligence and Neuroscience
Computer Games Technology
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
Advances in
FuzzySystems
Hindawiwwwhindawicom
Volume 2018
International Journal of
ReconfigurableComputing
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
thinspArtificial Intelligence
Hindawiwwwhindawicom Volumethinsp2018
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawiwwwhindawicom Volume 2018
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Computational Intelligence and Neuroscience
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018
Human-ComputerInteraction
Advances in
Hindawiwwwhindawicom Volume 2018
Scientic Programming
Submit your manuscripts atwwwhindawicom
3 Multi-Information Flow ConvolutionalNeural Network
0e proposed MiF-CNN solves the person re-id problemwith classification thought 0e overall network structure isshown in Figure 1 0e structure includes 2 shallow con-volutional layers 3 multi-information flow convolutionalstructures with novel connection pattern fully connectedlayers max pooling layers and classification output layersLow-level features of pedestrian images are extracted first by2 shallow convolutional layers After deeper multi-information flow convolutional structures MiF-CNNextracted higher level features 0e final discriminativepedestrian feature vectors are obtained after reducing di-mensions by pooling layers and integrating by fully con-nected layers
31 Features Extraction In MiF-CNN all convolutionalfilters are 3times 3 with stride 1 Batch normalization and ReLUactivation function are applied after each convolutionallayer 0e operation process of convolutional layer can beformulated as
z(l)j 1113944
i
x(lminus1)i otimesw
(l)j
x(l)j σ zl
j1113872 1113873
⎧⎪⎪⎨
⎪⎪⎩lgt 1 (1)
where x(lminus1)i is the i-th feature map from the (lminus 1)-th
convolutional layer z(l)j is the convolutional output of x
(lminus1)i
x(l)j is the j-th feature map from the l-th convolutional layer
w(l)j is the filter on the j-th feature map in the l-th con-
volutional layer and otimes represents the convolutional opera-tion0e process of convolutional layers extracting features isthat neuron on the j-th feature map in the l-th convolutionallayer sum each feature map after connecting and convolutionby filter w
(l)j and map the extracted features on j-th feature
map in the l-th convolutional layer σ(middot) is the ReLU acti-vation function which is formulated as σ(x) max(0 x)Because of batch normalization bias is ignored
32Multi-Information Flow Convolutional Structure In thisstructure both output and input of the current convolu-tional layer are concatenated together and fed into the nextconvolutional layer ie the input of each layer is theconnection combination of outputs from all previous layers0e detail of multi-information flow convolutional struc-tures is shown in Figure 2
0is connection pattern makes feature maps of eachlayers be reused by all subsequent layers in forward prop-agation process which makes the whole CNN model learnmore feature information of pedestrian images It can beconsidered as a special ldquoData Augmentationrdquo in featuremaps so as to enhance the information mobility of thenetwork In back propagation process gradient of input ineach layer contains derivative of loss function with respect toinput which makes propagation of gradient more effectiveand network easier to be trained In multi-information flow
convolutional structure the number of feature maps thateach layer outputs is a constant value ρ so the number offeature maps that l-th layers outputs is ρ0 + ρ(lminus 1) where ρ0is the number of feature maps in the initial layer Supposingthe feature map of l-th channel in the initial layer is x
(0)i
where i isin (1 ρ0) then the feature map of j-th channel thatinitial layer outputs can be expressed as
z(1)j 1113944
i
x(0)i otimesw
(1)j (2)
where w(1)j is the weight of the initial layer 0e output after
activation function is
a(1)j σ z
(1)j1113872 1113873 (3)
where j isin (1 ρ) 0e feature map that the l-th layer outputscan be expressed as
z(l)q 1113944
p
x(lminus1)p otimesw(l)
q
a(l)q σ z(l)
q1113872 1113873
⎧⎪⎪⎨
⎪⎪⎩lgt 0 (4)
where p isin (1 ρ0 + ρ(lminus 1)) q isin (1 ρ) 0e output of thel-th layer after concatenate operation is
x(l)r x
(lminus1)p a
(l)q1113960 1113961 lgt 0 (5)
where r isin (1 ρ0 + ρ middot l) [middot middot] represents the concatenateoperation on channel dimension
In the process of back propagation supposing Δx(l)r is
the derivative of loss function with respect to x(l)r Due to x(l)
r
containing x(lminus1)p and a(l)
q it produces two parts of gradientas shown below
Δa(l)q Δx(l)
r middotzx(l)
r
za(l)q
Δx(lminus1)p Δx(l)
r middotzx(l)
r
zx(lminus1)p
⎧⎪⎪⎪⎪⎪⎪⎪⎨
⎪⎪⎪⎪⎪⎪⎪⎩
(6)
where Δa(l)q is the gradient of output from the l-th layer after
activation functionΔx(lminus1)p is the gradient of output from the
(lminus 1)-th layer 0e gradient of weight in the l-th layer is
Δz(l)q Δa(l)
q middotza(l)
q
zz(l)q
Δx(l)r middot
zx(l)r
za(l)q
middot σprime z(l)q1113872 1113873
Δw(l)q Δz(l)
q middotzz(l)
q
zw(l)q
Δz(l)q middot x
(lminus1)p
⎧⎪⎪⎪⎪⎪⎪⎪⎪⎨
⎪⎪⎪⎪⎪⎪⎪⎪⎩
(7)
where Δz(l)q is the gradient of the convolution result in the
l-th layer σprime(z(l)q ) is the derivative of the activation function
with respect to z(l)q andΔw(l)
q is the gradient of weights in thel-th layer 0e network utilizes Δw(l)
q to update weights ofeach layer which is formulated as
w(l)q1113872 1113873new w
(l)q minus η middot Δw(l)
q (8)
Computational Intelligence and Neuroscience 3
where η is the learning rate Gradient keeps back propa-gation to the (lminus 1)-th layer 0e gradient of x(lminus1)
p is
Δx(lminus1)p Δz(l)
q middotzz(l)
q
zx(lminus1)p
Δz(l)q middot w
(l)q (9)
As shown in equations (6) and (9) the loss functionproduces two flows of gradient with respect to outputs ofeach convolutional layer which makes error informationpropagating more effective in network and restrains thevanishing gradient to a certain extent
0e proposed MiF-CNN includes three multi-information flow convolutional structures Between everytwo multi-information flow convolutional structures amiddle pooling layer is applied to compares the redundancyfeatures Hyperparameter ρ is set with a small value When itcomes to a new multi-information flow convolutionalstructure ρ is doubled Such a design makes each convo-lution layer learn a small quantity of features and reduce theredundant features so as to optimize the efficiency of net-work With deeper layers the network can learn more high-level and complex pedestrian features and improve the finalidentification accuracy
33 Loss Function Current deep learning algorithms usuallyuse cross-entropy loss as cost function which is formulatedas
LS minus1
M1113944
M
i1y
(i) logeθ
Ty(i)x
(i)
1113936kj1e
θTj x(i)
(10)
where y(i) is the ground truth of pedestrian categories intraining set θ is the parameter of the last fully connectedlayer x(i) is the feature vector of training samples k is thenumber of pedestrian categories in the training set andM isthe batch size
However in practice when using cross-entropy lossmerely if the quality of extracted features is not goodenough it will lead to intraclass distance being greater thaninterclass distance Aiming at this problem Wen et alproposed center loss in 2016 [25] Combination of cross-entropy loss and center loss can enhance the discriminationand generalization ability of the network Center loss isdefined as follows
LC 12M
1113944
M
i1xi minus cj
2
2 (11)
where cj is the center of the j-th pedestrian feature and xi isthe feature vector of pedestrian Center loss minimizes thedistance between feature and its center in order to reduce theintraclass distance Center cj is updated with equation (12)
Δctj
1113936Mi1δ yi j( 1113857 middot ct
j minus xi1113872 1113873
1 + 1113936Mi1δ yi j( 1113857
ct+1j ct
j minus β middot Δctj
⎧⎪⎪⎪⎪⎪⎨
⎪⎪⎪⎪⎪⎩
(12)
where β is the update rate δ(yi j) is 1 if prediction equalsto ground truth otherwise is 0 0at is to say center isupdated only when network predicts correctly
(ρ0) (ρ) (ρ0 + ρ) (ρ) (ρ0 + 2ρ) (ρ0 + l middot ρ)(ρ)
Input Conv3 times 3
Conv3 times 3
Conv3 times 3amp amp amp
1st layer 2nd layer lndashth layer
Figure 2 Multi-information flow convolutional structures where ldquoamprdquo represents concatenate operation on channel dimension
MiF P MiF P MiF
Input image
Pooling2 times 2
FC1024
Somaxk
Conv3 times 3
Conv3 times 3
Figure 1 Architecture for MiF-CNN where green rectangle denotes the multi-information flow convolutional structures orange rectangledenotes the middle pooling layer and k is the number of pedestrian categories in the training set
4 Computational Intelligence and Neuroscience
4 Person Attributes Recognition Network
0e rank list of MiF-CNN is shown in Figure 3 In theincorrect identification results of rank 1 (pedestrian B andpedestrian C) there is a big difference in attribute featuresbetween the top-ranking negative samples and the queryimage including gender clothing whether carrying hand-bag or not and so on Hence this paper recognizes personattributes for improving accuracy of person re-id
Based on Encoder-Decoder idea Xu et al [26] proposeda neural network model that can learn and generate thecontent of images 0e model utilized CNN as encoder toextract features of images which were then fed into a re-current neural network (RNN) for decoding into languagecaptions of images Inspired by that this paper presents aperson attribute recognition network (PARN) with LSTMand attention mechanism 0e proposed PARN takes pe-destrian features that are extracted by MiF-CNN as inputand outputs the attributes information of pedestrian images0e architecture of PARN is demonstrated in Figure 4
41 Input of PARN In PARN the input is the feature mapsthat are before the last fully connected layer in the MiF-CNN0e input feature is split into n feature vectors each ofwhich corresponds to a part of the image Each feature vectoris a DX-dimensional vector which is represented as
X x1 x2 xn1113864 1113865 xi isin RDX (13)
Referring to the natural language processing methodPARN transforms words in person attribute labels into wordembedding As a part of the input of PARN each wordembedding is a DY-dimensional vector
y y1 y2 ym1113864 1113865 yi isin RDY (14)
where m is the number of attribute words in each pedestrianimage and yi is the word embedding corresponding to eachattribute word
For attribute words common one-hot encoding con-siders each word as an individual which ignores the cor-relation among words However word embeddingrepresents each word as a continuous dense vector whichmakes those correlative words closer in space
42 Attention Mechanism Attention mechanism has beenwidely used in natural language processing and computer visionBy measuring the correlation between the output and differentparts of the input attention mechanism gives different weightsto different parts of the input enabling the network to use moreimportant feature information for prediction and reduce thedimension of input data [27]
In practice a certain attribute of pedestrian is onlycorresponding to a certain part of the image For examplewhen recognizing whether a pedestrian is wearing a hatpeople only pay attention to the area above the head of thepedestrian usually instead of other areas irrelevant to theattribute 0erefore before feeding the image features intothe LSTM network attention mechanism is introduced to
calculate the correlation between different positions ofimage features and the hidden state of LSTM at the previoustime 0e schematic diagram of attention mechanism isshown in Figure 5
At time t the fully connected layer and tanh function areused to integrate the information of the input feature vectorand the hidden state of LSTM at the previous time which isformulated as
vi tanh WattXxi + Watthhtminus1( 1113857 (15)
where WattX and Watth represent the fully connected layerweights of xi and htminus1 respectively 0e attention weights αi
is obtained by supplying softmax function on score vector vi
αi softmax Wattvvi( 1113857 (16)
where Wattv represents the fully connected layer weights ofvi 0e output Z is the weighted sums of all αi
Z 1113944n
i1αixi (17)
Feature Z highlights the local features that are helpful topredict and suppress the other local features that makesmall contribution to prediction which reduces the di-mension of features to some extent and makes LSTMnetwork focus on the part of input features that havegreater correlation with prediction while recognizingperson attributes so that improves the efficiency andprediction accuracy of the network
43 LSTM Normal RNN updates network parameters withback propagation which easily suffers from the vanishinggradient when there is a long gap between relevant informationand the current position to predict LSTM network solves thisproblem well because of its own structure advantage 0e ar-chitecture of LSTM unit is shown in Figure 6
Here c is the cell state for storing and transferring in-formation f is the forget gate which decides what should beabandoned in the cell state i is the input gate which decideswhat should be stored in the cell state g represents a vector ofnew candidate values that should be added to the cell state o isthe output gate which decides what parts of the cell stateshould be output to the next time h is the hidden state ofLSTM x is the input of LSTM and t denotes the current time
In PARN at any time t the input of LSTM consists oftwo parts word embedding ytminus1 at the previous time andimage features Zt at the current time 0e operation pro-cessing of LSTM in PARN is formulated as
f sigmoid Wf htminus1 ytminus1 Zt1113858 1113859 + bf1113872 1113873
i sigmoid Wi htminus1 ytminus1 Zt1113858 1113859 + bi( 1113857
g sigmoid Wg htminus1 ytminus1 Zt1113858 1113859 + bg1113872 1113873
o sigmoid Wo htminus1 ytminus1 Zt1113858 1113859 + bo( 1113857
ct f⊙ ctminus1 + i⊙g
ht o⊙ tanh ct( 1113857
⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨
⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩
(18)
Computational Intelligence and Neuroscience 5
where W and b represent the weights and biases of eachgate respectively Sigmoid function outputs a numberbetween 0 and 1 to decide the quantity scale of values thatgo through these gates Tanh function is the activationfunction of input 0is design makes LSTM give differentweights for information at different times so that it canchoose what part of information to remember or forget in along-term sequence
0e time step of LSTM in PARN is set as the number ofattributes of each pedestrian imagem 0e output hidden stateht is calculated by a fully connected layer and softmax functionat each time step to predict pedestrian attribute words
It is worth noting that at each time step the input wordembedding of LSTM is the one that LSTM learned at theprevious time step which takes advantage of the LSTM whensolving the long-term sequence problem Because of the
LSTMt = 1
LSTMt = 2
LSTMt = 3
LSTMt = m
Attention Attention Attention Attention
MiF-CNN
X
y1h1
y2h2
y3h3
ymndash1hmndash1
u1 u2 u3 um
y0h0
Figure 4 Architecture of PARN
Query Rank list1 2 3 4 5 6 7 8 9 10
A
B
C
Figure 3 Identification results of MiF-CNN (green bounding box represents positive samples in gallery)
6 Computational Intelligence and Neuroscience
correlation among attributes of pedestrian like the pedestrianowning attribute ldquomalerdquo generally own attribute ldquoshort hairrdquoand attribute ldquopantsrdquo LSTM can utilize the previous attributeresult to predict the current attribute more accurately
44 Network Optimization 0e loss function of PARN in-cludes two parts one is the cross-entropy between pre-diction and ground truth of network which is expressed as
Lsa 1113944m
i1uprime log
eθTaiu
1113936nw
j1eθT
aju⎛⎝ ⎞⎠ (19)
where u is the prediction value of the network uprime is theground truth of pedestrian labels m is the time step ofLSTM and nw is the total number of attribute words in eachdataset As shown in equation (19) the loss of a single imagein single epoch is the cumulated loss value afterm iterations0e other one is the loss function of attention mechanism asdescribed in [40]
Lα 12n
1113944
n
i11minus 1113944
m
i1αj
i⎛⎝ ⎞⎠
2
(20)
where αji represents the attention weights of the image
feature xi and n is the number of parts in the image feature0e object function of PARN to optimize is
J minus1
M1113944
M
i1L
isa + λ middot L
iα1113872 1113873 (21)
whereM is the batch size and λ is the rate of Lα in the objectfunction
5 Attribute-Aided Reranking Algorithm
Given a probe image q0 and gallery image setG g1 g2 gN1113864 1113865 including N pedestrian images theinitial similarity distance between q0 and gi is computedwith the Euclidean distance as
d q0 gi( 1113857
1113944
N
i1xq0minusxgi
1113872 11138732
11139741113972
(22)
where xq0and xgi
represent the features of q0 and gi re-spectively that extracted by MiF-CNN 0e initial ranklist R0 g
0prime1 g
0prime2 g
0primeN is obtained by sorting similarity
distance d(q0 gi) in ascending order whered(q0 g
0primei )ltd(q0 g
0primei+1) If the top-1 gallery g
0prime1 is the positive
sample it means rank 1 is correct If the positive sample isnot the top-1 gallery but the top-5 gallery it means rank 5 iscorrect
0e attribute-aided reranking algorithm reranks therank list R0 according to the attribute feature similaritybetween q0 and gi so that more positive gallery getshigher ranking in R0 which could improve the perfor-mance of person re-id To be specific when rank 5 iscorrect but rank 1 is not the proposed algorithm dis-tributes the initial feature score sf for top-5 gallery g
0prime1 simg
0prime5
according to their ranking in R0 the higher the rankingthe higher the sf 0en the proposed algorithm dis-tributes attribute score sa for g
0prime1simg
0prime5 according to the
number of their attributes that are the same as q0 themore same attributes the higher sa 0e total score ofeach gallery is s sf(1minus c) + sac where c is the scoreweight 0e reranking rank list Rlowast0 is obtained byreranking g
0prime1simg
0prime5 in descending order of their total score
s For M query images the whole processing of theattribute-aided reranking algorithm is shown asAlgorithm 1
6 Experiments
0is paper evaluates the attribute recognition accuracy ofPARN on person re-id public datasets first and then com-pares our results with other person attribute recognitionmethods 0en this paper evaluates the performance ofMiF-CNN on three people re-id datasets and the availabilityof the proposed attribute-aided reranking algorithm forimproving the accuracy of person re-id 0e analysis ofexperiment results is also given after comparing our resultswith the state-of-the-art person re-id methods
61 Datasets and Evaluation Protocol 0is paper evaluatesthe proposed methods on three challenging person re-id
α3α2α1
Z
Xx1 x2 x3 xn
Softmax Softmax Softmax Softmax
tanh tanh tanh tanh
Watt Watt Watt Watt
htndash1 htndash1 htndash1 htndash1
v1 v2 v3 vn
αn
Figure 5 Schematic diagram of attention mechanism
Wf Wi Wg Wo
f i g o
htndash1 ht
tanh
tanh
ctndash1 ct
xt
σ σ σ
Figure 6 Architecture of the LSTM unit
Computational Intelligence and Neuroscience 7
public datasets including VIPeR [28] CUHK01 [29] andMarket1501 [30]
VIPeR Dataset It contains 1264 images of 632 per-sons Because of the lack of image samples low-resolution a great change in pose view and illu-mination it is one of the hardest difficult person re-iddatasets In experiment the VIPeR dataset is ran-domly split into two parts images of 316 persons fortraining and the images of remaining 316 persons fortestingCUHK01 Dataset It consists of 3884 images of 971persons which are captured by two cameras withdifferent views 485 pedestrians are randomlychosen for training and other 486 pedestrians fortestingMarket1501 Dataset It is one of the largest person re-id datasets that include 32268 images of 1501 personsEach person has a number of images that are capturedby six cameras with different views 0e dataset isdivided into two parts 12936 images of 751 personsfor training and 19732 images of 750 persons fortestingEvaluation Protocol For the single-shot datasets VI-PeR and CUHK01 cumulative match characteristic(CMC) is used to record the ranks of correct identify[31] For the multishot dataset Market1501 apartfrom CMC mean Average Precision (mAP) is uti-lized to evaluate the performance of the proposedmethods
62 Experimental Results on Attribute Recognition 0e at-tributes in PARN refer to pedestrian identity-level attri-butes For the VIPeR dataset attributes include ldquogenderrdquoldquolength of hairrdquo ldquolower clothingrdquo ldquoupper clothingrdquo ldquoback-pack or notrdquo and ldquocarrying anything or notrdquo For theCUHK01 dataset attributes include ldquogenderrdquo ldquolength ofhairrdquo ldquobackpack or notrdquo ldquohandbag or notrdquo ldquocolor of upperrdquoand ldquocolor of lowerrdquo For the Market1501 dataset attributescontain ldquogenderrdquo ldquolength of hairrdquo ldquowearing hat or notrdquoldquolower clothingrdquo ldquobackpack or notrdquo ldquohandbag or notrdquoldquolength of sleeverdquo and ldquolength of lower clothingrdquo Eachperson in each dataset owns an attribute word list as shownin Figure 7
0is paper evaluates the attribute recognition accuracy ofPARN on the Market1501 dataset and compares our resultswith outstanding person attribute recognition method APR[22] 0e experiment results are reported in Table 1 whereldquoLslvrdquo represents the ldquolength of sleeverdquo and ldquoLlowrdquo repre-sents the ldquolength of lower clothingrdquo
It can be observed from Table 1 that PARN obtainssuperior recognition accuracy of 8 attributes especiallyldquolength of hairrdquo ldquohandbag or notrdquo and mean accuracy arehigher than APR whereas accuracy of other attributes iscloser with APR Comparison with APR demonstrates theoutstanding attribute recognition performance of PARNwhich plays an important role in improving the accuracy ofperson re-id
63 Experimental Results on Person re-id 0is paper eval-uates the performance of the proposed methods on threedatasets 0e experimental results of the proposed methodsand other state-of-the-art methods are presented inTable 2
As shown in Table 2 the proposed MiF-CNN modelobtains a great performance among state-of-the-artmethods and on the contrary comparison between MiFand MiF + PARN demonstrates that the proposed attribute-aided reranking algorithm is helpful to increase the personre-id accuracy 0e improvement effect is especially obviouson VIPeR and CUHK01 datasets where the identificationaccuracy of rank 1 rank 5 and rank 10 improves by 602633 and 222 and 494 494 and 226 respectively0e proposed MiF-CNN model with the attribute-aidedreranking algorithm gets the best rank 5 accuracy on theVIPeR dataset the best rank 1 and rank 5 accuracy on theCUHK01 dataset and the best mAP on the Market1501dataset among various methods
64 Analysis of Experimental Results Because of the smallquantity of training samples in VIPeR and CUHK01 data-sets a deep CNN model is hard to be trained and easilysuffers from overfitting To handle this the FT-JSTL+DGDmethod based on deep CNN learned deep features frommultiple domains jointly by merging all the datasets togetherand fine-tuned the pretrained model on VIPeR andCUHK01 separately 0e structure and hyperparameters ofthe deep CNN model in the M3TCP method need to beadjusted manually for adapting the scale of different datasetsso as to overcome the training problem of the deep neuralnetwork
In contrast the proposed MiF-CNN has an immobilestructure and can be trained on smaller datasets directlywithout any fine tuning In spite of the deep structure themodel can converge quickly and well MiF-CNNwithout theattribute-aided reranking algorithm outperforms FT-JSTL +DGD andM3TCP by 027 and 1317 respectivelyat rank 1 accuracy on the CUHK01 dataset It also out-performs FT-JSTL+DGD by 222 at rank 1 accuracy onthe VIPeR dataset which indicates that the proposed MiF-CNN model has an excellent ability of reducing overfittingand generalization and an outstanding performance inperson re-id
Figure 8 demonstrates the rank list of MiF andMiF + PARN It can be seen that the MiF-CNN with theattribute-aided reranking method enables the lower rankedpositive sample in rank list of MiF-CNN obtain a higherranking so as to improve the accuracy of person re-idwhich verifies the effectiveness of the attribute-aidedreranking algorithm for improving performance of per-son re-id model
7 Conclusion
0is paper studies and discusses the training problem ofdeep neural network in person re-id task and the using ofpedestrian attributes for further improving the accuracy of
8 Computational Intelligence and Neuroscience
person re-id 0e proposed MiF-CNN model realizes thereuse of feature maps and gradient information whichenhances the feature mobility of network and improves theefficiency of gradient propagation 0e designed person at-tribute recognition network uses an attention mechanism tomeasure the correlation between input feature maps and thehidden state of LSTM at previous time for reducing the di-mension of features It also employs LSTM to decode imagefeatures into pedestrian attributes On the basis of the at-tribute recognition results the attribute-aided reranking
algorithm is presented which rematches attribute featuresamong samples to aid more positive samples rank higher inrank list so as to improve the identify accuracy further
0e experimental results on three public person re-iddatasets indicate the outstanding performance of MiF-CNNmodel in person re-id 0e attribute-aided reranking algo-rithm makes a major contribution to improve the accuracyof person re-id In the future more improvement and op-timization will be done on pedestrian attribute recognitionnetwork Moreover the caption of person images or videos
Input Initial rank list R0Output 0e reranking rank list Rlowast
(1) for i 1 2 M do(2) if rank 1 correct(3) Rlowast R0(4) end if(5) elif rank 5 correct(6) Distributing initial feature score sf for top-5 gallery g
iprime1simg
iprime5 according to their ranking in R0
(7) Distributing attribute score sa for giprime1simg
iprime5 according to the number of their attributes that are the same as qi
(8) s sf(1minus c) + sac
(9) Reranking giprime1simg
iprime5 in descending order of their total score s and obtain Rlowast
(10) end if(11) elif rank 10 correct(12) Changing g
iprime1simg
iprime5 to g
iprime1simg
iprime10 repeat 6ndash10
(13) end if(14) else(15) Changing g
iprime1simg
iprime5 to g
iprime1simg
iprimeM repeat 6ndash10
(16) end if(17) end for
ALGORITHM 1 0e attribute-aided reranking algorithm
Male
Hat
Pants
Stripes
Backpack
Carrying nothing
(a)
Male
Short hair
Pants
No hat
No backpack
No handbag
Short sleeve
Short lower
(b)
Figure 7 Attribute label of the pedestrian in datasets
Table 1 Recognition accuracy of different attributes in PARN and APR ()
Method Gender Hair Clothes Hat Backpack Handbag Lslv Llow MeanAPR 8578 8346 9136 8821 8632 7601 9412 9264 8723PARN 8493 7910 8935 9667 8386 8518 9284 8845 8755
Computational Intelligence and Neuroscience 9
can be obtained by the LSTM model with natural languageprocess ideas which make person re-id methods moresignificant
Data Availability
0e (attributes recognition accuracy on the Market1501dataset and person re-id accuracy on VIPeR CUHK01 andMarket1501 datasets) data used to support the findings ofthis study are included within the article
Conflicts of Interest
0e authors declare that they have no conflicts of interest
Acknowledgments
0is work was supported by the National Natural ScienceFoundation of China (no 61773105) the Natural ScienceFoundation of Liaoning Province (no 20170540675) andthe Scientific Research Project of Liaoning EducationalDepartment (no LQGD2017023)
Table 2 Comparison of various methods with the proposed methods on three datasets ()
MethodsVIPeR CUHK01 Market1501
Rank 1 Rank 5 Rank 10 Rank 1 Rank 5 Rank 10 Rank 1 Rank 5 Rank 10 mAPDNS [32] 4228 7146 8294 6498 8496 8992 6796 mdash mdash 4189DLDA [33] 4411 7259 8166 6712 8945 9168 4815 mdash mdash 2994FT-JSTL +DGD [11] 386 mdash mdash 6660 mdash mdash mdash mdash mdash mdashPDC [34] 5127 7405 8418 mdash mdash mdash 8414 9273 9492 6341K-means-CNN [35] 4650 6930 8070 5350 8250 9120 mdash mdash mdash mdashSpindle [9] 5380 7410 8320 mdash mdash mdash 7690 9150 9460 mdashPersonNet [36] mdash mdash mdash 7114 9007 9500 3721 mdash mdash 1857CSBT [37] 3660 6620 8830 5120 7630 9180 4290 mdash mdash 2030SDH-CNN [38] mdash mdash mdash mdash mdash mdash 5812 6850 8082 4820M3TCP [21] mdash mdash mdash 5370 8430 9100 mdash mdash mdash mdashDM3 [39] 4270 7430 8510 4970 7730 8610 7580 8910 9240 5320MiF 4082 6835 8101 6687 8539 8951 7892 8646 8928 6600MiF + PARN 4684 7468 8323 7181 9033 9177 7939 8895 9136 6646ldquoMiFrdquo represents the MiF-CNN method and ldquoMiF + PARNrdquo represents the MiF-CNN with the attribute-aided reranking method
Query
1 2 3 4 5 6 7 8 9 10
Query
MiF
MiF + PARN
MiF
MiF + PARN
Rank list
Figure 8 Rank list comparison between MiF and MiF + PARN
10 Computational Intelligence and Neuroscience
References
[1] L Li and J Guo ldquoPedestrian re-identification based onreinforced deep feature fusionrdquo Information Technologyvol 42 no 7 pp 15ndash19 2018 in Chinese
[2] J Dai Y Zhang H Lu and H Wang ldquoCross-view semanticprojection learning for person re-identificationrdquo PatternRecognition vol 75 pp 63ndash76 2018
[3] T T T Pham T-L Le H Vu et al ldquoFully-automated personre-identification in multi-camera surveillance system with arobust kernel descriptor and effective shadow removalmethodrdquo Image and Vision Computing vol 59 pp 44ndash622017
[4] P DaoDao and H J Lee ldquoDeep multi-task network forlearning person identity and attributesrdquo IEEE Access vol 6pp 60801ndash60811 2018
[5] J Li L Zhuo J Zhang F Li and H Zhang ldquoA survey ofperson re-identificationrdquo Acta Automatica Sinica vol 44no 9 pp 1554ndash1568 2018 in Chinese
[6] L Wu Y Wang J Gao et al ldquoDeep adaptive feature em-bedding with local sample distributions for person re-iden-tificationrdquo Pattern Recognition vol 73 pp 275ndash288 2018
[7] L Li Y Yang and A G Hauptmann ldquoPerson re-identification past present and futurerdquo 2016 httparxivorgabs161002984
[8] D Wu S J Zheng C A Yuan and D S Huang ldquoA deepmodel with combined losses for person re-identificationrdquoCognitive Systems Research vol 54 pp 74ndash82 2018
[9] H Zhao M Tian S Sun et al ldquoSpindle net person re-identification with human body region guided feature de-composition and fusionrdquo in Proceedings of the 2017 IEEEConference on Computer Vision and Pattern Recognition(CVPR) pp 1077ndash1085 Honolulu HI USA 2017
[10] W Chen X Chen J Zhang and K Huang ldquoBeyond tripletloss a deep quadruplet network for person re-identificationrdquoin Proceedings of the 2017 IEEE Conference on ComputerVision and Pattern Recognition (CVPR) vol 28 Honolulu HIUSA July 2017
[11] T Xiao H Li W Ouyang et al ldquoLearning deep featurerepresentations with domain guided dropout for person re-identificationrdquo in Proceedings of the 2016 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 1249ndash1258 Las Vegas NV USA June 2016
[12] E Ahmed M Jones and T K Marks ldquoAn improved deeplearning architecture for person re-identificationrdquo in Pro-ceedings of the 2015 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 3908ndash3916 Boston MAUSA June 2015
[13] M Farenzena L Bazzani A Perina et al ldquoPerson re-identification by symmetry-driven accumulation of localfeaturesrdquo in Proceedings of the 2010 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 2360ndash2367 San Francisco CA USA June 2010
[14] Y Yang J Yang J Yan et al ldquoSalient color names for personre-identificationrdquo in Proceedings of Computer Vision-ECCV2014 pp 536ndash551 Zurich Switzerland September 2014
[15] L Liao M Cristani A Perina et al ldquoMultiple-shot person re-identification by chromatic and epitomic analysesrdquo PatternRecognition Letters vol 33 no 7 pp 898ndash903 2012
[16] S Murino R Laganiere and P Payeur ldquoImproving pedes-trian detection with selective gradient self-similarity featurerdquoPattern Recognition vol 48 no 8 pp 2364ndash2376 2015
[17] R Layne T M Hospedales S Gong et al ldquoPerson re-identification by attributesrdquo in Proceedings of the 23th
BritishMachine Vision Conference (BMVC) vol 2 no 3 p 83Surrey UK September 2012
[18] Z Wang R Hu Y Yu C Liang and W Huang ldquoMulti-levelfusion for person Re-identification with incomplete marksrdquoin Proceedings of the 23rd ACM international conference onMultimedia pp 1267ndash1270 Brisbane Australia March 2018
[19] Y Chen S Duffner A Stoian et al ldquoDeep and low-levelfeature based attribute learning for person re-identificationrdquoImage and Vision Computing vol 79 pp 25ndash34 2018
[20] Z Dufour X Bai M Ye and S Satoh ldquoIncremental deephidden attribute learningrdquo in Proceedings of the 26th ACMinternational conference on Multimedia pp 72ndash80 SeoulSouth Korea October 2018
[21] D Cheng Y Gong S Zhou et al ldquoPerson re-identification bymulti-channel parts-based cnn with improved triplet lossfunctionrdquo in Proceedings of the 2016 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 1335ndash1344 Las Vegas NV USA June 2016
[22] K He X Zhang S Ren et al ldquoDeep residual learning forimage recognitionrdquo in Proceedings of the 2016 IEEE Confer-ence on Computer Vision and Pattern Recognition (CVPR)pp 770ndash778 Las Vegas NV USA June 2016
[23] Y Lin L Zheng Z Zheng et al ldquoImproving person re-identification by attribute and identity learningrdquo 2017 httparxivorgabs170307220
[24] Y Yan B Ni J Liu and X Yang ldquoMulti-level attentionmodelfor person re-identificationrdquo Pattern Recognition Letters2018 In press
[25] Y Wen K Zhang Z Li et al ldquoA discriminative featurelearning approach for deep face recognitionrdquo in Proceedingsof 14th European Conference pp 499ndash515 AmsterdamNetherlands October 2016
[26] K Xu J Ba R Kiros et al ldquoShow attend and tell neuralimage caption generation with visual attentionrdquo in Pro-ceedings of the 32nd International Conference on MachineLearning (ICML) pp 2048ndash2057 Lille France July 2015
[27] H Choi K Cho and Y Bengio ldquoFine-grained attentionmechanism for neural machine translationrdquoNeurocomputingvol 284 pp 171ndash176 2018
[28] D Gray and H Tao ldquoViewpoint invariant pedestrian rec-ognition with an ensemble of localized featuresrdquo in Pro-ceedings of European Conference on Computer Vision LectureNotes in Computer Science pp 262ndash275 Marseille FranceOctober 2008
[29] W Li R Zhao and X Wang ldquoHuman reidentification withtransferred metric learningrdquo in Proceedings of the 2008 AsianConference on Computer Vision (ACCV) pp 31ndash44 DaejeonKorea November 2012
[30] L Zheng L Shen L Tian et al ldquoScalable person re-identification a benchmarkrdquo in Proceedings of the 2015IEEE Conference on Computer Vision and Pattern Recognition(CVPR) pp 1116ndash1124 Boston MA USA December 2015
[31] W Li R Zhao T Xiao et al ldquoDeepReID deep filter pairingneural network for person re-identificationrdquo in Proceedings ofthe 2014 Computer Vision and Pattern Recognition (CVPR)pp 152ndash159 Columbus OH USA June 2014
[32] L Zhang T Xiang and S Gong ldquoLearning a discriminativenull space for person re-identificationrdquo in Proceedings of theIEEE Conference on Computer Cision and Pattern Recognition(CVPR) pp 1239ndash1248 Las Vegas NV USA June 2016
[33] L Wu C Shen and A van den Hengel ldquoDeep linear dis-criminant analysis on Fisher networks a hybrid architecturefor person re-identificationrdquo Pattern Recognition vol 65pp 238ndash250 2017
Computational Intelligence and Neuroscience 11
[34] C Su J Li S Zhang et al ldquoPose-driven deep convolutionalmodel for person re-identificationrdquo in Proceedings of the 2016International Conference on Computer Vision (ICCV)pp 3980ndash3989 Venice Italy October 2017
[35] Z Zhao B Zhao and F Su ldquoPerson re-identification viaintegrating patch-based metric learning and local saliencelearningrdquo Pattern Recognition vol 75 pp 90ndash98 2018
[36] L Wu C Shen and A Hengel ldquoPersonnet person re-identification with deep convolutional neural networksrdquo2016 httparxivorgabs160107255
[37] J Chen Y Wang J Qin et al ldquoFast person re-identificationvia cross-camera semantic binary transformationrdquo in Pro-ceedings of the 2017 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 5330ndash5339 Honolulu HIUSA July 2017
[38] L Wu Y Wang Z Ge et al ldquoStructured deep hashing withconvolutional neural networks for fast person re-identifica-tionrdquo Computer Vision and Image Understanding vol 167no 10 pp 63ndash73 2018
[39] Z Hu R Hu C Chen Y Yu et al ldquoPerson reidentification viadiscrepancy matrix and matrix metricrdquo IEEE Transactions onCybernetics vol 48 no 10 pp 3006ndash3020 2018
[40] D Jiang K Cho and Y Bengio ldquoNeural machine translationby jointly learning to align and translaterdquo 2014 httparxivorgabs14090473
12 Computational Intelligence and Neuroscience
Computer Games Technology
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
Advances in
FuzzySystems
Hindawiwwwhindawicom
Volume 2018
International Journal of
ReconfigurableComputing
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
thinspArtificial Intelligence
Hindawiwwwhindawicom Volumethinsp2018
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawiwwwhindawicom Volume 2018
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Computational Intelligence and Neuroscience
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018
Human-ComputerInteraction
Advances in
Hindawiwwwhindawicom Volume 2018
Scientic Programming
Submit your manuscripts atwwwhindawicom
where η is the learning rate Gradient keeps back propa-gation to the (lminus 1)-th layer 0e gradient of x(lminus1)
p is
Δx(lminus1)p Δz(l)
q middotzz(l)
q
zx(lminus1)p
Δz(l)q middot w
(l)q (9)
As shown in equations (6) and (9) the loss functionproduces two flows of gradient with respect to outputs ofeach convolutional layer which makes error informationpropagating more effective in network and restrains thevanishing gradient to a certain extent
0e proposed MiF-CNN includes three multi-information flow convolutional structures Between everytwo multi-information flow convolutional structures amiddle pooling layer is applied to compares the redundancyfeatures Hyperparameter ρ is set with a small value When itcomes to a new multi-information flow convolutionalstructure ρ is doubled Such a design makes each convo-lution layer learn a small quantity of features and reduce theredundant features so as to optimize the efficiency of net-work With deeper layers the network can learn more high-level and complex pedestrian features and improve the finalidentification accuracy
33 Loss Function Current deep learning algorithms usuallyuse cross-entropy loss as cost function which is formulatedas
LS minus1
M1113944
M
i1y
(i) logeθ
Ty(i)x
(i)
1113936kj1e
θTj x(i)
(10)
where y(i) is the ground truth of pedestrian categories intraining set θ is the parameter of the last fully connectedlayer x(i) is the feature vector of training samples k is thenumber of pedestrian categories in the training set andM isthe batch size
However in practice when using cross-entropy lossmerely if the quality of extracted features is not goodenough it will lead to intraclass distance being greater thaninterclass distance Aiming at this problem Wen et alproposed center loss in 2016 [25] Combination of cross-entropy loss and center loss can enhance the discriminationand generalization ability of the network Center loss isdefined as follows
LC 12M
1113944
M
i1xi minus cj
2
2 (11)
where cj is the center of the j-th pedestrian feature and xi isthe feature vector of pedestrian Center loss minimizes thedistance between feature and its center in order to reduce theintraclass distance Center cj is updated with equation (12)
Δctj
1113936Mi1δ yi j( 1113857 middot ct
j minus xi1113872 1113873
1 + 1113936Mi1δ yi j( 1113857
ct+1j ct
j minus β middot Δctj
⎧⎪⎪⎪⎪⎪⎨
⎪⎪⎪⎪⎪⎩
(12)
where β is the update rate δ(yi j) is 1 if prediction equalsto ground truth otherwise is 0 0at is to say center isupdated only when network predicts correctly
(ρ0) (ρ) (ρ0 + ρ) (ρ) (ρ0 + 2ρ) (ρ0 + l middot ρ)(ρ)
Input Conv3 times 3
Conv3 times 3
Conv3 times 3amp amp amp
1st layer 2nd layer lndashth layer
Figure 2 Multi-information flow convolutional structures where ldquoamprdquo represents concatenate operation on channel dimension
MiF P MiF P MiF
Input image
Pooling2 times 2
FC1024
Somaxk
Conv3 times 3
Conv3 times 3
Figure 1 Architecture for MiF-CNN where green rectangle denotes the multi-information flow convolutional structures orange rectangledenotes the middle pooling layer and k is the number of pedestrian categories in the training set
4 Computational Intelligence and Neuroscience
4 Person Attributes Recognition Network
0e rank list of MiF-CNN is shown in Figure 3 In theincorrect identification results of rank 1 (pedestrian B andpedestrian C) there is a big difference in attribute featuresbetween the top-ranking negative samples and the queryimage including gender clothing whether carrying hand-bag or not and so on Hence this paper recognizes personattributes for improving accuracy of person re-id
Based on Encoder-Decoder idea Xu et al [26] proposeda neural network model that can learn and generate thecontent of images 0e model utilized CNN as encoder toextract features of images which were then fed into a re-current neural network (RNN) for decoding into languagecaptions of images Inspired by that this paper presents aperson attribute recognition network (PARN) with LSTMand attention mechanism 0e proposed PARN takes pe-destrian features that are extracted by MiF-CNN as inputand outputs the attributes information of pedestrian images0e architecture of PARN is demonstrated in Figure 4
41 Input of PARN In PARN the input is the feature mapsthat are before the last fully connected layer in the MiF-CNN0e input feature is split into n feature vectors each ofwhich corresponds to a part of the image Each feature vectoris a DX-dimensional vector which is represented as
X x1 x2 xn1113864 1113865 xi isin RDX (13)
Referring to the natural language processing methodPARN transforms words in person attribute labels into wordembedding As a part of the input of PARN each wordembedding is a DY-dimensional vector
y y1 y2 ym1113864 1113865 yi isin RDY (14)
where m is the number of attribute words in each pedestrianimage and yi is the word embedding corresponding to eachattribute word
For attribute words common one-hot encoding con-siders each word as an individual which ignores the cor-relation among words However word embeddingrepresents each word as a continuous dense vector whichmakes those correlative words closer in space
42 Attention Mechanism Attention mechanism has beenwidely used in natural language processing and computer visionBy measuring the correlation between the output and differentparts of the input attention mechanism gives different weightsto different parts of the input enabling the network to use moreimportant feature information for prediction and reduce thedimension of input data [27]
In practice a certain attribute of pedestrian is onlycorresponding to a certain part of the image For examplewhen recognizing whether a pedestrian is wearing a hatpeople only pay attention to the area above the head of thepedestrian usually instead of other areas irrelevant to theattribute 0erefore before feeding the image features intothe LSTM network attention mechanism is introduced to
calculate the correlation between different positions ofimage features and the hidden state of LSTM at the previoustime 0e schematic diagram of attention mechanism isshown in Figure 5
At time t the fully connected layer and tanh function areused to integrate the information of the input feature vectorand the hidden state of LSTM at the previous time which isformulated as
vi tanh WattXxi + Watthhtminus1( 1113857 (15)
where WattX and Watth represent the fully connected layerweights of xi and htminus1 respectively 0e attention weights αi
is obtained by supplying softmax function on score vector vi
αi softmax Wattvvi( 1113857 (16)
where Wattv represents the fully connected layer weights ofvi 0e output Z is the weighted sums of all αi
Z 1113944n
i1αixi (17)
Feature Z highlights the local features that are helpful topredict and suppress the other local features that makesmall contribution to prediction which reduces the di-mension of features to some extent and makes LSTMnetwork focus on the part of input features that havegreater correlation with prediction while recognizingperson attributes so that improves the efficiency andprediction accuracy of the network
43 LSTM Normal RNN updates network parameters withback propagation which easily suffers from the vanishinggradient when there is a long gap between relevant informationand the current position to predict LSTM network solves thisproblem well because of its own structure advantage 0e ar-chitecture of LSTM unit is shown in Figure 6
Here c is the cell state for storing and transferring in-formation f is the forget gate which decides what should beabandoned in the cell state i is the input gate which decideswhat should be stored in the cell state g represents a vector ofnew candidate values that should be added to the cell state o isthe output gate which decides what parts of the cell stateshould be output to the next time h is the hidden state ofLSTM x is the input of LSTM and t denotes the current time
In PARN at any time t the input of LSTM consists oftwo parts word embedding ytminus1 at the previous time andimage features Zt at the current time 0e operation pro-cessing of LSTM in PARN is formulated as
f sigmoid Wf htminus1 ytminus1 Zt1113858 1113859 + bf1113872 1113873
i sigmoid Wi htminus1 ytminus1 Zt1113858 1113859 + bi( 1113857
g sigmoid Wg htminus1 ytminus1 Zt1113858 1113859 + bg1113872 1113873
o sigmoid Wo htminus1 ytminus1 Zt1113858 1113859 + bo( 1113857
ct f⊙ ctminus1 + i⊙g
ht o⊙ tanh ct( 1113857
⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨
⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩
(18)
Computational Intelligence and Neuroscience 5
where W and b represent the weights and biases of eachgate respectively Sigmoid function outputs a numberbetween 0 and 1 to decide the quantity scale of values thatgo through these gates Tanh function is the activationfunction of input 0is design makes LSTM give differentweights for information at different times so that it canchoose what part of information to remember or forget in along-term sequence
0e time step of LSTM in PARN is set as the number ofattributes of each pedestrian imagem 0e output hidden stateht is calculated by a fully connected layer and softmax functionat each time step to predict pedestrian attribute words
It is worth noting that at each time step the input wordembedding of LSTM is the one that LSTM learned at theprevious time step which takes advantage of the LSTM whensolving the long-term sequence problem Because of the
LSTMt = 1
LSTMt = 2
LSTMt = 3
LSTMt = m
Attention Attention Attention Attention
MiF-CNN
X
y1h1
y2h2
y3h3
ymndash1hmndash1
u1 u2 u3 um
y0h0
Figure 4 Architecture of PARN
Query Rank list1 2 3 4 5 6 7 8 9 10
A
B
C
Figure 3 Identification results of MiF-CNN (green bounding box represents positive samples in gallery)
6 Computational Intelligence and Neuroscience
correlation among attributes of pedestrian like the pedestrianowning attribute ldquomalerdquo generally own attribute ldquoshort hairrdquoand attribute ldquopantsrdquo LSTM can utilize the previous attributeresult to predict the current attribute more accurately
44 Network Optimization 0e loss function of PARN in-cludes two parts one is the cross-entropy between pre-diction and ground truth of network which is expressed as
Lsa 1113944m
i1uprime log
eθTaiu
1113936nw
j1eθT
aju⎛⎝ ⎞⎠ (19)
where u is the prediction value of the network uprime is theground truth of pedestrian labels m is the time step ofLSTM and nw is the total number of attribute words in eachdataset As shown in equation (19) the loss of a single imagein single epoch is the cumulated loss value afterm iterations0e other one is the loss function of attention mechanism asdescribed in [40]
Lα 12n
1113944
n
i11minus 1113944
m
i1αj
i⎛⎝ ⎞⎠
2
(20)
where αji represents the attention weights of the image
feature xi and n is the number of parts in the image feature0e object function of PARN to optimize is
J minus1
M1113944
M
i1L
isa + λ middot L
iα1113872 1113873 (21)
whereM is the batch size and λ is the rate of Lα in the objectfunction
5 Attribute-Aided Reranking Algorithm
Given a probe image q0 and gallery image setG g1 g2 gN1113864 1113865 including N pedestrian images theinitial similarity distance between q0 and gi is computedwith the Euclidean distance as
d q0 gi( 1113857
1113944
N
i1xq0minusxgi
1113872 11138732
11139741113972
(22)
where xq0and xgi
represent the features of q0 and gi re-spectively that extracted by MiF-CNN 0e initial ranklist R0 g
0prime1 g
0prime2 g
0primeN is obtained by sorting similarity
distance d(q0 gi) in ascending order whered(q0 g
0primei )ltd(q0 g
0primei+1) If the top-1 gallery g
0prime1 is the positive
sample it means rank 1 is correct If the positive sample isnot the top-1 gallery but the top-5 gallery it means rank 5 iscorrect
0e attribute-aided reranking algorithm reranks therank list R0 according to the attribute feature similaritybetween q0 and gi so that more positive gallery getshigher ranking in R0 which could improve the perfor-mance of person re-id To be specific when rank 5 iscorrect but rank 1 is not the proposed algorithm dis-tributes the initial feature score sf for top-5 gallery g
0prime1 simg
0prime5
according to their ranking in R0 the higher the rankingthe higher the sf 0en the proposed algorithm dis-tributes attribute score sa for g
0prime1simg
0prime5 according to the
number of their attributes that are the same as q0 themore same attributes the higher sa 0e total score ofeach gallery is s sf(1minus c) + sac where c is the scoreweight 0e reranking rank list Rlowast0 is obtained byreranking g
0prime1simg
0prime5 in descending order of their total score
s For M query images the whole processing of theattribute-aided reranking algorithm is shown asAlgorithm 1
6 Experiments
0is paper evaluates the attribute recognition accuracy ofPARN on person re-id public datasets first and then com-pares our results with other person attribute recognitionmethods 0en this paper evaluates the performance ofMiF-CNN on three people re-id datasets and the availabilityof the proposed attribute-aided reranking algorithm forimproving the accuracy of person re-id 0e analysis ofexperiment results is also given after comparing our resultswith the state-of-the-art person re-id methods
61 Datasets and Evaluation Protocol 0is paper evaluatesthe proposed methods on three challenging person re-id
α3α2α1
Z
Xx1 x2 x3 xn
Softmax Softmax Softmax Softmax
tanh tanh tanh tanh
Watt Watt Watt Watt
htndash1 htndash1 htndash1 htndash1
v1 v2 v3 vn
αn
Figure 5 Schematic diagram of attention mechanism
Wf Wi Wg Wo
f i g o
htndash1 ht
tanh
tanh
ctndash1 ct
xt
σ σ σ
Figure 6 Architecture of the LSTM unit
Computational Intelligence and Neuroscience 7
public datasets including VIPeR [28] CUHK01 [29] andMarket1501 [30]
VIPeR Dataset It contains 1264 images of 632 per-sons Because of the lack of image samples low-resolution a great change in pose view and illu-mination it is one of the hardest difficult person re-iddatasets In experiment the VIPeR dataset is ran-domly split into two parts images of 316 persons fortraining and the images of remaining 316 persons fortestingCUHK01 Dataset It consists of 3884 images of 971persons which are captured by two cameras withdifferent views 485 pedestrians are randomlychosen for training and other 486 pedestrians fortestingMarket1501 Dataset It is one of the largest person re-id datasets that include 32268 images of 1501 personsEach person has a number of images that are capturedby six cameras with different views 0e dataset isdivided into two parts 12936 images of 751 personsfor training and 19732 images of 750 persons fortestingEvaluation Protocol For the single-shot datasets VI-PeR and CUHK01 cumulative match characteristic(CMC) is used to record the ranks of correct identify[31] For the multishot dataset Market1501 apartfrom CMC mean Average Precision (mAP) is uti-lized to evaluate the performance of the proposedmethods
62 Experimental Results on Attribute Recognition 0e at-tributes in PARN refer to pedestrian identity-level attri-butes For the VIPeR dataset attributes include ldquogenderrdquoldquolength of hairrdquo ldquolower clothingrdquo ldquoupper clothingrdquo ldquoback-pack or notrdquo and ldquocarrying anything or notrdquo For theCUHK01 dataset attributes include ldquogenderrdquo ldquolength ofhairrdquo ldquobackpack or notrdquo ldquohandbag or notrdquo ldquocolor of upperrdquoand ldquocolor of lowerrdquo For the Market1501 dataset attributescontain ldquogenderrdquo ldquolength of hairrdquo ldquowearing hat or notrdquoldquolower clothingrdquo ldquobackpack or notrdquo ldquohandbag or notrdquoldquolength of sleeverdquo and ldquolength of lower clothingrdquo Eachperson in each dataset owns an attribute word list as shownin Figure 7
0is paper evaluates the attribute recognition accuracy ofPARN on the Market1501 dataset and compares our resultswith outstanding person attribute recognition method APR[22] 0e experiment results are reported in Table 1 whereldquoLslvrdquo represents the ldquolength of sleeverdquo and ldquoLlowrdquo repre-sents the ldquolength of lower clothingrdquo
It can be observed from Table 1 that PARN obtainssuperior recognition accuracy of 8 attributes especiallyldquolength of hairrdquo ldquohandbag or notrdquo and mean accuracy arehigher than APR whereas accuracy of other attributes iscloser with APR Comparison with APR demonstrates theoutstanding attribute recognition performance of PARNwhich plays an important role in improving the accuracy ofperson re-id
63 Experimental Results on Person re-id 0is paper eval-uates the performance of the proposed methods on threedatasets 0e experimental results of the proposed methodsand other state-of-the-art methods are presented inTable 2
As shown in Table 2 the proposed MiF-CNN modelobtains a great performance among state-of-the-artmethods and on the contrary comparison between MiFand MiF + PARN demonstrates that the proposed attribute-aided reranking algorithm is helpful to increase the personre-id accuracy 0e improvement effect is especially obviouson VIPeR and CUHK01 datasets where the identificationaccuracy of rank 1 rank 5 and rank 10 improves by 602633 and 222 and 494 494 and 226 respectively0e proposed MiF-CNN model with the attribute-aidedreranking algorithm gets the best rank 5 accuracy on theVIPeR dataset the best rank 1 and rank 5 accuracy on theCUHK01 dataset and the best mAP on the Market1501dataset among various methods
64 Analysis of Experimental Results Because of the smallquantity of training samples in VIPeR and CUHK01 data-sets a deep CNN model is hard to be trained and easilysuffers from overfitting To handle this the FT-JSTL+DGDmethod based on deep CNN learned deep features frommultiple domains jointly by merging all the datasets togetherand fine-tuned the pretrained model on VIPeR andCUHK01 separately 0e structure and hyperparameters ofthe deep CNN model in the M3TCP method need to beadjusted manually for adapting the scale of different datasetsso as to overcome the training problem of the deep neuralnetwork
In contrast the proposed MiF-CNN has an immobilestructure and can be trained on smaller datasets directlywithout any fine tuning In spite of the deep structure themodel can converge quickly and well MiF-CNNwithout theattribute-aided reranking algorithm outperforms FT-JSTL +DGD andM3TCP by 027 and 1317 respectivelyat rank 1 accuracy on the CUHK01 dataset It also out-performs FT-JSTL+DGD by 222 at rank 1 accuracy onthe VIPeR dataset which indicates that the proposed MiF-CNN model has an excellent ability of reducing overfittingand generalization and an outstanding performance inperson re-id
Figure 8 demonstrates the rank list of MiF andMiF + PARN It can be seen that the MiF-CNN with theattribute-aided reranking method enables the lower rankedpositive sample in rank list of MiF-CNN obtain a higherranking so as to improve the accuracy of person re-idwhich verifies the effectiveness of the attribute-aidedreranking algorithm for improving performance of per-son re-id model
7 Conclusion
0is paper studies and discusses the training problem ofdeep neural network in person re-id task and the using ofpedestrian attributes for further improving the accuracy of
8 Computational Intelligence and Neuroscience
person re-id 0e proposed MiF-CNN model realizes thereuse of feature maps and gradient information whichenhances the feature mobility of network and improves theefficiency of gradient propagation 0e designed person at-tribute recognition network uses an attention mechanism tomeasure the correlation between input feature maps and thehidden state of LSTM at previous time for reducing the di-mension of features It also employs LSTM to decode imagefeatures into pedestrian attributes On the basis of the at-tribute recognition results the attribute-aided reranking
algorithm is presented which rematches attribute featuresamong samples to aid more positive samples rank higher inrank list so as to improve the identify accuracy further
0e experimental results on three public person re-iddatasets indicate the outstanding performance of MiF-CNNmodel in person re-id 0e attribute-aided reranking algo-rithm makes a major contribution to improve the accuracyof person re-id In the future more improvement and op-timization will be done on pedestrian attribute recognitionnetwork Moreover the caption of person images or videos
Input Initial rank list R0Output 0e reranking rank list Rlowast
(1) for i 1 2 M do(2) if rank 1 correct(3) Rlowast R0(4) end if(5) elif rank 5 correct(6) Distributing initial feature score sf for top-5 gallery g
iprime1simg
iprime5 according to their ranking in R0
(7) Distributing attribute score sa for giprime1simg
iprime5 according to the number of their attributes that are the same as qi
(8) s sf(1minus c) + sac
(9) Reranking giprime1simg
iprime5 in descending order of their total score s and obtain Rlowast
(10) end if(11) elif rank 10 correct(12) Changing g
iprime1simg
iprime5 to g
iprime1simg
iprime10 repeat 6ndash10
(13) end if(14) else(15) Changing g
iprime1simg
iprime5 to g
iprime1simg
iprimeM repeat 6ndash10
(16) end if(17) end for
ALGORITHM 1 0e attribute-aided reranking algorithm
Male
Hat
Pants
Stripes
Backpack
Carrying nothing
(a)
Male
Short hair
Pants
No hat
No backpack
No handbag
Short sleeve
Short lower
(b)
Figure 7 Attribute label of the pedestrian in datasets
Table 1 Recognition accuracy of different attributes in PARN and APR ()
Method Gender Hair Clothes Hat Backpack Handbag Lslv Llow MeanAPR 8578 8346 9136 8821 8632 7601 9412 9264 8723PARN 8493 7910 8935 9667 8386 8518 9284 8845 8755
Computational Intelligence and Neuroscience 9
can be obtained by the LSTM model with natural languageprocess ideas which make person re-id methods moresignificant
Data Availability
0e (attributes recognition accuracy on the Market1501dataset and person re-id accuracy on VIPeR CUHK01 andMarket1501 datasets) data used to support the findings ofthis study are included within the article
Conflicts of Interest
0e authors declare that they have no conflicts of interest
Acknowledgments
0is work was supported by the National Natural ScienceFoundation of China (no 61773105) the Natural ScienceFoundation of Liaoning Province (no 20170540675) andthe Scientific Research Project of Liaoning EducationalDepartment (no LQGD2017023)
Table 2 Comparison of various methods with the proposed methods on three datasets ()
MethodsVIPeR CUHK01 Market1501
Rank 1 Rank 5 Rank 10 Rank 1 Rank 5 Rank 10 Rank 1 Rank 5 Rank 10 mAPDNS [32] 4228 7146 8294 6498 8496 8992 6796 mdash mdash 4189DLDA [33] 4411 7259 8166 6712 8945 9168 4815 mdash mdash 2994FT-JSTL +DGD [11] 386 mdash mdash 6660 mdash mdash mdash mdash mdash mdashPDC [34] 5127 7405 8418 mdash mdash mdash 8414 9273 9492 6341K-means-CNN [35] 4650 6930 8070 5350 8250 9120 mdash mdash mdash mdashSpindle [9] 5380 7410 8320 mdash mdash mdash 7690 9150 9460 mdashPersonNet [36] mdash mdash mdash 7114 9007 9500 3721 mdash mdash 1857CSBT [37] 3660 6620 8830 5120 7630 9180 4290 mdash mdash 2030SDH-CNN [38] mdash mdash mdash mdash mdash mdash 5812 6850 8082 4820M3TCP [21] mdash mdash mdash 5370 8430 9100 mdash mdash mdash mdashDM3 [39] 4270 7430 8510 4970 7730 8610 7580 8910 9240 5320MiF 4082 6835 8101 6687 8539 8951 7892 8646 8928 6600MiF + PARN 4684 7468 8323 7181 9033 9177 7939 8895 9136 6646ldquoMiFrdquo represents the MiF-CNN method and ldquoMiF + PARNrdquo represents the MiF-CNN with the attribute-aided reranking method
Query
1 2 3 4 5 6 7 8 9 10
Query
MiF
MiF + PARN
MiF
MiF + PARN
Rank list
Figure 8 Rank list comparison between MiF and MiF + PARN
10 Computational Intelligence and Neuroscience
References
[1] L Li and J Guo ldquoPedestrian re-identification based onreinforced deep feature fusionrdquo Information Technologyvol 42 no 7 pp 15ndash19 2018 in Chinese
[2] J Dai Y Zhang H Lu and H Wang ldquoCross-view semanticprojection learning for person re-identificationrdquo PatternRecognition vol 75 pp 63ndash76 2018
[3] T T T Pham T-L Le H Vu et al ldquoFully-automated personre-identification in multi-camera surveillance system with arobust kernel descriptor and effective shadow removalmethodrdquo Image and Vision Computing vol 59 pp 44ndash622017
[4] P DaoDao and H J Lee ldquoDeep multi-task network forlearning person identity and attributesrdquo IEEE Access vol 6pp 60801ndash60811 2018
[5] J Li L Zhuo J Zhang F Li and H Zhang ldquoA survey ofperson re-identificationrdquo Acta Automatica Sinica vol 44no 9 pp 1554ndash1568 2018 in Chinese
[6] L Wu Y Wang J Gao et al ldquoDeep adaptive feature em-bedding with local sample distributions for person re-iden-tificationrdquo Pattern Recognition vol 73 pp 275ndash288 2018
[7] L Li Y Yang and A G Hauptmann ldquoPerson re-identification past present and futurerdquo 2016 httparxivorgabs161002984
[8] D Wu S J Zheng C A Yuan and D S Huang ldquoA deepmodel with combined losses for person re-identificationrdquoCognitive Systems Research vol 54 pp 74ndash82 2018
[9] H Zhao M Tian S Sun et al ldquoSpindle net person re-identification with human body region guided feature de-composition and fusionrdquo in Proceedings of the 2017 IEEEConference on Computer Vision and Pattern Recognition(CVPR) pp 1077ndash1085 Honolulu HI USA 2017
[10] W Chen X Chen J Zhang and K Huang ldquoBeyond tripletloss a deep quadruplet network for person re-identificationrdquoin Proceedings of the 2017 IEEE Conference on ComputerVision and Pattern Recognition (CVPR) vol 28 Honolulu HIUSA July 2017
[11] T Xiao H Li W Ouyang et al ldquoLearning deep featurerepresentations with domain guided dropout for person re-identificationrdquo in Proceedings of the 2016 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 1249ndash1258 Las Vegas NV USA June 2016
[12] E Ahmed M Jones and T K Marks ldquoAn improved deeplearning architecture for person re-identificationrdquo in Pro-ceedings of the 2015 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 3908ndash3916 Boston MAUSA June 2015
[13] M Farenzena L Bazzani A Perina et al ldquoPerson re-identification by symmetry-driven accumulation of localfeaturesrdquo in Proceedings of the 2010 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 2360ndash2367 San Francisco CA USA June 2010
[14] Y Yang J Yang J Yan et al ldquoSalient color names for personre-identificationrdquo in Proceedings of Computer Vision-ECCV2014 pp 536ndash551 Zurich Switzerland September 2014
[15] L Liao M Cristani A Perina et al ldquoMultiple-shot person re-identification by chromatic and epitomic analysesrdquo PatternRecognition Letters vol 33 no 7 pp 898ndash903 2012
[16] S Murino R Laganiere and P Payeur ldquoImproving pedes-trian detection with selective gradient self-similarity featurerdquoPattern Recognition vol 48 no 8 pp 2364ndash2376 2015
[17] R Layne T M Hospedales S Gong et al ldquoPerson re-identification by attributesrdquo in Proceedings of the 23th
BritishMachine Vision Conference (BMVC) vol 2 no 3 p 83Surrey UK September 2012
[18] Z Wang R Hu Y Yu C Liang and W Huang ldquoMulti-levelfusion for person Re-identification with incomplete marksrdquoin Proceedings of the 23rd ACM international conference onMultimedia pp 1267ndash1270 Brisbane Australia March 2018
[19] Y Chen S Duffner A Stoian et al ldquoDeep and low-levelfeature based attribute learning for person re-identificationrdquoImage and Vision Computing vol 79 pp 25ndash34 2018
[20] Z Dufour X Bai M Ye and S Satoh ldquoIncremental deephidden attribute learningrdquo in Proceedings of the 26th ACMinternational conference on Multimedia pp 72ndash80 SeoulSouth Korea October 2018
[21] D Cheng Y Gong S Zhou et al ldquoPerson re-identification bymulti-channel parts-based cnn with improved triplet lossfunctionrdquo in Proceedings of the 2016 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 1335ndash1344 Las Vegas NV USA June 2016
[22] K He X Zhang S Ren et al ldquoDeep residual learning forimage recognitionrdquo in Proceedings of the 2016 IEEE Confer-ence on Computer Vision and Pattern Recognition (CVPR)pp 770ndash778 Las Vegas NV USA June 2016
[23] Y Lin L Zheng Z Zheng et al ldquoImproving person re-identification by attribute and identity learningrdquo 2017 httparxivorgabs170307220
[24] Y Yan B Ni J Liu and X Yang ldquoMulti-level attentionmodelfor person re-identificationrdquo Pattern Recognition Letters2018 In press
[25] Y Wen K Zhang Z Li et al ldquoA discriminative featurelearning approach for deep face recognitionrdquo in Proceedingsof 14th European Conference pp 499ndash515 AmsterdamNetherlands October 2016
[26] K Xu J Ba R Kiros et al ldquoShow attend and tell neuralimage caption generation with visual attentionrdquo in Pro-ceedings of the 32nd International Conference on MachineLearning (ICML) pp 2048ndash2057 Lille France July 2015
[27] H Choi K Cho and Y Bengio ldquoFine-grained attentionmechanism for neural machine translationrdquoNeurocomputingvol 284 pp 171ndash176 2018
[28] D Gray and H Tao ldquoViewpoint invariant pedestrian rec-ognition with an ensemble of localized featuresrdquo in Pro-ceedings of European Conference on Computer Vision LectureNotes in Computer Science pp 262ndash275 Marseille FranceOctober 2008
[29] W Li R Zhao and X Wang ldquoHuman reidentification withtransferred metric learningrdquo in Proceedings of the 2008 AsianConference on Computer Vision (ACCV) pp 31ndash44 DaejeonKorea November 2012
[30] L Zheng L Shen L Tian et al ldquoScalable person re-identification a benchmarkrdquo in Proceedings of the 2015IEEE Conference on Computer Vision and Pattern Recognition(CVPR) pp 1116ndash1124 Boston MA USA December 2015
[31] W Li R Zhao T Xiao et al ldquoDeepReID deep filter pairingneural network for person re-identificationrdquo in Proceedings ofthe 2014 Computer Vision and Pattern Recognition (CVPR)pp 152ndash159 Columbus OH USA June 2014
[32] L Zhang T Xiang and S Gong ldquoLearning a discriminativenull space for person re-identificationrdquo in Proceedings of theIEEE Conference on Computer Cision and Pattern Recognition(CVPR) pp 1239ndash1248 Las Vegas NV USA June 2016
[33] L Wu C Shen and A van den Hengel ldquoDeep linear dis-criminant analysis on Fisher networks a hybrid architecturefor person re-identificationrdquo Pattern Recognition vol 65pp 238ndash250 2017
Computational Intelligence and Neuroscience 11
[34] C Su J Li S Zhang et al ldquoPose-driven deep convolutionalmodel for person re-identificationrdquo in Proceedings of the 2016International Conference on Computer Vision (ICCV)pp 3980ndash3989 Venice Italy October 2017
[35] Z Zhao B Zhao and F Su ldquoPerson re-identification viaintegrating patch-based metric learning and local saliencelearningrdquo Pattern Recognition vol 75 pp 90ndash98 2018
[36] L Wu C Shen and A Hengel ldquoPersonnet person re-identification with deep convolutional neural networksrdquo2016 httparxivorgabs160107255
[37] J Chen Y Wang J Qin et al ldquoFast person re-identificationvia cross-camera semantic binary transformationrdquo in Pro-ceedings of the 2017 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 5330ndash5339 Honolulu HIUSA July 2017
[38] L Wu Y Wang Z Ge et al ldquoStructured deep hashing withconvolutional neural networks for fast person re-identifica-tionrdquo Computer Vision and Image Understanding vol 167no 10 pp 63ndash73 2018
[39] Z Hu R Hu C Chen Y Yu et al ldquoPerson reidentification viadiscrepancy matrix and matrix metricrdquo IEEE Transactions onCybernetics vol 48 no 10 pp 3006ndash3020 2018
[40] D Jiang K Cho and Y Bengio ldquoNeural machine translationby jointly learning to align and translaterdquo 2014 httparxivorgabs14090473
12 Computational Intelligence and Neuroscience
Computer Games Technology
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
Advances in
FuzzySystems
Hindawiwwwhindawicom
Volume 2018
International Journal of
ReconfigurableComputing
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
thinspArtificial Intelligence
Hindawiwwwhindawicom Volumethinsp2018
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawiwwwhindawicom Volume 2018
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Computational Intelligence and Neuroscience
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018
Human-ComputerInteraction
Advances in
Hindawiwwwhindawicom Volume 2018
Scientic Programming
Submit your manuscripts atwwwhindawicom
4 Person Attributes Recognition Network
0e rank list of MiF-CNN is shown in Figure 3 In theincorrect identification results of rank 1 (pedestrian B andpedestrian C) there is a big difference in attribute featuresbetween the top-ranking negative samples and the queryimage including gender clothing whether carrying hand-bag or not and so on Hence this paper recognizes personattributes for improving accuracy of person re-id
Based on Encoder-Decoder idea Xu et al [26] proposeda neural network model that can learn and generate thecontent of images 0e model utilized CNN as encoder toextract features of images which were then fed into a re-current neural network (RNN) for decoding into languagecaptions of images Inspired by that this paper presents aperson attribute recognition network (PARN) with LSTMand attention mechanism 0e proposed PARN takes pe-destrian features that are extracted by MiF-CNN as inputand outputs the attributes information of pedestrian images0e architecture of PARN is demonstrated in Figure 4
41 Input of PARN In PARN the input is the feature mapsthat are before the last fully connected layer in the MiF-CNN0e input feature is split into n feature vectors each ofwhich corresponds to a part of the image Each feature vectoris a DX-dimensional vector which is represented as
X x1 x2 xn1113864 1113865 xi isin RDX (13)
Referring to the natural language processing methodPARN transforms words in person attribute labels into wordembedding As a part of the input of PARN each wordembedding is a DY-dimensional vector
y y1 y2 ym1113864 1113865 yi isin RDY (14)
where m is the number of attribute words in each pedestrianimage and yi is the word embedding corresponding to eachattribute word
For attribute words common one-hot encoding con-siders each word as an individual which ignores the cor-relation among words However word embeddingrepresents each word as a continuous dense vector whichmakes those correlative words closer in space
42 Attention Mechanism Attention mechanism has beenwidely used in natural language processing and computer visionBy measuring the correlation between the output and differentparts of the input attention mechanism gives different weightsto different parts of the input enabling the network to use moreimportant feature information for prediction and reduce thedimension of input data [27]
In practice a certain attribute of pedestrian is onlycorresponding to a certain part of the image For examplewhen recognizing whether a pedestrian is wearing a hatpeople only pay attention to the area above the head of thepedestrian usually instead of other areas irrelevant to theattribute 0erefore before feeding the image features intothe LSTM network attention mechanism is introduced to
calculate the correlation between different positions ofimage features and the hidden state of LSTM at the previoustime 0e schematic diagram of attention mechanism isshown in Figure 5
At time t the fully connected layer and tanh function areused to integrate the information of the input feature vectorand the hidden state of LSTM at the previous time which isformulated as
vi tanh WattXxi + Watthhtminus1( 1113857 (15)
where WattX and Watth represent the fully connected layerweights of xi and htminus1 respectively 0e attention weights αi
is obtained by supplying softmax function on score vector vi
αi softmax Wattvvi( 1113857 (16)
where Wattv represents the fully connected layer weights ofvi 0e output Z is the weighted sums of all αi
Z 1113944n
i1αixi (17)
Feature Z highlights the local features that are helpful topredict and suppress the other local features that makesmall contribution to prediction which reduces the di-mension of features to some extent and makes LSTMnetwork focus on the part of input features that havegreater correlation with prediction while recognizingperson attributes so that improves the efficiency andprediction accuracy of the network
43 LSTM Normal RNN updates network parameters withback propagation which easily suffers from the vanishinggradient when there is a long gap between relevant informationand the current position to predict LSTM network solves thisproblem well because of its own structure advantage 0e ar-chitecture of LSTM unit is shown in Figure 6
Here c is the cell state for storing and transferring in-formation f is the forget gate which decides what should beabandoned in the cell state i is the input gate which decideswhat should be stored in the cell state g represents a vector ofnew candidate values that should be added to the cell state o isthe output gate which decides what parts of the cell stateshould be output to the next time h is the hidden state ofLSTM x is the input of LSTM and t denotes the current time
In PARN at any time t the input of LSTM consists oftwo parts word embedding ytminus1 at the previous time andimage features Zt at the current time 0e operation pro-cessing of LSTM in PARN is formulated as
f sigmoid Wf htminus1 ytminus1 Zt1113858 1113859 + bf1113872 1113873
i sigmoid Wi htminus1 ytminus1 Zt1113858 1113859 + bi( 1113857
g sigmoid Wg htminus1 ytminus1 Zt1113858 1113859 + bg1113872 1113873
o sigmoid Wo htminus1 ytminus1 Zt1113858 1113859 + bo( 1113857
ct f⊙ ctminus1 + i⊙g
ht o⊙ tanh ct( 1113857
⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨
⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩
(18)
Computational Intelligence and Neuroscience 5
where W and b represent the weights and biases of eachgate respectively Sigmoid function outputs a numberbetween 0 and 1 to decide the quantity scale of values thatgo through these gates Tanh function is the activationfunction of input 0is design makes LSTM give differentweights for information at different times so that it canchoose what part of information to remember or forget in along-term sequence
0e time step of LSTM in PARN is set as the number ofattributes of each pedestrian imagem 0e output hidden stateht is calculated by a fully connected layer and softmax functionat each time step to predict pedestrian attribute words
It is worth noting that at each time step the input wordembedding of LSTM is the one that LSTM learned at theprevious time step which takes advantage of the LSTM whensolving the long-term sequence problem Because of the
LSTMt = 1
LSTMt = 2
LSTMt = 3
LSTMt = m
Attention Attention Attention Attention
MiF-CNN
X
y1h1
y2h2
y3h3
ymndash1hmndash1
u1 u2 u3 um
y0h0
Figure 4 Architecture of PARN
Query Rank list1 2 3 4 5 6 7 8 9 10
A
B
C
Figure 3 Identification results of MiF-CNN (green bounding box represents positive samples in gallery)
6 Computational Intelligence and Neuroscience
correlation among attributes of pedestrian like the pedestrianowning attribute ldquomalerdquo generally own attribute ldquoshort hairrdquoand attribute ldquopantsrdquo LSTM can utilize the previous attributeresult to predict the current attribute more accurately
44 Network Optimization 0e loss function of PARN in-cludes two parts one is the cross-entropy between pre-diction and ground truth of network which is expressed as
Lsa 1113944m
i1uprime log
eθTaiu
1113936nw
j1eθT
aju⎛⎝ ⎞⎠ (19)
where u is the prediction value of the network uprime is theground truth of pedestrian labels m is the time step ofLSTM and nw is the total number of attribute words in eachdataset As shown in equation (19) the loss of a single imagein single epoch is the cumulated loss value afterm iterations0e other one is the loss function of attention mechanism asdescribed in [40]
Lα 12n
1113944
n
i11minus 1113944
m
i1αj
i⎛⎝ ⎞⎠
2
(20)
where αji represents the attention weights of the image
feature xi and n is the number of parts in the image feature0e object function of PARN to optimize is
J minus1
M1113944
M
i1L
isa + λ middot L
iα1113872 1113873 (21)
whereM is the batch size and λ is the rate of Lα in the objectfunction
5 Attribute-Aided Reranking Algorithm
Given a probe image q0 and gallery image setG g1 g2 gN1113864 1113865 including N pedestrian images theinitial similarity distance between q0 and gi is computedwith the Euclidean distance as
d q0 gi( 1113857
1113944
N
i1xq0minusxgi
1113872 11138732
11139741113972
(22)
where xq0and xgi
represent the features of q0 and gi re-spectively that extracted by MiF-CNN 0e initial ranklist R0 g
0prime1 g
0prime2 g
0primeN is obtained by sorting similarity
distance d(q0 gi) in ascending order whered(q0 g
0primei )ltd(q0 g
0primei+1) If the top-1 gallery g
0prime1 is the positive
sample it means rank 1 is correct If the positive sample isnot the top-1 gallery but the top-5 gallery it means rank 5 iscorrect
0e attribute-aided reranking algorithm reranks therank list R0 according to the attribute feature similaritybetween q0 and gi so that more positive gallery getshigher ranking in R0 which could improve the perfor-mance of person re-id To be specific when rank 5 iscorrect but rank 1 is not the proposed algorithm dis-tributes the initial feature score sf for top-5 gallery g
0prime1 simg
0prime5
according to their ranking in R0 the higher the rankingthe higher the sf 0en the proposed algorithm dis-tributes attribute score sa for g
0prime1simg
0prime5 according to the
number of their attributes that are the same as q0 themore same attributes the higher sa 0e total score ofeach gallery is s sf(1minus c) + sac where c is the scoreweight 0e reranking rank list Rlowast0 is obtained byreranking g
0prime1simg
0prime5 in descending order of their total score
s For M query images the whole processing of theattribute-aided reranking algorithm is shown asAlgorithm 1
6 Experiments
0is paper evaluates the attribute recognition accuracy ofPARN on person re-id public datasets first and then com-pares our results with other person attribute recognitionmethods 0en this paper evaluates the performance ofMiF-CNN on three people re-id datasets and the availabilityof the proposed attribute-aided reranking algorithm forimproving the accuracy of person re-id 0e analysis ofexperiment results is also given after comparing our resultswith the state-of-the-art person re-id methods
61 Datasets and Evaluation Protocol 0is paper evaluatesthe proposed methods on three challenging person re-id
α3α2α1
Z
Xx1 x2 x3 xn
Softmax Softmax Softmax Softmax
tanh tanh tanh tanh
Watt Watt Watt Watt
htndash1 htndash1 htndash1 htndash1
v1 v2 v3 vn
αn
Figure 5 Schematic diagram of attention mechanism
Wf Wi Wg Wo
f i g o
htndash1 ht
tanh
tanh
ctndash1 ct
xt
σ σ σ
Figure 6 Architecture of the LSTM unit
Computational Intelligence and Neuroscience 7
public datasets including VIPeR [28] CUHK01 [29] andMarket1501 [30]
VIPeR Dataset It contains 1264 images of 632 per-sons Because of the lack of image samples low-resolution a great change in pose view and illu-mination it is one of the hardest difficult person re-iddatasets In experiment the VIPeR dataset is ran-domly split into two parts images of 316 persons fortraining and the images of remaining 316 persons fortestingCUHK01 Dataset It consists of 3884 images of 971persons which are captured by two cameras withdifferent views 485 pedestrians are randomlychosen for training and other 486 pedestrians fortestingMarket1501 Dataset It is one of the largest person re-id datasets that include 32268 images of 1501 personsEach person has a number of images that are capturedby six cameras with different views 0e dataset isdivided into two parts 12936 images of 751 personsfor training and 19732 images of 750 persons fortestingEvaluation Protocol For the single-shot datasets VI-PeR and CUHK01 cumulative match characteristic(CMC) is used to record the ranks of correct identify[31] For the multishot dataset Market1501 apartfrom CMC mean Average Precision (mAP) is uti-lized to evaluate the performance of the proposedmethods
62 Experimental Results on Attribute Recognition 0e at-tributes in PARN refer to pedestrian identity-level attri-butes For the VIPeR dataset attributes include ldquogenderrdquoldquolength of hairrdquo ldquolower clothingrdquo ldquoupper clothingrdquo ldquoback-pack or notrdquo and ldquocarrying anything or notrdquo For theCUHK01 dataset attributes include ldquogenderrdquo ldquolength ofhairrdquo ldquobackpack or notrdquo ldquohandbag or notrdquo ldquocolor of upperrdquoand ldquocolor of lowerrdquo For the Market1501 dataset attributescontain ldquogenderrdquo ldquolength of hairrdquo ldquowearing hat or notrdquoldquolower clothingrdquo ldquobackpack or notrdquo ldquohandbag or notrdquoldquolength of sleeverdquo and ldquolength of lower clothingrdquo Eachperson in each dataset owns an attribute word list as shownin Figure 7
0is paper evaluates the attribute recognition accuracy ofPARN on the Market1501 dataset and compares our resultswith outstanding person attribute recognition method APR[22] 0e experiment results are reported in Table 1 whereldquoLslvrdquo represents the ldquolength of sleeverdquo and ldquoLlowrdquo repre-sents the ldquolength of lower clothingrdquo
It can be observed from Table 1 that PARN obtainssuperior recognition accuracy of 8 attributes especiallyldquolength of hairrdquo ldquohandbag or notrdquo and mean accuracy arehigher than APR whereas accuracy of other attributes iscloser with APR Comparison with APR demonstrates theoutstanding attribute recognition performance of PARNwhich plays an important role in improving the accuracy ofperson re-id
63 Experimental Results on Person re-id 0is paper eval-uates the performance of the proposed methods on threedatasets 0e experimental results of the proposed methodsand other state-of-the-art methods are presented inTable 2
As shown in Table 2 the proposed MiF-CNN modelobtains a great performance among state-of-the-artmethods and on the contrary comparison between MiFand MiF + PARN demonstrates that the proposed attribute-aided reranking algorithm is helpful to increase the personre-id accuracy 0e improvement effect is especially obviouson VIPeR and CUHK01 datasets where the identificationaccuracy of rank 1 rank 5 and rank 10 improves by 602633 and 222 and 494 494 and 226 respectively0e proposed MiF-CNN model with the attribute-aidedreranking algorithm gets the best rank 5 accuracy on theVIPeR dataset the best rank 1 and rank 5 accuracy on theCUHK01 dataset and the best mAP on the Market1501dataset among various methods
64 Analysis of Experimental Results Because of the smallquantity of training samples in VIPeR and CUHK01 data-sets a deep CNN model is hard to be trained and easilysuffers from overfitting To handle this the FT-JSTL+DGDmethod based on deep CNN learned deep features frommultiple domains jointly by merging all the datasets togetherand fine-tuned the pretrained model on VIPeR andCUHK01 separately 0e structure and hyperparameters ofthe deep CNN model in the M3TCP method need to beadjusted manually for adapting the scale of different datasetsso as to overcome the training problem of the deep neuralnetwork
In contrast the proposed MiF-CNN has an immobilestructure and can be trained on smaller datasets directlywithout any fine tuning In spite of the deep structure themodel can converge quickly and well MiF-CNNwithout theattribute-aided reranking algorithm outperforms FT-JSTL +DGD andM3TCP by 027 and 1317 respectivelyat rank 1 accuracy on the CUHK01 dataset It also out-performs FT-JSTL+DGD by 222 at rank 1 accuracy onthe VIPeR dataset which indicates that the proposed MiF-CNN model has an excellent ability of reducing overfittingand generalization and an outstanding performance inperson re-id
Figure 8 demonstrates the rank list of MiF andMiF + PARN It can be seen that the MiF-CNN with theattribute-aided reranking method enables the lower rankedpositive sample in rank list of MiF-CNN obtain a higherranking so as to improve the accuracy of person re-idwhich verifies the effectiveness of the attribute-aidedreranking algorithm for improving performance of per-son re-id model
7 Conclusion
0is paper studies and discusses the training problem ofdeep neural network in person re-id task and the using ofpedestrian attributes for further improving the accuracy of
8 Computational Intelligence and Neuroscience
person re-id 0e proposed MiF-CNN model realizes thereuse of feature maps and gradient information whichenhances the feature mobility of network and improves theefficiency of gradient propagation 0e designed person at-tribute recognition network uses an attention mechanism tomeasure the correlation between input feature maps and thehidden state of LSTM at previous time for reducing the di-mension of features It also employs LSTM to decode imagefeatures into pedestrian attributes On the basis of the at-tribute recognition results the attribute-aided reranking
algorithm is presented which rematches attribute featuresamong samples to aid more positive samples rank higher inrank list so as to improve the identify accuracy further
0e experimental results on three public person re-iddatasets indicate the outstanding performance of MiF-CNNmodel in person re-id 0e attribute-aided reranking algo-rithm makes a major contribution to improve the accuracyof person re-id In the future more improvement and op-timization will be done on pedestrian attribute recognitionnetwork Moreover the caption of person images or videos
Input Initial rank list R0Output 0e reranking rank list Rlowast
(1) for i 1 2 M do(2) if rank 1 correct(3) Rlowast R0(4) end if(5) elif rank 5 correct(6) Distributing initial feature score sf for top-5 gallery g
iprime1simg
iprime5 according to their ranking in R0
(7) Distributing attribute score sa for giprime1simg
iprime5 according to the number of their attributes that are the same as qi
(8) s sf(1minus c) + sac
(9) Reranking giprime1simg
iprime5 in descending order of their total score s and obtain Rlowast
(10) end if(11) elif rank 10 correct(12) Changing g
iprime1simg
iprime5 to g
iprime1simg
iprime10 repeat 6ndash10
(13) end if(14) else(15) Changing g
iprime1simg
iprime5 to g
iprime1simg
iprimeM repeat 6ndash10
(16) end if(17) end for
ALGORITHM 1 0e attribute-aided reranking algorithm
Male
Hat
Pants
Stripes
Backpack
Carrying nothing
(a)
Male
Short hair
Pants
No hat
No backpack
No handbag
Short sleeve
Short lower
(b)
Figure 7 Attribute label of the pedestrian in datasets
Table 1 Recognition accuracy of different attributes in PARN and APR ()
Method Gender Hair Clothes Hat Backpack Handbag Lslv Llow MeanAPR 8578 8346 9136 8821 8632 7601 9412 9264 8723PARN 8493 7910 8935 9667 8386 8518 9284 8845 8755
Computational Intelligence and Neuroscience 9
can be obtained by the LSTM model with natural languageprocess ideas which make person re-id methods moresignificant
Data Availability
0e (attributes recognition accuracy on the Market1501dataset and person re-id accuracy on VIPeR CUHK01 andMarket1501 datasets) data used to support the findings ofthis study are included within the article
Conflicts of Interest
0e authors declare that they have no conflicts of interest
Acknowledgments
0is work was supported by the National Natural ScienceFoundation of China (no 61773105) the Natural ScienceFoundation of Liaoning Province (no 20170540675) andthe Scientific Research Project of Liaoning EducationalDepartment (no LQGD2017023)
Table 2 Comparison of various methods with the proposed methods on three datasets ()
MethodsVIPeR CUHK01 Market1501
Rank 1 Rank 5 Rank 10 Rank 1 Rank 5 Rank 10 Rank 1 Rank 5 Rank 10 mAPDNS [32] 4228 7146 8294 6498 8496 8992 6796 mdash mdash 4189DLDA [33] 4411 7259 8166 6712 8945 9168 4815 mdash mdash 2994FT-JSTL +DGD [11] 386 mdash mdash 6660 mdash mdash mdash mdash mdash mdashPDC [34] 5127 7405 8418 mdash mdash mdash 8414 9273 9492 6341K-means-CNN [35] 4650 6930 8070 5350 8250 9120 mdash mdash mdash mdashSpindle [9] 5380 7410 8320 mdash mdash mdash 7690 9150 9460 mdashPersonNet [36] mdash mdash mdash 7114 9007 9500 3721 mdash mdash 1857CSBT [37] 3660 6620 8830 5120 7630 9180 4290 mdash mdash 2030SDH-CNN [38] mdash mdash mdash mdash mdash mdash 5812 6850 8082 4820M3TCP [21] mdash mdash mdash 5370 8430 9100 mdash mdash mdash mdashDM3 [39] 4270 7430 8510 4970 7730 8610 7580 8910 9240 5320MiF 4082 6835 8101 6687 8539 8951 7892 8646 8928 6600MiF + PARN 4684 7468 8323 7181 9033 9177 7939 8895 9136 6646ldquoMiFrdquo represents the MiF-CNN method and ldquoMiF + PARNrdquo represents the MiF-CNN with the attribute-aided reranking method
Query
1 2 3 4 5 6 7 8 9 10
Query
MiF
MiF + PARN
MiF
MiF + PARN
Rank list
Figure 8 Rank list comparison between MiF and MiF + PARN
10 Computational Intelligence and Neuroscience
References
[1] L Li and J Guo ldquoPedestrian re-identification based onreinforced deep feature fusionrdquo Information Technologyvol 42 no 7 pp 15ndash19 2018 in Chinese
[2] J Dai Y Zhang H Lu and H Wang ldquoCross-view semanticprojection learning for person re-identificationrdquo PatternRecognition vol 75 pp 63ndash76 2018
[3] T T T Pham T-L Le H Vu et al ldquoFully-automated personre-identification in multi-camera surveillance system with arobust kernel descriptor and effective shadow removalmethodrdquo Image and Vision Computing vol 59 pp 44ndash622017
[4] P DaoDao and H J Lee ldquoDeep multi-task network forlearning person identity and attributesrdquo IEEE Access vol 6pp 60801ndash60811 2018
[5] J Li L Zhuo J Zhang F Li and H Zhang ldquoA survey ofperson re-identificationrdquo Acta Automatica Sinica vol 44no 9 pp 1554ndash1568 2018 in Chinese
[6] L Wu Y Wang J Gao et al ldquoDeep adaptive feature em-bedding with local sample distributions for person re-iden-tificationrdquo Pattern Recognition vol 73 pp 275ndash288 2018
[7] L Li Y Yang and A G Hauptmann ldquoPerson re-identification past present and futurerdquo 2016 httparxivorgabs161002984
[8] D Wu S J Zheng C A Yuan and D S Huang ldquoA deepmodel with combined losses for person re-identificationrdquoCognitive Systems Research vol 54 pp 74ndash82 2018
[9] H Zhao M Tian S Sun et al ldquoSpindle net person re-identification with human body region guided feature de-composition and fusionrdquo in Proceedings of the 2017 IEEEConference on Computer Vision and Pattern Recognition(CVPR) pp 1077ndash1085 Honolulu HI USA 2017
[10] W Chen X Chen J Zhang and K Huang ldquoBeyond tripletloss a deep quadruplet network for person re-identificationrdquoin Proceedings of the 2017 IEEE Conference on ComputerVision and Pattern Recognition (CVPR) vol 28 Honolulu HIUSA July 2017
[11] T Xiao H Li W Ouyang et al ldquoLearning deep featurerepresentations with domain guided dropout for person re-identificationrdquo in Proceedings of the 2016 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 1249ndash1258 Las Vegas NV USA June 2016
[12] E Ahmed M Jones and T K Marks ldquoAn improved deeplearning architecture for person re-identificationrdquo in Pro-ceedings of the 2015 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 3908ndash3916 Boston MAUSA June 2015
[13] M Farenzena L Bazzani A Perina et al ldquoPerson re-identification by symmetry-driven accumulation of localfeaturesrdquo in Proceedings of the 2010 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 2360ndash2367 San Francisco CA USA June 2010
[14] Y Yang J Yang J Yan et al ldquoSalient color names for personre-identificationrdquo in Proceedings of Computer Vision-ECCV2014 pp 536ndash551 Zurich Switzerland September 2014
[15] L Liao M Cristani A Perina et al ldquoMultiple-shot person re-identification by chromatic and epitomic analysesrdquo PatternRecognition Letters vol 33 no 7 pp 898ndash903 2012
[16] S Murino R Laganiere and P Payeur ldquoImproving pedes-trian detection with selective gradient self-similarity featurerdquoPattern Recognition vol 48 no 8 pp 2364ndash2376 2015
[17] R Layne T M Hospedales S Gong et al ldquoPerson re-identification by attributesrdquo in Proceedings of the 23th
BritishMachine Vision Conference (BMVC) vol 2 no 3 p 83Surrey UK September 2012
[18] Z Wang R Hu Y Yu C Liang and W Huang ldquoMulti-levelfusion for person Re-identification with incomplete marksrdquoin Proceedings of the 23rd ACM international conference onMultimedia pp 1267ndash1270 Brisbane Australia March 2018
[19] Y Chen S Duffner A Stoian et al ldquoDeep and low-levelfeature based attribute learning for person re-identificationrdquoImage and Vision Computing vol 79 pp 25ndash34 2018
[20] Z Dufour X Bai M Ye and S Satoh ldquoIncremental deephidden attribute learningrdquo in Proceedings of the 26th ACMinternational conference on Multimedia pp 72ndash80 SeoulSouth Korea October 2018
[21] D Cheng Y Gong S Zhou et al ldquoPerson re-identification bymulti-channel parts-based cnn with improved triplet lossfunctionrdquo in Proceedings of the 2016 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 1335ndash1344 Las Vegas NV USA June 2016
[22] K He X Zhang S Ren et al ldquoDeep residual learning forimage recognitionrdquo in Proceedings of the 2016 IEEE Confer-ence on Computer Vision and Pattern Recognition (CVPR)pp 770ndash778 Las Vegas NV USA June 2016
[23] Y Lin L Zheng Z Zheng et al ldquoImproving person re-identification by attribute and identity learningrdquo 2017 httparxivorgabs170307220
[24] Y Yan B Ni J Liu and X Yang ldquoMulti-level attentionmodelfor person re-identificationrdquo Pattern Recognition Letters2018 In press
[25] Y Wen K Zhang Z Li et al ldquoA discriminative featurelearning approach for deep face recognitionrdquo in Proceedingsof 14th European Conference pp 499ndash515 AmsterdamNetherlands October 2016
[26] K Xu J Ba R Kiros et al ldquoShow attend and tell neuralimage caption generation with visual attentionrdquo in Pro-ceedings of the 32nd International Conference on MachineLearning (ICML) pp 2048ndash2057 Lille France July 2015
[27] H Choi K Cho and Y Bengio ldquoFine-grained attentionmechanism for neural machine translationrdquoNeurocomputingvol 284 pp 171ndash176 2018
[28] D Gray and H Tao ldquoViewpoint invariant pedestrian rec-ognition with an ensemble of localized featuresrdquo in Pro-ceedings of European Conference on Computer Vision LectureNotes in Computer Science pp 262ndash275 Marseille FranceOctober 2008
[29] W Li R Zhao and X Wang ldquoHuman reidentification withtransferred metric learningrdquo in Proceedings of the 2008 AsianConference on Computer Vision (ACCV) pp 31ndash44 DaejeonKorea November 2012
[30] L Zheng L Shen L Tian et al ldquoScalable person re-identification a benchmarkrdquo in Proceedings of the 2015IEEE Conference on Computer Vision and Pattern Recognition(CVPR) pp 1116ndash1124 Boston MA USA December 2015
[31] W Li R Zhao T Xiao et al ldquoDeepReID deep filter pairingneural network for person re-identificationrdquo in Proceedings ofthe 2014 Computer Vision and Pattern Recognition (CVPR)pp 152ndash159 Columbus OH USA June 2014
[32] L Zhang T Xiang and S Gong ldquoLearning a discriminativenull space for person re-identificationrdquo in Proceedings of theIEEE Conference on Computer Cision and Pattern Recognition(CVPR) pp 1239ndash1248 Las Vegas NV USA June 2016
[33] L Wu C Shen and A van den Hengel ldquoDeep linear dis-criminant analysis on Fisher networks a hybrid architecturefor person re-identificationrdquo Pattern Recognition vol 65pp 238ndash250 2017
Computational Intelligence and Neuroscience 11
[34] C Su J Li S Zhang et al ldquoPose-driven deep convolutionalmodel for person re-identificationrdquo in Proceedings of the 2016International Conference on Computer Vision (ICCV)pp 3980ndash3989 Venice Italy October 2017
[35] Z Zhao B Zhao and F Su ldquoPerson re-identification viaintegrating patch-based metric learning and local saliencelearningrdquo Pattern Recognition vol 75 pp 90ndash98 2018
[36] L Wu C Shen and A Hengel ldquoPersonnet person re-identification with deep convolutional neural networksrdquo2016 httparxivorgabs160107255
[37] J Chen Y Wang J Qin et al ldquoFast person re-identificationvia cross-camera semantic binary transformationrdquo in Pro-ceedings of the 2017 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 5330ndash5339 Honolulu HIUSA July 2017
[38] L Wu Y Wang Z Ge et al ldquoStructured deep hashing withconvolutional neural networks for fast person re-identifica-tionrdquo Computer Vision and Image Understanding vol 167no 10 pp 63ndash73 2018
[39] Z Hu R Hu C Chen Y Yu et al ldquoPerson reidentification viadiscrepancy matrix and matrix metricrdquo IEEE Transactions onCybernetics vol 48 no 10 pp 3006ndash3020 2018
[40] D Jiang K Cho and Y Bengio ldquoNeural machine translationby jointly learning to align and translaterdquo 2014 httparxivorgabs14090473
12 Computational Intelligence and Neuroscience
Computer Games Technology
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
Advances in
FuzzySystems
Hindawiwwwhindawicom
Volume 2018
International Journal of
ReconfigurableComputing
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
thinspArtificial Intelligence
Hindawiwwwhindawicom Volumethinsp2018
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawiwwwhindawicom Volume 2018
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Computational Intelligence and Neuroscience
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018
Human-ComputerInteraction
Advances in
Hindawiwwwhindawicom Volume 2018
Scientic Programming
Submit your manuscripts atwwwhindawicom
where W and b represent the weights and biases of eachgate respectively Sigmoid function outputs a numberbetween 0 and 1 to decide the quantity scale of values thatgo through these gates Tanh function is the activationfunction of input 0is design makes LSTM give differentweights for information at different times so that it canchoose what part of information to remember or forget in along-term sequence
0e time step of LSTM in PARN is set as the number ofattributes of each pedestrian imagem 0e output hidden stateht is calculated by a fully connected layer and softmax functionat each time step to predict pedestrian attribute words
It is worth noting that at each time step the input wordembedding of LSTM is the one that LSTM learned at theprevious time step which takes advantage of the LSTM whensolving the long-term sequence problem Because of the
LSTMt = 1
LSTMt = 2
LSTMt = 3
LSTMt = m
Attention Attention Attention Attention
MiF-CNN
X
y1h1
y2h2
y3h3
ymndash1hmndash1
u1 u2 u3 um
y0h0
Figure 4 Architecture of PARN
Query Rank list1 2 3 4 5 6 7 8 9 10
A
B
C
Figure 3 Identification results of MiF-CNN (green bounding box represents positive samples in gallery)
6 Computational Intelligence and Neuroscience
correlation among attributes of pedestrian like the pedestrianowning attribute ldquomalerdquo generally own attribute ldquoshort hairrdquoand attribute ldquopantsrdquo LSTM can utilize the previous attributeresult to predict the current attribute more accurately
44 Network Optimization 0e loss function of PARN in-cludes two parts one is the cross-entropy between pre-diction and ground truth of network which is expressed as
Lsa 1113944m
i1uprime log
eθTaiu
1113936nw
j1eθT
aju⎛⎝ ⎞⎠ (19)
where u is the prediction value of the network uprime is theground truth of pedestrian labels m is the time step ofLSTM and nw is the total number of attribute words in eachdataset As shown in equation (19) the loss of a single imagein single epoch is the cumulated loss value afterm iterations0e other one is the loss function of attention mechanism asdescribed in [40]
Lα 12n
1113944
n
i11minus 1113944
m
i1αj
i⎛⎝ ⎞⎠
2
(20)
where αji represents the attention weights of the image
feature xi and n is the number of parts in the image feature0e object function of PARN to optimize is
J minus1
M1113944
M
i1L
isa + λ middot L
iα1113872 1113873 (21)
whereM is the batch size and λ is the rate of Lα in the objectfunction
5 Attribute-Aided Reranking Algorithm
Given a probe image q0 and gallery image setG g1 g2 gN1113864 1113865 including N pedestrian images theinitial similarity distance between q0 and gi is computedwith the Euclidean distance as
d q0 gi( 1113857
1113944
N
i1xq0minusxgi
1113872 11138732
11139741113972
(22)
where xq0and xgi
represent the features of q0 and gi re-spectively that extracted by MiF-CNN 0e initial ranklist R0 g
0prime1 g
0prime2 g
0primeN is obtained by sorting similarity
distance d(q0 gi) in ascending order whered(q0 g
0primei )ltd(q0 g
0primei+1) If the top-1 gallery g
0prime1 is the positive
sample it means rank 1 is correct If the positive sample isnot the top-1 gallery but the top-5 gallery it means rank 5 iscorrect
0e attribute-aided reranking algorithm reranks therank list R0 according to the attribute feature similaritybetween q0 and gi so that more positive gallery getshigher ranking in R0 which could improve the perfor-mance of person re-id To be specific when rank 5 iscorrect but rank 1 is not the proposed algorithm dis-tributes the initial feature score sf for top-5 gallery g
0prime1 simg
0prime5
according to their ranking in R0 the higher the rankingthe higher the sf 0en the proposed algorithm dis-tributes attribute score sa for g
0prime1simg
0prime5 according to the
number of their attributes that are the same as q0 themore same attributes the higher sa 0e total score ofeach gallery is s sf(1minus c) + sac where c is the scoreweight 0e reranking rank list Rlowast0 is obtained byreranking g
0prime1simg
0prime5 in descending order of their total score
s For M query images the whole processing of theattribute-aided reranking algorithm is shown asAlgorithm 1
6 Experiments
0is paper evaluates the attribute recognition accuracy ofPARN on person re-id public datasets first and then com-pares our results with other person attribute recognitionmethods 0en this paper evaluates the performance ofMiF-CNN on three people re-id datasets and the availabilityof the proposed attribute-aided reranking algorithm forimproving the accuracy of person re-id 0e analysis ofexperiment results is also given after comparing our resultswith the state-of-the-art person re-id methods
61 Datasets and Evaluation Protocol 0is paper evaluatesthe proposed methods on three challenging person re-id
α3α2α1
Z
Xx1 x2 x3 xn
Softmax Softmax Softmax Softmax
tanh tanh tanh tanh
Watt Watt Watt Watt
htndash1 htndash1 htndash1 htndash1
v1 v2 v3 vn
αn
Figure 5 Schematic diagram of attention mechanism
Wf Wi Wg Wo
f i g o
htndash1 ht
tanh
tanh
ctndash1 ct
xt
σ σ σ
Figure 6 Architecture of the LSTM unit
Computational Intelligence and Neuroscience 7
public datasets including VIPeR [28] CUHK01 [29] andMarket1501 [30]
VIPeR Dataset It contains 1264 images of 632 per-sons Because of the lack of image samples low-resolution a great change in pose view and illu-mination it is one of the hardest difficult person re-iddatasets In experiment the VIPeR dataset is ran-domly split into two parts images of 316 persons fortraining and the images of remaining 316 persons fortestingCUHK01 Dataset It consists of 3884 images of 971persons which are captured by two cameras withdifferent views 485 pedestrians are randomlychosen for training and other 486 pedestrians fortestingMarket1501 Dataset It is one of the largest person re-id datasets that include 32268 images of 1501 personsEach person has a number of images that are capturedby six cameras with different views 0e dataset isdivided into two parts 12936 images of 751 personsfor training and 19732 images of 750 persons fortestingEvaluation Protocol For the single-shot datasets VI-PeR and CUHK01 cumulative match characteristic(CMC) is used to record the ranks of correct identify[31] For the multishot dataset Market1501 apartfrom CMC mean Average Precision (mAP) is uti-lized to evaluate the performance of the proposedmethods
62 Experimental Results on Attribute Recognition 0e at-tributes in PARN refer to pedestrian identity-level attri-butes For the VIPeR dataset attributes include ldquogenderrdquoldquolength of hairrdquo ldquolower clothingrdquo ldquoupper clothingrdquo ldquoback-pack or notrdquo and ldquocarrying anything or notrdquo For theCUHK01 dataset attributes include ldquogenderrdquo ldquolength ofhairrdquo ldquobackpack or notrdquo ldquohandbag or notrdquo ldquocolor of upperrdquoand ldquocolor of lowerrdquo For the Market1501 dataset attributescontain ldquogenderrdquo ldquolength of hairrdquo ldquowearing hat or notrdquoldquolower clothingrdquo ldquobackpack or notrdquo ldquohandbag or notrdquoldquolength of sleeverdquo and ldquolength of lower clothingrdquo Eachperson in each dataset owns an attribute word list as shownin Figure 7
0is paper evaluates the attribute recognition accuracy ofPARN on the Market1501 dataset and compares our resultswith outstanding person attribute recognition method APR[22] 0e experiment results are reported in Table 1 whereldquoLslvrdquo represents the ldquolength of sleeverdquo and ldquoLlowrdquo repre-sents the ldquolength of lower clothingrdquo
It can be observed from Table 1 that PARN obtainssuperior recognition accuracy of 8 attributes especiallyldquolength of hairrdquo ldquohandbag or notrdquo and mean accuracy arehigher than APR whereas accuracy of other attributes iscloser with APR Comparison with APR demonstrates theoutstanding attribute recognition performance of PARNwhich plays an important role in improving the accuracy ofperson re-id
63 Experimental Results on Person re-id 0is paper eval-uates the performance of the proposed methods on threedatasets 0e experimental results of the proposed methodsand other state-of-the-art methods are presented inTable 2
As shown in Table 2 the proposed MiF-CNN modelobtains a great performance among state-of-the-artmethods and on the contrary comparison between MiFand MiF + PARN demonstrates that the proposed attribute-aided reranking algorithm is helpful to increase the personre-id accuracy 0e improvement effect is especially obviouson VIPeR and CUHK01 datasets where the identificationaccuracy of rank 1 rank 5 and rank 10 improves by 602633 and 222 and 494 494 and 226 respectively0e proposed MiF-CNN model with the attribute-aidedreranking algorithm gets the best rank 5 accuracy on theVIPeR dataset the best rank 1 and rank 5 accuracy on theCUHK01 dataset and the best mAP on the Market1501dataset among various methods
64 Analysis of Experimental Results Because of the smallquantity of training samples in VIPeR and CUHK01 data-sets a deep CNN model is hard to be trained and easilysuffers from overfitting To handle this the FT-JSTL+DGDmethod based on deep CNN learned deep features frommultiple domains jointly by merging all the datasets togetherand fine-tuned the pretrained model on VIPeR andCUHK01 separately 0e structure and hyperparameters ofthe deep CNN model in the M3TCP method need to beadjusted manually for adapting the scale of different datasetsso as to overcome the training problem of the deep neuralnetwork
In contrast the proposed MiF-CNN has an immobilestructure and can be trained on smaller datasets directlywithout any fine tuning In spite of the deep structure themodel can converge quickly and well MiF-CNNwithout theattribute-aided reranking algorithm outperforms FT-JSTL +DGD andM3TCP by 027 and 1317 respectivelyat rank 1 accuracy on the CUHK01 dataset It also out-performs FT-JSTL+DGD by 222 at rank 1 accuracy onthe VIPeR dataset which indicates that the proposed MiF-CNN model has an excellent ability of reducing overfittingand generalization and an outstanding performance inperson re-id
Figure 8 demonstrates the rank list of MiF andMiF + PARN It can be seen that the MiF-CNN with theattribute-aided reranking method enables the lower rankedpositive sample in rank list of MiF-CNN obtain a higherranking so as to improve the accuracy of person re-idwhich verifies the effectiveness of the attribute-aidedreranking algorithm for improving performance of per-son re-id model
7 Conclusion
0is paper studies and discusses the training problem ofdeep neural network in person re-id task and the using ofpedestrian attributes for further improving the accuracy of
8 Computational Intelligence and Neuroscience
person re-id 0e proposed MiF-CNN model realizes thereuse of feature maps and gradient information whichenhances the feature mobility of network and improves theefficiency of gradient propagation 0e designed person at-tribute recognition network uses an attention mechanism tomeasure the correlation between input feature maps and thehidden state of LSTM at previous time for reducing the di-mension of features It also employs LSTM to decode imagefeatures into pedestrian attributes On the basis of the at-tribute recognition results the attribute-aided reranking
algorithm is presented which rematches attribute featuresamong samples to aid more positive samples rank higher inrank list so as to improve the identify accuracy further
0e experimental results on three public person re-iddatasets indicate the outstanding performance of MiF-CNNmodel in person re-id 0e attribute-aided reranking algo-rithm makes a major contribution to improve the accuracyof person re-id In the future more improvement and op-timization will be done on pedestrian attribute recognitionnetwork Moreover the caption of person images or videos
Input Initial rank list R0Output 0e reranking rank list Rlowast
(1) for i 1 2 M do(2) if rank 1 correct(3) Rlowast R0(4) end if(5) elif rank 5 correct(6) Distributing initial feature score sf for top-5 gallery g
iprime1simg
iprime5 according to their ranking in R0
(7) Distributing attribute score sa for giprime1simg
iprime5 according to the number of their attributes that are the same as qi
(8) s sf(1minus c) + sac
(9) Reranking giprime1simg
iprime5 in descending order of their total score s and obtain Rlowast
(10) end if(11) elif rank 10 correct(12) Changing g
iprime1simg
iprime5 to g
iprime1simg
iprime10 repeat 6ndash10
(13) end if(14) else(15) Changing g
iprime1simg
iprime5 to g
iprime1simg
iprimeM repeat 6ndash10
(16) end if(17) end for
ALGORITHM 1 0e attribute-aided reranking algorithm
Male
Hat
Pants
Stripes
Backpack
Carrying nothing
(a)
Male
Short hair
Pants
No hat
No backpack
No handbag
Short sleeve
Short lower
(b)
Figure 7 Attribute label of the pedestrian in datasets
Table 1 Recognition accuracy of different attributes in PARN and APR ()
Method Gender Hair Clothes Hat Backpack Handbag Lslv Llow MeanAPR 8578 8346 9136 8821 8632 7601 9412 9264 8723PARN 8493 7910 8935 9667 8386 8518 9284 8845 8755
Computational Intelligence and Neuroscience 9
can be obtained by the LSTM model with natural languageprocess ideas which make person re-id methods moresignificant
Data Availability
0e (attributes recognition accuracy on the Market1501dataset and person re-id accuracy on VIPeR CUHK01 andMarket1501 datasets) data used to support the findings ofthis study are included within the article
Conflicts of Interest
0e authors declare that they have no conflicts of interest
Acknowledgments
0is work was supported by the National Natural ScienceFoundation of China (no 61773105) the Natural ScienceFoundation of Liaoning Province (no 20170540675) andthe Scientific Research Project of Liaoning EducationalDepartment (no LQGD2017023)
Table 2 Comparison of various methods with the proposed methods on three datasets ()
MethodsVIPeR CUHK01 Market1501
Rank 1 Rank 5 Rank 10 Rank 1 Rank 5 Rank 10 Rank 1 Rank 5 Rank 10 mAPDNS [32] 4228 7146 8294 6498 8496 8992 6796 mdash mdash 4189DLDA [33] 4411 7259 8166 6712 8945 9168 4815 mdash mdash 2994FT-JSTL +DGD [11] 386 mdash mdash 6660 mdash mdash mdash mdash mdash mdashPDC [34] 5127 7405 8418 mdash mdash mdash 8414 9273 9492 6341K-means-CNN [35] 4650 6930 8070 5350 8250 9120 mdash mdash mdash mdashSpindle [9] 5380 7410 8320 mdash mdash mdash 7690 9150 9460 mdashPersonNet [36] mdash mdash mdash 7114 9007 9500 3721 mdash mdash 1857CSBT [37] 3660 6620 8830 5120 7630 9180 4290 mdash mdash 2030SDH-CNN [38] mdash mdash mdash mdash mdash mdash 5812 6850 8082 4820M3TCP [21] mdash mdash mdash 5370 8430 9100 mdash mdash mdash mdashDM3 [39] 4270 7430 8510 4970 7730 8610 7580 8910 9240 5320MiF 4082 6835 8101 6687 8539 8951 7892 8646 8928 6600MiF + PARN 4684 7468 8323 7181 9033 9177 7939 8895 9136 6646ldquoMiFrdquo represents the MiF-CNN method and ldquoMiF + PARNrdquo represents the MiF-CNN with the attribute-aided reranking method
Query
1 2 3 4 5 6 7 8 9 10
Query
MiF
MiF + PARN
MiF
MiF + PARN
Rank list
Figure 8 Rank list comparison between MiF and MiF + PARN
10 Computational Intelligence and Neuroscience
References
[1] L Li and J Guo ldquoPedestrian re-identification based onreinforced deep feature fusionrdquo Information Technologyvol 42 no 7 pp 15ndash19 2018 in Chinese
[2] J Dai Y Zhang H Lu and H Wang ldquoCross-view semanticprojection learning for person re-identificationrdquo PatternRecognition vol 75 pp 63ndash76 2018
[3] T T T Pham T-L Le H Vu et al ldquoFully-automated personre-identification in multi-camera surveillance system with arobust kernel descriptor and effective shadow removalmethodrdquo Image and Vision Computing vol 59 pp 44ndash622017
[4] P DaoDao and H J Lee ldquoDeep multi-task network forlearning person identity and attributesrdquo IEEE Access vol 6pp 60801ndash60811 2018
[5] J Li L Zhuo J Zhang F Li and H Zhang ldquoA survey ofperson re-identificationrdquo Acta Automatica Sinica vol 44no 9 pp 1554ndash1568 2018 in Chinese
[6] L Wu Y Wang J Gao et al ldquoDeep adaptive feature em-bedding with local sample distributions for person re-iden-tificationrdquo Pattern Recognition vol 73 pp 275ndash288 2018
[7] L Li Y Yang and A G Hauptmann ldquoPerson re-identification past present and futurerdquo 2016 httparxivorgabs161002984
[8] D Wu S J Zheng C A Yuan and D S Huang ldquoA deepmodel with combined losses for person re-identificationrdquoCognitive Systems Research vol 54 pp 74ndash82 2018
[9] H Zhao M Tian S Sun et al ldquoSpindle net person re-identification with human body region guided feature de-composition and fusionrdquo in Proceedings of the 2017 IEEEConference on Computer Vision and Pattern Recognition(CVPR) pp 1077ndash1085 Honolulu HI USA 2017
[10] W Chen X Chen J Zhang and K Huang ldquoBeyond tripletloss a deep quadruplet network for person re-identificationrdquoin Proceedings of the 2017 IEEE Conference on ComputerVision and Pattern Recognition (CVPR) vol 28 Honolulu HIUSA July 2017
[11] T Xiao H Li W Ouyang et al ldquoLearning deep featurerepresentations with domain guided dropout for person re-identificationrdquo in Proceedings of the 2016 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 1249ndash1258 Las Vegas NV USA June 2016
[12] E Ahmed M Jones and T K Marks ldquoAn improved deeplearning architecture for person re-identificationrdquo in Pro-ceedings of the 2015 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 3908ndash3916 Boston MAUSA June 2015
[13] M Farenzena L Bazzani A Perina et al ldquoPerson re-identification by symmetry-driven accumulation of localfeaturesrdquo in Proceedings of the 2010 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 2360ndash2367 San Francisco CA USA June 2010
[14] Y Yang J Yang J Yan et al ldquoSalient color names for personre-identificationrdquo in Proceedings of Computer Vision-ECCV2014 pp 536ndash551 Zurich Switzerland September 2014
[15] L Liao M Cristani A Perina et al ldquoMultiple-shot person re-identification by chromatic and epitomic analysesrdquo PatternRecognition Letters vol 33 no 7 pp 898ndash903 2012
[16] S Murino R Laganiere and P Payeur ldquoImproving pedes-trian detection with selective gradient self-similarity featurerdquoPattern Recognition vol 48 no 8 pp 2364ndash2376 2015
[17] R Layne T M Hospedales S Gong et al ldquoPerson re-identification by attributesrdquo in Proceedings of the 23th
BritishMachine Vision Conference (BMVC) vol 2 no 3 p 83Surrey UK September 2012
[18] Z Wang R Hu Y Yu C Liang and W Huang ldquoMulti-levelfusion for person Re-identification with incomplete marksrdquoin Proceedings of the 23rd ACM international conference onMultimedia pp 1267ndash1270 Brisbane Australia March 2018
[19] Y Chen S Duffner A Stoian et al ldquoDeep and low-levelfeature based attribute learning for person re-identificationrdquoImage and Vision Computing vol 79 pp 25ndash34 2018
[20] Z Dufour X Bai M Ye and S Satoh ldquoIncremental deephidden attribute learningrdquo in Proceedings of the 26th ACMinternational conference on Multimedia pp 72ndash80 SeoulSouth Korea October 2018
[21] D Cheng Y Gong S Zhou et al ldquoPerson re-identification bymulti-channel parts-based cnn with improved triplet lossfunctionrdquo in Proceedings of the 2016 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 1335ndash1344 Las Vegas NV USA June 2016
[22] K He X Zhang S Ren et al ldquoDeep residual learning forimage recognitionrdquo in Proceedings of the 2016 IEEE Confer-ence on Computer Vision and Pattern Recognition (CVPR)pp 770ndash778 Las Vegas NV USA June 2016
[23] Y Lin L Zheng Z Zheng et al ldquoImproving person re-identification by attribute and identity learningrdquo 2017 httparxivorgabs170307220
[24] Y Yan B Ni J Liu and X Yang ldquoMulti-level attentionmodelfor person re-identificationrdquo Pattern Recognition Letters2018 In press
[25] Y Wen K Zhang Z Li et al ldquoA discriminative featurelearning approach for deep face recognitionrdquo in Proceedingsof 14th European Conference pp 499ndash515 AmsterdamNetherlands October 2016
[26] K Xu J Ba R Kiros et al ldquoShow attend and tell neuralimage caption generation with visual attentionrdquo in Pro-ceedings of the 32nd International Conference on MachineLearning (ICML) pp 2048ndash2057 Lille France July 2015
[27] H Choi K Cho and Y Bengio ldquoFine-grained attentionmechanism for neural machine translationrdquoNeurocomputingvol 284 pp 171ndash176 2018
[28] D Gray and H Tao ldquoViewpoint invariant pedestrian rec-ognition with an ensemble of localized featuresrdquo in Pro-ceedings of European Conference on Computer Vision LectureNotes in Computer Science pp 262ndash275 Marseille FranceOctober 2008
[29] W Li R Zhao and X Wang ldquoHuman reidentification withtransferred metric learningrdquo in Proceedings of the 2008 AsianConference on Computer Vision (ACCV) pp 31ndash44 DaejeonKorea November 2012
[30] L Zheng L Shen L Tian et al ldquoScalable person re-identification a benchmarkrdquo in Proceedings of the 2015IEEE Conference on Computer Vision and Pattern Recognition(CVPR) pp 1116ndash1124 Boston MA USA December 2015
[31] W Li R Zhao T Xiao et al ldquoDeepReID deep filter pairingneural network for person re-identificationrdquo in Proceedings ofthe 2014 Computer Vision and Pattern Recognition (CVPR)pp 152ndash159 Columbus OH USA June 2014
[32] L Zhang T Xiang and S Gong ldquoLearning a discriminativenull space for person re-identificationrdquo in Proceedings of theIEEE Conference on Computer Cision and Pattern Recognition(CVPR) pp 1239ndash1248 Las Vegas NV USA June 2016
[33] L Wu C Shen and A van den Hengel ldquoDeep linear dis-criminant analysis on Fisher networks a hybrid architecturefor person re-identificationrdquo Pattern Recognition vol 65pp 238ndash250 2017
Computational Intelligence and Neuroscience 11
[34] C Su J Li S Zhang et al ldquoPose-driven deep convolutionalmodel for person re-identificationrdquo in Proceedings of the 2016International Conference on Computer Vision (ICCV)pp 3980ndash3989 Venice Italy October 2017
[35] Z Zhao B Zhao and F Su ldquoPerson re-identification viaintegrating patch-based metric learning and local saliencelearningrdquo Pattern Recognition vol 75 pp 90ndash98 2018
[36] L Wu C Shen and A Hengel ldquoPersonnet person re-identification with deep convolutional neural networksrdquo2016 httparxivorgabs160107255
[37] J Chen Y Wang J Qin et al ldquoFast person re-identificationvia cross-camera semantic binary transformationrdquo in Pro-ceedings of the 2017 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 5330ndash5339 Honolulu HIUSA July 2017
[38] L Wu Y Wang Z Ge et al ldquoStructured deep hashing withconvolutional neural networks for fast person re-identifica-tionrdquo Computer Vision and Image Understanding vol 167no 10 pp 63ndash73 2018
[39] Z Hu R Hu C Chen Y Yu et al ldquoPerson reidentification viadiscrepancy matrix and matrix metricrdquo IEEE Transactions onCybernetics vol 48 no 10 pp 3006ndash3020 2018
[40] D Jiang K Cho and Y Bengio ldquoNeural machine translationby jointly learning to align and translaterdquo 2014 httparxivorgabs14090473
12 Computational Intelligence and Neuroscience
Computer Games Technology
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
Advances in
FuzzySystems
Hindawiwwwhindawicom
Volume 2018
International Journal of
ReconfigurableComputing
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
thinspArtificial Intelligence
Hindawiwwwhindawicom Volumethinsp2018
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawiwwwhindawicom Volume 2018
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Computational Intelligence and Neuroscience
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018
Human-ComputerInteraction
Advances in
Hindawiwwwhindawicom Volume 2018
Scientic Programming
Submit your manuscripts atwwwhindawicom
correlation among attributes of pedestrian like the pedestrianowning attribute ldquomalerdquo generally own attribute ldquoshort hairrdquoand attribute ldquopantsrdquo LSTM can utilize the previous attributeresult to predict the current attribute more accurately
44 Network Optimization 0e loss function of PARN in-cludes two parts one is the cross-entropy between pre-diction and ground truth of network which is expressed as
Lsa 1113944m
i1uprime log
eθTaiu
1113936nw
j1eθT
aju⎛⎝ ⎞⎠ (19)
where u is the prediction value of the network uprime is theground truth of pedestrian labels m is the time step ofLSTM and nw is the total number of attribute words in eachdataset As shown in equation (19) the loss of a single imagein single epoch is the cumulated loss value afterm iterations0e other one is the loss function of attention mechanism asdescribed in [40]
Lα 12n
1113944
n
i11minus 1113944
m
i1αj
i⎛⎝ ⎞⎠
2
(20)
where αji represents the attention weights of the image
feature xi and n is the number of parts in the image feature0e object function of PARN to optimize is
J minus1
M1113944
M
i1L
isa + λ middot L
iα1113872 1113873 (21)
whereM is the batch size and λ is the rate of Lα in the objectfunction
5 Attribute-Aided Reranking Algorithm
Given a probe image q0 and gallery image setG g1 g2 gN1113864 1113865 including N pedestrian images theinitial similarity distance between q0 and gi is computedwith the Euclidean distance as
d q0 gi( 1113857
1113944
N
i1xq0minusxgi
1113872 11138732
11139741113972
(22)
where xq0and xgi
represent the features of q0 and gi re-spectively that extracted by MiF-CNN 0e initial ranklist R0 g
0prime1 g
0prime2 g
0primeN is obtained by sorting similarity
distance d(q0 gi) in ascending order whered(q0 g
0primei )ltd(q0 g
0primei+1) If the top-1 gallery g
0prime1 is the positive
sample it means rank 1 is correct If the positive sample isnot the top-1 gallery but the top-5 gallery it means rank 5 iscorrect
0e attribute-aided reranking algorithm reranks therank list R0 according to the attribute feature similaritybetween q0 and gi so that more positive gallery getshigher ranking in R0 which could improve the perfor-mance of person re-id To be specific when rank 5 iscorrect but rank 1 is not the proposed algorithm dis-tributes the initial feature score sf for top-5 gallery g
0prime1 simg
0prime5
according to their ranking in R0 the higher the rankingthe higher the sf 0en the proposed algorithm dis-tributes attribute score sa for g
0prime1simg
0prime5 according to the
number of their attributes that are the same as q0 themore same attributes the higher sa 0e total score ofeach gallery is s sf(1minus c) + sac where c is the scoreweight 0e reranking rank list Rlowast0 is obtained byreranking g
0prime1simg
0prime5 in descending order of their total score
s For M query images the whole processing of theattribute-aided reranking algorithm is shown asAlgorithm 1
6 Experiments
0is paper evaluates the attribute recognition accuracy ofPARN on person re-id public datasets first and then com-pares our results with other person attribute recognitionmethods 0en this paper evaluates the performance ofMiF-CNN on three people re-id datasets and the availabilityof the proposed attribute-aided reranking algorithm forimproving the accuracy of person re-id 0e analysis ofexperiment results is also given after comparing our resultswith the state-of-the-art person re-id methods
61 Datasets and Evaluation Protocol 0is paper evaluatesthe proposed methods on three challenging person re-id
α3α2α1
Z
Xx1 x2 x3 xn
Softmax Softmax Softmax Softmax
tanh tanh tanh tanh
Watt Watt Watt Watt
htndash1 htndash1 htndash1 htndash1
v1 v2 v3 vn
αn
Figure 5 Schematic diagram of attention mechanism
Wf Wi Wg Wo
f i g o
htndash1 ht
tanh
tanh
ctndash1 ct
xt
σ σ σ
Figure 6 Architecture of the LSTM unit
Computational Intelligence and Neuroscience 7
public datasets including VIPeR [28] CUHK01 [29] andMarket1501 [30]
VIPeR Dataset It contains 1264 images of 632 per-sons Because of the lack of image samples low-resolution a great change in pose view and illu-mination it is one of the hardest difficult person re-iddatasets In experiment the VIPeR dataset is ran-domly split into two parts images of 316 persons fortraining and the images of remaining 316 persons fortestingCUHK01 Dataset It consists of 3884 images of 971persons which are captured by two cameras withdifferent views 485 pedestrians are randomlychosen for training and other 486 pedestrians fortestingMarket1501 Dataset It is one of the largest person re-id datasets that include 32268 images of 1501 personsEach person has a number of images that are capturedby six cameras with different views 0e dataset isdivided into two parts 12936 images of 751 personsfor training and 19732 images of 750 persons fortestingEvaluation Protocol For the single-shot datasets VI-PeR and CUHK01 cumulative match characteristic(CMC) is used to record the ranks of correct identify[31] For the multishot dataset Market1501 apartfrom CMC mean Average Precision (mAP) is uti-lized to evaluate the performance of the proposedmethods
62 Experimental Results on Attribute Recognition 0e at-tributes in PARN refer to pedestrian identity-level attri-butes For the VIPeR dataset attributes include ldquogenderrdquoldquolength of hairrdquo ldquolower clothingrdquo ldquoupper clothingrdquo ldquoback-pack or notrdquo and ldquocarrying anything or notrdquo For theCUHK01 dataset attributes include ldquogenderrdquo ldquolength ofhairrdquo ldquobackpack or notrdquo ldquohandbag or notrdquo ldquocolor of upperrdquoand ldquocolor of lowerrdquo For the Market1501 dataset attributescontain ldquogenderrdquo ldquolength of hairrdquo ldquowearing hat or notrdquoldquolower clothingrdquo ldquobackpack or notrdquo ldquohandbag or notrdquoldquolength of sleeverdquo and ldquolength of lower clothingrdquo Eachperson in each dataset owns an attribute word list as shownin Figure 7
0is paper evaluates the attribute recognition accuracy ofPARN on the Market1501 dataset and compares our resultswith outstanding person attribute recognition method APR[22] 0e experiment results are reported in Table 1 whereldquoLslvrdquo represents the ldquolength of sleeverdquo and ldquoLlowrdquo repre-sents the ldquolength of lower clothingrdquo
It can be observed from Table 1 that PARN obtainssuperior recognition accuracy of 8 attributes especiallyldquolength of hairrdquo ldquohandbag or notrdquo and mean accuracy arehigher than APR whereas accuracy of other attributes iscloser with APR Comparison with APR demonstrates theoutstanding attribute recognition performance of PARNwhich plays an important role in improving the accuracy ofperson re-id
63 Experimental Results on Person re-id 0is paper eval-uates the performance of the proposed methods on threedatasets 0e experimental results of the proposed methodsand other state-of-the-art methods are presented inTable 2
As shown in Table 2 the proposed MiF-CNN modelobtains a great performance among state-of-the-artmethods and on the contrary comparison between MiFand MiF + PARN demonstrates that the proposed attribute-aided reranking algorithm is helpful to increase the personre-id accuracy 0e improvement effect is especially obviouson VIPeR and CUHK01 datasets where the identificationaccuracy of rank 1 rank 5 and rank 10 improves by 602633 and 222 and 494 494 and 226 respectively0e proposed MiF-CNN model with the attribute-aidedreranking algorithm gets the best rank 5 accuracy on theVIPeR dataset the best rank 1 and rank 5 accuracy on theCUHK01 dataset and the best mAP on the Market1501dataset among various methods
64 Analysis of Experimental Results Because of the smallquantity of training samples in VIPeR and CUHK01 data-sets a deep CNN model is hard to be trained and easilysuffers from overfitting To handle this the FT-JSTL+DGDmethod based on deep CNN learned deep features frommultiple domains jointly by merging all the datasets togetherand fine-tuned the pretrained model on VIPeR andCUHK01 separately 0e structure and hyperparameters ofthe deep CNN model in the M3TCP method need to beadjusted manually for adapting the scale of different datasetsso as to overcome the training problem of the deep neuralnetwork
In contrast the proposed MiF-CNN has an immobilestructure and can be trained on smaller datasets directlywithout any fine tuning In spite of the deep structure themodel can converge quickly and well MiF-CNNwithout theattribute-aided reranking algorithm outperforms FT-JSTL +DGD andM3TCP by 027 and 1317 respectivelyat rank 1 accuracy on the CUHK01 dataset It also out-performs FT-JSTL+DGD by 222 at rank 1 accuracy onthe VIPeR dataset which indicates that the proposed MiF-CNN model has an excellent ability of reducing overfittingand generalization and an outstanding performance inperson re-id
Figure 8 demonstrates the rank list of MiF andMiF + PARN It can be seen that the MiF-CNN with theattribute-aided reranking method enables the lower rankedpositive sample in rank list of MiF-CNN obtain a higherranking so as to improve the accuracy of person re-idwhich verifies the effectiveness of the attribute-aidedreranking algorithm for improving performance of per-son re-id model
7 Conclusion
0is paper studies and discusses the training problem ofdeep neural network in person re-id task and the using ofpedestrian attributes for further improving the accuracy of
8 Computational Intelligence and Neuroscience
person re-id 0e proposed MiF-CNN model realizes thereuse of feature maps and gradient information whichenhances the feature mobility of network and improves theefficiency of gradient propagation 0e designed person at-tribute recognition network uses an attention mechanism tomeasure the correlation between input feature maps and thehidden state of LSTM at previous time for reducing the di-mension of features It also employs LSTM to decode imagefeatures into pedestrian attributes On the basis of the at-tribute recognition results the attribute-aided reranking
algorithm is presented which rematches attribute featuresamong samples to aid more positive samples rank higher inrank list so as to improve the identify accuracy further
0e experimental results on three public person re-iddatasets indicate the outstanding performance of MiF-CNNmodel in person re-id 0e attribute-aided reranking algo-rithm makes a major contribution to improve the accuracyof person re-id In the future more improvement and op-timization will be done on pedestrian attribute recognitionnetwork Moreover the caption of person images or videos
Input Initial rank list R0Output 0e reranking rank list Rlowast
(1) for i 1 2 M do(2) if rank 1 correct(3) Rlowast R0(4) end if(5) elif rank 5 correct(6) Distributing initial feature score sf for top-5 gallery g
iprime1simg
iprime5 according to their ranking in R0
(7) Distributing attribute score sa for giprime1simg
iprime5 according to the number of their attributes that are the same as qi
(8) s sf(1minus c) + sac
(9) Reranking giprime1simg
iprime5 in descending order of their total score s and obtain Rlowast
(10) end if(11) elif rank 10 correct(12) Changing g
iprime1simg
iprime5 to g
iprime1simg
iprime10 repeat 6ndash10
(13) end if(14) else(15) Changing g
iprime1simg
iprime5 to g
iprime1simg
iprimeM repeat 6ndash10
(16) end if(17) end for
ALGORITHM 1 0e attribute-aided reranking algorithm
Male
Hat
Pants
Stripes
Backpack
Carrying nothing
(a)
Male
Short hair
Pants
No hat
No backpack
No handbag
Short sleeve
Short lower
(b)
Figure 7 Attribute label of the pedestrian in datasets
Table 1 Recognition accuracy of different attributes in PARN and APR ()
Method Gender Hair Clothes Hat Backpack Handbag Lslv Llow MeanAPR 8578 8346 9136 8821 8632 7601 9412 9264 8723PARN 8493 7910 8935 9667 8386 8518 9284 8845 8755
Computational Intelligence and Neuroscience 9
can be obtained by the LSTM model with natural languageprocess ideas which make person re-id methods moresignificant
Data Availability
0e (attributes recognition accuracy on the Market1501dataset and person re-id accuracy on VIPeR CUHK01 andMarket1501 datasets) data used to support the findings ofthis study are included within the article
Conflicts of Interest
0e authors declare that they have no conflicts of interest
Acknowledgments
0is work was supported by the National Natural ScienceFoundation of China (no 61773105) the Natural ScienceFoundation of Liaoning Province (no 20170540675) andthe Scientific Research Project of Liaoning EducationalDepartment (no LQGD2017023)
Table 2 Comparison of various methods with the proposed methods on three datasets ()
MethodsVIPeR CUHK01 Market1501
Rank 1 Rank 5 Rank 10 Rank 1 Rank 5 Rank 10 Rank 1 Rank 5 Rank 10 mAPDNS [32] 4228 7146 8294 6498 8496 8992 6796 mdash mdash 4189DLDA [33] 4411 7259 8166 6712 8945 9168 4815 mdash mdash 2994FT-JSTL +DGD [11] 386 mdash mdash 6660 mdash mdash mdash mdash mdash mdashPDC [34] 5127 7405 8418 mdash mdash mdash 8414 9273 9492 6341K-means-CNN [35] 4650 6930 8070 5350 8250 9120 mdash mdash mdash mdashSpindle [9] 5380 7410 8320 mdash mdash mdash 7690 9150 9460 mdashPersonNet [36] mdash mdash mdash 7114 9007 9500 3721 mdash mdash 1857CSBT [37] 3660 6620 8830 5120 7630 9180 4290 mdash mdash 2030SDH-CNN [38] mdash mdash mdash mdash mdash mdash 5812 6850 8082 4820M3TCP [21] mdash mdash mdash 5370 8430 9100 mdash mdash mdash mdashDM3 [39] 4270 7430 8510 4970 7730 8610 7580 8910 9240 5320MiF 4082 6835 8101 6687 8539 8951 7892 8646 8928 6600MiF + PARN 4684 7468 8323 7181 9033 9177 7939 8895 9136 6646ldquoMiFrdquo represents the MiF-CNN method and ldquoMiF + PARNrdquo represents the MiF-CNN with the attribute-aided reranking method
Query
1 2 3 4 5 6 7 8 9 10
Query
MiF
MiF + PARN
MiF
MiF + PARN
Rank list
Figure 8 Rank list comparison between MiF and MiF + PARN
10 Computational Intelligence and Neuroscience
References
[1] L Li and J Guo ldquoPedestrian re-identification based onreinforced deep feature fusionrdquo Information Technologyvol 42 no 7 pp 15ndash19 2018 in Chinese
[2] J Dai Y Zhang H Lu and H Wang ldquoCross-view semanticprojection learning for person re-identificationrdquo PatternRecognition vol 75 pp 63ndash76 2018
[3] T T T Pham T-L Le H Vu et al ldquoFully-automated personre-identification in multi-camera surveillance system with arobust kernel descriptor and effective shadow removalmethodrdquo Image and Vision Computing vol 59 pp 44ndash622017
[4] P DaoDao and H J Lee ldquoDeep multi-task network forlearning person identity and attributesrdquo IEEE Access vol 6pp 60801ndash60811 2018
[5] J Li L Zhuo J Zhang F Li and H Zhang ldquoA survey ofperson re-identificationrdquo Acta Automatica Sinica vol 44no 9 pp 1554ndash1568 2018 in Chinese
[6] L Wu Y Wang J Gao et al ldquoDeep adaptive feature em-bedding with local sample distributions for person re-iden-tificationrdquo Pattern Recognition vol 73 pp 275ndash288 2018
[7] L Li Y Yang and A G Hauptmann ldquoPerson re-identification past present and futurerdquo 2016 httparxivorgabs161002984
[8] D Wu S J Zheng C A Yuan and D S Huang ldquoA deepmodel with combined losses for person re-identificationrdquoCognitive Systems Research vol 54 pp 74ndash82 2018
[9] H Zhao M Tian S Sun et al ldquoSpindle net person re-identification with human body region guided feature de-composition and fusionrdquo in Proceedings of the 2017 IEEEConference on Computer Vision and Pattern Recognition(CVPR) pp 1077ndash1085 Honolulu HI USA 2017
[10] W Chen X Chen J Zhang and K Huang ldquoBeyond tripletloss a deep quadruplet network for person re-identificationrdquoin Proceedings of the 2017 IEEE Conference on ComputerVision and Pattern Recognition (CVPR) vol 28 Honolulu HIUSA July 2017
[11] T Xiao H Li W Ouyang et al ldquoLearning deep featurerepresentations with domain guided dropout for person re-identificationrdquo in Proceedings of the 2016 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 1249ndash1258 Las Vegas NV USA June 2016
[12] E Ahmed M Jones and T K Marks ldquoAn improved deeplearning architecture for person re-identificationrdquo in Pro-ceedings of the 2015 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 3908ndash3916 Boston MAUSA June 2015
[13] M Farenzena L Bazzani A Perina et al ldquoPerson re-identification by symmetry-driven accumulation of localfeaturesrdquo in Proceedings of the 2010 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 2360ndash2367 San Francisco CA USA June 2010
[14] Y Yang J Yang J Yan et al ldquoSalient color names for personre-identificationrdquo in Proceedings of Computer Vision-ECCV2014 pp 536ndash551 Zurich Switzerland September 2014
[15] L Liao M Cristani A Perina et al ldquoMultiple-shot person re-identification by chromatic and epitomic analysesrdquo PatternRecognition Letters vol 33 no 7 pp 898ndash903 2012
[16] S Murino R Laganiere and P Payeur ldquoImproving pedes-trian detection with selective gradient self-similarity featurerdquoPattern Recognition vol 48 no 8 pp 2364ndash2376 2015
[17] R Layne T M Hospedales S Gong et al ldquoPerson re-identification by attributesrdquo in Proceedings of the 23th
BritishMachine Vision Conference (BMVC) vol 2 no 3 p 83Surrey UK September 2012
[18] Z Wang R Hu Y Yu C Liang and W Huang ldquoMulti-levelfusion for person Re-identification with incomplete marksrdquoin Proceedings of the 23rd ACM international conference onMultimedia pp 1267ndash1270 Brisbane Australia March 2018
[19] Y Chen S Duffner A Stoian et al ldquoDeep and low-levelfeature based attribute learning for person re-identificationrdquoImage and Vision Computing vol 79 pp 25ndash34 2018
[20] Z Dufour X Bai M Ye and S Satoh ldquoIncremental deephidden attribute learningrdquo in Proceedings of the 26th ACMinternational conference on Multimedia pp 72ndash80 SeoulSouth Korea October 2018
[21] D Cheng Y Gong S Zhou et al ldquoPerson re-identification bymulti-channel parts-based cnn with improved triplet lossfunctionrdquo in Proceedings of the 2016 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 1335ndash1344 Las Vegas NV USA June 2016
[22] K He X Zhang S Ren et al ldquoDeep residual learning forimage recognitionrdquo in Proceedings of the 2016 IEEE Confer-ence on Computer Vision and Pattern Recognition (CVPR)pp 770ndash778 Las Vegas NV USA June 2016
[23] Y Lin L Zheng Z Zheng et al ldquoImproving person re-identification by attribute and identity learningrdquo 2017 httparxivorgabs170307220
[24] Y Yan B Ni J Liu and X Yang ldquoMulti-level attentionmodelfor person re-identificationrdquo Pattern Recognition Letters2018 In press
[25] Y Wen K Zhang Z Li et al ldquoA discriminative featurelearning approach for deep face recognitionrdquo in Proceedingsof 14th European Conference pp 499ndash515 AmsterdamNetherlands October 2016
[26] K Xu J Ba R Kiros et al ldquoShow attend and tell neuralimage caption generation with visual attentionrdquo in Pro-ceedings of the 32nd International Conference on MachineLearning (ICML) pp 2048ndash2057 Lille France July 2015
[27] H Choi K Cho and Y Bengio ldquoFine-grained attentionmechanism for neural machine translationrdquoNeurocomputingvol 284 pp 171ndash176 2018
[28] D Gray and H Tao ldquoViewpoint invariant pedestrian rec-ognition with an ensemble of localized featuresrdquo in Pro-ceedings of European Conference on Computer Vision LectureNotes in Computer Science pp 262ndash275 Marseille FranceOctober 2008
[29] W Li R Zhao and X Wang ldquoHuman reidentification withtransferred metric learningrdquo in Proceedings of the 2008 AsianConference on Computer Vision (ACCV) pp 31ndash44 DaejeonKorea November 2012
[30] L Zheng L Shen L Tian et al ldquoScalable person re-identification a benchmarkrdquo in Proceedings of the 2015IEEE Conference on Computer Vision and Pattern Recognition(CVPR) pp 1116ndash1124 Boston MA USA December 2015
[31] W Li R Zhao T Xiao et al ldquoDeepReID deep filter pairingneural network for person re-identificationrdquo in Proceedings ofthe 2014 Computer Vision and Pattern Recognition (CVPR)pp 152ndash159 Columbus OH USA June 2014
[32] L Zhang T Xiang and S Gong ldquoLearning a discriminativenull space for person re-identificationrdquo in Proceedings of theIEEE Conference on Computer Cision and Pattern Recognition(CVPR) pp 1239ndash1248 Las Vegas NV USA June 2016
[33] L Wu C Shen and A van den Hengel ldquoDeep linear dis-criminant analysis on Fisher networks a hybrid architecturefor person re-identificationrdquo Pattern Recognition vol 65pp 238ndash250 2017
Computational Intelligence and Neuroscience 11
[34] C Su J Li S Zhang et al ldquoPose-driven deep convolutionalmodel for person re-identificationrdquo in Proceedings of the 2016International Conference on Computer Vision (ICCV)pp 3980ndash3989 Venice Italy October 2017
[35] Z Zhao B Zhao and F Su ldquoPerson re-identification viaintegrating patch-based metric learning and local saliencelearningrdquo Pattern Recognition vol 75 pp 90ndash98 2018
[36] L Wu C Shen and A Hengel ldquoPersonnet person re-identification with deep convolutional neural networksrdquo2016 httparxivorgabs160107255
[37] J Chen Y Wang J Qin et al ldquoFast person re-identificationvia cross-camera semantic binary transformationrdquo in Pro-ceedings of the 2017 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 5330ndash5339 Honolulu HIUSA July 2017
[38] L Wu Y Wang Z Ge et al ldquoStructured deep hashing withconvolutional neural networks for fast person re-identifica-tionrdquo Computer Vision and Image Understanding vol 167no 10 pp 63ndash73 2018
[39] Z Hu R Hu C Chen Y Yu et al ldquoPerson reidentification viadiscrepancy matrix and matrix metricrdquo IEEE Transactions onCybernetics vol 48 no 10 pp 3006ndash3020 2018
[40] D Jiang K Cho and Y Bengio ldquoNeural machine translationby jointly learning to align and translaterdquo 2014 httparxivorgabs14090473
12 Computational Intelligence and Neuroscience
Computer Games Technology
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
Advances in
FuzzySystems
Hindawiwwwhindawicom
Volume 2018
International Journal of
ReconfigurableComputing
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
thinspArtificial Intelligence
Hindawiwwwhindawicom Volumethinsp2018
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawiwwwhindawicom Volume 2018
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Computational Intelligence and Neuroscience
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018
Human-ComputerInteraction
Advances in
Hindawiwwwhindawicom Volume 2018
Scientic Programming
Submit your manuscripts atwwwhindawicom
public datasets including VIPeR [28] CUHK01 [29] andMarket1501 [30]
VIPeR Dataset It contains 1264 images of 632 per-sons Because of the lack of image samples low-resolution a great change in pose view and illu-mination it is one of the hardest difficult person re-iddatasets In experiment the VIPeR dataset is ran-domly split into two parts images of 316 persons fortraining and the images of remaining 316 persons fortestingCUHK01 Dataset It consists of 3884 images of 971persons which are captured by two cameras withdifferent views 485 pedestrians are randomlychosen for training and other 486 pedestrians fortestingMarket1501 Dataset It is one of the largest person re-id datasets that include 32268 images of 1501 personsEach person has a number of images that are capturedby six cameras with different views 0e dataset isdivided into two parts 12936 images of 751 personsfor training and 19732 images of 750 persons fortestingEvaluation Protocol For the single-shot datasets VI-PeR and CUHK01 cumulative match characteristic(CMC) is used to record the ranks of correct identify[31] For the multishot dataset Market1501 apartfrom CMC mean Average Precision (mAP) is uti-lized to evaluate the performance of the proposedmethods
62 Experimental Results on Attribute Recognition 0e at-tributes in PARN refer to pedestrian identity-level attri-butes For the VIPeR dataset attributes include ldquogenderrdquoldquolength of hairrdquo ldquolower clothingrdquo ldquoupper clothingrdquo ldquoback-pack or notrdquo and ldquocarrying anything or notrdquo For theCUHK01 dataset attributes include ldquogenderrdquo ldquolength ofhairrdquo ldquobackpack or notrdquo ldquohandbag or notrdquo ldquocolor of upperrdquoand ldquocolor of lowerrdquo For the Market1501 dataset attributescontain ldquogenderrdquo ldquolength of hairrdquo ldquowearing hat or notrdquoldquolower clothingrdquo ldquobackpack or notrdquo ldquohandbag or notrdquoldquolength of sleeverdquo and ldquolength of lower clothingrdquo Eachperson in each dataset owns an attribute word list as shownin Figure 7
0is paper evaluates the attribute recognition accuracy ofPARN on the Market1501 dataset and compares our resultswith outstanding person attribute recognition method APR[22] 0e experiment results are reported in Table 1 whereldquoLslvrdquo represents the ldquolength of sleeverdquo and ldquoLlowrdquo repre-sents the ldquolength of lower clothingrdquo
It can be observed from Table 1 that PARN obtainssuperior recognition accuracy of 8 attributes especiallyldquolength of hairrdquo ldquohandbag or notrdquo and mean accuracy arehigher than APR whereas accuracy of other attributes iscloser with APR Comparison with APR demonstrates theoutstanding attribute recognition performance of PARNwhich plays an important role in improving the accuracy ofperson re-id
63 Experimental Results on Person re-id 0is paper eval-uates the performance of the proposed methods on threedatasets 0e experimental results of the proposed methodsand other state-of-the-art methods are presented inTable 2
As shown in Table 2 the proposed MiF-CNN modelobtains a great performance among state-of-the-artmethods and on the contrary comparison between MiFand MiF + PARN demonstrates that the proposed attribute-aided reranking algorithm is helpful to increase the personre-id accuracy 0e improvement effect is especially obviouson VIPeR and CUHK01 datasets where the identificationaccuracy of rank 1 rank 5 and rank 10 improves by 602633 and 222 and 494 494 and 226 respectively0e proposed MiF-CNN model with the attribute-aidedreranking algorithm gets the best rank 5 accuracy on theVIPeR dataset the best rank 1 and rank 5 accuracy on theCUHK01 dataset and the best mAP on the Market1501dataset among various methods
64 Analysis of Experimental Results Because of the smallquantity of training samples in VIPeR and CUHK01 data-sets a deep CNN model is hard to be trained and easilysuffers from overfitting To handle this the FT-JSTL+DGDmethod based on deep CNN learned deep features frommultiple domains jointly by merging all the datasets togetherand fine-tuned the pretrained model on VIPeR andCUHK01 separately 0e structure and hyperparameters ofthe deep CNN model in the M3TCP method need to beadjusted manually for adapting the scale of different datasetsso as to overcome the training problem of the deep neuralnetwork
In contrast the proposed MiF-CNN has an immobilestructure and can be trained on smaller datasets directlywithout any fine tuning In spite of the deep structure themodel can converge quickly and well MiF-CNNwithout theattribute-aided reranking algorithm outperforms FT-JSTL +DGD andM3TCP by 027 and 1317 respectivelyat rank 1 accuracy on the CUHK01 dataset It also out-performs FT-JSTL+DGD by 222 at rank 1 accuracy onthe VIPeR dataset which indicates that the proposed MiF-CNN model has an excellent ability of reducing overfittingand generalization and an outstanding performance inperson re-id
Figure 8 demonstrates the rank list of MiF andMiF + PARN It can be seen that the MiF-CNN with theattribute-aided reranking method enables the lower rankedpositive sample in rank list of MiF-CNN obtain a higherranking so as to improve the accuracy of person re-idwhich verifies the effectiveness of the attribute-aidedreranking algorithm for improving performance of per-son re-id model
7 Conclusion
0is paper studies and discusses the training problem ofdeep neural network in person re-id task and the using ofpedestrian attributes for further improving the accuracy of
8 Computational Intelligence and Neuroscience
person re-id 0e proposed MiF-CNN model realizes thereuse of feature maps and gradient information whichenhances the feature mobility of network and improves theefficiency of gradient propagation 0e designed person at-tribute recognition network uses an attention mechanism tomeasure the correlation between input feature maps and thehidden state of LSTM at previous time for reducing the di-mension of features It also employs LSTM to decode imagefeatures into pedestrian attributes On the basis of the at-tribute recognition results the attribute-aided reranking
algorithm is presented which rematches attribute featuresamong samples to aid more positive samples rank higher inrank list so as to improve the identify accuracy further
0e experimental results on three public person re-iddatasets indicate the outstanding performance of MiF-CNNmodel in person re-id 0e attribute-aided reranking algo-rithm makes a major contribution to improve the accuracyof person re-id In the future more improvement and op-timization will be done on pedestrian attribute recognitionnetwork Moreover the caption of person images or videos
Input Initial rank list R0Output 0e reranking rank list Rlowast
(1) for i 1 2 M do(2) if rank 1 correct(3) Rlowast R0(4) end if(5) elif rank 5 correct(6) Distributing initial feature score sf for top-5 gallery g
iprime1simg
iprime5 according to their ranking in R0
(7) Distributing attribute score sa for giprime1simg
iprime5 according to the number of their attributes that are the same as qi
(8) s sf(1minus c) + sac
(9) Reranking giprime1simg
iprime5 in descending order of their total score s and obtain Rlowast
(10) end if(11) elif rank 10 correct(12) Changing g
iprime1simg
iprime5 to g
iprime1simg
iprime10 repeat 6ndash10
(13) end if(14) else(15) Changing g
iprime1simg
iprime5 to g
iprime1simg
iprimeM repeat 6ndash10
(16) end if(17) end for
ALGORITHM 1 0e attribute-aided reranking algorithm
Male
Hat
Pants
Stripes
Backpack
Carrying nothing
(a)
Male
Short hair
Pants
No hat
No backpack
No handbag
Short sleeve
Short lower
(b)
Figure 7 Attribute label of the pedestrian in datasets
Table 1 Recognition accuracy of different attributes in PARN and APR ()
Method Gender Hair Clothes Hat Backpack Handbag Lslv Llow MeanAPR 8578 8346 9136 8821 8632 7601 9412 9264 8723PARN 8493 7910 8935 9667 8386 8518 9284 8845 8755
Computational Intelligence and Neuroscience 9
can be obtained by the LSTM model with natural languageprocess ideas which make person re-id methods moresignificant
Data Availability
0e (attributes recognition accuracy on the Market1501dataset and person re-id accuracy on VIPeR CUHK01 andMarket1501 datasets) data used to support the findings ofthis study are included within the article
Conflicts of Interest
0e authors declare that they have no conflicts of interest
Acknowledgments
0is work was supported by the National Natural ScienceFoundation of China (no 61773105) the Natural ScienceFoundation of Liaoning Province (no 20170540675) andthe Scientific Research Project of Liaoning EducationalDepartment (no LQGD2017023)
Table 2 Comparison of various methods with the proposed methods on three datasets ()
MethodsVIPeR CUHK01 Market1501
Rank 1 Rank 5 Rank 10 Rank 1 Rank 5 Rank 10 Rank 1 Rank 5 Rank 10 mAPDNS [32] 4228 7146 8294 6498 8496 8992 6796 mdash mdash 4189DLDA [33] 4411 7259 8166 6712 8945 9168 4815 mdash mdash 2994FT-JSTL +DGD [11] 386 mdash mdash 6660 mdash mdash mdash mdash mdash mdashPDC [34] 5127 7405 8418 mdash mdash mdash 8414 9273 9492 6341K-means-CNN [35] 4650 6930 8070 5350 8250 9120 mdash mdash mdash mdashSpindle [9] 5380 7410 8320 mdash mdash mdash 7690 9150 9460 mdashPersonNet [36] mdash mdash mdash 7114 9007 9500 3721 mdash mdash 1857CSBT [37] 3660 6620 8830 5120 7630 9180 4290 mdash mdash 2030SDH-CNN [38] mdash mdash mdash mdash mdash mdash 5812 6850 8082 4820M3TCP [21] mdash mdash mdash 5370 8430 9100 mdash mdash mdash mdashDM3 [39] 4270 7430 8510 4970 7730 8610 7580 8910 9240 5320MiF 4082 6835 8101 6687 8539 8951 7892 8646 8928 6600MiF + PARN 4684 7468 8323 7181 9033 9177 7939 8895 9136 6646ldquoMiFrdquo represents the MiF-CNN method and ldquoMiF + PARNrdquo represents the MiF-CNN with the attribute-aided reranking method
Query
1 2 3 4 5 6 7 8 9 10
Query
MiF
MiF + PARN
MiF
MiF + PARN
Rank list
Figure 8 Rank list comparison between MiF and MiF + PARN
10 Computational Intelligence and Neuroscience
References
[1] L Li and J Guo ldquoPedestrian re-identification based onreinforced deep feature fusionrdquo Information Technologyvol 42 no 7 pp 15ndash19 2018 in Chinese
[2] J Dai Y Zhang H Lu and H Wang ldquoCross-view semanticprojection learning for person re-identificationrdquo PatternRecognition vol 75 pp 63ndash76 2018
[3] T T T Pham T-L Le H Vu et al ldquoFully-automated personre-identification in multi-camera surveillance system with arobust kernel descriptor and effective shadow removalmethodrdquo Image and Vision Computing vol 59 pp 44ndash622017
[4] P DaoDao and H J Lee ldquoDeep multi-task network forlearning person identity and attributesrdquo IEEE Access vol 6pp 60801ndash60811 2018
[5] J Li L Zhuo J Zhang F Li and H Zhang ldquoA survey ofperson re-identificationrdquo Acta Automatica Sinica vol 44no 9 pp 1554ndash1568 2018 in Chinese
[6] L Wu Y Wang J Gao et al ldquoDeep adaptive feature em-bedding with local sample distributions for person re-iden-tificationrdquo Pattern Recognition vol 73 pp 275ndash288 2018
[7] L Li Y Yang and A G Hauptmann ldquoPerson re-identification past present and futurerdquo 2016 httparxivorgabs161002984
[8] D Wu S J Zheng C A Yuan and D S Huang ldquoA deepmodel with combined losses for person re-identificationrdquoCognitive Systems Research vol 54 pp 74ndash82 2018
[9] H Zhao M Tian S Sun et al ldquoSpindle net person re-identification with human body region guided feature de-composition and fusionrdquo in Proceedings of the 2017 IEEEConference on Computer Vision and Pattern Recognition(CVPR) pp 1077ndash1085 Honolulu HI USA 2017
[10] W Chen X Chen J Zhang and K Huang ldquoBeyond tripletloss a deep quadruplet network for person re-identificationrdquoin Proceedings of the 2017 IEEE Conference on ComputerVision and Pattern Recognition (CVPR) vol 28 Honolulu HIUSA July 2017
[11] T Xiao H Li W Ouyang et al ldquoLearning deep featurerepresentations with domain guided dropout for person re-identificationrdquo in Proceedings of the 2016 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 1249ndash1258 Las Vegas NV USA June 2016
[12] E Ahmed M Jones and T K Marks ldquoAn improved deeplearning architecture for person re-identificationrdquo in Pro-ceedings of the 2015 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 3908ndash3916 Boston MAUSA June 2015
[13] M Farenzena L Bazzani A Perina et al ldquoPerson re-identification by symmetry-driven accumulation of localfeaturesrdquo in Proceedings of the 2010 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 2360ndash2367 San Francisco CA USA June 2010
[14] Y Yang J Yang J Yan et al ldquoSalient color names for personre-identificationrdquo in Proceedings of Computer Vision-ECCV2014 pp 536ndash551 Zurich Switzerland September 2014
[15] L Liao M Cristani A Perina et al ldquoMultiple-shot person re-identification by chromatic and epitomic analysesrdquo PatternRecognition Letters vol 33 no 7 pp 898ndash903 2012
[16] S Murino R Laganiere and P Payeur ldquoImproving pedes-trian detection with selective gradient self-similarity featurerdquoPattern Recognition vol 48 no 8 pp 2364ndash2376 2015
[17] R Layne T M Hospedales S Gong et al ldquoPerson re-identification by attributesrdquo in Proceedings of the 23th
BritishMachine Vision Conference (BMVC) vol 2 no 3 p 83Surrey UK September 2012
[18] Z Wang R Hu Y Yu C Liang and W Huang ldquoMulti-levelfusion for person Re-identification with incomplete marksrdquoin Proceedings of the 23rd ACM international conference onMultimedia pp 1267ndash1270 Brisbane Australia March 2018
[19] Y Chen S Duffner A Stoian et al ldquoDeep and low-levelfeature based attribute learning for person re-identificationrdquoImage and Vision Computing vol 79 pp 25ndash34 2018
[20] Z Dufour X Bai M Ye and S Satoh ldquoIncremental deephidden attribute learningrdquo in Proceedings of the 26th ACMinternational conference on Multimedia pp 72ndash80 SeoulSouth Korea October 2018
[21] D Cheng Y Gong S Zhou et al ldquoPerson re-identification bymulti-channel parts-based cnn with improved triplet lossfunctionrdquo in Proceedings of the 2016 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 1335ndash1344 Las Vegas NV USA June 2016
[22] K He X Zhang S Ren et al ldquoDeep residual learning forimage recognitionrdquo in Proceedings of the 2016 IEEE Confer-ence on Computer Vision and Pattern Recognition (CVPR)pp 770ndash778 Las Vegas NV USA June 2016
[23] Y Lin L Zheng Z Zheng et al ldquoImproving person re-identification by attribute and identity learningrdquo 2017 httparxivorgabs170307220
[24] Y Yan B Ni J Liu and X Yang ldquoMulti-level attentionmodelfor person re-identificationrdquo Pattern Recognition Letters2018 In press
[25] Y Wen K Zhang Z Li et al ldquoA discriminative featurelearning approach for deep face recognitionrdquo in Proceedingsof 14th European Conference pp 499ndash515 AmsterdamNetherlands October 2016
[26] K Xu J Ba R Kiros et al ldquoShow attend and tell neuralimage caption generation with visual attentionrdquo in Pro-ceedings of the 32nd International Conference on MachineLearning (ICML) pp 2048ndash2057 Lille France July 2015
[27] H Choi K Cho and Y Bengio ldquoFine-grained attentionmechanism for neural machine translationrdquoNeurocomputingvol 284 pp 171ndash176 2018
[28] D Gray and H Tao ldquoViewpoint invariant pedestrian rec-ognition with an ensemble of localized featuresrdquo in Pro-ceedings of European Conference on Computer Vision LectureNotes in Computer Science pp 262ndash275 Marseille FranceOctober 2008
[29] W Li R Zhao and X Wang ldquoHuman reidentification withtransferred metric learningrdquo in Proceedings of the 2008 AsianConference on Computer Vision (ACCV) pp 31ndash44 DaejeonKorea November 2012
[30] L Zheng L Shen L Tian et al ldquoScalable person re-identification a benchmarkrdquo in Proceedings of the 2015IEEE Conference on Computer Vision and Pattern Recognition(CVPR) pp 1116ndash1124 Boston MA USA December 2015
[31] W Li R Zhao T Xiao et al ldquoDeepReID deep filter pairingneural network for person re-identificationrdquo in Proceedings ofthe 2014 Computer Vision and Pattern Recognition (CVPR)pp 152ndash159 Columbus OH USA June 2014
[32] L Zhang T Xiang and S Gong ldquoLearning a discriminativenull space for person re-identificationrdquo in Proceedings of theIEEE Conference on Computer Cision and Pattern Recognition(CVPR) pp 1239ndash1248 Las Vegas NV USA June 2016
[33] L Wu C Shen and A van den Hengel ldquoDeep linear dis-criminant analysis on Fisher networks a hybrid architecturefor person re-identificationrdquo Pattern Recognition vol 65pp 238ndash250 2017
Computational Intelligence and Neuroscience 11
[34] C Su J Li S Zhang et al ldquoPose-driven deep convolutionalmodel for person re-identificationrdquo in Proceedings of the 2016International Conference on Computer Vision (ICCV)pp 3980ndash3989 Venice Italy October 2017
[35] Z Zhao B Zhao and F Su ldquoPerson re-identification viaintegrating patch-based metric learning and local saliencelearningrdquo Pattern Recognition vol 75 pp 90ndash98 2018
[36] L Wu C Shen and A Hengel ldquoPersonnet person re-identification with deep convolutional neural networksrdquo2016 httparxivorgabs160107255
[37] J Chen Y Wang J Qin et al ldquoFast person re-identificationvia cross-camera semantic binary transformationrdquo in Pro-ceedings of the 2017 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 5330ndash5339 Honolulu HIUSA July 2017
[38] L Wu Y Wang Z Ge et al ldquoStructured deep hashing withconvolutional neural networks for fast person re-identifica-tionrdquo Computer Vision and Image Understanding vol 167no 10 pp 63ndash73 2018
[39] Z Hu R Hu C Chen Y Yu et al ldquoPerson reidentification viadiscrepancy matrix and matrix metricrdquo IEEE Transactions onCybernetics vol 48 no 10 pp 3006ndash3020 2018
[40] D Jiang K Cho and Y Bengio ldquoNeural machine translationby jointly learning to align and translaterdquo 2014 httparxivorgabs14090473
12 Computational Intelligence and Neuroscience
Computer Games Technology
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
Advances in
FuzzySystems
Hindawiwwwhindawicom
Volume 2018
International Journal of
ReconfigurableComputing
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
thinspArtificial Intelligence
Hindawiwwwhindawicom Volumethinsp2018
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawiwwwhindawicom Volume 2018
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Computational Intelligence and Neuroscience
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018
Human-ComputerInteraction
Advances in
Hindawiwwwhindawicom Volume 2018
Scientic Programming
Submit your manuscripts atwwwhindawicom
person re-id 0e proposed MiF-CNN model realizes thereuse of feature maps and gradient information whichenhances the feature mobility of network and improves theefficiency of gradient propagation 0e designed person at-tribute recognition network uses an attention mechanism tomeasure the correlation between input feature maps and thehidden state of LSTM at previous time for reducing the di-mension of features It also employs LSTM to decode imagefeatures into pedestrian attributes On the basis of the at-tribute recognition results the attribute-aided reranking
algorithm is presented which rematches attribute featuresamong samples to aid more positive samples rank higher inrank list so as to improve the identify accuracy further
0e experimental results on three public person re-iddatasets indicate the outstanding performance of MiF-CNNmodel in person re-id 0e attribute-aided reranking algo-rithm makes a major contribution to improve the accuracyof person re-id In the future more improvement and op-timization will be done on pedestrian attribute recognitionnetwork Moreover the caption of person images or videos
Input Initial rank list R0Output 0e reranking rank list Rlowast
(1) for i 1 2 M do(2) if rank 1 correct(3) Rlowast R0(4) end if(5) elif rank 5 correct(6) Distributing initial feature score sf for top-5 gallery g
iprime1simg
iprime5 according to their ranking in R0
(7) Distributing attribute score sa for giprime1simg
iprime5 according to the number of their attributes that are the same as qi
(8) s sf(1minus c) + sac
(9) Reranking giprime1simg
iprime5 in descending order of their total score s and obtain Rlowast
(10) end if(11) elif rank 10 correct(12) Changing g
iprime1simg
iprime5 to g
iprime1simg
iprime10 repeat 6ndash10
(13) end if(14) else(15) Changing g
iprime1simg
iprime5 to g
iprime1simg
iprimeM repeat 6ndash10
(16) end if(17) end for
ALGORITHM 1 0e attribute-aided reranking algorithm
Male
Hat
Pants
Stripes
Backpack
Carrying nothing
(a)
Male
Short hair
Pants
No hat
No backpack
No handbag
Short sleeve
Short lower
(b)
Figure 7 Attribute label of the pedestrian in datasets
Table 1 Recognition accuracy of different attributes in PARN and APR ()
Method Gender Hair Clothes Hat Backpack Handbag Lslv Llow MeanAPR 8578 8346 9136 8821 8632 7601 9412 9264 8723PARN 8493 7910 8935 9667 8386 8518 9284 8845 8755
Computational Intelligence and Neuroscience 9
can be obtained by the LSTM model with natural languageprocess ideas which make person re-id methods moresignificant
Data Availability
0e (attributes recognition accuracy on the Market1501dataset and person re-id accuracy on VIPeR CUHK01 andMarket1501 datasets) data used to support the findings ofthis study are included within the article
Conflicts of Interest
0e authors declare that they have no conflicts of interest
Acknowledgments
0is work was supported by the National Natural ScienceFoundation of China (no 61773105) the Natural ScienceFoundation of Liaoning Province (no 20170540675) andthe Scientific Research Project of Liaoning EducationalDepartment (no LQGD2017023)
Table 2 Comparison of various methods with the proposed methods on three datasets ()
MethodsVIPeR CUHK01 Market1501
Rank 1 Rank 5 Rank 10 Rank 1 Rank 5 Rank 10 Rank 1 Rank 5 Rank 10 mAPDNS [32] 4228 7146 8294 6498 8496 8992 6796 mdash mdash 4189DLDA [33] 4411 7259 8166 6712 8945 9168 4815 mdash mdash 2994FT-JSTL +DGD [11] 386 mdash mdash 6660 mdash mdash mdash mdash mdash mdashPDC [34] 5127 7405 8418 mdash mdash mdash 8414 9273 9492 6341K-means-CNN [35] 4650 6930 8070 5350 8250 9120 mdash mdash mdash mdashSpindle [9] 5380 7410 8320 mdash mdash mdash 7690 9150 9460 mdashPersonNet [36] mdash mdash mdash 7114 9007 9500 3721 mdash mdash 1857CSBT [37] 3660 6620 8830 5120 7630 9180 4290 mdash mdash 2030SDH-CNN [38] mdash mdash mdash mdash mdash mdash 5812 6850 8082 4820M3TCP [21] mdash mdash mdash 5370 8430 9100 mdash mdash mdash mdashDM3 [39] 4270 7430 8510 4970 7730 8610 7580 8910 9240 5320MiF 4082 6835 8101 6687 8539 8951 7892 8646 8928 6600MiF + PARN 4684 7468 8323 7181 9033 9177 7939 8895 9136 6646ldquoMiFrdquo represents the MiF-CNN method and ldquoMiF + PARNrdquo represents the MiF-CNN with the attribute-aided reranking method
Query
1 2 3 4 5 6 7 8 9 10
Query
MiF
MiF + PARN
MiF
MiF + PARN
Rank list
Figure 8 Rank list comparison between MiF and MiF + PARN
10 Computational Intelligence and Neuroscience
References
[1] L Li and J Guo ldquoPedestrian re-identification based onreinforced deep feature fusionrdquo Information Technologyvol 42 no 7 pp 15ndash19 2018 in Chinese
[2] J Dai Y Zhang H Lu and H Wang ldquoCross-view semanticprojection learning for person re-identificationrdquo PatternRecognition vol 75 pp 63ndash76 2018
[3] T T T Pham T-L Le H Vu et al ldquoFully-automated personre-identification in multi-camera surveillance system with arobust kernel descriptor and effective shadow removalmethodrdquo Image and Vision Computing vol 59 pp 44ndash622017
[4] P DaoDao and H J Lee ldquoDeep multi-task network forlearning person identity and attributesrdquo IEEE Access vol 6pp 60801ndash60811 2018
[5] J Li L Zhuo J Zhang F Li and H Zhang ldquoA survey ofperson re-identificationrdquo Acta Automatica Sinica vol 44no 9 pp 1554ndash1568 2018 in Chinese
[6] L Wu Y Wang J Gao et al ldquoDeep adaptive feature em-bedding with local sample distributions for person re-iden-tificationrdquo Pattern Recognition vol 73 pp 275ndash288 2018
[7] L Li Y Yang and A G Hauptmann ldquoPerson re-identification past present and futurerdquo 2016 httparxivorgabs161002984
[8] D Wu S J Zheng C A Yuan and D S Huang ldquoA deepmodel with combined losses for person re-identificationrdquoCognitive Systems Research vol 54 pp 74ndash82 2018
[9] H Zhao M Tian S Sun et al ldquoSpindle net person re-identification with human body region guided feature de-composition and fusionrdquo in Proceedings of the 2017 IEEEConference on Computer Vision and Pattern Recognition(CVPR) pp 1077ndash1085 Honolulu HI USA 2017
[10] W Chen X Chen J Zhang and K Huang ldquoBeyond tripletloss a deep quadruplet network for person re-identificationrdquoin Proceedings of the 2017 IEEE Conference on ComputerVision and Pattern Recognition (CVPR) vol 28 Honolulu HIUSA July 2017
[11] T Xiao H Li W Ouyang et al ldquoLearning deep featurerepresentations with domain guided dropout for person re-identificationrdquo in Proceedings of the 2016 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 1249ndash1258 Las Vegas NV USA June 2016
[12] E Ahmed M Jones and T K Marks ldquoAn improved deeplearning architecture for person re-identificationrdquo in Pro-ceedings of the 2015 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 3908ndash3916 Boston MAUSA June 2015
[13] M Farenzena L Bazzani A Perina et al ldquoPerson re-identification by symmetry-driven accumulation of localfeaturesrdquo in Proceedings of the 2010 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 2360ndash2367 San Francisco CA USA June 2010
[14] Y Yang J Yang J Yan et al ldquoSalient color names for personre-identificationrdquo in Proceedings of Computer Vision-ECCV2014 pp 536ndash551 Zurich Switzerland September 2014
[15] L Liao M Cristani A Perina et al ldquoMultiple-shot person re-identification by chromatic and epitomic analysesrdquo PatternRecognition Letters vol 33 no 7 pp 898ndash903 2012
[16] S Murino R Laganiere and P Payeur ldquoImproving pedes-trian detection with selective gradient self-similarity featurerdquoPattern Recognition vol 48 no 8 pp 2364ndash2376 2015
[17] R Layne T M Hospedales S Gong et al ldquoPerson re-identification by attributesrdquo in Proceedings of the 23th
BritishMachine Vision Conference (BMVC) vol 2 no 3 p 83Surrey UK September 2012
[18] Z Wang R Hu Y Yu C Liang and W Huang ldquoMulti-levelfusion for person Re-identification with incomplete marksrdquoin Proceedings of the 23rd ACM international conference onMultimedia pp 1267ndash1270 Brisbane Australia March 2018
[19] Y Chen S Duffner A Stoian et al ldquoDeep and low-levelfeature based attribute learning for person re-identificationrdquoImage and Vision Computing vol 79 pp 25ndash34 2018
[20] Z Dufour X Bai M Ye and S Satoh ldquoIncremental deephidden attribute learningrdquo in Proceedings of the 26th ACMinternational conference on Multimedia pp 72ndash80 SeoulSouth Korea October 2018
[21] D Cheng Y Gong S Zhou et al ldquoPerson re-identification bymulti-channel parts-based cnn with improved triplet lossfunctionrdquo in Proceedings of the 2016 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 1335ndash1344 Las Vegas NV USA June 2016
[22] K He X Zhang S Ren et al ldquoDeep residual learning forimage recognitionrdquo in Proceedings of the 2016 IEEE Confer-ence on Computer Vision and Pattern Recognition (CVPR)pp 770ndash778 Las Vegas NV USA June 2016
[23] Y Lin L Zheng Z Zheng et al ldquoImproving person re-identification by attribute and identity learningrdquo 2017 httparxivorgabs170307220
[24] Y Yan B Ni J Liu and X Yang ldquoMulti-level attentionmodelfor person re-identificationrdquo Pattern Recognition Letters2018 In press
[25] Y Wen K Zhang Z Li et al ldquoA discriminative featurelearning approach for deep face recognitionrdquo in Proceedingsof 14th European Conference pp 499ndash515 AmsterdamNetherlands October 2016
[26] K Xu J Ba R Kiros et al ldquoShow attend and tell neuralimage caption generation with visual attentionrdquo in Pro-ceedings of the 32nd International Conference on MachineLearning (ICML) pp 2048ndash2057 Lille France July 2015
[27] H Choi K Cho and Y Bengio ldquoFine-grained attentionmechanism for neural machine translationrdquoNeurocomputingvol 284 pp 171ndash176 2018
[28] D Gray and H Tao ldquoViewpoint invariant pedestrian rec-ognition with an ensemble of localized featuresrdquo in Pro-ceedings of European Conference on Computer Vision LectureNotes in Computer Science pp 262ndash275 Marseille FranceOctober 2008
[29] W Li R Zhao and X Wang ldquoHuman reidentification withtransferred metric learningrdquo in Proceedings of the 2008 AsianConference on Computer Vision (ACCV) pp 31ndash44 DaejeonKorea November 2012
[30] L Zheng L Shen L Tian et al ldquoScalable person re-identification a benchmarkrdquo in Proceedings of the 2015IEEE Conference on Computer Vision and Pattern Recognition(CVPR) pp 1116ndash1124 Boston MA USA December 2015
[31] W Li R Zhao T Xiao et al ldquoDeepReID deep filter pairingneural network for person re-identificationrdquo in Proceedings ofthe 2014 Computer Vision and Pattern Recognition (CVPR)pp 152ndash159 Columbus OH USA June 2014
[32] L Zhang T Xiang and S Gong ldquoLearning a discriminativenull space for person re-identificationrdquo in Proceedings of theIEEE Conference on Computer Cision and Pattern Recognition(CVPR) pp 1239ndash1248 Las Vegas NV USA June 2016
[33] L Wu C Shen and A van den Hengel ldquoDeep linear dis-criminant analysis on Fisher networks a hybrid architecturefor person re-identificationrdquo Pattern Recognition vol 65pp 238ndash250 2017
Computational Intelligence and Neuroscience 11
[34] C Su J Li S Zhang et al ldquoPose-driven deep convolutionalmodel for person re-identificationrdquo in Proceedings of the 2016International Conference on Computer Vision (ICCV)pp 3980ndash3989 Venice Italy October 2017
[35] Z Zhao B Zhao and F Su ldquoPerson re-identification viaintegrating patch-based metric learning and local saliencelearningrdquo Pattern Recognition vol 75 pp 90ndash98 2018
[36] L Wu C Shen and A Hengel ldquoPersonnet person re-identification with deep convolutional neural networksrdquo2016 httparxivorgabs160107255
[37] J Chen Y Wang J Qin et al ldquoFast person re-identificationvia cross-camera semantic binary transformationrdquo in Pro-ceedings of the 2017 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 5330ndash5339 Honolulu HIUSA July 2017
[38] L Wu Y Wang Z Ge et al ldquoStructured deep hashing withconvolutional neural networks for fast person re-identifica-tionrdquo Computer Vision and Image Understanding vol 167no 10 pp 63ndash73 2018
[39] Z Hu R Hu C Chen Y Yu et al ldquoPerson reidentification viadiscrepancy matrix and matrix metricrdquo IEEE Transactions onCybernetics vol 48 no 10 pp 3006ndash3020 2018
[40] D Jiang K Cho and Y Bengio ldquoNeural machine translationby jointly learning to align and translaterdquo 2014 httparxivorgabs14090473
12 Computational Intelligence and Neuroscience
Computer Games Technology
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
Advances in
FuzzySystems
Hindawiwwwhindawicom
Volume 2018
International Journal of
ReconfigurableComputing
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
thinspArtificial Intelligence
Hindawiwwwhindawicom Volumethinsp2018
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawiwwwhindawicom Volume 2018
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Computational Intelligence and Neuroscience
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018
Human-ComputerInteraction
Advances in
Hindawiwwwhindawicom Volume 2018
Scientic Programming
Submit your manuscripts atwwwhindawicom
can be obtained by the LSTM model with natural languageprocess ideas which make person re-id methods moresignificant
Data Availability
0e (attributes recognition accuracy on the Market1501dataset and person re-id accuracy on VIPeR CUHK01 andMarket1501 datasets) data used to support the findings ofthis study are included within the article
Conflicts of Interest
0e authors declare that they have no conflicts of interest
Acknowledgments
0is work was supported by the National Natural ScienceFoundation of China (no 61773105) the Natural ScienceFoundation of Liaoning Province (no 20170540675) andthe Scientific Research Project of Liaoning EducationalDepartment (no LQGD2017023)
Table 2 Comparison of various methods with the proposed methods on three datasets ()
MethodsVIPeR CUHK01 Market1501
Rank 1 Rank 5 Rank 10 Rank 1 Rank 5 Rank 10 Rank 1 Rank 5 Rank 10 mAPDNS [32] 4228 7146 8294 6498 8496 8992 6796 mdash mdash 4189DLDA [33] 4411 7259 8166 6712 8945 9168 4815 mdash mdash 2994FT-JSTL +DGD [11] 386 mdash mdash 6660 mdash mdash mdash mdash mdash mdashPDC [34] 5127 7405 8418 mdash mdash mdash 8414 9273 9492 6341K-means-CNN [35] 4650 6930 8070 5350 8250 9120 mdash mdash mdash mdashSpindle [9] 5380 7410 8320 mdash mdash mdash 7690 9150 9460 mdashPersonNet [36] mdash mdash mdash 7114 9007 9500 3721 mdash mdash 1857CSBT [37] 3660 6620 8830 5120 7630 9180 4290 mdash mdash 2030SDH-CNN [38] mdash mdash mdash mdash mdash mdash 5812 6850 8082 4820M3TCP [21] mdash mdash mdash 5370 8430 9100 mdash mdash mdash mdashDM3 [39] 4270 7430 8510 4970 7730 8610 7580 8910 9240 5320MiF 4082 6835 8101 6687 8539 8951 7892 8646 8928 6600MiF + PARN 4684 7468 8323 7181 9033 9177 7939 8895 9136 6646ldquoMiFrdquo represents the MiF-CNN method and ldquoMiF + PARNrdquo represents the MiF-CNN with the attribute-aided reranking method
Query
1 2 3 4 5 6 7 8 9 10
Query
MiF
MiF + PARN
MiF
MiF + PARN
Rank list
Figure 8 Rank list comparison between MiF and MiF + PARN
10 Computational Intelligence and Neuroscience
References
[1] L Li and J Guo ldquoPedestrian re-identification based onreinforced deep feature fusionrdquo Information Technologyvol 42 no 7 pp 15ndash19 2018 in Chinese
[2] J Dai Y Zhang H Lu and H Wang ldquoCross-view semanticprojection learning for person re-identificationrdquo PatternRecognition vol 75 pp 63ndash76 2018
[3] T T T Pham T-L Le H Vu et al ldquoFully-automated personre-identification in multi-camera surveillance system with arobust kernel descriptor and effective shadow removalmethodrdquo Image and Vision Computing vol 59 pp 44ndash622017
[4] P DaoDao and H J Lee ldquoDeep multi-task network forlearning person identity and attributesrdquo IEEE Access vol 6pp 60801ndash60811 2018
[5] J Li L Zhuo J Zhang F Li and H Zhang ldquoA survey ofperson re-identificationrdquo Acta Automatica Sinica vol 44no 9 pp 1554ndash1568 2018 in Chinese
[6] L Wu Y Wang J Gao et al ldquoDeep adaptive feature em-bedding with local sample distributions for person re-iden-tificationrdquo Pattern Recognition vol 73 pp 275ndash288 2018
[7] L Li Y Yang and A G Hauptmann ldquoPerson re-identification past present and futurerdquo 2016 httparxivorgabs161002984
[8] D Wu S J Zheng C A Yuan and D S Huang ldquoA deepmodel with combined losses for person re-identificationrdquoCognitive Systems Research vol 54 pp 74ndash82 2018
[9] H Zhao M Tian S Sun et al ldquoSpindle net person re-identification with human body region guided feature de-composition and fusionrdquo in Proceedings of the 2017 IEEEConference on Computer Vision and Pattern Recognition(CVPR) pp 1077ndash1085 Honolulu HI USA 2017
[10] W Chen X Chen J Zhang and K Huang ldquoBeyond tripletloss a deep quadruplet network for person re-identificationrdquoin Proceedings of the 2017 IEEE Conference on ComputerVision and Pattern Recognition (CVPR) vol 28 Honolulu HIUSA July 2017
[11] T Xiao H Li W Ouyang et al ldquoLearning deep featurerepresentations with domain guided dropout for person re-identificationrdquo in Proceedings of the 2016 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 1249ndash1258 Las Vegas NV USA June 2016
[12] E Ahmed M Jones and T K Marks ldquoAn improved deeplearning architecture for person re-identificationrdquo in Pro-ceedings of the 2015 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 3908ndash3916 Boston MAUSA June 2015
[13] M Farenzena L Bazzani A Perina et al ldquoPerson re-identification by symmetry-driven accumulation of localfeaturesrdquo in Proceedings of the 2010 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 2360ndash2367 San Francisco CA USA June 2010
[14] Y Yang J Yang J Yan et al ldquoSalient color names for personre-identificationrdquo in Proceedings of Computer Vision-ECCV2014 pp 536ndash551 Zurich Switzerland September 2014
[15] L Liao M Cristani A Perina et al ldquoMultiple-shot person re-identification by chromatic and epitomic analysesrdquo PatternRecognition Letters vol 33 no 7 pp 898ndash903 2012
[16] S Murino R Laganiere and P Payeur ldquoImproving pedes-trian detection with selective gradient self-similarity featurerdquoPattern Recognition vol 48 no 8 pp 2364ndash2376 2015
[17] R Layne T M Hospedales S Gong et al ldquoPerson re-identification by attributesrdquo in Proceedings of the 23th
BritishMachine Vision Conference (BMVC) vol 2 no 3 p 83Surrey UK September 2012
[18] Z Wang R Hu Y Yu C Liang and W Huang ldquoMulti-levelfusion for person Re-identification with incomplete marksrdquoin Proceedings of the 23rd ACM international conference onMultimedia pp 1267ndash1270 Brisbane Australia March 2018
[19] Y Chen S Duffner A Stoian et al ldquoDeep and low-levelfeature based attribute learning for person re-identificationrdquoImage and Vision Computing vol 79 pp 25ndash34 2018
[20] Z Dufour X Bai M Ye and S Satoh ldquoIncremental deephidden attribute learningrdquo in Proceedings of the 26th ACMinternational conference on Multimedia pp 72ndash80 SeoulSouth Korea October 2018
[21] D Cheng Y Gong S Zhou et al ldquoPerson re-identification bymulti-channel parts-based cnn with improved triplet lossfunctionrdquo in Proceedings of the 2016 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 1335ndash1344 Las Vegas NV USA June 2016
[22] K He X Zhang S Ren et al ldquoDeep residual learning forimage recognitionrdquo in Proceedings of the 2016 IEEE Confer-ence on Computer Vision and Pattern Recognition (CVPR)pp 770ndash778 Las Vegas NV USA June 2016
[23] Y Lin L Zheng Z Zheng et al ldquoImproving person re-identification by attribute and identity learningrdquo 2017 httparxivorgabs170307220
[24] Y Yan B Ni J Liu and X Yang ldquoMulti-level attentionmodelfor person re-identificationrdquo Pattern Recognition Letters2018 In press
[25] Y Wen K Zhang Z Li et al ldquoA discriminative featurelearning approach for deep face recognitionrdquo in Proceedingsof 14th European Conference pp 499ndash515 AmsterdamNetherlands October 2016
[26] K Xu J Ba R Kiros et al ldquoShow attend and tell neuralimage caption generation with visual attentionrdquo in Pro-ceedings of the 32nd International Conference on MachineLearning (ICML) pp 2048ndash2057 Lille France July 2015
[27] H Choi K Cho and Y Bengio ldquoFine-grained attentionmechanism for neural machine translationrdquoNeurocomputingvol 284 pp 171ndash176 2018
[28] D Gray and H Tao ldquoViewpoint invariant pedestrian rec-ognition with an ensemble of localized featuresrdquo in Pro-ceedings of European Conference on Computer Vision LectureNotes in Computer Science pp 262ndash275 Marseille FranceOctober 2008
[29] W Li R Zhao and X Wang ldquoHuman reidentification withtransferred metric learningrdquo in Proceedings of the 2008 AsianConference on Computer Vision (ACCV) pp 31ndash44 DaejeonKorea November 2012
[30] L Zheng L Shen L Tian et al ldquoScalable person re-identification a benchmarkrdquo in Proceedings of the 2015IEEE Conference on Computer Vision and Pattern Recognition(CVPR) pp 1116ndash1124 Boston MA USA December 2015
[31] W Li R Zhao T Xiao et al ldquoDeepReID deep filter pairingneural network for person re-identificationrdquo in Proceedings ofthe 2014 Computer Vision and Pattern Recognition (CVPR)pp 152ndash159 Columbus OH USA June 2014
[32] L Zhang T Xiang and S Gong ldquoLearning a discriminativenull space for person re-identificationrdquo in Proceedings of theIEEE Conference on Computer Cision and Pattern Recognition(CVPR) pp 1239ndash1248 Las Vegas NV USA June 2016
[33] L Wu C Shen and A van den Hengel ldquoDeep linear dis-criminant analysis on Fisher networks a hybrid architecturefor person re-identificationrdquo Pattern Recognition vol 65pp 238ndash250 2017
Computational Intelligence and Neuroscience 11
[34] C Su J Li S Zhang et al ldquoPose-driven deep convolutionalmodel for person re-identificationrdquo in Proceedings of the 2016International Conference on Computer Vision (ICCV)pp 3980ndash3989 Venice Italy October 2017
[35] Z Zhao B Zhao and F Su ldquoPerson re-identification viaintegrating patch-based metric learning and local saliencelearningrdquo Pattern Recognition vol 75 pp 90ndash98 2018
[36] L Wu C Shen and A Hengel ldquoPersonnet person re-identification with deep convolutional neural networksrdquo2016 httparxivorgabs160107255
[37] J Chen Y Wang J Qin et al ldquoFast person re-identificationvia cross-camera semantic binary transformationrdquo in Pro-ceedings of the 2017 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 5330ndash5339 Honolulu HIUSA July 2017
[38] L Wu Y Wang Z Ge et al ldquoStructured deep hashing withconvolutional neural networks for fast person re-identifica-tionrdquo Computer Vision and Image Understanding vol 167no 10 pp 63ndash73 2018
[39] Z Hu R Hu C Chen Y Yu et al ldquoPerson reidentification viadiscrepancy matrix and matrix metricrdquo IEEE Transactions onCybernetics vol 48 no 10 pp 3006ndash3020 2018
[40] D Jiang K Cho and Y Bengio ldquoNeural machine translationby jointly learning to align and translaterdquo 2014 httparxivorgabs14090473
12 Computational Intelligence and Neuroscience
Computer Games Technology
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
Advances in
FuzzySystems
Hindawiwwwhindawicom
Volume 2018
International Journal of
ReconfigurableComputing
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
thinspArtificial Intelligence
Hindawiwwwhindawicom Volumethinsp2018
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawiwwwhindawicom Volume 2018
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Computational Intelligence and Neuroscience
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018
Human-ComputerInteraction
Advances in
Hindawiwwwhindawicom Volume 2018
Scientic Programming
Submit your manuscripts atwwwhindawicom
References
[1] L Li and J Guo ldquoPedestrian re-identification based onreinforced deep feature fusionrdquo Information Technologyvol 42 no 7 pp 15ndash19 2018 in Chinese
[2] J Dai Y Zhang H Lu and H Wang ldquoCross-view semanticprojection learning for person re-identificationrdquo PatternRecognition vol 75 pp 63ndash76 2018
[3] T T T Pham T-L Le H Vu et al ldquoFully-automated personre-identification in multi-camera surveillance system with arobust kernel descriptor and effective shadow removalmethodrdquo Image and Vision Computing vol 59 pp 44ndash622017
[4] P DaoDao and H J Lee ldquoDeep multi-task network forlearning person identity and attributesrdquo IEEE Access vol 6pp 60801ndash60811 2018
[5] J Li L Zhuo J Zhang F Li and H Zhang ldquoA survey ofperson re-identificationrdquo Acta Automatica Sinica vol 44no 9 pp 1554ndash1568 2018 in Chinese
[6] L Wu Y Wang J Gao et al ldquoDeep adaptive feature em-bedding with local sample distributions for person re-iden-tificationrdquo Pattern Recognition vol 73 pp 275ndash288 2018
[7] L Li Y Yang and A G Hauptmann ldquoPerson re-identification past present and futurerdquo 2016 httparxivorgabs161002984
[8] D Wu S J Zheng C A Yuan and D S Huang ldquoA deepmodel with combined losses for person re-identificationrdquoCognitive Systems Research vol 54 pp 74ndash82 2018
[9] H Zhao M Tian S Sun et al ldquoSpindle net person re-identification with human body region guided feature de-composition and fusionrdquo in Proceedings of the 2017 IEEEConference on Computer Vision and Pattern Recognition(CVPR) pp 1077ndash1085 Honolulu HI USA 2017
[10] W Chen X Chen J Zhang and K Huang ldquoBeyond tripletloss a deep quadruplet network for person re-identificationrdquoin Proceedings of the 2017 IEEE Conference on ComputerVision and Pattern Recognition (CVPR) vol 28 Honolulu HIUSA July 2017
[11] T Xiao H Li W Ouyang et al ldquoLearning deep featurerepresentations with domain guided dropout for person re-identificationrdquo in Proceedings of the 2016 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 1249ndash1258 Las Vegas NV USA June 2016
[12] E Ahmed M Jones and T K Marks ldquoAn improved deeplearning architecture for person re-identificationrdquo in Pro-ceedings of the 2015 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 3908ndash3916 Boston MAUSA June 2015
[13] M Farenzena L Bazzani A Perina et al ldquoPerson re-identification by symmetry-driven accumulation of localfeaturesrdquo in Proceedings of the 2010 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 2360ndash2367 San Francisco CA USA June 2010
[14] Y Yang J Yang J Yan et al ldquoSalient color names for personre-identificationrdquo in Proceedings of Computer Vision-ECCV2014 pp 536ndash551 Zurich Switzerland September 2014
[15] L Liao M Cristani A Perina et al ldquoMultiple-shot person re-identification by chromatic and epitomic analysesrdquo PatternRecognition Letters vol 33 no 7 pp 898ndash903 2012
[16] S Murino R Laganiere and P Payeur ldquoImproving pedes-trian detection with selective gradient self-similarity featurerdquoPattern Recognition vol 48 no 8 pp 2364ndash2376 2015
[17] R Layne T M Hospedales S Gong et al ldquoPerson re-identification by attributesrdquo in Proceedings of the 23th
BritishMachine Vision Conference (BMVC) vol 2 no 3 p 83Surrey UK September 2012
[18] Z Wang R Hu Y Yu C Liang and W Huang ldquoMulti-levelfusion for person Re-identification with incomplete marksrdquoin Proceedings of the 23rd ACM international conference onMultimedia pp 1267ndash1270 Brisbane Australia March 2018
[19] Y Chen S Duffner A Stoian et al ldquoDeep and low-levelfeature based attribute learning for person re-identificationrdquoImage and Vision Computing vol 79 pp 25ndash34 2018
[20] Z Dufour X Bai M Ye and S Satoh ldquoIncremental deephidden attribute learningrdquo in Proceedings of the 26th ACMinternational conference on Multimedia pp 72ndash80 SeoulSouth Korea October 2018
[21] D Cheng Y Gong S Zhou et al ldquoPerson re-identification bymulti-channel parts-based cnn with improved triplet lossfunctionrdquo in Proceedings of the 2016 IEEE Conference onComputer Vision and Pattern Recognition (CVPR)pp 1335ndash1344 Las Vegas NV USA June 2016
[22] K He X Zhang S Ren et al ldquoDeep residual learning forimage recognitionrdquo in Proceedings of the 2016 IEEE Confer-ence on Computer Vision and Pattern Recognition (CVPR)pp 770ndash778 Las Vegas NV USA June 2016
[23] Y Lin L Zheng Z Zheng et al ldquoImproving person re-identification by attribute and identity learningrdquo 2017 httparxivorgabs170307220
[24] Y Yan B Ni J Liu and X Yang ldquoMulti-level attentionmodelfor person re-identificationrdquo Pattern Recognition Letters2018 In press
[25] Y Wen K Zhang Z Li et al ldquoA discriminative featurelearning approach for deep face recognitionrdquo in Proceedingsof 14th European Conference pp 499ndash515 AmsterdamNetherlands October 2016
[26] K Xu J Ba R Kiros et al ldquoShow attend and tell neuralimage caption generation with visual attentionrdquo in Pro-ceedings of the 32nd International Conference on MachineLearning (ICML) pp 2048ndash2057 Lille France July 2015
[27] H Choi K Cho and Y Bengio ldquoFine-grained attentionmechanism for neural machine translationrdquoNeurocomputingvol 284 pp 171ndash176 2018
[28] D Gray and H Tao ldquoViewpoint invariant pedestrian rec-ognition with an ensemble of localized featuresrdquo in Pro-ceedings of European Conference on Computer Vision LectureNotes in Computer Science pp 262ndash275 Marseille FranceOctober 2008
[29] W Li R Zhao and X Wang ldquoHuman reidentification withtransferred metric learningrdquo in Proceedings of the 2008 AsianConference on Computer Vision (ACCV) pp 31ndash44 DaejeonKorea November 2012
[30] L Zheng L Shen L Tian et al ldquoScalable person re-identification a benchmarkrdquo in Proceedings of the 2015IEEE Conference on Computer Vision and Pattern Recognition(CVPR) pp 1116ndash1124 Boston MA USA December 2015
[31] W Li R Zhao T Xiao et al ldquoDeepReID deep filter pairingneural network for person re-identificationrdquo in Proceedings ofthe 2014 Computer Vision and Pattern Recognition (CVPR)pp 152ndash159 Columbus OH USA June 2014
[32] L Zhang T Xiang and S Gong ldquoLearning a discriminativenull space for person re-identificationrdquo in Proceedings of theIEEE Conference on Computer Cision and Pattern Recognition(CVPR) pp 1239ndash1248 Las Vegas NV USA June 2016
[33] L Wu C Shen and A van den Hengel ldquoDeep linear dis-criminant analysis on Fisher networks a hybrid architecturefor person re-identificationrdquo Pattern Recognition vol 65pp 238ndash250 2017
Computational Intelligence and Neuroscience 11
[34] C Su J Li S Zhang et al ldquoPose-driven deep convolutionalmodel for person re-identificationrdquo in Proceedings of the 2016International Conference on Computer Vision (ICCV)pp 3980ndash3989 Venice Italy October 2017
[35] Z Zhao B Zhao and F Su ldquoPerson re-identification viaintegrating patch-based metric learning and local saliencelearningrdquo Pattern Recognition vol 75 pp 90ndash98 2018
[36] L Wu C Shen and A Hengel ldquoPersonnet person re-identification with deep convolutional neural networksrdquo2016 httparxivorgabs160107255
[37] J Chen Y Wang J Qin et al ldquoFast person re-identificationvia cross-camera semantic binary transformationrdquo in Pro-ceedings of the 2017 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 5330ndash5339 Honolulu HIUSA July 2017
[38] L Wu Y Wang Z Ge et al ldquoStructured deep hashing withconvolutional neural networks for fast person re-identifica-tionrdquo Computer Vision and Image Understanding vol 167no 10 pp 63ndash73 2018
[39] Z Hu R Hu C Chen Y Yu et al ldquoPerson reidentification viadiscrepancy matrix and matrix metricrdquo IEEE Transactions onCybernetics vol 48 no 10 pp 3006ndash3020 2018
[40] D Jiang K Cho and Y Bengio ldquoNeural machine translationby jointly learning to align and translaterdquo 2014 httparxivorgabs14090473
12 Computational Intelligence and Neuroscience
Computer Games Technology
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
Advances in
FuzzySystems
Hindawiwwwhindawicom
Volume 2018
International Journal of
ReconfigurableComputing
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
thinspArtificial Intelligence
Hindawiwwwhindawicom Volumethinsp2018
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawiwwwhindawicom Volume 2018
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Computational Intelligence and Neuroscience
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018
Human-ComputerInteraction
Advances in
Hindawiwwwhindawicom Volume 2018
Scientic Programming
Submit your manuscripts atwwwhindawicom
[34] C Su J Li S Zhang et al ldquoPose-driven deep convolutionalmodel for person re-identificationrdquo in Proceedings of the 2016International Conference on Computer Vision (ICCV)pp 3980ndash3989 Venice Italy October 2017
[35] Z Zhao B Zhao and F Su ldquoPerson re-identification viaintegrating patch-based metric learning and local saliencelearningrdquo Pattern Recognition vol 75 pp 90ndash98 2018
[36] L Wu C Shen and A Hengel ldquoPersonnet person re-identification with deep convolutional neural networksrdquo2016 httparxivorgabs160107255
[37] J Chen Y Wang J Qin et al ldquoFast person re-identificationvia cross-camera semantic binary transformationrdquo in Pro-ceedings of the 2017 IEEE Conference on Computer Vision andPattern Recognition (CVPR) pp 5330ndash5339 Honolulu HIUSA July 2017
[38] L Wu Y Wang Z Ge et al ldquoStructured deep hashing withconvolutional neural networks for fast person re-identifica-tionrdquo Computer Vision and Image Understanding vol 167no 10 pp 63ndash73 2018
[39] Z Hu R Hu C Chen Y Yu et al ldquoPerson reidentification viadiscrepancy matrix and matrix metricrdquo IEEE Transactions onCybernetics vol 48 no 10 pp 3006ndash3020 2018
[40] D Jiang K Cho and Y Bengio ldquoNeural machine translationby jointly learning to align and translaterdquo 2014 httparxivorgabs14090473
12 Computational Intelligence and Neuroscience
Computer Games Technology
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
Advances in
FuzzySystems
Hindawiwwwhindawicom
Volume 2018
International Journal of
ReconfigurableComputing
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
thinspArtificial Intelligence
Hindawiwwwhindawicom Volumethinsp2018
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawiwwwhindawicom Volume 2018
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Computational Intelligence and Neuroscience
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018
Human-ComputerInteraction
Advances in
Hindawiwwwhindawicom Volume 2018
Scientic Programming
Submit your manuscripts atwwwhindawicom
Computer Games Technology
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
Advances in
FuzzySystems
Hindawiwwwhindawicom
Volume 2018
International Journal of
ReconfigurableComputing
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
thinspArtificial Intelligence
Hindawiwwwhindawicom Volumethinsp2018
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawiwwwhindawicom Volume 2018
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Computational Intelligence and Neuroscience
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018
Human-ComputerInteraction
Advances in
Hindawiwwwhindawicom Volume 2018
Scientic Programming
Submit your manuscripts atwwwhindawicom