Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Hindawi Publishing CorporationMathematical Problems in EngineeringVolume 2013 Article ID 164810 8 pageshttpdxdoiorg1011552013164810
Research ArticleTraffic Volume Data Outlier Recovery via Tensor Model
Huachun Tan1 Jianshuai Feng2 Guangdong Feng3 Wuhong Wang1 and Yu-Jin Zhang4
1 Department of Transportation Engineering Beijing Institute of Technology Beijing 100081 China2 Integrated Information System Research Center Institute of Automation Chinese Academy of Sciences Beijing 100190 China3 Civil Aviation Engineering Consulting Company of China Beijing 100621 China4Department of Electronic Engineering Tsinghua University Beijing 100084 China
Correspondence should be addressed to Huachun Tan htan8wiscedu
Received 22 September 2012 Accepted 24 February 2013
Academic Editor Huimin Niu
Copyright copy 2013 Huachun Tan et al This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited
Traffic volume data is already collected and used for a variety of purposes in intelligent transportation system (ITS) However thecollected data might be abnormal due to the problem of outlier data caused by malfunctions in data collection and record systemsTo fully analyze and operate the collected data it is necessary to develop a validate method for addressing the outlier data Manyexisting algorithms have studied the problem of outlier recovery based on the time series methods In this paper a multiway tensormodel is proposed for constructing the traffic volume data based on the intrinsic multilinear correlations such as day to day andhour to hour Then a novel tensor recovery method called ADMM-TR is proposed for recovering outlier data of traffic volumedataThe proposed method is evaluated on synthetic data and real world traffic volume data Experimental results demonstrate thepracticability effectiveness and advantage of the proposed method especially for the real world traffic volume data
1 Introduction
In order to alleviate the traffic congestion problem andfacilitate themobility inmetropolises large amounts of trafficinformation are collected as a part of intelligent transporta-tion system (ITS) such as CVIS (Cooperative Vehicle Infras-tructure System) in China These collected traffic data havewide range of applications The real time traffic informationis provided to travelers to support their decision for makingprocess on the optimal route choice [1] As shown by the workof Kim et al [2] the real time information can contribute toreduce the operation cost and maximize resource utilizationIn addition to these applications the collected data could beapplied to maximize the utilization of the infrastructure forsmooth flow of the traffic One such application of real timetraffic data is traffic information control [3] On the otherhand several data mining techniques have been applied tomine time related association rules from traffic databases andtheir results have been used for traffic prediction such as theworks of Williams et al [4] and Xu et al [5] From the abovediscussion it is concluded that the collected traffic data areessential for many potential applications in ITS
In real world the collected data are always corrupteddue to noise values especially outlier value which may becaused by detector failures communication problems or anyother hardwaresoftware related problems The presence ofoutlier data in the database would degrade significantly thequality as well as reliability of the data and might impede theeffectiveness of ITS applications Therefore it is essential tofill the gaps caused by outlier data in order to fully explorethe applicability of the data and realize the ITS applications
While many different kinds of traffic data such as trafficvolume speed and occupancy are collected the focus of thisresearch is on the traffic volume outlier data recovery It issupposed that the detectors collecting traffic information areset up at road sections and the collected values represent thetraffic volume for those road sectionsThe aimof this researchis to recover the traffic volume outlier data for road sections
Literature survey in the related field shows that severalfiltering recovery techniques have been applied to recoverthe outlier traffic data [6ndash8] Filtering methods includetechniques such as singular value decomposition waveletanalysis immune algorithm and spectrum subtraction Fil-tering methods formulate the traffic volume as time series
2 Mathematical Problems in Engineering
model and smooth the traffic waveform These approachesrecover the outlier data of day by day through spectrumanalysis and feature information extracting However thetraffic data through the same location is significantly similarfrom day to day and these approaches cannot utilize suchcharacteristic Pei and Ma [6] show that similarity is animportant factor impacting on recovery performance Whilethe above methods consider only one mode similarity therecovery performance is mainly dependent on the smooththreshold Unfortunately the smooth threshold is empiricallydetermined
In order to improve the recovery performance and con-sider themultidimension characteristic of traffic dataminingthe multimode similarities will make a great contribution forrecovering outlier value Our approach is based on utilizingmultimode correlations of traffic data that is traffic data havedifferent correlation on different modes such as week modeday mode and hour mode More concretely the feature ofthe proposed method is to recover the outlier value using thetraffic volume information of the different modes But theproblem is not so simple because traffic volumes from manydays might be corrupted by outlier data simultaneously Inorder to consider the multiple outlier traffic volumes we usetensor modeling the traffic volume
In order to solve the traffic volume outlier data problemwe formulate the traffic volume recovery problem as a datarecovery problem based on the assumption that the essentialtraffic volume is low-119899-ranklow rank and the outliers aresparse That is the corrupted traffic volumes can formulatedas
A =L +S (1)
whereA is the observed traffic volumewhich is corruptedLis the recovered traffic volume andS represents the outliersIn the problem the entrances of corrupted traffic data areunknown One straight solution is optimizing the followingproblem under the assumption that the 119899-rank ofL is smalland the corrupted outliers are sparse or bounded
minLsum
119894
120583119894rank119894 (L)
st A minusL minusS119865 le 120575(2)
The tensor recovery problem of (2) has been studiedin recent years which will be detailed in Section 3 In thispaper a new data recovery method based on tensor modelcalledAlternatingDirectionMethod ofMultipliers for TensorRecovery (ADMM-TR) is proposed to handle the outliertraffic volumes
This paper makes three main contributions (1) We usetensor to model the traffic volume and take advantage ofthe multiway characteristics of tensor which could explorethe multicorrelations of different modes in traffic data and(2) we formulate the problem of the traffic volume outlierdata recovery as a tensor recovery problem (3) we proposedADMM-TR algorithm by extending ADMM from matrix totensor case to solve the formulated tensor recovery problemfor traffic volume and the convergence of ADMM-TR is
proved It also should be noted that the proposed ADMM-TRmethod is different from [9] which reported that extendedADM for tensor recovery is proposed In fact they presentedADM for tensor completion in which the data are missedand the entrances of missing data are known While inthis research the objective is to recover the data which arecorrupted including missing or noised and the entrances ofcorrupted values are unknown
The paper is organized as follows We present the reviewof tensor model in Section 2 Section 3 briefly reviews therelated data recovery methods In Section 4 tensor model fortraffic volume is constructed traffic data recovery problem isformulated and an efficient algorithm is proposed to solvethe formulation Also a simple convergence guarantee for theproposed algorithm is given In Section 5 we evaluated theproposed method on synthetic data and real world trafficvolume data Finally we provide some concluding remarksin Section 6
2 Notation and Review of Tensor Models
In this section we adopt the nomenclature of Kolda andBaderrsquos review on tensor decomposition [10] and partiallyadopt the notation in [11]
A tensor is the generalization of amatrix to higher dimen-sions We denote scalars by lowercase letters (119886 119887 119888 )vectors as bold-case lowercase letters (a b c ) andmatricesas uppercase letters (119860 119861 119862 ) Tensors are written ascalligraphic letters (ABC )
An 119899-mode tensor is denoted as A isin R1198681times1198682timessdotsdotsdottimes119868119873 Itselements are denoted as 119886
1198941sdotsdotsdot119894119896sdotsdotsdot119894119899
where 1 le 119894119896le 119868119870 1 le
119870 le 119873 The mode-119899 unfolding (also called matricizationor flattening) of a tensor A isin R1198681times1198682timessdotsdotsdottimes119868119873 is defined asunfold (A 119899) = 119860
(119899) The tensor element (119894
1 1198942 119894
119873) is
mapped to the matrix element (119894119899 119895) where
119895 = 1 +
119873
sum
119896=1
119896 = 119899
(119894119896minus 1) 119869119896 with 119869
119896=
119896minus1
prod
119898=1119898 = 119899
119868119898 (3)
Therefore 119860(119899)isin R119868119899times119869 where 119869 = prod
119873
119896=1 119896 = 119899119868119896
Accordingly its inverse operator fold can be defined asfold (119860
(119899) 119899) = A
The 119899-rank of an119873-dimensional tensorA isin R1198681times1198682timessdotsdotsdottimes119868119873 denoted by 119903
119899 is the rank of the mode-119899 unfolding matrix
119860(119899)
119903119899= rank
119899 (A) = rank (119860 (119899)) (4)
The inner product of two same-size tensors AB isin
R1198681times1198682timessdotsdotsdottimes119868119873 is defined as the sum of the products of theirentries that is
⟨AB⟩ = sum1198941
sum
1198942
sdot sdot sdotsum
119894119873
1198861198941sdotsdotsdot119894119896sdotsdotsdot119894119899
1198871198941sdotsdotsdot119894119896sdotsdotsdot119894119899
(5)
The corresponding Frobenius norm is A119865= radic⟨AA⟩
Besides the 1198970norm of a tensor A denoted by A
0 is the
number of nonzero elements inA and the 1198971norm is defined
Mathematical Problems in Engineering 3
as A1= sum1198941sdotsdotsdot119894119896sdotsdotsdot119894119899
|1198861198941sdotsdotsdot119894119896sdotsdotsdot119894119899
| It is clear that A119865= 119860(119899)119865
A0= 119860(119899)0 and A
1= 119860(119899)1for any 1 le 119899 le 119873
The 119899-mode (matrix) product of a tensorA isin R1198681times1198682timessdotsdotsdottimes119868119873with a matrix 119872 isin R119869times119868119899 is denoted by Atimes
119899119872 and is size
1198681times sdot sdot sdot times 119868
119899minus1times119869times119868
119899+1times sdot sdot sdot times 119868
119873 In terms of flattened matrix
the 119899-mode product can be expressed as
Y = Atimes119899119872lArrrArr 119884
(119899)= 119872119860
(119899) (6)
3 Review of Data Recovery Methods
Recently the problem of recovering the sparse and low-rank components with no prior knowledge about the sparsitypattern of the sparse matrix or the rank of the low-rankmatrix has been well studied Authors of [12] proposedthe concept of ldquorank-sparse incoherencerdquo and solved theproblem by an interior point solver after being reformulatedas a semidefinite problem However although interior pointmethods normally take very few iterations to converge theyhave difficulty in handling large matrices So this limitationprevented the usage of the technique in computer vision andthe traffic volume recovery in this research
To solve the problem for large scale matrices Wrightet al [13] have adopted the iterative thresholding techniqueto solve the problem and obtained scalability propertiesLin et al have proposed an accelerated proximal gradient(APG) algorithm [14] and applied techniques of augmentedLagrange multipliers (ALM) [15] to solve the problem Yuanand Yang [16] have utilized the alternating direction method(ADM) which can be regarded as a practical version ofthe classical ALM method to solve the matrix recoveryproblem The ADM method has been proved to have apleasing convergence speed and results in [16] demonstratedits excellent performance
Inspired by the idea of [16] this paper extends the sparseand low-rank recovery problem to tensor case which is dueto the fact that the multidimensional traffic data can beformulated into the form of tensor
4 ADMM-TR for Traffic Volume OutlierRecovery
In this section we show the solution of problem (2) Thetensor model is firstly constructed for traffic volume inSection 41 Then we present the tensor recovery problem inSection 42 In Section 43 the classical ADMM approach isintroduced In Section 44 we convert the original probleminto a constrained convex optimization problem which canbe solved by the extended ADMM approach and presentthe details of the proposed algorithm Also the convergenceguarantees of the proposed algorithm are given in thissection
41 Tensor Model for Traffic Volume The correlations oftraffic volume data are critical for recovering the corruptedtraffic volume data Traditional methods mostly exploit partof correlations such as historical or temporal neighboringcorrelationsThe classic methods usually utilize the temporal
Table 1 The similarity coefficient of four modes
Mode Size Similarity coefficientHour 6 times 12 09670Day 7 times 288 08654Week 7 times 288 09153Link 4 times 288 08497
correlations of traffic data from day to day For the singledetector data multiple correlations contain the relations oftraffic data from day to day hour to hour and so forth Inaddition the spatial correlations exist in multiple detectorsdata
In this paper quantitative analysis of traffic data corre-lation is analyzed based on the traffic volume data down-loaded from httppemsdotcagov The correlation coeffi-cient applied tomeasuring the data correlation is given by [17]
119904 =
sum119899ge119894gt119895ge1
119877 (119894 119895)
119899 (119899 minus 1) 2
(7)
where 119899 refers to the whole data points 119877(119894 119895) refers tothe correlation coefficient matrix Table 1 gives the resultsof correlation coefficient of four modes which is hour dayweek and month
Conventional methods usually use day-to-day matrixpattern to model the traffic data Although each mode oftraffic data has a very high similarity these methods do notutilize the multimode correlations which are ldquoDay times HourrdquoldquoWeek times Hourrdquo and ldquoLink times Hourrdquo simultaneously and thusmay result in poor recovery performance
To make full use of the multimode correlation and trafficspatial-temporal information traffic data need to be con-structed into multiway data set Fortunately tensor patternbased traffic data can be well used to model the multiwaytraffic dataThis helps keep the original structure and employenough traffic spatial-temporal information
42 The Tensor Recovery Problem The problem of (2) is NPhard since it is not convex Then we use an approximationformulation as shown in (8)
minLS Llowast + 120578S1
st A minusL minusS119865 le 120575(8)
where A isin R1198681times1198682timessdotsdotsdottimes119868119873 is the given matrix to be recoveredL isin R1198681times1198682timessdotsdotsdottimes119868119873 is the low-rank component of A S isin
R1198681times1198682timessdotsdotsdottimes119868119873 is the sparse component of A Compared with(2) (8) relaxes the constrain for recovering a low-119899-ranktensor from a high-dimensional data tensor despite bothsmall entry-wise noise and gross sparse errors
Recently Liu et al [18] have proposed the definition of thenuclear norm of an 119899-mode tensor
Xlowast =1
119899
119899
sum
119894=1
1003817100381710038171003817119883(119894)
1003817100381710038171003817lowast (9)
4 Mathematical Problems in Engineering
Based on this definition the optimization in (8) can bewritten as
minLS Llowast + 120578S1 equiv
1
119899
119899
sum
119894=1
120582119894
1003817100381710038171003817119871(119894)
1003817100381710038171003817lowast+
1
119899
119899
sum
119894=1
120578119894
1003817100381710038171003817119878(119894)
10038171003817100381710038171
st A minusL minusS119865 le 120575
(10)
In order to recover (L S) instead of directly solving (10)we solve the following dual problem
min LS
1
2120574
A minusL minusS2
119865+
119899
sum
119894=1
120582119894
1003817100381710038171003817119871(119894)
1003817100381710038171003817lowast+
119899
sum
119894=1
120578119894
1003817100381710038171003817119878(119894)
10038171003817100381710038171 (11)
The problem in (11) is still difficult to solve due to theinterdependent nuclear norm and 119897
1norm constraints To
simplify the problem the formulation can be reformulated asfollows
minLS119872
119894119873119894
1
2120574
A minusL minus S2
119865+
119899
sum
119894=1
120582119894
1003817100381710038171003817119872119894
1003817100381710038171003817lowast+
119899
sum
119894=1
120578119894
1003817100381710038171003817119873119894
10038171003817100381710038171
st 119875119894L = 119872
119894119875119894S = 119873
119894forall119894
(12)
where 119875119894is the matrix representation of mode-119894 unfolding
(note that 119875119894is a permutation matrix thus 119875
119894
119879119875119894= 119868) 119872
(119894)
and119873(119894)are additional auxiliary matrices of the same size as
the mode-119894 unfolding ofL (or S)
43The Classical ADMMApproach The classical alternatingdirection method of multipliers (ADMM) is for solvingstructured convex programs of the form
min119909isin119862119909119910isin119862119910
119891 (119909) + 119892 (119910)
st 119860119909 + 119861119910 = 119888(13)
where119891 and 119892 are convex functions defined on closed subsets119862119909and 119862
119910119860 119861 and 119888 are matrices and vector of appropriate
sizes The segmented Lagrangian function of (13) is
119871119860(119909 119910 119908) = 119891 (119909) + 119892 (119910) + ⟨119908119860119909 + 119861119910 minus 119888⟩
+
120573
2
1003817100381710038171003817119860119909 + 119861119910 minus 119888
1003817100381710038171003817
2
2
(14)
where 119908 is a Lagrangian multiplier vector and 120573 gt 0 is apenalty parameter
The approach performs one sweep of alternating mini-mization with respect to 119909 and 119910 individually then updatesthe multiplier 119908 at the iteration 119896 the steps are given by [18Equations (479)ndash(481)]
119909(119896+1)larr997888 arg min
119909isin119862119909
119871119860(119909 119910(119896) 119908(119896))
119910(119896+1)larr997888 argmin
119909isin119862119910
119871119860(119909(119896+1) 119910 119908(119896))
119908(119896+1)larr997888 119908
(119896)+ 120588120573 (119860119909
(119896+1)+ 119861119910(119896+1)minus 119888)
(15)
where 120588 is the step length A convergence proof for the aboveADMM algorithm was shown as follows
Theorem 1 (See [19 Proposition 52]) Assume that theoptimal solution set 119883lowast of (13) is nonempty Furthermoreassume that119862
119909is bounded or else the matrix119860lowast119860 is invertible
Then a sequence 119909(119896) 119910(119896) 119908(119896) generated by (15) is boundedand every limit point of 119909(119896) is an optimal solution of theoriginal of problem (13)
44 ADMM Extension to Tensor Recovery We observe (12)is well structured in the sense that the separable structureemerges in both the objective function and constraintsThuswe propose an algorithm based on an extension of theclassical ADMM approach for solving the tensor recoveryproblem by taking advantage of this favorable structure
The augmented Lagrangian of (12) is
119871119860(LS119872
119894 119873119894)
=
1
2120574
A minusL minusS2
119865
+
119899
sum
119894=1
(120582119894
1003817100381710038171003817119872119894
1003817100381710038171003817lowast+ ⟨119884119894 119875119894L minus119872
119894⟩ +
120572119894
2
1003817100381710038171003817119875119894L minus119872
119894
1003817100381710038171003817
2
119865)
+
119899
sum
119894=1
(120578119894
1003817100381710038171003817119873119894
10038171003817100381710038171+ ⟨119885119894 119875119894S minus 119873
119894⟩ +
120573119894
2
1003817100381710038171003817119875119894S minus 119873
119894
1003817100381710038171003817
2
119865)
(16)
where 119884119894 119885119894are Lagrangian multipliers and 120572
119894 120573119894gt 0 are
penalty parametersThen we can now directly apply ADMM with this aug-
mented Lagrangian function
Computing 119872119894 The optimal119872
119894can be solved with all other
variables to be constant by the following subproblem
min119872119894
120582119894
1003817100381710038171003817119872119894
1003817100381710038171003817lowast+ ⟨119884119894 119875119894L minus119872
119894⟩ +
120572119894
2
1003817100381710038171003817119875119894L minus119872
119894
1003817100381710038171003817
2
119865 (17)
As shown in [20] the optimal solution of (17) is given by
119872119894= 119880119894119863120582119894120572119894(Λ)119881119894
119879 (18)
where 119880119894Λ119881119894
119879 is the singular value decomposition given by
119880119894Λ119881119894
119879= 119875119894L +
119884119894
120572119894
(19)
and the ldquoshrinkagerdquo operator119863120591(119909) with 120591 gt 0 is defined as
119863120591 (119909) =
119909 minus 120591 if 119909 gt 120591119909 + 120591 if 119909 lt minus1205910 otherwise
(20)
Computing 119873119894 The optimal 119873
119894can be solved with all other
variables to be the constants by the following subproblem
min119873119894
120578119894
1003817100381710038171003817119873119894
10038171003817100381710038171+ ⟨119885119894 119875119894S minus 119873
119894⟩ +
120573119894
2
1003817100381710038171003817119875119894S minus 119873
119894
1003817100381710038171003817
2
119865 (21)
Mathematical Problems in Engineering 5
Input n-mode tensorAParameters 120572 120573 120574 120582 120578 120588(1) InitializationL(0) = A S(0) = 0 119872
119894=L(119894) 119873119894= 0 119896 = 1
(2) Repeat until convergence(3) for 119894 = 1 to 119899(4) 119872
(119896+1)
119894= 119880119894119863120582119894120572119894(Λ)119881
119879
119894
where 119880119894Λ119881119879
119894= 119875119894L(119896) + 119884
(119896)
119894120572119894
(5) 119873(119896+1)
119894= 119863120578119894120573119894(119875119894S(119896) + 119885
(119896)
119894120573119894)
(6) 119884(119896+1)
119894= 119884(119896)
119894+ 120572119894(119875119894L(119896) minus119872
(119896+1)
119894)
(7) 119885(119896+1)
119894= 119885(119896)
119894+ 120573119894(119875119894S(119896) minus 119873
(119896+1)
119894)
(8) end for
(9) L(119896+1) =A minus S(119896) minus 120574sum
119899
119894=1119875119879
119894(119884(119896+1)
119894minus 120572119894119872(119896+1)
119894)
1 + 120574sum119899
119894=1120572119894
(10) S(119896+1) =A minusL(119896+1) minus 120574sum
119899
119894=1119875119879
119894(119885(119896+1)
119894minus 120572119894119873(119896+1)
119894)
1 + 120574sum119899
119894=1120573119894
(11) 120572 = 120588120572 120573 = 120588120573(12) End(13) 119896 = 119896 + 1
Output n-mode tensorLS
Algorithm 1 ADMM-TR ADMM for tensor recovery
By the well-known 1198971minimization [21] the optimal
solution of (21) is
119873119894= 119863120578119894120573119894
(119875119894S +119885119894
120573119894
) (22)
where119863120591is the ldquoshrinkagerdquo operation
Computing L Now we fix all variables except L andminimize 119871
119860over L The resulting minimization problem
is the minimization of a quadratic function
minL 119871119860 (
L) =1
2120574
A minusL minus S2
119865
+
119899
sum
119894=1
(⟨119884119894 119875119894L minus119872
119894⟩ +
120572119894
2
1003817100381710038171003817119875119894L minus119872
119894
1003817100381710038171003817
2
119865)
(23)
The objective function is differentiable so the minimizerLmin is characterized by (120597119871119860(L))120597L = 0 Thus we obtain
Lmin =A minusS minus 120574sum
119899
119894=1119875119894
119879(119884119894minus 120572119894119872119894)
(1 + 120574sum119899
119894=1120572119894)
(24)
ComputingS Nowwefix all variables exceptS andminimize119871119860
over S The resulting minimization problem is theminimization of a quadratic function
minS 119871119860 (
S) =1
2120574
A minusL minusS2
119865
+
119899
sum
119894=1
(⟨119885119894 119875119894S minus 119873
119894⟩ +
120573119894
2
1003817100381710038171003817119875119894S minus 119873
119894
1003817100381710038171003817
2
119865)
(25)
The objective function is also differentiable so the min-imizer Smin is characterized by (120597119871
119860(S))120597S = 0 Thus we
have
Smin =A minusL minus 120574sum
119899
119894=1119875119894
119879(119885119894minus 120573119894119873119894)
(1 + 120574sum119899
119894=1120573119894)
(26)
For comparing with RSTD [22] we also choose thedifference of L and S in successive iterations against acertain tolerance as the stopping criterion The pseudocodeof the proposed ADMM-TR algorithm is summarized inAlgorithm 1
Theorem 2 (Convergence of ADMM-TR) Assume that theoptimal solution set 119883lowast of (11) is nonempty A sequenceL(119896)S(119896)119872
(119896)
119894 119873(119896)
119894 119884(119896)
119894 119885(119896)
119894 generated by our proposed
ADMM-TR algorithm is bounded and every limit point ofL(119896)S(119896) is an optimal solution of the original problem (11)
Proof We check the assumptions of Theorem 1 119862119909is not
bounded but 119875119894
lowast119875119894= 119868 is a constant multiple of the identity
operator Thus Theorem 1 can also be applied to ADMM-TRandTheorem 2 can be derived
5 Numerical Experiments
This section evaluates the empirical performance of the pro-posed algorithm on synthetic data and compares the resultswithRSTD (Rank Sparsity TensorDecomposition) [22] Alsoexperiments on traffic volume data outlier recovery illustratethe efficiency of the proposedmethod in traffic research filed
We use the Lanczos algorithm for computing the singularvalues decomposition and adopt the same rule for predicting
6 Mathematical Problems in Engineering
Table 2 Comparison of ADMM-TR with RSTD for synthetic data whereA0isin R40times40times40 119899-rank = [5 5 5]
119904119901119903
ADMM-TR RSTDRSE L
0
(119890-3)RSE S
0
(119890-3) iter Time (s) RSE L0
(119890-3)RSE S
0
(119890-3) iter Time (s)
005 43 43 133 137 46 46 208 178015 98 42 162 208 105 47 235 336025 123 43 514 449 159 52 676 510035 565 95 654 533 675 114 737 575
Table 3 Comparison of ADMM-TR with RSTD for synthetic data whereA0isin R40times40times40 n-rank = [10 10 10]
119904119901119903
Algorithm ADMM-TR Algorithm RSTDRSE L
0
(119890-3)RSE S
0
(119890-3) iter Time (s) RSE L0
(119890-3)RSE S
0
(119890-3) iter Time (s)
005 44 20 236 264 47 25 338 299015 47 21 417 421 52 25 603 508025 89 29 664 608 120 42 663 536035 173 52 981 806 211 65 1110 877
the dimension of the principal singular space as [22] And theparameters are set as 120572 = 120573 = [119868
1119868max 1198682119868max 119868119899119868max]
119879
and 120574 = 1sum([1198681119868max 1198682119868max 119868119899119868max]
119879 for all experi-ments where 119868max = max119868
119894 120578 is set to 1radic119868max as suggested
in [23]All the experiments are conducted and timed on the same
desktop with an Pentium (R) Dual-Core 250GHz CPU thathas 4GB memory running on Windows 7 and MATLAB
51 Synthetic Data ADMM-TR and RSTD are tested on thesynthetic data of size 40times40times40 We generate the dimension119903 of a ldquocore tensorrdquo C isin R119903timessdotsdotsdottimes119903 which we fill with Gaussiandistributed entries (sim N(0 1)) Then we generate matrices119880(1) 119880
(119873) with119880(119894) isin R119899119894times119903 whose elements are also iidGaussian random variables (simN(0 1)) and set
L0= Ctimes1119880(1)
times2sdotsdotsdottimes119873119880(119873) (27)
The entries of sparse tensor S0are independently dis-
tributed each taking on value 0 with probability 1-119904119901119903 andeach taking on impulsive value with probability 119904119901119903We applythe proposed algorithm to the tensorA
0=L0+S0to recover
L and S and compare with RSTD For these experimentstwo cases of 119899-rank are investigated 119899-rank = [5 5 5] and 119899-rank = [10 10 10] Table 1 presents the average results (across30 instances) for different 119904119901119903
The quality of recovery is measured by the relative squareerror (RSE) toL
0and S
0 defined to be
RSE L0=
10038171003817100381710038171003817
L minusL0
10038171003817100381710038171003817119865
1003817100381710038171003817L0
1003817100381710038171003817119865
RSE S0=
10038171003817100381710038171003817
S minusS0
10038171003817100381710038171003817119865
1003817100381710038171003817S0
1003817100381710038171003817119865
(28)
Tables 2 and 3 show that the proposed algorithm(ADMM-TR) is about 10 percent faster than RSTD proposed
in [22] and achieves better accuracy in terms of relative squareerror Though both of the algorithms involve computing aSVD per iteration we observe that the proposed algorithmtake much fewer iterations than RSTD to converge to theoptimal solution
The more impulsive entries are added that is for highervalue of spr the less probable it becomes for the tensorrecovery problem In addition the problem becomes moresophisticated when the 119899-rank is higher for the ground truthtensors of the same size In Table 1 different spr is set fortwo tensor cases And results show the recovered accuracydecreases as the spr grows for a certain case In particular therecovered accuracy for the tensorA
0isin R40times40times40 with 119899-rank
= [5 5 5] decreases sharply when the spr is up to 40 whilethe phenomenon occurs for tensor A
0isin R40times40times40 with 119899-
rank = [10 10 10] when the spr is about 25 This is due tothat relative low rank and low sparse ratio are precondition ofthe tensor recovery problem
52 TrafficVolumeData To evaluate the performances of theproposed method in traffic volume data outlier recovery acomplete traffic volume data set is used as the test data setWeuse the data of a fixed point in Sacramento County which isdownloaded fromhttppemsdotcagovThe traffic volumedata are recorded every 5 minutes Therefore a daily trafficvolume series for a loop detector contains 288 records andthe whole period of the data lasts for 16 days that is fromAugust 2 to August 17 2010
Based on multiple correlations of the traffic volume datawe model the data set as a tensor model of size 16 times 24 times 12which stands for 16 days 24 hours in a day and 12 sampleintervals (ie recorded by 5 minutes) per hour The ratiosof outlier data are set from 5 to 15 and the outlier dataare produced randomly All the results are averaged by 10instances
Mathematical Problems in Engineering 7
0
100
200
300
400
500
600
700
800
900
1000
Traffi
c vol
ume (
veh5
min
)
0 50 100 150 200 250 300
Raw dataOutlier data
Recovered data
Time interval (5 min)
Figure 1 Comparisons with raw traffic volume data data corruptedby outliers with 5 ratio and data recovered by ADMM-TR
0 50 100 150 200 250 3000
200
400
600
800
1000
1200
1400
Raw dataOutlier data
Recovered data
Time interval (5 min)
Traffi
c vol
ume (
veh5
min
)
Figure 2 Comparisons with raw traffic volume data data corruptedby outliers with 10 ratio and data recovered by ADMM-TR
For the real world data we mainly pay attention to thecorrectness of the recovered traffic volume data Thus thequality of recovery is measured by the relative square error(RSE) to L
0and Mean Absolute Percentage Error (MAPE)
toL0 defined to be
RSE L0=
10038171003817100381710038171003817
L minusL0
10038171003817100381710038171003817119865
1003817100381710038171003817L0
1003817100381710038171003817119865
MAPE L0=
1
119872
119872
sum
119898=1
1003816100381610038161003816100381610038161003816100381610038161003816
119905(119898)
119903minus 119905(119898)
119890
119905(119898)
119903
1003816100381610038161003816100381610038161003816100381610038161003816
(29)
0
200
400
600
800
1000
1200
1400
1600
1800
0 50 100 150 200 250 300
Raw dataOutlier data
Recovered data
Time interval (5 min)
Traffi
c vol
ume (
veh5
min
)
Figure 3 Comparisons with raw traffic volume data data corruptedby outliers with 15 ratio and data recovered by ADMM-TR
Table 4 Traffic volume outlier data before and afterbeing recovered
119904119901119903
Before recovery After recoveryRSE L
0
(119890-3)MAPE L
0
(119890-3)RSE L
0
(119890-3)MAPE L
0
(119890-3)005 05005 02136 00961 00314010 07066 05008 01417 01027015 08850 08777 02221 02351
where 119905(119898)119903
and 119905(119898)119890
are the 119898th elements which stand forthe known real value and recovered value respectively 119872denotes the number of recovered traffic volumes
Table 4 presents the relative errors of traffic volumeoutlier data before and after being recovered by ADMM-TRThe results show that the RSE L
0and MAPE L
0for traffic
volume data corrupted by outlier data are about 5 times thanthe data after being recovered by ADMM-TR Figures 1 2and 3 present the profiles of traffic volume data for a day Theresults show that ADMM-TR could recover the traffic volumeoutlier data with perfect performance
6 Conclusions
In this paper we concentrate on the mathematical problemin traffic volume outlier data recovery and proposed anovel tensor recovery method based on alternating directionmethod of multipliers (ADMM) The proposed algorithmcan automatically separate the low-119899-rank tensor data andsparse partThe experiments show that the proposedmethodis more stable and accurate in most cases and has excellentconvergence rate Experiments on real world traffic volumedata demonstrate the practicability and effectiveness of theproposed method in traffic research domain
In the future we would like to investigate how to auto-matically choose the parameters in our algorithm and explore
8 Mathematical Problems in Engineering
additional applications of our method in traffic researchdomain
Acknowledgments
This research was supported by NSFC (Grant No 6127137661171118 and 91120015) Beijing Natural Science Foundation(4122067)The authors would like to thank Professor Bin Ranfrom the University ofWisconsin-Madison and Yong Li fromthe University of Notre Dame for the suggestive discussions
References
[1] K Tamura and M Hirayama ldquoToward realization of VICSmdashvehicle information and communications systemrdquo in Pro-ceedings of the IEEE-IEE Vehicle Navigation and InformationsSystems Conference pp 72ndash77 October 1993
[2] S Kim M E Lewis and C C White III ldquoOptimal vehiclerouting with real-time traffic informationrdquo IEEE Transactionson Intelligent Transportation Systems vol 6 no 2 pp 178ndash1882005
[3] P Mirchandani and L Head ldquoA real-time traffic signal controlsystem architecture algorithms and analysisrdquo TransportationResearch Part C vol 9 no 6 pp 415ndash432 2001
[4] B M Williams P K Durvasula and D E Brown ldquoUrbanfreeway traffic flow prediction application of seasonal autore-gressive integrated moving average and exponential smoothingmodelsrdquo Transportation Research Record no 1644 pp 132ndash1411998
[5] J Xu X Li and H Shi ldquoShort-term traffic flow forecastingmodel under missing datardquo Journal of Computer Applicationsvol 30 no 4 pp 1117ndash1120 2010
[6] Y Pei and J Ma ldquoReal-time traffic data screening and recon-structionrdquo China Civil Engineering Journal vol 36 no 7 pp78ndash83 2003
[7] S Chen W Wang and W Li ldquoNoise recognition and noisereduction of real-time trafficdatardquoDongnanDaxueXuebao vol36 no 2 pp 322ndash325 2006
[8] B-G Daniel J D-P Francisco G-O David et al ldquoWavelet-based denoising for traffic volume time series forecasting withself-organizing neural networksrdquo Computer-Aided Civil andInfrastructure Engineering vol 25 no 7 pp 530ndash545 2010
[9] S Gandy B Recht and I Yamada ldquoTensor completion andlowmdashrank tensor recovery via convex optimizationrdquo InverseProblems vol 27 no 2 Article ID 025010 2011
[10] T G Kolda and B W Bader ldquoTensor decompositions andapplicationsrdquo SIAM Review vol 51 no 3 pp 455ndash500 2009
[11] A S Lewis and G Knowles ldquoImage compression using the 2-Dwavelet transformrdquo IEEE Transactions of Image Processing vol1 no 2 pp 244ndash250 1992
[12] V Chandrasekaran S Sanghavi P A Parrilo and A S WillskyldquoRank-sparsity incoherence for matrix decompositionrdquo SIAMJournal on Optimization vol 21 no 2 pp 572ndash596 2011
[13] J Wright Y Peng Y Ma and A Ganesh ldquoRobust principalcomponent analysis exact recovery of corrupted low-rankmatrices by convex optimizationrdquo in Proceedings of the 23rdAnnual Conference on Neural Information Processing Systems(NIPS rsquo09) pp 2080ndash2088 Vancouver Canada 2009
[14] A Ganesh Z Lin J Wright L Wu M Chen and Y MaldquoFast algorithms for recovering a corrupted low-rank matrixrdquo
in Proceedings of the 3rd IEEE International Workshop onComputational Advances in Multi-Sensor Adaptive Processing(CAMSAP rsquo09) pp 213ndash216 December 2009
[15] Z Lin M Chen L Wu and Y Ma ldquoThe augmented Lagrangemultiplier method for exact recovery of a corrupted low-rankmatricesrdquoMathematical Programming 2009
[16] X Yuan and J Yang ldquoSparse and low-rank matrix decomposi-tion via alternating directionmethodsrdquo Tech Rep Departmentof Mathematics Hong Kong Baptist University 2009
[17] L Qu Y Zhang J Hu L Jia and L Li ldquoA BPCA basedmissing value imputing method for traffic flow volume datardquo inProceedings of the IEEE Intelligent Vehicles Symposium (IV rsquo08)pp 985ndash990 June 2008
[18] J Liu P Musialski P Wonka and J Ye ldquoTensor completion forestimating missing values in visual datardquo in Proceedings of theInternational Conference on Computer Vision (ICCV rsquo09) 2009
[19] D Bertsekas and J Tsitsiklis Parallel and Distributed Computa-tion Prentice-Hall 1989
[20] J-F Cai E J Candes and Z Shen ldquoA singular value thresh-olding algorithm for matrix completionrdquo SIAM Journal onOptimization vol 20 no 4 pp 1956ndash1982 2010
[21] E T Hale W Yin and Y Zhang ldquoFixed-point continuationfor ℓ1-minimization methodology and convergencerdquo SIAMJournal on Optimization vol 19 no 3 pp 1107ndash1130 2008
[22] Y Li J Yan Y Zhou and J Yang ldquoOptimum subspace learningand error correction for tensorsrdquo in Proceedings of the 11thEuropean Conference on Computer Vision (ECCV rsquo10) CreteGreece 2010
[23] E J Candes X Li Y Ma and J Wright ldquoRobust principalcomponent analysisrdquo Journal of the ACM vol 58 no 3 article11 2011
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
2 Mathematical Problems in Engineering
model and smooth the traffic waveform These approachesrecover the outlier data of day by day through spectrumanalysis and feature information extracting However thetraffic data through the same location is significantly similarfrom day to day and these approaches cannot utilize suchcharacteristic Pei and Ma [6] show that similarity is animportant factor impacting on recovery performance Whilethe above methods consider only one mode similarity therecovery performance is mainly dependent on the smooththreshold Unfortunately the smooth threshold is empiricallydetermined
In order to improve the recovery performance and con-sider themultidimension characteristic of traffic dataminingthe multimode similarities will make a great contribution forrecovering outlier value Our approach is based on utilizingmultimode correlations of traffic data that is traffic data havedifferent correlation on different modes such as week modeday mode and hour mode More concretely the feature ofthe proposed method is to recover the outlier value using thetraffic volume information of the different modes But theproblem is not so simple because traffic volumes from manydays might be corrupted by outlier data simultaneously Inorder to consider the multiple outlier traffic volumes we usetensor modeling the traffic volume
In order to solve the traffic volume outlier data problemwe formulate the traffic volume recovery problem as a datarecovery problem based on the assumption that the essentialtraffic volume is low-119899-ranklow rank and the outliers aresparse That is the corrupted traffic volumes can formulatedas
A =L +S (1)
whereA is the observed traffic volumewhich is corruptedLis the recovered traffic volume andS represents the outliersIn the problem the entrances of corrupted traffic data areunknown One straight solution is optimizing the followingproblem under the assumption that the 119899-rank ofL is smalland the corrupted outliers are sparse or bounded
minLsum
119894
120583119894rank119894 (L)
st A minusL minusS119865 le 120575(2)
The tensor recovery problem of (2) has been studiedin recent years which will be detailed in Section 3 In thispaper a new data recovery method based on tensor modelcalledAlternatingDirectionMethod ofMultipliers for TensorRecovery (ADMM-TR) is proposed to handle the outliertraffic volumes
This paper makes three main contributions (1) We usetensor to model the traffic volume and take advantage ofthe multiway characteristics of tensor which could explorethe multicorrelations of different modes in traffic data and(2) we formulate the problem of the traffic volume outlierdata recovery as a tensor recovery problem (3) we proposedADMM-TR algorithm by extending ADMM from matrix totensor case to solve the formulated tensor recovery problemfor traffic volume and the convergence of ADMM-TR is
proved It also should be noted that the proposed ADMM-TRmethod is different from [9] which reported that extendedADM for tensor recovery is proposed In fact they presentedADM for tensor completion in which the data are missedand the entrances of missing data are known While inthis research the objective is to recover the data which arecorrupted including missing or noised and the entrances ofcorrupted values are unknown
The paper is organized as follows We present the reviewof tensor model in Section 2 Section 3 briefly reviews therelated data recovery methods In Section 4 tensor model fortraffic volume is constructed traffic data recovery problem isformulated and an efficient algorithm is proposed to solvethe formulation Also a simple convergence guarantee for theproposed algorithm is given In Section 5 we evaluated theproposed method on synthetic data and real world trafficvolume data Finally we provide some concluding remarksin Section 6
2 Notation and Review of Tensor Models
In this section we adopt the nomenclature of Kolda andBaderrsquos review on tensor decomposition [10] and partiallyadopt the notation in [11]
A tensor is the generalization of amatrix to higher dimen-sions We denote scalars by lowercase letters (119886 119887 119888 )vectors as bold-case lowercase letters (a b c ) andmatricesas uppercase letters (119860 119861 119862 ) Tensors are written ascalligraphic letters (ABC )
An 119899-mode tensor is denoted as A isin R1198681times1198682timessdotsdotsdottimes119868119873 Itselements are denoted as 119886
1198941sdotsdotsdot119894119896sdotsdotsdot119894119899
where 1 le 119894119896le 119868119870 1 le
119870 le 119873 The mode-119899 unfolding (also called matricizationor flattening) of a tensor A isin R1198681times1198682timessdotsdotsdottimes119868119873 is defined asunfold (A 119899) = 119860
(119899) The tensor element (119894
1 1198942 119894
119873) is
mapped to the matrix element (119894119899 119895) where
119895 = 1 +
119873
sum
119896=1
119896 = 119899
(119894119896minus 1) 119869119896 with 119869
119896=
119896minus1
prod
119898=1119898 = 119899
119868119898 (3)
Therefore 119860(119899)isin R119868119899times119869 where 119869 = prod
119873
119896=1 119896 = 119899119868119896
Accordingly its inverse operator fold can be defined asfold (119860
(119899) 119899) = A
The 119899-rank of an119873-dimensional tensorA isin R1198681times1198682timessdotsdotsdottimes119868119873 denoted by 119903
119899 is the rank of the mode-119899 unfolding matrix
119860(119899)
119903119899= rank
119899 (A) = rank (119860 (119899)) (4)
The inner product of two same-size tensors AB isin
R1198681times1198682timessdotsdotsdottimes119868119873 is defined as the sum of the products of theirentries that is
⟨AB⟩ = sum1198941
sum
1198942
sdot sdot sdotsum
119894119873
1198861198941sdotsdotsdot119894119896sdotsdotsdot119894119899
1198871198941sdotsdotsdot119894119896sdotsdotsdot119894119899
(5)
The corresponding Frobenius norm is A119865= radic⟨AA⟩
Besides the 1198970norm of a tensor A denoted by A
0 is the
number of nonzero elements inA and the 1198971norm is defined
Mathematical Problems in Engineering 3
as A1= sum1198941sdotsdotsdot119894119896sdotsdotsdot119894119899
|1198861198941sdotsdotsdot119894119896sdotsdotsdot119894119899
| It is clear that A119865= 119860(119899)119865
A0= 119860(119899)0 and A
1= 119860(119899)1for any 1 le 119899 le 119873
The 119899-mode (matrix) product of a tensorA isin R1198681times1198682timessdotsdotsdottimes119868119873with a matrix 119872 isin R119869times119868119899 is denoted by Atimes
119899119872 and is size
1198681times sdot sdot sdot times 119868
119899minus1times119869times119868
119899+1times sdot sdot sdot times 119868
119873 In terms of flattened matrix
the 119899-mode product can be expressed as
Y = Atimes119899119872lArrrArr 119884
(119899)= 119872119860
(119899) (6)
3 Review of Data Recovery Methods
Recently the problem of recovering the sparse and low-rank components with no prior knowledge about the sparsitypattern of the sparse matrix or the rank of the low-rankmatrix has been well studied Authors of [12] proposedthe concept of ldquorank-sparse incoherencerdquo and solved theproblem by an interior point solver after being reformulatedas a semidefinite problem However although interior pointmethods normally take very few iterations to converge theyhave difficulty in handling large matrices So this limitationprevented the usage of the technique in computer vision andthe traffic volume recovery in this research
To solve the problem for large scale matrices Wrightet al [13] have adopted the iterative thresholding techniqueto solve the problem and obtained scalability propertiesLin et al have proposed an accelerated proximal gradient(APG) algorithm [14] and applied techniques of augmentedLagrange multipliers (ALM) [15] to solve the problem Yuanand Yang [16] have utilized the alternating direction method(ADM) which can be regarded as a practical version ofthe classical ALM method to solve the matrix recoveryproblem The ADM method has been proved to have apleasing convergence speed and results in [16] demonstratedits excellent performance
Inspired by the idea of [16] this paper extends the sparseand low-rank recovery problem to tensor case which is dueto the fact that the multidimensional traffic data can beformulated into the form of tensor
4 ADMM-TR for Traffic Volume OutlierRecovery
In this section we show the solution of problem (2) Thetensor model is firstly constructed for traffic volume inSection 41 Then we present the tensor recovery problem inSection 42 In Section 43 the classical ADMM approach isintroduced In Section 44 we convert the original probleminto a constrained convex optimization problem which canbe solved by the extended ADMM approach and presentthe details of the proposed algorithm Also the convergenceguarantees of the proposed algorithm are given in thissection
41 Tensor Model for Traffic Volume The correlations oftraffic volume data are critical for recovering the corruptedtraffic volume data Traditional methods mostly exploit partof correlations such as historical or temporal neighboringcorrelationsThe classic methods usually utilize the temporal
Table 1 The similarity coefficient of four modes
Mode Size Similarity coefficientHour 6 times 12 09670Day 7 times 288 08654Week 7 times 288 09153Link 4 times 288 08497
correlations of traffic data from day to day For the singledetector data multiple correlations contain the relations oftraffic data from day to day hour to hour and so forth Inaddition the spatial correlations exist in multiple detectorsdata
In this paper quantitative analysis of traffic data corre-lation is analyzed based on the traffic volume data down-loaded from httppemsdotcagov The correlation coeffi-cient applied tomeasuring the data correlation is given by [17]
119904 =
sum119899ge119894gt119895ge1
119877 (119894 119895)
119899 (119899 minus 1) 2
(7)
where 119899 refers to the whole data points 119877(119894 119895) refers tothe correlation coefficient matrix Table 1 gives the resultsof correlation coefficient of four modes which is hour dayweek and month
Conventional methods usually use day-to-day matrixpattern to model the traffic data Although each mode oftraffic data has a very high similarity these methods do notutilize the multimode correlations which are ldquoDay times HourrdquoldquoWeek times Hourrdquo and ldquoLink times Hourrdquo simultaneously and thusmay result in poor recovery performance
To make full use of the multimode correlation and trafficspatial-temporal information traffic data need to be con-structed into multiway data set Fortunately tensor patternbased traffic data can be well used to model the multiwaytraffic dataThis helps keep the original structure and employenough traffic spatial-temporal information
42 The Tensor Recovery Problem The problem of (2) is NPhard since it is not convex Then we use an approximationformulation as shown in (8)
minLS Llowast + 120578S1
st A minusL minusS119865 le 120575(8)
where A isin R1198681times1198682timessdotsdotsdottimes119868119873 is the given matrix to be recoveredL isin R1198681times1198682timessdotsdotsdottimes119868119873 is the low-rank component of A S isin
R1198681times1198682timessdotsdotsdottimes119868119873 is the sparse component of A Compared with(2) (8) relaxes the constrain for recovering a low-119899-ranktensor from a high-dimensional data tensor despite bothsmall entry-wise noise and gross sparse errors
Recently Liu et al [18] have proposed the definition of thenuclear norm of an 119899-mode tensor
Xlowast =1
119899
119899
sum
119894=1
1003817100381710038171003817119883(119894)
1003817100381710038171003817lowast (9)
4 Mathematical Problems in Engineering
Based on this definition the optimization in (8) can bewritten as
minLS Llowast + 120578S1 equiv
1
119899
119899
sum
119894=1
120582119894
1003817100381710038171003817119871(119894)
1003817100381710038171003817lowast+
1
119899
119899
sum
119894=1
120578119894
1003817100381710038171003817119878(119894)
10038171003817100381710038171
st A minusL minusS119865 le 120575
(10)
In order to recover (L S) instead of directly solving (10)we solve the following dual problem
min LS
1
2120574
A minusL minusS2
119865+
119899
sum
119894=1
120582119894
1003817100381710038171003817119871(119894)
1003817100381710038171003817lowast+
119899
sum
119894=1
120578119894
1003817100381710038171003817119878(119894)
10038171003817100381710038171 (11)
The problem in (11) is still difficult to solve due to theinterdependent nuclear norm and 119897
1norm constraints To
simplify the problem the formulation can be reformulated asfollows
minLS119872
119894119873119894
1
2120574
A minusL minus S2
119865+
119899
sum
119894=1
120582119894
1003817100381710038171003817119872119894
1003817100381710038171003817lowast+
119899
sum
119894=1
120578119894
1003817100381710038171003817119873119894
10038171003817100381710038171
st 119875119894L = 119872
119894119875119894S = 119873
119894forall119894
(12)
where 119875119894is the matrix representation of mode-119894 unfolding
(note that 119875119894is a permutation matrix thus 119875
119894
119879119875119894= 119868) 119872
(119894)
and119873(119894)are additional auxiliary matrices of the same size as
the mode-119894 unfolding ofL (or S)
43The Classical ADMMApproach The classical alternatingdirection method of multipliers (ADMM) is for solvingstructured convex programs of the form
min119909isin119862119909119910isin119862119910
119891 (119909) + 119892 (119910)
st 119860119909 + 119861119910 = 119888(13)
where119891 and 119892 are convex functions defined on closed subsets119862119909and 119862
119910119860 119861 and 119888 are matrices and vector of appropriate
sizes The segmented Lagrangian function of (13) is
119871119860(119909 119910 119908) = 119891 (119909) + 119892 (119910) + ⟨119908119860119909 + 119861119910 minus 119888⟩
+
120573
2
1003817100381710038171003817119860119909 + 119861119910 minus 119888
1003817100381710038171003817
2
2
(14)
where 119908 is a Lagrangian multiplier vector and 120573 gt 0 is apenalty parameter
The approach performs one sweep of alternating mini-mization with respect to 119909 and 119910 individually then updatesthe multiplier 119908 at the iteration 119896 the steps are given by [18Equations (479)ndash(481)]
119909(119896+1)larr997888 arg min
119909isin119862119909
119871119860(119909 119910(119896) 119908(119896))
119910(119896+1)larr997888 argmin
119909isin119862119910
119871119860(119909(119896+1) 119910 119908(119896))
119908(119896+1)larr997888 119908
(119896)+ 120588120573 (119860119909
(119896+1)+ 119861119910(119896+1)minus 119888)
(15)
where 120588 is the step length A convergence proof for the aboveADMM algorithm was shown as follows
Theorem 1 (See [19 Proposition 52]) Assume that theoptimal solution set 119883lowast of (13) is nonempty Furthermoreassume that119862
119909is bounded or else the matrix119860lowast119860 is invertible
Then a sequence 119909(119896) 119910(119896) 119908(119896) generated by (15) is boundedand every limit point of 119909(119896) is an optimal solution of theoriginal of problem (13)
44 ADMM Extension to Tensor Recovery We observe (12)is well structured in the sense that the separable structureemerges in both the objective function and constraintsThuswe propose an algorithm based on an extension of theclassical ADMM approach for solving the tensor recoveryproblem by taking advantage of this favorable structure
The augmented Lagrangian of (12) is
119871119860(LS119872
119894 119873119894)
=
1
2120574
A minusL minusS2
119865
+
119899
sum
119894=1
(120582119894
1003817100381710038171003817119872119894
1003817100381710038171003817lowast+ ⟨119884119894 119875119894L minus119872
119894⟩ +
120572119894
2
1003817100381710038171003817119875119894L minus119872
119894
1003817100381710038171003817
2
119865)
+
119899
sum
119894=1
(120578119894
1003817100381710038171003817119873119894
10038171003817100381710038171+ ⟨119885119894 119875119894S minus 119873
119894⟩ +
120573119894
2
1003817100381710038171003817119875119894S minus 119873
119894
1003817100381710038171003817
2
119865)
(16)
where 119884119894 119885119894are Lagrangian multipliers and 120572
119894 120573119894gt 0 are
penalty parametersThen we can now directly apply ADMM with this aug-
mented Lagrangian function
Computing 119872119894 The optimal119872
119894can be solved with all other
variables to be constant by the following subproblem
min119872119894
120582119894
1003817100381710038171003817119872119894
1003817100381710038171003817lowast+ ⟨119884119894 119875119894L minus119872
119894⟩ +
120572119894
2
1003817100381710038171003817119875119894L minus119872
119894
1003817100381710038171003817
2
119865 (17)
As shown in [20] the optimal solution of (17) is given by
119872119894= 119880119894119863120582119894120572119894(Λ)119881119894
119879 (18)
where 119880119894Λ119881119894
119879 is the singular value decomposition given by
119880119894Λ119881119894
119879= 119875119894L +
119884119894
120572119894
(19)
and the ldquoshrinkagerdquo operator119863120591(119909) with 120591 gt 0 is defined as
119863120591 (119909) =
119909 minus 120591 if 119909 gt 120591119909 + 120591 if 119909 lt minus1205910 otherwise
(20)
Computing 119873119894 The optimal 119873
119894can be solved with all other
variables to be the constants by the following subproblem
min119873119894
120578119894
1003817100381710038171003817119873119894
10038171003817100381710038171+ ⟨119885119894 119875119894S minus 119873
119894⟩ +
120573119894
2
1003817100381710038171003817119875119894S minus 119873
119894
1003817100381710038171003817
2
119865 (21)
Mathematical Problems in Engineering 5
Input n-mode tensorAParameters 120572 120573 120574 120582 120578 120588(1) InitializationL(0) = A S(0) = 0 119872
119894=L(119894) 119873119894= 0 119896 = 1
(2) Repeat until convergence(3) for 119894 = 1 to 119899(4) 119872
(119896+1)
119894= 119880119894119863120582119894120572119894(Λ)119881
119879
119894
where 119880119894Λ119881119879
119894= 119875119894L(119896) + 119884
(119896)
119894120572119894
(5) 119873(119896+1)
119894= 119863120578119894120573119894(119875119894S(119896) + 119885
(119896)
119894120573119894)
(6) 119884(119896+1)
119894= 119884(119896)
119894+ 120572119894(119875119894L(119896) minus119872
(119896+1)
119894)
(7) 119885(119896+1)
119894= 119885(119896)
119894+ 120573119894(119875119894S(119896) minus 119873
(119896+1)
119894)
(8) end for
(9) L(119896+1) =A minus S(119896) minus 120574sum
119899
119894=1119875119879
119894(119884(119896+1)
119894minus 120572119894119872(119896+1)
119894)
1 + 120574sum119899
119894=1120572119894
(10) S(119896+1) =A minusL(119896+1) minus 120574sum
119899
119894=1119875119879
119894(119885(119896+1)
119894minus 120572119894119873(119896+1)
119894)
1 + 120574sum119899
119894=1120573119894
(11) 120572 = 120588120572 120573 = 120588120573(12) End(13) 119896 = 119896 + 1
Output n-mode tensorLS
Algorithm 1 ADMM-TR ADMM for tensor recovery
By the well-known 1198971minimization [21] the optimal
solution of (21) is
119873119894= 119863120578119894120573119894
(119875119894S +119885119894
120573119894
) (22)
where119863120591is the ldquoshrinkagerdquo operation
Computing L Now we fix all variables except L andminimize 119871
119860over L The resulting minimization problem
is the minimization of a quadratic function
minL 119871119860 (
L) =1
2120574
A minusL minus S2
119865
+
119899
sum
119894=1
(⟨119884119894 119875119894L minus119872
119894⟩ +
120572119894
2
1003817100381710038171003817119875119894L minus119872
119894
1003817100381710038171003817
2
119865)
(23)
The objective function is differentiable so the minimizerLmin is characterized by (120597119871119860(L))120597L = 0 Thus we obtain
Lmin =A minusS minus 120574sum
119899
119894=1119875119894
119879(119884119894minus 120572119894119872119894)
(1 + 120574sum119899
119894=1120572119894)
(24)
ComputingS Nowwefix all variables exceptS andminimize119871119860
over S The resulting minimization problem is theminimization of a quadratic function
minS 119871119860 (
S) =1
2120574
A minusL minusS2
119865
+
119899
sum
119894=1
(⟨119885119894 119875119894S minus 119873
119894⟩ +
120573119894
2
1003817100381710038171003817119875119894S minus 119873
119894
1003817100381710038171003817
2
119865)
(25)
The objective function is also differentiable so the min-imizer Smin is characterized by (120597119871
119860(S))120597S = 0 Thus we
have
Smin =A minusL minus 120574sum
119899
119894=1119875119894
119879(119885119894minus 120573119894119873119894)
(1 + 120574sum119899
119894=1120573119894)
(26)
For comparing with RSTD [22] we also choose thedifference of L and S in successive iterations against acertain tolerance as the stopping criterion The pseudocodeof the proposed ADMM-TR algorithm is summarized inAlgorithm 1
Theorem 2 (Convergence of ADMM-TR) Assume that theoptimal solution set 119883lowast of (11) is nonempty A sequenceL(119896)S(119896)119872
(119896)
119894 119873(119896)
119894 119884(119896)
119894 119885(119896)
119894 generated by our proposed
ADMM-TR algorithm is bounded and every limit point ofL(119896)S(119896) is an optimal solution of the original problem (11)
Proof We check the assumptions of Theorem 1 119862119909is not
bounded but 119875119894
lowast119875119894= 119868 is a constant multiple of the identity
operator Thus Theorem 1 can also be applied to ADMM-TRandTheorem 2 can be derived
5 Numerical Experiments
This section evaluates the empirical performance of the pro-posed algorithm on synthetic data and compares the resultswithRSTD (Rank Sparsity TensorDecomposition) [22] Alsoexperiments on traffic volume data outlier recovery illustratethe efficiency of the proposedmethod in traffic research filed
We use the Lanczos algorithm for computing the singularvalues decomposition and adopt the same rule for predicting
6 Mathematical Problems in Engineering
Table 2 Comparison of ADMM-TR with RSTD for synthetic data whereA0isin R40times40times40 119899-rank = [5 5 5]
119904119901119903
ADMM-TR RSTDRSE L
0
(119890-3)RSE S
0
(119890-3) iter Time (s) RSE L0
(119890-3)RSE S
0
(119890-3) iter Time (s)
005 43 43 133 137 46 46 208 178015 98 42 162 208 105 47 235 336025 123 43 514 449 159 52 676 510035 565 95 654 533 675 114 737 575
Table 3 Comparison of ADMM-TR with RSTD for synthetic data whereA0isin R40times40times40 n-rank = [10 10 10]
119904119901119903
Algorithm ADMM-TR Algorithm RSTDRSE L
0
(119890-3)RSE S
0
(119890-3) iter Time (s) RSE L0
(119890-3)RSE S
0
(119890-3) iter Time (s)
005 44 20 236 264 47 25 338 299015 47 21 417 421 52 25 603 508025 89 29 664 608 120 42 663 536035 173 52 981 806 211 65 1110 877
the dimension of the principal singular space as [22] And theparameters are set as 120572 = 120573 = [119868
1119868max 1198682119868max 119868119899119868max]
119879
and 120574 = 1sum([1198681119868max 1198682119868max 119868119899119868max]
119879 for all experi-ments where 119868max = max119868
119894 120578 is set to 1radic119868max as suggested
in [23]All the experiments are conducted and timed on the same
desktop with an Pentium (R) Dual-Core 250GHz CPU thathas 4GB memory running on Windows 7 and MATLAB
51 Synthetic Data ADMM-TR and RSTD are tested on thesynthetic data of size 40times40times40 We generate the dimension119903 of a ldquocore tensorrdquo C isin R119903timessdotsdotsdottimes119903 which we fill with Gaussiandistributed entries (sim N(0 1)) Then we generate matrices119880(1) 119880
(119873) with119880(119894) isin R119899119894times119903 whose elements are also iidGaussian random variables (simN(0 1)) and set
L0= Ctimes1119880(1)
times2sdotsdotsdottimes119873119880(119873) (27)
The entries of sparse tensor S0are independently dis-
tributed each taking on value 0 with probability 1-119904119901119903 andeach taking on impulsive value with probability 119904119901119903We applythe proposed algorithm to the tensorA
0=L0+S0to recover
L and S and compare with RSTD For these experimentstwo cases of 119899-rank are investigated 119899-rank = [5 5 5] and 119899-rank = [10 10 10] Table 1 presents the average results (across30 instances) for different 119904119901119903
The quality of recovery is measured by the relative squareerror (RSE) toL
0and S
0 defined to be
RSE L0=
10038171003817100381710038171003817
L minusL0
10038171003817100381710038171003817119865
1003817100381710038171003817L0
1003817100381710038171003817119865
RSE S0=
10038171003817100381710038171003817
S minusS0
10038171003817100381710038171003817119865
1003817100381710038171003817S0
1003817100381710038171003817119865
(28)
Tables 2 and 3 show that the proposed algorithm(ADMM-TR) is about 10 percent faster than RSTD proposed
in [22] and achieves better accuracy in terms of relative squareerror Though both of the algorithms involve computing aSVD per iteration we observe that the proposed algorithmtake much fewer iterations than RSTD to converge to theoptimal solution
The more impulsive entries are added that is for highervalue of spr the less probable it becomes for the tensorrecovery problem In addition the problem becomes moresophisticated when the 119899-rank is higher for the ground truthtensors of the same size In Table 1 different spr is set fortwo tensor cases And results show the recovered accuracydecreases as the spr grows for a certain case In particular therecovered accuracy for the tensorA
0isin R40times40times40 with 119899-rank
= [5 5 5] decreases sharply when the spr is up to 40 whilethe phenomenon occurs for tensor A
0isin R40times40times40 with 119899-
rank = [10 10 10] when the spr is about 25 This is due tothat relative low rank and low sparse ratio are precondition ofthe tensor recovery problem
52 TrafficVolumeData To evaluate the performances of theproposed method in traffic volume data outlier recovery acomplete traffic volume data set is used as the test data setWeuse the data of a fixed point in Sacramento County which isdownloaded fromhttppemsdotcagovThe traffic volumedata are recorded every 5 minutes Therefore a daily trafficvolume series for a loop detector contains 288 records andthe whole period of the data lasts for 16 days that is fromAugust 2 to August 17 2010
Based on multiple correlations of the traffic volume datawe model the data set as a tensor model of size 16 times 24 times 12which stands for 16 days 24 hours in a day and 12 sampleintervals (ie recorded by 5 minutes) per hour The ratiosof outlier data are set from 5 to 15 and the outlier dataare produced randomly All the results are averaged by 10instances
Mathematical Problems in Engineering 7
0
100
200
300
400
500
600
700
800
900
1000
Traffi
c vol
ume (
veh5
min
)
0 50 100 150 200 250 300
Raw dataOutlier data
Recovered data
Time interval (5 min)
Figure 1 Comparisons with raw traffic volume data data corruptedby outliers with 5 ratio and data recovered by ADMM-TR
0 50 100 150 200 250 3000
200
400
600
800
1000
1200
1400
Raw dataOutlier data
Recovered data
Time interval (5 min)
Traffi
c vol
ume (
veh5
min
)
Figure 2 Comparisons with raw traffic volume data data corruptedby outliers with 10 ratio and data recovered by ADMM-TR
For the real world data we mainly pay attention to thecorrectness of the recovered traffic volume data Thus thequality of recovery is measured by the relative square error(RSE) to L
0and Mean Absolute Percentage Error (MAPE)
toL0 defined to be
RSE L0=
10038171003817100381710038171003817
L minusL0
10038171003817100381710038171003817119865
1003817100381710038171003817L0
1003817100381710038171003817119865
MAPE L0=
1
119872
119872
sum
119898=1
1003816100381610038161003816100381610038161003816100381610038161003816
119905(119898)
119903minus 119905(119898)
119890
119905(119898)
119903
1003816100381610038161003816100381610038161003816100381610038161003816
(29)
0
200
400
600
800
1000
1200
1400
1600
1800
0 50 100 150 200 250 300
Raw dataOutlier data
Recovered data
Time interval (5 min)
Traffi
c vol
ume (
veh5
min
)
Figure 3 Comparisons with raw traffic volume data data corruptedby outliers with 15 ratio and data recovered by ADMM-TR
Table 4 Traffic volume outlier data before and afterbeing recovered
119904119901119903
Before recovery After recoveryRSE L
0
(119890-3)MAPE L
0
(119890-3)RSE L
0
(119890-3)MAPE L
0
(119890-3)005 05005 02136 00961 00314010 07066 05008 01417 01027015 08850 08777 02221 02351
where 119905(119898)119903
and 119905(119898)119890
are the 119898th elements which stand forthe known real value and recovered value respectively 119872denotes the number of recovered traffic volumes
Table 4 presents the relative errors of traffic volumeoutlier data before and after being recovered by ADMM-TRThe results show that the RSE L
0and MAPE L
0for traffic
volume data corrupted by outlier data are about 5 times thanthe data after being recovered by ADMM-TR Figures 1 2and 3 present the profiles of traffic volume data for a day Theresults show that ADMM-TR could recover the traffic volumeoutlier data with perfect performance
6 Conclusions
In this paper we concentrate on the mathematical problemin traffic volume outlier data recovery and proposed anovel tensor recovery method based on alternating directionmethod of multipliers (ADMM) The proposed algorithmcan automatically separate the low-119899-rank tensor data andsparse partThe experiments show that the proposedmethodis more stable and accurate in most cases and has excellentconvergence rate Experiments on real world traffic volumedata demonstrate the practicability and effectiveness of theproposed method in traffic research domain
In the future we would like to investigate how to auto-matically choose the parameters in our algorithm and explore
8 Mathematical Problems in Engineering
additional applications of our method in traffic researchdomain
Acknowledgments
This research was supported by NSFC (Grant No 6127137661171118 and 91120015) Beijing Natural Science Foundation(4122067)The authors would like to thank Professor Bin Ranfrom the University ofWisconsin-Madison and Yong Li fromthe University of Notre Dame for the suggestive discussions
References
[1] K Tamura and M Hirayama ldquoToward realization of VICSmdashvehicle information and communications systemrdquo in Pro-ceedings of the IEEE-IEE Vehicle Navigation and InformationsSystems Conference pp 72ndash77 October 1993
[2] S Kim M E Lewis and C C White III ldquoOptimal vehiclerouting with real-time traffic informationrdquo IEEE Transactionson Intelligent Transportation Systems vol 6 no 2 pp 178ndash1882005
[3] P Mirchandani and L Head ldquoA real-time traffic signal controlsystem architecture algorithms and analysisrdquo TransportationResearch Part C vol 9 no 6 pp 415ndash432 2001
[4] B M Williams P K Durvasula and D E Brown ldquoUrbanfreeway traffic flow prediction application of seasonal autore-gressive integrated moving average and exponential smoothingmodelsrdquo Transportation Research Record no 1644 pp 132ndash1411998
[5] J Xu X Li and H Shi ldquoShort-term traffic flow forecastingmodel under missing datardquo Journal of Computer Applicationsvol 30 no 4 pp 1117ndash1120 2010
[6] Y Pei and J Ma ldquoReal-time traffic data screening and recon-structionrdquo China Civil Engineering Journal vol 36 no 7 pp78ndash83 2003
[7] S Chen W Wang and W Li ldquoNoise recognition and noisereduction of real-time trafficdatardquoDongnanDaxueXuebao vol36 no 2 pp 322ndash325 2006
[8] B-G Daniel J D-P Francisco G-O David et al ldquoWavelet-based denoising for traffic volume time series forecasting withself-organizing neural networksrdquo Computer-Aided Civil andInfrastructure Engineering vol 25 no 7 pp 530ndash545 2010
[9] S Gandy B Recht and I Yamada ldquoTensor completion andlowmdashrank tensor recovery via convex optimizationrdquo InverseProblems vol 27 no 2 Article ID 025010 2011
[10] T G Kolda and B W Bader ldquoTensor decompositions andapplicationsrdquo SIAM Review vol 51 no 3 pp 455ndash500 2009
[11] A S Lewis and G Knowles ldquoImage compression using the 2-Dwavelet transformrdquo IEEE Transactions of Image Processing vol1 no 2 pp 244ndash250 1992
[12] V Chandrasekaran S Sanghavi P A Parrilo and A S WillskyldquoRank-sparsity incoherence for matrix decompositionrdquo SIAMJournal on Optimization vol 21 no 2 pp 572ndash596 2011
[13] J Wright Y Peng Y Ma and A Ganesh ldquoRobust principalcomponent analysis exact recovery of corrupted low-rankmatrices by convex optimizationrdquo in Proceedings of the 23rdAnnual Conference on Neural Information Processing Systems(NIPS rsquo09) pp 2080ndash2088 Vancouver Canada 2009
[14] A Ganesh Z Lin J Wright L Wu M Chen and Y MaldquoFast algorithms for recovering a corrupted low-rank matrixrdquo
in Proceedings of the 3rd IEEE International Workshop onComputational Advances in Multi-Sensor Adaptive Processing(CAMSAP rsquo09) pp 213ndash216 December 2009
[15] Z Lin M Chen L Wu and Y Ma ldquoThe augmented Lagrangemultiplier method for exact recovery of a corrupted low-rankmatricesrdquoMathematical Programming 2009
[16] X Yuan and J Yang ldquoSparse and low-rank matrix decomposi-tion via alternating directionmethodsrdquo Tech Rep Departmentof Mathematics Hong Kong Baptist University 2009
[17] L Qu Y Zhang J Hu L Jia and L Li ldquoA BPCA basedmissing value imputing method for traffic flow volume datardquo inProceedings of the IEEE Intelligent Vehicles Symposium (IV rsquo08)pp 985ndash990 June 2008
[18] J Liu P Musialski P Wonka and J Ye ldquoTensor completion forestimating missing values in visual datardquo in Proceedings of theInternational Conference on Computer Vision (ICCV rsquo09) 2009
[19] D Bertsekas and J Tsitsiklis Parallel and Distributed Computa-tion Prentice-Hall 1989
[20] J-F Cai E J Candes and Z Shen ldquoA singular value thresh-olding algorithm for matrix completionrdquo SIAM Journal onOptimization vol 20 no 4 pp 1956ndash1982 2010
[21] E T Hale W Yin and Y Zhang ldquoFixed-point continuationfor ℓ1-minimization methodology and convergencerdquo SIAMJournal on Optimization vol 19 no 3 pp 1107ndash1130 2008
[22] Y Li J Yan Y Zhou and J Yang ldquoOptimum subspace learningand error correction for tensorsrdquo in Proceedings of the 11thEuropean Conference on Computer Vision (ECCV rsquo10) CreteGreece 2010
[23] E J Candes X Li Y Ma and J Wright ldquoRobust principalcomponent analysisrdquo Journal of the ACM vol 58 no 3 article11 2011
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Mathematical Problems in Engineering 3
as A1= sum1198941sdotsdotsdot119894119896sdotsdotsdot119894119899
|1198861198941sdotsdotsdot119894119896sdotsdotsdot119894119899
| It is clear that A119865= 119860(119899)119865
A0= 119860(119899)0 and A
1= 119860(119899)1for any 1 le 119899 le 119873
The 119899-mode (matrix) product of a tensorA isin R1198681times1198682timessdotsdotsdottimes119868119873with a matrix 119872 isin R119869times119868119899 is denoted by Atimes
119899119872 and is size
1198681times sdot sdot sdot times 119868
119899minus1times119869times119868
119899+1times sdot sdot sdot times 119868
119873 In terms of flattened matrix
the 119899-mode product can be expressed as
Y = Atimes119899119872lArrrArr 119884
(119899)= 119872119860
(119899) (6)
3 Review of Data Recovery Methods
Recently the problem of recovering the sparse and low-rank components with no prior knowledge about the sparsitypattern of the sparse matrix or the rank of the low-rankmatrix has been well studied Authors of [12] proposedthe concept of ldquorank-sparse incoherencerdquo and solved theproblem by an interior point solver after being reformulatedas a semidefinite problem However although interior pointmethods normally take very few iterations to converge theyhave difficulty in handling large matrices So this limitationprevented the usage of the technique in computer vision andthe traffic volume recovery in this research
To solve the problem for large scale matrices Wrightet al [13] have adopted the iterative thresholding techniqueto solve the problem and obtained scalability propertiesLin et al have proposed an accelerated proximal gradient(APG) algorithm [14] and applied techniques of augmentedLagrange multipliers (ALM) [15] to solve the problem Yuanand Yang [16] have utilized the alternating direction method(ADM) which can be regarded as a practical version ofthe classical ALM method to solve the matrix recoveryproblem The ADM method has been proved to have apleasing convergence speed and results in [16] demonstratedits excellent performance
Inspired by the idea of [16] this paper extends the sparseand low-rank recovery problem to tensor case which is dueto the fact that the multidimensional traffic data can beformulated into the form of tensor
4 ADMM-TR for Traffic Volume OutlierRecovery
In this section we show the solution of problem (2) Thetensor model is firstly constructed for traffic volume inSection 41 Then we present the tensor recovery problem inSection 42 In Section 43 the classical ADMM approach isintroduced In Section 44 we convert the original probleminto a constrained convex optimization problem which canbe solved by the extended ADMM approach and presentthe details of the proposed algorithm Also the convergenceguarantees of the proposed algorithm are given in thissection
41 Tensor Model for Traffic Volume The correlations oftraffic volume data are critical for recovering the corruptedtraffic volume data Traditional methods mostly exploit partof correlations such as historical or temporal neighboringcorrelationsThe classic methods usually utilize the temporal
Table 1 The similarity coefficient of four modes
Mode Size Similarity coefficientHour 6 times 12 09670Day 7 times 288 08654Week 7 times 288 09153Link 4 times 288 08497
correlations of traffic data from day to day For the singledetector data multiple correlations contain the relations oftraffic data from day to day hour to hour and so forth Inaddition the spatial correlations exist in multiple detectorsdata
In this paper quantitative analysis of traffic data corre-lation is analyzed based on the traffic volume data down-loaded from httppemsdotcagov The correlation coeffi-cient applied tomeasuring the data correlation is given by [17]
119904 =
sum119899ge119894gt119895ge1
119877 (119894 119895)
119899 (119899 minus 1) 2
(7)
where 119899 refers to the whole data points 119877(119894 119895) refers tothe correlation coefficient matrix Table 1 gives the resultsof correlation coefficient of four modes which is hour dayweek and month
Conventional methods usually use day-to-day matrixpattern to model the traffic data Although each mode oftraffic data has a very high similarity these methods do notutilize the multimode correlations which are ldquoDay times HourrdquoldquoWeek times Hourrdquo and ldquoLink times Hourrdquo simultaneously and thusmay result in poor recovery performance
To make full use of the multimode correlation and trafficspatial-temporal information traffic data need to be con-structed into multiway data set Fortunately tensor patternbased traffic data can be well used to model the multiwaytraffic dataThis helps keep the original structure and employenough traffic spatial-temporal information
42 The Tensor Recovery Problem The problem of (2) is NPhard since it is not convex Then we use an approximationformulation as shown in (8)
minLS Llowast + 120578S1
st A minusL minusS119865 le 120575(8)
where A isin R1198681times1198682timessdotsdotsdottimes119868119873 is the given matrix to be recoveredL isin R1198681times1198682timessdotsdotsdottimes119868119873 is the low-rank component of A S isin
R1198681times1198682timessdotsdotsdottimes119868119873 is the sparse component of A Compared with(2) (8) relaxes the constrain for recovering a low-119899-ranktensor from a high-dimensional data tensor despite bothsmall entry-wise noise and gross sparse errors
Recently Liu et al [18] have proposed the definition of thenuclear norm of an 119899-mode tensor
Xlowast =1
119899
119899
sum
119894=1
1003817100381710038171003817119883(119894)
1003817100381710038171003817lowast (9)
4 Mathematical Problems in Engineering
Based on this definition the optimization in (8) can bewritten as
minLS Llowast + 120578S1 equiv
1
119899
119899
sum
119894=1
120582119894
1003817100381710038171003817119871(119894)
1003817100381710038171003817lowast+
1
119899
119899
sum
119894=1
120578119894
1003817100381710038171003817119878(119894)
10038171003817100381710038171
st A minusL minusS119865 le 120575
(10)
In order to recover (L S) instead of directly solving (10)we solve the following dual problem
min LS
1
2120574
A minusL minusS2
119865+
119899
sum
119894=1
120582119894
1003817100381710038171003817119871(119894)
1003817100381710038171003817lowast+
119899
sum
119894=1
120578119894
1003817100381710038171003817119878(119894)
10038171003817100381710038171 (11)
The problem in (11) is still difficult to solve due to theinterdependent nuclear norm and 119897
1norm constraints To
simplify the problem the formulation can be reformulated asfollows
minLS119872
119894119873119894
1
2120574
A minusL minus S2
119865+
119899
sum
119894=1
120582119894
1003817100381710038171003817119872119894
1003817100381710038171003817lowast+
119899
sum
119894=1
120578119894
1003817100381710038171003817119873119894
10038171003817100381710038171
st 119875119894L = 119872
119894119875119894S = 119873
119894forall119894
(12)
where 119875119894is the matrix representation of mode-119894 unfolding
(note that 119875119894is a permutation matrix thus 119875
119894
119879119875119894= 119868) 119872
(119894)
and119873(119894)are additional auxiliary matrices of the same size as
the mode-119894 unfolding ofL (or S)
43The Classical ADMMApproach The classical alternatingdirection method of multipliers (ADMM) is for solvingstructured convex programs of the form
min119909isin119862119909119910isin119862119910
119891 (119909) + 119892 (119910)
st 119860119909 + 119861119910 = 119888(13)
where119891 and 119892 are convex functions defined on closed subsets119862119909and 119862
119910119860 119861 and 119888 are matrices and vector of appropriate
sizes The segmented Lagrangian function of (13) is
119871119860(119909 119910 119908) = 119891 (119909) + 119892 (119910) + ⟨119908119860119909 + 119861119910 minus 119888⟩
+
120573
2
1003817100381710038171003817119860119909 + 119861119910 minus 119888
1003817100381710038171003817
2
2
(14)
where 119908 is a Lagrangian multiplier vector and 120573 gt 0 is apenalty parameter
The approach performs one sweep of alternating mini-mization with respect to 119909 and 119910 individually then updatesthe multiplier 119908 at the iteration 119896 the steps are given by [18Equations (479)ndash(481)]
119909(119896+1)larr997888 arg min
119909isin119862119909
119871119860(119909 119910(119896) 119908(119896))
119910(119896+1)larr997888 argmin
119909isin119862119910
119871119860(119909(119896+1) 119910 119908(119896))
119908(119896+1)larr997888 119908
(119896)+ 120588120573 (119860119909
(119896+1)+ 119861119910(119896+1)minus 119888)
(15)
where 120588 is the step length A convergence proof for the aboveADMM algorithm was shown as follows
Theorem 1 (See [19 Proposition 52]) Assume that theoptimal solution set 119883lowast of (13) is nonempty Furthermoreassume that119862
119909is bounded or else the matrix119860lowast119860 is invertible
Then a sequence 119909(119896) 119910(119896) 119908(119896) generated by (15) is boundedand every limit point of 119909(119896) is an optimal solution of theoriginal of problem (13)
44 ADMM Extension to Tensor Recovery We observe (12)is well structured in the sense that the separable structureemerges in both the objective function and constraintsThuswe propose an algorithm based on an extension of theclassical ADMM approach for solving the tensor recoveryproblem by taking advantage of this favorable structure
The augmented Lagrangian of (12) is
119871119860(LS119872
119894 119873119894)
=
1
2120574
A minusL minusS2
119865
+
119899
sum
119894=1
(120582119894
1003817100381710038171003817119872119894
1003817100381710038171003817lowast+ ⟨119884119894 119875119894L minus119872
119894⟩ +
120572119894
2
1003817100381710038171003817119875119894L minus119872
119894
1003817100381710038171003817
2
119865)
+
119899
sum
119894=1
(120578119894
1003817100381710038171003817119873119894
10038171003817100381710038171+ ⟨119885119894 119875119894S minus 119873
119894⟩ +
120573119894
2
1003817100381710038171003817119875119894S minus 119873
119894
1003817100381710038171003817
2
119865)
(16)
where 119884119894 119885119894are Lagrangian multipliers and 120572
119894 120573119894gt 0 are
penalty parametersThen we can now directly apply ADMM with this aug-
mented Lagrangian function
Computing 119872119894 The optimal119872
119894can be solved with all other
variables to be constant by the following subproblem
min119872119894
120582119894
1003817100381710038171003817119872119894
1003817100381710038171003817lowast+ ⟨119884119894 119875119894L minus119872
119894⟩ +
120572119894
2
1003817100381710038171003817119875119894L minus119872
119894
1003817100381710038171003817
2
119865 (17)
As shown in [20] the optimal solution of (17) is given by
119872119894= 119880119894119863120582119894120572119894(Λ)119881119894
119879 (18)
where 119880119894Λ119881119894
119879 is the singular value decomposition given by
119880119894Λ119881119894
119879= 119875119894L +
119884119894
120572119894
(19)
and the ldquoshrinkagerdquo operator119863120591(119909) with 120591 gt 0 is defined as
119863120591 (119909) =
119909 minus 120591 if 119909 gt 120591119909 + 120591 if 119909 lt minus1205910 otherwise
(20)
Computing 119873119894 The optimal 119873
119894can be solved with all other
variables to be the constants by the following subproblem
min119873119894
120578119894
1003817100381710038171003817119873119894
10038171003817100381710038171+ ⟨119885119894 119875119894S minus 119873
119894⟩ +
120573119894
2
1003817100381710038171003817119875119894S minus 119873
119894
1003817100381710038171003817
2
119865 (21)
Mathematical Problems in Engineering 5
Input n-mode tensorAParameters 120572 120573 120574 120582 120578 120588(1) InitializationL(0) = A S(0) = 0 119872
119894=L(119894) 119873119894= 0 119896 = 1
(2) Repeat until convergence(3) for 119894 = 1 to 119899(4) 119872
(119896+1)
119894= 119880119894119863120582119894120572119894(Λ)119881
119879
119894
where 119880119894Λ119881119879
119894= 119875119894L(119896) + 119884
(119896)
119894120572119894
(5) 119873(119896+1)
119894= 119863120578119894120573119894(119875119894S(119896) + 119885
(119896)
119894120573119894)
(6) 119884(119896+1)
119894= 119884(119896)
119894+ 120572119894(119875119894L(119896) minus119872
(119896+1)
119894)
(7) 119885(119896+1)
119894= 119885(119896)
119894+ 120573119894(119875119894S(119896) minus 119873
(119896+1)
119894)
(8) end for
(9) L(119896+1) =A minus S(119896) minus 120574sum
119899
119894=1119875119879
119894(119884(119896+1)
119894minus 120572119894119872(119896+1)
119894)
1 + 120574sum119899
119894=1120572119894
(10) S(119896+1) =A minusL(119896+1) minus 120574sum
119899
119894=1119875119879
119894(119885(119896+1)
119894minus 120572119894119873(119896+1)
119894)
1 + 120574sum119899
119894=1120573119894
(11) 120572 = 120588120572 120573 = 120588120573(12) End(13) 119896 = 119896 + 1
Output n-mode tensorLS
Algorithm 1 ADMM-TR ADMM for tensor recovery
By the well-known 1198971minimization [21] the optimal
solution of (21) is
119873119894= 119863120578119894120573119894
(119875119894S +119885119894
120573119894
) (22)
where119863120591is the ldquoshrinkagerdquo operation
Computing L Now we fix all variables except L andminimize 119871
119860over L The resulting minimization problem
is the minimization of a quadratic function
minL 119871119860 (
L) =1
2120574
A minusL minus S2
119865
+
119899
sum
119894=1
(⟨119884119894 119875119894L minus119872
119894⟩ +
120572119894
2
1003817100381710038171003817119875119894L minus119872
119894
1003817100381710038171003817
2
119865)
(23)
The objective function is differentiable so the minimizerLmin is characterized by (120597119871119860(L))120597L = 0 Thus we obtain
Lmin =A minusS minus 120574sum
119899
119894=1119875119894
119879(119884119894minus 120572119894119872119894)
(1 + 120574sum119899
119894=1120572119894)
(24)
ComputingS Nowwefix all variables exceptS andminimize119871119860
over S The resulting minimization problem is theminimization of a quadratic function
minS 119871119860 (
S) =1
2120574
A minusL minusS2
119865
+
119899
sum
119894=1
(⟨119885119894 119875119894S minus 119873
119894⟩ +
120573119894
2
1003817100381710038171003817119875119894S minus 119873
119894
1003817100381710038171003817
2
119865)
(25)
The objective function is also differentiable so the min-imizer Smin is characterized by (120597119871
119860(S))120597S = 0 Thus we
have
Smin =A minusL minus 120574sum
119899
119894=1119875119894
119879(119885119894minus 120573119894119873119894)
(1 + 120574sum119899
119894=1120573119894)
(26)
For comparing with RSTD [22] we also choose thedifference of L and S in successive iterations against acertain tolerance as the stopping criterion The pseudocodeof the proposed ADMM-TR algorithm is summarized inAlgorithm 1
Theorem 2 (Convergence of ADMM-TR) Assume that theoptimal solution set 119883lowast of (11) is nonempty A sequenceL(119896)S(119896)119872
(119896)
119894 119873(119896)
119894 119884(119896)
119894 119885(119896)
119894 generated by our proposed
ADMM-TR algorithm is bounded and every limit point ofL(119896)S(119896) is an optimal solution of the original problem (11)
Proof We check the assumptions of Theorem 1 119862119909is not
bounded but 119875119894
lowast119875119894= 119868 is a constant multiple of the identity
operator Thus Theorem 1 can also be applied to ADMM-TRandTheorem 2 can be derived
5 Numerical Experiments
This section evaluates the empirical performance of the pro-posed algorithm on synthetic data and compares the resultswithRSTD (Rank Sparsity TensorDecomposition) [22] Alsoexperiments on traffic volume data outlier recovery illustratethe efficiency of the proposedmethod in traffic research filed
We use the Lanczos algorithm for computing the singularvalues decomposition and adopt the same rule for predicting
6 Mathematical Problems in Engineering
Table 2 Comparison of ADMM-TR with RSTD for synthetic data whereA0isin R40times40times40 119899-rank = [5 5 5]
119904119901119903
ADMM-TR RSTDRSE L
0
(119890-3)RSE S
0
(119890-3) iter Time (s) RSE L0
(119890-3)RSE S
0
(119890-3) iter Time (s)
005 43 43 133 137 46 46 208 178015 98 42 162 208 105 47 235 336025 123 43 514 449 159 52 676 510035 565 95 654 533 675 114 737 575
Table 3 Comparison of ADMM-TR with RSTD for synthetic data whereA0isin R40times40times40 n-rank = [10 10 10]
119904119901119903
Algorithm ADMM-TR Algorithm RSTDRSE L
0
(119890-3)RSE S
0
(119890-3) iter Time (s) RSE L0
(119890-3)RSE S
0
(119890-3) iter Time (s)
005 44 20 236 264 47 25 338 299015 47 21 417 421 52 25 603 508025 89 29 664 608 120 42 663 536035 173 52 981 806 211 65 1110 877
the dimension of the principal singular space as [22] And theparameters are set as 120572 = 120573 = [119868
1119868max 1198682119868max 119868119899119868max]
119879
and 120574 = 1sum([1198681119868max 1198682119868max 119868119899119868max]
119879 for all experi-ments where 119868max = max119868
119894 120578 is set to 1radic119868max as suggested
in [23]All the experiments are conducted and timed on the same
desktop with an Pentium (R) Dual-Core 250GHz CPU thathas 4GB memory running on Windows 7 and MATLAB
51 Synthetic Data ADMM-TR and RSTD are tested on thesynthetic data of size 40times40times40 We generate the dimension119903 of a ldquocore tensorrdquo C isin R119903timessdotsdotsdottimes119903 which we fill with Gaussiandistributed entries (sim N(0 1)) Then we generate matrices119880(1) 119880
(119873) with119880(119894) isin R119899119894times119903 whose elements are also iidGaussian random variables (simN(0 1)) and set
L0= Ctimes1119880(1)
times2sdotsdotsdottimes119873119880(119873) (27)
The entries of sparse tensor S0are independently dis-
tributed each taking on value 0 with probability 1-119904119901119903 andeach taking on impulsive value with probability 119904119901119903We applythe proposed algorithm to the tensorA
0=L0+S0to recover
L and S and compare with RSTD For these experimentstwo cases of 119899-rank are investigated 119899-rank = [5 5 5] and 119899-rank = [10 10 10] Table 1 presents the average results (across30 instances) for different 119904119901119903
The quality of recovery is measured by the relative squareerror (RSE) toL
0and S
0 defined to be
RSE L0=
10038171003817100381710038171003817
L minusL0
10038171003817100381710038171003817119865
1003817100381710038171003817L0
1003817100381710038171003817119865
RSE S0=
10038171003817100381710038171003817
S minusS0
10038171003817100381710038171003817119865
1003817100381710038171003817S0
1003817100381710038171003817119865
(28)
Tables 2 and 3 show that the proposed algorithm(ADMM-TR) is about 10 percent faster than RSTD proposed
in [22] and achieves better accuracy in terms of relative squareerror Though both of the algorithms involve computing aSVD per iteration we observe that the proposed algorithmtake much fewer iterations than RSTD to converge to theoptimal solution
The more impulsive entries are added that is for highervalue of spr the less probable it becomes for the tensorrecovery problem In addition the problem becomes moresophisticated when the 119899-rank is higher for the ground truthtensors of the same size In Table 1 different spr is set fortwo tensor cases And results show the recovered accuracydecreases as the spr grows for a certain case In particular therecovered accuracy for the tensorA
0isin R40times40times40 with 119899-rank
= [5 5 5] decreases sharply when the spr is up to 40 whilethe phenomenon occurs for tensor A
0isin R40times40times40 with 119899-
rank = [10 10 10] when the spr is about 25 This is due tothat relative low rank and low sparse ratio are precondition ofthe tensor recovery problem
52 TrafficVolumeData To evaluate the performances of theproposed method in traffic volume data outlier recovery acomplete traffic volume data set is used as the test data setWeuse the data of a fixed point in Sacramento County which isdownloaded fromhttppemsdotcagovThe traffic volumedata are recorded every 5 minutes Therefore a daily trafficvolume series for a loop detector contains 288 records andthe whole period of the data lasts for 16 days that is fromAugust 2 to August 17 2010
Based on multiple correlations of the traffic volume datawe model the data set as a tensor model of size 16 times 24 times 12which stands for 16 days 24 hours in a day and 12 sampleintervals (ie recorded by 5 minutes) per hour The ratiosof outlier data are set from 5 to 15 and the outlier dataare produced randomly All the results are averaged by 10instances
Mathematical Problems in Engineering 7
0
100
200
300
400
500
600
700
800
900
1000
Traffi
c vol
ume (
veh5
min
)
0 50 100 150 200 250 300
Raw dataOutlier data
Recovered data
Time interval (5 min)
Figure 1 Comparisons with raw traffic volume data data corruptedby outliers with 5 ratio and data recovered by ADMM-TR
0 50 100 150 200 250 3000
200
400
600
800
1000
1200
1400
Raw dataOutlier data
Recovered data
Time interval (5 min)
Traffi
c vol
ume (
veh5
min
)
Figure 2 Comparisons with raw traffic volume data data corruptedby outliers with 10 ratio and data recovered by ADMM-TR
For the real world data we mainly pay attention to thecorrectness of the recovered traffic volume data Thus thequality of recovery is measured by the relative square error(RSE) to L
0and Mean Absolute Percentage Error (MAPE)
toL0 defined to be
RSE L0=
10038171003817100381710038171003817
L minusL0
10038171003817100381710038171003817119865
1003817100381710038171003817L0
1003817100381710038171003817119865
MAPE L0=
1
119872
119872
sum
119898=1
1003816100381610038161003816100381610038161003816100381610038161003816
119905(119898)
119903minus 119905(119898)
119890
119905(119898)
119903
1003816100381610038161003816100381610038161003816100381610038161003816
(29)
0
200
400
600
800
1000
1200
1400
1600
1800
0 50 100 150 200 250 300
Raw dataOutlier data
Recovered data
Time interval (5 min)
Traffi
c vol
ume (
veh5
min
)
Figure 3 Comparisons with raw traffic volume data data corruptedby outliers with 15 ratio and data recovered by ADMM-TR
Table 4 Traffic volume outlier data before and afterbeing recovered
119904119901119903
Before recovery After recoveryRSE L
0
(119890-3)MAPE L
0
(119890-3)RSE L
0
(119890-3)MAPE L
0
(119890-3)005 05005 02136 00961 00314010 07066 05008 01417 01027015 08850 08777 02221 02351
where 119905(119898)119903
and 119905(119898)119890
are the 119898th elements which stand forthe known real value and recovered value respectively 119872denotes the number of recovered traffic volumes
Table 4 presents the relative errors of traffic volumeoutlier data before and after being recovered by ADMM-TRThe results show that the RSE L
0and MAPE L
0for traffic
volume data corrupted by outlier data are about 5 times thanthe data after being recovered by ADMM-TR Figures 1 2and 3 present the profiles of traffic volume data for a day Theresults show that ADMM-TR could recover the traffic volumeoutlier data with perfect performance
6 Conclusions
In this paper we concentrate on the mathematical problemin traffic volume outlier data recovery and proposed anovel tensor recovery method based on alternating directionmethod of multipliers (ADMM) The proposed algorithmcan automatically separate the low-119899-rank tensor data andsparse partThe experiments show that the proposedmethodis more stable and accurate in most cases and has excellentconvergence rate Experiments on real world traffic volumedata demonstrate the practicability and effectiveness of theproposed method in traffic research domain
In the future we would like to investigate how to auto-matically choose the parameters in our algorithm and explore
8 Mathematical Problems in Engineering
additional applications of our method in traffic researchdomain
Acknowledgments
This research was supported by NSFC (Grant No 6127137661171118 and 91120015) Beijing Natural Science Foundation(4122067)The authors would like to thank Professor Bin Ranfrom the University ofWisconsin-Madison and Yong Li fromthe University of Notre Dame for the suggestive discussions
References
[1] K Tamura and M Hirayama ldquoToward realization of VICSmdashvehicle information and communications systemrdquo in Pro-ceedings of the IEEE-IEE Vehicle Navigation and InformationsSystems Conference pp 72ndash77 October 1993
[2] S Kim M E Lewis and C C White III ldquoOptimal vehiclerouting with real-time traffic informationrdquo IEEE Transactionson Intelligent Transportation Systems vol 6 no 2 pp 178ndash1882005
[3] P Mirchandani and L Head ldquoA real-time traffic signal controlsystem architecture algorithms and analysisrdquo TransportationResearch Part C vol 9 no 6 pp 415ndash432 2001
[4] B M Williams P K Durvasula and D E Brown ldquoUrbanfreeway traffic flow prediction application of seasonal autore-gressive integrated moving average and exponential smoothingmodelsrdquo Transportation Research Record no 1644 pp 132ndash1411998
[5] J Xu X Li and H Shi ldquoShort-term traffic flow forecastingmodel under missing datardquo Journal of Computer Applicationsvol 30 no 4 pp 1117ndash1120 2010
[6] Y Pei and J Ma ldquoReal-time traffic data screening and recon-structionrdquo China Civil Engineering Journal vol 36 no 7 pp78ndash83 2003
[7] S Chen W Wang and W Li ldquoNoise recognition and noisereduction of real-time trafficdatardquoDongnanDaxueXuebao vol36 no 2 pp 322ndash325 2006
[8] B-G Daniel J D-P Francisco G-O David et al ldquoWavelet-based denoising for traffic volume time series forecasting withself-organizing neural networksrdquo Computer-Aided Civil andInfrastructure Engineering vol 25 no 7 pp 530ndash545 2010
[9] S Gandy B Recht and I Yamada ldquoTensor completion andlowmdashrank tensor recovery via convex optimizationrdquo InverseProblems vol 27 no 2 Article ID 025010 2011
[10] T G Kolda and B W Bader ldquoTensor decompositions andapplicationsrdquo SIAM Review vol 51 no 3 pp 455ndash500 2009
[11] A S Lewis and G Knowles ldquoImage compression using the 2-Dwavelet transformrdquo IEEE Transactions of Image Processing vol1 no 2 pp 244ndash250 1992
[12] V Chandrasekaran S Sanghavi P A Parrilo and A S WillskyldquoRank-sparsity incoherence for matrix decompositionrdquo SIAMJournal on Optimization vol 21 no 2 pp 572ndash596 2011
[13] J Wright Y Peng Y Ma and A Ganesh ldquoRobust principalcomponent analysis exact recovery of corrupted low-rankmatrices by convex optimizationrdquo in Proceedings of the 23rdAnnual Conference on Neural Information Processing Systems(NIPS rsquo09) pp 2080ndash2088 Vancouver Canada 2009
[14] A Ganesh Z Lin J Wright L Wu M Chen and Y MaldquoFast algorithms for recovering a corrupted low-rank matrixrdquo
in Proceedings of the 3rd IEEE International Workshop onComputational Advances in Multi-Sensor Adaptive Processing(CAMSAP rsquo09) pp 213ndash216 December 2009
[15] Z Lin M Chen L Wu and Y Ma ldquoThe augmented Lagrangemultiplier method for exact recovery of a corrupted low-rankmatricesrdquoMathematical Programming 2009
[16] X Yuan and J Yang ldquoSparse and low-rank matrix decomposi-tion via alternating directionmethodsrdquo Tech Rep Departmentof Mathematics Hong Kong Baptist University 2009
[17] L Qu Y Zhang J Hu L Jia and L Li ldquoA BPCA basedmissing value imputing method for traffic flow volume datardquo inProceedings of the IEEE Intelligent Vehicles Symposium (IV rsquo08)pp 985ndash990 June 2008
[18] J Liu P Musialski P Wonka and J Ye ldquoTensor completion forestimating missing values in visual datardquo in Proceedings of theInternational Conference on Computer Vision (ICCV rsquo09) 2009
[19] D Bertsekas and J Tsitsiklis Parallel and Distributed Computa-tion Prentice-Hall 1989
[20] J-F Cai E J Candes and Z Shen ldquoA singular value thresh-olding algorithm for matrix completionrdquo SIAM Journal onOptimization vol 20 no 4 pp 1956ndash1982 2010
[21] E T Hale W Yin and Y Zhang ldquoFixed-point continuationfor ℓ1-minimization methodology and convergencerdquo SIAMJournal on Optimization vol 19 no 3 pp 1107ndash1130 2008
[22] Y Li J Yan Y Zhou and J Yang ldquoOptimum subspace learningand error correction for tensorsrdquo in Proceedings of the 11thEuropean Conference on Computer Vision (ECCV rsquo10) CreteGreece 2010
[23] E J Candes X Li Y Ma and J Wright ldquoRobust principalcomponent analysisrdquo Journal of the ACM vol 58 no 3 article11 2011
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
4 Mathematical Problems in Engineering
Based on this definition the optimization in (8) can bewritten as
minLS Llowast + 120578S1 equiv
1
119899
119899
sum
119894=1
120582119894
1003817100381710038171003817119871(119894)
1003817100381710038171003817lowast+
1
119899
119899
sum
119894=1
120578119894
1003817100381710038171003817119878(119894)
10038171003817100381710038171
st A minusL minusS119865 le 120575
(10)
In order to recover (L S) instead of directly solving (10)we solve the following dual problem
min LS
1
2120574
A minusL minusS2
119865+
119899
sum
119894=1
120582119894
1003817100381710038171003817119871(119894)
1003817100381710038171003817lowast+
119899
sum
119894=1
120578119894
1003817100381710038171003817119878(119894)
10038171003817100381710038171 (11)
The problem in (11) is still difficult to solve due to theinterdependent nuclear norm and 119897
1norm constraints To
simplify the problem the formulation can be reformulated asfollows
minLS119872
119894119873119894
1
2120574
A minusL minus S2
119865+
119899
sum
119894=1
120582119894
1003817100381710038171003817119872119894
1003817100381710038171003817lowast+
119899
sum
119894=1
120578119894
1003817100381710038171003817119873119894
10038171003817100381710038171
st 119875119894L = 119872
119894119875119894S = 119873
119894forall119894
(12)
where 119875119894is the matrix representation of mode-119894 unfolding
(note that 119875119894is a permutation matrix thus 119875
119894
119879119875119894= 119868) 119872
(119894)
and119873(119894)are additional auxiliary matrices of the same size as
the mode-119894 unfolding ofL (or S)
43The Classical ADMMApproach The classical alternatingdirection method of multipliers (ADMM) is for solvingstructured convex programs of the form
min119909isin119862119909119910isin119862119910
119891 (119909) + 119892 (119910)
st 119860119909 + 119861119910 = 119888(13)
where119891 and 119892 are convex functions defined on closed subsets119862119909and 119862
119910119860 119861 and 119888 are matrices and vector of appropriate
sizes The segmented Lagrangian function of (13) is
119871119860(119909 119910 119908) = 119891 (119909) + 119892 (119910) + ⟨119908119860119909 + 119861119910 minus 119888⟩
+
120573
2
1003817100381710038171003817119860119909 + 119861119910 minus 119888
1003817100381710038171003817
2
2
(14)
where 119908 is a Lagrangian multiplier vector and 120573 gt 0 is apenalty parameter
The approach performs one sweep of alternating mini-mization with respect to 119909 and 119910 individually then updatesthe multiplier 119908 at the iteration 119896 the steps are given by [18Equations (479)ndash(481)]
119909(119896+1)larr997888 arg min
119909isin119862119909
119871119860(119909 119910(119896) 119908(119896))
119910(119896+1)larr997888 argmin
119909isin119862119910
119871119860(119909(119896+1) 119910 119908(119896))
119908(119896+1)larr997888 119908
(119896)+ 120588120573 (119860119909
(119896+1)+ 119861119910(119896+1)minus 119888)
(15)
where 120588 is the step length A convergence proof for the aboveADMM algorithm was shown as follows
Theorem 1 (See [19 Proposition 52]) Assume that theoptimal solution set 119883lowast of (13) is nonempty Furthermoreassume that119862
119909is bounded or else the matrix119860lowast119860 is invertible
Then a sequence 119909(119896) 119910(119896) 119908(119896) generated by (15) is boundedand every limit point of 119909(119896) is an optimal solution of theoriginal of problem (13)
44 ADMM Extension to Tensor Recovery We observe (12)is well structured in the sense that the separable structureemerges in both the objective function and constraintsThuswe propose an algorithm based on an extension of theclassical ADMM approach for solving the tensor recoveryproblem by taking advantage of this favorable structure
The augmented Lagrangian of (12) is
119871119860(LS119872
119894 119873119894)
=
1
2120574
A minusL minusS2
119865
+
119899
sum
119894=1
(120582119894
1003817100381710038171003817119872119894
1003817100381710038171003817lowast+ ⟨119884119894 119875119894L minus119872
119894⟩ +
120572119894
2
1003817100381710038171003817119875119894L minus119872
119894
1003817100381710038171003817
2
119865)
+
119899
sum
119894=1
(120578119894
1003817100381710038171003817119873119894
10038171003817100381710038171+ ⟨119885119894 119875119894S minus 119873
119894⟩ +
120573119894
2
1003817100381710038171003817119875119894S minus 119873
119894
1003817100381710038171003817
2
119865)
(16)
where 119884119894 119885119894are Lagrangian multipliers and 120572
119894 120573119894gt 0 are
penalty parametersThen we can now directly apply ADMM with this aug-
mented Lagrangian function
Computing 119872119894 The optimal119872
119894can be solved with all other
variables to be constant by the following subproblem
min119872119894
120582119894
1003817100381710038171003817119872119894
1003817100381710038171003817lowast+ ⟨119884119894 119875119894L minus119872
119894⟩ +
120572119894
2
1003817100381710038171003817119875119894L minus119872
119894
1003817100381710038171003817
2
119865 (17)
As shown in [20] the optimal solution of (17) is given by
119872119894= 119880119894119863120582119894120572119894(Λ)119881119894
119879 (18)
where 119880119894Λ119881119894
119879 is the singular value decomposition given by
119880119894Λ119881119894
119879= 119875119894L +
119884119894
120572119894
(19)
and the ldquoshrinkagerdquo operator119863120591(119909) with 120591 gt 0 is defined as
119863120591 (119909) =
119909 minus 120591 if 119909 gt 120591119909 + 120591 if 119909 lt minus1205910 otherwise
(20)
Computing 119873119894 The optimal 119873
119894can be solved with all other
variables to be the constants by the following subproblem
min119873119894
120578119894
1003817100381710038171003817119873119894
10038171003817100381710038171+ ⟨119885119894 119875119894S minus 119873
119894⟩ +
120573119894
2
1003817100381710038171003817119875119894S minus 119873
119894
1003817100381710038171003817
2
119865 (21)
Mathematical Problems in Engineering 5
Input n-mode tensorAParameters 120572 120573 120574 120582 120578 120588(1) InitializationL(0) = A S(0) = 0 119872
119894=L(119894) 119873119894= 0 119896 = 1
(2) Repeat until convergence(3) for 119894 = 1 to 119899(4) 119872
(119896+1)
119894= 119880119894119863120582119894120572119894(Λ)119881
119879
119894
where 119880119894Λ119881119879
119894= 119875119894L(119896) + 119884
(119896)
119894120572119894
(5) 119873(119896+1)
119894= 119863120578119894120573119894(119875119894S(119896) + 119885
(119896)
119894120573119894)
(6) 119884(119896+1)
119894= 119884(119896)
119894+ 120572119894(119875119894L(119896) minus119872
(119896+1)
119894)
(7) 119885(119896+1)
119894= 119885(119896)
119894+ 120573119894(119875119894S(119896) minus 119873
(119896+1)
119894)
(8) end for
(9) L(119896+1) =A minus S(119896) minus 120574sum
119899
119894=1119875119879
119894(119884(119896+1)
119894minus 120572119894119872(119896+1)
119894)
1 + 120574sum119899
119894=1120572119894
(10) S(119896+1) =A minusL(119896+1) minus 120574sum
119899
119894=1119875119879
119894(119885(119896+1)
119894minus 120572119894119873(119896+1)
119894)
1 + 120574sum119899
119894=1120573119894
(11) 120572 = 120588120572 120573 = 120588120573(12) End(13) 119896 = 119896 + 1
Output n-mode tensorLS
Algorithm 1 ADMM-TR ADMM for tensor recovery
By the well-known 1198971minimization [21] the optimal
solution of (21) is
119873119894= 119863120578119894120573119894
(119875119894S +119885119894
120573119894
) (22)
where119863120591is the ldquoshrinkagerdquo operation
Computing L Now we fix all variables except L andminimize 119871
119860over L The resulting minimization problem
is the minimization of a quadratic function
minL 119871119860 (
L) =1
2120574
A minusL minus S2
119865
+
119899
sum
119894=1
(⟨119884119894 119875119894L minus119872
119894⟩ +
120572119894
2
1003817100381710038171003817119875119894L minus119872
119894
1003817100381710038171003817
2
119865)
(23)
The objective function is differentiable so the minimizerLmin is characterized by (120597119871119860(L))120597L = 0 Thus we obtain
Lmin =A minusS minus 120574sum
119899
119894=1119875119894
119879(119884119894minus 120572119894119872119894)
(1 + 120574sum119899
119894=1120572119894)
(24)
ComputingS Nowwefix all variables exceptS andminimize119871119860
over S The resulting minimization problem is theminimization of a quadratic function
minS 119871119860 (
S) =1
2120574
A minusL minusS2
119865
+
119899
sum
119894=1
(⟨119885119894 119875119894S minus 119873
119894⟩ +
120573119894
2
1003817100381710038171003817119875119894S minus 119873
119894
1003817100381710038171003817
2
119865)
(25)
The objective function is also differentiable so the min-imizer Smin is characterized by (120597119871
119860(S))120597S = 0 Thus we
have
Smin =A minusL minus 120574sum
119899
119894=1119875119894
119879(119885119894minus 120573119894119873119894)
(1 + 120574sum119899
119894=1120573119894)
(26)
For comparing with RSTD [22] we also choose thedifference of L and S in successive iterations against acertain tolerance as the stopping criterion The pseudocodeof the proposed ADMM-TR algorithm is summarized inAlgorithm 1
Theorem 2 (Convergence of ADMM-TR) Assume that theoptimal solution set 119883lowast of (11) is nonempty A sequenceL(119896)S(119896)119872
(119896)
119894 119873(119896)
119894 119884(119896)
119894 119885(119896)
119894 generated by our proposed
ADMM-TR algorithm is bounded and every limit point ofL(119896)S(119896) is an optimal solution of the original problem (11)
Proof We check the assumptions of Theorem 1 119862119909is not
bounded but 119875119894
lowast119875119894= 119868 is a constant multiple of the identity
operator Thus Theorem 1 can also be applied to ADMM-TRandTheorem 2 can be derived
5 Numerical Experiments
This section evaluates the empirical performance of the pro-posed algorithm on synthetic data and compares the resultswithRSTD (Rank Sparsity TensorDecomposition) [22] Alsoexperiments on traffic volume data outlier recovery illustratethe efficiency of the proposedmethod in traffic research filed
We use the Lanczos algorithm for computing the singularvalues decomposition and adopt the same rule for predicting
6 Mathematical Problems in Engineering
Table 2 Comparison of ADMM-TR with RSTD for synthetic data whereA0isin R40times40times40 119899-rank = [5 5 5]
119904119901119903
ADMM-TR RSTDRSE L
0
(119890-3)RSE S
0
(119890-3) iter Time (s) RSE L0
(119890-3)RSE S
0
(119890-3) iter Time (s)
005 43 43 133 137 46 46 208 178015 98 42 162 208 105 47 235 336025 123 43 514 449 159 52 676 510035 565 95 654 533 675 114 737 575
Table 3 Comparison of ADMM-TR with RSTD for synthetic data whereA0isin R40times40times40 n-rank = [10 10 10]
119904119901119903
Algorithm ADMM-TR Algorithm RSTDRSE L
0
(119890-3)RSE S
0
(119890-3) iter Time (s) RSE L0
(119890-3)RSE S
0
(119890-3) iter Time (s)
005 44 20 236 264 47 25 338 299015 47 21 417 421 52 25 603 508025 89 29 664 608 120 42 663 536035 173 52 981 806 211 65 1110 877
the dimension of the principal singular space as [22] And theparameters are set as 120572 = 120573 = [119868
1119868max 1198682119868max 119868119899119868max]
119879
and 120574 = 1sum([1198681119868max 1198682119868max 119868119899119868max]
119879 for all experi-ments where 119868max = max119868
119894 120578 is set to 1radic119868max as suggested
in [23]All the experiments are conducted and timed on the same
desktop with an Pentium (R) Dual-Core 250GHz CPU thathas 4GB memory running on Windows 7 and MATLAB
51 Synthetic Data ADMM-TR and RSTD are tested on thesynthetic data of size 40times40times40 We generate the dimension119903 of a ldquocore tensorrdquo C isin R119903timessdotsdotsdottimes119903 which we fill with Gaussiandistributed entries (sim N(0 1)) Then we generate matrices119880(1) 119880
(119873) with119880(119894) isin R119899119894times119903 whose elements are also iidGaussian random variables (simN(0 1)) and set
L0= Ctimes1119880(1)
times2sdotsdotsdottimes119873119880(119873) (27)
The entries of sparse tensor S0are independently dis-
tributed each taking on value 0 with probability 1-119904119901119903 andeach taking on impulsive value with probability 119904119901119903We applythe proposed algorithm to the tensorA
0=L0+S0to recover
L and S and compare with RSTD For these experimentstwo cases of 119899-rank are investigated 119899-rank = [5 5 5] and 119899-rank = [10 10 10] Table 1 presents the average results (across30 instances) for different 119904119901119903
The quality of recovery is measured by the relative squareerror (RSE) toL
0and S
0 defined to be
RSE L0=
10038171003817100381710038171003817
L minusL0
10038171003817100381710038171003817119865
1003817100381710038171003817L0
1003817100381710038171003817119865
RSE S0=
10038171003817100381710038171003817
S minusS0
10038171003817100381710038171003817119865
1003817100381710038171003817S0
1003817100381710038171003817119865
(28)
Tables 2 and 3 show that the proposed algorithm(ADMM-TR) is about 10 percent faster than RSTD proposed
in [22] and achieves better accuracy in terms of relative squareerror Though both of the algorithms involve computing aSVD per iteration we observe that the proposed algorithmtake much fewer iterations than RSTD to converge to theoptimal solution
The more impulsive entries are added that is for highervalue of spr the less probable it becomes for the tensorrecovery problem In addition the problem becomes moresophisticated when the 119899-rank is higher for the ground truthtensors of the same size In Table 1 different spr is set fortwo tensor cases And results show the recovered accuracydecreases as the spr grows for a certain case In particular therecovered accuracy for the tensorA
0isin R40times40times40 with 119899-rank
= [5 5 5] decreases sharply when the spr is up to 40 whilethe phenomenon occurs for tensor A
0isin R40times40times40 with 119899-
rank = [10 10 10] when the spr is about 25 This is due tothat relative low rank and low sparse ratio are precondition ofthe tensor recovery problem
52 TrafficVolumeData To evaluate the performances of theproposed method in traffic volume data outlier recovery acomplete traffic volume data set is used as the test data setWeuse the data of a fixed point in Sacramento County which isdownloaded fromhttppemsdotcagovThe traffic volumedata are recorded every 5 minutes Therefore a daily trafficvolume series for a loop detector contains 288 records andthe whole period of the data lasts for 16 days that is fromAugust 2 to August 17 2010
Based on multiple correlations of the traffic volume datawe model the data set as a tensor model of size 16 times 24 times 12which stands for 16 days 24 hours in a day and 12 sampleintervals (ie recorded by 5 minutes) per hour The ratiosof outlier data are set from 5 to 15 and the outlier dataare produced randomly All the results are averaged by 10instances
Mathematical Problems in Engineering 7
0
100
200
300
400
500
600
700
800
900
1000
Traffi
c vol
ume (
veh5
min
)
0 50 100 150 200 250 300
Raw dataOutlier data
Recovered data
Time interval (5 min)
Figure 1 Comparisons with raw traffic volume data data corruptedby outliers with 5 ratio and data recovered by ADMM-TR
0 50 100 150 200 250 3000
200
400
600
800
1000
1200
1400
Raw dataOutlier data
Recovered data
Time interval (5 min)
Traffi
c vol
ume (
veh5
min
)
Figure 2 Comparisons with raw traffic volume data data corruptedby outliers with 10 ratio and data recovered by ADMM-TR
For the real world data we mainly pay attention to thecorrectness of the recovered traffic volume data Thus thequality of recovery is measured by the relative square error(RSE) to L
0and Mean Absolute Percentage Error (MAPE)
toL0 defined to be
RSE L0=
10038171003817100381710038171003817
L minusL0
10038171003817100381710038171003817119865
1003817100381710038171003817L0
1003817100381710038171003817119865
MAPE L0=
1
119872
119872
sum
119898=1
1003816100381610038161003816100381610038161003816100381610038161003816
119905(119898)
119903minus 119905(119898)
119890
119905(119898)
119903
1003816100381610038161003816100381610038161003816100381610038161003816
(29)
0
200
400
600
800
1000
1200
1400
1600
1800
0 50 100 150 200 250 300
Raw dataOutlier data
Recovered data
Time interval (5 min)
Traffi
c vol
ume (
veh5
min
)
Figure 3 Comparisons with raw traffic volume data data corruptedby outliers with 15 ratio and data recovered by ADMM-TR
Table 4 Traffic volume outlier data before and afterbeing recovered
119904119901119903
Before recovery After recoveryRSE L
0
(119890-3)MAPE L
0
(119890-3)RSE L
0
(119890-3)MAPE L
0
(119890-3)005 05005 02136 00961 00314010 07066 05008 01417 01027015 08850 08777 02221 02351
where 119905(119898)119903
and 119905(119898)119890
are the 119898th elements which stand forthe known real value and recovered value respectively 119872denotes the number of recovered traffic volumes
Table 4 presents the relative errors of traffic volumeoutlier data before and after being recovered by ADMM-TRThe results show that the RSE L
0and MAPE L
0for traffic
volume data corrupted by outlier data are about 5 times thanthe data after being recovered by ADMM-TR Figures 1 2and 3 present the profiles of traffic volume data for a day Theresults show that ADMM-TR could recover the traffic volumeoutlier data with perfect performance
6 Conclusions
In this paper we concentrate on the mathematical problemin traffic volume outlier data recovery and proposed anovel tensor recovery method based on alternating directionmethod of multipliers (ADMM) The proposed algorithmcan automatically separate the low-119899-rank tensor data andsparse partThe experiments show that the proposedmethodis more stable and accurate in most cases and has excellentconvergence rate Experiments on real world traffic volumedata demonstrate the practicability and effectiveness of theproposed method in traffic research domain
In the future we would like to investigate how to auto-matically choose the parameters in our algorithm and explore
8 Mathematical Problems in Engineering
additional applications of our method in traffic researchdomain
Acknowledgments
This research was supported by NSFC (Grant No 6127137661171118 and 91120015) Beijing Natural Science Foundation(4122067)The authors would like to thank Professor Bin Ranfrom the University ofWisconsin-Madison and Yong Li fromthe University of Notre Dame for the suggestive discussions
References
[1] K Tamura and M Hirayama ldquoToward realization of VICSmdashvehicle information and communications systemrdquo in Pro-ceedings of the IEEE-IEE Vehicle Navigation and InformationsSystems Conference pp 72ndash77 October 1993
[2] S Kim M E Lewis and C C White III ldquoOptimal vehiclerouting with real-time traffic informationrdquo IEEE Transactionson Intelligent Transportation Systems vol 6 no 2 pp 178ndash1882005
[3] P Mirchandani and L Head ldquoA real-time traffic signal controlsystem architecture algorithms and analysisrdquo TransportationResearch Part C vol 9 no 6 pp 415ndash432 2001
[4] B M Williams P K Durvasula and D E Brown ldquoUrbanfreeway traffic flow prediction application of seasonal autore-gressive integrated moving average and exponential smoothingmodelsrdquo Transportation Research Record no 1644 pp 132ndash1411998
[5] J Xu X Li and H Shi ldquoShort-term traffic flow forecastingmodel under missing datardquo Journal of Computer Applicationsvol 30 no 4 pp 1117ndash1120 2010
[6] Y Pei and J Ma ldquoReal-time traffic data screening and recon-structionrdquo China Civil Engineering Journal vol 36 no 7 pp78ndash83 2003
[7] S Chen W Wang and W Li ldquoNoise recognition and noisereduction of real-time trafficdatardquoDongnanDaxueXuebao vol36 no 2 pp 322ndash325 2006
[8] B-G Daniel J D-P Francisco G-O David et al ldquoWavelet-based denoising for traffic volume time series forecasting withself-organizing neural networksrdquo Computer-Aided Civil andInfrastructure Engineering vol 25 no 7 pp 530ndash545 2010
[9] S Gandy B Recht and I Yamada ldquoTensor completion andlowmdashrank tensor recovery via convex optimizationrdquo InverseProblems vol 27 no 2 Article ID 025010 2011
[10] T G Kolda and B W Bader ldquoTensor decompositions andapplicationsrdquo SIAM Review vol 51 no 3 pp 455ndash500 2009
[11] A S Lewis and G Knowles ldquoImage compression using the 2-Dwavelet transformrdquo IEEE Transactions of Image Processing vol1 no 2 pp 244ndash250 1992
[12] V Chandrasekaran S Sanghavi P A Parrilo and A S WillskyldquoRank-sparsity incoherence for matrix decompositionrdquo SIAMJournal on Optimization vol 21 no 2 pp 572ndash596 2011
[13] J Wright Y Peng Y Ma and A Ganesh ldquoRobust principalcomponent analysis exact recovery of corrupted low-rankmatrices by convex optimizationrdquo in Proceedings of the 23rdAnnual Conference on Neural Information Processing Systems(NIPS rsquo09) pp 2080ndash2088 Vancouver Canada 2009
[14] A Ganesh Z Lin J Wright L Wu M Chen and Y MaldquoFast algorithms for recovering a corrupted low-rank matrixrdquo
in Proceedings of the 3rd IEEE International Workshop onComputational Advances in Multi-Sensor Adaptive Processing(CAMSAP rsquo09) pp 213ndash216 December 2009
[15] Z Lin M Chen L Wu and Y Ma ldquoThe augmented Lagrangemultiplier method for exact recovery of a corrupted low-rankmatricesrdquoMathematical Programming 2009
[16] X Yuan and J Yang ldquoSparse and low-rank matrix decomposi-tion via alternating directionmethodsrdquo Tech Rep Departmentof Mathematics Hong Kong Baptist University 2009
[17] L Qu Y Zhang J Hu L Jia and L Li ldquoA BPCA basedmissing value imputing method for traffic flow volume datardquo inProceedings of the IEEE Intelligent Vehicles Symposium (IV rsquo08)pp 985ndash990 June 2008
[18] J Liu P Musialski P Wonka and J Ye ldquoTensor completion forestimating missing values in visual datardquo in Proceedings of theInternational Conference on Computer Vision (ICCV rsquo09) 2009
[19] D Bertsekas and J Tsitsiklis Parallel and Distributed Computa-tion Prentice-Hall 1989
[20] J-F Cai E J Candes and Z Shen ldquoA singular value thresh-olding algorithm for matrix completionrdquo SIAM Journal onOptimization vol 20 no 4 pp 1956ndash1982 2010
[21] E T Hale W Yin and Y Zhang ldquoFixed-point continuationfor ℓ1-minimization methodology and convergencerdquo SIAMJournal on Optimization vol 19 no 3 pp 1107ndash1130 2008
[22] Y Li J Yan Y Zhou and J Yang ldquoOptimum subspace learningand error correction for tensorsrdquo in Proceedings of the 11thEuropean Conference on Computer Vision (ECCV rsquo10) CreteGreece 2010
[23] E J Candes X Li Y Ma and J Wright ldquoRobust principalcomponent analysisrdquo Journal of the ACM vol 58 no 3 article11 2011
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Mathematical Problems in Engineering 5
Input n-mode tensorAParameters 120572 120573 120574 120582 120578 120588(1) InitializationL(0) = A S(0) = 0 119872
119894=L(119894) 119873119894= 0 119896 = 1
(2) Repeat until convergence(3) for 119894 = 1 to 119899(4) 119872
(119896+1)
119894= 119880119894119863120582119894120572119894(Λ)119881
119879
119894
where 119880119894Λ119881119879
119894= 119875119894L(119896) + 119884
(119896)
119894120572119894
(5) 119873(119896+1)
119894= 119863120578119894120573119894(119875119894S(119896) + 119885
(119896)
119894120573119894)
(6) 119884(119896+1)
119894= 119884(119896)
119894+ 120572119894(119875119894L(119896) minus119872
(119896+1)
119894)
(7) 119885(119896+1)
119894= 119885(119896)
119894+ 120573119894(119875119894S(119896) minus 119873
(119896+1)
119894)
(8) end for
(9) L(119896+1) =A minus S(119896) minus 120574sum
119899
119894=1119875119879
119894(119884(119896+1)
119894minus 120572119894119872(119896+1)
119894)
1 + 120574sum119899
119894=1120572119894
(10) S(119896+1) =A minusL(119896+1) minus 120574sum
119899
119894=1119875119879
119894(119885(119896+1)
119894minus 120572119894119873(119896+1)
119894)
1 + 120574sum119899
119894=1120573119894
(11) 120572 = 120588120572 120573 = 120588120573(12) End(13) 119896 = 119896 + 1
Output n-mode tensorLS
Algorithm 1 ADMM-TR ADMM for tensor recovery
By the well-known 1198971minimization [21] the optimal
solution of (21) is
119873119894= 119863120578119894120573119894
(119875119894S +119885119894
120573119894
) (22)
where119863120591is the ldquoshrinkagerdquo operation
Computing L Now we fix all variables except L andminimize 119871
119860over L The resulting minimization problem
is the minimization of a quadratic function
minL 119871119860 (
L) =1
2120574
A minusL minus S2
119865
+
119899
sum
119894=1
(⟨119884119894 119875119894L minus119872
119894⟩ +
120572119894
2
1003817100381710038171003817119875119894L minus119872
119894
1003817100381710038171003817
2
119865)
(23)
The objective function is differentiable so the minimizerLmin is characterized by (120597119871119860(L))120597L = 0 Thus we obtain
Lmin =A minusS minus 120574sum
119899
119894=1119875119894
119879(119884119894minus 120572119894119872119894)
(1 + 120574sum119899
119894=1120572119894)
(24)
ComputingS Nowwefix all variables exceptS andminimize119871119860
over S The resulting minimization problem is theminimization of a quadratic function
minS 119871119860 (
S) =1
2120574
A minusL minusS2
119865
+
119899
sum
119894=1
(⟨119885119894 119875119894S minus 119873
119894⟩ +
120573119894
2
1003817100381710038171003817119875119894S minus 119873
119894
1003817100381710038171003817
2
119865)
(25)
The objective function is also differentiable so the min-imizer Smin is characterized by (120597119871
119860(S))120597S = 0 Thus we
have
Smin =A minusL minus 120574sum
119899
119894=1119875119894
119879(119885119894minus 120573119894119873119894)
(1 + 120574sum119899
119894=1120573119894)
(26)
For comparing with RSTD [22] we also choose thedifference of L and S in successive iterations against acertain tolerance as the stopping criterion The pseudocodeof the proposed ADMM-TR algorithm is summarized inAlgorithm 1
Theorem 2 (Convergence of ADMM-TR) Assume that theoptimal solution set 119883lowast of (11) is nonempty A sequenceL(119896)S(119896)119872
(119896)
119894 119873(119896)
119894 119884(119896)
119894 119885(119896)
119894 generated by our proposed
ADMM-TR algorithm is bounded and every limit point ofL(119896)S(119896) is an optimal solution of the original problem (11)
Proof We check the assumptions of Theorem 1 119862119909is not
bounded but 119875119894
lowast119875119894= 119868 is a constant multiple of the identity
operator Thus Theorem 1 can also be applied to ADMM-TRandTheorem 2 can be derived
5 Numerical Experiments
This section evaluates the empirical performance of the pro-posed algorithm on synthetic data and compares the resultswithRSTD (Rank Sparsity TensorDecomposition) [22] Alsoexperiments on traffic volume data outlier recovery illustratethe efficiency of the proposedmethod in traffic research filed
We use the Lanczos algorithm for computing the singularvalues decomposition and adopt the same rule for predicting
6 Mathematical Problems in Engineering
Table 2 Comparison of ADMM-TR with RSTD for synthetic data whereA0isin R40times40times40 119899-rank = [5 5 5]
119904119901119903
ADMM-TR RSTDRSE L
0
(119890-3)RSE S
0
(119890-3) iter Time (s) RSE L0
(119890-3)RSE S
0
(119890-3) iter Time (s)
005 43 43 133 137 46 46 208 178015 98 42 162 208 105 47 235 336025 123 43 514 449 159 52 676 510035 565 95 654 533 675 114 737 575
Table 3 Comparison of ADMM-TR with RSTD for synthetic data whereA0isin R40times40times40 n-rank = [10 10 10]
119904119901119903
Algorithm ADMM-TR Algorithm RSTDRSE L
0
(119890-3)RSE S
0
(119890-3) iter Time (s) RSE L0
(119890-3)RSE S
0
(119890-3) iter Time (s)
005 44 20 236 264 47 25 338 299015 47 21 417 421 52 25 603 508025 89 29 664 608 120 42 663 536035 173 52 981 806 211 65 1110 877
the dimension of the principal singular space as [22] And theparameters are set as 120572 = 120573 = [119868
1119868max 1198682119868max 119868119899119868max]
119879
and 120574 = 1sum([1198681119868max 1198682119868max 119868119899119868max]
119879 for all experi-ments where 119868max = max119868
119894 120578 is set to 1radic119868max as suggested
in [23]All the experiments are conducted and timed on the same
desktop with an Pentium (R) Dual-Core 250GHz CPU thathas 4GB memory running on Windows 7 and MATLAB
51 Synthetic Data ADMM-TR and RSTD are tested on thesynthetic data of size 40times40times40 We generate the dimension119903 of a ldquocore tensorrdquo C isin R119903timessdotsdotsdottimes119903 which we fill with Gaussiandistributed entries (sim N(0 1)) Then we generate matrices119880(1) 119880
(119873) with119880(119894) isin R119899119894times119903 whose elements are also iidGaussian random variables (simN(0 1)) and set
L0= Ctimes1119880(1)
times2sdotsdotsdottimes119873119880(119873) (27)
The entries of sparse tensor S0are independently dis-
tributed each taking on value 0 with probability 1-119904119901119903 andeach taking on impulsive value with probability 119904119901119903We applythe proposed algorithm to the tensorA
0=L0+S0to recover
L and S and compare with RSTD For these experimentstwo cases of 119899-rank are investigated 119899-rank = [5 5 5] and 119899-rank = [10 10 10] Table 1 presents the average results (across30 instances) for different 119904119901119903
The quality of recovery is measured by the relative squareerror (RSE) toL
0and S
0 defined to be
RSE L0=
10038171003817100381710038171003817
L minusL0
10038171003817100381710038171003817119865
1003817100381710038171003817L0
1003817100381710038171003817119865
RSE S0=
10038171003817100381710038171003817
S minusS0
10038171003817100381710038171003817119865
1003817100381710038171003817S0
1003817100381710038171003817119865
(28)
Tables 2 and 3 show that the proposed algorithm(ADMM-TR) is about 10 percent faster than RSTD proposed
in [22] and achieves better accuracy in terms of relative squareerror Though both of the algorithms involve computing aSVD per iteration we observe that the proposed algorithmtake much fewer iterations than RSTD to converge to theoptimal solution
The more impulsive entries are added that is for highervalue of spr the less probable it becomes for the tensorrecovery problem In addition the problem becomes moresophisticated when the 119899-rank is higher for the ground truthtensors of the same size In Table 1 different spr is set fortwo tensor cases And results show the recovered accuracydecreases as the spr grows for a certain case In particular therecovered accuracy for the tensorA
0isin R40times40times40 with 119899-rank
= [5 5 5] decreases sharply when the spr is up to 40 whilethe phenomenon occurs for tensor A
0isin R40times40times40 with 119899-
rank = [10 10 10] when the spr is about 25 This is due tothat relative low rank and low sparse ratio are precondition ofthe tensor recovery problem
52 TrafficVolumeData To evaluate the performances of theproposed method in traffic volume data outlier recovery acomplete traffic volume data set is used as the test data setWeuse the data of a fixed point in Sacramento County which isdownloaded fromhttppemsdotcagovThe traffic volumedata are recorded every 5 minutes Therefore a daily trafficvolume series for a loop detector contains 288 records andthe whole period of the data lasts for 16 days that is fromAugust 2 to August 17 2010
Based on multiple correlations of the traffic volume datawe model the data set as a tensor model of size 16 times 24 times 12which stands for 16 days 24 hours in a day and 12 sampleintervals (ie recorded by 5 minutes) per hour The ratiosof outlier data are set from 5 to 15 and the outlier dataare produced randomly All the results are averaged by 10instances
Mathematical Problems in Engineering 7
0
100
200
300
400
500
600
700
800
900
1000
Traffi
c vol
ume (
veh5
min
)
0 50 100 150 200 250 300
Raw dataOutlier data
Recovered data
Time interval (5 min)
Figure 1 Comparisons with raw traffic volume data data corruptedby outliers with 5 ratio and data recovered by ADMM-TR
0 50 100 150 200 250 3000
200
400
600
800
1000
1200
1400
Raw dataOutlier data
Recovered data
Time interval (5 min)
Traffi
c vol
ume (
veh5
min
)
Figure 2 Comparisons with raw traffic volume data data corruptedby outliers with 10 ratio and data recovered by ADMM-TR
For the real world data we mainly pay attention to thecorrectness of the recovered traffic volume data Thus thequality of recovery is measured by the relative square error(RSE) to L
0and Mean Absolute Percentage Error (MAPE)
toL0 defined to be
RSE L0=
10038171003817100381710038171003817
L minusL0
10038171003817100381710038171003817119865
1003817100381710038171003817L0
1003817100381710038171003817119865
MAPE L0=
1
119872
119872
sum
119898=1
1003816100381610038161003816100381610038161003816100381610038161003816
119905(119898)
119903minus 119905(119898)
119890
119905(119898)
119903
1003816100381610038161003816100381610038161003816100381610038161003816
(29)
0
200
400
600
800
1000
1200
1400
1600
1800
0 50 100 150 200 250 300
Raw dataOutlier data
Recovered data
Time interval (5 min)
Traffi
c vol
ume (
veh5
min
)
Figure 3 Comparisons with raw traffic volume data data corruptedby outliers with 15 ratio and data recovered by ADMM-TR
Table 4 Traffic volume outlier data before and afterbeing recovered
119904119901119903
Before recovery After recoveryRSE L
0
(119890-3)MAPE L
0
(119890-3)RSE L
0
(119890-3)MAPE L
0
(119890-3)005 05005 02136 00961 00314010 07066 05008 01417 01027015 08850 08777 02221 02351
where 119905(119898)119903
and 119905(119898)119890
are the 119898th elements which stand forthe known real value and recovered value respectively 119872denotes the number of recovered traffic volumes
Table 4 presents the relative errors of traffic volumeoutlier data before and after being recovered by ADMM-TRThe results show that the RSE L
0and MAPE L
0for traffic
volume data corrupted by outlier data are about 5 times thanthe data after being recovered by ADMM-TR Figures 1 2and 3 present the profiles of traffic volume data for a day Theresults show that ADMM-TR could recover the traffic volumeoutlier data with perfect performance
6 Conclusions
In this paper we concentrate on the mathematical problemin traffic volume outlier data recovery and proposed anovel tensor recovery method based on alternating directionmethod of multipliers (ADMM) The proposed algorithmcan automatically separate the low-119899-rank tensor data andsparse partThe experiments show that the proposedmethodis more stable and accurate in most cases and has excellentconvergence rate Experiments on real world traffic volumedata demonstrate the practicability and effectiveness of theproposed method in traffic research domain
In the future we would like to investigate how to auto-matically choose the parameters in our algorithm and explore
8 Mathematical Problems in Engineering
additional applications of our method in traffic researchdomain
Acknowledgments
This research was supported by NSFC (Grant No 6127137661171118 and 91120015) Beijing Natural Science Foundation(4122067)The authors would like to thank Professor Bin Ranfrom the University ofWisconsin-Madison and Yong Li fromthe University of Notre Dame for the suggestive discussions
References
[1] K Tamura and M Hirayama ldquoToward realization of VICSmdashvehicle information and communications systemrdquo in Pro-ceedings of the IEEE-IEE Vehicle Navigation and InformationsSystems Conference pp 72ndash77 October 1993
[2] S Kim M E Lewis and C C White III ldquoOptimal vehiclerouting with real-time traffic informationrdquo IEEE Transactionson Intelligent Transportation Systems vol 6 no 2 pp 178ndash1882005
[3] P Mirchandani and L Head ldquoA real-time traffic signal controlsystem architecture algorithms and analysisrdquo TransportationResearch Part C vol 9 no 6 pp 415ndash432 2001
[4] B M Williams P K Durvasula and D E Brown ldquoUrbanfreeway traffic flow prediction application of seasonal autore-gressive integrated moving average and exponential smoothingmodelsrdquo Transportation Research Record no 1644 pp 132ndash1411998
[5] J Xu X Li and H Shi ldquoShort-term traffic flow forecastingmodel under missing datardquo Journal of Computer Applicationsvol 30 no 4 pp 1117ndash1120 2010
[6] Y Pei and J Ma ldquoReal-time traffic data screening and recon-structionrdquo China Civil Engineering Journal vol 36 no 7 pp78ndash83 2003
[7] S Chen W Wang and W Li ldquoNoise recognition and noisereduction of real-time trafficdatardquoDongnanDaxueXuebao vol36 no 2 pp 322ndash325 2006
[8] B-G Daniel J D-P Francisco G-O David et al ldquoWavelet-based denoising for traffic volume time series forecasting withself-organizing neural networksrdquo Computer-Aided Civil andInfrastructure Engineering vol 25 no 7 pp 530ndash545 2010
[9] S Gandy B Recht and I Yamada ldquoTensor completion andlowmdashrank tensor recovery via convex optimizationrdquo InverseProblems vol 27 no 2 Article ID 025010 2011
[10] T G Kolda and B W Bader ldquoTensor decompositions andapplicationsrdquo SIAM Review vol 51 no 3 pp 455ndash500 2009
[11] A S Lewis and G Knowles ldquoImage compression using the 2-Dwavelet transformrdquo IEEE Transactions of Image Processing vol1 no 2 pp 244ndash250 1992
[12] V Chandrasekaran S Sanghavi P A Parrilo and A S WillskyldquoRank-sparsity incoherence for matrix decompositionrdquo SIAMJournal on Optimization vol 21 no 2 pp 572ndash596 2011
[13] J Wright Y Peng Y Ma and A Ganesh ldquoRobust principalcomponent analysis exact recovery of corrupted low-rankmatrices by convex optimizationrdquo in Proceedings of the 23rdAnnual Conference on Neural Information Processing Systems(NIPS rsquo09) pp 2080ndash2088 Vancouver Canada 2009
[14] A Ganesh Z Lin J Wright L Wu M Chen and Y MaldquoFast algorithms for recovering a corrupted low-rank matrixrdquo
in Proceedings of the 3rd IEEE International Workshop onComputational Advances in Multi-Sensor Adaptive Processing(CAMSAP rsquo09) pp 213ndash216 December 2009
[15] Z Lin M Chen L Wu and Y Ma ldquoThe augmented Lagrangemultiplier method for exact recovery of a corrupted low-rankmatricesrdquoMathematical Programming 2009
[16] X Yuan and J Yang ldquoSparse and low-rank matrix decomposi-tion via alternating directionmethodsrdquo Tech Rep Departmentof Mathematics Hong Kong Baptist University 2009
[17] L Qu Y Zhang J Hu L Jia and L Li ldquoA BPCA basedmissing value imputing method for traffic flow volume datardquo inProceedings of the IEEE Intelligent Vehicles Symposium (IV rsquo08)pp 985ndash990 June 2008
[18] J Liu P Musialski P Wonka and J Ye ldquoTensor completion forestimating missing values in visual datardquo in Proceedings of theInternational Conference on Computer Vision (ICCV rsquo09) 2009
[19] D Bertsekas and J Tsitsiklis Parallel and Distributed Computa-tion Prentice-Hall 1989
[20] J-F Cai E J Candes and Z Shen ldquoA singular value thresh-olding algorithm for matrix completionrdquo SIAM Journal onOptimization vol 20 no 4 pp 1956ndash1982 2010
[21] E T Hale W Yin and Y Zhang ldquoFixed-point continuationfor ℓ1-minimization methodology and convergencerdquo SIAMJournal on Optimization vol 19 no 3 pp 1107ndash1130 2008
[22] Y Li J Yan Y Zhou and J Yang ldquoOptimum subspace learningand error correction for tensorsrdquo in Proceedings of the 11thEuropean Conference on Computer Vision (ECCV rsquo10) CreteGreece 2010
[23] E J Candes X Li Y Ma and J Wright ldquoRobust principalcomponent analysisrdquo Journal of the ACM vol 58 no 3 article11 2011
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
6 Mathematical Problems in Engineering
Table 2 Comparison of ADMM-TR with RSTD for synthetic data whereA0isin R40times40times40 119899-rank = [5 5 5]
119904119901119903
ADMM-TR RSTDRSE L
0
(119890-3)RSE S
0
(119890-3) iter Time (s) RSE L0
(119890-3)RSE S
0
(119890-3) iter Time (s)
005 43 43 133 137 46 46 208 178015 98 42 162 208 105 47 235 336025 123 43 514 449 159 52 676 510035 565 95 654 533 675 114 737 575
Table 3 Comparison of ADMM-TR with RSTD for synthetic data whereA0isin R40times40times40 n-rank = [10 10 10]
119904119901119903
Algorithm ADMM-TR Algorithm RSTDRSE L
0
(119890-3)RSE S
0
(119890-3) iter Time (s) RSE L0
(119890-3)RSE S
0
(119890-3) iter Time (s)
005 44 20 236 264 47 25 338 299015 47 21 417 421 52 25 603 508025 89 29 664 608 120 42 663 536035 173 52 981 806 211 65 1110 877
the dimension of the principal singular space as [22] And theparameters are set as 120572 = 120573 = [119868
1119868max 1198682119868max 119868119899119868max]
119879
and 120574 = 1sum([1198681119868max 1198682119868max 119868119899119868max]
119879 for all experi-ments where 119868max = max119868
119894 120578 is set to 1radic119868max as suggested
in [23]All the experiments are conducted and timed on the same
desktop with an Pentium (R) Dual-Core 250GHz CPU thathas 4GB memory running on Windows 7 and MATLAB
51 Synthetic Data ADMM-TR and RSTD are tested on thesynthetic data of size 40times40times40 We generate the dimension119903 of a ldquocore tensorrdquo C isin R119903timessdotsdotsdottimes119903 which we fill with Gaussiandistributed entries (sim N(0 1)) Then we generate matrices119880(1) 119880
(119873) with119880(119894) isin R119899119894times119903 whose elements are also iidGaussian random variables (simN(0 1)) and set
L0= Ctimes1119880(1)
times2sdotsdotsdottimes119873119880(119873) (27)
The entries of sparse tensor S0are independently dis-
tributed each taking on value 0 with probability 1-119904119901119903 andeach taking on impulsive value with probability 119904119901119903We applythe proposed algorithm to the tensorA
0=L0+S0to recover
L and S and compare with RSTD For these experimentstwo cases of 119899-rank are investigated 119899-rank = [5 5 5] and 119899-rank = [10 10 10] Table 1 presents the average results (across30 instances) for different 119904119901119903
The quality of recovery is measured by the relative squareerror (RSE) toL
0and S
0 defined to be
RSE L0=
10038171003817100381710038171003817
L minusL0
10038171003817100381710038171003817119865
1003817100381710038171003817L0
1003817100381710038171003817119865
RSE S0=
10038171003817100381710038171003817
S minusS0
10038171003817100381710038171003817119865
1003817100381710038171003817S0
1003817100381710038171003817119865
(28)
Tables 2 and 3 show that the proposed algorithm(ADMM-TR) is about 10 percent faster than RSTD proposed
in [22] and achieves better accuracy in terms of relative squareerror Though both of the algorithms involve computing aSVD per iteration we observe that the proposed algorithmtake much fewer iterations than RSTD to converge to theoptimal solution
The more impulsive entries are added that is for highervalue of spr the less probable it becomes for the tensorrecovery problem In addition the problem becomes moresophisticated when the 119899-rank is higher for the ground truthtensors of the same size In Table 1 different spr is set fortwo tensor cases And results show the recovered accuracydecreases as the spr grows for a certain case In particular therecovered accuracy for the tensorA
0isin R40times40times40 with 119899-rank
= [5 5 5] decreases sharply when the spr is up to 40 whilethe phenomenon occurs for tensor A
0isin R40times40times40 with 119899-
rank = [10 10 10] when the spr is about 25 This is due tothat relative low rank and low sparse ratio are precondition ofthe tensor recovery problem
52 TrafficVolumeData To evaluate the performances of theproposed method in traffic volume data outlier recovery acomplete traffic volume data set is used as the test data setWeuse the data of a fixed point in Sacramento County which isdownloaded fromhttppemsdotcagovThe traffic volumedata are recorded every 5 minutes Therefore a daily trafficvolume series for a loop detector contains 288 records andthe whole period of the data lasts for 16 days that is fromAugust 2 to August 17 2010
Based on multiple correlations of the traffic volume datawe model the data set as a tensor model of size 16 times 24 times 12which stands for 16 days 24 hours in a day and 12 sampleintervals (ie recorded by 5 minutes) per hour The ratiosof outlier data are set from 5 to 15 and the outlier dataare produced randomly All the results are averaged by 10instances
Mathematical Problems in Engineering 7
0
100
200
300
400
500
600
700
800
900
1000
Traffi
c vol
ume (
veh5
min
)
0 50 100 150 200 250 300
Raw dataOutlier data
Recovered data
Time interval (5 min)
Figure 1 Comparisons with raw traffic volume data data corruptedby outliers with 5 ratio and data recovered by ADMM-TR
0 50 100 150 200 250 3000
200
400
600
800
1000
1200
1400
Raw dataOutlier data
Recovered data
Time interval (5 min)
Traffi
c vol
ume (
veh5
min
)
Figure 2 Comparisons with raw traffic volume data data corruptedby outliers with 10 ratio and data recovered by ADMM-TR
For the real world data we mainly pay attention to thecorrectness of the recovered traffic volume data Thus thequality of recovery is measured by the relative square error(RSE) to L
0and Mean Absolute Percentage Error (MAPE)
toL0 defined to be
RSE L0=
10038171003817100381710038171003817
L minusL0
10038171003817100381710038171003817119865
1003817100381710038171003817L0
1003817100381710038171003817119865
MAPE L0=
1
119872
119872
sum
119898=1
1003816100381610038161003816100381610038161003816100381610038161003816
119905(119898)
119903minus 119905(119898)
119890
119905(119898)
119903
1003816100381610038161003816100381610038161003816100381610038161003816
(29)
0
200
400
600
800
1000
1200
1400
1600
1800
0 50 100 150 200 250 300
Raw dataOutlier data
Recovered data
Time interval (5 min)
Traffi
c vol
ume (
veh5
min
)
Figure 3 Comparisons with raw traffic volume data data corruptedby outliers with 15 ratio and data recovered by ADMM-TR
Table 4 Traffic volume outlier data before and afterbeing recovered
119904119901119903
Before recovery After recoveryRSE L
0
(119890-3)MAPE L
0
(119890-3)RSE L
0
(119890-3)MAPE L
0
(119890-3)005 05005 02136 00961 00314010 07066 05008 01417 01027015 08850 08777 02221 02351
where 119905(119898)119903
and 119905(119898)119890
are the 119898th elements which stand forthe known real value and recovered value respectively 119872denotes the number of recovered traffic volumes
Table 4 presents the relative errors of traffic volumeoutlier data before and after being recovered by ADMM-TRThe results show that the RSE L
0and MAPE L
0for traffic
volume data corrupted by outlier data are about 5 times thanthe data after being recovered by ADMM-TR Figures 1 2and 3 present the profiles of traffic volume data for a day Theresults show that ADMM-TR could recover the traffic volumeoutlier data with perfect performance
6 Conclusions
In this paper we concentrate on the mathematical problemin traffic volume outlier data recovery and proposed anovel tensor recovery method based on alternating directionmethod of multipliers (ADMM) The proposed algorithmcan automatically separate the low-119899-rank tensor data andsparse partThe experiments show that the proposedmethodis more stable and accurate in most cases and has excellentconvergence rate Experiments on real world traffic volumedata demonstrate the practicability and effectiveness of theproposed method in traffic research domain
In the future we would like to investigate how to auto-matically choose the parameters in our algorithm and explore
8 Mathematical Problems in Engineering
additional applications of our method in traffic researchdomain
Acknowledgments
This research was supported by NSFC (Grant No 6127137661171118 and 91120015) Beijing Natural Science Foundation(4122067)The authors would like to thank Professor Bin Ranfrom the University ofWisconsin-Madison and Yong Li fromthe University of Notre Dame for the suggestive discussions
References
[1] K Tamura and M Hirayama ldquoToward realization of VICSmdashvehicle information and communications systemrdquo in Pro-ceedings of the IEEE-IEE Vehicle Navigation and InformationsSystems Conference pp 72ndash77 October 1993
[2] S Kim M E Lewis and C C White III ldquoOptimal vehiclerouting with real-time traffic informationrdquo IEEE Transactionson Intelligent Transportation Systems vol 6 no 2 pp 178ndash1882005
[3] P Mirchandani and L Head ldquoA real-time traffic signal controlsystem architecture algorithms and analysisrdquo TransportationResearch Part C vol 9 no 6 pp 415ndash432 2001
[4] B M Williams P K Durvasula and D E Brown ldquoUrbanfreeway traffic flow prediction application of seasonal autore-gressive integrated moving average and exponential smoothingmodelsrdquo Transportation Research Record no 1644 pp 132ndash1411998
[5] J Xu X Li and H Shi ldquoShort-term traffic flow forecastingmodel under missing datardquo Journal of Computer Applicationsvol 30 no 4 pp 1117ndash1120 2010
[6] Y Pei and J Ma ldquoReal-time traffic data screening and recon-structionrdquo China Civil Engineering Journal vol 36 no 7 pp78ndash83 2003
[7] S Chen W Wang and W Li ldquoNoise recognition and noisereduction of real-time trafficdatardquoDongnanDaxueXuebao vol36 no 2 pp 322ndash325 2006
[8] B-G Daniel J D-P Francisco G-O David et al ldquoWavelet-based denoising for traffic volume time series forecasting withself-organizing neural networksrdquo Computer-Aided Civil andInfrastructure Engineering vol 25 no 7 pp 530ndash545 2010
[9] S Gandy B Recht and I Yamada ldquoTensor completion andlowmdashrank tensor recovery via convex optimizationrdquo InverseProblems vol 27 no 2 Article ID 025010 2011
[10] T G Kolda and B W Bader ldquoTensor decompositions andapplicationsrdquo SIAM Review vol 51 no 3 pp 455ndash500 2009
[11] A S Lewis and G Knowles ldquoImage compression using the 2-Dwavelet transformrdquo IEEE Transactions of Image Processing vol1 no 2 pp 244ndash250 1992
[12] V Chandrasekaran S Sanghavi P A Parrilo and A S WillskyldquoRank-sparsity incoherence for matrix decompositionrdquo SIAMJournal on Optimization vol 21 no 2 pp 572ndash596 2011
[13] J Wright Y Peng Y Ma and A Ganesh ldquoRobust principalcomponent analysis exact recovery of corrupted low-rankmatrices by convex optimizationrdquo in Proceedings of the 23rdAnnual Conference on Neural Information Processing Systems(NIPS rsquo09) pp 2080ndash2088 Vancouver Canada 2009
[14] A Ganesh Z Lin J Wright L Wu M Chen and Y MaldquoFast algorithms for recovering a corrupted low-rank matrixrdquo
in Proceedings of the 3rd IEEE International Workshop onComputational Advances in Multi-Sensor Adaptive Processing(CAMSAP rsquo09) pp 213ndash216 December 2009
[15] Z Lin M Chen L Wu and Y Ma ldquoThe augmented Lagrangemultiplier method for exact recovery of a corrupted low-rankmatricesrdquoMathematical Programming 2009
[16] X Yuan and J Yang ldquoSparse and low-rank matrix decomposi-tion via alternating directionmethodsrdquo Tech Rep Departmentof Mathematics Hong Kong Baptist University 2009
[17] L Qu Y Zhang J Hu L Jia and L Li ldquoA BPCA basedmissing value imputing method for traffic flow volume datardquo inProceedings of the IEEE Intelligent Vehicles Symposium (IV rsquo08)pp 985ndash990 June 2008
[18] J Liu P Musialski P Wonka and J Ye ldquoTensor completion forestimating missing values in visual datardquo in Proceedings of theInternational Conference on Computer Vision (ICCV rsquo09) 2009
[19] D Bertsekas and J Tsitsiklis Parallel and Distributed Computa-tion Prentice-Hall 1989
[20] J-F Cai E J Candes and Z Shen ldquoA singular value thresh-olding algorithm for matrix completionrdquo SIAM Journal onOptimization vol 20 no 4 pp 1956ndash1982 2010
[21] E T Hale W Yin and Y Zhang ldquoFixed-point continuationfor ℓ1-minimization methodology and convergencerdquo SIAMJournal on Optimization vol 19 no 3 pp 1107ndash1130 2008
[22] Y Li J Yan Y Zhou and J Yang ldquoOptimum subspace learningand error correction for tensorsrdquo in Proceedings of the 11thEuropean Conference on Computer Vision (ECCV rsquo10) CreteGreece 2010
[23] E J Candes X Li Y Ma and J Wright ldquoRobust principalcomponent analysisrdquo Journal of the ACM vol 58 no 3 article11 2011
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Mathematical Problems in Engineering 7
0
100
200
300
400
500
600
700
800
900
1000
Traffi
c vol
ume (
veh5
min
)
0 50 100 150 200 250 300
Raw dataOutlier data
Recovered data
Time interval (5 min)
Figure 1 Comparisons with raw traffic volume data data corruptedby outliers with 5 ratio and data recovered by ADMM-TR
0 50 100 150 200 250 3000
200
400
600
800
1000
1200
1400
Raw dataOutlier data
Recovered data
Time interval (5 min)
Traffi
c vol
ume (
veh5
min
)
Figure 2 Comparisons with raw traffic volume data data corruptedby outliers with 10 ratio and data recovered by ADMM-TR
For the real world data we mainly pay attention to thecorrectness of the recovered traffic volume data Thus thequality of recovery is measured by the relative square error(RSE) to L
0and Mean Absolute Percentage Error (MAPE)
toL0 defined to be
RSE L0=
10038171003817100381710038171003817
L minusL0
10038171003817100381710038171003817119865
1003817100381710038171003817L0
1003817100381710038171003817119865
MAPE L0=
1
119872
119872
sum
119898=1
1003816100381610038161003816100381610038161003816100381610038161003816
119905(119898)
119903minus 119905(119898)
119890
119905(119898)
119903
1003816100381610038161003816100381610038161003816100381610038161003816
(29)
0
200
400
600
800
1000
1200
1400
1600
1800
0 50 100 150 200 250 300
Raw dataOutlier data
Recovered data
Time interval (5 min)
Traffi
c vol
ume (
veh5
min
)
Figure 3 Comparisons with raw traffic volume data data corruptedby outliers with 15 ratio and data recovered by ADMM-TR
Table 4 Traffic volume outlier data before and afterbeing recovered
119904119901119903
Before recovery After recoveryRSE L
0
(119890-3)MAPE L
0
(119890-3)RSE L
0
(119890-3)MAPE L
0
(119890-3)005 05005 02136 00961 00314010 07066 05008 01417 01027015 08850 08777 02221 02351
where 119905(119898)119903
and 119905(119898)119890
are the 119898th elements which stand forthe known real value and recovered value respectively 119872denotes the number of recovered traffic volumes
Table 4 presents the relative errors of traffic volumeoutlier data before and after being recovered by ADMM-TRThe results show that the RSE L
0and MAPE L
0for traffic
volume data corrupted by outlier data are about 5 times thanthe data after being recovered by ADMM-TR Figures 1 2and 3 present the profiles of traffic volume data for a day Theresults show that ADMM-TR could recover the traffic volumeoutlier data with perfect performance
6 Conclusions
In this paper we concentrate on the mathematical problemin traffic volume outlier data recovery and proposed anovel tensor recovery method based on alternating directionmethod of multipliers (ADMM) The proposed algorithmcan automatically separate the low-119899-rank tensor data andsparse partThe experiments show that the proposedmethodis more stable and accurate in most cases and has excellentconvergence rate Experiments on real world traffic volumedata demonstrate the practicability and effectiveness of theproposed method in traffic research domain
In the future we would like to investigate how to auto-matically choose the parameters in our algorithm and explore
8 Mathematical Problems in Engineering
additional applications of our method in traffic researchdomain
Acknowledgments
This research was supported by NSFC (Grant No 6127137661171118 and 91120015) Beijing Natural Science Foundation(4122067)The authors would like to thank Professor Bin Ranfrom the University ofWisconsin-Madison and Yong Li fromthe University of Notre Dame for the suggestive discussions
References
[1] K Tamura and M Hirayama ldquoToward realization of VICSmdashvehicle information and communications systemrdquo in Pro-ceedings of the IEEE-IEE Vehicle Navigation and InformationsSystems Conference pp 72ndash77 October 1993
[2] S Kim M E Lewis and C C White III ldquoOptimal vehiclerouting with real-time traffic informationrdquo IEEE Transactionson Intelligent Transportation Systems vol 6 no 2 pp 178ndash1882005
[3] P Mirchandani and L Head ldquoA real-time traffic signal controlsystem architecture algorithms and analysisrdquo TransportationResearch Part C vol 9 no 6 pp 415ndash432 2001
[4] B M Williams P K Durvasula and D E Brown ldquoUrbanfreeway traffic flow prediction application of seasonal autore-gressive integrated moving average and exponential smoothingmodelsrdquo Transportation Research Record no 1644 pp 132ndash1411998
[5] J Xu X Li and H Shi ldquoShort-term traffic flow forecastingmodel under missing datardquo Journal of Computer Applicationsvol 30 no 4 pp 1117ndash1120 2010
[6] Y Pei and J Ma ldquoReal-time traffic data screening and recon-structionrdquo China Civil Engineering Journal vol 36 no 7 pp78ndash83 2003
[7] S Chen W Wang and W Li ldquoNoise recognition and noisereduction of real-time trafficdatardquoDongnanDaxueXuebao vol36 no 2 pp 322ndash325 2006
[8] B-G Daniel J D-P Francisco G-O David et al ldquoWavelet-based denoising for traffic volume time series forecasting withself-organizing neural networksrdquo Computer-Aided Civil andInfrastructure Engineering vol 25 no 7 pp 530ndash545 2010
[9] S Gandy B Recht and I Yamada ldquoTensor completion andlowmdashrank tensor recovery via convex optimizationrdquo InverseProblems vol 27 no 2 Article ID 025010 2011
[10] T G Kolda and B W Bader ldquoTensor decompositions andapplicationsrdquo SIAM Review vol 51 no 3 pp 455ndash500 2009
[11] A S Lewis and G Knowles ldquoImage compression using the 2-Dwavelet transformrdquo IEEE Transactions of Image Processing vol1 no 2 pp 244ndash250 1992
[12] V Chandrasekaran S Sanghavi P A Parrilo and A S WillskyldquoRank-sparsity incoherence for matrix decompositionrdquo SIAMJournal on Optimization vol 21 no 2 pp 572ndash596 2011
[13] J Wright Y Peng Y Ma and A Ganesh ldquoRobust principalcomponent analysis exact recovery of corrupted low-rankmatrices by convex optimizationrdquo in Proceedings of the 23rdAnnual Conference on Neural Information Processing Systems(NIPS rsquo09) pp 2080ndash2088 Vancouver Canada 2009
[14] A Ganesh Z Lin J Wright L Wu M Chen and Y MaldquoFast algorithms for recovering a corrupted low-rank matrixrdquo
in Proceedings of the 3rd IEEE International Workshop onComputational Advances in Multi-Sensor Adaptive Processing(CAMSAP rsquo09) pp 213ndash216 December 2009
[15] Z Lin M Chen L Wu and Y Ma ldquoThe augmented Lagrangemultiplier method for exact recovery of a corrupted low-rankmatricesrdquoMathematical Programming 2009
[16] X Yuan and J Yang ldquoSparse and low-rank matrix decomposi-tion via alternating directionmethodsrdquo Tech Rep Departmentof Mathematics Hong Kong Baptist University 2009
[17] L Qu Y Zhang J Hu L Jia and L Li ldquoA BPCA basedmissing value imputing method for traffic flow volume datardquo inProceedings of the IEEE Intelligent Vehicles Symposium (IV rsquo08)pp 985ndash990 June 2008
[18] J Liu P Musialski P Wonka and J Ye ldquoTensor completion forestimating missing values in visual datardquo in Proceedings of theInternational Conference on Computer Vision (ICCV rsquo09) 2009
[19] D Bertsekas and J Tsitsiklis Parallel and Distributed Computa-tion Prentice-Hall 1989
[20] J-F Cai E J Candes and Z Shen ldquoA singular value thresh-olding algorithm for matrix completionrdquo SIAM Journal onOptimization vol 20 no 4 pp 1956ndash1982 2010
[21] E T Hale W Yin and Y Zhang ldquoFixed-point continuationfor ℓ1-minimization methodology and convergencerdquo SIAMJournal on Optimization vol 19 no 3 pp 1107ndash1130 2008
[22] Y Li J Yan Y Zhou and J Yang ldquoOptimum subspace learningand error correction for tensorsrdquo in Proceedings of the 11thEuropean Conference on Computer Vision (ECCV rsquo10) CreteGreece 2010
[23] E J Candes X Li Y Ma and J Wright ldquoRobust principalcomponent analysisrdquo Journal of the ACM vol 58 no 3 article11 2011
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
8 Mathematical Problems in Engineering
additional applications of our method in traffic researchdomain
Acknowledgments
This research was supported by NSFC (Grant No 6127137661171118 and 91120015) Beijing Natural Science Foundation(4122067)The authors would like to thank Professor Bin Ranfrom the University ofWisconsin-Madison and Yong Li fromthe University of Notre Dame for the suggestive discussions
References
[1] K Tamura and M Hirayama ldquoToward realization of VICSmdashvehicle information and communications systemrdquo in Pro-ceedings of the IEEE-IEE Vehicle Navigation and InformationsSystems Conference pp 72ndash77 October 1993
[2] S Kim M E Lewis and C C White III ldquoOptimal vehiclerouting with real-time traffic informationrdquo IEEE Transactionson Intelligent Transportation Systems vol 6 no 2 pp 178ndash1882005
[3] P Mirchandani and L Head ldquoA real-time traffic signal controlsystem architecture algorithms and analysisrdquo TransportationResearch Part C vol 9 no 6 pp 415ndash432 2001
[4] B M Williams P K Durvasula and D E Brown ldquoUrbanfreeway traffic flow prediction application of seasonal autore-gressive integrated moving average and exponential smoothingmodelsrdquo Transportation Research Record no 1644 pp 132ndash1411998
[5] J Xu X Li and H Shi ldquoShort-term traffic flow forecastingmodel under missing datardquo Journal of Computer Applicationsvol 30 no 4 pp 1117ndash1120 2010
[6] Y Pei and J Ma ldquoReal-time traffic data screening and recon-structionrdquo China Civil Engineering Journal vol 36 no 7 pp78ndash83 2003
[7] S Chen W Wang and W Li ldquoNoise recognition and noisereduction of real-time trafficdatardquoDongnanDaxueXuebao vol36 no 2 pp 322ndash325 2006
[8] B-G Daniel J D-P Francisco G-O David et al ldquoWavelet-based denoising for traffic volume time series forecasting withself-organizing neural networksrdquo Computer-Aided Civil andInfrastructure Engineering vol 25 no 7 pp 530ndash545 2010
[9] S Gandy B Recht and I Yamada ldquoTensor completion andlowmdashrank tensor recovery via convex optimizationrdquo InverseProblems vol 27 no 2 Article ID 025010 2011
[10] T G Kolda and B W Bader ldquoTensor decompositions andapplicationsrdquo SIAM Review vol 51 no 3 pp 455ndash500 2009
[11] A S Lewis and G Knowles ldquoImage compression using the 2-Dwavelet transformrdquo IEEE Transactions of Image Processing vol1 no 2 pp 244ndash250 1992
[12] V Chandrasekaran S Sanghavi P A Parrilo and A S WillskyldquoRank-sparsity incoherence for matrix decompositionrdquo SIAMJournal on Optimization vol 21 no 2 pp 572ndash596 2011
[13] J Wright Y Peng Y Ma and A Ganesh ldquoRobust principalcomponent analysis exact recovery of corrupted low-rankmatrices by convex optimizationrdquo in Proceedings of the 23rdAnnual Conference on Neural Information Processing Systems(NIPS rsquo09) pp 2080ndash2088 Vancouver Canada 2009
[14] A Ganesh Z Lin J Wright L Wu M Chen and Y MaldquoFast algorithms for recovering a corrupted low-rank matrixrdquo
in Proceedings of the 3rd IEEE International Workshop onComputational Advances in Multi-Sensor Adaptive Processing(CAMSAP rsquo09) pp 213ndash216 December 2009
[15] Z Lin M Chen L Wu and Y Ma ldquoThe augmented Lagrangemultiplier method for exact recovery of a corrupted low-rankmatricesrdquoMathematical Programming 2009
[16] X Yuan and J Yang ldquoSparse and low-rank matrix decomposi-tion via alternating directionmethodsrdquo Tech Rep Departmentof Mathematics Hong Kong Baptist University 2009
[17] L Qu Y Zhang J Hu L Jia and L Li ldquoA BPCA basedmissing value imputing method for traffic flow volume datardquo inProceedings of the IEEE Intelligent Vehicles Symposium (IV rsquo08)pp 985ndash990 June 2008
[18] J Liu P Musialski P Wonka and J Ye ldquoTensor completion forestimating missing values in visual datardquo in Proceedings of theInternational Conference on Computer Vision (ICCV rsquo09) 2009
[19] D Bertsekas and J Tsitsiklis Parallel and Distributed Computa-tion Prentice-Hall 1989
[20] J-F Cai E J Candes and Z Shen ldquoA singular value thresh-olding algorithm for matrix completionrdquo SIAM Journal onOptimization vol 20 no 4 pp 1956ndash1982 2010
[21] E T Hale W Yin and Y Zhang ldquoFixed-point continuationfor ℓ1-minimization methodology and convergencerdquo SIAMJournal on Optimization vol 19 no 3 pp 1107ndash1130 2008
[22] Y Li J Yan Y Zhou and J Yang ldquoOptimum subspace learningand error correction for tensorsrdquo in Proceedings of the 11thEuropean Conference on Computer Vision (ECCV rsquo10) CreteGreece 2010
[23] E J Candes X Li Y Ma and J Wright ldquoRobust principalcomponent analysisrdquo Journal of the ACM vol 58 no 3 article11 2011
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of