Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
arX
iv:1
607.
0652
1v2
[cs.
NI]
3 F
eb 2
017
1
Joint Uplink/Downlink Optimization forBackhaul-Limited Mobile Cloud Computing with
User SchedulingAli Al-Shuwaili, Osvaldo Simeone, Alireza Bagheri, and Gesualdo Scutari
Abstract—Mobile cloud computing enables the offloading ofcomputationally heavy applications, such as for gaming, objectrecognition or video processing, from mobile users (MUs) tocloudlet or cloud servers, which are connected to wireless accesspoints, either directly or through finite-capacity backhaul links.In this paper, the design of a mobile cloud computing system isinvestigated by proposing the joint optimization of computing andcommunication resources with the aim of minimizing the energyrequired for offloading across all MUs under latency constraintsat the application layer. The proposed design accounts for multi-antenna uplink and downlink interfering transmissions, with orwithout cooperation on the downlink, along with the allocationof backhaul and computational resources and user selection.The resulting design optimization problems are non-convex, andstationary solutions are computed by means of successive convexapproximation (SCA) techniques. Numerical results illustratethe advantages in terms of energy-latency trade-off of the jointoptimization of computing and communication resources, aswellas the impact of system parameters, such as backhaul capacity,and of the network architecture.
Index Terms—Mobile cloud computing, 5G, Successive con-vex approximation, Application offloading, Backhaul, Cloudlet,Latency, User selection, Network MIMO.
I. I NTRODUCTION
Mobile cloud computing enables the offloading of com-putationally heavy applications from Mobile Users (MUs)to a remote server by means of a wireless communicationnetwork [1]–[5]. Given the battery-limited nature of mobiledevices, mobile cloud computing makes it possible to provideservices, such as gaming, object recognition, video processing,or augmented reality, that may otherwise not be available tomobile users. Offloading may take place to remote serversin the “cloud”, which is accessed by the wireless accesspoints via backhaul links, or to “cloudlet” servers that aredirectly connected to an access point [5]. In the latter case, theapproach is also known as mobile edge computing, which iscurrently seen as a key enabler of the so called tactile internet[6] and is subject to standardization efforts [7]. Offloadingto peer mobile devices is also being considered [8]. Exam-ples of architectures that are based on mobile cloud/cloudletcomputation offloading include MAUI [9], ThinkAir [10], and
A. Al-Shuwaili, O. Simeone, and A. Bagheri are with the Center for Wire-less Information Processing (CWiP), Department of Electrical and ComputerEngineering, New Jersey Institute of Technology, Newark, NJ 07102 USA(e-mail: [email protected], [email protected], [email protected]).
G. Scutari is with the Department of Industrial Engineering, PurdueUniversity, West Lafayette, IN 47907 USA (e-mail: [email protected]).
This work was partially supported by the U.S. NSF through grant 1525629.
MobiCoRE [11], while examples of commercial applicationbased on mobile cloud computing include Apple iCloud,Shazam, and Google Goggles [12], [13].
When optimized solely at the application layer, as in most ofthe earlier literature, the design of a mobile cloud computingsystems entail decision of whether a mobile can decreaseits energy expenditure by offloading the execution of theentire application [1], [2], [14]–[16] or of some selectedsubtasks (see, e.g., [17], [18]). Given that offloading requirestransmission and reception on the wireless interface, a moresystematic approach involves the joint optimization of offload-ing decisions and communication parameters, such as uplinkpower allocation [18], [19], link and subcarrier selection[20],[21], and uplink data rate [22], [23].
The joint optimization of the set of subtasks to be offloadedand of the uplink transmission powers was studied in [18], [19]under static channel condition. A related dynamic computationoffloading approach was presented in [14], [23] that deter-mines which application subtasks are to be executed remotelygiven the available wireless network connectivity. Assuminga simplified execution model, in which input data can bearbitrary partitioned for separate processing, references [22],[24] study the fraction of the mobile data to be offloaded tothe cloud, with the rest being executed locally at the mobiledevice. A framework that integrates wireless energy transferand cloud mobile computing was put forth in [25].
The works summarized thus far focus on the operation of asingleMU. In contrast to the single-MU problem formulation,in a scenario with multiple MUs transmitting over a sharedwireless medium across multiple cells, the design of a mobilecloud computing system requires: (i) the management ofinterference for theuplink, through which MUs offload thedata needed for computation in the cloud; (ii ) the managementof interference for thedownlink, through which the outcomeof the cloud computations are fed back to the MUs; (iii ) theallocation ofbackhaulresources for communication betweenwireless edge and cloud; and (iv) the allocation ofcomputingresources at the cloud. Furthermore, the optimization shouldincludeuser selection, or schedulingmechanisms whereby theoffloading users are guaranteed an energy consumption that issmaller than the amount required for local computing at thedevice.
The limited literature on resource allocation and offloadingdecisions for the multiuser case includes papers [26]–[34]. Theproblem of decentralized user scheduling when modeling theaspect (i) of uplink interference is considered in [26], [27] for
single-antenna elements within a game-theoretic framework.The joint allocation of radio and computational resourcesis considered in [28] by accounting for the elements (i)and (iv), in the presence of multiple clouds, with the aimof maximizing network operator revenue via resource poolsharing. A problem formulation including elements (i) and(iv) was studied in [29] with MIMO transceivers and for afixed set of scheduled users. Scheduling in a single-cell wasconsidered in [30] with the goal of minimizing the weightedsum mobile energy consumption; it was shown that the optimalscheduling and cloud resource allocation policy (element (iv))have a threshold-based structure. Another scheduling strategyfor multiuser offloading systems in a small-cell set-up ispresented in [31], where the resources are allocated under theobjective of minimizing the average latency experienced bythe worst-case user by accounting for element (iv) with theinclusion of uplink and downlink tasks schedulers. A schemethat jointly optimizes the computation offloading decisionsand the radio resource allocation in heterogeneous networksby accounting for element (i) so as to minimize the mobileenergy expenditure under latency constraints was proposedin[32]. An energy-efficient resource allocation for interference-free multi-users scenario was discussed in [33] with the aimofoptimizing uplink and downlink transmissions duration whileconsidering element (iv). Finally, the problem of schedulingtasks between cloud and edge processors was studied in [34]without modeling the physical layer.
In this paper, we account for all elements (i)-(iv) as well asfor user scheduling. Specifically, our main contributions are:
• We study the problem of minimizing the mobile energyconsumption under latency constraints over uplink anddownlink precoding, uplink and downlink backhaul re-source allocation, as well as cloud computing resourceallocation for general multi-antenna transceivers. Thiswork is mostly motivated by the increasing importance ofbackhaul capacity limitations, which are well understoodto be often the bottleneck in modern dense networkdeployments (see, e.g., [35], [36]). We emphasize that,unlike [29], the problem formulation explicitly modelsthe optimization of downlink communication for down-loading the outcome of the optimization at the MUs aswell as of the backhaul resource allocated to the activeMUs in both uplink and downlink, that is, it accounts forall elements (i)-(iv) listed above. The resulting problemrequires the development of a novel adaptation of the Suc-cessive Convex Approximation (SCA) scheme [37], [38]that accounts for downlink and backhaul transmissions.
• The optimization of users scheduling is tackled jointlywith the operation of uplink/downlink, backhaul and com-putational resources under the key constraint that eachoffloading MU should not consume more energy thanthat required for local computation. The resulting mixedinteger problem is tackled by a means of SCA coupledwith the smoothlp-norm approximation approach [39].We emphasize that this problem was not studied in [30],which instead considered a fixed set of scheduled users.
• A hybrid cloud-edge computing setup is studied in
Fig. 1: Basic system model: Mobile users (MUs) offload the execu-tion of their applications to a centralized cloud processorthrough awireless access network and finite capacity backhaul links.
which, beside a cloud server, “cloudlet” or “edge” serversare available locally at the wireless access points. Thecloudlet servers are able to execute offloaded applicationswithout incurring backhaul latency but with a generallysmaller CPU frequency [5]. For the first time, we inves-tigate here the optimal task allocation between cloudletand the cloud via SCA in the presence of all elements(i)-(iv) discussed above.
• We study the impact of cooperative downlink transmis-sion via network MIMO [40] on the achievable energy-latency trade-off by accounting for the backhaul overheadneeded to deliver user data to multiple access points fortransmission to the MUs. This has also not previouslystudied in the context of mobile cloud computing withbackhaul limitations.
The rest of the paper is organized as follows. Section IIintroduces the system model along with the basic problem for-mulation. The proposed SCA solution is described in SectionIII. Users scheduling is studied in Section IV. Hybrid edge,or cloudlet, and cloud computing is formulated and tackledin Section V. Cooperative downlink transmission is discussedin Section VI. Numerical results are presented in Section VII,while concluding remarks are finally provided in Section VIII.
II. SYSTEM MODEL AND PROBLEM FORMULATION
In this section, we describe the basic system model andproblem formulation that will be adopted in this paper. Thebasic model assumes fixed user scheduling, offloading to acentralized cloud processor, and non-cooperative transmissionat the wireless access points. Generalizations that address userscheduling, local computing capabilities at the access pointsand cooperative transmission will be treated later in SectionIV, Section V and Section VI, respectively.
A. System Model
We consider a network composed ofNc cells of possi-bly different sizes such as micro- or femto-cells. Each celln = 1, . . . , Nc includes a base station, referred to as cloud-enhanced e-Node B (ceNB) borrowing from LTE nomencla-ture, which is connected to a common cloud server that pro-vides computational resources. As shown in Fig. 1, each cell
2
Fig. 2: Timeline of offloading from an MUin. Note that the totaltwo-way backhaul latency is given as∆bh
in = ∆bh,ulin
+∆bh,dlin
.
containsK active mobile users (MUs) that have been sched-uled for the offloading of their local applications to the cloudprocessor. TheK MUs in the same cell transmit in orthogonalspectral resources, in the time or frequency domain. We denoteby in the MU in cell n that is scheduled on thei-th spectralresource, and byI , {in : i = 1, . . . ,K, n = 1, . . . , Nc} theset of all the active MUs in the system. Each MUin andceNB n is equipped withNTin
transmit andNRnreceive
antenna, respectively. Note that MUs in different cells thatare scheduled on the same spectral resources interfere witheach other.
Each MU in wishes to run an application within a givenmaximum latencyTin . The application to be executed ischaracterized by the numberVin of CPU cycles necessaryto complete it, by the numberBI
inof input bits, and by
the numberBOin of output bits encoding the result of the
computation. Table I summarizes the notations and parametersused in the system model.
We next derive energy and latency resulting from offloadingof the applications of all active MUs. The offloading latencyconsist of the time∆ul
inneeded for the MU to transmit the
input bits to its ceNB in the uplink; the time∆exein necessary
for the cloud to execute the instructions; the round-trip time∆bh
infor exchanging information between ceNB and the cloud
through the backhaul link; and the time∆dlin
to send the resultback to the MU in the downlink (see Fig. 2). We can hencewrite the total offloading latency for MUin as
∆in = ∆ulin +∆exe
in +∆bhin +∆dl
in . (1)
The energyEin of each MUin instead depends only on thepower used for transmission in the uplink and reception energyin the downlink. These latency and energy terms are computedas a function of the radio and computational resources asdetailed next.
1) Uplink transmission:The optimization variables at thephysical layer for the uplink are the users’ transmit covariancematricesQul ,
(Qul
in
)
in∈I, whereQul
in= E[xul
inxulH
in] with
xulin∼ CN (0,Qul
in) being the signal transmitted by the user
in. These matrices are subject to power budget constraints
Qulin ,
{
Qulin ∈ C
NTin×NTin : Qul
in � 0, tr(Qulin) ≤ Pul
in
}
,
(2)where Pul
inis the maximum allowed transmit energy per
symbol of MU in. For any given profileQul, the achievabletransmission rate, in bits per symbol, which corresponds tothemutual information evaluated on the MIMO Gaussian channel(see, e.g., [41]) of MUin when multi-user interference istreated as additive Gaussian noise, is
rulin(Q) = log2 det(
I+HHinR
uln (Qul
−in)−1HinQ
ulin
)
, (3)
TABLE I: System Parameters
Parameter DescriptionFc, F ceNB
n cloud and cloudlet computational capacityCul
n , Cdln uplink and downlink backhaul capacity of celln
Hin , Gin uplink and downlink channel matrix for userinPulin
, P dln uplink and downlink power budget constraint
Wul,W dl uplink and downlink bandwidthNo noise power spectral densityNTin
, NRnnumber of transmit and receive antenna
BIin
, BOin
number of input and output bits for userinVin number of CPU cycles for userinTin latency constraint for userinNc,K number of cell and number of users in each celldin reception energy constant for userinκ mobile device switched capacitanceα, δ step size constant and termination accuracy
where
Ruln (Qul
−in) , N0I+∑
jm∈I,m 6=n
HjmnQuljmHH
jmn (4)
is the covariance matrix of the sum of the noise and of theinter-cell interference affecting reception at then-th ceNB in
the i-th spectral resources;Qul−in
,
((Qul
jm
)K
j=1
)Nc
n6=m=1; N0
is the noise power spectral density; andHin is the uplinkchannel matrix for MUin to the ceNB in the celln, whereasHjmn is the cross channel matrix between the interferingMU jm in the cell m and the ceNB in celln. The channelmatrices account for path loss, slow and fast fading. The time,in seconds, necessary for useri in cell n to transmit the inputbits BI
in to its ceNB in the uplink is then
∆ulin
(Qul
)=
BIin
Wulrulin (Qul), (5)
whereWul is the uplink channel bandwidth allocated to eachone of the orthogonal spectral resources. The correspondingenergy consumption due to uplink transmission is then
Eulin
(Qul
)= BI
in
tr(Qul
in
)
rulin (Qul), (6)
since aroundBIin/rulin
(Qul
)channel uses are needed in order
to transmit reliably in the uplink.2) Downlink transmission:The optimization variables for
the downlink are the ceNBs’ transmit covariance matrices(Qdl
in
)K
i=1, which are subject to per-ceNB power constraints
P dln :
Qdln ,
{
(Qdl
in
)K
i=1∈ C
NTin×NTin : Qdl
in� 0,
K∑
i=1
tr(Qdlin) ≤ P dl
n
}
.
(7)Similar to the uplink, we can write the achievable rate in bitsper symbol for each MU in the downlink as
rdlin(Qdl) = log2 det
(
I+GHinR
dln (Qdl
−in)−1GinQ
dlin
)
, (8)
with
Rdln (Qdl
−in) , N0I+∑
jm∈I,m 6=n
GjmnQdljmGH
jmn; (9)
3
the corresponding required transmission time reads
∆dlin
(Qdl
)=
BOin
W dlrdlin (Qdl), (10)
whereGin is the downlink channel matrix between the ceNBin cell n and the MUin; Gjmn is the cross channel matrixbetween the interfering MUjm in the cellm and the ceNB incell n; W dl is the downlink channel bandwidth; andQdl
−in ,((
Qdljm
)K
j=1
)Nc
n6=m=1. The downlink energy consumption is
given by
Edlin
(Qdl
)= BO
in
dinrdlin (Qdl)
, (11)
wheredin is parameter that indicates the receiver energy ex-penditure for each symbol interval. Note that (7)-(8) implicitlyassume that the downlink spectral resources are allocated tothe MUs in the same way as for the uplink, so that MUsin for n = 1, . . . , Nc are mutually interfering in both uplinkand downlink. This assumption can be easily alleviated at thecost of introducing additional notation.
3) Cloud processing:Let Fc be the capacity in terms ofnumber of CPU cycles per second of the cloud; and letfin ≥ 0be the fraction of the processing powerFc assigned to userin, so that
∑
in∈I fin ≤ 1. The time needed to runVin CPUcycles for userin remotely is then
∆exein (fin) =
Vin
finFc. (12)
Define f , (fin)in∈I .4) Backhaul transmission:We denote byCul
n the capacityin bits per second of the backhaul connecting the ceNB in celln with the cloud, and byCdl
n the capacity in bits per secondof the backhaul connecting the cloud with the ceNB in celln. Let culin , c
dlin ≥ 0 be the fraction of the backhaul capacities
Culn andCdl
n , respectively, allocated to MUi in cell n. Wethen have the constraint
∑Ki=1 c
ulin≤ 1 and
∑Ki=1 c
dlin≤ 1 for
all n. Moreover, the time delay due to the backhaul transferbetween ceNBn and the cloud in both directions is given by
∆bhin (c
ulin , c
dlin) =
BIin
culinCuln
+BO
in
cdlinCdln
, (13)
where the first term represents the latency in the uplinkdirection, denoted as∆bh,ul
in(cf. Fig. 2), and the second
term represents the latency in the downlink direction, denotedas ∆bh,dl
inin the same figure. The uplink/downlink backhaul
allocation vectors are defined ascul , (culin)in∈I and cdl ,
(cdlin)in∈I , respectively.
B. Problem Formulation
The total energy consumption due to the offloading of userin data, i.e., transmittingBI
in bits and receivingBOin bits, is
then given by
Ein
(Qul,Qdl
)= Eul
in
(Qul
)+ Edl
in
(Qdl
)
= BIin
tr(Qul
in
)
rulin (Qul)+BO
in
dinrdlin (Qdl)
.(14)
The optimal offloading problem can be stated as the mini-mization of the sum of the energy spent by all MUs to runtheir applications remotely, subject to individual latency andpower constraint. Stated in mathematical terms, we have thefollowing
minQul,Qdl,f ,cul,cdl
E(Qul,Qdl
),∑
in∈I
Ein
(Qul,Qdl
)
=∑
in∈I
BIin
tr(Qulin)
rulin
(Qul)+BO
indin
rdlin
(Qdl)
s.t. C.1BI
in
Wulrulin
(Qul)+
BIin
culin
Culn
+Vin
finFc
+BO
in
cdlin
Cdln
+BO
in
Wdlrdlin
(Qdl)≤ Tin , ∀in ∈ I,
C.2 fin ≥ 0, ∀in ∈ I,∑
in∈I
fin ≤ 1,
C.3 culin , cdlin ≥ 0, ∀in ∈ I,
K∑
i=1
culin ≤ 1,K∑
i=1
cdlin ≤ 1, ∀n = 1, . . . , Nc,
C.4 Qulin∈ Qul
in, ∀in ∈ I,
(Qdl
in
)K
i=1∈ Qdl
n , ∀n = 1, . . . , Nc.(P.1)
Constraint C.1 enforces that the latency for any MUin tobe less than or equal to the maximum tolerable delay ofTin seconds; C.2 imposes the mentioned limit on the cloudcomputational resources; C.3 enforces the limited backhaulcapacities in uplink and downlink; and C.4 guarantee thatpower budget constraint on the radio interface of both uplinkand downlink is satisfied. Note that problem (P.1) dependsonly on the ratiosBI
in/Wul, BI
in/Culn , Vin/Fc, BO
in/Wdl and
BOin/C
dln . We denote byZ ,
(Qul,Qdl, f , cul, cdl
)the set of
all optimization variables, and byZ the feasible set of (P.1).Note that (P.1) is non-convex, due to the non-convexity of theobjective function and of the constraint C.1.
Remark 1. As discussed, the expression in C.1 for thelatency of each user assumes that the communication andprocessing steps, including uplink transmission, edge-to-cloudbackhaul transmission, cloud processing, cloud-to-edge back-haul transmission and downlink transmission take place oneafter the other for each user, as illustrated in Fig. 2. Giventhat the uplink, backhaul, execution and downlink latenciesmay be different across the users, the communication steps ofdifferent users may not be aligned. While this fact may beexploited by a sophisticated receiver that tracks the variationof the interference power within a communication block, theresulting achievable rates are extremely difficult to characterizeand optimize. In contrast, the expression of the uplink anddownlink rates (3) and (8), in which interference is assumedto be caused by all users, are achievable by means of standarddecoders (see, e.g., [42]) and can be efficiently computed andoptimized, as it will be shown in Sec. III.
Feasibility. Problem (P.1) has a non-empty feasible set ifthere exist matricesQul and Qdl such that the followinginequalities hold
a) Tin >BI
in
Wulrulin
(Qul)+
BOin
Wdlrdlin
(Qdl), ∀in ∈ I;
4
b)∑
in∈I
Vin/Fc
Tin −BI
in
Wulrulin
(Qul)+
BOin
Wdlrdlin(Qdl)
≤ α1;
c)K∑
i=1
BIin/Cul
n
Tin −BI
in
Wulrulin
(Qul)+
BOin
Wdlrdlin
(Qdl)
≤ α2, ∀n = 1, . . . , Nc;
d)K∑
i=1
BOin/C
dln
Tin −BI
in
Wulrulin
(Qul)+
BOin
Wdlrdlin
(Qdl)
≤ α3, ∀n = 1, . . . , Nc;
for some α1, α2, α3 ≥ 0 with α1 + α2 + α3 = 1. Infact, if above conditions are met, one can always choose
fin = α−11 (Vin/Fc)
(
Tin −BI
in
Wulrulin
(Qul)−
BOin
Wdlrdlin
(Qdl)
)−1
,
and similarly forculin andcdlin , so that conditions b), c) and d)are satisfied with equality.
Remark 2. When the goal is identifying the achievable trade-off curve between energy consumption and latency, assumingfor simplicity that all MUs have the same latency constraintT ,e.g.,Tin = T , the following problem may also be considered
minQul,Qdl,f ,cul,cdl,T
E(Qul,Qdl
)+ λT
s.t. C.1BI
in
Wulrulin
(Qul)+
BIin
culin
Culn
+Vin
finFc
+BO
in
cdlinCdln
+BO
in
Wdlrdlin(Qdl)
≤ T , ∀in ∈ I,
C.2−C.4 of (P.1),(P.2)
whereλ > 0 is a parameter identifying the desired relativeweight between energy and latency minimization.
Remark 3. The problem formulation (P.1) can be easilyextended to account for a more general backhaul topologyin which the ceNBs are connected to the cloud via a multi-hop network with predefined routes between ceNBs and cloud.We do not further elaborate on this model here, because of thespace limitation.
Remark 4. Problem (P.1) was tacked in [29] for the spe-cial case where the joint optimization was over radio andcomputational resources and only in the uplink direction.Problem (P.1), instead, considers a general setup in whichjoint optimization carried over backhaul capacities in bothuplink and downlink, as well as the optimization over thedownlink radio transmission from the ceNB to the MU. Thisgeneralization entails different formulations for the objectivefunction and for the latency constraint from that in [29] andthus calls for novel SCA-based solution methods.
III. SUCCESSIVE CONVEX APPROXIMATION OPTIMIZATION
Problem (P.1) is non-convex due to the non-convexity of theobjective function and of the constraint C.1. To address thisissue, we leverage the SCA method proposed in [37], [38]for general non-convex problems. The SCA algorithm in [37],[38] is proved to converge to a stationaty solution of the (NP-hard) non-convex probem by solving a sequence of convexsub-problems, each one of which can be solved in polynomialtime, e.g., by interior-point methods [43]. To do so, we needto identify convex approximants for the objective functionandfor the non-convex constraints C.1 that satisfy the conditionsspecified in [37], [38] as discussed next.
A. Convex Surrogate for the Objective Function
Define asK ⊇ Z any compact convex set containingthe feasible setZ such that all functions in (P.1) are welldefined on it. Note that such a set always exists. This setexists by the same arguments used in [29, Sec. IV-B]. LetZ (v) ,
(Qul (v) ,Qdl (v), f (v) , cul (v), cdl (v)
), with v be-
ing the current iterate index of the SCA algorithm. Accordingto [37], [38], in order to be used within the SCA scheme,a convex approximantE (Z;Z (v)) of the objective functionE(Qul,Qdl
)around the current feasible iterateZ (v) ∈ Z
must satisfy the following properties ([37, Sec. II]):A1: E (•;Z (v)) is uniformly strongly convex onK;A2: ∇Qul∗ E (Z (v) ;Z (v)) = ∇Qul∗E
(Qul (v) ,Qdl (v)
)
and ∇Qdl∗ E (Z (v) ;Z (v)) = ∇Qdl∗E(Qul (v) ,Qdl (v)
),
for all Z (v) ∈ Z;A3: ∇Z∗E (•; •) is Lipschitz continuous onK ×Z;where ∇x∗f(x;y) denotes the conjugate gradient of thefunction f(x;y) with respect to its first argumentx. Wenote that, besides the convexity and smoothness conditionsA1 and A3, A2 enforces that the first-order behavior of theapproximant be the same as for the original function. In whatfollows, we derive an approximant for the objective function(14) satisfying A1-A3 above. Let us write
E (Z;Z (v)) ,∑
in∈I
Ein (Zin ;Z (v)) + E (Z;Z (v))
=∑
in∈I
(
Eulin
(Qul;Qul(v)
)+ Edl
in
(Qdl;Qdl(v)
) )
+ E (Z;Z (v)) ,
(15)
where Zin ,(Qul
in,Qdl
in, fin , c
ulin, cdlin
)with the function
E (Z;Z (v)) ,γqul
2
∑
in∈I
∥∥Qul −Qul (v)
∥∥2
+γqdl
2
∑
in∈I
∥∥Qdl −Qdl (v)
∥∥2
+γf
2 ‖f − f (v)‖2 +γcul
2
∥∥cul − cul (v)
∥∥2
+γcdl
2
∥∥cdl − cdl (v)
∥∥2
is added tomake the approximant objective function (15) strongly convexonK, with γqul , γqdl , γf , γcul andγcdl being arbitrary positiveconstants (see [37], [38]).The functionsEul
in
(Qul;Qul(v)
)
and Edlin
(Qdl;Qdl(v)
)are the convex approximants for the
uplink and downlink energy terms, respectively, as derivednext.
1) Convex approximant forEulin
(Qul
): It is not difficult to
check that an approximant that utilize the “partial” convexityof the uplink energy function (6) can be obtained as (cf. [29,Sec. IV-B])
Eulin
(Qul;Qul(v)
)= tr
(Qul
in (v)) BI
in
rulin(Qul
in,Qul
−in(v))
+ tr(Qul
in
) BIin
rulin(Qul
in(v) ,Qul
−in(v))
+∑
jm∈I,m 6=n
⟨
∇Qulin
∗Ejm
(Qul (v)
),Qul
in −Qulin (v)
⟩
,
(16)
where〈A,B〉 , Re{
tr(AHB
)}, and the conjugate gradient
5
∇Qulin
∗Ejm
(Qul (v)
)is given by [29, eq. (18)]
∇Qulin
∗Ejm
(Qul (v)
)=
tr(Qul
jm(v))∆ul
jm
(Qul (v)
)
log (2) ruljm (Qul (v))
· [HHinm(Rul
m
(Qul
−jm (v))−1− (Rul
m
(Qul
−jm (v))
+HjmQuljm (v)HH
jm)−1)Hinm].(17)
2) Convex approximant forEdlin
(Qdl
): To obtain the desired
convex approximant of downlink energy in (11), we need toconstruct the convex surrogate for the downlink rate function,i.e., rdlin
(Qdl;Qdl (v)
). To obtain such an approximant, we
exploit first the concave-convex structure of the rate functionsrdlin(Qdl)
rdlin(Qdl
)= log2 det
(Rdl
n
(Qdl
−in
)+HH
inHinQdlin
)
︸ ︷︷ ︸
rdlin
+(Qdl)
− log2 det(Rdl
n
(Qdl
−in
))
︸ ︷︷ ︸
rdlin−(Qdl
−in)
,(18)
where rdlin+ (
Qdl)
and rdlin− (
Qdl−in
)are concave functions.
The convex approximant is then obtained as
rdlin(Qdl;Qdl (v)
)= rdl+in
(Qdl
)− rdl−in
(Qdl
−in (v))
−∑
jm∈I
⟨
∇Quljm
∗rdlin− (
Qdl−in (v)
),Qdl
jm −Qdljm (v)
⟩
,
(19)
with
∇Qdljm
∗rdl−in
(Qdl
−in (v))= HH
jmnRdln
(Qdl
−in (v))−1
Hjmn.
(20)Then, we can simply obtain the convex approximant forEdl
in
(Qdl)
as
Edlin
(Qdl;Qdl (v)
)= BO
in
dinrdlin (Qdl;Qdl (v))
. (21)
Remark 5. The key advantage of SCA [37], [38] as com-pared to more conventional approaches such as Difference-of-Convex (DC) programming [44] is that the convex surrogateE (Z;Z (v)) need not be a global upper bound on the functionE(Qul,Qdl) – a condition that appears to be difficult to ensurefor the objective function of (P.1).
B. Inner Convexification of the Constraints
Next, let us consider the latency constraint C.1. Note thatthis constraint differs from the corresponding latency con-straint in [29] by virtue of the contributions due to downlinkand backhaul transmissions. Let us define the non-convex partof the left-hand side of C.1 as
gin(Qul,Qdl
),
BIin
Wulrulin (Qul)+
BOin
W dlrdlin (Qdl). (22)
To apply SCA, we need to obtain an approximantgin(Qul,Qdl;Z(v)
)at the current iterateZ (v)) ∈ Z
that satisfies the following properties ([37, Sec. II]):B1: gin
(•,Z(v)
)is uniformly convex onK;
B2: ∇Z∗ gin(Qul (v) ,Qdl (v) ;Z(v)
)=
∇Z∗gin(Qul (v) ,Qdl (v)
), for all Z(v) ∈ Z;
B3: ∇Z∗ gin(•; •) is continuous onK ×Z;B4: gin
(Qul,Qdl;Z(v)
)≥ gin
(Qul,Qdl
), for all
(Qul,Qdl
)∈ K andZ(v) ∈ Z;
B5: gin(Qul (v) ,Qdl (v) ;Z(v)
)= gin
(Qul (v) ,Qdl (v)
),
for all Z(v) ∈ Z;B6: gin(•; •) is Lipschitz continuous onK ×Z.Besides the first-order behavior and smoothness conditionsB2, B3, B6, the key assumptions B1, B4 and B5 enforcethat the approximantgin
(Qul,Qdl;Z(v)
)be a locally tight
(condition B5) convex (condition B1) upper bound (conditionB4) on the original constraintgin
(Qul,Qdl
). The desired
surrogate approximationgin(Qul,Qdl;Z(v)
)is then obtained
from (19) as
gin(Qul,Qdl;Z(v)
),
BIin
Wulrulin (Qul;Qul (v))
+BO
in
W dlrdlin (Qdl;Qdl (v)),
(23)
The proposed approximant (23) is an upper bound (condi-tion B4) and is a convex function ofQul andQdl (conditionB1). This is because both terms in (23) are the reciprocal of aconcave and positive function, and the sum of the two convexfunctions is convex. Furthermore, it is easy to show that theoriginal constraint (22) and its convex approximant (23) havethe same first-order behavior (condition B2) by evaluating thegradient of both functions at the current iterate. The remainingproperties B3-B6 can also be checked in a similar manner to[29, Sec. IV-B]. For instance, since the approximant functions(16) and (19) are twice continuously differentiable over thecompact convex setK ⊇ Z, the Lipschitz continuity of theirconjugate gradients follows readily.
C. SCA Algorithm
The SCA algorithm operates by iteratively solving thefollowing problem around the current iterateZ (v) ∈ Z,
Z (Z (v)) , argminQul,Qdl,f ,cul,cdl
E (Z;Z (v))
s.t.
C.1 gin(Qul,Qdl;Z(v)
)+
BIin
culinCuln
+BO
in
cdlinCdln
+Vin
finFc≤ Tin , ∀in ∈ I,
C.2−C.4 of (P.1). (P.3)
The unique solution of the strongly convex optimization prob-lem (P.3) is denotedZ
(Z (v)
),(Qul, Qdl, f , cul, cdl
). Note
that E (Z;Z (v)) is a function ofZ, given the current iterateZ (v).
The SCA scheme is summarized in Algo-rithm 1. In step 2, the termination criterion is∣∣E(Qul (v + 1),Qdl (v + 1)
)− E
(Qul (v),Qdl (v)
)∣∣ ≤ δ,
whereδ > 0 is the desired accuracy. The step size rule weused isγ (v) = γ (v − 1) (1− αγ (v − 1)) with γ (0) ∈ (0, 1]andα ∈ (0, 1/γ (0)) (other step size rules can also be adopted,
6
see [37], [38]). Algorithm 1 converges to a stationary pointof the problem (P.1) in the sense of [29, Theorem 2].
Remark 6. The SCA scheme can also be easily adaptedto tackle the weighted sum problem (P.2) discussed in Re-mark 2. This alternative formulation has the key advantagethat the identification of an initial feasible pointZ (0) ,(Qul (0) ,Qdl (0) , f (0) , cul (0) , cdl (0) , T (0)
)for the SCA
is a trivial task. This is because one can always select a valueof T (0) that satisfies the constraint C.1 in (P.2) for given valuesof the other variables.
Remark 7. Each instance of the optimization prob-lem (P.3) tackled by SCA can be solved with complexityO(max{n3, n2m}) using interior-points methods [45, Ch. 1],where n is the size of the optimization variables, namely(
2N2Tin
+ 3)
KNc, and m is the number of constraints,namely m = 7KNc + 3Nc + 1. Note that the complexityscales polynomially with the numberK of users and withall system parameters. While here we have focused on acentralized implementation, the complexity could be furtherreduced by developing distributed solutions as described in[37, Sec. IV]. Finally, we would also like to mention that,in practice, rather than solving problem (P.3) using SCA ateach time slot for the given realization of the channels, itwould be possible to solve the problem for a number ofrepresentative channels so as to build a sufficiently dense look-up table. More interestingly, as recently explored in [46] forpower allocation in an interference channel, one could use suchrepresentative channels to train a neural network, or anotherlearning machine, to “interpolate” the solution to other channelrealizations. We leave these aspects for future investigations.
Algorithm 1: SCA Solution for (P.3)
Input: Parameters from Table I;Z (0) ∈ Z; v = 0;{γ (v)}v ∈ (0, 1]; γqul , γqdl , γf , γcul , γcdl > 0.
1: If Z (v) satisfies the termination criterion, stop.2: ComputeZ (Z (v)) from (P.3).
3: SetZ (v + 1) = Z (v) + γ (v)(
Z (Z (v))− Z (v))
.4: v ← v + 1, and return to step1.
Output: Z =(Qul,Qdl, f , cul, cdl
).
IV. USER SELECTION
In the previous section, we assumed that a given numberof active users, namelyK per cell, was scheduled for trans-mission. The premise of this section, is that, if too manyMUs simultaneously choose to offload their computationaltasks, the resulting interference on the wireless channel mayrequire an energy consumption at the mobile for wirelesstransmission that exceeds the energy that would be neededfor local computing at some MUs. Moreover, the backhauland computing delays may make the latency constraint inproblem (P.1) impossible to satisfy, and thus problem (P.1)infeasible. For theses reasons, in this section, we consider userselection with the aim of maximizing the number of MUs
that perform offloading while guaranteeing that the selectedMUs can satisfy their latency constraints and, at the sametime, consume less energy than with local computing. In therest of this section, the local computation energy model isfirst elaborated on and then the user scheduling problem isformulated and tackled by integrating SCA with smoothlp-norm approximation methods.
A. Local Computation Energy
When the application is executed at the mobile device, theenergy consumptionEM
in is determined by the number ofCPU cycles required by the application,Vin , and by the clockfrequency of the device chip, which is denoted here asFin .In particular, for CMOS circuits, the energy per operation isproportional to the square of the supply voltage to the chip,and when the supply voltage is low, the clock frequency ofthe chip is a linear function of the voltage supply [47]. As aresult, the mobile energy for computing can be expressed as
EMin = κVinF
2in , (24)
whereκ is the effective switched capacitance, which dependson the MU processor architecture, and the clock frequency isselected so as to meet the latency constraint, yielding
Fin =Vin
Tin
. (25)
By plugging this into (24), we obtain the total consumptionenergy for mobile execution as
EMin = κ
V 3in
T 2in
. (26)
For each MUin, offloading is advantageous when the energyfor local mobile computing is higher than the energy requiredfor offloading, i.e.,
EMin ≥ Eul
in
(Qul
), (27)
whereEulin
(Qul
)is given in (6). Note that here we consider
for brevity only the uplink energy contribution in (14).
B. User Scheduling
To proceed, we introduce the auxiliary slack variables(xin , yin) for each MU in measuring the violation of thelatency constraint C.1 in (P.1) and the energy constraint(27), respectively. Our system design becomes maximizing thenumber of MUs that can perform offloading, while satisfyingthe latency constraints and guaranteeing energy savings withrespect to local computation when offloading is performed.This amounts to maximizing the number of MUsin with noviolation of the mentioned constraints, i.e., withxin = 0 andyin = 0. This is done here by minimizing theℓ0-norm
‖x‖0 + ‖y‖0 =∑
in∈X
I(xin > 0) +∑
in∈X
I(yin > 0), (28)
whereI is an indicator function that returns1 for xin , yin > 0and 0 otherwise, also definex , [xin ]in∈X and y ,
[yin ]in∈X . The sum of thel0-norms of the slack vectorsxandy in (28) counts the number of constraints C.1 and C.2
7
that are violated by the users. Therefore, minimizing this sumenforces the selection of users that satisfy the largest numberof constraints. Accordingly, the problem, defined for generalityover any arbitrary subset of usersX ⊆ I, reads
minQul,Qdl,f ,cul,cdl,x,y
‖x‖0 + ‖y‖0
s.t. C.1BI
in
Wulrulin
(Qul)+
BIin
culin
Culn
+Vin
finFc+
BOin
cdlin
Cdln
+BO
in
Wdlrdlin
(Qdl)− Tin ≤ xin , ∀in ∈ X ,
C.2 Eulin
(Qul
)− EM
in ≤ yin , ∀in ∈ X ,C.3 xin ≥ 0, yin ≥ 0, ∀in ∈ X ,C.2−C.4 of (P.1), ∀in ∈ X .
(P.4)DefineZ ,
(Qul,Qdl, f , cul, cdl,x,y
).
The scheduling algorithm is described in Algorithm 2.The algorithm maximizes the number of scheduled MUs byprogressively removing the MUsin that have the largestentries in the vectorw , x/
∑
in∈X xin + y/∑
in∈X yin ,which measures the relative amount, with respect to all usersin X , by which a user violates the two constraints. Notethat the normalizations by
∑
in∈X xin and∑
in∈X yin ensurethat the two constraints are considered on an equal footing.The outlined iterative procedure is repeated until a subsetofusersX ∗ is found for which problem (P.4) returns vectorwthat is close to zero, signifying feasibility of offloading underconstraints C.1-C.4 of (P.1) as well as (27). We observe that,unlike the admission control scheme in [39, Algorithm 2], theproposed algorithm requires two set of auxiliary variablesinorder to account for the constraints C.1 and C.2.
Algorithm 2 returns the solutions∗, from which the setof MUs scheduled for offloading is obtained asX ∗ ,
{π1, . . . , πs∗}. In more details, upon obtaining the solutionof (P.4), the set of MUs is ordered according to the respectivevalues of the entries of vectorw. Then, the subsetX ∗
of scheduled users is computed by bisection. In particular,bisection searches for the minimum number of users inthe interval [0,KNc], where KNc is the total number ofusers, that should be removed, so that the rest of the userscan be scheduled for offloading while satisfying the desiredconstraints. Specifically, as described in Algorithm 2, setX ∗ is defined asX ∗ , {π1, . . . , πs∗}, where the value ofs∗ ∈ [0,KNc] is found by successively searching within theinterval [Klow,Kup], which is initialized as[0,KNc]. At eachstep, first, the search interval is halved using the midpoints. Then, a feasibility test is performed to check whether theconstraints C.1-C.4 of (P.1) can be met if the subset of MUsX [s] , {π1, . . . , πs} is scheduled and the limits[Klow,Kup]is updated accordingly. The feasibility test is carried outbysolving an instance of problem (P.4) over the subset of MUsX [s]. The feasibility status is determined by the value of theresulting auxiliary variables, i.e., the problem is considered tobe feasible if the slack variables are smaller than a positivevalueη close to zero.
We now discuss how to solve problem (P.4). Problem(P.4) is non-convex due to the non-convexity of the objectiveand of the constraints C.1 and C.2. Based on the limit‖x‖0 = limp→0 ‖x‖
pp = limp→0
∑
in∈X |xin |p, the objective
function of (P.4) can be approximated by a higher-order norm
to make the problem mathematically tractable. In particular,in a manner similar to [39], we adopt the smooth objective
f(x,y) ,∑
in∈X
(x2in + ǫ2)p/2 +
∑
in∈X
(y2in + ǫ2)p/2, (29)
whereǫ > 0 is a small fixed regularization parameter. Substi-tuting (29) as the objective in (P.4), we now apply the SCAapproach to obtain a local optimal solution of the resultingproblem.
To this end, a convex upper bound satisfying conditionsA1-A3 described in Sec. III-A for the smoothedℓp-normobjective function (29) can be obtained from the result in [39,Proposition 1] and is given by
f (Z;Z (v)) ,∑
in∈X
ω′inx
2in+
∑
in∈X
ω′′iny
2in+f (Z;Z (v)) , (30)
where ω′in = p
2
[
(xin(v))2 + ǫ2
] p
2−1
and
ω′′in
= p2
[
(yin(v))2 + ǫ2
] p
2−1
; Z (v) ,(Qul (v) ,Qdl (v), f (v) , cul (v), cdl (v),x (v),y (v)
); and
The functionf (Z;Z (v)) ,γqul
2
∑
in∈I
∥∥Qul −Qul (v)
∥∥2+
γqdl
2
∑
in∈I
∥∥Qdl −Qdl (v)
∥∥2
+γf
2 ‖f − f (v)‖2 +γcul
2
∥∥cul − cul (v)
∥∥2
+γcdl
2
∥∥cdl − cdl (v)
∥∥2
+γx
2 ‖x− x (v)‖2 +γy
2 ‖y − y (v)‖2 is added to realizethe strong convexity of (30) withγx, γy > 0.
The convexification of constraint C.1 is done as in (23).Lastly, to obtain an inner convexification for the energyconstraint C.2 that satisfies the conditions B1-B6 in Sec. III-B,we utilize the concave-convex structure of the rate functionrulin(Qul
)as in (18), to rewrite constraint C.2 as
tr(Qul
in
)−
EMin
BIin
rulin+ (
Qul)−
EMin
BIin
rulin− (
Qul−in
)≤ yin , (31)
whererulin+ (
Qul)
andrulin− (
Qul−in
)are given in (18). Using
the linearization (19), we then obtain the desired upper boundon C.2 as
tr(Qul
in
)−
EMin
BIin
rulin(Qul;Qul (v)
)≤ yin . (32)
Given a feasible pointZ (v), we define the following stronglyconvex problem
Z (Z (v)) , argminQul,Qdl,f ,cul,cdl,x,y
f (Z;Z (v))
s.t.
C.1 gin(Qul,Qdl;Z (v)
)+
BIin
culinCuln
+BO
in
cdlinCdln
+Vin
finFc− Tin ≤ xin , ∀in ∈ X ,
C.2 tr(Qul
in
)−
EMin
BIin
rulin(Qul;Qul (v)
)≤ yin , ∀in ∈ X ,
C.3 xin ≥ 0, yin ≥ 0, ∀in ∈ X ,
C.2−C.4 of (P.1), ∀in ∈ X , (P.5)
8
whereZ (Z) , (Qul, Qdl, f , cul, cdl, x, y) denote the uniquesolution of (P.5). The SCA scheme for solving (P.5) is de-scribed in Algorithm 3. As a technical note, we observe thathere, since the approximant (30) of the objective function (29)is an upper bound on (29), convergence of Algorithm 3 isguaranteed also by settingγ(v) = 1 [37, Sec. III-A].
Algorithm 2: User SchedulingInput: Parameters used by Algorithm 3.
1: Solve problem (P.4) using Algorithm 3 forX = I toobtainw ,
x∑
in∈I xin
+y
∑
in∈I yinand sort the MUs
in ascending order of the value ofw aswπ1≤ wπ2
≤ . . . ≤ wπKNc, whereπ is a permutation of
I.2: Set:Klow = 0, Kup = KNc.3: Repeat4: Sets← ⌊Klow+Kup
2 ⌋.5: Perform the feasibility test by solving (P.4) using
Algorithm 3 for X = X [s] , {π1, . . . , πs}: if it isfeasible, setKlow = s; otherwise, setKup = s.
6: Until Kup−Klow = 1.7: Sets∗ = Klow andX ∗ = {π1, . . . , πs∗}.
Output: Number of scheduled MUss∗.
Algorithm 3: SCA Solution for (P.5)
Input: Parameters from Table I;p = 0.5; v = 0; Z (0) ∈ Z;{γ (v)}v ∈ (0, 1]; γqul , γqdl , γf , γcul , γcdl , γx, γy, ǫ > 0;ω′
in(0) = ω′′in(0) = 1.
1: If∣∣∣f (Z;Z (v + 1))− f (Z;Z (v))
∣∣∣ ≤ δ, stop.
2: ComputeZ (Z (v)) from (P.5).
3: SetZ (v + 1) = Z (v) + γ (v)(
Z (Z (v))− Z (v))
.4: Update
ω′in(v + 1) =
p
2
[
(xin(v + 1))2 + ǫ2] p
2−1
,
ω′′in(v + 1) =
p
2
[
(yin(v + 1))2 + ǫ2] p
2−1
.
5: v ← v + 1, and return to step1.Output:
(Qul,Qdl, f , cul, cdl,x,y
).
V. HYBRID EDGE AND CLOUD COMPUTING
In the previous sections, we considered a scenario in whichthe MUs can offload applications to a cloud server. In thissection, we extend the analysis to a more general set-up inwhich the ceNBs are directly connected to local computingservers, also known as cloudlets [5], which may run some ofthe MUs’ applications. Specifically, each ceNB can either ex-ecute the computation task on the behalf of the MU or offloadit to the cloud. LetF ceNB
n be the computation capability inCPU cycles per second of ceNBn, and letf ceNB
in ≥ 0 bethe fraction of the ceNB’s computing power assigned to user
in, so that∑
i
f ceNBin
≤ 1. If implemented at the ceNB, the
execution time of the task of MUin is then given as
∆exe|ceNBin
=Vin
f ceNBin
F ceNBn
. (33)
In the same way, if the cloud processes the task of userin,the execution time∆exe|cloud
inis given by the right-hand side
of (12). The overall latency∆in experienced by each MUincan be expressed as
∆in = ∆ulin + (1− uin)∆
exe|ceNBin
+ uin∆exe|cloudin
+ uin∆bhin +∆dl
in ,(34)
where ∆ulin,∆bh
inand ∆dl
inhave the same definition as in
(5), (13) and (10), respectively;uin is a binary variable thatindicates whether if the task of MUin is processed on theceNB (uin = 0) or on the cloud (uin = 1).
To proceed, we relax the binary variableuin to be definedin the interval [0, 1]. This relaxation not only provides alower bound on the minimum energy expenditure that canbe obtained with a hard choice between cloudlet and cloudoffloading, perhaps more importantly, it also captures a systemin which the input data to the application of the MUin canbe split into two parts, of sizes1− uin and uin , that canbe processed separately at the ceNB and cloud, respectively.In the following, we will adopt this latter justification of themodel.
As in Sec. II and Sec. III, we aim at minimizing the totalenergy consumed by the MUs to execute their tasks remotelyunder latency and power constraints. The problem is given by
minQul,Qdl,u,fceNB,f ,cul,cdl
Eul(Qul
)=∑
in∈I
Eulin
(Qul
in,Qul
−in
)
=∑
in∈I
BIin
tr(Qulin)
rulin
(Qul)
s.t. C.1BI
in
Wulrulin
(Qul)+
(1−uin )Vin
fceNBin
F ceNBn
+uinBI
in
culin
Culn
+uinVin
finFc+
uinBOin
cdlin
Cdln
+BO
in
Wdlrdlin(Qdl)
≤ Tin , ∀in ∈ I,
C.2 0 ≤ uin ≤ 1, ∀in ∈ I,C.3 f ceNB
in, fin ≥ 0,
∑
in∈I
fin ≤ 1, ∀in ∈ I,
K∑
i=1
f ceNBin
≤ 1, ∀n = 1, . . . , Nc,
C.4 culin , cdlin ≥ 0, ∀in ∈ I,
K∑
i=1
culin ≤ 1,K∑
i=1
cdlin ≤ 1, ∀n = 1, . . . , Nc,
C.5 Qulin∈ Qul
in, ∀in ∈ I,
(Qdl
in
)K
i=1∈ Qdl
n , ∀n = 1, . . . , Nc.(P.6)
Note that, unlike (P.1), here we have the additional optimiza-tion variablesu , (uin)in∈I and fceNB ,
(f ceNBin
)
in∈I.
It can be observed that (P.6) is non-convex due to the non-convexity of the objective function and constraint C.1. Wetackle the problem by means of the SCA method convexifyingthe objective function as done in Sec. III-A. Furthermore, weneed to calculate a convex upper bound for the C.1 constraint
9
in (P.6) that satisfies conditions B1-B6 in Sec. III-B. Let uswrite the left-hand side of C.1 by
gin(Qul,Qdl,u, fceNB, f , cul, cdl
),
BIin
Wulrulin (Qul)
+BO
in
W dlrdlin (Qdl)+
(1− uin)Vin
f ceNBin
F ceNBn
+uinVin
finFc
+uinB
Iin
culinCuln
+uinB
Oin
cdlinCdln
.
(35)
To build the desired bound ongin , we observe that the first twoterms can be handled as in Sec. III-B, while for the last fourterms, we observe that they are all ratios of linear functions,we now observe that the relationship
x
y=
1
2
(
x+1
y
)2
−1
2
(
x2 +1
y2
)
, (36)
holds, where the right-hand side is the difference of twoconvex functions ifx ≥ 0 and y > 0. Therefore, a locallytight convex upper bound on the left-hand side of (36) can beobtained by linearizing the concave part as
x
y≤1
2
(
x+1
y
)2
−1
2
(
(xv)2+
1
(yv)2
)
− xv (x− xv)+
1
(yv)3(y − yv) ,
(37)
where superscriptv identifies the point(xv, yv) at which theupper bound is tight. Using (37) to the last four terms in (35),we define the desired approximants by substituting forx andy in (37) the numerator and denominator, respectively, of eachof the last four terms in (35). This yields the SCA procedurein Algorithm 1 with the difference that we defineZ (v) ,(Qul (v) ,Qdl (v) ,u (v) , fceNB (v) , f (v) , cul (v) , cdl (v)
)
andZ ,(Qul,Qdl,u, fceNB, f , cul, cdl
)
Z (Z (v)) , argminQul,Qdl,,u,fceNB,f ,cul,cdl
Eul (Z;Z (v))
s.t.
C.1 gin (Z;Z (v)) ≤ Tin , ∀in ∈ I,
C.2−C.5 of (P.6), (P.7)
where Z (Z (v)) , (Qul, Qdl, u, fceNB , f , cul, cdl) denotesthe unique solution of the strongly convex optimizationproblem (P.7); the objective functionEul (Z;Z (v)) ,∑
in∈I
(
Eulin
(Qul;Qul(v)
)+ Ein (Zin ;Zin (v))
)
where Eulin
(Qul;Qul(v)
)is given as in (16) and we
define E (Zin ;Zin (v)) ,γqul
2
∥∥Qul
in −Qulin (v)
∥∥2
+γqdl
2
∥∥Qdl
in−Qdl
in(v)∥∥2
+ γu
2 ‖uin − uin (v)‖2 +γfceNB
2 ‖f ceNBin
- f ceNBin
(v) ‖2 +γf
2 ‖fin − fin (v)‖2 +γcul
2
∥∥culin − culin (v)
∥∥2
+γcdl
2
∥∥cdlin − cdlin (v)
∥∥2
withγu, γfceNB > 0; and gin (Z;Z (v)) is the locally convexupper bound defined above.
VI. ENHANCED DOWNLINK VIA NETWORK MIMO
So far, we assumed that each ceNBn serves the users{in : i = 1, . . . ,K} in its cell, hence interfering with otherceNBs. In this section, instead, we consider enhanced down-link transmission based on network MIMO. Specifically, weassume that each MU can receive the result of the cloudexecution in the downlink not only from ceNB in the samecell, but also from other ceNBs that cooperatively transmitto the MU. Cooperation among ceNBs is enabled by thetransmission on the backhaul links of the outcome of the cloudexecution for a given MU to multiple ceNBs. We specificallyconsider here the extreme case of full ceNB cooperation, sothat all the ceNBs transmit cooperatively to each MU. Theextension to a more general model with clustered cooperationis straightforward and will not be pursued further.
To express the achievable downlink rate with networkMIMO, it is convenient to define user-centric transmit co-variance matricesQdl
j ∈ CNT×NT for every MU j ∈ I,
where we have definedNT ,∑Nc
n=1 NTn, such that then-th
NTn× NTn
block on the main diagonal, denoted by[Qdlj ]n,
represents the contribution of ceNBn to the transmission toMU j. Note that the out-of-diagonal blocks inQdl
j describethe correlation among signals sent by different ceNBs, whichdesignates cooperative transmission at the ceNBs. The set ofdownlink covariance matrices,Qdl =
{Qdl
j
}
j∈Iis
Qdl ,{Qdl
j ∈ CNT×NT , j ∈ I :
∑
j∈I
tr([Qdl
j ]n)≤ P dl
n , ∀n},
(38)
so that the per-ceNB power constraints are satisfied. Also, it isconvenient to define the channel matrix from all the ceNBs toMU j asGj ∈ C
NRj×NT , whereGj , [Gj1,Gj2, . . . ,GjNc
]and we have setGj,n = Gj . With these definitions, theachievable downlink rate is given by
rdlin(Qdl) = log2 det
(
I+ GHinR
dlin(Q
dl−in)
−1GinQdlin
)
, (39)
with
Rdlin(Q
dl−in) , σ2
wI+∑
im∈I\{in}
GinQdlimGH
in , (40)
andQdl−in
, (Qdljm
)jm 6=in .
In order to enable network MIMO, we need to ensure thatdownlink transmissions from all ceNBs take place at the sametime. To this end, we impose that uplink transmission, com-puting and backhaul transmissions for all MUs are constrainedto be completed by a given timeT1. At time T1, then, thedownlink transmission is initiated and takes a given timeT2.Accordingly, we formulate the optimization problem following
10
the weighted sum approach discussed in Remark 2 as
minQul,Qdl,f ,cul,cdl,T1,T2
∑
in∈I
BIin
tr(Qulin)
rulin
(Qul)+ λ(T1 + T2)
s.t. C.1BI
in
Wulrulin
(Qul)+
BIin
culin
Culn
+Vin
finFc
+BO
in
Cdlmcdl
in,m
≤ T1, ∀in ∈ I,
C.2BO
in
Wdlrdlin
(Qdl)≤ T2, ∀in ∈ I,
C.3 fin ≥ 0,∑
in∈I
fin ≤ 1, ∀in ∈ I,
C.4 culin , cdlin,m
≥ 0,K∑
i=1
culin ≤ 1,∑
j∈I
cdlj,m ≤ 1, ∀in ∈ I,
C.5 Qulin ∈ Q
ulin ,Q
dl ∈ Qdl, ∀in ∈ I,(P.8)
wherecdl ,(cdlin,m
)
in∈I,m, with cdlin,m being the fraction of
the backhaul capacity to ceNBm allocated to transmit theoutput bits intended for MUin; and λ > 0 is a parameterdefining the relative weight of energy and latency.
Problem (P.8) is non-convex due to the non-convexity ofthe objective function and constraints C.1 and C.2. We tacklethis problem using SCA in a way similar to that used for(P.1), i.e., we obtain a convex approximants for the objectivefunctions using (16) and for the latency constraints using (23).The problem is then solved using the SCA procedure describedin Algorithm 1.
VII. N UMERICAL RESULTS
In this section, we present numerical results validatingthe model and algorithms presented in the previous sections.Throughout, we consider a network composed of three cellswith five users in each cell, i.e.,Nc = 3 and K = 5. Alltransceivers are equipped withNTin
= NRn= 2 antennas.
The channel matrices are generated with independent andidentically distributed complex Gaussian entries having zeromean and variance equal to the path loss, which is assumed tobe identical for uplink and downlink. The path loss is givenby 170 dBm between an MU and the ceNB in the samesmall cell and 180 dBm between an MU and the ceNB in theother small cell. The values 170 dBm and 180 dBm can bejustified by considering the Walfish-Ikegami model [48] withdistances of500 meters between MU and ceNB in the samecell and700 meters for MU and ceNB in different cells. Theother parameters of the Walfish-Ikegami model are selectedso as to simulate a small-cell environment in a typical urbansetting as listed in Table II. Targeting a small-cell scenarios,we will explore values of the backhaul capacities as low asfew Mbits/s [35], [36]. Finally, note that settingN0 = −170dBm/Hz [48], yields an average signal-to-noise ratio (SNR)on the direct link of40 dBm per receive antenna and30 dBmon the interference link for a signal transmitted at0.01 Jouleper symbol from a single antenna in the uplink and downlink.Furthermore, unless stated otherwise, we select each numberof input bitsBI
inand output bitsBO
inuniformly at random in
the interval[0.1 − 1] Mbits and we set the number of CPUcycles asVin = 2640×BI
in CPU cycles. These choices reflectcomputational-intensive applications, as demonstrated by the
1 3 5 7 9 11 13 15 17
Iteration index (v)
3500
4000
4500
5000
5500
6000
6500
Mob
ilesum-energy
[J]
Equal Bh and Cloud Alloc.
Equal Cloud Alloc.
Equal Bh Alloc.
Joint Optim.
Fig. 3: Minimum average mobile energy consumption versus itera-tion index (Tin = 0.12 seconds,Nc = 3, K = 5, W ul
= W dl=
100 MHz, BIin andBO
in ∼ U{0.1, 1} Mbits, Vin = 2640×BIin CPU
cycles,Culn = Cdl
n = 100 Mbits/s, andFc = 1011 CPU cycles/s).
measurments in [15], [49]. The cloud capacity isFc = 1011
CPU cycles/s, which is, e.g., obtained by a four-core serverwith Intel Xeon processor with 3.3 GHz that is commerciallyused by Amazon elastic compute cloud (EC2) [16], [50].Other system parameters are set toWul = W dl = 10 MHz,Cul
n = Cdln = 100 Mbits/s, din = 10−5 J/symbol [51], and
Tin = 0.1 seconds. Throughout, averages are intended withrespect to the channel realizations.
parameter value descriptionf 1800 frequency (MHz)θ 45◦ road orientation angle (degree)hTx 3 height of transmitter (meters)hRx 1 height of receiver (meters)hRoof 5.5 mean value of buildings height (meters)sRoof 8 mean value of buildings separation (meters)str wid 5 mean value of street width (meters)ka 54 path loss penalty when ceNB below rooftopkd 5 correlation adjustment constant
TABLE II: Parameters in the Walfish-Ikegami path loss model[48].
We start by illustrating the typical convergence propertiesof the SCA algorithm by plotting in Fig. 3 the averageminimal mobile energy consumptionE
(Qul,Qdl
)versus the
iteration indexv. Besides the joint optimization across uplink,downlink, backhaul and computing resources, studied in Sec.II, we consider the following more conventional solutions:(i) Equal backhaul and cloud allocation:the computingand backhaul resources are equally allocated to all MUs,that is, fin = 1/(NcK) and culin = cdlin = 1/K for allin ∈ I, while the covariance transmit and receive matricesat the physical layer are optimized using SCA [29];(ii)Equal cloud allocation:computing resources at the cloudare equally allocated among the MUs, while the rest of theparameters are jointly optimized using SCA;(iii) Equal back-haul allocation: the backhaul resources are equally allocatedamong the MUs, while the rest of the parameters are jointly
11
0.08 0.09 0.1 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18
Latency [s]
3500
4000
4500
5000
5500
6000
Mob
ilesum-energy
[J]
Equal Bh and Cloud Alloc.
Equal Cloud Alloc.
Equal Bh Alloc.
Joint Optim.
Fig. 4: Minimum average mobile energy consumption versus thelatency constraintTin , assumed to be the same for all MUs (Nc =
3, K = 5, W ul= W dl
= 100 MHz, BIin and BO
in ∼ U{0.1, 1}
Mbits, Vin = 2640 × BIin CPU cycles,Cul
n = Cdln = 100 Mbits/s,
andFc = 1011 CPU cycles/s).
optimized using SCA. We observe the fast convergence ofSCA, which requires around 17 iterations to obtain resultsclose to convergence (at iteration 17 the termination criterion∣∣E(Qul (v + 1),Qdl (v + 1)
)− E
(Qul (v),Qdl (v)
)∣∣ ≤ δ is
satisfied with δ = 10−3 for α = 10−5). Furthermore, itcan be seen that the proposed joint optimization methodshows a considerable gain compared to the equal allocationof computational and backhaul resources.
The gains that can be accrued by means of joint optimizationare further investigated in Fig. 4 (obtained in the same settingof Fig. 3), which depicts the minimum average mobile energyas a function of the latency constraintsTin , which are assumedto be the same for all MUs. The energy saving due to jointoptimization can be seen to be particularly pronounced in theregime in which the latency constraint is more stringent. Forinstance, atTin = 0.12 seconds, which is the smallest latencyfor which all schemes are feasible, joint optimization saves66% in terms of sum-energy as compared to equal allocation ofbackhaul and cloud resources, 48% as compared to equal cloudallocation, and 32% as compared to equal backhaul allocation.
To account for the case in which the network may oper-ate under asymmetrical bandwidth allocation for uplink anddownlink, we compare the mobile sum-energy consumptionof the schemes considered above in the same set-up of Fig.3 and Fig. 4, with the caveat that we setW dl = 10 MHzand we vary the uplink/downlink bandwidth ratioWul/W dl,in Fig. 5. We observe that joint optimization is especiallyadvantageous in term of mobile energy consumption whenthe uplink bandwidth is more constrained than the downlinkbandwidth. This is because, in this regime, it is particularlyuseful to allocate more computing and backhaul resources tousers with worse channel condition in order to meet the latencyrequirements with minimal energy expenditure. For example,whenWul is five times smaller than the downlink bandwidth,
the joint optimization scheme is50% more energy efficientthan the fixed allocation of backhaul and cloud resources.
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
W ul/W dl
600
800
1000
1200
1400
1600
1800
2000
2200
2400
Mob
ilesum-energy
[J]
Equal Bh and Cloud Alloc.
Equal Bh Alloc.
Equal Cloud Alloc.
Joint Optim.
Fig. 5: Minimum average mobile sum-energy consumption versusbandwidth allocation ratioW ul/W dl (W dl
= 10 MHz, Nc =
2,K = 2, Tin = 0.09 s, BIin and BO
in ∼ U{0.1, 1} Mbits,Vin = 2640 × BI
in CPU cycles,Culn = Cdl
n = 100 Mbits/s, andFc = 10
11 CPU cycles/s).
While the previous results assume that all MUs performcomputation offloading, we next turn to the study of userselection that is presented in Sec. IV. To account for mobileenergy consumption, we set the effective switched capacitanceof the mobile asκ = 10−26 so that the mobile energyconsumption is consistent with the measurements made in [15]for a Nokia N900 mobile device operating at frequency 500MHz. To this end, we plot the average number of selectedMUs as a function of the latency constraintTin , assumed tobe the same for all MUs, for different values of backhaul inFig. 6. The figure shows the results obtained by means of theefficient algorithm proposed in Sec. IV withη = 10−5, as wellas by anexhaustive searchstrategy in which, for every subsetof users, a joint optimization problem is solved as per Sec. IIusing SCA (Algorithm 1) and then the subset of users whichyields the minimal energy consumption is selected. The figuredemonstrates how a less restrictive latency constraint and/or alarger backhaul increases the number of users that should beallowed to offload. For instance, forTin ≤ 0.07 seconds, thereis no MU on average offloads its applications since offloadingis more energy demanding. Instead, forTin > 0.2 seconds, aslong as the backhaul capacity is larger than500 Mbits/s, allMUs tend to offload. It is also observed that the proposedscheme selects a number of users close to that chosen byexhaustive search.
The corresponding overall energy consumption for the ex-ample of Fig. 6 with backhaul capacityCul
n = Cdln = 100
Mbits/s is shown in Fig. 7. Note that the overall energyconsumption is the sum of the mobile computing energy forthe MUs that perform local computing, namely (26), and ofthe energy used for transmission, namely (6), for the MUsthat offload. For reference, we also show the energy requiredwhen all MUs perform their tasks locally. The proposed user
12
0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2
Latency [s]
0
0.5
1
1.5
2
2.5
3
3.5
4Averagenumber
ofselected
MUs
Culn = 500 Mbits/s (ES)
Culn = 500 Mbits/s (US)
Culn = 100 Mbits/s (ES)
Culn = 100 Mbits/s (US)
Culn = 50 Mbits/s (ES)
Culn = 50 Mbits/s (US)
Culn = 10 Mbits/s (ES)
Culn = 10 Mbits/s (US)
US: Proposed User SelectionES: Exhaustive Search User Selection
Fig. 6: Average number of selected MUs versus latency constraintTin for both the proposed efficient scheme in Sec. IV and exhaustivesearch (Nc = 2,K = 2, BI
in = BOin = 1 Mbits, Vin = 10
9 CPUcycles,Fc = 10
11 CPU cycles/s, andCdln = Cul
n ).
0.07 0.08 0.09 0.1 0.11 0.12 0.13 0.14 0.15 0.16 0.17
Latency [s]
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
Mob
ilesum-energy
[J]
Local Computing
Proposed User Selection
Exhaustive Search User Selection
Fig. 7: Minimum average mobile energy consumption versus thelatency constraintTin (Nc = 2, K = 2, BI
in = BOin = 1 Mbits,
Vin = 109 CPU cycles,Cul
n = Cdln = 100 Mbits/s, andFc = 10
11
CPU cycles/s).
selection scheme is seen to achieve a near-optimal energyperformance compared to the exhaustive search method. Asan example, when the latency constraintTin = 0.17 seconds,around 83% energy saving can be obtained with the proposeduser offloading selection scheme as compared tolocal com-puting.
In the previous results, central cloud processing was as-sumed. We now investigate the optimal allocation of comput-ing tasks between the edge and the cloud in the set-up studiedin Sec. V. To this end, in Fig. 8, we plot the fractions(uin)in∈I
of the application of each MUin to be performed at the cloudas a function of the backhaul capacity when the computingcapability of the cloudlet is given byF ceNB
n = 0.1Fc.To highlight the impact of asymmetries in the offloading
2 2.5 3 3.5 4 4.5 5 5.5 6
Backhaul capacity [Mbits/s]
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Optimal
fraction
ofprocessingat
theclou
d(u
i n)
u11
u21
u12
u22
Fig. 8: Offloading cloud vs. cloudlet indicator versus the backhaulcapacity (Nc = 2,K = 2, Tin = 0.9 seconds,BI
11= BO
11= 1
Mbits, BI21
= BO21
= 0.7 Mbits, BI12
= BO12
= 0.5 Mbits, BI22
=
BO22
= 0.1 Mbits; V11= 10
9 CPU cycles,V21= 0.7V11
, V12=
0.5V11and V22
= 0.1V11, Cul
n = Cdln = 100 Mbits/s, F ceNB
n =
1010 CPU cycles/s, andFc = 10
11 CPU cycles/s).
requirements, we assume here that in the first cell the twousers haveBI
11 = BO11 = 1 Mbits, V11 = 109 CPU cycles, and
BI21 = BO
21 = 0.7 Mbits, V21 = 0.7V11 CPU cycles, while theusers in the second cell have smaller offloading requirementsfor data transfer and computing, normallyBI
12 = BO12 = 0.5
Mbits, V12 = 0.5V11 CPU cycles, andBI22 = BO
22 = 0.1Mbits, V22 = 0.1V11 CPU cycles. It is observed that, when thebackhaul capacity is limited, tasks tends to be performed atthe edge, especially for users with more stringent data transferand computing requirements, for example, atCul
n = Cdln = 3
Mbits/s, the two users in the first cell execute50% and80%percent of their offloaded tasks at the cloud, while the usersin the second cell have a complete tasks execution at thecloud due to the moderate number of input/output bits. Asthe backhaul capacity is increased, more tasks are executedatthe cloud, benefiting from the improved connection betweenceNBs and the cloud.
To obtain further insights, Fig. 9. shows the mobile sum-energy consumption versus the backhaul capacity for cloudmobile computing, whereby all users offload to the cloud,and edge mobile computing, whereby all users offload to thelocal cloudlet, forNc = 2,K = 2, andF ceNB
n = 1010 CPUcycles/s. We also include the performance of the solutionintroduced in Sec. V, which allows for hybrid edge-cloudoffloading. For cloud offloading, we consider different valuesfor computational capacityFc of the cloud. It is seen that whenthe backhaul capacity constraint is stringent, edge offloadingis more energy efficient as compared to cloud offloading, e.g.,by about 44% for cloud computational capacityFc = 1010
CPU cycles/s and backhaul capacityCuln = Cdl
n = 2.4Mbits/s. Furthermore, larger values of cloud capacityFc cancompensate for limited backhaul resources, making cloudcomputing preferable to edge computing. It is also observedthat the hybrid edge-cloud offloading scheme can significantly
13
2.4 2.5 2.6 2.7 2.8 2.9 3 3.1 3.2 3.3 3.4
Backhaul capacity [Mbits/s] ×107
1000
1500
2000
2500
3000
3500
4000
4500
5000Mob
ilesum-energy
[J]
Edge offloading
Cloud offloading (Fc = 1010)
Hybrid offloading (Fc = 1010)
Cloud offloading (Fc = 1011)
Cloud offloading (Fc = 1012)
Fig. 9: Minimum average mobile sum-energy consumption forcloud and edge offloading, along with the proposed hybrid edge-cloud offloading scheme, versus backhaul capacityCul
n = Cdln
(Nc = 2, K = 2, Tin = 0.1 s, W ul= W dl
= 10 MHz, BIin
andBOin ∼ U{0.1, 1} Mbits, Vin = 2640 × BI
in CPU cycles, andF ceNBn = 10
10 CPU cycles/s).
outperform edge computing when backhaul capacity is limited.We finally turn to the performance of the downlink network
MIMO scheme presented in Sec. VI. In Fig. 10, we plotthe average mobile sum-energy with and without cooperativetransmission in the downlink. The performance of cooperativescheme is obtained from the solution of (P.8), while theperformance of non-cooperative scheme is obtained from thesolution of (P.1) under the indicated values of backhaul capac-ity. The users offloading requirements are identical to thatusedin Fig. 6. The key observation here is that the performanceof the downlink cooperative transmission is strongly limitedby the backhaul capacity. For instance, we can see that, withbackhaulCul
n = Cdln = 10 Mbits/s, the cooperative scheme
is about 85% more energy-consuming as compared to thenon-cooperative scheme at latency0.5 seconds. The energyperformance gap between the two schemes is diminished asthe backhaul increases as observed withCul
n = Cdln = 100
Mbits/s. With this value of backhaul capacity, the cooperativescheme is less energy efficient by45% aroundTin = 0.3seconds. However, with backhaulCul
n = Cdln = 10 Gbits/s,
as for a standard fiber optic channel, the cooperative schemestarts to have noticeable energy gains. At latency0.08 seconds,for example, the cooperative scheme attains57% energysaving as compared to the non-cooperative scheme.
VIII. CONCLUDING REMARKS
In this paper, we investigated the design of cloud mobilecomputing systems over MIMO cellular networks as a jointoptimization problem over radio, computational resourcesandbackhaul resources in both uplink and downlink directions.Aniterative algorithm based on successive convex approximationswas presented for solving the resulting non-convex problemunder latency and power constraints. Numerical results showthat the proposed joint optimization yields significant energy
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
Latency [s]
0
1000
2000
3000
4000
5000
6000
Mob
ilesum-energy
[J]
cooperative (Culn = 10 Gbits/s)
non-cooperative (Culn = 10 Gbits/s)
non-cooperative (Culn = 100 Mbits/s)
cooperative (Culn = 100 Mbits/s)
non-cooperative (Culn = 10 Mbits/s)
cooperative (Culn = 10 Mbits/s)
Fig. 10: Minimum average mobile energy consumption versus thelatency constraintTin , assumed to be the same for all MUs (Nc =
2,K = 2, BI11
= BI12
= 1 Mbits, BI21
= BI22
= 0.1 Mbits,BO
11= BO
12= 1 Mbits, BO
21= BO
22= 0.1 Mbits, V11
= V12= 10
9
CPU cycles,V21= V22
= 108 CPU cycles,Fc = 10
11 CPU cycles/s,andCdl
n = Culn ).
saving compared to the conventional solutions based on sepa-rate allocations of computing and/or backhaul resources. Thissaving is more pronounced in low latency regimes, where,in our results, it leads to energy saving as high as66%.The mentioned baseline problem was further generalized inseveral directions. First, we tackled user selection with theaim of ensuring an energy savings for all users that performoffloading. Then, a hybrid architecture that leverages bothedgeand cloud computing was studied by addressing the optimalallocation between cloud and edge. It was seen that the optimalallocation is significantly affected by the backhaul capacity.Finally, joint downlink transmission based on network MIMOwas considered, demonstrating the critical importance of thebackhaul for the viability of this technique. Among the prob-lems left open by this study, we mention here the considerationof higher-layer aspects such as queuing in the definition of thelatency.
REFERENCES
[1] K. Kumar, J. Liu, Y.-H. Lu, and B. Bhargava, “A survey of computationoffloading for mobile systems,”Mobile Netw. App., vol. 18, no. 1, pp.129–140, Feb. 2013.
[2] H. T. Dinh, C. Lee, D. Niyato, and P. Wang, “A survey of mobilecloud computing: architecture, applications, and approaches,” Wirelesscommun. mobile comput., vol. 13, no. 18, pp. 1587–1611, Dec. 2013.
[3] Z. Sanaei, S. Abolfazli, A. Gani, and R. Buyya, “Heterogeneity in mobilecloud computing: taxonomy and open challenges,”IEEE Commun.Surveys Tuts., vol. 16, no. 1, pp. 369–392, Feb. 2014.
[4] L. Lei, Z. Zhong, K. Zheng, J. Chen, and H. Meng, “Challenges onwireless heterogeneous networks for mobile cloud computing,” IEEEWireless Commun., vol. 20, no. 3, pp. 34–44, Jun. 2013.
[5] M. Satyanarayanan, P. Bahl, R. Caceres, and N. Davies, “The case forVM-based cloudlets in mobile computing,”IEEE Pervasive Comput.,vol. 8, no. 4, pp. 14–23, Oct. 2009.
[6] M. Maier, M. Chowdhury, B. P. Rimal, and D. P. Van, “The tactileinternet: Vision, recent progress, and open challenges,”IEEE Commun.Magazine, vol. 54, no. 5, pp. 138–145, May 2016.
14
[7] Y. C. Hu, M. Patel, D. Sabella, N. Sprecher, and V. Young, “Mobileedge computing–A key technology towards 5G,” ETSI White Paper, no.11, tech. rep., Sep. 2015.
[8] B. Han, P. Hui, V. S. A. Kumar, M. V. Marathe, J. Shao, and A.Srini-vasan, “Mobile data offloading through opportunistic communicationsand social participation,”IEEE Trans. Mobile Comput., vol. 11, no. 5,pp. 821–834, May 2012.
[9] E. Cuervo, A. Balasubramanian, D.-k. Cho, A. Wolman, S. Saroiu,R. Chandra, and P. Bahl, “MAUI: making smartphones last longer withcode offload,” inProc. of the 8th international conference on Mobilesystems, applications, and services, San Francisco, CA, USA, Jun. 2010,pp. 49–62.
[10] S. Kosta, A. Aucinas, P. Hui, R. Mortier, and X. Zhang, “Thinkair:Dynamic resource allocation and parallel execution in the cloud formobile code offloading,” inProc. IEEE INFOCOM, orlando, FL, USA,Mar. 2012, pp. 945–953.
[11] M. Whaiduzzaman, A. Naveed, and A. Gani, “MobiCoRE: Mobiledevice based cloudlet resource enhancement for optimal task response,”IEEE Trans. Services Comput., vol. 9, no. 99, pp. 1–15, May 2016.
[12] N. Fernando, S. W. Loke, and W. Rahayu, “Mobile cloud computing:A survey,” Future Generation Computer Systems, vol. 29, no. 1, pp.84–106, Jan. 2013.
[13] J. E. Costa and J. J. Rodrigues,Mobile cloud computing: Technologies,services, and applications. IGI Global, 2013.
[14] D. Huang, P. Wang, and D. Niyato, “A dynamic offloading algorithmfor mobile computing,”IEEE Trans. Wireless Commun., vol. 11, no. 6,pp. 1991–1995, Jun. 2012.
[15] A. P. Miettinen and J. K. Nurminen, “Energy efficiency ofmobile clientsin cloud computing,” inProc. the 2nd USENIX Conference on Hot Topicsin Cloud Computing, Boston, MA, USA, pp. 4-11, Jun. 2010.
[16] K. Kumar and Y. H. Lu, “Cloud computing for mobile users:Canoffloading computation save energy?”Computer, vol. 43, no. 4, pp. 51–56, Apr. 2010.
[17] Y.-H. Kao, B. Krishnamachari, M.-R. Ra, and F. Bai, “Hermes: Latencyoptimal task assignment for resource-constrained mobile computing,”in Proc. IEEE INFOCOM, Kowloon, Hong Kong, pp. 1894-1902, Apr.2015.
[18] M. Kamoun, W. Labidi, and M. Sarkiss, “Joint resource allocation andoffloading strategies in cloud enabled cellular networks,”in Proc. IEEEICC, London, UK, pp. 5529-5534, Jun. 2015.
[19] S. Khalili and O. Simeone, “Inter-layer per-mobile optimization of cloudmobile computing: A message-passing approach,”Submitted to IEEETrans. Commun., Aug. 2015.
[20] X. Xiang, C. Lin, and X. Chen, “Energy-efficient link selection andtransmission scheduling in mobile cloud computing,”IEEE WirelessCommun. Letters, vol. 3, no. 2, pp. 153–156, Apr. 2014.
[21] Y. Yu, J. Zhang, and K. Ben Letaief, “Joint subcarrier and CPU timeallocation for mobile edge computing,”arXiv preprint:1608.06128, aug.2016.
[22] O. Munoz, A. Pascual-Iserte, and J. Vidal, “Optimization of radio andcomputational resources for energy efficiency in latency-constrainedapplication offloading,”IEEE Trans. Veh. Technol., vol. 64, no. 10, pp.4738–4755, Oct. 2015.
[23] W. Zhang, Y. Wen, K. Guan, D. Kilper, H. Luo, and D. O. Wu, “Energy-optimal mobile cloud computing under stochastic wireless channel,”IEEE Trans. Wireless Commun., vol. 12, no. 9, pp. 4569–4581, Sep.2013.
[24] Y. Wang, M. Sheng, X. Wang, L. Wang, W. Han, Y. Zhang, and Y. Shi,“Energy-optimal partial computation offloading using dynamic voltagescaling,” inProc. IEEE ICCW, London, UK, pp. 2695–2700, Sep. 2015.
[25] C. You, K. Huang, and H. Chae, “Energy efficient mobile cloud com-puting powered by wireless energy transfer,”arXiv preprint:1507.04094,Jul. 2015.
[26] X. Chen, “Decentralized computation offloading game for mobile cloudcomputing,” IEEE Trans. Parallel Distributed Sys., vol. 26, no. 4, pp.974–983, Apr. 2015.
[27] X. Chen, L. Jiao, W. Li, and X. Fu, “Efficient multi-user computationoffloading for mobile-edge cloud computing,”IEEE/ACM Trans. Netw.,vol. 9, no. 99, pp. 1–14, Sep. 2015.
[28] R. Kaewpuang, D. Niyato, P. Wang, and E. Hossain, “A framework forcooperative resource management in mobile cloud computing,” IEEE J.Sel. Areas Commun., vol. 31, no. 12, pp. 2685–2700, December 2013.
[29] S. Sardellitti, G. Scutari, and S. Barbarossa, “Joint optimization of radioand computational resources for multicell mobile-edge computing,”IEEE Trans. Signal Inf. Process. Over Netw, vol. 1, no. 2, pp. 89–103,Jun. 2015.
[30] C. You, K. Huang, H. Chae, and B. H. Kim, “Energy-efficient resourceallocation for mobile-edge computation offloading,”IEEE Trans. Wire-less Commun., vol. PP, no. 99, pp. 1–1, Dec. 2016.
[31] M. Molina, O. Muoz, A. Pascual-Iserte, and J. Vidal, “Joint schedulingof communication and computation resources in multiuser wirelessapplication offloading,” inProc. IEEE PIMRC, Washington, DC, USA,pp. 1093-1098, Sep. 2014.
[32] K. Zhang, Y. Mao, S. Leng, Q. Zhao, L. Li, X. Peng, L. Pan,S. Maharjan, and Y. Zhang, “Energy-efficient offloading for mobile edgecomputing in 5G heterogeneous networks,”IEEE Access, vol. 4, pp.5896–5907, Aug. 2016.
[33] J. Guo, Z. Song, and Y. Cui, “Energy-efficient resource allocation formulti-user mobile edge computing,”arXiv preprint:1611.01786, Nov.2016.
[34] T. Zhao, S. Zhou, X. Guo, Y. Zhao, and Z. Niu, “A cooperativescheduling scheme of local cloud and internet cloud for delay-awaremobile cloud computing,” inProc. IEEE Globecom Workshops, Dec2015, pp. 1–6.
[35] J. G. Andrews, S. Buzzi, W. Choi, S. V. Hanly, A. Lozano, A. C. K.Soong, and J. C. Zhang, “What will 5G be?”IEEE J. Sel. AreasCommun, vol. 32, no. 6, pp. 1065–1082, Jun. 2014.
[36] J. Oueis, E. Calvanese-Strinati, A. D. Domenico, and S.Barbarossa,“On the impact of backhaul network on distributed cloud computing,”in Proc. IEEE WCNCW, Istanbul, Turkey, pp. 12-17, Apr. 2014.
[37] G. Scutari, F. Facchinei, and L. Lampariello, “Parallel and distributedmethods for constrained nonconvex optimization-part I: Theory,” IEEETrans. Signal Processing, to appear, 2016. On-line: arXiv:1410.4754.
[38] G. Scutari, F. Facchinei, L. Lampariello, S. Sardellitti, and P. Song,“Parallel and distributed methods for nonconvex optimization–part II:Applications in communications and machine learning,”IEEE Trans.Signal Processing, to appear, 2016. On-line: arXiv:1601.04059.
[39] Y. Shi, J. Cheng, J. Zhang, B. Bai, W. Chen, and K. B. Letaief,“Smoothed Lp-minimization for green cloud-RAN with user admissioncontrol,” IEEE J. Sel. Areas Commun., vol. 34, no. 4, Apr. 2016.
[40] D. Gesbert, S. Hanly, H. Huang, S. S. Shitz, O. Simeone, and W. Yu,“Multi-cell MIMO cooperative networks: A new look at interference,”IEEE J. Sel. Areas Commun., vol. 28, no. 9, pp. 1380–1408, Dec. 2010.
[41] D. Tse and P. Viswanath,Fundamentals of wireless communication.Cambridge university press, 2005.
[42] A. Lapidoth, “Nearest neighbor decoding for additive non-Gaussiannoise channels,”IEEE Trans. Information Theory, vol. 42, no. 5, pp.1520–1529, Sep. 1996.
[43] Y. Nesterov and A. Nemirovskii,Interior-point polynomial algorithmsin convex programming. Siam, 1994, vol. 13.
[44] T. Lipp and S. Boyd, “Variations and extension of the convex–concaveprocedure,”Optimization and Engineering, pp. 1–25, Nov. 2015.
[45] S. Boyd and L. Vandenberghe,Convex optimization. Cambridgeuniversity press, 2004.
[46] H. Sun, X. Chen, Q. Shi, M. Hong, and X. Fu, “Learning to optimize:Training deep neural networks for wireless resource management,”Submitted, Sep. 2016.
[47] T. D. Burd and R. W. Brodersen, “Processor design for portablesystems,”J. VLSI Signal Process. Syst., vol. 13, no. 2-3, pp. 203–221,Aug. 1996.
[48] A. R. Mishra,Advanced Cellular Network Planning and Optimisation:2G/2.5G/3G and Evolution to 4G, 1st ed. Wiley Publishing, 2006.
[49] E. Lagerspetz and S. Tarkoma, “Mobile search and the cloud: Thebenefits of offloading,” inProc. IEEE PERCOM Workshops, St. Louis,MO, USA, pp. 117-122, Mar. 2011.
[50] A. LLC, “Amazon elastic compute cloud (EC2) website,” 2016.[Online]. Available: https://aws.amazon.com/ec2/
[51] N. Balasubramanian, A. Balasubramanian, and A. Venkataramani, “En-ergy consumption in mobile phones: A measurement study and impli-cations for network applications,” inProc. ACM SIGCOMM InternetMeasurement Conference, Chicago, IL, Nov. 2009, pp. 280–293.
15