Joint Uplink/Downlink Optimization for Backhaul … › pdf › 1607.06521.pdfMobile cloud computing enables the ofﬂoading of com-putationally heavy applications from Mobile Users

arX

iv:1

607.

0652

1v2

[cs.

NI]

3 F

eb 2

017

1

Joint Uplink/Downlink Optimization forBackhaul-Limited Mobile Cloud Computing with

User SchedulingAli Al-Shuwaili, Osvaldo Simeone, Alireza Bagheri, and Gesualdo Scutari

Abstract—Mobile cloud computing enables the offloading ofcomputationally heavy applications, such as for gaming, objectrecognition or video processing, from mobile users (MUs) tocloudlet or cloud servers, which are connected to wireless accesspoints, either directly or through finite-capacity backhaul links.In this paper, the design of a mobile cloud computing system isinvestigated by proposing the joint optimization of computing andcommunication resources with the aim of minimizing the energyrequired for offloading across all MUs under latency constraintsat the application layer. The proposed design accounts for multi-antenna uplink and downlink interfering transmissions, with orwithout cooperation on the downlink, along with the allocationof backhaul and computational resources and user selection.The resulting design optimization problems are non-convex, andstationary solutions are computed by means of successive convexapproximation (SCA) techniques. Numerical results illustratethe advantages in terms of energy-latency trade-off of the jointoptimization of computing and communication resources, aswellas the impact of system parameters, such as backhaul capacity,and of the network architecture.

Index Terms—Mobile cloud computing, 5G, Successive con-vex approximation, Application offloading, Backhaul, Cloudlet,Latency, User selection, Network MIMO.

I. I NTRODUCTION

Mobile cloud computing enables the offloading of com-putationally heavy applications from Mobile Users (MUs)to a remote server by means of a wireless communicationnetwork [1]–[5]. Given the battery-limited nature of mobiledevices, mobile cloud computing makes it possible to provideservices, such as gaming, object recognition, video processing,or augmented reality, that may otherwise not be available tomobile users. Offloading may take place to remote serversin the “cloud”, which is accessed by the wireless accesspoints via backhaul links, or to “cloudlet” servers that aredirectly connected to an access point [5]. In the latter case, theapproach is also known as mobile edge computing, which iscurrently seen as a key enabler of the so called tactile internet[6] and is subject to standardization efforts [7]. Offloadingto peer mobile devices is also being considered [8]. Exam-ples of architectures that are based on mobile cloud/cloudletcomputation offloading include MAUI [9], ThinkAir [10], and

A. Al-Shuwaili, O. Simeone, and A. Bagheri are with the Center for Wire-less Information Processing (CWiP), Department of Electrical and ComputerEngineering, New Jersey Institute of Technology, Newark, NJ 07102 USA(e-mail: [email protected], [email protected], [email protected]).

G. Scutari is with the Department of Industrial Engineering, PurdueUniversity, West Lafayette, IN 47907 USA (e-mail: [email protected]).

This work was partially supported by the U.S. NSF through grant 1525629.

MobiCoRE [11], while examples of commercial applicationbased on mobile cloud computing include Apple iCloud,Shazam, and Google Goggles [12], [13].

When optimized solely at the application layer, as in most ofthe earlier literature, the design of a mobile cloud computingsystems entail decision of whether a mobile can decreaseits energy expenditure by offloading the execution of theentire application [1], [2], [14]–[16] or of some selectedsubtasks (see, e.g., [17], [18]). Given that offloading requirestransmission and reception on the wireless interface, a moresystematic approach involves the joint optimization of offload-ing decisions and communication parameters, such as uplinkpower allocation [18], [19], link and subcarrier selection[20],[21], and uplink data rate [22], [23].

The joint optimization of the set of subtasks to be offloadedand of the uplink transmission powers was studied in [18], [19]under static channel condition. A related dynamic computationoffloading approach was presented in [14], [23] that deter-mines which application subtasks are to be executed remotelygiven the available wireless network connectivity. Assuminga simplified execution model, in which input data can bearbitrary partitioned for separate processing, references [22],[24] study the fraction of the mobile data to be offloaded tothe cloud, with the rest being executed locally at the mobiledevice. A framework that integrates wireless energy transferand cloud mobile computing was put forth in [25].

The works summarized thus far focus on the operation of asingleMU. In contrast to the single-MU problem formulation,in a scenario with multiple MUs transmitting over a sharedwireless medium across multiple cells, the design of a mobilecloud computing system requires: (i) the management ofinterference for theuplink, through which MUs offload thedata needed for computation in the cloud; (ii ) the managementof interference for thedownlink, through which the outcomeof the cloud computations are fed back to the MUs; (iii ) theallocation ofbackhaulresources for communication betweenwireless edge and cloud; and (iv) the allocation ofcomputingresources at the cloud. Furthermore, the optimization shouldincludeuser selection, or schedulingmechanisms whereby theoffloading users are guaranteed an energy consumption that issmaller than the amount required for local computing at thedevice.

The limited literature on resource allocation and offloadingdecisions for the multiuser case includes papers [26]–[34]. Theproblem of decentralized user scheduling when modeling theaspect (i) of uplink interference is considered in [26], [27] for

http://arxiv.org/abs/1607.06521v2

single-antenna elements within a game-theoretic framework.The joint allocation of radio and computational resourcesis considered in [28] by accounting for the elements (i)and (iv), in the presence of multiple clouds, with the aimof maximizing network operator revenue via resource poolsharing. A problem formulation including elements (i) and(iv) was studied in [29] with MIMO transceivers and for afixed set of scheduled users. Scheduling in a single-cell wasconsidered in [30] with the goal of minimizing the weightedsum mobile energy consumption; it was shown that the optimalscheduling and cloud resource allocation policy (element (iv))have a threshold-based structure. Another scheduling strategyfor multiuser offloading systems in a small-cell set-up ispresented in [31], where the resources are allocated under theobjective of minimizing the average latency experienced bythe worst-case user by accounting for element (iv) with theinclusion of uplink and downlink tasks schedulers. A schemethat jointly optimizes the computation offloading decisionsand the radio resource allocation in heterogeneous networksby accounting for element (i) so as to minimize the mobileenergy expenditure under latency constraints was proposedin[32]. An energy-efficient resource allocation for interference-free multi-users scenario was discussed in [33] with the aimofoptimizing uplink and downlink transmissions duration whileconsidering element (iv). Finally, the problem of schedulingtasks between cloud and edge processors was studied in [34]without modeling the physical layer.

In this paper, we account for all elements (i)-(iv) as well asfor user scheduling. Specifically, our main contributions are:

• We study the problem of minimizing the mobile energyconsumption under latency constraints over uplink anddownlink precoding, uplink and downlink backhaul re-source allocation, as well as cloud computing resourceallocation for general multi-antenna transceivers. Thiswork is mostly motivated by the increasing importance ofbackhaul capacity limitations, which are well understoodto be often the bottleneck in modern dense networkdeployments (see, e.g., [35], [36]). We emphasize that,unlike [29], the problem formulation explicitly modelsthe optimization of downlink communication for down-loading the outcome of the optimization at the MUs aswell as of the backhaul resource allocated to the activeMUs in both uplink and downlink, that is, it accounts forall elements (i)-(iv) listed above. The resulting problemrequires the development of a novel adaptation of the Suc-cessive Convex Approximation (SCA) scheme [37], [38]that accounts for downlink and backhaul transmissions.

• The optimization of users scheduling is tackled jointlywith the operation of uplink/downlink, backhaul and com-putational resources under the key constraint that eachoffloading MU should not consume more energy thanthat required for local computation. The resulting mixedinteger problem is tackled by a means of SCA coupledwith the smoothlp-norm approximation approach [39].We emphasize that this problem was not studied in [30],which instead considered a fixed set of scheduled users.

• A hybrid cloud-edge computing setup is studied in

Fig. 1: Basic system model: Mobile users (MUs) offload the execu-tion of their applications to a centralized cloud processorthrough awireless access network and finite capacity backhaul links.

which, beside a cloud server, “cloudlet” or “edge” serversare available locally at the wireless access points. Thecloudlet servers are able to execute offloaded applicationswithout incurring backhaul latency but with a generallysmaller CPU frequency [5]. For the first time, we inves-tigate here the optimal task allocation between cloudletand the cloud via SCA in the presence of all elements(i)-(iv) discussed above.

• We study the impact of cooperative downlink transmis-sion via network MIMO [40] on the achievable energy-latency trade-off by accounting for the backhaul overheadneeded to deliver user data to multiple access points fortransmission to the MUs. This has also not previouslystudied in the context of mobile cloud computing withbackhaul limitations.

The rest of the paper is organized as follows. Section IIintroduces the system model along with the basic problem for-mulation. The proposed SCA solution is described in SectionIII. Users scheduling is studied in Section IV. Hybrid edge,or cloudlet, and cloud computing is formulated and tackledin Section V. Cooperative downlink transmission is discussedin Section VI. Numerical results are presented in Section VII,while concluding remarks are finally provided in Section VIII.

II. SYSTEM MODEL AND PROBLEM FORMULATION

In this section, we describe the basic system model andproblem formulation that will be adopted in this paper. Thebasic model assumes fixed user scheduling, offloading to acentralized cloud processor, and non-cooperative transmissionat the wireless access points. Generalizations that address userscheduling, local computing capabilities at the access pointsand cooperative transmission will be treated later in SectionIV, Section V and Section VI, respectively.

A. System Model

We consider a network composed ofNc cells of possi-bly different sizes such as micro- or femto-cells. Each celln = 1, . . . , Nc includes a base station, referred to as cloud-enhanced e-Node B (ceNB) borrowing from LTE nomencla-ture, which is connected to a common cloud server that pro-vides computational resources. As shown in Fig. 1, each cell

2

Fig. 2: Timeline of offloading from an MUin. Note that the totaltwo-way backhaul latency is given as∆bh

in = ∆bh,ulin

+∆bh,dlin

.

containsK active mobile users (MUs) that have been sched-uled for the offloading of their local applications to the cloudprocessor. TheK MUs in the same cell transmit in orthogonalspectral resources, in the time or frequency domain. We denoteby in the MU in cell n that is scheduled on thei-th spectralresource, and byI , {in : i = 1, . . . ,K, n = 1, . . . , Nc} theset of all the active MUs in the system. Each MUin andceNB n is equipped withNTin

transmit andNRnreceive

antenna, respectively. Note that MUs in different cells thatare scheduled on the same spectral resources interfere witheach other.

Each MU in wishes to run an application within a givenmaximum latencyTin . The application to be executed ischaracterized by the numberVin of CPU cycles necessaryto complete it, by the numberBI

inof input bits, and by

the numberBOin of output bits encoding the result of the

computation. Table I summarizes the notations and parametersused in the system model.

We next derive energy and latency resulting from offloadingof the applications of all active MUs. The offloading latencyconsist of the time∆ul

inneeded for the MU to transmit the

input bits to its ceNB in the uplink; the time∆exein necessary

for the cloud to execute the instructions; the round-trip time∆bh

infor exchanging information between ceNB and the cloud

through the backhaul link; and the time∆dlin

to send the resultback to the MU in the downlink (see Fig. 2). We can hencewrite the total offloading latency for MUin as

∆in = ∆ulin +∆exe

in +∆bhin +∆dl

in . (1)

The energyEin of each MUin instead depends only on thepower used for transmission in the uplink and reception energyin the downlink. These latency and energy terms are computedas a function of the radio and computational resources asdetailed next.

1) Uplink transmission:The optimization variables at thephysical layer for the uplink are the users’ transmit covariancematricesQul ,

(Qul

in

)

in∈I, whereQul

in= E[xul

inxulH

in] with

xulin∼ CN (0,Qul

in) being the signal transmitted by the user

in. These matrices are subject to power budget constraints

Qulin ,

{

Qulin ∈ C

NTin×NTin : Qul

in � 0, tr(Qulin) ≤ Pul

in

}

,

(2)where Pul

inis the maximum allowed transmit energy per

symbol of MU in. For any given profileQul, the achievabletransmission rate, in bits per symbol, which corresponds tothemutual information evaluated on the MIMO Gaussian channel(see, e.g., [41]) of MUin when multi-user interference istreated as additive Gaussian noise, is

rulin(Q) = log2 det(

I+HHinR

uln (Qul

−in)−1HinQ

ulin

)

, (3)

TABLE I: System Parameters

Parameter DescriptionFc, F ceNB

n cloud and cloudlet computational capacityCul

n , Cdln uplink and downlink backhaul capacity of celln

Hin , Gin uplink and downlink channel matrix for userinPulin

, P dln uplink and downlink power budget constraint

Wul,W dl uplink and downlink bandwidthNo noise power spectral densityNTin

, NRnnumber of transmit and receive antenna

BIin

, BOin

number of input and output bits for userinVin number of CPU cycles for userinTin latency constraint for userinNc,K number of cell and number of users in each celldin reception energy constant for userinκ mobile device switched capacitanceα, δ step size constant and termination accuracy

where

Ruln (Qul

−in) , N0I+∑

jm∈I,m 6=n

HjmnQuljmHH

jmn (4)

is the covariance matrix of the sum of the noise and of theinter-cell interference affecting reception at then-th ceNB in

the i-th spectral resources;Qul−in

,

((Qul

jm

)K

j=1

)Nc

n6=m=1; N0

is the noise power spectral density; andHin is the uplinkchannel matrix for MUin to the ceNB in the celln, whereasHjmn is the cross channel matrix between the interferingMU jm in the cell m and the ceNB in celln. The channelmatrices account for path loss, slow and fast fading. The time,in seconds, necessary for useri in cell n to transmit the inputbits BI

in to its ceNB in the uplink is then

∆ulin

(Qul

)=

BIin

Wulrulin (Qul), (5)

whereWul is the uplink channel bandwidth allocated to eachone of the orthogonal spectral resources. The correspondingenergy consumption due to uplink transmission is then

Eulin

(Qul

)= BI

in

tr(Qul

in

)

rulin (Qul), (6)

since aroundBIin/rulin

(Qul

)channel uses are needed in order

to transmit reliably in the uplink.2) Downlink transmission:The optimization variables for

the downlink are the ceNBs’ transmit covariance matrices(Qdl

in

)K

i=1, which are subject to per-ceNB power constraints

P dln :

Qdln ,

{

(Qdl

in

)K

i=1∈ C

NTin×NTin : Qdl

in� 0,

K∑

i=1

tr(Qdlin) ≤ P dl

n

}

.

(7)Similar to the uplink, we can write the achievable rate in bitsper symbol for each MU in the downlink as

rdlin(Qdl) = log2 det

(

I+GHinR

dln (Qdl

−in)−1GinQ

dlin

)

, (8)

with

Rdln (Qdl

−in) , N0I+∑

jm∈I,m 6=n

GjmnQdljmGH

jmn; (9)

3

the corresponding required transmission time reads

∆dlin

(Qdl

)=

BOin

W dlrdlin (Qdl), (10)

whereGin is the downlink channel matrix between the ceNBin cell n and the MUin; Gjmn is the cross channel matrixbetween the interfering MUjm in the cellm and the ceNB incell n; W dl is the downlink channel bandwidth; andQdl

−in ,((

Qdljm

)K

j=1

)Nc

n6=m=1. The downlink energy consumption is

given by

Edlin

(Qdl

)= BO

in

dinrdlin (Qdl)

, (11)

wheredin is parameter that indicates the receiver energy ex-penditure for each symbol interval. Note that (7)-(8) implicitlyassume that the downlink spectral resources are allocated tothe MUs in the same way as for the uplink, so that MUsin for n = 1, . . . , Nc are mutually interfering in both uplinkand downlink. This assumption can be easily alleviated at thecost of introducing additional notation.

3) Cloud processing:Let Fc be the capacity in terms ofnumber of CPU cycles per second of the cloud; and letfin ≥ 0be the fraction of the processing powerFc assigned to userin, so that

∑

in∈I fin ≤ 1. The time needed to runVin CPUcycles for userin remotely is then

∆exein (fin) =

Vin

finFc. (12)

Define f , (fin)in∈I .4) Backhaul transmission:We denote byCul

n the capacityin bits per second of the backhaul connecting the ceNB in celln with the cloud, and byCdl

n the capacity in bits per secondof the backhaul connecting the cloud with the ceNB in celln. Let culin , c

dlin ≥ 0 be the fraction of the backhaul capacities

Culn andCdl

n , respectively, allocated to MUi in cell n. Wethen have the constraint

∑Ki=1 c

ulin≤ 1 and

∑Ki=1 c

dlin≤ 1 for

all n. Moreover, the time delay due to the backhaul transferbetween ceNBn and the cloud in both directions is given by

∆bhin (c

ulin , c

dlin) =

BIin

culinCuln

+BO

in

cdlinCdln

, (13)

where the first term represents the latency in the uplinkdirection, denoted as∆bh,ul

in(cf. Fig. 2), and the second

term represents the latency in the downlink direction, denotedas ∆bh,dl

inin the same figure. The uplink/downlink backhaul

allocation vectors are defined ascul , (culin)in∈I and cdl ,

(cdlin)in∈I , respectively.

B. Problem Formulation

The total energy consumption due to the offloading of userin data, i.e., transmittingBI

in bits and receivingBOin bits, is

then given by

Ein

(Qul,Qdl

)= Eul

in

(Qul

)+ Edl

in

(Qdl

)

= BIin

tr(Qul

in

)

rulin (Qul)+BO

in

dinrdlin (Qdl)

.(14)

The optimal offloading problem can be stated as the mini-mization of the sum of the energy spent by all MUs to runtheir applications remotely, subject to individual latency andpower constraint. Stated in mathematical terms, we have thefollowing

minQul,Qdl,f ,cul,cdl

E(Qul,Qdl

),∑

in∈I

Ein

(Qul,Qdl

)

=∑

in∈I

BIin

tr(Qulin)

rulin

(Qul)+BO

indin

rdlin

(Qdl)

s.t. C.1BI

in

Wulrulin

(Qul)+

BIin

culin

Culn

+Vin

finFc

+BO

in

cdlin

Cdln

+BO

in

Wdlrdlin

(Qdl)≤ Tin , ∀in ∈ I,

C.2 fin ≥ 0, ∀in ∈ I,∑

in∈I

fin ≤ 1,

C.3 culin , cdlin ≥ 0, ∀in ∈ I,

K∑

i=1

culin ≤ 1,K∑

i=1

cdlin ≤ 1, ∀n = 1, . . . , Nc,

C.4 Qulin∈ Qul

in, ∀in ∈ I,

(Qdl

in

)K

i=1∈ Qdl

n , ∀n = 1, . . . , Nc.(P.1)

Constraint C.1 enforces that the latency for any MUin tobe less than or equal to the maximum tolerable delay ofTin seconds; C.2 imposes the mentioned limit on the cloudcomputational resources; C.3 enforces the limited backhaulcapacities in uplink and downlink; and C.4 guarantee thatpower budget constraint on the radio interface of both uplinkand downlink is satisfied. Note that problem (P.1) dependsonly on the ratiosBI

in/Wul, BI

in/Culn , Vin/Fc, BO

in/Wdl and

BOin/C

dln . We denote byZ ,

(Qul,Qdl, f , cul, cdl

)the set of

all optimization variables, and byZ the feasible set of (P.1).Note that (P.1) is non-convex, due to the non-convexity of theobjective function and of the constraint C.1.

Remark 1. As discussed, the expression in C.1 for thelatency of each user assumes that the communication andprocessing steps, including uplink transmission, edge-to-cloudbackhaul transmission, cloud processing, cloud-to-edge back-haul transmission and downlink transmission take place oneafter the other for each user, as illustrated in Fig. 2. Giventhat the uplink, backhaul, execution and downlink latenciesmay be different across the users, the communication steps ofdifferent users may not be aligned. While this fact may beexploited by a sophisticated receiver that tracks the variationof the interference power within a communication block, theresulting achievable rates are extremely difficult to characterizeand optimize. In contrast, the expression of the uplink anddownlink rates (3) and (8), in which interference is assumedto be caused by all users, are achievable by means of standarddecoders (see, e.g., [42]) and can be efficiently computed andoptimized, as it will be shown in Sec. III.

Feasibility. Problem (P.1) has a non-empty feasible set ifthere exist matricesQul and Qdl such that the followinginequalities hold

a) Tin >BI

in

Wulrulin

(Qul)+

BOin

Wdlrdlin

(Qdl), ∀in ∈ I;

4

b)∑

in∈I

Vin/Fc

Tin −BI

in

Wulrulin

(Qul)+

BOin

Wdlrdlin(Qdl)

≤ α1;

c)K∑

i=1

BIin/Cul

n

Tin −BI

in

Wulrulin

(Qul)+

BOin

Wdlrdlin

(Qdl)

≤ α2, ∀n = 1, . . . , Nc;

d)K∑

i=1

BOin/C

dln

Tin −BI

in

Wulrulin

(Qul)+

BOin

Wdlrdlin

(Qdl)

≤ α3, ∀n = 1, . . . , Nc;

for some α1, α2, α3 ≥ 0 with α1 + α2 + α3 = 1. Infact, if above conditions are met, one can always choose

fin = α−11 (Vin/Fc)

(

Tin −BI

in

Wulrulin

(Qul)−

BOin

Wdlrdlin

(Qdl)

)−1

,

and similarly forculin andcdlin , so that conditions b), c) and d)are satisfied with equality.

Remark 2. When the goal is identifying the achievable trade-off curve between energy consumption and latency, assumingfor simplicity that all MUs have the same latency constraintT ,e.g.,Tin = T , the following problem may also be considered

minQul,Qdl,f ,cul,cdl,T

E(Qul,Qdl

)+ λT

s.t. C.1BI

in

Wulrulin

(Qul)+

BIin

culin

Culn

+Vin

finFc

+BO

in

cdlinCdln

+BO

in

Wdlrdlin(Qdl)

≤ T , ∀in ∈ I,

C.2−C.4 of (P.1),(P.2)

whereλ > 0 is a parameter identifying the desired relativeweight between energy and latency minimization.

Remark 3. The problem formulation (P.1) can be easilyextended to account for a more general backhaul topologyin which the ceNBs are connected to the cloud via a multi-hop network with predefined routes between ceNBs and cloud.We do not further elaborate on this model here, because of thespace limitation.

Remark 4. Problem (P.1) was tacked in [29] for the spe-cial case where the joint optimization was over radio andcomputational resources and only in the uplink direction.Problem (P.1), instead, considers a general setup in whichjoint optimization carried over backhaul capacities in bothuplink and downlink, as well as the optimization over thedownlink radio transmission from the ceNB to the MU. Thisgeneralization entails different formulations for the objectivefunction and for the latency constraint from that in [29] andthus calls for novel SCA-based solution methods.

III. SUCCESSIVE CONVEX APPROXIMATION OPTIMIZATION

Problem (P.1) is non-convex due to the non-convexity of theobjective function and of the constraint C.1. To address thisissue, we leverage the SCA method proposed in [37], [38]for general non-convex problems. The SCA algorithm in [37],[38] is proved to converge to a stationaty solution of the (NP-hard) non-convex probem by solving a sequence of convexsub-problems, each one of which can be solved in polynomialtime, e.g., by interior-point methods [43]. To do so, we needto identify convex approximants for the objective functionandfor the non-convex constraints C.1 that satisfy the conditionsspecified in [37], [38] as discussed next.

A. Convex Surrogate for the Objective Function

Define asK ⊇ Z any compact convex set containingthe feasible setZ such that all functions in (P.1) are welldefined on it. Note that such a set always exists. This setexists by the same arguments used in [29, Sec. IV-B]. LetZ (v) ,

(Qul (v) ,Qdl (v), f (v) , cul (v), cdl (v)

), with v be-

ing the current iterate index of the SCA algorithm. Accordingto [37], [38], in order to be used within the SCA scheme,a convex approximantE (Z;Z (v)) of the objective functionE(Qul,Qdl

)around the current feasible iterateZ (v) ∈ Z

must satisfy the following properties ([37, Sec. II]):A1: E (•;Z (v)) is uniformly strongly convex onK;A2: ∇Qul∗ E (Z (v) ;Z (v)) = ∇Qul∗E

(Qul (v) ,Qdl (v)

)

and ∇Qdl∗ E (Z (v) ;Z (v)) = ∇Qdl∗E(Qul (v) ,Qdl (v)

),

for all Z (v) ∈ Z;A3: ∇Z∗E (•; •) is Lipschitz continuous onK ×Z;where ∇x∗f(x;y) denotes the conjugate gradient of thefunction f(x;y) with respect to its first argumentx. Wenote that, besides the convexity and smoothness conditionsA1 and A3, A2 enforces that the first-order behavior of theapproximant be the same as for the original function. In whatfollows, we derive an approximant for the objective function(14) satisfying A1-A3 above. Let us write

E (Z;Z (v)) ,∑

in∈I

Ein (Zin ;Z (v)) + E (Z;Z (v))

=∑

in∈I

(

Eulin

(Qul;Qul(v)

)+ Edl

in

(Qdl;Qdl(v)

) )

+ E (Z;Z (v)) ,

(15)

where Zin ,(Qul

in,Qdl

in, fin , c

ulin, cdlin

)with the function

E (Z;Z (v)) ,γqul

2

∑

in∈I

∥∥Qul −Qul (v)

∥∥2

+γqdl

2

∑

in∈I

∥∥Qdl −Qdl (v)

∥∥2

+γf

2 ‖f − f (v)‖2 +γcul

2

∥∥cul − cul (v)

∥∥2

+γcdl

2

∥∥cdl − cdl (v)

∥∥2

is added tomake the approximant objective function (15) strongly convexonK, with γqul , γqdl , γf , γcul andγcdl being arbitrary positiveconstants (see [37], [38]).The functionsEul

in

(Qul;Qul(v)

)

and Edlin

(Qdl;Qdl(v)

)are the convex approximants for the

uplink and downlink energy terms, respectively, as derivednext.

1) Convex approximant forEulin

(Qul

): It is not difficult to

check that an approximant that utilize the “partial” convexityof the uplink energy function (6) can be obtained as (cf. [29,Sec. IV-B])

Eulin

(Qul;Qul(v)

)= tr

(Qul

in (v)) BI

in

rulin(Qul

in,Qul

−in(v))

+ tr(Qul

in

) BIin

rulin(Qul

in(v) ,Qul

−in(v))

+∑

jm∈I,m 6=n

⟨

∇Qulin

∗Ejm

(Qul (v)

),Qul

in −Qulin (v)

⟩

,

(16)

where〈A,B〉 , Re{

tr(AHB

)}, and the conjugate gradient

5

∇Qulin

∗Ejm

(Qul (v)

)is given by [29, eq. (18)]

∇Qulin

∗Ejm

(Qul (v)

)=

tr(Qul

jm(v))∆ul

jm

(Qul (v)

)

log (2) ruljm (Qul (v))

· [HHinm(Rul

m

(Qul

−jm (v))−1− (Rul

m

(Qul

−jm (v))

+HjmQuljm (v)HH

jm)−1)Hinm].(17)

2) Convex approximant forEdlin

(Qdl

): To obtain the desired

convex approximant of downlink energy in (11), we need toconstruct the convex surrogate for the downlink rate function,i.e., rdlin

(Qdl;Qdl (v)

). To obtain such an approximant, we

exploit first the concave-convex structure of the rate functionsrdlin(Qdl)

rdlin(Qdl

)= log2 det

(Rdl

n

(Qdl

−in

)+HH

inHinQdlin

)

︸︷︷︸

rdlin

+(Qdl)

− log2 det(Rdl

n

(Qdl

−in

))

︸︷︷︸

rdlin−(Qdl

−in)

,(18)

where rdlin+ (

Qdl)

and rdlin− (

Qdl−in

)are concave functions.

The convex approximant is then obtained as

rdlin(Qdl;Qdl (v)

)= rdl+in

(Qdl

)− rdl−in

(Qdl

−in (v))

−∑

jm∈I

⟨

∇Quljm

∗rdlin− (

Qdl−in (v)

),Qdl

jm −Qdljm (v)

⟩

,

(19)

with

∇Qdljm

∗rdl−in

(Qdl

−in (v))= HH

jmnRdln

(Qdl

−in (v))−1

Hjmn.

(20)Then, we can simply obtain the convex approximant forEdl

in

(Qdl)

as

Edlin

(Qdl;Qdl (v)

)= BO

in

dinrdlin (Qdl;Qdl (v))

. (21)

Remark 5. The key advantage of SCA [37], [38] as com-pared to more conventional approaches such as Difference-of-Convex (DC) programming [44] is that the convex surrogateE (Z;Z (v)) need not be a global upper bound on the functionE(Qul,Qdl) – a condition that appears to be difficult to ensurefor the objective function of (P.1).

B. Inner Convexification of the Constraints

Next, let us consider the latency constraint C.1. Note thatthis constraint differs from the corresponding latency con-straint in [29] by virtue of the contributions due to downlinkand backhaul transmissions. Let us define the non-convex partof the left-hand side of C.1 as

gin(Qul,Qdl

),

BIin

Wulrulin (Qul)+

BOin

W dlrdlin (Qdl). (22)

To apply SCA, we need to obtain an approximantgin(Qul,Qdl;Z(v)

)at the current iterateZ (v)) ∈ Z

that satisfies the following properties ([37, Sec. II]):B1: gin

(•,Z(v)

)is uniformly convex onK;

B2: ∇Z∗ gin(Qul (v) ,Qdl (v) ;Z(v)

)=

∇Z∗gin(Qul (v) ,Qdl (v)

), for all Z(v) ∈ Z;

B3: ∇Z∗ gin(•; •) is continuous onK ×Z;B4: gin

(Qul,Qdl;Z(v)

)≥ gin

(Qul,Qdl

), for all

(Qul,Qdl

)∈ K andZ(v) ∈ Z;

B5: gin(Qul (v) ,Qdl (v) ;Z(v)

)= gin

(Qul (v) ,Qdl (v)

),

for all Z(v) ∈ Z;B6: gin(•; •) is Lipschitz continuous onK ×Z.Besides the first-order behavior and smoothness conditionsB2, B3, B6, the key assumptions B1, B4 and B5 enforcethat the approximantgin

(Qul,Qdl;Z(v)

)be a locally tight

(condition B5) convex (condition B1) upper bound (conditionB4) on the original constraintgin

(Qul,Qdl

). The desired

surrogate approximationgin(Qul,Qdl;Z(v)

)is then obtained

from (19) as

gin(Qul,Qdl;Z(v)

),

BIin

Wulrulin (Qul;Qul (v))

+BO

in

W dlrdlin (Qdl;Qdl (v)),

(23)

The proposed approximant (23) is an upper bound (condi-tion B4) and is a convex function ofQul andQdl (conditionB1). This is because both terms in (23) are the reciprocal of aconcave and positive function, and the sum of the two convexfunctions is convex. Furthermore, it is easy to show that theoriginal constraint (22) and its convex approximant (23) havethe same first-order behavior (condition B2) by evaluating thegradient of both functions at the current iterate. The remainingproperties B3-B6 can also be checked in a similar manner to[29, Sec. IV-B]. For instance, since the approximant functions(16) and (19) are twice continuously differentiable over thecompact convex setK ⊇ Z, the Lipschitz continuity of theirconjugate gradients follows readily.

C. SCA Algorithm

The SCA algorithm operates by iteratively solving thefollowing problem around the current iterateZ (v) ∈ Z,

Z (Z (v)) , argminQul,Qdl,f ,cul,cdl

E (Z;Z (v))

s.t.

C.1 gin(Qul,Qdl;Z(v)

)+

BIin

culinCuln

+BO

in

cdlinCdln

+Vin

finFc≤ Tin , ∀in ∈ I,

C.2−C.4 of (P.1). (P.3)

The unique solution of the strongly convex optimization prob-lem (P.3) is denotedZ

(Z (v)

),(Qul, Qdl, f , cul, cdl

). Note

that E (Z;Z (v)) is a function ofZ, given the current iterateZ (v).

The SCA scheme is summarized in Algo-rithm 1. In step 2, the termination criterion is∣∣E(Qul (v + 1),Qdl (v + 1)

)− E

(Qul (v),Qdl (v)

)∣∣ ≤ δ,

whereδ > 0 is the desired accuracy. The step size rule weused isγ (v) = γ (v − 1) (1− αγ (v − 1)) with γ (0) ∈ (0, 1]andα ∈ (0, 1/γ (0)) (other step size rules can also be adopted,

6

see [37], [38]). Algorithm 1 converges to a stationary pointof the problem (P.1) in the sense of [29, Theorem 2].

Remark 6. The SCA scheme can also be easily adaptedto tackle the weighted sum problem (P.2) discussed in Re-mark 2. This alternative formulation has the key advantagethat the identification of an initial feasible pointZ (0) ,(Qul (0) ,Qdl (0) , f (0) , cul (0) , cdl (0) , T (0)

)for the SCA

is a trivial task. This is because one can always select a valueof T (0) that satisfies the constraint C.1 in (P.2) for given valuesof the other variables.

Remark 7. Each instance of the optimization prob-lem (P.3) tackled by SCA can be solved with complexityO(max{n3, n2m}) using interior-points methods [45, Ch. 1],where n is the size of the optimization variables, namely(

2N2Tin

+ 3)

KNc, and m is the number of constraints,namely m = 7KNc + 3Nc + 1. Note that the complexityscales polynomially with the numberK of users and withall system parameters. While here we have focused on acentralized implementation, the complexity could be furtherreduced by developing distributed solutions as described in[37, Sec. IV]. Finally, we would also like to mention that,in practice, rather than solving problem (P.3) using SCA ateach time slot for the given realization of the channels, itwould be possible to solve the problem for a number ofrepresentative channels so as to build a sufficiently dense look-up table. More interestingly, as recently explored in [46] forpower allocation in an interference channel, one could use suchrepresentative channels to train a neural network, or anotherlearning machine, to “interpolate” the solution to other channelrealizations. We leave these aspects for future investigations.

Algorithm 1: SCA Solution for (P.3)

Input: Parameters from Table I;Z (0) ∈ Z; v = 0;{γ (v)}v ∈ (0, 1]; γqul , γqdl , γf , γcul , γcdl > 0.

1: If Z (v) satisfies the termination criterion, stop.2: ComputeZ (Z (v)) from (P.3).

3: SetZ (v + 1) = Z (v) + γ (v)(

Z (Z (v))− Z (v))

.4: v ← v + 1, and return to step1.

Output: Z =(Qul,Qdl, f , cul, cdl

).

IV. USER SELECTION

In the previous section, we assumed that a given numberof active users, namelyK per cell, was scheduled for trans-mission. The premise of this section, is that, if too manyMUs simultaneously choose to offload their computationaltasks, the resulting interference on the wireless channel mayrequire an energy consumption at the mobile for wirelesstransmission that exceeds the energy that would be neededfor local computing at some MUs. Moreover, the backhauland computing delays may make the latency constraint inproblem (P.1) impossible to satisfy, and thus problem (P.1)infeasible. For theses reasons, in this section, we consider userselection with the aim of maximizing the number of MUs

that perform offloading while guaranteeing that the selectedMUs can satisfy their latency constraints and, at the sametime, consume less energy than with local computing. In therest of this section, the local computation energy model isfirst elaborated on and then the user scheduling problem isformulated and tackled by integrating SCA with smoothlp-norm approximation methods.

A. Local Computation Energy

When the application is executed at the mobile device, theenergy consumptionEM

in is determined by the number ofCPU cycles required by the application,Vin , and by the clockfrequency of the device chip, which is denoted here asFin .In particular, for CMOS circuits, the energy per operation isproportional to the square of the supply voltage to the chip,and when the supply voltage is low, the clock frequency ofthe chip is a linear function of the voltage supply [47]. As aresult, the mobile energy for computing can be expressed as

EMin = κVinF

2in , (24)

whereκ is the effective switched capacitance, which dependson the MU processor architecture, and the clock frequency isselected so as to meet the latency constraint, yielding

Fin =Vin

Tin

. (25)

By plugging this into (24), we obtain the total consumptionenergy for mobile execution as

EMin = κ

V 3in

T 2in

. (26)

For each MUin, offloading is advantageous when the energyfor local mobile computing is higher than the energy requiredfor offloading, i.e.,

EMin ≥ Eul

in

(Qul

), (27)

whereEulin

(Qul

)is given in (6). Note that here we consider

for brevity only the uplink energy contribution in (14).

B. User Scheduling

To proceed, we introduce the auxiliary slack variables(xin , yin) for each MU in measuring the violation of thelatency constraint C.1 in (P.1) and the energy constraint(27), respectively. Our system design becomes maximizing thenumber of MUs that can perform offloading, while satisfyingthe latency constraints and guaranteeing energy savings withrespect to local computation when offloading is performed.This amounts to maximizing the number of MUsin with noviolation of the mentioned constraints, i.e., withxin = 0 andyin = 0. This is done here by minimizing theℓ0-norm

‖x‖0 + ‖y‖0 =∑

in∈X

I(xin > 0) +∑

in∈X

I(yin > 0), (28)

whereI is an indicator function that returns1 for xin , yin > 0and 0 otherwise, also definex , [xin ]in∈X and y ,

[yin ]in∈X . The sum of thel0-norms of the slack vectorsxandy in (28) counts the number of constraints C.1 and C.2

7

that are violated by the users. Therefore, minimizing this sumenforces the selection of users that satisfy the largest numberof constraints. Accordingly, the problem, defined for generalityover any arbitrary subset of usersX ⊆ I, reads

minQul,Qdl,f ,cul,cdl,x,y

‖x‖0 + ‖y‖0

s.t. C.1BI

in

Wulrulin

(Qul)+

BIin

culin

Culn

+Vin

finFc+

BOin

cdlin

Cdln

+BO

in

Wdlrdlin

(Qdl)− Tin ≤ xin , ∀in ∈ X ,

C.2 Eulin

(Qul

)− EM

in ≤ yin , ∀in ∈ X ,C.3 xin ≥ 0, yin ≥ 0, ∀in ∈ X ,C.2−C.4 of (P.1), ∀in ∈ X .

(P.4)DefineZ ,

(Qul,Qdl, f , cul, cdl,x,y

).

The scheduling algorithm is described in Algorithm 2.The algorithm maximizes the number of scheduled MUs byprogressively removing the MUsin that have the largestentries in the vectorw , x/

∑

in∈X xin + y/∑

in∈X yin ,which measures the relative amount, with respect to all usersin X , by which a user violates the two constraints. Notethat the normalizations by

∑

in∈X xin and∑

in∈X yin ensurethat the two constraints are considered on an equal footing.The outlined iterative procedure is repeated until a subsetofusersX ∗ is found for which problem (P.4) returns vectorwthat is close to zero, signifying feasibility of offloading underconstraints C.1-C.4 of (P.1) as well as (27). We observe that,unlike the admission control scheme in [39, Algorithm 2], theproposed algorithm requires two set of auxiliary variablesinorder to account for the constraints C.1 and C.2.

Algorithm 2 returns the solutions∗, from which the setof MUs scheduled for offloading is obtained asX ∗ ,

{π1, . . . , πs∗}. In more details, upon obtaining the solutionof (P.4), the set of MUs is ordered according to the respectivevalues of the entries of vectorw. Then, the subsetX ∗

of scheduled users is computed by bisection. In particular,bisection searches for the minimum number of users inthe interval [0,KNc], where KNc is the total number ofusers, that should be removed, so that the rest of the userscan be scheduled for offloading while satisfying the desiredconstraints. Specifically, as described in Algorithm 2, setX ∗ is defined asX ∗ , {π1, . . . , πs∗}, where the value ofs∗ ∈ [0,KNc] is found by successively searching within theinterval [Klow,Kup], which is initialized as[0,KNc]. At eachstep, first, the search interval is halved using the midpoints. Then, a feasibility test is performed to check whether theconstraints C.1-C.4 of (P.1) can be met if the subset of MUsX [s] , {π1, . . . , πs} is scheduled and the limits[Klow,Kup]is updated accordingly. The feasibility test is carried outbysolving an instance of problem (P.4) over the subset of MUsX [s]. The feasibility status is determined by the value of theresulting auxiliary variables, i.e., the problem is considered tobe feasible if the slack variables are smaller than a positivevalueη close to zero.

We now discuss how to solve problem (P.4). Problem(P.4) is non-convex due to the non-convexity of the objectiveand of the constraints C.1 and C.2. Based on the limit‖x‖0 = limp→0 ‖x‖

pp = limp→0

∑

in∈X |xin |p, the objective

function of (P.4) can be approximated by a higher-order norm

to make the problem mathematically tractable. In particular,in a manner similar to [39], we adopt the smooth objective

f(x,y) ,∑

in∈X

(x2in + ǫ2)p/2 +

∑

in∈X

(y2in + ǫ2)p/2, (29)

whereǫ > 0 is a small fixed regularization parameter. Substi-tuting (29) as the objective in (P.4), we now apply the SCAapproach to obtain a local optimal solution of the resultingproblem.

To this end, a convex upper bound satisfying conditionsA1-A3 described in Sec. III-A for the smoothedℓp-normobjective function (29) can be obtained from the result in [39,Proposition 1] and is given by

f (Z;Z (v)) ,∑

in∈X

ω′inx

2in+

∑

in∈X

ω′′iny

2in+f (Z;Z (v)) , (30)

where ω′in = p

2

[

(xin(v))2 + ǫ2

] p

2−1

and

ω′′in

= p2

[

(yin(v))2 + ǫ2

] p

2−1

; Z (v) ,(Qul (v) ,Qdl (v), f (v) , cul (v), cdl (v),x (v),y (v)

); and

The functionf (Z;Z (v)) ,γqul

2

∑

in∈I

∥∥Qul −Qul (v)

∥∥2+

γqdl

2

∑

in∈I

∥∥Qdl −Qdl (v)

∥∥2

+γf

2 ‖f − f (v)‖2 +γcul

2

∥∥cul − cul (v)

∥∥2

+γcdl

2

∥∥cdl − cdl (v)

∥∥2

+γx

2 ‖x− x (v)‖2 +γy

2 ‖y − y (v)‖2 is added to realizethe strong convexity of (30) withγx, γy > 0.

The convexification of constraint C.1 is done as in (23).Lastly, to obtain an inner convexification for the energyconstraint C.2 that satisfies the conditions B1-B6 in Sec. III-B,we utilize the concave-convex structure of the rate functionrulin(Qul

)as in (18), to rewrite constraint C.2 as

tr(Qul

in

)−

EMin

BIin

rulin+ (

Qul)−

EMin

BIin

rulin− (

Qul−in

)≤ yin , (31)

whererulin+ (

Qul)

andrulin− (

Qul−in

)are given in (18). Using

the linearization (19), we then obtain the desired upper boundon C.2 as

tr(Qul

in

)−

EMin

BIin

rulin(Qul;Qul (v)

)≤ yin . (32)

Given a feasible pointZ (v), we define the following stronglyconvex problem

Z (Z (v)) , argminQul,Qdl,f ,cul,cdl,x,y

f (Z;Z (v))

s.t.

C.1 gin(Qul,Qdl;Z (v)

)+

BIin

culinCuln

+BO

in

cdlinCdln

+Vin

finFc− Tin ≤ xin , ∀in ∈ X ,

C.2 tr(Qul

in

)−

EMin

BIin

rulin(Qul;Qul (v)

)≤ yin , ∀in ∈ X ,

C.3 xin ≥ 0, yin ≥ 0, ∀in ∈ X ,

C.2−C.4 of (P.1), ∀in ∈ X , (P.5)

8

whereZ (Z) , (Qul, Qdl, f , cul, cdl, x, y) denote the uniquesolution of (P.5). The SCA scheme for solving (P.5) is de-scribed in Algorithm 3. As a technical note, we observe thathere, since the approximant (30) of the objective function (29)is an upper bound on (29), convergence of Algorithm 3 isguaranteed also by settingγ(v) = 1 [37, Sec. III-A].

Algorithm 2: User SchedulingInput: Parameters used by Algorithm 3.

1: Solve problem (P.4) using Algorithm 3 forX = I toobtainw ,

x∑

in∈I xin

+y

∑

in∈I yinand sort the MUs

in ascending order of the value ofw aswπ1≤ wπ2

≤ . . . ≤ wπKNc, whereπ is a permutation of

I.2: Set:Klow = 0, Kup = KNc.3: Repeat4: Sets← ⌊Klow+Kup

2 ⌋.5: Perform the feasibility test by solving (P.4) using

Algorithm 3 for X = X [s] , {π1, . . . , πs}: if it isfeasible, setKlow = s; otherwise, setKup = s.

6: Until Kup−Klow = 1.7: Sets∗ = Klow andX ∗ = {π1, . . . , πs∗}.

Output: Number of scheduled MUss∗.

Algorithm 3: SCA Solution for (P.5)

Input: Parameters from Table I;p = 0.5; v = 0; Z (0) ∈ Z;{γ (v)}v ∈ (0, 1]; γqul , γqdl , γf , γcul , γcdl , γx, γy, ǫ > 0;ω′

in(0) = ω′′in(0) = 1.

1: If∣∣∣f (Z;Z (v + 1))− f (Z;Z (v))

∣∣∣ ≤ δ, stop.

2: ComputeZ (Z (v)) from (P.5).

3: SetZ (v + 1) = Z (v) + γ (v)(

Z (Z (v))− Z (v))

.4: Update

ω′in(v + 1) =

p

2

[

(xin(v + 1))2 + ǫ2] p

2−1

,

ω′′in(v + 1) =

p

2

[

(yin(v + 1))2 + ǫ2] p

2−1

.

5: v ← v + 1, and return to step1.Output:

(Qul,Qdl, f , cul, cdl,x,y

).

V. HYBRID EDGE AND CLOUD COMPUTING

In the previous sections, we considered a scenario in whichthe MUs can offload applications to a cloud server. In thissection, we extend the analysis to a more general set-up inwhich the ceNBs are directly connected to local computingservers, also known as cloudlets [5], which may run some ofthe MUs’ applications. Specifically, each ceNB can either ex-ecute the computation task on the behalf of the MU or offloadit to the cloud. LetF ceNB

n be the computation capability inCPU cycles per second of ceNBn, and letf ceNB

in ≥ 0 bethe fraction of the ceNB’s computing power assigned to user

in, so that∑

i

f ceNBin

≤ 1. If implemented at the ceNB, the

execution time of the task of MUin is then given as

∆exe|ceNBin

=Vin

f ceNBin

F ceNBn

. (33)

In the same way, if the cloud processes the task of userin,the execution time∆exe|cloud

inis given by the right-hand side

of (12). The overall latency∆in experienced by each MUincan be expressed as

∆in = ∆ulin + (1− uin)∆

exe|ceNBin

+ uin∆exe|cloudin

+ uin∆bhin +∆dl

in ,(34)

where ∆ulin,∆bh

inand ∆dl

inhave the same definition as in

(5), (13) and (10), respectively;uin is a binary variable thatindicates whether if the task of MUin is processed on theceNB (uin = 0) or on the cloud (uin = 1).

To proceed, we relax the binary variableuin to be definedin the interval [0, 1]. This relaxation not only provides alower bound on the minimum energy expenditure that canbe obtained with a hard choice between cloudlet and cloudoffloading, perhaps more importantly, it also captures a systemin which the input data to the application of the MUin canbe split into two parts, of sizes1− uin and uin , that canbe processed separately at the ceNB and cloud, respectively.In the following, we will adopt this latter justification of themodel.

As in Sec. II and Sec. III, we aim at minimizing the totalenergy consumed by the MUs to execute their tasks remotelyunder latency and power constraints. The problem is given by

minQul,Qdl,u,fceNB,f ,cul,cdl

Eul(Qul

)=∑

in∈I

Eulin

(Qul

in,Qul

−in

)

=∑

in∈I

BIin

tr(Qulin)

rulin

(Qul)

s.t. C.1BI

in

Wulrulin

(Qul)+

(1−uin )Vin

fceNBin

F ceNBn

+uinBI

in

culin

Culn

+uinVin

finFc+

uinBOin

cdlin

Cdln

+BO

in

Wdlrdlin(Qdl)

≤ Tin , ∀in ∈ I,

C.2 0 ≤ uin ≤ 1, ∀in ∈ I,C.3 f ceNB

in, fin ≥ 0,

∑

in∈I

fin ≤ 1, ∀in ∈ I,

K∑

i=1

f ceNBin

≤ 1, ∀n = 1, . . . , Nc,

C.4 culin , cdlin ≥ 0, ∀in ∈ I,

K∑

i=1

culin ≤ 1,K∑

i=1

cdlin ≤ 1, ∀n = 1, . . . , Nc,

C.5 Qulin∈ Qul

in, ∀in ∈ I,

(Qdl

in

)K

i=1∈ Qdl

n , ∀n = 1, . . . , Nc.(P.6)

Note that, unlike (P.1), here we have the additional optimiza-tion variablesu , (uin)in∈I and fceNB ,

(f ceNBin

)

in∈I.

It can be observed that (P.6) is non-convex due to the non-convexity of the objective function and constraint C.1. Wetackle the problem by means of the SCA method convexifyingthe objective function as done in Sec. III-A. Furthermore, weneed to calculate a convex upper bound for the C.1 constraint

9

in (P.6) that satisfies conditions B1-B6 in Sec. III-B. Let uswrite the left-hand side of C.1 by

gin(Qul,Qdl,u, fceNB, f , cul, cdl

),

BIin

Wulrulin (Qul)

+BO

in

W dlrdlin (Qdl)+

(1− uin)Vin

f ceNBin

F ceNBn

+uinVin

finFc

+uinB

Iin

culinCuln

+uinB

Oin

cdlinCdln

.

(35)

To build the desired bound ongin , we observe that the first twoterms can be handled as in Sec. III-B, while for the last fourterms, we observe that they are all ratios of linear functions,we now observe that the relationship

x

y=

1

2

(

x+1

y

)2

−1

2

(

x2 +1

y2

)

, (36)

holds, where the right-hand side is the difference of twoconvex functions ifx ≥ 0 and y > 0. Therefore, a locallytight convex upper bound on the left-hand side of (36) can beobtained by linearizing the concave part as

x

y≤1

2

(

x+1

y

)2

−1

2

(

(xv)2+

1

(yv)2

)

− xv (x− xv)+

1

(yv)3(y − yv) ,

(37)

where superscriptv identifies the point(xv, yv) at which theupper bound is tight. Using (37) to the last four terms in (35),we define the desired approximants by substituting forx andy in (37) the numerator and denominator, respectively, of eachof the last four terms in (35). This yields the SCA procedurein Algorithm 1 with the difference that we defineZ (v) ,(Qul (v) ,Qdl (v) ,u (v) , fceNB (v) , f (v) , cul (v) , cdl (v)

)

andZ ,(Qul,Qdl,u, fceNB, f , cul, cdl

)

Z (Z (v)) , argminQul,Qdl,,u,fceNB,f ,cul,cdl

Eul (Z;Z (v))

s.t.

C.1 gin (Z;Z (v)) ≤ Tin , ∀in ∈ I,

C.2−C.5 of (P.6), (P.7)

where Z (Z (v)) , (Qul, Qdl, u, fceNB , f , cul, cdl) denotesthe unique solution of the strongly convex optimizationproblem (P.7); the objective functionEul (Z;Z (v)) ,∑

in∈I

(

Eulin

(Qul;Qul(v)

)+ Ein (Zin ;Zin (v))

)

where Eulin

(Qul;Qul(v)

)is given as in (16) and we

define E (Zin ;Zin (v)) ,γqul

2

∥∥Qul

in −Qulin (v)

∥∥2

+γqdl

2

∥∥Qdl

in−Qdl

in(v)∥∥2

+ γu

2 ‖uin − uin (v)‖2 +γfceNB

2 ‖f ceNBin

- f ceNBin

(v) ‖2 +γf

2 ‖fin − fin (v)‖2 +γcul

2

∥∥culin − culin (v)

∥∥2

+γcdl

2

∥∥cdlin − cdlin (v)

∥∥2

withγu, γfceNB > 0; and gin (Z;Z (v)) is the locally convexupper bound defined above.

VI. ENHANCED DOWNLINK VIA NETWORK MIMO

So far, we assumed that each ceNBn serves the users{in : i = 1, . . . ,K} in its cell, hence interfering with otherceNBs. In this section, instead, we consider enhanced down-link transmission based on network MIMO. Specifically, weassume that each MU can receive the result of the cloudexecution in the downlink not only from ceNB in the samecell, but also from other ceNBs that cooperatively transmitto the MU. Cooperation among ceNBs is enabled by thetransmission on the backhaul links of the outcome of the cloudexecution for a given MU to multiple ceNBs. We specificallyconsider here the extreme case of full ceNB cooperation, sothat all the ceNBs transmit cooperatively to each MU. Theextension to a more general model with clustered cooperationis straightforward and will not be pursued further.

To express the achievable downlink rate with networkMIMO, it is convenient to define user-centric transmit co-variance matricesQdl

j ∈ CNT×NT for every MU j ∈ I,

where we have definedNT ,∑Nc

n=1 NTn, such that then-th

NTn× NTn

block on the main diagonal, denoted by[Qdlj ]n,

represents the contribution of ceNBn to the transmission toMU j. Note that the out-of-diagonal blocks inQdl

j describethe correlation among signals sent by different ceNBs, whichdesignates cooperative transmission at the ceNBs. The set ofdownlink covariance matrices,Qdl =

{Qdl

j

}

j∈Iis

Qdl ,{Qdl

j ∈ CNT×NT , j ∈ I :

∑

j∈I

tr([Qdl

j ]n)≤ P dl

n , ∀n},

(38)

so that the per-ceNB power constraints are satisfied. Also, it isconvenient to define the channel matrix from all the ceNBs toMU j asGj ∈ C

NRj×NT , whereGj , [Gj1,Gj2, . . . ,GjNc

]and we have setGj,n = Gj . With these definitions, theachievable downlink rate is given by

rdlin(Qdl) = log2 det

(

I+ GHinR

dlin(Q

dl−in)

−1GinQdlin

)

, (39)

with

Rdlin(Q

dl−in) , σ2

wI+∑

im∈I\{in}

GinQdlimGH

in , (40)

andQdl−in

, (Qdljm

)jm 6=in .

In order to enable network MIMO, we need to ensure thatdownlink transmissions from all ceNBs take place at the sametime. To this end, we impose that uplink transmission, com-puting and backhaul transmissions for all MUs are constrainedto be completed by a given timeT1. At time T1, then, thedownlink transmission is initiated and takes a given timeT2.Accordingly, we formulate the optimization problem following

10

the weighted sum approach discussed in Remark 2 as

minQul,Qdl,f ,cul,cdl,T1,T2

∑

in∈I

BIin

tr(Qulin)

rulin

(Qul)+ λ(T1 + T2)

s.t. C.1BI

in

Wulrulin

(Qul)+

BIin

culin

Culn

+Vin

finFc

+BO

in

Cdlmcdl

in,m

≤ T1, ∀in ∈ I,

C.2BO

in

Wdlrdlin

(Qdl)≤ T2, ∀in ∈ I,

C.3 fin ≥ 0,∑

in∈I

fin ≤ 1, ∀in ∈ I,

C.4 culin , cdlin,m

≥ 0,K∑

i=1

culin ≤ 1,∑

j∈I

cdlj,m ≤ 1, ∀in ∈ I,

C.5 Qulin ∈ Q

ulin ,Q

dl ∈ Qdl, ∀in ∈ I,(P.8)

wherecdl ,(cdlin,m

)

in∈I,m, with cdlin,m being the fraction of

the backhaul capacity to ceNBm allocated to transmit theoutput bits intended for MUin; and λ > 0 is a parameterdefining the relative weight of energy and latency.

Problem (P.8) is non-convex due to the non-convexity ofthe objective function and constraints C.1 and C.2. We tacklethis problem using SCA in a way similar to that used for(P.1), i.e., we obtain a convex approximants for the objectivefunctions using (16) and for the latency constraints using (23).The problem is then solved using the SCA procedure describedin Algorithm 1.

VII. N UMERICAL RESULTS

In this section, we present numerical results validatingthe model and algorithms presented in the previous sections.Throughout, we consider a network composed of three cellswith five users in each cell, i.e.,Nc = 3 and K = 5. Alltransceivers are equipped withNTin

= NRn= 2 antennas.

The channel matrices are generated with independent andidentically distributed complex Gaussian entries having zeromean and variance equal to the path loss, which is assumed tobe identical for uplink and downlink. The path loss is givenby 170 dBm between an MU and the ceNB in the samesmall cell and 180 dBm between an MU and the ceNB in theother small cell. The values 170 dBm and 180 dBm can bejustified by considering the Walfish-Ikegami model [48] withdistances of500 meters between MU and ceNB in the samecell and700 meters for MU and ceNB in different cells. Theother parameters of the Walfish-Ikegami model are selectedso as to simulate a small-cell environment in a typical urbansetting as listed in Table II. Targeting a small-cell scenarios,we will explore values of the backhaul capacities as low asfew Mbits/s [35], [36]. Finally, note that settingN0 = −170dBm/Hz [48], yields an average signal-to-noise ratio (SNR)on the direct link of40 dBm per receive antenna and30 dBmon the interference link for a signal transmitted at0.01 Jouleper symbol from a single antenna in the uplink and downlink.Furthermore, unless stated otherwise, we select each numberof input bitsBI

inand output bitsBO

inuniformly at random in

the interval[0.1 − 1] Mbits and we set the number of CPUcycles asVin = 2640×BI

in CPU cycles. These choices reflectcomputational-intensive applications, as demonstrated by the

1 3 5 7 9 11 13 15 17

Iteration index (v)

3500

4000

4500

5000

5500

6000

6500

Mob

ilesum-energy

[J]

Equal Bh and Cloud Alloc.

Equal Cloud Alloc.

Equal Bh Alloc.

Joint Optim.

Fig. 3: Minimum average mobile energy consumption versus itera-tion index (Tin = 0.12 seconds,Nc = 3, K = 5, W ul

= W dl=

100 MHz, BIin andBO

in ∼ U{0.1, 1} Mbits, Vin = 2640×BIin CPU

cycles,Culn = Cdl

n = 100 Mbits/s, andFc = 1011 CPU cycles/s).

measurments in [15], [49]. The cloud capacity isFc = 1011

CPU cycles/s, which is, e.g., obtained by a four-core serverwith Intel Xeon processor with 3.3 GHz that is commerciallyused by Amazon elastic compute cloud (EC2) [16], [50].Other system parameters are set toWul = W dl = 10 MHz,Cul

n = Cdln = 100 Mbits/s, din = 10−5 J/symbol [51], and

Tin = 0.1 seconds. Throughout, averages are intended withrespect to the channel realizations.

parameter value descriptionf 1800 frequency (MHz)θ 45◦ road orientation angle (degree)hTx 3 height of transmitter (meters)hRx 1 height of receiver (meters)hRoof 5.5 mean value of buildings height (meters)sRoof 8 mean value of buildings separation (meters)str wid 5 mean value of street width (meters)ka 54 path loss penalty when ceNB below rooftopkd 5 correlation adjustment constant

TABLE II: Parameters in the Walfish-Ikegami path loss model[48].

We start by illustrating the typical convergence propertiesof the SCA algorithm by plotting in Fig. 3 the averageminimal mobile energy consumptionE

(Qul,Qdl

)versus the

iteration indexv. Besides the joint optimization across uplink,downlink, backhaul and computing resources, studied in Sec.II, we consider the following more conventional solutions:(i) Equal backhaul and cloud allocation:the computingand backhaul resources are equally allocated to all MUs,that is, fin = 1/(NcK) and culin = cdlin = 1/K for allin ∈ I, while the covariance transmit and receive matricesat the physical layer are optimized using SCA [29];(ii)Equal cloud allocation:computing resources at the cloudare equally allocated among the MUs, while the rest of theparameters are jointly optimized using SCA;(iii) Equal back-haul allocation: the backhaul resources are equally allocatedamong the MUs, while the rest of the parameters are jointly

11

0.08 0.09 0.1 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18

Latency [s]

3500

4000

4500

5000

5500

6000

Mob

ilesum-energy

[J]


Equal Cloud Alloc.

Equal Bh Alloc.

Joint Optim.

Fig. 4: Minimum average mobile energy consumption versus thelatency constraintTin , assumed to be the same for all MUs (Nc =

3, K = 5, W ul= W dl

= 100 MHz, BIin and BO

in ∼ U{0.1, 1}

Mbits, Vin = 2640 × BIin CPU cycles,Cul

n = Cdln = 100 Mbits/s,

andFc = 1011 CPU cycles/s).

optimized using SCA. We observe the fast convergence ofSCA, which requires around 17 iterations to obtain resultsclose to convergence (at iteration 17 the termination criterion∣∣E(Qul (v + 1),Qdl (v + 1)

)− E

(Qul (v),Qdl (v)

)∣∣ ≤ δ is

satisfied with δ = 10−3 for α = 10−5). Furthermore, itcan be seen that the proposed joint optimization methodshows a considerable gain compared to the equal allocationof computational and backhaul resources.

The gains that can be accrued by means of joint optimizationare further investigated in Fig. 4 (obtained in the same settingof Fig. 3), which depicts the minimum average mobile energyas a function of the latency constraintsTin , which are assumedto be the same for all MUs. The energy saving due to jointoptimization can be seen to be particularly pronounced in theregime in which the latency constraint is more stringent. Forinstance, atTin = 0.12 seconds, which is the smallest latencyfor which all schemes are feasible, joint optimization saves66% in terms of sum-energy as compared to equal allocation ofbackhaul and cloud resources, 48% as compared to equal cloudallocation, and 32% as compared to equal backhaul allocation.

To account for the case in which the network may oper-ate under asymmetrical bandwidth allocation for uplink anddownlink, we compare the mobile sum-energy consumptionof the schemes considered above in the same set-up of Fig.3 and Fig. 4, with the caveat that we setW dl = 10 MHzand we vary the uplink/downlink bandwidth ratioWul/W dl,in Fig. 5. We observe that joint optimization is especiallyadvantageous in term of mobile energy consumption whenthe uplink bandwidth is more constrained than the downlinkbandwidth. This is because, in this regime, it is particularlyuseful to allocate more computing and backhaul resources tousers with worse channel condition in order to meet the latencyrequirements with minimal energy expenditure. For example,whenWul is five times smaller than the downlink bandwidth,

the joint optimization scheme is50% more energy efficientthan the fixed allocation of backhaul and cloud resources.

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

W ul/W dl

600

800

1000

1200

1400

1600

1800

2000

2200

2400

Mob

ilesum-energy

[J]


Equal Bh Alloc.

Equal Cloud Alloc.

Joint Optim.

Fig. 5: Minimum average mobile sum-energy consumption versusbandwidth allocation ratioW ul/W dl (W dl

= 10 MHz, Nc =

2,K = 2, Tin = 0.09 s, BIin and BO

in ∼ U{0.1, 1} Mbits,Vin = 2640 × BI

in CPU cycles,Culn = Cdl

n = 100 Mbits/s, andFc = 10

11 CPU cycles/s).

While the previous results assume that all MUs performcomputation offloading, we next turn to the study of userselection that is presented in Sec. IV. To account for mobileenergy consumption, we set the effective switched capacitanceof the mobile asκ = 10−26 so that the mobile energyconsumption is consistent with the measurements made in [15]for a Nokia N900 mobile device operating at frequency 500MHz. To this end, we plot the average number of selectedMUs as a function of the latency constraintTin , assumed tobe the same for all MUs, for different values of backhaul inFig. 6. The figure shows the results obtained by means of theefficient algorithm proposed in Sec. IV withη = 10−5, as wellas by anexhaustive searchstrategy in which, for every subsetof users, a joint optimization problem is solved as per Sec. IIusing SCA (Algorithm 1) and then the subset of users whichyields the minimal energy consumption is selected. The figuredemonstrates how a less restrictive latency constraint and/or alarger backhaul increases the number of users that should beallowed to offload. For instance, forTin ≤ 0.07 seconds, thereis no MU on average offloads its applications since offloadingis more energy demanding. Instead, forTin > 0.2 seconds, aslong as the backhaul capacity is larger than500 Mbits/s, allMUs tend to offload. It is also observed that the proposedscheme selects a number of users close to that chosen byexhaustive search.

The corresponding overall energy consumption for the ex-ample of Fig. 6 with backhaul capacityCul

n = Cdln = 100

Mbits/s is shown in Fig. 7. Note that the overall energyconsumption is the sum of the mobile computing energy forthe MUs that perform local computing, namely (26), and ofthe energy used for transmission, namely (6), for the MUsthat offload. For reference, we also show the energy requiredwhen all MUs perform their tasks locally. The proposed user

12

0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2

Latency [s]

0

0.5

1

1.5

2

2.5

3

3.5

4Averagenumber

ofselected

MUs

Culn = 500 Mbits/s (ES)

Culn = 500 Mbits/s (US)







US: Proposed User SelectionES: Exhaustive Search User Selection

Fig. 6: Average number of selected MUs versus latency constraintTin for both the proposed efficient scheme in Sec. IV and exhaustivesearch (Nc = 2,K = 2, BI

in = BOin = 1 Mbits, Vin = 10

9 CPUcycles,Fc = 10

11 CPU cycles/s, andCdln = Cul

n ).

0.07 0.08 0.09 0.1 0.11 0.12 0.13 0.14 0.15 0.16 0.17

Latency [s]

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

Mob

ilesum-energy

[J]

Local Computing

Proposed User Selection

Exhaustive Search User Selection

Fig. 7: Minimum average mobile energy consumption versus thelatency constraintTin (Nc = 2, K = 2, BI

in = BOin = 1 Mbits,

Vin = 109 CPU cycles,Cul

n = Cdln = 100 Mbits/s, andFc = 10

11

CPU cycles/s).

selection scheme is seen to achieve a near-optimal energyperformance compared to the exhaustive search method. Asan example, when the latency constraintTin = 0.17 seconds,around 83% energy saving can be obtained with the proposeduser offloading selection scheme as compared tolocal com-puting.

In the previous results, central cloud processing was as-sumed. We now investigate the optimal allocation of comput-ing tasks between the edge and the cloud in the set-up studiedin Sec. V. To this end, in Fig. 8, we plot the fractions(uin)in∈I

of the application of each MUin to be performed at the cloudas a function of the backhaul capacity when the computingcapability of the cloudlet is given byF ceNB

n = 0.1Fc.To highlight the impact of asymmetries in the offloading

2 2.5 3 3.5 4 4.5 5 5.5 6

Backhaul capacity [Mbits/s]

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Optimal

fraction

ofprocessingat

theclou

d(u

i n)

u11

u21

u12

u22

Fig. 8: Offloading cloud vs. cloudlet indicator versus the backhaulcapacity (Nc = 2,K = 2, Tin = 0.9 seconds,BI

11= BO

11= 1

Mbits, BI21

= BO21

= 0.7 Mbits, BI12

= BO12

= 0.5 Mbits, BI22

=

BO22

= 0.1 Mbits; V11= 10

9 CPU cycles,V21= 0.7V11

, V12=

0.5V11and V22

= 0.1V11, Cul

n = Cdln = 100 Mbits/s, F ceNB

n =

1010 CPU cycles/s, andFc = 10

11 CPU cycles/s).

requirements, we assume here that in the first cell the twousers haveBI

11 = BO11 = 1 Mbits, V11 = 109 CPU cycles, and

BI21 = BO

21 = 0.7 Mbits, V21 = 0.7V11 CPU cycles, while theusers in the second cell have smaller offloading requirementsfor data transfer and computing, normallyBI

12 = BO12 = 0.5

Mbits, V12 = 0.5V11 CPU cycles, andBI22 = BO

22 = 0.1Mbits, V22 = 0.1V11 CPU cycles. It is observed that, when thebackhaul capacity is limited, tasks tends to be performed atthe edge, especially for users with more stringent data transferand computing requirements, for example, atCul

n = Cdln = 3

Mbits/s, the two users in the first cell execute50% and80%percent of their offloaded tasks at the cloud, while the usersin the second cell have a complete tasks execution at thecloud due to the moderate number of input/output bits. Asthe backhaul capacity is increased, more tasks are executedatthe cloud, benefiting from the improved connection betweenceNBs and the cloud.

To obtain further insights, Fig. 9. shows the mobile sum-energy consumption versus the backhaul capacity for cloudmobile computing, whereby all users offload to the cloud,and edge mobile computing, whereby all users offload to thelocal cloudlet, forNc = 2,K = 2, andF ceNB

n = 1010 CPUcycles/s. We also include the performance of the solutionintroduced in Sec. V, which allows for hybrid edge-cloudoffloading. For cloud offloading, we consider different valuesfor computational capacityFc of the cloud. It is seen that whenthe backhaul capacity constraint is stringent, edge offloadingis more energy efficient as compared to cloud offloading, e.g.,by about 44% for cloud computational capacityFc = 1010

CPU cycles/s and backhaul capacityCuln = Cdl

n = 2.4Mbits/s. Furthermore, larger values of cloud capacityFc cancompensate for limited backhaul resources, making cloudcomputing preferable to edge computing. It is also observedthat the hybrid edge-cloud offloading scheme can significantly

13

2.4 2.5 2.6 2.7 2.8 2.9 3 3.1 3.2 3.3 3.4

Backhaul capacity [Mbits/s] ×107

1000

1500

2000

2500

3000

3500

4000

4500

5000Mob

ilesum-energy

[J]

Edge offloading

Cloud offloading (Fc = 1010)

Hybrid offloading (Fc = 1010)



Fig. 9: Minimum average mobile sum-energy consumption forcloud and edge offloading, along with the proposed hybrid edge-cloud offloading scheme, versus backhaul capacityCul

n = Cdln

(Nc = 2, K = 2, Tin = 0.1 s, W ul= W dl

= 10 MHz, BIin

andBOin ∼ U{0.1, 1} Mbits, Vin = 2640 × BI

in CPU cycles, andF ceNBn = 10

10 CPU cycles/s).

outperform edge computing when backhaul capacity is limited.We finally turn to the performance of the downlink network

MIMO scheme presented in Sec. VI. In Fig. 10, we plotthe average mobile sum-energy with and without cooperativetransmission in the downlink. The performance of cooperativescheme is obtained from the solution of (P.8), while theperformance of non-cooperative scheme is obtained from thesolution of (P.1) under the indicated values of backhaul capac-ity. The users offloading requirements are identical to thatusedin Fig. 6. The key observation here is that the performanceof the downlink cooperative transmission is strongly limitedby the backhaul capacity. For instance, we can see that, withbackhaulCul

n = Cdln = 10 Mbits/s, the cooperative scheme

is about 85% more energy-consuming as compared to thenon-cooperative scheme at latency0.5 seconds. The energyperformance gap between the two schemes is diminished asthe backhaul increases as observed withCul

n = Cdln = 100

Mbits/s. With this value of backhaul capacity, the cooperativescheme is less energy efficient by45% aroundTin = 0.3seconds. However, with backhaulCul

n = Cdln = 10 Gbits/s,

as for a standard fiber optic channel, the cooperative schemestarts to have noticeable energy gains. At latency0.08 seconds,for example, the cooperative scheme attains57% energysaving as compared to the non-cooperative scheme.

VIII. CONCLUDING REMARKS

In this paper, we investigated the design of cloud mobilecomputing systems over MIMO cellular networks as a jointoptimization problem over radio, computational resourcesandbackhaul resources in both uplink and downlink directions.Aniterative algorithm based on successive convex approximationswas presented for solving the resulting non-convex problemunder latency and power constraints. Numerical results showthat the proposed joint optimization yields significant energy

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

Latency [s]

0

1000

2000

3000

4000

5000

6000

Mob

ilesum-energy

[J]

cooperative (Culn = 10 Gbits/s)

non-cooperative (Culn = 10 Gbits/s)

non-cooperative (Culn = 100 Mbits/s)

cooperative (Culn = 100 Mbits/s)

non-cooperative (Culn = 10 Mbits/s)

cooperative (Culn = 10 Mbits/s)

Fig. 10: Minimum average mobile energy consumption versus thelatency constraintTin , assumed to be the same for all MUs (Nc =

2,K = 2, BI11

= BI12

= 1 Mbits, BI21

= BI22

= 0.1 Mbits,BO

11= BO

12= 1 Mbits, BO

21= BO

22= 0.1 Mbits, V11

= V12= 10

9

CPU cycles,V21= V22

= 108 CPU cycles,Fc = 10

11 CPU cycles/s,andCdl

n = Culn ).

saving compared to the conventional solutions based on sepa-rate allocations of computing and/or backhaul resources. Thissaving is more pronounced in low latency regimes, where,in our results, it leads to energy saving as high as66%.The mentioned baseline problem was further generalized inseveral directions. First, we tackled user selection with theaim of ensuring an energy savings for all users that performoffloading. Then, a hybrid architecture that leverages bothedgeand cloud computing was studied by addressing the optimalallocation between cloud and edge. It was seen that the optimalallocation is significantly affected by the backhaul capacity.Finally, joint downlink transmission based on network MIMOwas considered, demonstrating the critical importance of thebackhaul for the viability of this technique. Among the prob-lems left open by this study, we mention here the considerationof higher-layer aspects such as queuing in the definition of thelatency.

REFERENCES

[1] K. Kumar, J. Liu, Y.-H. Lu, and B. Bhargava, “A survey of computationoffloading for mobile systems,”Mobile Netw. App., vol. 18, no. 1, pp.129–140, Feb. 2013.

[2] H. T. Dinh, C. Lee, D. Niyato, and P. Wang, “A survey of mobilecloud computing: architecture, applications, and approaches,” Wirelesscommun. mobile comput., vol. 13, no. 18, pp. 1587–1611, Dec. 2013.

[3] Z. Sanaei, S. Abolfazli, A. Gani, and R. Buyya, “Heterogeneity in mobilecloud computing: taxonomy and open challenges,”IEEE Commun.Surveys Tuts., vol. 16, no. 1, pp. 369–392, Feb. 2014.

[4] L. Lei, Z. Zhong, K. Zheng, J. Chen, and H. Meng, “Challenges onwireless heterogeneous networks for mobile cloud computing,” IEEEWireless Commun., vol. 20, no. 3, pp. 34–44, Jun. 2013.

[5] M. Satyanarayanan, P. Bahl, R. Caceres, and N. Davies, “The case forVM-based cloudlets in mobile computing,”IEEE Pervasive Comput.,vol. 8, no. 4, pp. 14–23, Oct. 2009.

[6] M. Maier, M. Chowdhury, B. P. Rimal, and D. P. Van, “The tactileinternet: Vision, recent progress, and open challenges,”IEEE Commun.Magazine, vol. 54, no. 5, pp. 138–145, May 2016.

14

[7] Y. C. Hu, M. Patel, D. Sabella, N. Sprecher, and V. Young, “Mobileedge computing–A key technology towards 5G,” ETSI White Paper, no.11, tech. rep., Sep. 2015.

[8] B. Han, P. Hui, V. S. A. Kumar, M. V. Marathe, J. Shao, and A.Srini-vasan, “Mobile data offloading through opportunistic communicationsand social participation,”IEEE Trans. Mobile Comput., vol. 11, no. 5,pp. 821–834, May 2012.

[9] E. Cuervo, A. Balasubramanian, D.-k. Cho, A. Wolman, S. Saroiu,R. Chandra, and P. Bahl, “MAUI: making smartphones last longer withcode offload,” inProc. of the 8th international conference on Mobilesystems, applications, and services, San Francisco, CA, USA, Jun. 2010,pp. 49–62.

[10] S. Kosta, A. Aucinas, P. Hui, R. Mortier, and X. Zhang, “Thinkair:Dynamic resource allocation and parallel execution in the cloud formobile code offloading,” inProc. IEEE INFOCOM, orlando, FL, USA,Mar. 2012, pp. 945–953.

[11] M. Whaiduzzaman, A. Naveed, and A. Gani, “MobiCoRE: Mobiledevice based cloudlet resource enhancement for optimal task response,”IEEE Trans. Services Comput., vol. 9, no. 99, pp. 1–15, May 2016.

[12] N. Fernando, S. W. Loke, and W. Rahayu, “Mobile cloud computing:A survey,” Future Generation Computer Systems, vol. 29, no. 1, pp.84–106, Jan. 2013.

[13] J. E. Costa and J. J. Rodrigues,Mobile cloud computing: Technologies,services, and applications. IGI Global, 2013.

[14] D. Huang, P. Wang, and D. Niyato, “A dynamic offloading algorithmfor mobile computing,”IEEE Trans. Wireless Commun., vol. 11, no. 6,pp. 1991–1995, Jun. 2012.

[15] A. P. Miettinen and J. K. Nurminen, “Energy efficiency ofmobile clientsin cloud computing,” inProc. the 2nd USENIX Conference on Hot Topicsin Cloud Computing, Boston, MA, USA, pp. 4-11, Jun. 2010.

[16] K. Kumar and Y. H. Lu, “Cloud computing for mobile users:Canoffloading computation save energy?”Computer, vol. 43, no. 4, pp. 51–56, Apr. 2010.

[17] Y.-H. Kao, B. Krishnamachari, M.-R. Ra, and F. Bai, “Hermes: Latencyoptimal task assignment for resource-constrained mobile computing,”in Proc. IEEE INFOCOM, Kowloon, Hong Kong, pp. 1894-1902, Apr.2015.

[18] M. Kamoun, W. Labidi, and M. Sarkiss, “Joint resource allocation andoffloading strategies in cloud enabled cellular networks,”in Proc. IEEEICC, London, UK, pp. 5529-5534, Jun. 2015.

[19] S. Khalili and O. Simeone, “Inter-layer per-mobile optimization of cloudmobile computing: A message-passing approach,”Submitted to IEEETrans. Commun., Aug. 2015.

[20] X. Xiang, C. Lin, and X. Chen, “Energy-efficient link selection andtransmission scheduling in mobile cloud computing,”IEEE WirelessCommun. Letters, vol. 3, no. 2, pp. 153–156, Apr. 2014.

[21] Y. Yu, J. Zhang, and K. Ben Letaief, “Joint subcarrier and CPU timeallocation for mobile edge computing,”arXiv preprint:1608.06128, aug.2016.

[22] O. Munoz, A. Pascual-Iserte, and J. Vidal, “Optimization of radio andcomputational resources for energy efficiency in latency-constrainedapplication offloading,”IEEE Trans. Veh. Technol., vol. 64, no. 10, pp.4738–4755, Oct. 2015.

[23] W. Zhang, Y. Wen, K. Guan, D. Kilper, H. Luo, and D. O. Wu, “Energy-optimal mobile cloud computing under stochastic wireless channel,”IEEE Trans. Wireless Commun., vol. 12, no. 9, pp. 4569–4581, Sep.2013.

[24] Y. Wang, M. Sheng, X. Wang, L. Wang, W. Han, Y. Zhang, and Y. Shi,“Energy-optimal partial computation offloading using dynamic voltagescaling,” inProc. IEEE ICCW, London, UK, pp. 2695–2700, Sep. 2015.

[25] C. You, K. Huang, and H. Chae, “Energy efficient mobile cloud com-puting powered by wireless energy transfer,”arXiv preprint:1507.04094,Jul. 2015.

[26] X. Chen, “Decentralized computation offloading game for mobile cloudcomputing,” IEEE Trans. Parallel Distributed Sys., vol. 26, no. 4, pp.974–983, Apr. 2015.

[27] X. Chen, L. Jiao, W. Li, and X. Fu, “Efficient multi-user computationoffloading for mobile-edge cloud computing,”IEEE/ACM Trans. Netw.,vol. 9, no. 99, pp. 1–14, Sep. 2015.

[28] R. Kaewpuang, D. Niyato, P. Wang, and E. Hossain, “A framework forcooperative resource management in mobile cloud computing,” IEEE J.Sel. Areas Commun., vol. 31, no. 12, pp. 2685–2700, December 2013.

[29] S. Sardellitti, G. Scutari, and S. Barbarossa, “Joint optimization of radioand computational resources for multicell mobile-edge computing,”IEEE Trans. Signal Inf. Process. Over Netw, vol. 1, no. 2, pp. 89–103,Jun. 2015.

[30] C. You, K. Huang, H. Chae, and B. H. Kim, “Energy-efficient resourceallocation for mobile-edge computation offloading,”IEEE Trans. Wire-less Commun., vol. PP, no. 99, pp. 1–1, Dec. 2016.

[31] M. Molina, O. Muoz, A. Pascual-Iserte, and J. Vidal, “Joint schedulingof communication and computation resources in multiuser wirelessapplication offloading,” inProc. IEEE PIMRC, Washington, DC, USA,pp. 1093-1098, Sep. 2014.

[32] K. Zhang, Y. Mao, S. Leng, Q. Zhao, L. Li, X. Peng, L. Pan,S. Maharjan, and Y. Zhang, “Energy-efficient offloading for mobile edgecomputing in 5G heterogeneous networks,”IEEE Access, vol. 4, pp.5896–5907, Aug. 2016.

[33] J. Guo, Z. Song, and Y. Cui, “Energy-efficient resource allocation formulti-user mobile edge computing,”arXiv preprint:1611.01786, Nov.2016.

[34] T. Zhao, S. Zhou, X. Guo, Y. Zhao, and Z. Niu, “A cooperativescheduling scheme of local cloud and internet cloud for delay-awaremobile cloud computing,” inProc. IEEE Globecom Workshops, Dec2015, pp. 1–6.

[35] J. G. Andrews, S. Buzzi, W. Choi, S. V. Hanly, A. Lozano, A. C. K.Soong, and J. C. Zhang, “What will 5G be?”IEEE J. Sel. AreasCommun, vol. 32, no. 6, pp. 1065–1082, Jun. 2014.

[36] J. Oueis, E. Calvanese-Strinati, A. D. Domenico, and S.Barbarossa,“On the impact of backhaul network on distributed cloud computing,”in Proc. IEEE WCNCW, Istanbul, Turkey, pp. 12-17, Apr. 2014.

[37] G. Scutari, F. Facchinei, and L. Lampariello, “Parallel and distributedmethods for constrained nonconvex optimization-part I: Theory,” IEEETrans. Signal Processing, to appear, 2016. On-line: arXiv:1410.4754.

[38] G. Scutari, F. Facchinei, L. Lampariello, S. Sardellitti, and P. Song,“Parallel and distributed methods for nonconvex optimization–part II:Applications in communications and machine learning,”IEEE Trans.Signal Processing, to appear, 2016. On-line: arXiv:1601.04059.

[39] Y. Shi, J. Cheng, J. Zhang, B. Bai, W. Chen, and K. B. Letaief,“Smoothed Lp-minimization for green cloud-RAN with user admissioncontrol,” IEEE J. Sel. Areas Commun., vol. 34, no. 4, Apr. 2016.

[40] D. Gesbert, S. Hanly, H. Huang, S. S. Shitz, O. Simeone, and W. Yu,“Multi-cell MIMO cooperative networks: A new look at interference,”IEEE J. Sel. Areas Commun., vol. 28, no. 9, pp. 1380–1408, Dec. 2010.

[41] D. Tse and P. Viswanath,Fundamentals of wireless communication.Cambridge university press, 2005.

[42] A. Lapidoth, “Nearest neighbor decoding for additive non-Gaussiannoise channels,”IEEE Trans. Information Theory, vol. 42, no. 5, pp.1520–1529, Sep. 1996.

[43] Y. Nesterov and A. Nemirovskii,Interior-point polynomial algorithmsin convex programming. Siam, 1994, vol. 13.

[44] T. Lipp and S. Boyd, “Variations and extension of the convex–concaveprocedure,”Optimization and Engineering, pp. 1–25, Nov. 2015.

[45] S. Boyd and L. Vandenberghe,Convex optimization. Cambridgeuniversity press, 2004.

[46] H. Sun, X. Chen, Q. Shi, M. Hong, and X. Fu, “Learning to optimize:Training deep neural networks for wireless resource management,”Submitted, Sep. 2016.

[47] T. D. Burd and R. W. Brodersen, “Processor design for portablesystems,”J. VLSI Signal Process. Syst., vol. 13, no. 2-3, pp. 203–221,Aug. 1996.

[48] A. R. Mishra,Advanced Cellular Network Planning and Optimisation:2G/2.5G/3G and Evolution to 4G, 1st ed. Wiley Publishing, 2006.

[49] E. Lagerspetz and S. Tarkoma, “Mobile search and the cloud: Thebenefits of offloading,” inProc. IEEE PERCOM Workshops, St. Louis,MO, USA, pp. 117-122, Mar. 2011.

[50] A. LLC, “Amazon elastic compute cloud (EC2) website,” 2016.[Online]. Available: https://aws.amazon.com/ec2/

[51] N. Balasubramanian, A. Balasubramanian, and A. Venkataramani, “En-ergy consumption in mobile phones: A measurement study and impli-cations for network applications,” inProc. ACM SIGCOMM InternetMeasurement Conference, Chicago, IL, Nov. 2009, pp. 280–293.

15

https://aws.amazon.com/ec2/

Documents

Joint Uplink/Downlink Optimization for Backhaul … › pdf › 1607.06521.pdfMobile cloud computing enables the ofﬂoading of com-putationally heavy applications from Mobile Users