Fast algorithms for convex cost flow problems on circles, lines, and trees

Fast Algorithms for Convex Cost Flow Problems onCircles, Lines, and Trees

James B. OrlinSloan School of Management, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139

Balachandran VaidyanathanOperations Research, FedEx Express, Memphis, Tennessee 38125

We develop efficient algorithms to solve convex cost flowproblems where the underlying graph is a circle, a line,or a tree. Each node i has an associated supply/demandb(i ). The cost of sending flow on arc (i , j ) is a piecewiselinear convex function fij defined over R. Let n be the num-ber of nodes and m = O(n) be the total number of piecesof all the convex functions. A flow x is feasible if theimbalances on all nodes are nonnegative. Excess ei (x )stored on node i has an associated linear cost ci × ei (x ).We show that the problem on a circle can be transformedinto an equivalent problem on a line in O(n) time. There-after, we develop an algorithm that solves the problemon a line in O(sort(n) + nα(n)) time, where sort(n) is thetime to sort n real numbers and α(n) is the inverse Acker-mann function. We also prove that when the nodes lie ona tree, the problem can be solved in O(n log n) time usingthe dynamic tree data structure. We describe applica-tions in areas such as distributed computing, lot-sizing,computational biology, computational music, and trans-portation. © 2013 Wiley Periodicals, Inc. NETWORKS, Vol. 62(4),288–296 2013

Keywords: network flow algorithms; convex cost flows; mini-mum cost flows; computational complexity; applications

1. INTRODUCTION

Convex cost flow problems belong to a class of nonlinearoptimization problems that arise frequently in practice andcan be solved efficiently. If the convex costs are separable andpiecewise linear, the problems can be transformed into mini-mum cost flow problems with parallel arcs (Ahuja et al. [4]).This allows them to be solved using efficient network flowalgorithms. In this article, we develop specialized algorithmsto solve convex cost flow problems on circles, lines, and trees.Our algorithms are faster than general purpose algorithms by

Received January, 2012; accepted July, 2013Correspondence to: B. Vaidyanathan; e-mail: [email protected] 10.1002/net.21517Published online 5 October 2013 in Wiley Online Library(wileyonlinelibrary.com).© 2013 Wiley Periodicals, Inc.

a factor of n log n or more. Our research is motivated by theapplications these problems have in areas such as distributedcomputing, lot-sizing, computational biology, computationalmusic, and transportation.

The current fastest strongly polynomial algorithm forthe minimum cost flow problem on a general graphis due to Orlin [18]. Orlin’s algorithm has a run-timeO(m log n(m + n log n)) (where m is the number of arcs) andthus solves the convex cost flow problem (with piecewiselinear costs and O(n) total pieces) on circles and trees inO(n2 log2 n) time. Several researchers have also studied spe-cial cases of the minimum cost flow problem. We summarizethese results. Karp and Li [15] show that the matching prob-lem (or assignment problem) on a Euclidean circle or linecan be solved in O(n) time. Aggarwal et al. [1] prove thatthe transportation problem on a Euclidean circle or line canbe solved in O(n log n) time. Colannino et al. [11] prove thatthe many-to-one matching problem on a Euclidean line canbe solved in O(n) time. Ahuja and Hochbaum [2] developedan O(n log n) algorithm for the linear cost dynamic lot-sizingproblem. Their problem is essentially a minimum cost flowproblem on a line and with a linear nonzero cost of using sup-plies (or production); their algorithm also applies when thearc costs are convex. Finally, Vaidyanathan and Ahuja [24]developed an O(n log n) algorithm for the minimum cost flowproblem on a circle with nonnegative costs. There is also arich literature on convex cost network flows. We refer thereader to Ahuja et al. [4] for details.

The problem on a circle is a generalization of that on aline, and therefore most of our exposition in this article willbe with respect to circles. We first study the convex cost flowproblem on a circle (CFPC). Let the nodes on the circle benumbered from 1 to n in clockwise order, and let p(i) (orq(i))be the node immediately preceding (or succeeding) nodei on the circle. Each node i has a supply/demand b(i) and∑

i∈[1,n] b(i) = e∗ ≥ 0. The network contains arcs (p(i), i)for integer i ∈ [1, n]. The flows on these arcs can be negative.The cost of sending xi units of flow on arc (p(i), i) is fi(xi),where fi is convex, piecewise linear, and defined over R. The

NETWORKS—2013—DOI 10.1002/net

total number of pieces of all convex functions is denoted bym, and we assume that m = O(n). Also, the breakpoints of thecost functionfi are specified in sorted order. The imbalanceat node i ∈ [1, n] with respect to flow x is ei(x) = b(i) −xi,q(i) + xp(i),i. A flow x is feasible if ei(x) ≥ 0 for i ∈ [1, n].The excess stored on node i ∈ [1, n] has a linear cost ci ×ei(x). In the case that e∗ > 0, this is modeled by addinga node t with b(t) = −e∗ and an arc (i, t) for i ∈ [1, n]with a linear cost ci. After this addition, the total supply isequal to the total demand, and the convex cost flow problemcan be transformed into a minimum cost flow problem in itsstandard form (Ahuja et al. [4]). Our algorithm is applied onthis minimum cost flow problem.

The other problem that we study is the convex cost flowproblem on a tree (CFPT). This problem is defined similarlyto the CFPC, the only difference being that the nodes i ∈ [1, n]are on a tree instead of a circle, and all arcs of the tree aredirected away from a root node. We assume that node 1 is theroot node, the nodes in the tree are numbered from 1 to n, andthe tree contains arcs (p(i), i) for i ∈ [2, n], where p(i) is thepredecessor of node i. The flow on tree arcs can be negative.

A pseudoflow refers to any assignment of the variablesx, including assignments that violate the flow balance con-straints. If the flow balance constraints are satisfied bypseudoflow x, it is referred to as feasible or simply as aflow. A pseudoflow (or flow) is said to be optimal if it sat-isfies the optimality conditions for the minimum cost flowproblem (see Section 2, Theorem 1). Our algorithm for theproblem on the circle has three phases and works as follows.In Phase 1, we construct an optimal pseudoflow x such thate1(x) = e∗, ei(x) = 0 for i ∈ [2, n], and et(x) = −e∗. InPhase 2, we start with the solution of Phase 1 and transformit into an optimal pseudoflow x′ such that:

1. A particular node j∗ ∈ [2, n] is split into two nodes j1and j2

2. The modified residual network G∗(x′) is acyclic3. ej1(x

′) = e∗, et(x′) = −e∗, and all other nodes have zeroimbalance

At the end of Phase 2, we have effectively transformedthe CFPC into a convex cost flow problem on a line in O(n)

time. In Phase 3, we transform the pseudoflow of Phase 2 intoan optimal flow using an implicit implementation of the suc-cessive shortest path algorithm (SSPA). The run-time of ouralgorithm is O(sort(n)+nα(n)). We note that there are manycases in which sort(n) is faster than O(n log n). For example,radix sort takes O(n) time when all values are integers lessthan some polynomial in n. For the CFPT, our approach issimilar and the run-time of our algorithm is O(n2) using sim-ple data structures and can be improved to O(n log n) withthe use of dynamic trees. Below, we list the contributions ofour article:

• We develop specialized algorithms for convex cost flowproblems [with piecewise linear costs and O(n) totalpieces] on circles and trees that are faster by a factor ofn log n or more compared to general purpose algorithms.

• The algorithm for the CFPC followed by a linear timeflow decomposition algorithm (Vaidyanathan [23]) solvestransportation problems and many-to-one matching prob-lems on lines and circles in O(sort(n) + nα(n)) time. Thisimproves on the O(n log n) run-time for the transportationproblem on a line or a circle (Aggarwal et al. [1]) that usesa priority queue (such as a red-black tree). It also almostmatches the O(n) run-time for the many-to-one matchingproblem on a line (Colannino et al. [11]) when the num-bers sorted are integers less than some polynomial in n(the many-to-one matching problem on a circle that hasnot been studied previously).

• The CFPC is a generalization of the minimum cost flowproblem on a circle studied by Vaidyanathan and Ahuja[24] (Section 5.2). The O(sort(n)+nα(n)) run-time of thealgorithm presented in this article dominates the run-timeof the Vaidyanathan and Ahuja algorithm, and our datastructures are simpler. Their algorithm runs in O(n log n)

time and relies on dynamic trees.• The algorithm for the CFPC solves convex cost flow prob-

lems on a line in O(sort(n)+ nα(n)) time. This dominatesthe run-time of the previous fastest algorithm, developedby Ahuja and Hochbaum [2], and our data structures aresimpler. Their algorithm runs in O(n log n) time and relieson dynamic trees.

• The algorithm for the CFPT solves the CFPT in O(n log n)

time. This algorithm, followed by a linear time flowdecomposition algorithm (Vaidyanathan [23]), also solvestransportation problems, matching problems, and many-to-one matching problems on trees in O(n log n) time. Allthese problems have not been studied previously.

• We describe new applications of the problems studied.

2. PRELIMINARIES

Let fi(xi) be a piecewise linear convex function that rep-resents the cost of sending xi units of flow on arc (p(i), i) onthe circle or the tree. We assume without loss of generalitythat fi(xi) is defined over all of R. If it were defined only over[L, U], we can add a piece from U to ∞ with slope M forsome large M, and we can add a piece from −∞ to L withslope −M. Each arc (i, t) has a linear cost ci per unit flow andhas an infinite capacity.

Function fi(xi) is described by specifying: (1) the break-points of the function, which are x-coordinates at which theslope of the function changes; and (2) the slopes of the linearsegments between adjacent breakpoints. Let w(i) representthe set of all breakpoints corresponding to fi(xi) in increas-ing order; let w(i, k) represent the k-th breakpoint in this set;and let mi = |w(i)|. Because fi(xi) is defined over all ofR, w(i, 1) = −∞ and w(i, mi) = ∞ for i ∈ [1, n]. Let c(i)represent the set of slopes between adjacent breakpoints offi(xi), and let c(i, k) be the slope between breakpoints w(i, k)

and w(i, k + 1) for k ∈ [1, mi − 1]. Because fi(xi) is convex,the slopes in c(i) are strictly increasing. As is well-known(see for example, Ahuja et al. [4]), fi(xi) can be linearized byreplacing the associated arc (p(i), i) with |c(i)| parallel arcsand the convex cost flow problem can be transformed into anequivalent minimum cost flow problem.

NETWORKS—2013—DOI 10.1002/net 289

Our algorithm uses the concept of residual networks. Theresidual network G(x) corresponding to a flow x is definedas follows. Suppose that j = p(i). We replace each arc (j, i)by two arcs (i, j) and (j, i) in the following manner.

• If w(i, k) < xji < w(i, k + 1) for k ∈ [1, mi − 1], then (1) theresidual capacity of arc (j, i) is rji = w(i, k + 1) − xji, and itscost is cji = c(i, k); and (2) the residual capacity of arc (i, j)is rij = xji − w(i, k), and its residual cost cij = −c(i, k).

• If xji = w(i, k) for k ∈ [2, mi − 1], then (1) the residualcapacity of arc (j, i) is rji = w(i, k + 1) − w(i, k), and its costcji = c(i, k); and (2) the residual capacity of arc (i, j) is rij =w(i, k) − w(i, k − 1), and its residual cost cij = −c(i, k − 1).

Similarly, we replace arcs (i, t) by two arcs (i, t) and (t, i).The arc (i, t) has residual capacity rit = ∞ and residualcost ci, and the arc (t, i) has residual capacity rti = xit andresidual cost −ci. The residual capacity measures the amountby which the flow on an arc can increase (or decrease) beforeits flow is equal to the next (or previous) breakpoint. Theresidual cost of an arc measures the cost per unit of increase(or decrease) of its flow until its flow is equal to the next(or previous) breakpoint. The residual capacity of a path isthe minimum residual capacity among all arcs in it, and theresidual cost of a path is the sum of the residual costs of allarcs in it.

The SSPA is an algorithm for the linear minimum cost flowproblem. The algorithm is applied on the residual network.In each iteration, the SSPA maintains a pseudoflow. For apseudoflow x, the imbalance ei(x) = b(i)− total flow out ofnode i+ total flow into node i. If ei(x) > 0, node i is an excessnode and if ei(x) < 0, node i is a deficit node. The residualnetwork for a pseudoflow is defined in the same manner as theresidual network for a flow, except that one also keeps trackof excesses and deficits. The SSPA starts with an optimalpseudoflow and repeatedly sends flow on shortest paths inthe residual network until the pseudoflow becomes a flow.The correctness of the algorithm is based on the followingtheorem, which we state without proof (see Ahuja et al. [4]for details).

Theorem 1.

(a) A flow x is optimal if and only if G(x) contains nonegative cost directed cycles (negative cycle optimalityconditions).

(b) Suppose a pseudoflow x or a flow x satisfies the optimalitycondition and we obtain x’ from x by sending flow on ashortest path from node p to node q in G(x). Then, x’ alsosatisfies the optimality conditions. ω

The SSPA starts with a pseudoflow that satisfies theoptimality condition and maintains the optimality conditionthroughout. Therefore, when the pseudoflow becomes a flow,the algorithm terminates with the optimal solution.

3. THE CONVEX COST FLOW PROBLEM ON ACIRCLE

In this section, we describe the algorithm for the CFPC.

3.1. Phase 1

In Phase 1 of the algorithm, we create a flow that satisfiesthe supply/demand of all nodes i ∈ [2, n] and the optimalityconditions. We ignore node t and its incident arcs. We alsoreplace the supply/demand at node 1 by b(1)−e∗. After thesechanges, the total supply/demand of all nodes on the circleis zero (see Figure 1). We next determine the unique feasibleflow x with xn1 = 0. Note that x may not satisfy the opti-mality conditions for this modified problem. To transform xinto a flow that satisfies the optimality conditions, we send �

units of flow (� may be negative) clockwise around the cycle(starting and ending at node 1) so as to minimize the total cost.Let g(�) be the cost of sending � units of flow clockwisearound the cycle starting from flow x. Then, due to the con-vexity of fi(xi), g(�) is a piecewise linear convex functionwith m = O(n) breakpoints. Let �′ be the value that mini-mizes g(�). Then, x will be transformed into an optimal flowfor Phase 1 if one adds �′ to the flow of each arc on the circle.

Phase 1 is described below. We subsequently show howto implement Phase 1 in O(n) time.

Algorithm 1 Phase 1 for the CFPC.Algorithm Phase 1begin

delete arcs (i, t) and node t from the network;update b(1) := b(1) − e∗;determine the unique feasible flow x in which xn1 = 0;perform binary search to find �′ that minimizes g(�);update xp(i),i := xp(i),i + �′ for i ∈ [1, n];reinsert arcs (i, t) and node t into the network;

end

At the end of Phase 1, we have an optimal flow x withrespect to the network with node t deleted, and with b(1)

replaced by b(1)−e∗. With respect to the original supplies anddemands, e1(x) = e∗, ei(x) = 0 for i ∈ [2, n], and et(x) =−e∗. By construction, x satisfies the negative cycle optimalityconditions in Theorem 1.

We now consider the run-time of Phase 1. Creating theinitial flow takes O(n) time because it is a spanning tree flow(Ahuja et al. [4]). We will next show how to determine �′ inO(n) time. Because g(�) is a convex function, the left andright derivatives of g(�) at �′ are nonpositive and nonneg-ative, respectively. Let L be a lower bound on �′, and let Ube an upper bound. Initially, L = −∞ and U = ∞.

A breakpoint of g() belongs to the set of breakpoints of fi()for some i ∈ [1, n]. Let W be the unsorted list of breakpointsof g(�) that are in the interval (L, U). Let w(i) for i ∈ [1, n]be a sorted list of breakpoints of fi(�) that are in the interval(L, U). (Note that no additional time is spent in sorting w(i)because the breakpoints are already provided in sorted orderin the input data.) Let S be the set of arcs with no breakpointin (L, U), and let D be the sum of the costs of arcs in S.

Let g+(�) and g−(�) denote the right and left deriva-tives of g(), and f +

i (�) and f −i (�) denote the right and left

290 NETWORKS—2013—DOI 10.1002/net

FIG. 1. (a) Original network; (b) Phase 1 network.

derivatives of fi( ). Then, g+(�) = D + ∑i/∈S f +

i (�), andg−(�) = D + ∑

i/∈S f −i (�).

∑i/∈S f +

i (�) and∑

i/∈S f −i (�)

can be in determined O(|W |) steps by traversing lists w(i) fori /∈ S at most once. Hence, g+(�) and g−(�) are computedin O(|W |) steps.

We now show how to compute �′ in O(n) steps. InitiallyL = −∞ and U = ∞. We initialize W , S, w(i), and D in O(n)

time. We then perform binary search on W to determine �′,as described next.

Suppose that |W | = K at the beginning of some itera-tion of the binary search. Let � be the median value of thepoints in W , which can be computed in O(K) steps usingthe algorithm of Blum et al. [9]. If g−(�) ≤ 0 ≤ g+(�),then �′ = �. If g−(�) ≥ 0, then �′ ≤ �, and we canreplace U by �. If g+(�) ≤ 0, then �′ ≥ � and we canreplace L by �. Subsequently, we update W , S, w(i), and Din O(K) steps. Thus, in one iteration of the binary search, wereduce the number of breakpoints by a factor of 2 in O(K)

steps.Suppose that h(K) is the running time for an iteration that

starts with K breakpoints. Then, h(K) = h(K/2) + O(K).This recursive relationship implies that h(K) = O(K).Therefore, the time to solve the Phase 1 problem is O(n).

The following property is satisfied at the end of Phase 1.

Lemma 1. Let x′ be the flow obtained at the end of Phase 1.Then, x′ is an optimal pseudoflow for the original problem.

Proof. There is no negative cost directed cycle in G(x′).■

3.2. Phase 2

In Phase 2, we start with the optimal pseudoflow x′ at theend of Phase 1. Let j∗ be the penultimate node in the shortestpath between node 1 and node t in G(x′). Then, starting fromx′, we prove that the SSPA can be implemented such thatflow on arc (j∗, q(j∗)) does not increase and the flow on arc(p(j∗), j∗) does not decrease. Based on this property, we pushe∗ units of flow from node 1 to node j∗ and split node j∗ intotwo nodes making the residual network acyclic.

In the SSPA, we assume that if there are multiple shortestpaths from node 1 to node t, then the one with the fewestnumber of arcs is selected.

Lemma 2. Let P be the minimum cost path from node 1to node t in G(x′). Let j∗ be the node that precedes node ton P. Suppose that the SSPA is run starting from flow x′. Ateach iteration of the SSPA, the minimum cost path from 1 tot does not include either of the arcs (j∗, q(j∗)) or (j∗, p(j∗))(see Figure 2).

Proof. It is true at the first augmentation. We now con-sider a later augmentation. Assume that one of the arcs(j∗, q(j∗)) or (j∗, p(j∗)) is on the min cost path from 1 to t, and


FIG. 2. Illustration for Phase 2.

we will derive a contradiction. Let P′ denote the minimumcost path from node 1 to node t, and let i be the node thatprecedes t in P′. By assumption, j∗ ∈ P′, and j∗ �= i. WriteP′ as [P1, P2], where P1 is the subpath from 1 to j∗, and P2 isthe subpath from j∗ to t. Because P′ is the shortest path, (costof P2) < cj∗. Otherwise, [P1, (j∗, t)] is a better path. But thenthe cycle [(t, j∗), P2] has a cost of (cost of P2) − cj∗ < 0,which cannot happen with the SSPA. ■

Corollary 1. Let P be the minimum cost path from node 1 tonode t in G(x′). Let j∗ be the node that precedes node t on P. Ifj∗ = 1, then an optimal flow for the original problem can beobtained by sending e∗ units of flow on (1, t). Otherwise, letG∗ be obtained from G(x′) by splitting j∗ into two nodes j1 andj2, and including arcs (p(j∗), j1) and (j2, q(j∗)). The costs andflows of arcs (p(j∗), j1) and (j2, q(j∗)) are identical to thoseof arcs (p(j∗), j∗) and (j∗, q(j∗)), respectively. We also includearcs (j1, t) and (j2, t), each with a cost cj∗. (There is no arclinking j1 and j2.) Then, an optimal flow in G* correspondsto an optimal flow in G(x′).

Proof. The SSPA carries out the same augmentations inG(x′) and in G∗. ■


let P, j∗, j1, j2, and G∗ be defined as in Corollary 1;Send e∗ units of flow clockwise from node 1 to node j1;

end

Note that the flow can be sent from node 1 to node j1 in theclockwise direction or from node 1 to node j2 in the counter-clockwise direction regardless of whether P was clockwiseor counterclockwise. Without loss of generality, we chooseto push flow in the clockwise direction to node j1.

A natural implementation of Phase 2 runs in O(n) time.Note that after Phase 2, we are optimizing on a graph G∗where all of the excess resides at j1, which we now call theroot node. The pseudoflow is optimal because there is nonegative cost cycle in G∗. If we run the SSPA starting withG∗, all minimum cost paths are directed out of the root node.

3.3. Phase 3

In Phase 3, we transform the optimal pseudoflow of Phase2 into an optimal flow by repeatedly sending flow on shortestpaths from the excess node to node t. Note that because G∗is acyclic, we have reduced the original problem to a convexcost flow problem on a line. For the ease of discussion, letthe nodes on the line be labeled from node 1 to node n + 1such that all excess resides at node 1. If i is the node thatprecedes node t on an augmenting path then, we refer to theaugmenting path as Pi. Further, let ci = 0 for i ∈ [1, n + 1](this is achieved without loss of generality by setting nodepotentials π(i) = −ci for i ∈ [1, n + 1], π(t) = 0, and usingthe reduced costs instead of the original costs).

Because the flows on the arcs on the line never decrease,it follows that the number of flow augmentations is O(n). Wecan also identify the shortest path and update the arc flows inO(n) time. Thus, a direct application of the SSPA has a O(n2)

run-time and gives an O(n2) algorithm for the CFPC. Weshow that this run-time can be further reduced. We developan implicit implementation of the SSPA in which we onlykeep track of the flow on arcs (j, t). We also maintain datastructures that are updated as if the flow were augmented onthe other arcs (without actually updating their flows). Oncewe determine the optimal flow values xjt , we then generatethe optimal flow on other arcs using the Phase 1 algorithm.

Let dj represent the cost of sending flow on path Pj, i.e.,dj = Cost (Pj). In each iteration, we augment flow on pathPi where i = argmin{dj : j ∈ [1, n + 1]}. The quantity offlow augmented is equal to the residual capacity of Pi.

We call a node i ∈ [1, n] semioptimal if di < dj for j ∈[1, i −1]. To carry out the operations efficiently, we maintainthe following data structures.

Let L be a list of semioptimal nodes in increasing orderof indices. Last(L) is the last node in L. By construction, theshortest augmenting path is PLast(L). For node i ∈ L, next(i)is the node following i in L and Prev(i) is the node preceding


i in L (the values of Next(i) and Prev(i) are null if there isno such node). For a node i ∈ [1, Last(L)], Interval(i) is thenode j ∈ L such that j has the largest index and j < i.

We also maintain the values Dist(i). If i ∈ L and i �=Last(L), then Dist(i) = dk − di where k = Next(i). Hence,Dist(i) is the cost of the path from node i to node k. By thedefinition of L, Dist(i) < 0 for i ∈ L, i �= Last(L).

Finally, let h(�) be the cost of sending � > 0 units offlow from node 1 to node k = Last(L) in G∗. We do not storeh(�) explicitly. Instead, we maintain a sorted list W of them = O(n) breakpoints of function h(�), and we store Dist(i)for each i ∈ L. Without loss of generality, let W(1) = 0 andW(m) = ∞. For a breakpoint w ∈ W , let Node(w) be thenode i ∈ [1, n + 1] such that w is a breakpoint of fi and letIncrement(w) be the amount the slope of fi increases at w.

The time to initialize W is O(n + sort(n)). We next showthat the time to initialize all other data structures is O(n) time.Initialize L = {1} and d1 = 0. Traverse nodes i ∈ [2, n + 1]in increasing order of indices, compute di, and append thenode i to L if di < dLast(L). Using di, Dist(i) for i ∈ L canalso be computed in O(n) time. Node(w) and Increment(w)for w ∈ W can also be initialized in O(n) time. Thus, the totaltime to initialize all data structures is O(n + sort(n)). We arenow ready to describe Phase 3.


initialize L, Dist(i) for i ∈ L;initialize W , x;initialize Increment(w), Node(w) for w ∈ W ;while e1(x) > 0 dobegin

let f := min {e1(x), W(2) − W(1)}, k := Last(L);delete W (1) from W ;update xkt := xkt + f , e1(x) := e1(x) − f ;update L, Dist(i) for i ∈ L, W ;

endset b(i) := b(i) − xit for i ∈ [1, n];call Phase 1 to compute the optimal flow on all arcs;

end

The algorithm keeps track of the flow on arcs (i, t). There-fore, at the end of the while loop, we know the optimalflows on arcs (i, t). We then apply the Phase 1 algorithm todetermine the optimal flow on all other arcs.

Consider the run-time of the algorithm. One breakpointis deleted from W per iteration of the while loop; hence, thenumber of iterations is O(n).

The first three steps in the while loop can be performedin O(1) time. Note that the nodes are partitioned into |L| − 1intervals by the set L. At each iteration, the number ofintervals decreases and either two consecutive intervals aremerged or a set of nodes is deleted from L (if Last(L)changes). The bottleneck step turns out to be keeping track ofthe merged intervals. We will show that the total time spent

in maintaining the intervals is O(nα(n)), and the time spentin updating all other data structures is O(n).

Let w = W(1) and k = Node(w). Then, the cost of arc(k − 1, k) changes due to the flow augmentation. Let k′ =Interval(k). Then, we need to update Dist(k′) to Dist(k′)+Increment(w). If Dist(k′) < 0 after the update, we do notneed to update L. However, if Dist(k′) > 0, then Next(k′) isno longer semioptimal.

Suppose Next(k′) is no longer semioptimal. Let l ∈ L bethe first node after k′ such that the cost of the path betweennode k′ and node l is negative. (We use previously computedDist values to determine the cost of the path between nodek′ and node l.) If l does not exist, we delete all nodes in Lafter node k′ and k′ becomes Last(L). If l is defined, we deleteall nodes in L strictly between k′ and l, and update Dist(k′)to the cost of the path between node k′ and node l. Once anode is deleted from L, it cannot re-enter. Hence, ignoringthe time spent performing the Interval operations, the totaltime spent in updating Dist(i) and L in Phase 3 is O(n). Also,if Last(L) changes, the breakpoints corresponding to arcs inthe path between the new value of Last(L) and the previousvalue of Last(L) are deleted from W . By storing the addressesof the breakpoints corresponding to each arc, this update canbe performed in a total of O(n) time.

Finally, consider the time taken to perform Interval oper-ations, for which we use Tarjan’s Union-Find algorithm(Tarjan [20]). The Union-Find algorithm performs two oper-ations to manipulate a family of disjoint sets that partition auniverse of n elements. Find(i) determines the unique set towhich element i belongs and Union(A, B, C) combines set Aand set B into a new set C. A sequence of O(n) Find and Unionoperations can be performed in O(nα(n)) time, where α(n) isthe inverse Ackermann function (a function that grows veryslowly and is almost a constant). Corresponding to each nodej ∈ L and j �= Last(L), we maintain a setj that contains nodesi ∈ [j+1, Next(j)]. Thus, Interval(i) can be determined usinga Find(i) operation. Also, each time a node j is deleted from L,we perform a Union(setPrev(j), setj, setPrev(j)) operation. Thetotal number of Find and Union operations in Phase 3 is O(n)

and therefore the time spent to perform Interval operations isO(nα(n)).

Based on the preceding discussion, we have the followingresult.

Theorem 2. Our algorithm solves the CFPC in O(sort(n)+nα(n)) time.

4. THE CONVEX COST FLOW PROBLEM ON ATREE

We use a similar approach to solve the convex flow prob-lem on a tree. In Phase 1, we compute the unique spanningtree flow x′ in O(n) time. Because the tree does not containcycles, x′ satisfies the optimality conditions. With respect tothe original problem, at the end of Phase 1, we have an optimalpseudoflow x′ such that e1(x′) = e∗, ei(x′) = 0 for i ∈ [2, n],and et(x′) = −e∗. We skip Phase 2 and directly proceed toPhase 3.


We now describe the Phase 3 algorithm for the problemon a tree. Let Pi be the unique path in the tree between node1 and node t with node i ∈ [1, n] being the penultimate node.Because all the excess is at node 1, the flow on arcs (p(i), i)for i ∈ [2, n] is nondecreasing and the algorithm has O(n)

flow augmentations. The cost of path Pi for i ∈ [1, n] canbe determined in O(n) time by performing a tree traversalstarting from the root node and computing the cost to eachnode as it is visited. Thus, the shortest augmenting path canbe determined in O(n) time and this gives an O(n2) algorithmfor the CFPT. We now describe an application of the dynamictree data structure to improve the run-time to O(n log n).

Let hi(�) be the cost of sending � > 0 units of additionalflow on arc (p(i), i) for i ∈ [2, n], let Wi be a list of mi break-points of hi(�) in sorted order, and let Incrementi(w) be theamount the slope of hi(�) increases at breakpoint w ∈ Wi.Without loss of generality, let Wi(0) = 0 and Wi(m) = ∞.

The dynamic tree is a data structure that maintains acollection of node-disjoint rooted trees and allows certainoperations on the tree to be performed efficiently. We main-tain two dynamic trees which have the same structure as theinput graph: (1) R which stores the residual capacity of arc(p(i), i) for i ∈ [2, n]; (2)C which stores the cost of the pathPi for i ∈ [1, n]. R uses an arc based implementation (Sleatorand Tarjan [19]) and C uses a node-based implementationof the dynamic tree (Tarjan [21]). Because the flow on arc(p(i), i) cannot decrease, we only keep track of the residualcapacities of arcs (p(i), i) for i ∈ [2, n]. R is a directed in-tree and is maintained such that each arc (i, p(i)) ∈ R holdsresidual capacity rp(i),i. C is maintained such that each nodei ∈ C contains the cost of path Pi. Note that three pairs ofthe operations (link, value, and update) below have the samename, but what steps they carry out depends on whether thedynamic tree is R or C.

The following operations can be performed on R and C inO(log n) time.

1. link(R, node i, node j, real x)—The operation assumesthat node i is a tree root, and that node i and node j belongto different trees in R. The operation combines the treescontaining i and j by adding arc (i, j) with real value xand making j the parent of i.

2. value(R, node i)—Returns/initializes the value of arc(i, p(i)) in R.

3. update(R, node i, real x)—Adds a real number x to thevalues of all arcs in R in the path between i and the root.

4. minvalue(R, node i)—Returns the node j closest to theroot such that (j, p(j)) has the smallest value among thevalues of arcs on the path from i to the root.

5. link(C, node i, node j)—Combines the trees containingnodes i and j by adding arc (i, j).

6. value(C, node i)—Returns/initializes the value of nodei in C.

7. update(C, node i, real x)—Let Ti be the subtree thatcontains node i and is obtained by deleting arc (p(i), i)from C. Then, this operation adds a real number x to thevalues of all nodes in Ti.

8. minvalue(C)—Returns a node in C with the minimumvalue.

Wi, Incrementi, and rp(i),i for i ∈ [2, n] can be computedin a straightforward manner in O(n) time. Also, Cost(Pi) fori ∈ [1, n] can be determined in O(n) time by performing atree traversal starting from the node 1. With these values, Rand C can be initialized in O(n log n) time using O(n) linkoperations. The Phase 3 algorithm is as follows.

Algorithm 4 Phase 3 for the CFPT.Algorithm Phase 3begin

initialize Wi for i ∈ [2, n], R, C, x;while e1(x) > 0 dobegin

compute i := minvalue(C);compute j := minvalue(R, i);compute f := min{e1(x), value (R, j)};update xit := xit + f , e1(x) := e1(x) − f ;if e(1) = 0 then exit while loop;update(R, i, −f );delete Wj(1) from Wj;set value(R, j) := Wj(2) − Wj(1);update(C, j, Incrementj(W(1)));

endset b(i) := b(i) − xit for i ∈ [1, n];call Phase 1 to compute the optimal flow on all arcs;

end

In each iteration, we identify the shortest path Pi by per-forming a minvalue operation on C and the quantity of flowaugmented by performing a minvalue and value operation onR. Following this, we update the residual capacities of all arcsthat belong to Pi using an update and set value operation onR, and a delete operation on Wj. Finally, we update the costsusing an update operation on C. Each iteration of the whileloop (other than the last one) eliminates one breakpoint andthe total time spent per iteration is O(log n). We thereforehave the following result.

Theorem 3. The dynamic tree implementation of thealgorithm solves the CFPT in O(n log n) time.

5. APPLICATIONS

5.1. Distributed Computing Application

Distributed computing refers to the mechanism by which adivisible job is shared between multiple processors connectedtogether on a network. A popular network configuration forprocessors is the tree or the bus (line or circle) configuration(Cheng and Robertazzi [10], Bataineh and Robertazzi [6],Kim et al. [16]). The load on the processing network variesas jobs are created or destroyed and eventually some proces-sors have excess loads while others have deficits. Hence, theload needs to be optimally rebalanced so that the computingresources are better utilized. This rebalancing has to be per-formed repeatedly and in real-time. It is therefore importantthat the algorithm used is efficient.


The problem of rebalancing can be formulated as a convexcost flow problem. Each processor is a node in the network.Processors with excess processing loads are excess nodeswith associated excesses and processors with a deficit of loadsare deficit nodes with associated deficits. The arcs in the net-work are the connections between the processors and eacharc has an associated capacity and cost of transmitting dataon it. Retaining excess load at a particular node has an asso-ciated linear cost. It is easy to see that when the arc costs areseparable piecewise linear convex functions and the underly-ing network is a circle or a tree, this problem can be solvedefficiently using the algorithms in this article.

5.2. Dynamic Lot-Sizing Application

Dynamic lot-sizing addresses the problem of planningproduction in order to satisfy time varying demands of anitem. In the extensive lot-sizing literature, researchers solvelot-sizing problems over a fixed planning horizon where theinventories at the start and the end of the finite horizon arezero (Wagner and Whitin [29], Wagelmans et al. [28], Atam-turk and Hochbaum [5], Ahuja and Hochbaum [2]). However,in the real-world there are occasions where a firm wantsto develop a base schedule that repeats weekly (or possiblyrepeats after some other fixed number of periods), assumingthat the demand also repeats weekly; we refer to the repeat-ing schedule as cyclic. In this generalization of the lot-sizingproblem, the inventory at the end of an n period planning cyclebecomes the starting inventory for the next n period planningcycle, which then repeats indefinitely. The problem of deter-mining the optimum schedule that repeats every n periods iscalled the cyclic dynamic lot-sizing problem (CDLP). Usingsuitable transformations, our algorithm for the CFPC can beused to solve the CDLP under the assumption that produc-tion costs are identical and linear, and the inventory/backordercost on each arc is a piecewise linear convex function withminimum value at x = 0.

5.3. Computational Biology and Computational MusicApplications

It is easy to see that our algorithm along with flow decom-position can be used to solve many-to-one matching problemson circles and lines. We now describe two applications of themany-one-matching problems on a line. The fastest algorithmfor this problem runs in O(n) time (Colanino et al. [11]). Therun-time of our algorithm almost matches this run-time whileaccommodating a more general network and cost structure.

DNA sequencing is a fundamental problem in computa-tional biology and involves determining the order of basecompounds in a DNA molecule. Shotgun sequencing is amethod used for sequencing large DNA strands. In thismethod, the DNA strand is broken into numerous small seg-ments, which are then sequenced to obtain reads. Computeralgorithms are used to assemble the overlapping ends of dif-ferent reads into a continuous sequence; this phase is calledfinishing and is computationally very intensive.

Ben-Dor et al. [7] proposed a new approach to help inthe challenge of finishing sequencing projects. One of theimportant components in their approach involves solving asubproblem called the restriction scaffold assignment prob-lem (RSAP). This subproblem needs to be solved millionsof times and hence it is important to solve it efficiently. TheRSAP is mathematically a many-to-one matching problemon a Euclidean line.

Measuring the similarity between musical rhythms is afundamental problem in computational music theory (Tou-ssaint [22]). There exist several similarity measures suchas the Hamming distance (Hamming [13]), the Euclideaninterval-vector space (Mont-Reynaud and Golstein [17]), theinterval-difference distance measure (Coyle and Shmulevich[12]), the swap distance (Toussaint [22]), and the chronotonicdistance measures (Hofmann-Engl [14]).

The problem of computing the swap distance between tworhythms with unequal number of beats is measured using thedirected swap distance (Diaz-Banez et al. [8]). The associatedoptimization problem that has to be solved is once again amany-to-one matching problem on a Euclidean line.

5.4. Transportation Network Application

We give examples in railroad transportation networks.Railroad scheduling (or planning) problems like the crewscheduling problem (Vaidyanathan et al. [27]) and the loco-motive scheduling problem (Ahuja et al. [3], Vaidyanathanet al. [25, 26]) have been well-studied. However, in real-life events do not always go according to plan due toinclement weather, breakdowns, and other unplanned dis-ruptions. This creates imbalances and the network may haveexcess resources in some locations and a deficit of resourcesat other locations when compared to the optimal plan. Afterthe disruption is over, the railroad’s objective is to transitionback to the desired state by moving resources from excesspoints to deficit points, so that the optimal plan can once againbe followed. The railroad uses their existing trains to repo-sition resources because this is the most cost effective wayto do so. Crew or locomotives that are being repositioned ontrains are said to be deadheading. Similarly, empty freightcars are also repositioned on trains. Because the resourcesthat are being repositioned are idle and not contributing torevenue, the objective of the railroad is to minimize the totalcost of repositioning. Railroad networks are usually sparsedue to their very high construction costs. When the underly-ing railroad network is a tree, a line, or a circle and arc costsare convex, the problem of repositioning locomotive, crew,or rail cars at minimal cost can be formulated as a CFPT orCFPC and our algorithms can be used to solve these prob-lems. We can envision similar applications in areas such ashighway and river transportation.

Acknowledgments

The authors thank the anonymous referees for carefullyreviewing the paper and making numerous suggestions that


resulted in an improved presentation of our research. Theauthors also thank Editor-in-Chief Professor Douglas Shierfor suggesting many editorial improvements.

REFERENCES

[1] A. Aggarwal, A. Bar-Noy, S. Khuller, D. Kravets, and B.Schieber, Efficient minimum cost matching and transporta-tion using the quadrangle inequality, J Algorithms 19 (1995),116–143.

[2] R.K. Ahuja and D.S. Hochbaum, Solving linear cost dynamiclot sizing problems in O(n log n) time, Oper Res 56 (2008),255–261.

[3] R.K. Ahuja, J. Liu, J.B. Orlin, D. Sharma, and L.A. Shughart,Solving real-life locomotive scheduling problems, Transp Sci39 (2005), 503–517.

[4] R.K. Ahuja, T.L. Magnanti, and J.B. Orlin, Network flows:Theory, algorithms, and applications, Prentice Hall, NJ, 1993.

[5] A. Atamturk and D.S. Hochbaum, Capacity acquisition, sub-contracting, and lot sizing, Manag Sci 47 (2001), 1081–1100.

[6] S. Bataineh and T.G. Robertazzi, Bus-oriented load sharingfor a network of sensor driven processors, IEEE Trans SystMan Cybern 21 (1991), 1202–1205.

[7] A. Ben-Dor, R.M. Karp, B. Schwikowski, and R. Shamir,The restriction scaffold problem, J Comput Biol 10 (2003),385–398.

[8] J.M. Diaz-Banez, G. Farigu, F. Gomez, D. Rappaport, andG.T. Toussaint, El compas flamenco: A phylogenetic analy-sis, Proceedings of BRIDGES: Mathematical Connections inArt, Music and Science, Winfield, Kansas, 2004, pp. 61–70.

[9] M. Blum, R.W. Floyd, V. Pratt, R.L. Rivest, and R.E. Tar-jan, Time bounds for selection, J Comput Syst Sci 7 (1973),448–461.

[10] Y.C. Cheng and T.G. Robertazzi, Distributed computing fora tree network with communication delays, IEEE Trans AeroElectron Syst 26 (1990), 511–516.

[11] J. Colannino, M. Damian, F. Hurtado, J. Iacono, H. Meijer, S.Ramaswami, and G.T. Toussaint, An O(n log n) algorithm forthe restriction scaffold assignment problem, J Comput Biol13 (2006), 979–989.

[12] E.J. Coyle and I. Shmulevich, A system for machine recogni-tion of music patterns, Proceedings of the IEEE InternationalConference on Acoustics, Speech, and Signal Processing,Seattle, Washington, 1998.

[13] R.W. Hamming, Coding and information theory, Prentice-Hall, Englewood Cliffs, 1986.

[14] L. Hofmann-Engl, “Rhythmic similarity: A theoretical andempirical approach,” Proceedings of the Seventh Inter-national Conference on Music Perception and Cognition,Sidney, Australia, 2002, pp. 564–567.

[15] R.M. Karp and S.Y.R. Li, Two special cases of the assignmentproblem, Discrete Math 13 (1975), 129–142.

[16] H.J. Kim, G. Jee, and J.G. Lee, Optimal load distribution fromtree network processors, IEEE Trans Aero Electron Syst 32(1996), 607–612.

[17] M. Mont-Reynaud and M. Goldstein, “On finding rhythmicpatterns in musical lines,” Proceedings of the InternationalComputer Music Conference, Vancouver, Canada, 1985,pp. 391–397.

[18] J.B. Orlin, A faster strongly polynomial minimum cost flowalgorithm, Oper Res 41 (1993), 377–387.

[19] D. Sleator and R.E. Tarjan, A data structure for dynamic trees,J Comput Syst Sci 26 (1983), 362–391.

[20] R.E. Tarjan, Efficiency of a good but not linear set unionalgorithm, J Assoc Comput Mach 22 (1975), 215–225.

[21] R.E. Tarjan, Dynamic trees as search trees via Euler tours,applied to the network simplex algorithm, Math Program 78(1997), 169–177.

[22] G.T. Toussaint, A comparison of rhythmic similarity mea-sures, Technical report SOCS-TR-2004.6, School of Com-puter Science, McGill University, Montreal, Canada, 2004.

[23] B. Vaidyanathan, Simple linear flow decomposition algo-rithms on trees, circles, and augmented trees, Networks 60(2012), 227–234.

[24] B. Vaidyanathan and R.K. Ahuja, Fast algorithms for spe-cially structured minimum cost flow problems with applica-tions, Oper Res 58 (2010), 1681–1696.

[25] B. Vaidyanathan, R.K. Ahuja, J. Liu, and L.A. Shughart,Real-life locomotive planning: New formulations and com-putational results, Transp Res B 42 (2008), 147–168.

[26] B. Vaidyanathan, R.K. Ahuja, and J.B. Orlin, The locomotiverouting problem, Transp Sci 42 (2008), 492–507.

[27] B. Vaidyanathan, K.C. Jha, and R.K. Ahuja, Multi-commodity network flow approach to the railroad crew-scheduling problem, IBM J Res Develop 51 (2007), 325–344.

[28] A. Wagelmans, S. Van Hoesel, and A. Kolen, Economiclot sizing: An O(n log n) algorithm that runs in linear timein the Wagner-Whitin case, Oper Res 40 (1992), S145–S156.

[29] H.M. Wagner and T.M. Whitin, Dynamic version of theeconomic lot size model, Manag Sci 5 (1958), 89–96.


Documents

Fast algorithms for convex cost flow problems on circles, lines, and trees