https://doi.org/10.31449/inf.v42i3.1493 Informatica 42 (2018) 401–405 401
Improved Local Search Based Approximation Algorithm for Hard Uniform
Capacitated k-Median Problem
Sapna Grover, Neelima Gupta and Aditya Pancholi
Department of Computer Science, University of Delhi, India
E-mail: sgrover@cs.du.ac.in, ngupta@cs.du.ac.in, apancholi@cs.du.ac.in
Keywords: NP completeness, approximation algorithm, k-median problem
Received: January 7, 2017
In this paper, we study the hard uniform capacitatedk - median problem. We give (5 +  ) factor approxi-
mation for the problem using local search technique, violating cardinality by a factor of 3. Though better
results are known for the problem using LP techniques, local search algorithms are well known to be sim-
pler. There is a trade-off viz-a-viz approximation factor and cardinality violation between our result and
the result of Korupolu et al. [10] which is the only result known for the problem using local search. They
gave (1 +  ) approximation factor with (5 + 5=  ) factor loss in cardinality. In a sense, our result is an
improvement as they violate the cardinality by more than a factor of 6 to achieve 5 factor in approximation.
Though in their result, the approximation factor can be made arbitrarily small, cardinality loss is at least 5
and small approximation factor is obtained at a big loss in cardinality. Thus, we improve upon their result
with respect to cardinality.
Povzetek: Obravnavan je NP problem optimiranja iskanja k median in predlagana izvirna rešitev, ki dosega
boljše rezultate v doloˇ cenih primerjavah.
1 Introduction
k - Median Problem is one of the well studied NP-hard
optimization problem. The input instance consists of a set
of clients, a set of facilities, a non-negative numberk and
a non-negative cost of connecting a facility to a client. The
goal is to select a set of at mostk facilities as centers and
assign clients to them such that the total cost of serving the
clients from centers is minimum.
Several versions of the problem exist in literature
with different properties, the most common being Un-
capacitated k Median Problem (UkM) and Capacitated k
Median Problem (CkM). In the former case, each facility
has inﬁnite capacity (i.e. there is no limit on the amount
of demand it can serve) in comparison to ﬁnite capacity in
the latter case. In CkM, capacities may be soft or hard. In
soft capacitated version, multiple copies of a facility can
be opened at a location whereas in case of hard capacities,
each facility is either opened at some location or not. Also,
the capacities may be uniform or non-uniform. In the for-
mer case, all facilities have the same capacity in contrast
to the latter one where-in different facilities have different
capacities. Another variation of CkM is with respect to
assignments of clients to facilities: in un-splittable assign-
ments, the entire demand of a client has to be served by
only one facility, in comparison to splittable assignments
in which the demand of a client can be split among multi-
ple facilities.
Several techniques have been used to obtain results for
the problem. One of the most widely used technique to ap-
proximate the problem is LP Rounding ([4, 5, 7, 8, 9, 11,
12, 13, 14]). Charikar et al. [7] gave a 20=3 factor approxi-
mation algorithm for UkM, which was further improved to
3:25 factor by Charikar and Li in [8]. Li and Svensson [14]
further improved the ratio to 1 +
p
3 +  . Their algorithm
has a running time of O(n
(1= 
2
)
).
Obtaining a constant approximation factor for CkM
problem without violating capacity constraint and cardinal-
ity constraint is challenging as natural LP of the problem is
known to have an unbounded integrality gap. Approxima-
tion results violate either capacity constraint or cardinality
constraint, or both.
Cardinality violation: Li [12] gave a novel linear pro-
gram called rectangle LP and presented an improved ap-
proximation algorithm (exp(O(1= 
2
))) using at most (1 +
  )k facilities for hard uniform CkM problem. The running
time of the algorithm is n
O(1)
, where the constant in the
exponent does not depend on  . He then extended this re-
sult to non-uniform soft capacitated variant of the problem
in [13] and gave an (O(1= 
2
log(1=  ))) approximation fac-
tor bounding softness by a factor of 2. The algorithm has a
running time ofn
O(1=  )
.
Capacity violation: Charikar et al. [7] gave a 16 factor
approximation algorithm for hard uniform CkM violating
capacities by a factor of 3 in case of splittable demands
and 4 in case of un-splittable demands. In 2015, Byrka
et al. [4] gave anO(1=  ) approximation algorithm violat-
ing capacities by a factor of (3 +  ) for hard non-uniform
CkM. Demirci et al. [9] improved the approximation ratio
toO(1= 
5
) with capacity violation of (1 +  ) for the same
402 Informatica 42 (2018) 401–405 S. Grover et al.
version of the problem. The running time of their algorithm
isn
O(1=  )
. Recently, Byrka et al. [5] gave anO(1= 
2
) ap-
proximation violating capacities by a factor of (1 +  ) for
hard uniform CkM. The algorithm uses randomized round-
ing to round a fractional solution to the conﬁguration LP.
Aardal et al. [1] exploited the structure of an extreme
point solution to give a (7+  ) factor algorithm for hard non-
uniform Capacitated k- Facility Location Problem (Ck-
FLP) violating cardinality constraint by a factor of 2. As
a special case of CkFLP, their result applies on hard non-
uniform CkM with all facility costs being zero. In the same
manner, the CkFLP result (1= 
2
) of Byrka et al. [4] is appli-
cable on hard uniform CkM. The result violates capacities
by a factor of 2 +  .
The other commonly used technique for the problem is
local search [2, 6, 10]. Charikar and Guha [6] gave 4 factor
algorithm without violating cardinality constraint for the
un-capacitated variant of the problem. Korupolu et al. [10]
gave O(1 +  ) factor approximation algorithm for UkM us-
ing at most 3+5=  facilities. Arya et al. [2] gave an impro-
vised result of 3 + 2=p factor algorithm for UkM by using
p-swaps.
We present a (5 +  ) factor algorithm for hard uniform
CkM violating the cardinality by a factor of 3 using Lo-
cal Search. Algorithms based on local search are well
known to be simpler as compared to the LP-based algo-
rithms. The only result known for the problem using local
search is due to Korupolu et al. [10]. They give an algo-
rithm with a trade-off between approximation factor and
cardinality loss. They give (1 +  ) approximation factor
with (5 + 5=  ) factor loss in cardinality. To achieve 5 fac-
tor in approximation, cardinality violation is more than 6.
Though the approximation factor can be made arbitrarily
small, cardinality loss is at least 5. Note that small approx-
imation factor is obtained at a big loss in cardinality. For
example, for   anything less than 1, cardinality violation
is more than 10. Though we somewhat loose on the ap-
proximation factor, we surely improve upon the cardinality
violation. Thus, there is a trade-off between cardinality vi-
olation and approximation factor amongst their result and
ours. In particular, we present the following result:
Theorem 1. There is a polynomial time algorithm that
approximates hard uniform capacitatedk median problem
within 5 factor violating the cardinality by a factor of 3.
High Level Idea: We extend the idea of ‘mapping’ of
Arya et al. [2] to the capacitated version of the problem.
However, for the capacitated case, mapping needs to be
done a little intelligently. Mapping to an almost fully uti-
lized facility may not be able to accommodate all the clients
mapped to it and vice-versa. That is, a partially utilized
facility may not be able to accommodate the load of an
almost fully utilized facility. Thus, mapping is done only
between the partially utilized facilities. To ensure that there
are sufﬁcient number of partially utilized facilities, we need
to assume that we have sufﬁcient number (3k) of opened
centers.
2 Notation and preliminaries
2.1 Capacitatedk-median problem
In Capacitatedk-Median Problem, we are given a set ofF
of facilities, a setC of clients and a real valued distance
function c onF[C in metric space. Each client j 2C
has a non-negative demandd
j
and each facilityi2F has
a capacityu
i
indicating the amount of demand it can serve.
The cost of serving one unit of demand of a clientj2C
from facility i2F is denoted as c(i; j). The goal is to
select a subsetS  F of at most k facilities and assign
clients to them without violating the capacities such that the
total cost of serving all the clients by the opened facilities
is minimum.
We consider the hard uniform capacitatedk-median ver-
sion of the problem i.e.u
i
= U8i2F and at most one
instance of a facility can be opened at its location. We as-
sume unit demand at each client i.e.d
j
= 18j2C.
2.2 Local search paradigm
Given a Problem P, let S be any arbitrary feasible solution
to it. A new solution S
0
is called a neighborhood solution
of S if it can be obtained by performing local search
operations such as adding one or more facilities s = 2 S to
S or deleting one or more facilities s from S or swapping
one or more facilities of S with facilities not in S. We now
formally describe the steps of the algorithm.
The paradigm:
1. Compute an arbitrary feasible solutionS toP.
2. While S
0
is a neighborhood solution of S such that
cost(S
0
)<cost(S) do, setS = S
0
.
The solution S so obtained is called a locally optimal
solution. Note that cost(S
0
)  cost(S) for every neigh-
borhood solution S
0
, for otherwise S would not have been
locally optimal. More formally, a solutionS is said to be lo-
cally optimal if no further operation results in improvement
in cost.
3 (5+ ; 3) algorithm
For the k-median problems, we deﬁne an (a;b) -
approximation algorithm as a polynomial-time algorithm
that computes a solution using at mostbk number of facili-
ties with cost at mosta times the cost of an optimal solution
using at mostk facilities.
We select an arbitrary set of facilitiesS F such that
jSj = 3k. This set acts as our initial feasible solution. Note
that, deﬁning a subset of opened facilities completely spec-
iﬁes a solution. We can obtain the assignments by solving
an appropriately deﬁned instance of transportation prob-
lem.
Improved Local Search Based Approximation Algorithm for. . . Informatica 42 (2018) 401–405 403
The only operation permitted by our algorithm is
swap(s;o), deﬁned as follows: S = S f sg +fog,
o2FnS, s2S. Reassign all the clients served by o
in optimal too in our new solution.
We run the local search algorithm onS. SinceS is now
locally optimal, for all neighborhood solutionsS
0
ofS, we
have,cost(S
0
)  cost(S).
3.1 Analysis
LetO denote the optimal solution to the problem. We now
show that the local optimal solutionS is within 5 factor of
the optimal solution i:e:cost(S)  5cost(O).
For a clientj, let  S
(j) and  O
(j) denote the facilities
serving j inS andO respectively. Also, let S
j
and O
j
denote the service costs paid byj inS andO respectively.
Let s2S and o2O. Consider Figure 1. LetB
S
(s)
denote the ball ofs, that is, the set of clients served bys in
S. Similarly, letB
O
(o) denote the ball ofo2O. Also, let
B(s;o) be the set of clients served bys2S ando2O.
Figure 1: Balls of facilities
To deal with capacities, we classify the facilities inS
based on the number of clients served by them. A facility
s2S is said to be heavy if it serves more than U=2 clients
inS, else it is said to be light. Note that the number of
heavy facilities can be at most 2k. LetS
L
denote the set of
light facilities inS. SincejSj = 3k,jS
L
j  k.
LetB
L
O
(o) be the set of clients served by o in optimal
and by light facilities inS andM
o
=jB
L
O
(o)j. We say
that a facilitys2S
L
dominateso, if it serves more than
half the clients served by light facilities inS and byo2O,
i:e:B(s;o) >M
o
=2. A facility belonging toS
L
is called
bad if it dominates more than one facilities inO, it is called
good if it dominates exactly one facility inO, else it is
called nice
We now devise a 1  1 and onto mapping  :B
L
O
(o)!
B
L
O
(o). Order the clients inB
L
O
(o) asj
0
;j
1
;:::;j
Mo  1
such
that for everys2 S with a nonemptyB(s;o), the clients
inB(s;o) are consecutive; that is, there exists r;s; 0  r  s M
o
  1, such thatB(s;o) =fj
r
;:::;j
s
g. De-
ﬁne  (j
p
) = (j
q
), whereq = (p +
  M
o
=2
  )moduloM
o
.
Consider Figure 2a which shows the setB
O
(o). The corre-
sponding mapping is shown in Figure 2b.
The following claim holds on mapping:
Claim 1. Ifs2S
L
does not dominateo, then  (B(s;o))\
B(s;o) =  .
Proof. For contradiction, assume that both j
p
,   (j
p
) =
j
q
2 B(s;o) for some s, wherejB(s;o)j   M
o
=2. If
q = p +
  M
o
=2
  , then jB(s;o)j   q  p + 1 =
  M
o
=2
  + 1 >M
o
=2. Ifq = p +
  M
o
=2
   M
o
, then
jB(s;o)j  p  q + 1 =M
o
    M
o
=2
  + 1>M
o
=2. In
either case, we have a contradiction, and hence mapping  satisﬁes the claim.
Figure 2: Mapping
The notion of dominate can be used to construct a bipar-
tite graphH = (S;O;E). For each facility inS
L
, we have
a vertex on theS-side and for each facility inO, we have a
vertex on theO-side. We add an edge betweens2S
L
and
o2O ifs dominateso. Note that the degree of each vertex
onO-side is at most one while the vertices on theS-side
can have degree up tok.
We now consider allk swaps, one for each facility inO.
Ifs2S
L
is good, then we consider theswap(s;o), where
o is the facility inO dominated bys. Let  be the number
of facilities inO that did not participate in the above swaps.
Then the total number of bad and nice facilities inS
L
is at
least  and at least = 2 of them must be nice. The remain-
ing  facilities inO get swapped with the nice facilities in
S
L
such that each nice facility is considered in at most two
swaps. The bad facilities are not considered for swapping.
The swaps considered above satisfy the following proper-
ties:
1. Eacho2O is considered in exactly one swap.
2. Facilities inSnS
L
are not considered in any swap
operation.
3. Bad facilities inS
L
are not considered in any swap
operation.
4. Each nice facilitys2S
L
is considered in at most two
swap operations.
5. Ifswap(s;o) is considered thens does not dominate
any facilityo
0
6=o :o
0
2O.
404 Informatica 42 (2018) 401–405 S. Grover et al.
Lemma 1. Let cost(S) denote the cost of the local opti-
mal solutionS and,cost(O) denote the cost of the global
optimal solutionO. Then,cost(S)  5cost(O).
Proof. Considerswap(s;o). Letj2B
S
(s). We ﬁrst reas-
sign the clients inB
S
(s).
1. Ifj2B
O
(o), assignj too.
2. Ifj = 2B
O
(o), assignj tos
0
2S
L
such that  (j) =j
0
andj
0
2B
S
(s
0
).
In case 1, the change in cost is given by (O
j
  S
j
).
In case 2, the change in cost is (c(j; s
0
)  S
j
). Let
j2B
O
(o
0
). From triangle inequality, we get c(j; s
0
)  c(j; o
0
) +c(o
0
;   (j)) +c(  (j); s
0
) =O
j
+O
  (j)
+S
  (j)
.
AsS is a locally optimal solution, we have
X
j2B
S
(s)\B
O
(o)
(O
j
  S
j
)+
X
j2B
S
(s)nB
O
(o)
(O
j
+O
  (j)
+S
  (j)
  S
j
)> 0 (1)
Each facility o2O is considered in exactly one swap
operation. Thus the ﬁrst term of inequality when added
over all k swaps gives exactly cost(O)  cost(S). Each
s2S is considered in at most two swaps. The second term
of inequality when added over all k swaps is no greater
than 2(O
j
+O
  (j)
+S
  (j)
  S
j
). As  is a 1  1 and onto
mapping,
X
j2C
O
j
=
X
j2C
O
  (j)
and
X
j2C
(S
  (j)
  S
j
) = 0.
Thus, 2(O
j
+O
  (j)
+S
  (j)
  S
j
) = 4cost(O). Combining
the two terms, we getcost(O)  cost(S) + 4cost(O)  0.
Thus,cost(S)  5cost(O).
In the algorithm presented so far, we move to a new so-
lution if it gives some improvement in the cost, however
small that improvement may be. This may lead to an algo-
rithm taking lot of time. To ensure that the algorithm termi-
nates in polynomial time, a local search step is performed
only when the cost of the current solutionS is reduced by
at least
cost(S)
p(n;  )
, wheren is the size of the problem instance
andp(n;  ) is an appropriate polynomial inn and 1=  for
a ﬁxed  > 0. This modiﬁcation in the algorithm incurs a
cost of additive  in the approximation factor.
It is easy to see that if we have 3:5k facilities then the
total number of bad and nice facilities inS
L
is at least  +
k=2 and at least (  + k)=2     of them must be nice.
The remaining  facilities inO get swapped with the nice
facilities inS
L
such that each nice facility is considered
in at most one swap. This saves us factor 2 coming from
the second term of equation (1). Thus, we get (3 + ; 3:5)
algorithm. Also, usingp-swaps of Arya et al. [2], we can
get (3 + 2=p; 3) algorithm.
4 Conclusion and future work
We gave a (5 +  ) factor approximation algorithm for hard
uniform capacitatedk median problem using local search
technique, violating cardinality by a factor of 3. It improves
upon the existing results known for the problem using local
search, with respect to cardinality violation. It would be
interesting to obtain a constant factor algorithm reducing
the cardinality violation to (1 +  ). Though such a result
is known using LP-techniques, it would be interesting to
obtain similar result using local search. Another direction
to extend the work would be to consider the non-uniform
capacitated version of the problem using local search.
References
[1] Karen Aardal, Pieter L. van den Berg, Dion Gijswijt,
and Shanfei Li. Approximation algorithms for hard
capacitated k-facility location problems. European
Journal of Operational Research, 242(2):358–368,
2015. https://doi.org/10.1016/j.ejor.2014.10.011
[2] Vijay Arya, Naveen Garg, Rohit Khandekar, Adam
Meyerson, Kamesh Munagala, and Vinayaka Pan-
dit. Local search heuristic for k-median and fa-
cility location problems. In Proceedings on 33rd
Annual ACM Symposium on Theory of Comput-
ing, Heraklion, Crete, Greece, pages 21–29, 2001.
https://doi.org/10.1145/380752.380755
[3] Manisha Bansal. Approximation algorithms for facil-
ity location problems- https://drive.google.com/ﬁle/d/
0bxmghjb2ede3nkvka2rzcfnethm/view. In PhD The-
sis, October, 2013.
[4] Jaroslaw Byrka, Krzysztof Fleszar, Bartosz Rybicki,
and Joachim Spoerhase. Bi-factor approximation al-
gorithms for hard capacitated k-median problems. In
Proceedings of the Twenty-Sixth Annual ACM-SIAM
Symposium on Discrete Algorithms, SODA 2015, San
Diego, CA, USA, January 4-6, 2015, pages 722–736,
2015. https://doi.org/10.1137/1.9781611973730.49
[5] Jaroslaw Byrka, Bartosz Rybicki, and Sumedha
Uniyal. An approximation algorithm for uniform ca-
pacitated k-median problem with (1 +  ) capacity vi-
olation. In Integer Programming and Combinatorial
Optimization - 18th International Conference, IPCO
2016, Liège, Belgium, June 1-3, 2016, Proceedings,
pages 262–274, 2016. https://doi.org/10.1007/978-3-
319-33461-5_22
[6] Moses Charikar and Sudipto Guha. Improved com-
binatorial algorithms for the facility location and k-
median problems. In Proceedings of the 40th Annual
IEEE Symposium on Foundations of Computer Sci-
ence (FOCS), New York, NY, USA, pages 378–388,
1999. https://doi.org/10.1137/s0097539701398594
Improved Local Search Based Approximation Algorithm for. . . Informatica 42 (2018) 401–405 405
[7] Moses Charikar, Sudipto Guha, Éva Tardos, and
David B. Shmoys. A constant-factor approximation
algorithm for the k-median problem (extended ab-
stract). In Proceedings of the Thirty-First Annual
ACM Symposium on Theory of Computing, May 1-
4, 1999, Atlanta, Georgia, USA, pages 1–10, 1999.
https://doi.org/10.1145/301250.301257
[8] Moses Charikar and Shi Li. A dependent lp-rounding
approach for the k-median problem. In Automata,
Languages, and Programming - 39th International
Colloquium, ICALP 2012, Warwick, UK, July 9-13,
2012, Proceedings, Part I, pages 194–205, 2012.
https://doi.org/10.1007/978-3-642-31594-7_17
[9] H. Gökalp Demirci and Shi Li. Constant approxima-
tion for capacitated k-median with (1 +  ) - capacity
violation. In Proceedings of the 43rd International
Colloquium on Automata, Languages, and Program-
ming, ICALP 2016, July 11-15, 2016, Rome, Italy,
pages 73:1–73:14, 2016.
[10] Madhukar R. Korupolu, C. Greg Plaxton, and
Rajmohan Rajaraman. Analysis of a local
search heuristic for facility location problems.
Journal of Algorithms, 37(1):146–188, 2000.
https://doi.org/10.1006/jagm.2000.1100
[11] Shanfei Li. An Improved Approximation Algorithm
for the Hard Uniform Capacitated k-median Problem.
In Approximation, Randomization, and Combinato-
rial Optimization. Algorithms and Techniques (AP-
PROX/RANDOM 2014), Germany, pages 325–338.
https://doi.org/10.1016/j.ejor.2014.10.011
[12] Shi Li. On uniform capacitated k-median be-
yond the natural LP relaxation. In Proceedings
of the Twenty-Sixth Annual ACM-SIAM Symposium
on Discrete Algorithms, SODA 2015, San Diego,
CA, USA, January 4-6, 2015, pages 696–707, 2015.
https://doi.org/10.1137/1.9781611973730.47
[13] Shi Li. Approximating capacitated k-median with
(1 +   )k open facilities. In Proceedings of the
Twenty-Seventh Annual ACM-SIAM Symposium on
Discrete Algorithms, SODA 2016, Arlington, VA,
USA, January 10-12, 2016, pages 786–796, 2016.
https://doi.org/10.1137/1.9781611974331.ch56
[14] Shi Li and Ola Svensson. Approximating k-
median via pseudo-approximation. In Proceedings
of Symposium on Theory of Computing Conference
(STOC), Palo Alto, CA, USA, pages 901–910, 2013.
https://doi.org/10.1145/2488608.2488723
406 Informatica 42 (2018) 401–405 S. Grover et al.