Also available at http://amc.imfm.si
ISSN 1855-3966 (printed edn.), ISSN 1855-3974 (electronic edn.)
ARS MATHEMATICA CONTEMPORANEA 6 (2013) 83–88
On the minimum rainbow subgraph number
of a graph
Ingo Schiermeyer
Institut für Diskrete Mathematik und Algebra, Technische Universität Bergakademie
Freiberg, 09596 Freiberg, Germany
Received 19 October 2011, accepted 9 April 2012, published online 1 June 2012
Abstract
We consider the MINIMUM RAINBOW SUBGRAPH problem (MRS): Given a graph G
whose edges are coloured with p colours. Find a subgraph F ⊆ G of minimum order
and with p edges such that each colour occurs exactly once. This problem is NP-hard and
APX-hard.
For a given graph G and an edge colouring c with p colours we define the rainbow
subgraph number rs(G, c) to be the order of a minimum rainbow subgraph of G with size
p. In this paper we will show lower and upper bounds for the rainbow subgraph number of
a graph.
Keywords: Edge colouring, rainbow subgraph.
Math. Subj. Class.: 05C15, 05C35
1 Introduction and motivation
We use [2] for terminology and notation not defined here and consider finite and simple
graphs only.
Our research was motivated by the following problem from bioinformatics. The prob-
lem data consist in a set G of p genotypes g1 , g2 , . . . , gp corresponding to p individuals
in a population. Each genotype g is a vector with entries in {0, 1, 2}. Each position where
a 2 appears is called ambiguous position. For a genotype g we have to determine a pair
of haplotypes hP and hM (hP stands for the paternal haplotype and hM stands for the
maternal haplotype), which are binary vectors such that g = hP ⊕ hM .
Given two haplotypes h ′ and h ′′, their sum is defined as the vector g = h′ ⊕ h′′ with
g[i] = 0, if h′[i] = h′′[i] = 0, g[i] = 1, if h′[i] = h′′[i] = 1 and g[i] = 2, if h′[i] 6= h′′[i].
We say that a setH of haplotypes resolves G if for every g ∈ G there exist h1, h2 ∈ H
such that g = h1 ⊕ h2. Given a set G of genotypes, the haplotyping problem consists
E-mail address: Ingo.Schiermeyer@tu-freiberg.de (Ingo Schiermeyer)
Copyright c© 2013 DMFA Slovenije
84 Ars Math. Contemp. 6 (2013) 83–88
in finding a set H of haplotypes that resolves G. In the Pure Parsimony Haplotyping
problem (PPH problem) we are interested in finding a set H of smallest possible cardi-
nality. If each genotype has at most k ambiguous positions, then we denote this problem
by PPH(k). The PPH problem has been studied in ([3],[4],[7],[9]).
Matos Camacho et al. [8] have shown that the PPH(k) can be transformed to a graph
problem, the MINIMUM RAINBOW SUBGRAPH problem (MRS). Note that this edge-colou-
ring need not be proper.
Definition 1.1 (Rainbow subgraph).
Let G be a graph with an edge-colouring. A subgraph H of G is called rainbow subgraph
if H does not contain two edges of the same colour.
Definition 1.2 (Minimum Rainbow Subgraph problem (MRS)).
Given a graph G, whose edges are coloured with p colours, find a subgraph F ⊆ G of
minimum order and with p edges such that each colour occurs exactly once.
For a set G of p genotypes g1 , g2 , . . . , gp we will use p colours 1, 2, . . . , p. For each
haplotype we introduce a vertex. If two haplotypes h′ and h′′ resolve a genotype gi (gi =
h′ ⊕ h′′ ), then the corresponding vertices will be joined by an edge which receives colour
i. If a genotype is resolved by two identical haplotypes, then the corresponding vertex is
joined by an edge which is called a loop.
In this way we construct a graph G, whose edges are coloured with p colours. Note
that this is a proper edge colouring (no vertex is incident with two edges of the same
colour), since a haplotype h can be used at most once in a pair of haplotypes, which resolves
a genotype g. Furthermore, every set H of haplotypes that resolves G corresponds to a
rainbow subgraph F of G.
It has been shown in [8] that a graph G containing loops can be transformed into a
graph G′ without loops. Hence in the following we may assume that all graphs have no
loops.
Matos Camacho et al. [8] proved the MRS problem to be NP-hard and APX-hard.
In [5] it has been shown that the MRS problem remains NP-hard and APX-hard even for
graphs with maximum degree 2.
Remark: If we do not consider edge colourings, the analogous problem is known as the
(t, f(t)) dense subgraph problem ((t, f(t))-DSP), which asks whether there is a t-vertex
subgraph of a given graphGwhich has at least f(t) edges. When f(t) =
(
t
2
)
, (t, f(t))-DSP
is equivalent to the well-known t-clique problem (cf. [1]).
2 Lower bounds for the rainbow subgraph number
Definition 2.1. Let G be a graph and c be its edge colouring with p colours. The rainbow
subgraph number of G (with respect to the colouring c) is defined as the order of its mini-
mum rainbow subgraph of size p, and denoted by rs(G, c) (or rs(G), when the colouring
c is clear from the context).
Improved lower bounds for the rainbow subgraph number rs(G) will be of major
importance for the design of approximation algorithms with better approximation ratios
for the MRS problem (cf. [8, 5]). So far nothing better than the trivial lower bound
rs(G) ≥ 2p∆(G) is known. We can improve this lower bound by counting the number of
distinct colours among all edges incident to a vertex.
I. Schiermeyer: On the minimum rainbow subgraph number of a graph 85
Definition 2.2. Given an edge colouring of a graph G with colours 1, 2, . . . , p, we define
c(e) = i, if the edge e has colour i for 1 ≤ i ≤ p.
Let cd(v) (colour degree) denote the number of distinct colours among all edges incident
to the vertex v and let cd(i) = max{cd(v) | v ∈ V (G) has an incident edge with colour i}
be the maximum colour degree for every colour i, 1 ≤ i ≤ p.
Using the maximum colour degrees for all colours we can show the following improved
lower bound.
Proposition 2.3. Let G be a graph, whose edges are coloured with p colours. Then
rs(G) ≥
p∑
i=1
2
cd(i)
≥ 2p
∆(G)
.
Proof. Let F be a minimum rainbow subgraph of order k = rs(G). Then
rs(G) = k =
∑
v∈V (F )
dF (v)
dF (v)
=
∑
e=uw,e∈E(F )
1
dF (u)
+
1
dF (w)
≥
p∑
i=1
2
cd(i)
≥ 2p
∆(G)
.
The following example shows that this bound is sharp and improves the lower bound
of 2p∆(G) significantly.
Example 2.4. For p ≥ 4 and ∆ ≥ 2 let G = K1,∆ + Cp−1 (where G + H denotes the
disjoint union of two graphsG andH). All edges of the cycle Cp−1 are coloured distinctly,
say with colours 1, 2, . . . , p − 1, and all edges of K1,∆ are coloured with colour p. Then
rs(G) = p+ 1 = p− 1 + 2 =
∑p
i=1
2
cd(i) >
2p
∆(G) .
We can further improve this lower bound by counting the number of distinct colours
among all edges incident to the endvertices of an edge. For this purpose we define q(i) =
min{ 1cd(u) +
1
cd(w) | uw ∈ E(G) and c(uw) = i}.
Proposition 2.5. Let G be a graph, whose edges are coloured with p colours. Then
rs(G) ≥
p∑
i=1
q(i) ≥
p∑
i=1
2
cd(i)
≥ 2p
∆(G)
.
Proof. Let F be a minimum rainbow subgraph of order k = rs(G). For every colour i, 1 ≤
i ≤ p, let uiwi be an edge such that 1cd(ui) +
1
cd(wi)
= q(i). If uw ∈ E(F ) is an edge with
c(uw) = i, then 1cd(u) +
1
cd(w) ≥
1
cd(ui)
+ 1cd(wi) = q(i) ≥ 2 ·
1
max{cd(ui),cd(wi)} ≥
2
cd(i) .
Therefore,
rs(G) = k =
∑
v∈V (F )
dF (v)
dF (v)
=
∑
e=uw,e∈E(F )
1
dF (u)
+
1
dF (w)
≥
p∑
i=1
q(i) ≥
p∑
i=1
2
cd(i)
≥ 2p
∆(G)
.
86 Ars Math. Contemp. 6 (2013) 83–88
The following example shows that this bound is sharp and improves the previous two
lower bounds significantly.
Example 2.6. Let G ∼= K1,p for some p ≥ 2. Let the edges of G be coloured with p
colours. Then cd(i) = p and q(i) = 1 + 1p for 1 ≤ i ≤ p. Thus rs(G) = p + 1 =
p · (1 + 1p ) =
∑p
i=1 q(i) > 2 = p ·
2
p =
∑p
i=1
2
cd(i) =
2p
∆(K1,p)
.
3 Upper bounds for the rainbow subgraph number
First observe that the trivial upper bound rs(G) ≤ 2p is achieved if the rainbow subgraph
F is a matching. This upper bound has been improved towards rs(G) ≤ 2p + 1 −∆(G)
by Koch [6] for properly edge-coloured graphs and this bound is sharp. For instance, let
G = K1,∆ + (p −∆)K2, where p ≥ ∆, and all edges of G are coloured distinctly. Then
rs(G) = 2p+ 1−∆(G).
Similar to Brooks’ Theorem (cf. [2]) we can characterize all graphs achieving this
bound.
Theorem 3.1. Let G be a graph with maximum degree ∆ ≥ 2, whose edges are properly
coloured with p colours. If rs(G) = 2p+ 1−∆(G), then G has the following properties:
1. G contains a star K1,∆ with center vertex v0 and leaves v1, . . . , v∆ and G[N(v0)]
is edgeless. Let c(v0vi) = i for 1 ≤ i ≤ ∆ and H0 ∼= G[N [v0]].
2. If p > ∆, then let Hi be the subgraph spanned by the edges with colour i for ∆ +
1 ≤ i ≤ p. The subgraphs H∆+1, H∆+2, . . . ,Hp are pairwise vertex-disjoint and
V (H0) ∩ V (Hi) = ∅ for ∆ + 1 ≤ i ≤ p.
3. E(Hi, Hj) = ∅ for ∆ + 1 ≤ i < j ≤ p (where E(Hi, Hj) is the set of all edges
having one vertex in V (Hi) and the other vertex in V (Hj)).
4. E(vi, Hj) = ∅ for 1 ≤ i ≤ ∆ and ∆ + 1 ≤ j ≤ p (where E(vi, Hj) is the set of all
edges incident with vi and a vertex in V (Hj)).
5. If uv ∈ E(Hi) for some ∆ + 1 ≤ i ≤ p, then N(u) ∩N(v) = ∅.
6. N(vi) ∩N(vj) = ∅ for vi ∈ V (Hi), vj ∈ V (Hj),∆ + 1 ≤ i < j ≤ p.
Proof. 1. Suppose there is an edge vivj for some 1 ≤ i < j ≤ ∆. If c(vivj) = k for
some k with 1 ≤ k ≤ ∆, k 6= i, j, then rs(G) ≤ (∆+1)−1+(2p−2∆) = 2p−∆ <
2p + 1 −∆, a contradiction. If c(vivj) = k for some k with ∆ + 1 ≤ k ≤ p, then
rs(G) ≤ (∆ + 1) + (2p− 2∆− 2) = 2p−∆− 1 < 2p+ 1−∆, a contradiction as
well.
2. Suppose there are integers i, j with ∆ + 1 ≤ i < j ≤ p and two adjacent edges
e, f with c(e) = i, c(f) = j. Then rs(G) ≤ (∆ + 1) + (2p − 2∆ − 1) = 2p −
∆ < 2p + 1 − ∆, a contradiction. Suppose there are integers i, j with 1 ≤ i ≤
∆,∆ + 1 ≤ j ≤ p and two adjacent edges e, f with c(e) = i, c(f) = j. Then
rs(G) ≤ (∆ + 1) + (2p−2∆−1) = 2p−∆ < 2p+ 1−∆, a contradiction as well.
3. Suppose there is an edge vivj with vi ∈ V (Hi), vj ∈ V (Hj),∆ + 1 ≤ i < j ≤ p.
Then c(vivj) = k for some 1 ≤ k ≤ ∆. Hence rs(G) ≤ (∆+1)−1+(2p−2∆) =
2p−∆ < 2p+ 1−∆, a contradiction.
4. Suppose there is an edge vivj for two vertices vi ∈ V (H0) and vj ∈ V (Hj),∆+1 ≤
j ≤ p. Then rs(G) ≤ (∆+1)+(2p−2∆−1) = 2p−∆ < 2p+1−∆, a contradiction.
I. Schiermeyer: On the minimum rainbow subgraph number of a graph 87
5. Suppose there is an edge uv ∈ E(Hi) for some ∆+1 ≤ i ≤ p with N(u)∩N(v) 6=
∅. By 3. and 4. we conclude that N(u) ∩ N(v) ∩ V (H0) = ∅. Furthermore, for a
vertex w ∈ N(u)∩N(v), we have c(uw) = j, c(vw) = k for some 1 ≤ j < k ≤ ∆.
Then rs(G) ≤ (∆+1)−2+(2p−2∆+1) = 2p−∆ < 2p+1−∆, a contradiction.
6. SupposeN(vi)∩N(vj) 6= ∅ for two vertices vi ∈ V (Hi), vj ∈ V (Hj),∆+1 ≤ i <
j ≤ p.By 3. and 4. we conclude thatN(vi)∩N(vj)∩V (H0) = ∅. Furthermore, for a
vertexw ∈ N(vi)∩N(vj),we have c(uw) = k, c(vw) = l for some 1 ≤ k < l ≤ ∆.
Then rs(G) ≤ (∆+1)−2+(2p−2∆+1) = 2p−∆ < 2p+1−∆, a contradiction.
Another upper bound for the rainbow subgraph number follows from an approach pre-
sented in [8]. Observe that two adjacent edges of different colours together have three
vertices, whereas two edges of different colours in a matching have four vertices. Based on
this obervation the following algorithm has been proposed in [8].
Algorithm
Input: A graph G of order n whose edges are coloured with p colours
1. Construct a graph G′ with V (G′) = {v1, v2, . . . , vp} (vi corresponds to colour i)
and vivj ∈ E(G′) if there exist two adjacent edges e, f ∈ E(G) with c(e) = i and
c(f) = j (c(x) denotes the colour of the edge x).
2. Now compute a maximum matching M of order β(G′) in G′. This can be done in
polynomial time.
3. Next construct a graph H with V (H) ⊆ V (G) as follows: For each matching edge
of M choose two adjacent edges in G with these two colours. For each vertex of
V (G′) not in M choose an edge in G with this colour. In this way we obtain a
rainbow subgraph H ⊆ G with |E(H)| = p.
Correctness of the algorithm: Edges of the matching correspond to pairs of adjacent edges
in the original graph. Colours that are left out by this procedure are added greedily at the
end.
Claim 3.2. |V (H)| ≤ 2p− β(G′)
Proof. For each matching edge of G′ three vertices appear in H. Hence
|V (H)| ≤ 3β(G′) + 2(p− 2β(G′)) = 2p− β(G′)
Corollary 3.3. rs(G) ≤ 2p− β(G′).
Acknowledgement
We thank the referees for some valuable comments.
88 Ars Math. Contemp. 6 (2013) 83–88
References
[1] Y. Asahiro, R. Hassin and K. Iwama, Complexity of finding dense subgraphs, Discrete Appl.
Math. 121 (2002), 15–26.
[2] J. A. Bondy and U.S.R. Murty, Graph Theory, Springer, London, 2008.
[3] D. Catanzaro and M. Labbé, The pure parsimony haplotyping problem: overview and computa-
tional advances, International Transactions in Operational Research 16 (2009), 561–584.
[4] D. Gusfield, Haplotype inference by pure Parsimony, in: Proceedings ot the 14th annual confer-
ence on Combinatorial pattern matching (CPM’03), Springer, Berlin, Heidelberg, 2003, 144–
155.
[5] J. Katrenič and I. Schiermeyer, Improved approximation bounds for the minimum rainbow sub-
graph problem, Inf. Process. Lett. 111 (2011), 110–114.
[6] M. Koch, Das Population-Haplotyping-Problem: Graphentheoretische Ansätze, Diploma thesis,
TU Bergakademie Freiberg, 2008.
[7] G. Lancia, M. C. Pinotti and R. Rizzi, Haplotyping Populations by Pure Parsimony: Complexity
of Exact and Approximation Algorithms, INFORMS Journal on Computing 16 (2004), 348–359.
[8] S. Matos Camacho, I. Schiermeyer and Z. Tuza. Approximation algorithms for the minimum
rainbow subgraph problem, Discrete Math. 310 (2010), 2666–2670.
[9] L. S. Wang, Hapolotype inference by maximum parsimony, Bioinformatics 19 (2003), 1773–
1780.