Also available at http://amc.imfm.si
ISSN 1855-3966 (printed edn.), ISSN 1855-3974 (electronic edn.)
ARS MATHEMATICA CONTEMPORANEA 2 (2009) 77–92
Vulnerability bounds on the number of
spanning tree leaves
Gábor Salamon ∗
Department of Computer Science and Information Theory
Budapest University of Technology and Economics
1117 Budapest, Magyar tudósok körútja 2., Hungary
Received 28 September 2008, accepted 3 February 2008, published online 6 April 2009
Abstract
Hamiltonicity and vulnerability of graphs are in a strong connection. A basic necessary
condition states that a graph containing a 2-leaf spanning tree (that is, a Hamiltonian path)
cannot be split into more than k + 1 components by deleting k of its vertices. In this paper
we consider a more general approach and investigate the connection between the number
of spanning tree leaves and two vulnerability parameters, namely scattering number sc(G)
[10] and cut-asymmetry ca(G) [16]. We prove that any spanning tree of a graph G has at
least sc(G)+1 leaves. We also show that if X ⊂ V is a maximum cardinality independent
set of G = (V,E), such that the elements of X are all leaves of a particular spanning tree
then |X| = ca(G) + 1 = |V | − cvc(G), where cvc(G) is the size of a minimum connected
vertex cover of G. As a consequence we obtain a new proof for the following results: any
spanning tree with independent leaves provides a 2-approximation for both the MAXIMUM
INTERNAL SPANNING TREE [16] and the MINIMUM CONNECTED VERTEX COVER [17]
problems. We also consider the opposite point of view by fixing the number of leaves to q
and looking for a q-leaf subtree of G that spans a maximum number of vertices. Bermond
[2] proved that a 2-connected graph on n vertices always contains a path (a 2-leaf subtree)
of length min {n, δ2}, where δ2 is the minimum degree-sum of a 2-element independent
set. We generalize this result to obtain a sufficient condition for the existence of a large
q-leaf subtree.
Keywords: Spanning tree leaves, vulnerability, Hamiltonian path.
Math. Subj. Class.: 05C38, 05C85
∗Research is supported by Grant No. 67651 of the Hungarian National Science Fund (OTKA).
E-mail address: gsala@cs.bme.hu (Gábor Salamon)
Copyright c© 2009 DMFA
78 Ars Math. Contemp. 2 (2009) 77–92
1 Introduction
Hamiltonicity and vulnerability of graphs are closely related [5]. The most basic necessary
condition of hamiltonicity claims that a graph containing a Hamiltonian path cannot be
split into more than k + 1 components by deleting k of its vertices. This gives the inten-
tion for examining the existence of Hamiltonian cycles and paths as a function of different
vulnerability parameters, such as e.g. toughness or scattering number. For a survey on this
topic the reader is referred to [1]. A closely related approach is to generalize hamiltonicity
to explore bounds on the length of the longest cycle or path of a graph. As an example
we mention here a generalization of Ore’s theorem [13] given by Bermond [2]. The length
of the longest cycle in a 2-connected graph G on n vertices is at least min {n, δ2(G)},
where δ2(G) is the minimum degree of a 2-element independent set of G. As an imme-
diate consequence, a 2-connected graph G always has a path (a 2-leaf subtree) of length
min {n, δ2(G)}. In Section 6 we generalize the result of Bermond proving that if G is
an arbitrary graph on n vertices then there exists a q-leaf subtree of G spanning at least
min {n, ρq,2 + q − 1} vertices, where ρq,2 is the minimum degree-sum of the two highest
degree vertices of a q-element independent set. This is also a generalization of a result
of Broersma and Tuinstra [4] that gives a sufficient degree condition on the existence of a
q-leaf spanning tree.
The HAMILTONIAN PATH problem is among the first ones turned to be NP-hard [7].
However, its practical importance needs efficient approximation algorithms and heuristics
for finding different generalizations of Hamiltonian paths.
Several problems of this kind were investigated. Karger et al. [11] dealt with the
LONGEST PATH problem and proved that there is no constant factor approximation algo-
rithm unless P=NP. Furer and Raghavachari [6] considered the MINIMUM DEGREE SPAN-
NING TREE problem and gave an algorithm that approximates the optimal solution within
an additive constant of 1. Gargano et al. [9] looked at the MINIMUM BRANCHING SPAN-
NING TREE problem, a degree-based generalization of Hamiltonian paths. They proved
that it is NP-complete to decide whether a spanning tree with at most k branchings (ver-
tices of degree at least 3) exists (for any fix k). However, the approximability properties of
this approach are still undiscovered.
In this paper we consider the following optimization problems:
Problem 1: MINIMUM LEAF SPANNING TREE
Input: A connected graph G.
Goal: Find a spanning tree of G with a minimum number of leaves (1-degree vertices).
Problem 2: MAXIMUM INTERNAL SPANNING TREE
Input: A connected graph G.
Goal: Find a spanning tree ofG with a maximum number of internal vertices (non-leaves).
Clearly, both of these problems are generalizations of the HAMILTONIAN PATH prob-
lem, as a Hamiltonian path is, on one hand, a 2-leaf spanning tree, and on the other hand,
a spanning tree with n − 2 internal vertices. From the optimization point of view, these
problems are equivalent as only the cost function is complemented by counting internal
nodes instead of leaves.
However, the two problems have different approximability behavior. Using the result of
Karger et al. [11], Lu and Ravi [12] proved that there is no constant factor approximation
for the MINIMUM LEAF SPANNING TREE problem unless P=NP.
G. Salamon: Vulnerability bounds on the number of spanning tree leaves 79
Complementing the cost function yields a better situation as the MAXIMUM INTERNAL
SPANNING TREE problem can be constant-factor approximated. A 2-approximation can
be done in linear time [16] and a 7/4-approximation in O(n4) time [15]. Both algorithms
are based on finding a spanning tree whose leaf-set has a large independent subset. This
pushes us to examine the spanning trees with many independent leaves. Our approach uses
vulnerability parameters to bound the number of leaves and the number of independent
leaves in a spanning tree.
We mention here that Prieto and Sloper [14] investigated the MAXIMUM INTERNAL
SPANNING TREE problem from a parameterized complexity point of view. They gave an
O(n3) kernel showing FPT-membership of the problem.
Besides dealing with the MAXIMUM INTERNAL SPANNING TREE problem, we also
consider the following problem:
Problem 3: MINIMUM CONNECTED VERTEX COVER
Input: A connected graph G.
Goal: Find a minimum size subset X of V (G) such that X spans a connected subgraph
and meets each edge of G.
Savage gave a 2-approximation algorithm for this problem [17]. We prove that any
spanning tree with independent leaves provides a 2-approximation for both the MAXIMUM
INTERNAL SPANNING TREE and the MINIMUM CONNECTED VERTEX COVER problems.
Although this does not improve directly the current approximation ratios, we believe that
our new approach leads to better approximations in the future.
The rest of the paper is organized as follows. In Section 3 we show a connection
between scattering number and the minimum number of leaves. In Section 4 we prove that
cut-asymmetry exactly determines the size of a maximum independent subset of leaves.
Several properties of cut-asymmetry are also presented, among others a sufficient condition
for traceability. Namely, we prove that if ca(G) ≤ 1 then G has a Hamiltonian path. We
believe that this result has its own interest as cut-asymmetry and scattering number are
very similar graph parameters and sc(G) ≤ 1 is a well-known necessary condition of
traceability. The algorithmic aspects of our results are presented in Section 5. We show
how our approach can be used to directly obtain linear-time 2-approximation algorithms
for both the MAXIMUM INTERNAL SPANNING TREE and the MINIMUM CONNECTED
VERTEX COVER problems. In Section 6 we deal with large q-leaf subtrees of the input
graph (given a fix parameter q). We provide a sufficient degree-based lower bound on the
number of nodes that can be spanned by a q-leaf subtree.
2 Notation
By a graph G = (V,E) we mean an undirected simple connected graph on n = |V |
vertices. We let dG(v) (or d(v)) denote the degree of vertex v in G, and for a set X
of vertices dG(X) =
∑
v∈X dG(v). We use eG(X,Y ) to denote the number of edges
between vertex disjoint sets X and Y of V (G). NG(v) is the set of neighbors of a vertex
v. Let Vi(G) (V≥i(G)) be the set of vertices having degree exactly i (at least i). Let G[X]
denote the subgraph of G induced by X . A vertex-set X is called independent if G[X] has
no edges. α(G) stands for the size of a maximum cardinality independent set of G. The
number of components of a subgraph H is denoted by comp(H).
Let T be a spanning tree of G. The edges of T are called tree-edges, all other edges
of G are non-tree edges. We denote by PT (x, y) the unique path in T between vertices x
80 Ars Math. Contemp. 2 (2009) 77–92
and y. A vertex in V1(T ) is called a leaf, a vertex in V2(T ) is called a forwarding vertex,
and a vertex in V≥3(T ) is called a branching. Branchings and forwarding vertices are the
internal vertices of T . For a leaf l of T the branching being closest to l is denoted by
b(l). If (l, u) is a non-tree edge then u→l is the neighbor of u along the path PT (u, l). The
path PT (l, b(l)) is called the branch of l, and b−(l) is the neighbor of b(l) in PT (l, b(l)).
The tree-edges that are not in branches are called trunk-edges. If the leaves of T form an
independent set of G then T is called an independence tree.
G is called traceable if it has a Hamiltonian path. The minimum number of leaves
of a spanning tree of G is denoted by ml(G). (Note that ml(G) = 2 if and only if G is
traceable.)
For the sake of simplicity we use X + x, and X − x instead of X ∪ {x}, and X \ {x},
respectively. Similarly, if X is a vertex-set of G = (V,E) then G−X denotes G[V −X].
3 The Scattering Number and the Minimum Number of Leaves
Scattering number is a vulnerability parameter which measures how much “structural dam-
age” can be caused by removing some “important” vertices of the graph [10, 18]. It is
defined as follows:
Definition 3.1. [10] The scattering number of a non-complete graph G = (V,E) is
sc(G) = max
X⊂V,X 6=∅
{comp(G[V \X])− |X| : comp(G[V \X]) ≥ 2} .
By definition, sc(Kn) = −∞.
This definition immediately implies that sc(G) ≤ 1 if G is traceable. This can be
interpreted as follows:
Proposition 3.2. If G has a 2-leaf spanning tree (i.e. a Hamiltonian path) then ml(G) ≥
sc(G) + 1.
We show that every spanning tree of an arbitrary graph G has at least sc(G) + 1 leaves.
Firstly, we prove this for trees.
Lemma 3.3. Let T be a tree with q leaves. Then q ≥ sc(T ) + 1.
Proof. To prove the upper bound let X = {x1, x2, . . . , xk} be the vertex-set providing
the maximum in Definition 3.1. If several such sets exist, let us choose one of minimum
cardinality. Thus by definition sc(T ) = comp(T [V \ X]) − |X|. Moreover, each vertex
xj ∈ X is a branching of T , that is d(xj) ≥ 3 (for j = 1, 2, . . . , k). Otherwise, if some
xj ∈ X had degree less than 3 then for X ′ = X − xj we would have comp(T [V \X ′]) ≥
comp(T [V \X])− 1, and so we should have chosen X ′ instead of X .
Now let X0 = ∅ and let Xj = Xj−1 + xj (for j = 1, 2, . . . , k). Furthermore let
Tj = T [V \ Xj ] (for j = 0, 1, . . . , k). That is, Tj is the forest obtained from T by
removing the vertices of Xj . Then comp(Tj) ≤ comp(Tj−1) + d(xj)− 1, and hence
comp(Tk) = comp(T [V \X]) ≤ 1 +
∑
xj∈X
d(xj)− |X|. (3.1)
G. Salamon: Vulnerability bounds on the number of spanning tree leaves 81
On the other hand, let Y denote the set of branchings not in X , namely Y = V≥3(T ) \
X . Then counting the vertices of T according to their degree we obtain:
|V (T )| = q + |V2(T )|+ |X|+ |Y |
= q + |V2(T )|+
∑
i≥3
|Vi(T ) ∩X|+
∑
i≥3
|Vi(T ) ∩ Y |. (3.2)
Counting the total degree of vertices in T we have:
2|V (T )| − 2 = q + 2|V2(T )|+
∑
i≥3
i|Vi(T ) ∩X|+
∑
i≥3
i|Vi(T ) ∩ Y |, (3.3)
Equations (3.2) and (3.3) yield∑
xj∈X
d(xj) =
∑
i≥3
i|Vi(T ) ∩X| =
∑
i≥3
(i− 2)|Vi(T ) ∩X|+ 2|X| =
q − 2−
∑
i≥3
(i− 2)|Vi(T ) ∩ Y |+ 2|X| ≤ q − 2 + 2|X|.
Using (3.1) this implies
sc(T ) = comp(T [V \X])− |X| ≤ q − 1
proving the upper bound on the scattering number.
The following theorem generalizes Lemma 3.3 for arbitrary graphs. Note that Theorem
3.4 is also a generalization of Proposition 3.2.
Theorem 3.4. If G is a connected graph then ml(G) ≥ sc(G) + 1.
Proof. Let T be a minimum leaf spanning tree ofG, that is |V1(T )| = ml(G), and letX be
a set maximizing comp(G[V \X])− |X|. Then as comp(G[V \X]) ≤ comp(T [V \X]),
by Lemma 3.3 we have:
sc(G) = comp(G[V \X])− |X| ≤ comp(T [V \X])− |X| ≤
sc(T ) ≤ |V1(T )| − 1 = ml(G)− 1.
The scattering number of a tree T can also be used to form an upper bound on the
number of leaves of T as the following theorem shows.
Theorem 3.5. Let T be a q-leaf tree on at least 3 vertices. Then q ≤ 2 sc(T ).
Proof. We construct a subset X of internal vertices of T such that each component of
T [V \X] contains at most one leaf of T . Then we prove that |X| ≤ q/2. Combining these
with the definition of scattering number directly yields the theorem.
Let us call a spanning tree that has at most one branching a spider. We build a sequence
{T1 = T, T2, . . . , Tk} of trees such that Tk is a spider. If, for some i, Ti is not a spider
we construct Ti+1 as follows: let xi be a branching of Ti with exactly one trunk-edge
82 Ars Math. Contemp. 2 (2009) 77–92
l
xi = b(l)
yi
TiTi+1
Figure 1: Constructing Ti+1 from Ti by removing the branching xi
(xi, yi) incident to it (such an xi exists if Ti is not a spider). Let T ′i be the component
of Ti[V (Ti) − xi] that contains yi. If yi is not a leaf of T ′i then Ti+1 = T ′i . Otherwise,
Ti+1 is obtained by deleting the branch of yi from T ′i (Figure 1). Observe that the leaves
of Ti+1 form a subset of leaves of Ti, and that a leaf l is in V1(Ti) \ V1(Ti+1) if and only if
b(l) = xi.
If Ti is a spider (thus i = k) then we define xk to be the branching of the spider. (If
Tk is a path then xk is chosen to be any internal vertex of Tk.) Let X = {x1, x2, . . . , xk}.
Observe that this construction ensures that each leaf of T is in a different component of
T [V \X], thus q ≤ comp(T [V \X]).
To see that q ≥ 2|X| let bi denote the number of those branches of Ti that end in xi.
Then bi = dTi(xi)− 1 and bi ≥ 2 for 1 ≤ i ≤ k − 1. This implies that
k∑
i=1
bi ≥ 2k − 1.
As xi cuts down exactly bi leaves from Ti, and as Tk is a spider with bk + 1 leaves we
have |V1(Ti+1)| = |V1(Ti)| − bi, and |V1(Tk)| = bk + 1. Putting these together gives
q =
k∑
i=1
bi + 1 ≥ 2k = 2|X|.
Thus
sc(T ) ≥ comp(T [V \X])− |X| ≥ q − q
2
=
q
2
,
which finishes the proof.
Notice that the bounds of Lemma 3.3 and Theorem 3.5 are sharp. On one hand, if
T1 is a q-leaf spider then |V1(T1)| = q = sc(T1) + 1. On the other hand, let T2 be a
tree on vertices V (T2) = {v1, v2, . . . , vk, u1, u2, . . . , uk, w1, w2, . . . , wk} having edges
E(T2) = {(vi, vi+1)}i=1,2,...,k−1 ∪ {(vi, ui)}i=1,2,...,k ∪ {(vi, wi)}i=1,2,...,k (see Figure
2). Then sc(T2) = k =
|V1(T2)|
2 .
G. Salamon: Vulnerability bounds on the number of spanning tree leaves 83
v1
u1
w1
u2
v2
w2 wk
vk
uk
Figure 2: A graph proving that the bound of Theorem 3.5 is sharp.
These examples show that a difference of factor 2 can arise between the number of
leaves and the scattering number of trees. In Section 4 we present cut-asymmetry as a
vulnerability parameter which exactly determines the number of leaves when applied on a
tree.
Finally, we give two basic properties of the minimum leaf spanning trees. They are
used throughout the next sections.
Lemma 3.6. If T is a minimum leaf spanning tree of G then T is either a Hamiltonian
path or an independence tree.
Proof. Suppose for a contradiction that T is not a Hamiltonian path and there exist two
leaves l1, l2 of T such that (l1, l2) ∈ E(G). The spanning tree T ′ with edge-set E(T ′) =
E(T ) + (l1, l2)− (b(l1), b−(l1)) has less leaves than T contradicting the minimality of T
(Figure 3(a)).
Recall that if T is a tree then x→l1 denotes the successor of x along the path PT (x, l1).
Lemma 3.7. Let G be a non-traceable graph, and T be a minimum leaf spanning tree of
G. Let l1 be an arbitrary leaf of T , and (l1, x) ∈ E(G) \ E(T ) be a non-tree edge. Then
dT (x→l1) = 2 and if l2 6= l1 is a leaf of T then (x→l1 , l2) 6∈ E(G).
Proof. Let us consider the spanning tree T ′ with edge-set E(T ′) = E(T ) + (l1, x) −
(x→l1 , x). Observe that x→l1 is an internal vertex of T . If dT (x→l1) > 2 then V1(T ′) =
V1(T )− l1 and so T ′ has less leaves than T , a contradiction (Figure 3(b)). If dT (x→l1) = 2
then V1(T ′) = V1(T )−l1+x→l1 , that is T and T ′ has the same number of leaves. However,
the edge (x→l1 , l2) connects two leaves of T ′ and thus (by Lemma 3.6) neither T ′ nor T is
a minimum leaf spanning tree (Figure 3(c)).
We have the following trivial corollary of Lemma 3.6:
Corollary 3.8. If G is a non-complete connected graph then ml(G) ≤ α(G).
4 The Cut-asymmetry and the Maximum Number of Independent
Leaves
In the previous section we have seen that scattering number provides both lower and upper
bounds on the number of leaves of a tree. In this section we use another vulnerability
measure, namely cut-asymmetry. Recall that scattering number shows how much structural
84 Ars Math. Contemp. 2 (2009) 77–92
l1
b−(l1)
b(l1)
l2
(a)
x→l1
x
l1
(b)
l1
x→l1
l2
x
(c)
Figure 3: Decreasing the number of leaves by local improvements
damage can be caused by removing some individual vertices from the graph. We define cut-
asymmetry in a similar way but we count the connected subgraphs (instead of individual
vertices) to be removed. We prove that this model exactly characterizes the number of
leaves in a tree and can be used to determine the maximum number of independent leaves
of a spanning tree of an arbitrary graph. Besides, we point out a connection between cut-
asymmetry and the minimum size of a connected vertex cover.
Let us recall the definition of cut-asymmetry.
Definition 4.1. [16] The cut-asymmetry of a graph G = (V,E) is
ca(G) = max
X⊂V,X 6=∅
{comp(G[V \X])− comp(G[X])} .
This definition immediately implies that ca(G) ≥ max {sc(G), 0}, and that ca(G) ≤
n − 2 (equality holds if and only if G is a star). We mention here some further properties
of cut-asymmetry. The following theorem is a consequence of Theorems 4.11 and 4.15,
however a direct proof is also given below.
Theorem 4.2. ca(G) = 0 if and only if G is either a complete graph or a cycle.
Proof. If G is complete or a cycle then its cut-asymmetry is trivially 0. To see the other
direction let G be a non-complete graph such that ca(G) = 0. Let Z = {z1, z2, . . . , zk}
be an arbitrary independent set. As ca(G) = 0 the graph G[V \ Z] has k components
G. Salamon: Vulnerability bounds on the number of spanning tree leaves 85
C1, C2, . . . , Ck. Let us contract each component Ci to a single vertex ci. By this transfor-
mation we obtain the connected bipartite graph G′ with classes Z and C = {c1, c2, . . . ,
ck}.
Proposition 4.3. G′ is an even cycle.
Proof. For every 1 ≤ j ≤ k we have dG′(zj) = 2. Otherwise if dG′(zj) = 1 for some j
then comp(G[Z−zj ]) = k−1 and comp(G[V \(Z−zj)]) = k would imply ca(G) ≥ 1. If
dG′(zj) ≥ 3 for some j then comp(G[Z−zj ]) = k−1 and comp(G[V \(Z−zj)]) ≤ k−2
would imply ca(G) ≥ 1. A similar reasoning shows that dG′(cj) = 2 for each 1 ≤ j ≤ k.
As a result G′ is a 2-regular connected bipartite graph, that is, an even cycle.
Proposition 4.4. If vertex v is in the component Ci (for some i) such that |V (Ci)| ≥ 2
then v has at most one neighbor in Z (in graph G).
Proof. Suppose that zj1 ∈ Z and zj2 ∈ Z are both neighbors of such a v. Then
comp(G[Z + v]) = k − 1 and comp(G[V \ (Z + v)]) ≥ k
implies ca(G) ≥ 1, a contradiction.
Proposition 4.5. There is at most one edge in G between vertex zj ∈ Z and component Ci
for all i and j.
Proof. Suppose that there exist vertices u, v ∈ Ci and a vertex zj ∈ Z (for some i and j),
such that both (u, zj) and (v, zj) are edges of G. Then by Proposition 4.4, comp(G[Z −
zj+u]) = k as u has no other neighbors in Z. We also have comp(G[V \ (Z−zj+u)]) =
k − 1 as zj is neighboring to some other component Ci′ . This implies ca(G) ≥ 1, a
contradiction.
As a result of Propositions 4.3 and 4.5 we obtain that exactly two edges ofG leave each
component Ci. For a fixed i let these edges be (zj1 , ui) and (zj2 , vi) (ui, vi ∈ V (Ci)).
(Note that, by Proposition 4.4, ui = vi if and only if Ci has a single vertex.) Suppose that
Ci has a vertex xi such thatCi−xi has a ui−vi path. In this case comp(G[Z−zj1 +xi]) =
k using Propositions 4.3 and 4.4, and comp(G[V \ (Z − zj1 + xi)]) = k − 1 implying
ca(G) ≥ 1, a contradiction. We conclude that component Ci is either a single vertex
or a simple path connecting ui and vi. This, together with Proposition 4.3, proves the
theorem.
Let us recall that sc(G) ≤ 1 is a necessary condition for the existence of a Hamiltonian
path. The following theorem shows a sufficient condition for traceability by means of cut-
asymmetry. Its proof is postponed here, as the theorem is a direct corollary of Theorems
4.11 and 4.15.
Theorem 4.6. If ca(G) ≤ 1 then G has a Hamiltonian path.
Unfortunately the traceability of G does not imply a low value of ca(G) as shown by
the following theorem.
Theorem 4.7. If G = (V,E) is a traceable graph on n vertices then ca(G) ≤ bn−12 c.
Moreover, if k is an arbitrary integer such that 0 ≤ k ≤ bn−12 c then there exists a traceable
graph G on n vertices for which ca(G) = k.
86 Ars Math. Contemp. 2 (2009) 77–92
Proof. Property 3.2 implies that sc(G) ≤ 1, that is, for any proper subset X ⊂ V we have
comp(G[V \X]) ≤ |X|+1. As comp(G[V \X]) ≤ n−|X|we obtain comp(G[V \X]) ≤
bn+12 c. Thus ca(G) ≤ b
n−1
2 c.
To see the second part let us fix k ≤ bn−12 c. A graph on n vertices whose cut-
asymmetry is k is constructed as follows. If k = 0 then according to Theorem 4.2 the
graph G = Kn is a good example. So we suppose that k ≥ 1. Let P = x1, x2, . . . , xn
be a path of length n. Now we obtain G by adding the edges of the complete graph on
x2, x4, . . . , x2k to P . It is easy to check that ca(G) = k.
The following result of Salamon and Wiener determines the cut-asymmetry of a tree.
Lemma 4.8. [16] If T is a tree then |V1(T )| = ca(T ) + 1.
Thus for any tree T we obtain
sc(T ) + 1 ≤ |V1(T )| = ca(T ) + 1 ≤ 2 sc(T ).
For a connected graph G we have proved so far
sc(G) + 1 ≤ min {ml(G), ca(G) + 1} .
In what follows we prove that ml(G) ≤ ca(G)+1 holds for all graphs but the complete
graph Kn and the cycle Cn. For this purpose we introduce a new parameter called leaf-
independence.
Definition 4.9. Let G be any connected graph and T be a spanning tree of G. The leaf-
independence of T in G (denoted by liG(T )) is the cardinality of a maximum independent
subset of V1(T ). The leaf independence li(G) of the graph G is the maximum of liG(T )
over all spanning trees of G.
It is important to point out that a spanning tree T can have more than liG(T ) leaves.
Indeed, liG(T ) = |V1(T )| if and only if T is an independence tree. Note that li(G) is
defined for all connected graphs, however, not all connected graphs have an independence
tree. Böhme et al. [3] characterized the graphs that have no independence trees. (These
graphs are the complete graph, the cycle and the complete bipartite graph Kn,n.) If an
independence tree with αt leaves exists then obviously li(G) ≥ αt. The following conjec-
ture would imply that leaf independence is a generalization of the concept of independence
trees:
Conjecture 4.10. If a graph G has an independence tree then G has an independence tree
with li(G) leaves.
The definition of leaf-independence directly yields li(G) ≤ α(G) whenever G has at
least 3 vertices. Also, the leaf-independence of a tree T (on at least 3 vertices) is equal to
the number of its leaves. Thus, by Lemma 4.8, ca(T ) = li(T )− 1. The following theorem
states that this equality holds for any connected graph.
Theorem 4.11. Let G be a connected graph. Then ca(G) = li(G)− 1.
Proof. Let T be a spanning tree of G, such that liG(T ) = li(G). Let Z be a maximum
independent set of V1(T ), and let X = V \ Z. Then on one hand
comp(G[V \X]) = |V \X| = li(G),
G. Salamon: Vulnerability bounds on the number of spanning tree leaves 87
and on the other hand
1 ≤ comp(G[X]) ≤ comp(T [X]) = 1.
Therefore
ca(G) ≥ comp(G[V \X])− comp(G[X]) = li(G)− 1.
To prove the reverse direction let us choose X∗ to be the set giving the maximum in
Definition 4.1. If several such sets exist we take one of maximum cardinality. First we
show that in this case V \ X∗ is an independent set of G. Assume that this is not the
case, and there is an edge (u, v) in G[V \X∗]. Then considering X ′ = X∗ + u we have
comp(G[X ′]) ≤ comp(G[X∗]), and comp(G[V \X ′]) ≥ comp(G[V \X∗]). This forms a
contradiction as comp(G[V \X ′])−comp(G[X ′]) ≥ comp(G[V \X∗])−comp(G[X∗]),
and |X ′| > |X∗|.
In the follows we show that X∗ spans a connected subgraph of G. Suppose for a
contradiction that G[X∗] has more than one components. Then, since V \ X∗ is an
independent set, there must exist two components C1 and C2 of G[X∗], and a vertex
u ∈ V \X∗ such that G[V (C1)∪ V (C2) + u] is connected. Hence for the independent set
V \X ′ = (V \X∗)− u we have:
comp(G[V \X ′]) = |V \X ′| = |V \X∗| − 1 = comp(G[V \X∗])− 1,
and
comp(G[X ′]) ≤ comp(G[X∗])− 1.
Thus comp(G[V \X ′])− comp(G[X ′]) ≥ comp(G[V \X∗])− comp(G[X∗]), that is a
contradiction as |X ′| > |X∗|.
To finish the proof let T ∗ be a spanning tree of G[X∗]. We then connect each vertex of
V \ X∗ to T ∗. Thus we obtain a spanning tree of G in which all vertices of V \ X∗ are
leaves. Thus
li(G) ≥ |V \X∗| = comp(G[V \X∗])− comp(G[X∗]) + 1 = ca(G) + 1,
using that V \X∗ is independent and that X∗ spans a connected subgraph of G.
In proof of Theorem 4.11 we have seen that there exists an independent setZ∗ = V \X∗
such that G[V \ Z∗] is connected and that comp(G[Z∗]) − comp(G[V \ Z∗]) = ca(G).
This yields us an alternative definition of cut-asymmetry, that is
Corollary 4.12. Let Z be a maximum size independent set of V for which G[V \ Z] is
connected. Then ca(G) = |Z| − 1.
Recall that a connected vertex cover ofG is a vertex-set that spans a connected subgraph
and that meets all edges of G. Let us denote by cvc(G) the size of a minimum cardinality
connected vertex cover of G. The set Z∗ = V \X∗ of the above proof is a maximum-size
vertex-set that is independent and whose complementX∗ spans a connected subgraph. This
means that X∗ is a minimum-size connected vertex cover of G, and proves the following:
Corollary 4.13. For any connected graph G on n vertices li(G) + cvc(G) = n.
88 Ars Math. Contemp. 2 (2009) 77–92
Note that actually the above arguments also show that the complement of a minimum
connected vertex cover is always a maximum size independent subset of leaves of a span-
ning tree.
A minimum connected vertex cover is NP-hard to compute [8]. Thus Corollary 4.13
and Theorem 4.11 imply that it is also NP-hard to compute both cut-asymmetry and leaf-
independence.
In what follows we show that the leaf-independence provides an upper bound on the
minimum number of leaves in almost all graphs. By Lemma 3.6 we obtain that if G has no
Hamiltonian path then the leaves of any minimum leaf spanning tree of G are independent.
In this case ml(G) ≤ li(G) is straightforward. If G is traceable then ml(G) = 2 and so
ml(G) ≤ li(G) is true if and only if G has a spanning tree with at least two independent
leaves. We use a result of Böhme et al. to show that this condition is satisfied whenever G
is neither the complete graph Kn nor the cycle Cn.
Theorem 4.14. [3] If G is traceable and each Hamiltonian path of G is the part of a
Hamiltonian cycle then G is either the complete graph Kn or the cycle Cn or the complete
bipartite graph Kn,n.
This theorem immediately implies that li(G) ≥ 2 if G is not isomorphic to Kn, Cn
or Kn,n. It is easy to see that for these special graphs we have li(Kn) = li(Cn) = 1 and
li(Kn,n) = n− 1. Thus we obtain the following theorem.
Theorem 4.15. Let G be any connected graph but the complete graph Kn and the cycle
Cn. Then ml(G) ≤ li(G).
Summarizing the above results we have
Corollary 4.16. For any connected graph G but Kn and Cn:
sc(G) + 1 ≤ ml(G) ≤ li(G) = ca(G) + 1 ≤ α(G)
In the next section we use these inequalities to prove approximation ratios on two NP-
hard optimization problems.
5 Algorithmic Aspects
In this section we consider two NP-hard optimization problems. The MAXIMUM INTER-
NAL SPANNING TREE problem [16] is about finding a spanning tree of the input graph
that maximizes the number of internal vertices. The MINIMUM CONNECTED VERTEX
COVER problem [17] is about finding a minimum size subset of vertices that spans a con-
nected subgraph and meets all edges of the graph. Both cited papers give a linear-time 2-
approximation algorithm for their problem under investigation. The common point is that
firstly an independence tree is found by a slight modification of depth first search. Then the
authors prove that such a tree provides a 2-approximation for their problem. It turned out
that this property of independence trees is a direct consequence of Corollary 4.16. Thus
we can hope that the approximation ratio of 2 can be improved in the future for some spe-
cial graph classes by further investigating the graph parameters present in Corollary 4.16.
Currently, the best known approximation factors are 2 for the MINIMUM CONNECTED
VERTEX COVER problem [17], and 7/4 for the MAXIMUM INTERNAL SPANNING TREE
problem [15].
G. Salamon: Vulnerability bounds on the number of spanning tree leaves 89
We need the following proposition to show that independence trees provide 2-approxi-
mation for both problems.
Proposition 5.1. If G is a graph on n vertices then 2(n− α(G)) ≥ n− sc(G)
Proof. Let X be an independent set of size α(G). Then
sc(G) ≥ comp(G[X])− |V \X| = α(G)− (n− α(G)).
Subtracting both sides from n yields the proposition.
Therefore by Corollary 4.16 we have
Corollary 5.2. For any graph G on n nodes but Kn and Cn:
2(n− α(G)) ≥ n− (sc(G) + 1) ≥ n−ml(G) ≥ n− li(G) ≥ n− α(G)
Recall that by definition n − ml(G) is the maximum number of internal vertices in a
spanning tree, and that by Corollary 4.13 n − li(G) is the cardinality of a minimum size
connected vertex cover of G.
Now let T be any independence tree of G. Then ml(G) ≤ |V1(T )| ≤ li(G), and
Corollary 5.2 shows that n−|V1(T )| ≥ 12 (n−ml(G)) and that n−|V1(T )| ≤ 2(n−li(G)).
This means that T is a 2-approximation for both MAXIMUM INTERNAL SPANNING TREE
and MINIMUM CONNECTED VERTEX COVER problems.
6 Spanning Many Vertices with a q-Leaf Tree
Up to now we were focusing on spanning trees with few leaves. In this section we do
the contrary, we fix the number of leaves to q and examine how many vertices can be
spanned by a q-leaf subtree. This approach is a generalization of finding a long path in
a graph, as a path is just a 2-leaf subtree. Let δq(G) denote the minimum degree-sum of
a q-element independent subset of V (G), that is δq(G) = minX
{∑
v∈X d(v) : |X| =
q,X is independent
}
.
Let us recall Ore’s theorem:
Theorem 6.1. [13] LetG be a graph on n vertices. If δ2(G) ≥ n thenG has a Hamiltonian
path.
For 2-connected graphs, Bermond proved the following generalization:
Theorem 6.2. [2] Let G be a 2-connected graph on n vertices. Then there exists a path of
length min {n, δ2(G)} in G.
Broersma and Tuinstra used a different point of view to generalize Ore’s result. They
examined the existence of q-leaf spanning trees and obtained the following:
Theorem 6.3. [4] Let G be a graph on n vertices. If δ2 ≥ n − q + 1 for some integer
2 ≤ q ≤ n− 1 then G has a q-leaf spanning tree.
In what follows we give a common generalization of the above results. We give a
sufficient condition on the existence of a q-leaf subtree that spans many vertices.
90 Ars Math. Contemp. 2 (2009) 77–92
To formulate our statement on subtrees we use ρq,k to denote the minimum degree-
sum of the k highest degree vertices of a q-element independent set (2 ≤ k ≤ q). For-
mally, ρq,k = minX
{∑k
i=1 d(xi) : X = {x1, x2, . . . , xq} , d(x1) ≥ d(x2) ≥ · · · ≥
d(xq), X is independent
}
. Clearly δq(G) = ρq,q and for k ≤ q: δqq ≤
ρq,k
k .
Theorem 6.4. Let G = (V,E) be a connected graph and let 2 ≤ q < α(G) be an integer.
Then G has a subtree with at most q leaves that spans at least min {ρq,2 + q − 1, |V |}
vertices of G.
Proof. Let T be a maximum cardinality subtree of G with at most q leaves. If T spans G
then we are done. Otherwise let R = V (G) \ V (T ) 6= ∅. As T is maximal, none of its
leaves is adjacent to R and thus for each leaf l we have dG[V (T )](l) = dG(l). Moreover T
must be a minimum leaf spanning tree of G[V (T )]. Suppose that this is not the case and
there exists a tree T ′ spanning V (T ) with less than q leaves. Let e be an edge between
V (T ) and R. Then T ′ + e is a tree with at most q leaves that spans more vertices than T .
Thus T must have all properties of a minimum leaf spanning tree. According to Lemma
3.6, the set L of leaves of T is independent in G. Let l1 and l2 be the first and the
second highest degree vertices in L. For the leaf l2 we define the set S =
{
v : ∃u ∈
V (T ) s.t. (l2, u) ∈ E(G) \ E(T ), v = u→l2
}
. By Lemma 3.7 we obtain that the vertices
of S are forwarding vertices of T and so by the definition of S we have |S| = dG(l2)− 1.
Moreover, also by Lemma 3.7, the vertices of S are not adjacent to l1, that is S and NG(l1)
are disjoint. Therefore
|V (T )| ≥ |V1(T )|+ |S|+ |NG(l1)| = q + dG(l2)− 1 + dG(l1) ≥ ρq,2(G) + q − 1,
using the fact that the leaves of T form a q-size independent set.
The above bound is sharp as shown by the complete bipartite graph G = (A∪B,E) =
Kδ,n−δ (for any δ =
δq
q < n/2). To see this, let T be any non-spanning subtree of G
having exactly q leaves and t = |V (T )| vertices.
If the leaves (being independent in G) are all in B then |E(T )| = t− 1 = eT (A,B) =
eT (A,B ∩ V≥2(T )) + q, and each internal vertex of B has at least 2 neighbors in A,
so we have 2|B ∩ V≥2(T )| ≤ eT (A,B ∩ V≥2(T )) = t − q − 1. This, combined to
t ≤ q + |B ∩ V≥2(T )| + δ results t ≤ 2δ + q − 1 = 2 δqq + q − 1. If the leaves are all
in A then we take G′ = G − V1(T ) and a subtree T ′ = T − V1(T ). It is easy to see
that T ′ has all of its q′ ≤ q leaves in B and following to the above argument we have
|V (T ′)| ≤ 2(δ − q) + q′ − 1 ≤ 2δ − q − 1 and so t = |V (T ′)|+ q ≤ 2δ − 1.
As a result, at most 2 δqq + q − 1 ≤ ρq,2(G) + q − 1 vertices of G can be spanned by a
subtree of at most q leaves.
Putting Corollary 3.8 and Theorem 6.4 together yields the following:
Corollary 6.5. Given a connected graph G = (V,E) and an integer 2 ≤ q. If q ≥ α(G)
or ρq,2(G) ≥ n− q + 1 then G has a spanning tree with at most q leaves.
This is a generalization of the result of Broersma and Tuinstra [4] as ρq,2(G) (if defined)
is an upper bound on δ2(G).
G. Salamon: Vulnerability bounds on the number of spanning tree leaves 91
7 Concluding Remarks
In this paper we discussed the connection between vulnerability of a graph G and the num-
ber of leaves of its spanning trees. We gave a lower bound on the number of leaves by means
of scattering number sc(G) and an upper bound on the number of independent spanning
tree leaves by means of cut-asymmetry ca(G). These bounds are used to prove that any
independence tree provides a 2-approximation for both MAXIMUM INTERNAL SPANNING
TREE and MINIMUM CONNECTED VERTEX COVER problems. We also gave some new
cut-asymmetry related results. We proved that the leaf-set of an arbitrary spanning tree
can contain at most ca(G) + 1 independent vertices and that this bound is sharp, that is,
li(G) = ca(G) + 1. Clearly any independence tree has at most li(G) leaves. However, it
is an open question whether there always exists an independence tree with exactly li(G)
leaves. If the answer is positive then the notion of leaf-independence is a generalization of
the concept of independence trees.
At last, we gave a lower bound—by means of vertex degrees of q-element independent
sets—on the number of vertices spanned by a q-leaf subtree.
Acknowledgments
Author thanks to Gábor Wiener and to the anonymous referees for their valuable comments.
References
[1] D. Bauer, H. Broersma and E. Schmeichel, Toughness in Graphs – A Survey, Graphs and
Combinatorics 22 (2006), 1–35.
[2] J-C. Bermond, On Hamiltonian Walks, in: Proc. of the Fifth British Combinatorial Conference,
1975, 41–51.
[3] T. Böhme, H. J. Broersma, F. Göbel, A. V. Kostochka and M. Stiebitz, Spanning trees with
pairwise nonadjacent end-vertices, Discrete Mathematics 170 (1997), 219–222.
[4] H. Broersma and H. Tuinstra, Independence Trees and Hamilton Cycles, Journal of Graph
Theory 29 (1998), 227–237.
[5] V. Chvátal, Tough graphs and on Hamiltonian circuits, Discrete Mathematics 5 (1973), 215–
228.
[6] M. Furer and B. Raghavachari, Approximating the Minimum Degree Spanning Tree to within
One from the Optimal Degree, in: Proc. of the 3rd Annual ACM-SIAM Symp. on Discrete
Algorithms, 1992, 317–324.
[7] M. R. Garey and D. S. Johnson, Computers and intractability: a guide to NP-completeness,
Freeman, 1979.
[8] M. R. Garey and D. S. Johnson, The rectilinear Steiner tree problem is NP-complete, SIAM
Journal of Applied Mathematics 32 (1977), 826–834.
[9] L. Gargano, P. Hell, L. Stacho and U. Vaccaro, Spanning Trees with Bounded Number of
Branch Vertices, in: Proc. of ICALP’02, Springer LNCS Vol. 2380, 2002, 355–365.
[10] H. A. Jung, On a class of posets and the corresponding comparability graphs, Journal on Com-
binatorial Theory, Series B 24 (1978), 125–133.
[11] D. Karger, R. Motwani and G. D. S. Ramkumar, On Approximating the Longest Path in a
Graph, in: Proc. of WADS’93, 1993, 421–432.
92 Ars Math. Contemp. 2 (2009) 77–92
[12] H.-I. Lu and R. Ravi, The Power of Local Optimization: Approximation Algorithms for
Maximum-leaf Spanning Tree (DRAFT), Technical report number CS-96-05, Department of
Computer Science, Brown University, Providence, Rhode Island, 1996.
[13] O. Ore, Note on Hamiltonian Circuits, American Mathematical Monthly 67 (1960), 55.
[14] E. Prieto and C. Sloper, Either/or: Using vertex cover structure in designing FPT algorithms –
the case of k-internal spanning tree, in: Proc. of WADS 2003, LNCS 2748, 2003, 465–483.
[15] G. Salamon, Approximation Algorithms for the Maximum Internal Spanning Tree Problem, in:
Proc. of MFCS 2007, LNCS 4708, 2007, 90–102.
[16] G. Salamon and G. Wiener, On Finding Spanning Trees with Few Leaves, Information Pro-
cessing Letters 105 (2008), 164–169.
[17] C. Savage, Depth-first search and the vertex cover problem, Information Processing Letters 14
(1982), 233–235.
[18] S. Zhang and Z. Wang, Scattering number in graphs, Networks 37 (2001), 102–106.