IMFM Institute of Mathematics, Physics and Mechanics JADRANSKA 19, 1000 LJUBLJANA, SLOVENIA Preprint series Vol. 48 (2010), 1120 ISSN 2232-2094 ON THE ZAGREB INDEX INEQUALITY OF GRAPHS WITH PRESCRIBED VERTEX DEGREES Vesna Andova SaSo Bogoev Darko Dimitrov Marcin Pilipczuk Riste Skrekovski Ljubljana, May 25, 2010 o i-h o o On the Zagreb index inequality of graphs with prescribed vertex degrees April 13, 2010 Vesna Andovaa, Saso Bogoevb, Darko Dimitrovc, Marcin Pilipczukd and Riste Skrekovski6 aInstitute of Mathematics and Physics, Faculty of Electrical Engineering and Information Technologies, Ss Cyril and Methodius Univ., Ruger Boskovik, P. O. Box ,574, 1000 Skopje, Macedonia E-mail: vesna.andova@gmail.com bMinistry of Finance, IT Department Dame Gruev 14, 1000 Skopje, Macedonia E-mail: bogoja@finance.gov.mk cInstitut fur Informatik, Freie Universitat Berlin, Takustrafie 9, D-14195 Berlin, Germany E-mail: darko@mi.fu-berlin.de dInstitute of Informatics, University of Warsaw, Banacha 2, 02-097 Warsaw, Poland E-mail: malcin@mimuw.edu.pl 6Department of Mathematics, University of Ljubljana, Jadranska 21, 1111 Ljubljana, Slovenia co E-mail: skrekovski@gmail.com • 1 Abstract For a simple graph G with n vertices and m edges, the inequality M\ (G)/n < M2 (G)/m, where M\(G) and M2(G) are the first and the second Zagreb indices of G, is known as Zagreb indices inequality. According to this inequality, a set S of integers is good if for every graph whose degrees of vertices are in S, the inequality holds. We characterize that an interval [a, a + n] is good if and only if a > n(n—1'> or [a, a + n] = [1,4]. We also present an algorithm that decides if an arbitrary set S of cardinality s is good, which requires O(s2 log s) time and O(s) space. Keywords: First Zagreb index, second Zagreb index 1 Introduction lO Let G = (V, E) be a simple graph with n = \V\ vertices and m = \E\ edges. For v E V, d(v) is its degree. The first Zagreb index M1 (G) and the second Zagreb M2 (G) index are defined as follows: Mi(G) = d(v)2 and M2(G) = ^ d(u)d(v). vEV uvEE For the sake of simplicity, we often use M1 and M2 instead of M1(G) and M2(G), respectively. The first and second Zagreb indices are among the oldest topological indices [2, 6, 8], defined in 1972 by Gutman and Trinajstic [7], and are given different names in the literature, such as the Zagreb group indices, the Zagreb group parameters and most often, the Zagreb indices. These indices were among the first indices introduced, and have since been used to study molecular complexity, chirality, ZE-isomerism and hetero-systems. Overall, Zagreb indices exhibit a potential applicability for deriving multi-linear regression models. In 2003 the article [12] repopularized Zagreb indices, and since then a lot of work was done on this topic. For more results on this topic see [4, 5, 11, 16, 17]. Comparing the values of these indices on the same graph is a natural issue and gives interesting results. The following observation suggests that it is more reasonable to compare M1/n with M2/m, instead of comparing M1 with M2. Namely, for general graphs, m is bounded from above by n2, and thus, the orders of magnitude of M1 and M2 are O(n3) and O(n4) respectively. At first, the next conjecture was proposed [3]: Conjecture 1.1. For all simple graphs G, M1(G) < M2(G) n ~ m and the bound is tight for complete graphs. One can easily see that this relation becomes an equality on regular graphs, but also when G is a star. Besides, the inequality is true for trees [14], graphs of maximum degree four, so called chemical graphs [9] and unicyclic graphs [15]. Graphs with only two types of vertex degrees, graphs with vertex degrees in any interval of length three also satisfy the inequality (1), as well as graphs such that their vertices degrees are in the set {s — c,s,s + c} or in the interval [c,c + \ -,/c ]] for any integers c, s [1]. Here we will determine when a graph with vertices degrees in the interval [a, a + n], satisfies the inequality (1). On the other side there are graphs that do not satisfy the inequality (1), even more, there is an infinite family of graphs of maximum degree A > 5 such that the inequality is false. See [1, 9, 14, 10] for various examples of graphs dissatisfying this inequality. We also present an algorithm for deciding if a given set of integers S of cardinality s satisfy (1), which requires O(s2 log s) time CD and O(s) space. We denote by Ka,b the complete bipartite graph with a vertices in one class and b vertices in the other one. Let D(G) be the set of the vertex degrees of G, i.e., D(G) = {d(v) \ v E V} . A set S of integers is good if for every graph G with D(G) C S, the inequality (l) holds. Otherwise, S is a bad set. Since we discuss necessary conditions for (1) to hold, we denote for the sake of simplicity by mi,j the number of edges that connect vertices of degrees i and j in the graph G. Then, as shown in [9]: lO M _ Ml = y. m n ' m n i kl and — + - < —+ -, or k l i j 1111 (b) ij < kl and - + - > - + -. k l i j The next lemma determines the orderings of the integers i, j, k, and l, for which f (i, j, k, l) can be negative [1]. Lemma 2.2. If f (i,j, k,l) < 0 for some integers i < j and k < l, then, o i < k < l < j or k < i < j < l. The following simple lemma will be used in the proof of Lemma 2.4. Lemma 2.3. Let i < k < l < j and k + l > i + j for some four integers i, j, k, l. Then, kl > ij. Proof. Since k — i > j — l and j is the largest, we infer kl — ij = (k — i)j — (j — l)k > (j — l)(j — k) > 0. Hence, kl > ij. □ Now, we will present a condition of integers i, j, k, l, for which f (i,j, k, l) is nonnegative. Lemma 2.4. Let k + l > i + j and i < k < l < j for some four integers i, j, k, l. Then, f (i,j,k,l) > 0. Proof. Since k — i > j — l by Lemma 2.3, we have kl > ij , and so 1 1 1 1 k — i l — j . „ / 1 1 \ c^ i + j — k — l =^T + j > (j —1){ Fi — Jl) > 0- Now the proof is straightforward by Lemma 2.1. □ The following proposition gives an equivalence between the sign of f (i, j, k, l) and a relation of the integers i, j, k, l. CO Proposition 2.1. Let i,j,k,l be integers satisfying i < k < l < j. Then, f (i,j,k,l) < 0 if , ., k + l kl and only if-< — < 1. i + j ij Proof. First, suppose that k +l < — < 1. From the left inequality, it follows 1 + 1 < i + j ij k l —I—, and from the right inequality, it follows that ij > kl. Thus, by Lemma 2.1, we have i j f (i,j,k,l) < 0. cd then Suppose now that f (i,j, k, l) < 0. By Lemma 2.4, we have that k + l < i + j. If ij < kl, 1 1 1 1 k +1 i + j - + - ^^ = — — ^ < 0, k l i j kl ij and by Lemma 2.1, we have f (i, j, k, l) > 0, which is a contradiction to the assumption. Thus, kl we may assume that ij > kl, i.e., — < 1. Since f (i,j,k,l) is negative, by Lemma 2.1(a), p ij 1 1 1 1 k +1 kl^. , n — + - < —I—, and hence-< —. This concludes the proof. k l i j i + J ij □ o 1—1 o 10 £ o 0 o 1 00 ^ CO CO 3 Good intervals It is known that all intervals of lengths 1,2,3, and 4, except [2, 5], are good, see [13]. In [1], it was shown that for every integer c, the interval [c,c + [ yfc ]] is good. An immediate corollary of that result is that there are arbitrarily long good intervals. In this section we characterize the good intervals. First we show, if the function is negative for some values of an interval, than that interval is bed. Proposition 3.1. If f (i,j,k,l) < 0 for some integers i,j,k,l £ [a,b], then [a,b] is a bad interval. Proof. Whenever f(i,j,k,l) < 0 and i,j,k,l £ [a,b], a > 1, we can construct a connected graph Gxy, with D(Gxy) = {i,j,k,l}, that does not satisfies (1). An illustration of Gxy is given in Figure 1. The construction of Gxy is adapted from [1] and it is as follows: • Make a sequence of x copies of Kij and then continue that sequence with y copies of Kk,i. • Choose an edge from the first Kij graph and an edge from the second Kij graph. Let denote these edges by vjv) and vf v2, respectively. Replace v/v) and vf v2 by edges v/vf and vfvj. Continue this kind of replacement between all consecutive copies of K ij- Notice that these replacements do not change the degrees of the vertices. Next, choose an edge from the last Kitj graph and an edge from the first Kkil graph. Let denote these edges by vXvX and v/vf, respectively. Replace vXvX and v/v/ by edges vXvf and v1vx. Apply the same procedure between all consecutive graphs Kk l in the sequence. This completes the construction of Gxy. co cd Jh CD co u a CD U j j j l l l Figure 1: A connected graph Gxy with D(Gxy) = {i,j, k, l} constructed from x copies of Ki, j and y copies of Kk, i. The dashed edges are those that are removed from the corresponding complete bipartite graphs. From the construction, it follows that m^ j = x ■ i ■ j — 1, mk, i = y ■ k ■ l — 1 , m^ , l = mj, k = 1 k k k o 1—1 o 10 £ and mi i = mi k = mj j = mj i = mk k = mi i = 0. Thus, M2 Mi m n ^ f (q, r, s, t)mq, r ms,t q^r,sKt = 2 [f (i, j, k, l)mimk,i + f]i, j, i, l) + f (i, j, j, k)] m + [f (k, l, i, l) + f (k, l, j, k)] mk ,i + f (i, l, j, k)] . i,j o 0 o 1 CO ^ CO CO If we increase the number of Ki,j and Kkii graphs, i.e., x and y, in the graph Gxy , shown in Fig. 1, then mij and mkj will increase as well. Since f (i,j,k,l) < 0, it follows that for mj,,j and mk,i big enough, the difference M2/m — M\/n will be negative. Notice that if a = 1 and min(i,j,k,l) = 1, then Gxy is disconnected. Thus, we consider the intervals [1,6], b > 1, separately. For 1 < b < 4, these intervals are good. The function f (2, 5, 3, 3) is negative, and by the above construction interval [1,b], is bad for every b > 5. This establish the proposition. □ Theorem 3.1. For every integer n, the interval [a, a + n] is good if and only if a > or [a, a + n] = [1, 4]. n( n — 1) Proof. As [1,4] is good interval, in order to prove "if" direction of the theorem, it suffices n( n 1) to show that f (i,j,k,l) > 0 whenever i, j, k, l £ [a, a + n] and a > -^-. Suppose in contrary that, by Proposition 3.1 for some i, j, k, l from such an interval, f (i, j, k, l) < 0. By Lemma 2.2 and (4), we can assume that a < i < k < l < j < a + n. Let k = i + s, l = i \ t, j = i \ q, where 0 < s < t < q < n. Now, 1 1 = 2i + s + t k + 1 = (i + s)(i + t) and 1 \ 1 = 2i \ q and —\— = —--. i j i(i + q) m cd $h CD m u a CD U Since f (i,j, k, l) < 0, by Proposition 2.1, it follows that kl < ij and k +1 < i + j. Thus, (s +1)2 ( we obtain s +1 < q. As, st <-, we obtain st < - 1 4 1 1 1 it follows that f (i, j,k,l) < 0 if — + - < —\ -. Hence, k l i j 1)2 (n 1)2 --. By Lemma 2.1, 2i \ s \ t < 2i \ q (i + s)(i + t) i(i + q) (2i + s + t)(i2 + iq) < (2i + q)(i2 + (s + t)i + st) 2i3 + (s + t + 2q)i2 + (s + t)iq < 2i3 + (2s + 2t + q)i2 + 2sti + (s + t)iq + stq q i2 < (s \ t) i2 \ 2s t i \ s t q qi ;2 < (q — 1) i2 + 2sti + stq, o 1—1 o 10 £ o 0 o 1 CO ^ CO CO from here co cd $h CD co $h a CD $h i2 < 2s ti + stg ' n — 1\ 2 f n — 1 < 2 ( — ) i + I — | n n — 1 2 n — 1 < 2 ( — | i + —— i n( n — 1) < i 2 which is clearly impossible. Thus, we have shown that f(i, j, k, l) > 0 for arbitrary i, j, k, l n( n 1) from an interval [a, a + n] with a >-. Therefore, such an interval is a good one. Now, we prove the opposite direction. For n = 0,1, 2 and an arbitrary integer a, all inequalit n( n — 1) n( n 1) intervals [a, a + n] are good, see [13], and they satisfy the inequality a >---. For n = 3, [1,4] is the only good interval that does not satisfy a > For n > 4, we proceed as follows. By Proposition 3.1, it suffices to show that f (i, j, k,l) < 0 n — ^ , ■ ■ ■, ( n ) — 1 >. We show this by for some integers i,j, k,l 4 and a & Ara_i, the interval [a, a + n — 1] is bad. Now, we show that for n and a & An the interval [a, a + n] is bad. First, notice that An-1 c An. Suppose first a & An-1. Since [a, a + n — 1] is a subinterval of [a, a + n], and by induction hypothesis [a, a + n — 1] is a bad interval, it follows that also [a, a + n] is bad. It remains to show that [a, a + n] is a bad interval for a & An \ An-1, i.e., for a & (n — 1)(n — 2) n(n — 1)\ TTT -2-,-2-j. We consider two cases regarding the parity oi n: • n = 2s + 1: Then, 2s2 — s < a < 2s2 + s. Now, it can be easily verified that (a — s — 2s2)(a — s2) f (a, a + 2s + 1, a + s,a + s) = v ( ) < 0. a(a + s)(a + 2s + 1) • n = 2s: Then, (s — 1)(2s — 1) < a < s(2s — 1). Again, an easy verification shows that (a — s2 + s)(a2 + 2as + 2s2 — 2as2 — 2s3) f (a, a + 2s,a + s — 1, a + s) = Thus, the proof is completed. a(a + 2s)(a + s — 1)(a + s) < 0. □ o 4 Decision algorithm O CSI u CD co In this section, we consider a problem of deciding if a given set of integers S of cardinality s is a good one. A straightforward algorithm that solves the above problem is to check if f (i,j,k,l) < 0 for all 4-tuples (i,j,k,l) from S, where the 4-tuples (i,j,k,l) are variations with repetitions. Checking that suffices to determine if S is good or not, since by Proposition 3.1, if f (i, j, k, l) < 0, we can construct a graph that does not satisfy (1). Verifying if f (i,j, k,l) < 0 can be done in constant time by Lemma 2.1. Thus, the time complexity of this approach is O(s4). We now show that, using quite simple algorithmic tricks, one can obtain an O(s2 log s) algorithm. C^ Lemma 4.1. There exists an algorithm that checks whether a given set of positive integers S is good and requires O(s2 log s) time and O(s2) space. Proof. Let us denote P1(i, j) = ij and P2(i, j) = 1 +1. From (3) we obtain that f (i, j, k,l) < 0 if and only if P1(i,j) < P1(k,l) and P2(i,j) < P2(k,l) or P1(i,j) > P1(k,l) and P2(i,j) > P2(k, l). Using the symmetry of the function f, the set S is good if and only if for each pair (k,l) £ S x S there does not exist a pair (i,j) £ S x S such that both P1(i,j) < P1(k, l) and P2(i,j) P2(k,l)). Note that now the set S is good if and only if for each two pairs (i,j), (k,l) £ S x S if P(i,j) < P(k,l) then P2(i,j) > P2(k,l). 2 The algorithm, sketched in Pseudocode 4.1, simply checks the above condition. We sort all pairs (i,j) £ S x S increasingly according to the value of P(k,l) and then iterate over the sorted array T and check if there exist consecutive values (k',lr) and (k,l) such that P2(k',l') < P2(k,l). If we found such values, we have P1(k',l') < P1(k,l) since P(k',l') < P(k, l), and the set S is not good. Otherwise, we find that the array is sorted nonincreasingly according to P2 and, thus, if for any (i,j), (k,l) £ S x S we have P1(i,j) < P1(k,l) then P(i,j) < P(k,l) and P2(i,j) > P2(k,l), and, finally, the set S is good. Note that in Pseudocode 4.1 we do not keep the indices (k', l') of the previously considered pair, but we only store the value P2(k', l') in the variable p2. Let us now analyze consumed time and space. The sorting according to the value of P consumes O(s2 log s) time if we use Merge Sort or Heap Sort. We use O(s2) space to store the sorted pairs in the array T. □ The space complexity can be further improved to O(s). Note that in the algorithm from Lemma 4.1 we do not need to actually store the table T — we need only to iterate over pairs (k, l) in the increasing order of P(k, l) = (P1 (k, l), —P2(k, l)). The following technical lemma shows that such iterator can be constructed using O(s) space and O(s2 log s) total time. & Lemma 4.2. There exists an algorithm that, given a set S of s positive integers, generates a sequence of pairs (k,l) £ S x S in the increasing order of P(k,l) = (P1(k,l), —P2(k,l)). It uses O(s) additional space and O(s2 log s) total time. o 1—1 o 10 £ o 0 o 1 CO ^ CO CO Algorithm 4.1 An 0(s2 log s) algorithm that checks whether S is a good set. 1: procedure CheckGoodSet(S) 2 3 4 5 6 7 8 9 T — the set of all pairs (k,l) & S x S sort T in the increasing order of P(k, l) = (P1(k, l), —P2(k, l)) P2 —