IMFM Institute of Mathematics, Physics and Mechanics Jadranska 19, 1000 Ljubljana, Slovenia Preprint series Vol. 50 (2012), 1173 ISSN 2232-2094 PARITY INDEX OF BINARY WORDS AND POWERS OF PRIME WORDS Aleksandar Ilic Sandi Klavzar Yoomi Rho Ljubljana, April 12, 2012 i-H o u a CO CD U a CD U Parity index of binary words and powers of prime words Aleksandar Ilic Faculty of Sciences and Mathematics University of Nis, Serbia e-mail: aleksandari@gmail.com Sandi Klavzar Faculty of Mathematics and Physics IN University of Ljubljana, Slovenia and Faculty of Natural Sciences and Mathematics University of Maribor, Slovenia e-mail: sandi. klavzar@fmf . uni-lj . si Yoomi Rho* Department of Mathematics University of Incheon, Korea e-mail: rho@incheon.ac.kr Abstract Let f be a binary word and let Fd(f) be the set of words of length d which do not contain f as a factor (alias words that avoid the pattern f). A word is called even/odd if it contains an even/odd number of 1s. The parity index of f (of dimension d) is introduced as the difference between the number of even words and the number of odd words in Fd(f). A word f is called prime if every nontrivial suffix of f is different from the prefix of f of the same length. It is proved that if f is a power of a prime word, then the absolute value of the parity index of f is at most 1. We conjecture that no other word has this property and prove the conjecture for words 0r 1s0t, r, s,t > 1. The conjecture has also been verified by computer for all words f of length at most 10 and all d < 31. Keywords: binary words, combinatorics on words, words avoiding a pattern, parity index, generalized Fibonacci cubes. CD AMS Subject Classification (2010): 05A05, 68R15, 68W32. * Corresponding author 1 Introduction Elements of B = {0,1} are called bits and an element of Bd is a binary word of length d. Since all words considered here are binary, we will simply speak about words. A word u e Bd will be written in the coordinate form as u = uiu2 ... ud. A word f is a factor of a word x if f appears as a sequence of |f | consecutive bits of x. A word u is called f -free if it does not contain f as a factor. For a word f and positive integer d, let cm cm £ CO CO CO CD $H ) = {u 6 Bd | u is f-free} . The product notation will mean concatenation, for example, 1r is the word of length r with all bits equal 1. A word b is a power of a word c if b = ck for some k > 1. A word is called even if it contains an even number of 1s and odd otherwise. i—l Suppose that f is a word and d is a positive integer. Then the generalized Fibonacci cube, Qd(f), is the graph obtained from the d-dimensional cube Qd by removing all vertices that contain f as a factor. In other words, V(Qd(f)) = Fd(f), two vertices being adjacent if they differ in exactly one bit. These graphs were studied for the first time in [3], but special cases were extensively studied earlier. The most notable special case is formed by Fibonacci cubes rd = Qd(11), d > 1, see the survey [4]. The special case of Qd(1s) was introduced in [2] (under the same name of generalized cubes) and further investigated in [6, 9]. 00 The definition of the generalized Fibonacci cubes naturally leads to different problems on words. The most fundamental problem is to determine the order of these graphs. This problem was studied earlier under the notion of words avoiding a pattern. Calling f a pattern, then the number of words avoiding f is just the number of f-free words. Baccherini, Merlini and Sprugnoli [1] were interested in the number of f-free words that contain prescribed numbers of 0s and 1s and established that they are closely related to proper Riordan arrays. This work was extended in [7]. Another natural problem about generalized Fibonacci cubes is when they embed isometrically into hypercubes. This question naturally leads to the concept of the so called good and bad words. A word f is said to be d-good if for any f-free words u and v of length d, v can be obtained from u by complementing one by one the bits of u on which u and v differ, such that all intermediate words are f-free. Then f is good if it is d-good for any d > 1. The main result of [5] asserts that about eight CD percent of all words are good. Our principal motivation for the present paper is a result of [6] asserting that each Qd(1r) contains a hamiltonian path. This in particular implies that the bipartition of Qd(1r) is balanced. (By the way, it is not difficult to see that every generalized a 2 cu 2 cm Fibonacci cube is connected.) Clearly, the bipartition sets of Qd(f) are formed by even and odd words, respectively. Hence, for a set of words X, let e(X) and o(X) be the number of even and odd words in X, respectively. Let in addition A(X) = e(X) - o(X), in particular write A(x) = A({x}) for a word x. That is, A(x) = 1 if x is even and A(x) = -1 if x is odd. Then we define the parity index of f of dimension d as PId(f ) = A(Fd(f)). Using this notation, a necessary condition for Qd(f) to contain a hamiltonian path is that |PId(f)|< 1. In the next section we introduce prime words and prove that if f is a power of IN a prime word then |PId(f )| < 1 holds for any d. In Section 3 we consider the parity index of the words 0r 1s0t and prove that for any d large enough, |PId(0r 1s0*)| > 2. For the special case of 0r 10r a more precise result is obtained, in particular it is noted that {|P1d(010)|}d>3 is the so-called Padovan sequence. In the final section we pose a conjecture that powers of prime words are the only words with the property |PId(f )| < 1 for any d and verify the conjecture for all words of length < 10 and for all d < 31. 2 Powers of prime words cm A word f of length d is prime if for any k, 1 < k < d - 1, the suffix of f of length k is different from the prefix of f of the same length. In particular, words 0 and 1 are prime, and if d > 2, then the first bit and the last bit of a prime word are different. For instance, 001101 is a prime word which easily follows from the fact that the factor 00 appears only at its beginning. On the other hand the word 01101011 is not prime as it starts and ends with 011. For a word f of length £ let Sd(f) = Bd \ Fd(f), that is, cm cm £ CO CO Sd(f) = {b = b1 b2 ...bd | b contains factor f} . For i = 1,2,..., d - £ + 1 let in addition Sd (f ) = {b = bib2 - bd | b e S,bibi+i ••• bi+*_i = f} . m d r. Then Sd(f )= Ud=-i+1 S^f). / X\ By (k J we denote the set of all k-subsets of the set X. Lemma 2.1 Let f be a word of length £. Then d-l+1 A(Sd(f))= E (-1)k-i E A (n^sWf)) k=1 J s(Nd-£+1) a k CD r3 cu 3 CM i-H o CM cm u a 00 IN 0 o cm 1 cm 00 cm cm £ CO CO m CD $H CD m u a CD U Proof. Let xa be the characteristic function of a set A: 1; x e A, XA(x) = 0; otherwise. Since Sd(/) = Ud=i+1 Sdl)(/), the inclusion and exclusion principle implies that for every x e Sd(/), d-i+1 E (-i)k-1 E x, k= 1 n ui S^if ) ,(x) = 1. Therefore, A(Sd(/)) = E A(x) xsSdif) E A(x) xsSdif) d-i+1 d-i+1 E (-i)k-1 E x, k= 1 I^+i) n*i S(i)if ) ,(x) = E (-1)k-1 E E A(x)Xni6iS«if)(x) k=1 l£(Ndfc£+i) x^Sdif ) d d-i+1 E k=1 = E (-1)k-1 E A(n*,S«(/)). isf^1) Theorem 2.2 Let f be a power of a prime word. Then |PId(/)| < 1 for any d > 1. Proof. Let d > 1. Suppose first that f is a prime word. When d < £, we have Fd(f) = and if d = £, then Fd(f) contains all but the word f. Hence we may assume in the rest that d > £. Since e(Bd) = o(Bd), we have e(Fd(f)) + e(Sd(f)) = o(Fd(f)) + o(Sd(f)). Hence PId(f) = A(Fd(f)) = -A(Sd(f)). It thus suffices to prove that |A(Sd(f))| < 1. We first note that A(S((i)(f)) = 0. Indeed, the first i - 1 bits and the last d - £ - i +1 bits of the words from S^f) are arbitrary, hence S^f) contains even words and the same number of odd words. Consider now X = n^/S(l)(f) where I = {i1, i2, • • •, ifc} and i1 < i2 < — < ik. Because f is a prime word, X = 0 as soon as for some index j, ij+1 - ij < £. Moreover, by the same argument as the one used for A(S((i)(f)), A(X) = 0 as soon as for some index j, ij+1 -ij > £. Hence A(X) can be nonzero only when k£ = d and ij = (j - 1)£ + 1 for each 1 < j < k. Therefore, applying Lemma 2.1, O A(Sd(/)) = u a CO IN s CO CO -1 1 £ { d, £\d, k odd, f contains odd number of 1s, otherwise. The proof is complete for a prime word f. Assume now that f = (f ')r, where f' is a prime word and r > 2. Let |f '| = . The proof continues similarly as in the case when f was prime. The only difference is that now X = 0 as soon as for some index j, the difference j+i - j is not a multiple of £' and so A(X) can be nonzero only when d is a multiple of £'. □ 3 Non-prime words In this section we study the parity index of words consisting of three blocks, that is, of words 0r 1s0t, r,s,t > 1. Clearly, none of these words is prime. In our main result (Theorem 3.2) we prove that for no such word f, |PId(f)| < 1 holds for all d. Before that we separately give a more precise result for the special case of 0r 10r. The obtained results in particular imply that Qd(0r 1s0t) does not contain a hamiltonian path as soon as d is large enough. CO |PI (0r 10r)| = J d < 2r 2r + 2 < d < 3r + 1 , |PId(0 10 )| = 1 1; d = 2r + 1,3r + 2 < d < 4r + 3 . Theorem 3.1 Let r > 1. Then Moreover, for any d > 4r + 4, \PId(0r 10r)\ > 2. Proof. Suppose first that d < 2r. Then Fd(0r 10r) = Bd and hence PId(0r 10r) = 0. Since F2r+i(0r 10r) = Bd \ {0r 10r} we have PI2r+i(0r 10r) = 1. Let d > 2r + 2. Recall that -PId(0r 10r) = A(Sd(f)) = E^/) A(b). By Lemma 2.1, d-i+i -PId(f)= E (-1)k-1 E A(nje/Sdl)(f)). k=1 j c(Nd-A5+i) Suppose that for a set X = nie/Sdl)(f) there exists an index i such that if w e X then also w + a e X. Then A(X) = 0. It follows that A(X) + 0 if and only if there exist k > 0 and r < rj < 2r for all 1 < j < k, such that X = {0r 10ri 10r21 ••• 0rk 10r} . Moreover, in that case A(X) = A(0r 10ri 10r21 • 0rfc 10r) = (-1)k+1 CD cu 5 0 Hence let k > 0 and r < r, < 2r, 1 < j < k, and set b = 0r 10ri 10r21 - 0rk 10r. Let v = bri+2btl+3 — bd be the word obtained from b by omitting the first ri + 1 bits, so that v e Bd-ri-i. Since b has one more bit of 1 than v does, A(v) = -A(b). Note that v starts with 0r 1. Then 00 in CSF 00 CSF CSF iß CD CD Cß U a CD U v e n SÜ-i1_1(/) if and only if 6 e S^f) Q jS(dj+ri+1) Now we can compute as follows: PId(0r 10r) = -A(Sd(f )) = - E A(6) b£Sd(f) / E A(6) + E A(6) + — + E A(6) ) bESd (f) beSd (f) y ri=r ri=r+1 ri=2r = -( E -Ad_r_l(v)+ E -Ad_r_2(v)+' \ V€Sd-r-l(f ) V€Sd-r-2(f ) + E -Ad_2r_l(v) V£Sd-2r-l(f) ) o = E Ad_r_1 (v)+ E Ad_r_2 (v) + ' veSd-r-l(f) V£Sd_r-2 (f) + E Ad_2r_1 (v) veSd-2r-l(f ) = E E Ad_ri _1(v) r 2r + 2. If d < 3r + 1 and there is a word 6 e Sd(f), then there is an index i such that if w e X then also w + e^ e X and hence A(X) = 0. Assume d > 3r + 2. When 3r + 2 < d < 4r + 2, PId(0r 10r) = A(0r10d-2r-210r) = 1 or -1 and hence ad = 1. When d = 4r + 3, ad = a2r+2 + — + a3r+2 = 1. Let d = 4r + 3 + u for some u > 1. Then by Equation (2), a4r+3+u = a2r+2+u + — + a3r+2+u. If u < r, then 2r+2+u < 3r+2 < 3r+2+u < 4r+2 and therefore ad > 2. Assume u > r+1. Let u' = u-r. Then d = 5r + 3 + u' for u' > 1. By Equation (2), a5r+3+u = a3r+2+u + — + a4r+2+u. If u' < r, then 3r + 2 < 3r + 2 + u' < 4r + 3 < 4r + 2 + u' and therefore ad > 2. Assume u' > r + 1. Then let u'' = u' - r. Then d = 6r + 3 + u'' for u'' > 1. By Equation (2), a6r+3+M" = a4r+2+U" + — + a5r+2+U". As 3r + 2 < 4r + 2 + u'' < 5r + 2 + u'', ad > 2. Thus when d > 4r + 4, ad > 2. □ CSI 00 in u a CD U The special case of Theorem 3.1 when r = 1 deserves a special attention. In that case, |PId(010)| = |PId-2 (010)| + |PId-3(010)| with initial conditions |PI3(010)| = 1, |PI4 (010) | = 0, |PI5(010)| = 1 which is the Padovan sequence, see sequence A000931 from [8]. Theorem 3.2 Let r,s,t > 1. Let z be the integer such that (z- 1)t + 2 < r + s < zt +1. Then o I |PId(0r 1s 04 )| Moreover, for any d > (z + 1)(r + s +1) + 2, |PId (0r 1s04)| > 2 = 0; d < r + s +t, y (r + s +t) < d < (y + 1)(r + s) +1 for 1 < y < z, > 1; d = r + s +t, (y + 1)(r + s) +1 < d < (y + 1)(r + s +t) for 1 < y < z, d = (z + 1)(r + s +t) + 1. £ CO Proof. Since PId(0r 1s0*) = PId(0* 1s0r), it suffices to prove the result for words 0r 1s0t with r > t. By the same argument as in the proof of Theorem 3.1, A(X) + 0 if and only if there exist k > 0 and r < rj < r +1 for all 1 < j < k such that X = {0r 1s0ri 1s 0r21s — 0rk 1s0t} where A(X) = (-1)(k+1)s. Also m |PId(0r 1s0*)| = E |PId-ri-s(0r 1s0*)| . (3) r r + s +1 + 1. In the first part of the proof, we prove the theorem for d < (z + 1)r + (z + 1)s + (z + 1)t by induction on y for 1 < y < z. Then we prove the theorem for d > (z + 1)r + (z + 1)s + (z + 1)t + 1. The idea of the proof CM is as follows. From the first part of the proof, we notice that for each y > 1, ad = 0 for r + s - (y - 1)t - 1 consecutive numbers of d and then ad > 1 for the next yt + 1 consecutive numbers of d. As y increases, r + s - (y - 1)t - 1 decreases to zero and yt + 1 increases. While, by Equation (3), ad = ad-r-s-t + — + ad-r-s, which is a sum of t + 1 consecutive numbers, where t + 1 is a constant for given 0r 1s0t. Therefore for large enough d, ad > 2. By a similar argument as in the proof of Theorem 3.1, ad = 0 if d < 2r + 2s +1 and a ad = 1 if 2r + 2s +1 < d < 2r + 2s + 2t. Thus the statement is true for y = 1. Let y > 2. Suppose the statement is true for all 1 < yo < y. Let d = yr + ys + yt + u for some u > 1. Then by Equation (3) ayr+ys+yt+M = a(y-1)r + (y-1)s + (y-1)t+M + ••• + a(y_1)r + (y_1)s+yt+M. When d < (y + 1)r + (y + 1)s +1, i.e., u < r + s - (y - 1)t, (y - 1)r + (y - 1)s + (y - 1)t < (y - 1)r + (y - 1)s + (y - 1)t + u < (y - 1)r + (y - 1)s + yt + u < yr + ys +1 and hence by the induction assumption, ad = 0. When d = (y + 1)r + (y + 1)s +1, i.e., u = r + s - (y - 1)t, (y - 1)r + (y - 1)s + yt + u = yr + ys + t and hence by the induction assumption, ad > ayr+ys+t > 1. Assume d > (y + 1)r + (y + 1)s +1 + 1, i.e., u > r + s - (y - 1)t + 1. Let u' = u - r - s + (y - 1)t. Then d = (y + 1)r + (y + 1)s +1 + u' where u' > 1. By Equation (3), ad = ayr+ys+u + ••• + ayr+ys+t+u. If d < (y + 1)r + (y + 1)s + (y + 1)t, i.e., u' < yt, then yr + ys + u' < yr + ys + yt. Considering that yr + ys +1 < yr + ys +1 + u', cm ad > ayr+ys+t + ayr+ys+t+1 or ad > ayr+ys+-u' + ayr+ys+-u' +1 depending on whether yr + ys + u' < yr + ys +1 or not. Therefore by the induction assumption, ad > 1. Thus the theorem is proved for all d < (z + 1)r + (z + 1)s + (z + 1)t. Assume d > (z + 1)r + (z + 1)s + (z + 1)t + 1 > (z + 2)r + (z + 2)s + t. Let d = (z+2)r +(z+2)s+t+u'' where u'' > 0. Then by Equation (3), ad = a(z+2)r+(z+2)s+t+u" = a(z+1)r+(z+1)s+u" + -+a(z+1)r+(z+1)s+t+u". Note that (z+1)r+(z+1)s+t+u'' > (z+1)r+ (z+1)s+t. Assume d = (z+2)r +(z+2)s+t, i.e., u'' = 0. Then (z+1)r+(z+1)s+t+u'' = (z + 1)r + (z + 1)s +1. If r + s > zt, then (z + 1)r + (z + 1)s > zr + zs + zt and hence ad = a(z+1)r+(z+1)s+t > 1. If r + s < zt, then (z + 1)r + (z + 1)s < zr + zs + zt and hence ad > azr+zs+zt + a(z+1)r+(z+1)s+t > 2. Let d > (z + 2)r + (z + 2)s + t, i.e., u'' > 0. Then (z + 1)r + (z + 1)s + t + u'' > (z+1)r+(z+1)s+t. First assume d < (z+2)r+(z+2)s+(z+2)t, i.e., u'' < (z+1)t. Then (z + 1)r + (z + 1)s + u'' < (z + 1)r + (z + 1)s + (z + 1)t. Therefore ad > a(z+1)r+(z+1)s+t + a(z+1)r+(z+1)s+t+1 or ad > a(z+1)r+(z+1)s+u" + a(z+1)r+(z+1)s+u" + 1 depending on whether (z + 1)r + (z + 1)s + u'' < (z + 1)r + (z + 1)s +1 or not. Thus ad > 2 in any case. Second assume d = (z + 2)r +(z + 2)s + (z + 2)t, i.e., u'' = (z + 1)t. Then (z + 1)r +(z + 1)s + u'' = (z + 1)r + (z + 1)s + (z + 1)t and hence ad > a(z+1)r+(z+1)s+u" + a(z+1)r+(z+1)s+u"+1 > 2 considering that (z + 1)r + (z + 1)s + (z + 1)t < (z + 1)r + (z + 1)s + u'' +1 < (z + 2)r + (z + 2)s + (z+2)t. Finally assume d > (z+2)r + (z+2)s + (z+2)t, i.e., u'' > (z + 1)t. Suppose cm cm ^ CO CO m CD w u a CD U CM i-H o cm CM there is u" > (z + 1)t such that ad < 1. Let u0 be the smallest such an integer and d0 = (z + 2)r + (z + 2)S + t + u0'. Then ado = a(z+1)r+(z+1)s+u0 + "• + a(z+1)r+(z+1)s+i+«0'. Since (z+1)r+ (z+1)s+u0' > (z+1)r+(z+1)s+(z+1)t and (z+1)r+(z+1)s+t+u0' < d0, ado > 2, which is a contradiction. Thus the statement is true for all d. □ J-H a < CO 0 £ o cm 1 cm CO cm cm £ CO CO 4 Computer evidence and conjecture Using computer we obtained the parity index for all words f of length at most 10 and all d < 31. Since Qd(f) is isomorphic to Qd(f), where / is the binary complement of f, we have restricted the computation to words f that contain not more 1s than 0s. From the same reason reversed words need not to be considered. In Table 1 all words f of length at most 8 and with |PId(f )| < 1 for d < 31 are collected. length / 3 001 4 0001, 0011, 0101 5 00001, 00011, 00101 6 000001, 000011, 000101, 000111 001001, 001011, 001101, 010101 7 0000001, 0000011, 0000101, 0000111 0001001, 0001011, 0001101, 0010011 0010101, 0011101 8 00000001, 00000011, 00000101, 00000111 00001001, 00001011, 00001101, 00001111 00010001, 00010011, 00010101, 00010111 00011001, 00011011, 00011101, 00100011 00100101, 00101011, 00101101, 00110011 00110101, 00111101, 01010101 Table 1: List of words f with |f | < 8 and |PId(f )| < 1 for d < 31 m CD u CD m u a CD U It can be checked that every word from the table is a power of a prime word. Moreover, the same was verified also for the obtained words of length 9 and 10 (not given in the table). Based on this experiment and Theorems 2.2 and 3.2 we pose: Conjecture 4.1 Let f be a word such that |PId(f )| < 1 holds for any d. Then f is a power of a prime word. CM i-H o CM cm u a 00 IN A possible approach to the conjecture would be to prove that if f is not a power of a prime word, then the sequence {|PId(f )|}d satisfies a certain recurrence relation from which we can deduce the behavior of the sequence. For instance, one can establish the recurrent formula |PId(01110)| = |PId-4(01110)| + |PId-5(01110)|, with initial conditions |PI5(01110)| = 1, |PIa(01110)| = |PIz(01110)| = |PIg(01110)| = 0 and |PI9 (01110) | = 1. Similarly, either by applying Equation (3) or by a tedious case analysis yields, one can get: |PId (000001000) | = |PId-6 (000001000) | +|PId-7 (000001000)| +|PId-8 (000001000) | + |PId-9 (000001000) |, 0 o cm 1 cm 00 cm cm £ CO CO CO CD 5H CD CO with initial conditions |PIg (000001000) | = 1, |PIio (000001000) | = |PIn (000001000) | = |PIi2(000001000)| = |PIi3(000001000)| = |PIi4(000001000)| = 0, |PIi5(000001000)| = |PIi6(000001000)| = |PIi7(000001000)| = 1. In Fig. 1 the values of |PId(01110)| and |PId(000001000)| for 5 < d < 55 are plotted. Note that the sequence |P1d(f )| does not need to be monotone, but it seems that starting from some large enough dimension the sequence is strictly increasing. Figure 1: Values of |PId(f)| for f = 01110 and f = 000001000 U a CD U Acknowledgements This work was supported by the research Grants 174010 and 174033 of Serbian Ministry of Education and Science, by the research grant P1-0297 of Ministry of Higher Education, Science and Technology Slovenia, and by Basic Science Research Program through the National Research Foundation of Korea funded by the Ministry of Education, Science and Technology grant 2011-0025319. U a CO CD $H CD CO u a CD U References [1] D. Baccherini, D. Merlini, R. Sprugnoli, Binary words excluding a pattern and proper Riordan arrays, Discrete Math. 307 (2007) 1021-1037. [2] W.-J. Hsu, Fibonacci cubes—a new interconnection technology, IEEE Trans. Parallel Distrib. Syst. 4 (1993) 3-12. [3] A. Ilic, S. Klavzar, Y. Rho, Generalized Fibonacci cubes, Discrete Math. 312 (2012) 2-11. [4] S. KlavZar, Structure of Fibonacci cubes: a survey, to appear in J. Comb. Optim., DOI: 10.1007/s10878-011-9433-z. CM [5] S. KlavZar, S. Shpectorov, Asymptotic number of isometric generalized Fibonacci cubes, European J. Combin. 33 (2012) 220-226. CM [6] J. Liu, W.-J. Hsu, M. J. Chung, Generalized Fibonacci cubes are mostly Hamil-tonian, J. Graph Theory 18 (1994) 817-829. [7] D. Merlini, R. Sprugnoli, Algebraic aspects of some Riordan arrays related to binary words avoiding a pattern, Theoret. Comput. Sci. 412 (2011) 2981-3001. [8] N. J. A. Sloane, The On-Line Encyclopedia of Integer Sequences, Published electronically at http://oeis.org. N. Zagaglia Salvi, On the existence of cycles of every even length on generalized Fibonacci cubes, Matematiche (Catania) 51 (1996) 241-251.