IMFM
Institute of Mathematics, Physics and Mechanics JADRANSKA 19, 1000 LJUBLJANA, SLOVENIA
Preprint series Vol. 48 (2010), 1119
ISSN 2232-2094
ASYMPTOTIC NUMBER OF ISOMETRIC GENERALIZED FIBONACCI CUBES
Sandi Klavzar Sergey Shpectorov
Ljubljana, May 21, 2010
Asymptotic number of isometric generalized Fibonacci
cubes
Sandi Klavzar Faculty of Mathematics and Physics University of Ljubljana, Slovenia and
Faculty of Natural Sciences and Mathematics University of Maribor, Slovenia e-mail: sandi.klavzar@fmf.uni-lj.si
Sergey Shpectorov School of Mathematics University of Birmingham, United Kingdom e-mail: s.shpectorov@bham.ac.uk
c
o
i
CO ^
CO CO
CD CO
Abstract
For a binary word f, let Qd(f) be the subgraph of the d-dimensional cube Qd induced on the set of all words that do not contain f as a factor. Let Qn be the set of words f of length n that are good in the sense that Qd(f) is isometric in Qd for all d. It is proved that limn^TO \Gn\/2n exists. Estimates show that the limit is close to 0.08, that is, about eight percent of all words are good.
Key words: hypercube; generalized Fibonacci cube; isometric embedding; combinatorics on words
AMS subject classifications: 05A16, 05C12, 68R15
1 Introduction
The n-cube Qn, also known as the n-dimensional hypercube, encompasses the binary strings of length n. Hence it is not surprising that the n-cubes form one of the most studied classes of graphs; that they are one of the fundamental models used in CD computer science, in particular for designing interconnection networks; and that they are applicable elsewhere.
The edges of Qn intrinsically capture the (binary) Hamming distance between the binary strings. Consequently, hypercubes form one of the central classes in metric graph theory. However, a serious limitation of hypercubes for practical (and theoretical) purposes is that they are very rare—only one such graph exists to each power of 2. Hence different modifications of hypercubes and different subgraphs of
a
CD
r1
CSI
hypercubes were proposed such that the important properties (for whatever purpose) of hypercubes are preserved. The most important such class of graphs introduced by Graham and Pollack [5] back in 1971 is the class of partial cubes, the graphs defined as isometric subgraph of hypercubes. These graphs have metric properties analogous to those of hypercubes and can thus be equally well used for routing and similar tasks.
Fibonacci cubes, introduced by Hsu [8] in 1993, are defined as graphs obtained from n-cubes by removing all vertices that contain two consecutive ones. In this way significantly smaller graphs than hypercubes are obtained but they still have nice "interconnection" properties, cf. [8]. Moreover, Fibonacci cubes are partial cubes, see [10], and have been studied from numerous points of view [6, 8, 11, 14, 17, 18].
Instead of removing all vertices that contain the factor 11, one may remove all vertices that contain an arbitrary fixed factor f. We say that the word f is good if the subgraph Qd(f) obtained from Qd by removing the vertices with factor f is isometric in Qd for all d. Note that if Qd(f) is not isometric then the same holds for all d' > d, hence for a bad word f, the subgraph Qd(f) can be isometric in Qd for only a finite number of dimensions d. Denoting with Gn the set of good words of length n and with Bn the set of bad words of length n, we are interested in the asymptotic behavior of \Qn\/2n and |Bn|/2n. In other words, if the words are long, is the number of good/bad words negligible comparing to the number of all words? We prove that neither of these cases occur by demonstrating that there are considerable number of both good and bad words. In fact, we show that, as the length n goes to infinity, the proportion of good words has a limit strictly between 0 and 1 (and also provide more precise estimates for the limit). Thus, the generalized Fibonacci cubes Qd(f) for good words f constitute a large new explicit family of partial cubes, see [1, 2, 3, 4, 15] for a sample of other classes of partial cubes.
In order to estimate the limit proportion of good/bad words we will study r-error overlaps of words and demonstrate that, in particular, 2-error overlaps play a crucial
CO
role. Note that a similar concept has been studied before. The words that admit CO 0-error overlap are known in the literature as the non bifix-free words aka bordered words, see [7, 13]. The numbers of such words of length n are gives as sequence A094536 in [16] while A003000 of [16] gives the numbers of bifix-free words.
The rest of paper is organized as follows. In the next section concepts needed in this paper and some preliminary observations are given. In Section 3 we introduce words with r-error overlap. We focus on the words called stutters which are defined as words that have an r-error overlap, r < 2, of length at least half the length of the word. We show that the proportion of the stutters among all words becomes negligible for large n. Then, in Section 4, we prove that the density of the set of all words of length n having a 2-error overlap converges to a limit value a. In the
cd
following section we prove that every bad word has a 2-error overlap and from this fact we deduce that the density of bad words converges to a as well. In Section 6
o
i
CSI 00 CSI CSI
u
a
CD u
we then prove that a lies between 0.919975 and 0.924156.
2 Preliminaries
Let B = {0,1} and call elements of B bits. An element of Bd is called a binary word (or simply a word) of length d. A word f is a factor of a word x if f appears as the sequence of \f \ consecutive bits of x. We will use the product notation for words meaning concatenation, for example, 1s means 11... 1, the word of length s. Recall that all words of length d > 1 are vertices of the d-dimensional cube Qd, where two words are adjacent whenever they differ in exactly one position. For a word f of length n, let Qd(f) be the subgraph of Qd induced on the set of all words of length d that do not contain a factor f, see [9]. The graph Qd(f) is called the generalized Fibonacci cube defined by the forbidden word f. In this notation, Qd(11), d > 1, denotes the Fibonacci cubes. (We note that the name "generalized Fibonacci cubes" was earlier used for the restricted family Qd(1s), s > 1, see [12].)
We will consider the usual shortest path distance and write dG(u,v) for the distance in a graph G between u and v. Recall that a subgraph of a graph is called isometric if the distance between any two vertices of the subgraph is independent of whether it is computed in the subgraph or in the entire graph G. We will write H h G to denote that H is an isometric subgraph of G. Call a word f bad if there exists a dimension d such that Qd(f) is not isometric in Qd (notationally, Qd(f) h Qd). For instance, it is proved in [9] that (10)s1 (s > 1), and 1r0s (r, s > 2), are bad words.
o
Lemma 2.1 Suppose that Qd(f) h Qd for some dimension d. Then Qd'(f) h Qd' for all d' > d.
CO
Proof. Let d' = d + r, r > 1. Since Qd(f) h Qd, there exist vertices u,v of Qd(f) such that dqdff)(u, v) > dQd(u, v). If the first bit of f is 1, set u = 0ru and v = 0rv, otherwise set u = 1ru and v = 1rv. Then u, v e Qd(f) and
dQd'(f)(u,u) = dQd(f )(u,v) > dQd(u,v) = dQd'(u,u),
hh
hence Qd' (f) h Qd'.	■
So for every bad f there exists the smallest dimension d = d(f) for which non-isometricity holds. If we are not interested in the exact value of d(f), then we can always talk about the sufficiently large dimension d.
The word f is called good if it is not bad. That is, f is good if Qd(f) h Qd for all dimensions d. For instance, 1s, 10s, and (10)s, are good words for any s > 1, see [9], hence in particular the Fibonacci cube Qd(11) isometrically embeds into Qd. Set finally
Qn = {f e Bn \ f is good}
and
Bn = {f e Bn \ f is bad} .
3
3 r-Error overlaps and stutters
For a word f of some length n, let fs,k be the factor of f of length k starting from position s + 1. Here, clearly, k < n and s £ {0,1,... ,n — k}. For example, bk(f) := fo,k is the beginning of f of length k. Similarly, ek(f) := fn-k,k is the end part of f of the same length k. Suppose that bk(f) and ek(f) agree in all but r positions. Then we say that f has an r-error overlap of length k. If f has an r-error overlap for some length k then we simply say that f has an r-error overlap. We say that f is a .stutter if f has an r-error overlap of length k, where r < 2 and k > n. Let Sn be the set of all stutters of length n.
Lemma 3.1 The proportion of stutters among all words of length n tends to zero as n goes to infinity. That is,
lim — =0.
U
a
cd U
n
Proof. Suppose k > n and let s(k) be the number of stutters f in Sn that have an r-error overlap of length exactly k for some r < 2. It is easy to see that such a word f is fully specified by the r locations of the "errors" within bk(f) and by the bits in the last n — k positions of f. The number of choices of r < 2 positions within the k positions in bk(f) is 1 + k + k(kk2-) = k2+±2. Hence s(k) = (k2 + k + 2)2n-k-1. We can now estimate the number of stutters from above as
(N
n	n
|Sn| < Yh s(k) = Y (k2 + k + 2)2n-k-1 < n(2n2)2n/2-1 = n32n/2.
k=nn k=\n}
C^	3 n/2
Therefore, the density of stutters tends to zero, since ^ni < 2 = n32-n/2. ■
£ CO
4 2-Error overlaps
hh
In this section we study the density of the set Tn of all words of length n having a 2-error overlap. We say that a word f £ Tn is split if f has a 2-error overlap of length k < n. In this case, the beginning and the end part of f, that realize the 2-error overlap, are disjoint. Let Tn be the set of all split words from Tn.
Lemma 4.1 If f £Tn — Tn then f is a stutter. Proof. Follows directly from the definitions.
The next result allows us to conclude that the set of split words with a 2-error overlap has a limit density as the length n goes to infinity.
Proposition 4.2 ^e have IT+I > 2ITSI.
Proof. Consider the mapping 0 : Bn+1 h Bn defined by erasing the bit in position LnJ + 1 = \n+r1 !• Clearly, if f e Bn+1 and 0(f) has a 2-error overlap of some length k < n then f also has a 2-error overlap of the same length k. In particular,
0 1(Tn) Q	Since every fiber of this mapping has size two, the claim follows.
■
\TS I
Corollary 4.3 The sequence an = is monotonically increasing and bounded from above by 1. In particular, it has a limit a < 1.

co cd
U
a
CD U
In fact, this number a is also the limit density of the words with a 2-error overlap. Corollary 4.4 The sequence /3n = converges to the same limit value a.
Proof. Let jn = . According to Lemma 3.1, the sequence jn tends to zero. Hence both an and an + Yn converge to the same limit, a. On the other hand, clearly, an < 3n, since TnS Q Tn, and also 3n < an + Yn by Lemma 4.1. So the claim follows.	■
5 Density of bad words
In this section we establish that the density of the set of bad words also tends to the same limit value a. For this sake the following result is crucial.
00
Theorem 5.1 If f is bad then f has a 2-error overlap.
Proof. Suppose f is bad and choose d sufficiently large, so that Qd(f) h Qd. Let w and w' be vertices of Qd(f) such that dgd(f)(w,w') > dQd(w,w'). We will assume that m = dQd(w, w') is as small as possible. Clearly, m > 2. Let i1 <i2 < ■ ■ ■ < im be the positions in which w and w' differ. For a subset S of V = {ii, i2,... ,im }, let w(S) be the word obtained from w by switching the bits in all positions contained in S. In particular, w(0) = w and w(V) = w'. To simplify the notation, we will write w(i) instead of w({i}), w(i,j) instead of w({i,j}), and so on.
Note that all words w(S) lie on the shortest paths in Qd from w to w'. In particular, in view of the minimality of m, none of the words w(S), where 0 = S = V, is contained in Qd(f). So all these words contain occurrences of the word f. Let, for each i e {i-\_, i2,..., im}, f be one occurrence of f in w(i). Let also Si + 1 be the first position that fi occupies within w(i). Using the notation from Section 3, we can write w(i)si,n = f, where n is the length of f.
Clearly every fi contains the corresponding position i. Define a digraph D on V drawing an arc from i to j if and only if fi covers the position j (i.e., si + 1 < j < si + n).
Claim: There exist i and j in V with arcs both from i to j and from j to i.
Note first that every i is a source of an arc, for otherwise w' would contain f. In addition, all arcs from i\ are directed to the right and all the arcs from im to the left. Let j = is be the first vertex from which there is an arc pointing left. It is clear that i = is_i and j fulfill the assertion of the claim.
The assertion of the theorem is now essentially a direct corollary of the claim. Indeed, let i and j be as in the claim. Then both factors fi (of the word w(i)) and fj (of w(j)) are equal to f and they both contain positions i and j. So they have an overlap. Furthermore, within this overlap, they disagree only in the positions i and j. Hence f has a 2-error overlap.	■
CSI
This theorem is in particular useful for a short argument that certain words are good. For instance, it implies immediately that the already mentioned words 1s, 10s, and (10)s, are good. Additional instances of good words are 11010 and 1s01s0, s > 1. To prove these facts much more work was needed in [9].
According to Theorem 5.1, Bn C Tn. We next show that in fact Bn nearly coincides with n.
o
o
CSI
i
j-h
CD CO
u
a
CD U
Proposition 5.2 If f has a 2-error overlap and it is good then f is a stutter.
Proof. Let n be the length of f and suppose it has a 2-error overlap of length k. Then bk(f) and ek(f) disagree in some two positions i and j (out of k). Set d = 2n — k and let w,w' £ Bd be defined as follows. Let w be bn_k(f) followed by f(i) and, similarly, let w' be the same bn_k(f) followed by f (j). It is easy to see that w and w' disagree in positions n — k + i and n — k + j. So they are at distance 00 two in Qd. Notice, furthermore, that the two common neighbours of w and w' in Qd are the words w(n — k + i) = w'(n — k + j) and w(n — k + j) = w'(n — k + i). Both these words contain f; indeed, w(n — k + i) contains f as its end part (suffix of length n), while w(n — k + j) contains f as its beginning (prefix of length n). In particular, the common neighbours of w and w' are not contained in Qd(f). Since f is good, Qd(f) ^ Qd, which means that either w £ Qd(f), or w' £ Qd(f).
We can now show that f is a stutter. In view of the symmetry between i and j (and hence between w and w') we may assume that w £ Qd(f), that is, w contains a factor f. This factor overlaps with either bn(w) or with en(w) (or with both) in s > n positions. As both these words differ from f in one position, we conclude that f has an r-error overlap of length s with r < 1 (in fact, one can see that r = 1). As s > n, we have that f is a stutter, as claimed.	■
m
As a corollary of this we obtain that the limit density of bad words exists and it is equal to the same a, as in the preceding section. Let 5n = .
Corollary 5.3 The density 5n converges to a, that is,
lim 5n = a.
n
o
£ CO CO
Proof. By Theorem 5.1, Sn < ffn. On the other hand, by Proposition 5.2, Sn > ffn — Yn. Since Yn tends to zero, it follows that the sequence 5n has the same limit as ffn, namely, a.	■
Before we leave this section, we remark that the proof of Proposition 5.2 has a further consequence. Suppose f is a bad word, but it is not a stutter. By Theorem 5.1, f has a 2-error overlap of some length k. Set d = 2n — k and select vertices w and w' as in the proof of Proposition 5.2. Since f is not a stutter, the final argument in the proof means that both w and w' are contained in Qd(f), while, again as in the proof, the two common neighbours of w and w' are not. That is, we have 2 = dQd(w,w') < dqdff)(w,w'). This means that badness of nearly every bad word f can be established in a cube Qd, with d < 2n, with a pair of vertices at distance two from each other. This confirms observations from [9] where good and bad words of small length were considered. Namely, in most of the cases for which non-isometricity was established, the distance condition was shown to fail for distance 2. However, there exist bad words f for which distance 2 is preserved in Qd(f) but still Qd(f) ^ Qd(f).
6 Estimates for the limit value a
We are now finding some estimates for the limit density a. Estimates from below are easy: we can use that the sequence an (densities of split words with a 2-error overlap) is monotonically increasing. Furthermore it is easy to see that a2k+1 = a2k so we only need to consider even n. Manifestly, a4 = 0.25. With the use of a computer a few further values were found, see Table 1.
In particular, a3o = 0.919975... and hence we can say that a > 0.919975. So over 90% of all long words are bad. In fact, the data in Table 1 seems to suggest that the value of a is close to 0.92.
The starting values of the sequence f n (involving words of length n having a 2-error overlap) are shown in Table 2. By Corollary 4.4 this sequence also converges to a. Note, however, that it is not monotone.
To get un upper bound for a we will use similar ideas as above. Again we can only focus on the words of even length n = 2k. Let Tn be the number of nonsplit words of length n. If w is such a word then inserting two new bits in the middle produces a word of length n + 2 which is either again nonsplit or it has a 2-error
overlap of length exactly k + 1. The number of words of the latter sort is 2k+1(
k+1)
because we can choose k + 1 bits arbitrarily and then the second half must be the same as the first half but with two positions changed. Therefore we can write
Jh
4Tn < Tn+2 + 2k+1 (k + 1)
Switching to the densities Vn = Tn/2n (notice that Vn = 1 — an) we get
^ + k(k + 1) v" < Vn+2 + 2 . 2k+1
cd
£

0 o
1
CO ^
CO CO
co cd
u
CD co
a
CD
n	WI	an = \rs\l2n
4	4	0.250000
6	34	0.531250
8	182	0.710938
10	830	0.810547
12	3518	0.858887
14	14538	0.887329
16	59074	0.901398
18	238534	0.909935
20	958714	0.914301
22	3845886	0.916931
24	15408114	0.918395
26	61689006	0.919238
28	246881258	0.919704
30	987815218	0.919975
Table 1: Some values of an
which implies that
Since an = 1 — in we get
ln+2 > In —
k(k +1) 2 • 2fc+1
k(k + 1) an+2 < an + 2 . 2fc+i .
Combining these relations from n to n + 2m we get
k+rn-1
Ei(i + 1) 2 . 2i+1 .
i=k
Now we take m to infinity and obtain the following estimate for the limit a:
" i(i + 1)
a < an +
- n ' 7 '2 • 2i+i
i=k
Using computer with k = 15 we get that the sum here is at most 0.004181. Together with the value a30 = 0.919975 from Table 6 this yields an upper limit of 0.924156. Hence we have the following result:
Theorem 6.1 The value of the limit a is between 0.919975 and 0.924156.
As a consequence we see that for large n the number of good words is approximately 8% of all words of that length.
£

0 o
1
CO ^
CO CO
n	\Tn\	3'n — \Tn\/2
4	8	0.500000
5	22	0.687500
6	46	0.718750
7	98	0.765625
8	210	0.820313
9	430	0.839844
10	886	0.865234
11	1790	0.874023
12	3638	0.888184
13	7350	0.897217
14	14830	0.905151
15	29758	0.908142
16	59802	0.912506
17	119802	0.914017
18	240362	0.916908
19	480966	0.917370
20	963302	0.918676
21	1927382	0.919047
22	3857746	0.919758
23	7715446	0.919753
24	15437078	0.920122
25	30873042	0.920088
26	61759618	0.920290
27	123512490	0.920240
28	247051278	0.920338
29	494077866	0.920292
30	988213906	0.920346
Table 2: Some values of 3n
co cd
$h
CD co
$h
a
CD Jh
Acknowledgements
We are grateful to Ciril Petr and Marko Petkovsek for the computation of the data in Tables 1 and 2 and also to Marko for pointing out the connection with bifix-free words. The work of the first author was supported by the Research Grant P1-0297 of Ministry of Higher Education, Science and Technology Slovenia. He is also grateful to the University of Birmingham for his stay there.
References
[1] L Beaudou, S. Gravier, K. Meslem, Isometric embeddings of subdivided complete graphs in the hypercube, SIAM J. Discrete Math. 22 (2008) 1226-1238.
[2]	B Bresar, Characterizing almost-median graphs, European J. Combin. 28 (2007) 916-920.
o	S
[3]	B. Bresar, W. Imrich, S. KlavZar, H.M. Mulder, R. Skrekovski, Tiled partial cubes, J. Graph Theory 40 (2002) 91-103.
CSI
[4]	D. Eppstein, Cubic partial cubes from simplicial arrangements, Electron. J. Combin. 13 (2006) #R79, 14 pp.
[5]	R.L. Graham, H. Pollak, On the addressing problem for loop switching, Bell System Tech. J. 50 (1971) 2495-2519.

[6]	P. Gregor, Recursive fault-tolerance of Fibonacci cube in hypercubes, Discrete Math. 306 (2006) 1327-1341.
[7]	L.J. Guibas, M.A. Odlyzko, Periods in strings, J. Combin. Theory Ser. A 30 (1981) 19-42.
[8]	W.-J. Hsu, Fibonacci cubes—a new interconnection technology, IEEE Trans.

CO
Parallel Distrib. Syst. 4 (1993) 3-12.
[9] A. Ilic, S. KlavZar, Y. Rho, Generalized Fibonacci cubes, manuscript, 2009.
[10]	S. KlavZar, On median nature and enumerative properties of Fibonacci-like cubes, Discrete Math. 299 (2005) 145-153.
i
[11]	S. KlavZar, P. Zigert, Fibonacci cubes are the resonance graphs of Fibonaccenes, Fibonacci Quart. 43 (2005) 269-276.
CSI
[12]	J. Liu, W.-J. Hsu, M.J. Chung, Generalized Fibonacci cubes are mostly Hamil-tonian, J. Graph Theory 18 (1994) 817-829.
CO
[13]	M. Lothaire, Algebraic Combinatorics on Words, Encyclopedia of Mathematics
CO	and its Applications, 90. Cambridge University Press, Cambridge, 2002.
[14]	E. Munarini, N. Salvi Zagaglia, Structural and enumerative properties of the Fibonacci cubes, Discrete Math. 255 (2002) 317-324.
[15]	N. Polat, Netlike partial cubes. I. General properties, Discrete Math. 307 (2007) M 2704-2722.
[16] N. J. A. Sloane, The On-Line Encyclopedia of Integer Sequences,
http://www.research.att.com/~njas/sequences/
[17] A. Taranenko, A. Vesel, Fast recognition of Fibonacci cubes, Algorithmica 49
cd
(2007) 81-93.
i i
[18] H. Zhang, L. Ou, H. Yao, Fibonacci-like cubes as Z-transformation graphs, Discrete Math. 309 (2009) 1284-1293.
&