ISSN 2590-9770 The Art of Discrete and Applied Mathematics 3 (2020) #P2.01 https://doi.org/10.26493/2590-9770.1252.e71 (Also available at http://adam-journal.eu) Cograph editing: Merging modules is equivalent to editing P4s∗ Adrian Fritz Computational Biology of Infection Research, Helmholtz Centre for Infection Research, Inhoffenstraße 7, D-38124 Braunschweig Marc Hellmuth School of Computing, University of Leeds, EC Stoner Building, Leeds LS2 9JT, England Peter F. Stadler † Bioinformatics Group, Department of Computer Science, Universität Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany Nicolas Wieseke Swarm Intelligence and Complex Systems Group, Department of Computer Science, Leipzig University, Augustusplatz 10, D-04109 Leipzig, Germany Received 23 May 2018, accepted 10 September 2019, published online 27 March 2020 Abstract The modular decomposition of a graph G = (V,E) does not contain prime modules if and only if G is a cograph, that is, if no quadruple of vertices induces a simple connected path P4. The cograph editing problem consists in inserting into and deleting from G a set F of edges so that H = (V,E 4 F ) is a cograph and |F | is minimum. This NP-hard combinatorial optimization problem has recently found applications, e.g., in the context of phylogenetics. Efficient heuristics are hence of practical importance. The simple character- ization of cographs in terms of their modular decomposition suggests that instead of editing G one could operate directly on the modular decomposition. We show here that editing the ∗We thank the anonymous referees for their helpful comments. This work was funded in part by the Deutsche Forschungsgemeinschaft (DFG), Proj. Nr. 432974470 (to PFS). †PFS is also affiliated with the Interdisciplinary Center for Bioinformatics, the German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, the Competence Center for Scalable Data Services and Solutions Dresden-Leipzig, the Leipzig Research Center for Civilization Diseases, and the Centre for Biotechnology and Biomedicine at Leipzig University; the Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany; the Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria; the Center of noncoding RNA in Health and Technology (RTH) at the University of Copenhagen; Facultad de Ciencias of the National University of Colombia in Bogotá, Colombia; and the Santa Fe Institute, Santa Fe, NM. cb This work is licensed under https://creativecommons.org/licenses/by/4.0/ 2 Art Discrete Appl. Math. 3 (2020) #P2.01 induced P4s is equivalent to resolving prime modules by means of a suitable defined merge operation on the submodules. Moreover, we characterize so-called module-preserving edit sets and demonstrate that optimal pairwise sequences of module-preserving edit sets exist for every non-cograph. This eventually leads to an exact algorithm for the cograph editing problem as well as fixed-parameter tractable (FPT) results when cograph editing is param- eterized by the so-called modular-width. In addition, we provide two heuristics with time complexity O(|V |3), resp., O(|V |2). Keywords: Cograph editing, modular decomposition, module merge, prime modules, P4. Math. Subj. Class. (2020): 05C75, 05C05, 92B10 1 Introduction Cographs are of particular interest in computer science because many combinatorial op- timization problems that are NP-complete for arbitrary graphs become polynomial-time solvable on cographs [4, 8, 20]. This makes them an attractive starting point for construct- ing heuristics that are exact on cographs and yield approximate solutions on other graphs. In this context it is of considerable practical interest to determine “how close” an input graph is to a cograph. An independent motivation recently arose in biology, more precisely in molecular phy- logenetics [14, 21, 35, 36, 37, 47]. In particular, orthology, a key concept in evolutionary biology in phylogenetics, is intimately tied to cographs [35]. Two genes in a pair of related species are said to be orthologous if their last common ancestor was a speciation event. The orthology relation on a set of genes forms a cograph [30], see [33] for a detailed discussion and [21, 22, 23, 31, 47] for generalizations of these concepts. This relation can be estimated directly from biological sequence data, albeit in a necessarily noisy form. Correcting such an initial estimate to the nearest cograph thus has recently become a computational prob- lem of considerable practical interest in computational biology [35]. However, the (deci- sion version of the) problem to edit a given graph with a minimum number of edits into a cograph is NP-complete [32, 34, 38, 39]. As noted already in [7], the input for several combinatorial optimization problems, such as exam scheduling or several variants of clustering problems, is naturally expected to have few induced paths on four vertices (P4s). Since graphs without an induced P4 are exactly the cographs, available cograph editing algorithms focus on efficiently removing P4s, see e.g. [16, 24, 25, 38, 39, 53]. The FPT-algorithm introduced in [38, 39] takes as input a graph that is first edited to a so-called P4-sparse graph and then to a cograph. The basic strategy is to destroy the P4s in the subgraphs by branching into six cases that eventually leads to an O(4.612k|V |9/2)-time algorithm, where k is the number of required edits. Algorithms that compute the kernel of the (parameterized) cograph editing problem [24, 25] as well as the exact O(3|V ||V |)-time algorithm [53] use the modular-decomposition tree as a guide to locate the forbidden P4s using the fact that these are associated with prime modules. Nevertheless, the basic operation in all of these algorithms is still the direct destruction of the P4s. Cographs are recursively defined as follows: K1 is a cograph, the disjoint union of E-mail addresses: adrian.fritz@helmholtz-hzi.de (Adrian Fritz), mhellmuth@mailbox.org (Marc Hellmuth), studla@bioinf.uni-leipzig.de (Peter F. Stadler), wieseke@informatik.uni-leipzig.de (Nicolas Wieseke) A. Fritz, M. Hellmuth, P. F. Stadler and N. Wieseke: Cograph editing by module-merging 3 cographs is a cograph, and the join of cographs is a cograph. This recursive definition associates a vertex labeled tree, the cotree, with each cograph, where a vertex label “0” denotes a disjoint union, while “1” indicates the join of the children is formed. It has been shown in [7] that each cograph has a unique cotree and conversely, every tree whose interior vertices are labeled alternatingly defined a unique cograph. A simple recognition algorithm starts with an input graph G. If G is connected, then a node labeled “1” is inserted into the tree, the complement graph G is formed and the algorithm proceeds recursively on the connected components of G. If G is not connected, the tree-node is labeled “0”, and the algorithm recurses on the components of G. If both G and G are connected, then G is not a cograph, and the algorithm terminates with a negative answer. A natural heuristic for cograph editing proceeds by finding a minimal cut in G or G, removes the cut-edges and proceeds with the modified graph. This idea is pursued in [14, 15]. A very different heuristic for cograph modification was recently proposed by Crespelle [11]. It corrects the neighborhood of each vertex separately. More precisely, an inclusion- minimal cograph editing Hk of the induced subgraph Gk := G[{x1, . . . xk}] is computed from the correction Hi−1 of Gi−1 in such a way that only edges involving xi are inserted or deleted. It has the useful property that in each step the number of inserted or deleted edges is minimum, and it inserts or deletes no more than |E(G)| edges in total. It is based on a general property of single-vertex augmentations in hereditary graph classes that are stable under the addition of universal vertices and isolated vertices, see e.g. [48]. A key advantage is that it has linear time complexity, i.e., O(|V |+ |E|). Cotrees are a special case of the much more general modular decomposition tree, which is well-defined for every graph and conveys detailed information about its structure in a hi- erarchical manner [19]. A subset M ⊆ V is called a module of a graph G = (V,E), if all members of M share the same neighbors in V \M . A prime module is a module that is characterized by the property that both, the induced subgraph G[M ] and its complement G[M ], are connected subgraphs of G. Cographs play a particular role in this context as their modular decompositions are of a special form: they are characterized by the absence of prime modules. In particular, the cotree of a cograph coincides with its modular decom- position tree [19]. It is natural to ask, therefore, whether the modular decomposition tree can be manipulated in a such a way that all prime modules of a given graph are converted into “series” or “parallel” modules for which either G[M ] and or G[M ] is disconnected. This is equivalent to converting G into a cograph G∗. Every minimum edit set clearly is inclusion-minimal. However, not every minimum edit set – and in particular not every inclusion-minimal edit set – respects the module structure of G. Figure 1 below shows a pertinent example. In contrast to the editing approach of [11], we pursue an approach that is modul-preserving in the sense that each module ofG is also a module of the edited graph G∗. We argue that this property is desirable in the context of orthology detection, because the corrected modular decomposition tree, i.e., the cotree of G∗ has a direct interpretation as event-labeled gene tree [30, 35]. An alternative way of looking at the connection between cographs and their modular decomposition trees is to interpret the destruction of all P4s in a cograph editing algorithm as the resolution of all prime modules in the edited graph G∗. This simple observation suggests to edit the modules of G. The min-cut approach of [14] is one possibility to achieve this. Here, we consider the merging of modules instead. Every union ⋃ i∈IMi of the connected components M1, . . . ,Mk of the edited graph G∗[M ] or G∗[M ] forms a module G∗, while ⋃ i∈IMi was not a module in the graph G before editing. In this 4 Art Discrete Appl. Math. 3 (2020) #P2.01 situation, we say that “the modules Mi, i ∈ I of G are merged w.r.t. G∗”. Vertices within a module ⋃ i∈IMi share the same neighbors in V \ ( ⋃ i∈IMi). It is sufficient therefore to adjust the neighbors of certain submodules Mi of M to merge the Mi in a way that resolves the prime module M to obtain G∗. In this setting, it seems natural to edit the modular decomposition tree of a graph directly with the aim of converting it step-by-step into the closest modular decomposition tree of a cograph. To this end, one would like to break up individual prime modules by means of the module merge operation. The key results of this contribution are that (1) every prime node M can be resolved by a sequence of pairwise merges of modules that are children of M in the modular decompo- sition tree, and (2) optimal cograph editing can be expressed as optimal pairwise module merging. To prove these statements, we start with an overview of important properties on cographs and the modular decomposition (Section 2 and 3). In Section 4, we then show that so-called module-preserving edit sets are characterized by resolving any prime node by module-merges. In particular, we show that any graph has an optimal edit set that can be entirely expressed by merging modules that are children of prime modules in the modular decomposition tree. Finally in Section 5, we summarize the results and show how they can be used for establishing efficient heuristics for the cograph editing problem. We provide an exact algorithm that allows to optimally edit a cograph via pairwise module-merges. As by-product, we obtain an FPT algorithm for the case that cograph editing is parameterized by the so-called modular-width [1, 18]. We finish this paper with a short discussion on how the latter method can be used to obtain a simple O(|V |2)-time heuristic. 2 Basic definitions We consider simple finite undirected graphs G = (V,E) without loops. The complement G of a graph G = (V,E) has vertex set V and edge set E(G) = {xy | x, y ∈ V, x 6= y, xy /∈ E}. The notation G4F is used to denote the graph (V,E4F ), where4 denotes the symmetric difference. The disjoint union G∪· H of two distinct graphs G = (V,E) and H = (W,F ) is simply the graph (V ∪· W,E ∪· F ). The join G⊕H of G and H is defined as the graph (V ∪· W,E ∪· F ∪· {xy | x ∈ V, y ∈W}). A graph H = (W,F ) is a subgraph of a graph G = (V,E), in symbols H ⊆ G, if W ⊆ V and F ⊆ E. If H ⊆ G and xy ∈ F if and only if xy ∈ E for all x, y ∈ W , then H is called an induced subgraph. We will often denote an induced subgraph H = (W,F ) by G[W ]. A connected component of G is a connected induced subgraph that is maximal w.r.t. inclusion. We write G ' H for two isomorphic graphs G and H . Let G = (V,E) be a graph. The neighborhood N(v) of v ∈ V is defined as N(v) = {x | vx ∈ E}. If there is a risk of confusion we will write NG(v) to indicate that the respective neighborhood is taken w.r.t. G. The degree deg(v) of a vertex is defined as deg(v) = |N(v)|. A tree is a connected graph that does not contain cycles. A path is a tree where every vertex has degree 1 or 2. A rooted tree T = (V,E) is a tree with one distinguished vertex ρ ∈ V . We distinguish two further types of vertices in a tree: the leaves which are distinct from the root and are contained in only one edge and the inner vertices which are contained in at least two edges. The first inner vertex lca(x, y) that lies on both unique paths from two vertices x, resp., y to the root, is called lowest common ancestor of x and y. We say that a rooted tree T displays the triple xy|z if x, y, and z are leaves of T and the path from x to y does not intersect the path from z to the root of T . A. Fritz, M. Hellmuth, P. F. Stadler and N. Wieseke: Cograph editing by module-merging 5 It is well-known that there is a one-to-one correspondence between (isomorphism class- es of) rooted trees on V and so-called hierarchies on V . For a finite set V , a hierarchy on V is a subset C of the power set P(V ) such that (i) V ∈ C, (ii) {x} ∈ C for all x ∈ V and (iii) p ∩ q ∈ {p, q, ∅} for all p, q ∈ C. Theorem 2.1 ([51]). Let C be a collection of non-empty subsets of V . Then, there is a rooted tree T = (W,E) on V with C = {L(v) | v ∈ W} if and only if C is a hierarchy on V . 3 Cographs and the modular decomposition 3.1 Introduction to cographs Cographs are defined as the class of graphs formed from a single vertex under the closure of the operations of union and complementation, namely: (i) a single-vertex graph K1 is a cograph; (ii) the disjoint union G = (V1 ∪· V2, E1 ∪· E2) of cographs G1 = (V1, E1) and G2 = (V2, E2) is a cograph; (iii) the complementG of a cographG is a cograph. Condition (ii) can be replaced by the equivalent condition that the join G1 ⊕ G2 is a cograph, since G1 ⊕G2 is the complement of G1 ∪· G2. The name cograph originates from complement reducible graphs, as by definition, cographs can be “reduced” by stepwise complementation of connected components to to- tally disconnected graphs [50]. It is well-known that for each induced subgraph H of a cograph G either H is discon- nected or its complementH is disconnected [4]. This, in particular, allows representing the structure of a cograph G = (V,E) in an unambiguous way as a rooted tree T = (W,F ), called cotree: If the considered cograph is the single vertex graph K1, then output the tree ({u}, ∅). Else if the given cograph G is connected, create an inner vertex u in the cotree with label “series”, build the complement G and add the connected components of G as children of u. If G is not connected, then create an inner vertex u in the cotree with label “parallel” and add the connected components of G as children of u. Proceed recursively on the respective connected components that consists of more than one vertex. Eventually, this cotree will have leaf-set V ⊆ W and the inner vertices u ∈ W \ V are labeled with either “parallel” or “series” such that xy ∈ E if and only if u = lcaT (x, y) is labeled “series”. The complement of a path on four vertices P4 is again a P4 and hence, such graphs are not cographs. Intriguingly, cographs have indeed a quite simple characterization as P4-free graphs, that is, no four vertices induce a P4. A number of further equivalent characteriza- tions are given in [4] and Theorem 3.2. Determining whether a graph is a cograph can be done in linear time [5, 8]. 3.2 Modules and the modular decomposition The concept of modular decompositions (MD) is defined for arbitrary graphs G and allows us to present the structure of G in the form of a tree that generalizes the idea of cotrees. However, in general much more information needs to be stored at the inner vertices of this tree if the original graph has to be recovered. The MD is based on the notion of modules. These are also known as autonomous sets [43, 44], closed sets [19], clans [17], stable sets, clumps [2] or externally related sets [27]. A module of a given graph G = (V,E) is a subset M ⊆ V with the property that for all vertices in x, y ∈ M it holds that N(y) \M = N(x) \M . Therefore, the vertices 6 Art Discrete Appl. Math. 3 (2020) #P2.01 within a given module M are not distinguishable by the part of their neighborhoods that lie “outside” M . We denote with MD(G) the set of all modules of G = (V,E). Clearly, the vertex set V and the singletons {v}, v ∈ V are modules, called trivial modules. A graph G is called prime if it only contains trivial modules. For a module M of G and a vertex v ∈ M , we define the outM -neighborhood of v as N(v) \M . Since for any two vertices contained in M the outM -neighborhoods are identical, we can equivalently define N(v) \M as the outM -neighborhood of the module M , where v ∈M . We say that a module M of G is parallel, resp., series if the induced subgraph G[M ], resp., the complement G[M ] is disconnected. If both G[M ] and G[M ] are connected, then M is called prime. For a graphG = (V,E) let M and M ′ be disjoint subsets of V . We say thatM and M ′ are adjacent (in G) if each vertex of M is adjacent to all vertices of M ′; the sets are non- adjacent if none of the vertices of M is adjacent to a vertex of M ′. Two disjoint modules are either adjacent or non-adjacent [43]. One can therefore define the quotient graph G/P for an arbitrary subset P ⊆ MD(G) of pairwise disjoint modules: G/P has P as its vertex set and MiMj ∈ E(G/P ) if and only if Mi and Mj are adjacent in G. A module M is called strong if for any module M ′ 6= M either M ∩ M ′ = ∅, or M ⊆ M ′, or M ′ ⊆ M , i.e., a strong module does not overlap any other module. The set of all strong modules MDs(G) ⊆ MD(G) thus forms a hierarchy, the so-called modular decomposition of G. While arbitrary modules of a graph form a potentially exponential- sized family, the sub-family of strong modules has size O(|V (G)|) [26]. Let P = {M1, . . . ,Mk} be a partition of the vertex set of a graph G = (V,E). If every Mi ∈ P is a module of G, then P is a modular partition of G. A non-trivial modular partition P = {M1, . . . ,Mk} that contains only maximal (w.r.t. inclusion) strong modules is a maximal modular partition. We denote the (unique) maximal modular partition of G by Pmax(G). We will refer to the elements of Pmax(G[M ]) as the the children of M . This terminology is motivated by the following considerations: The hierarchical structure of MDs(G) gives rise to a canonical tree representation ofG, which is usually called the modular decomposition tree TMDs(G) [28, 44]. The root of this tree is the trivial module V and its |V | leaves are the trivial modules {v}, v ∈ V . The set of leaves Lv associated with the subtree rooted at an inner vertex v induces a strong module of G. In other words, each inner vertex v of TMDs(G) represents the strong module Lv . An inner vertex v is then labeled “parallel”, “series”, resp., “prime” if Lv is a parallel, series, resp., prime module. The strong module Lv of the induced subgraph G[Lv] associated to a vertex v labeled “prime” is called prime module. Note, the latter does not imply that the graph G[Lv] is prime, however, in all cases the quotient graph G[Lv]/Pmax(G[Lv]) is prime [28]. Similar to cotrees it holds that xy ∈ E if u = lcaTMDs(G)(xy) is labeled “series”, and xy /∈ E if u = lcaTMDs(G)(xy) is labeled “parallel”. However, to trace back the full structure of a given graph G from TMDs(G) one has to store additionally the information of the subgraph G[Lv]/Pmax(G[Lv]) in the vertices v labeled “prime”. Although, MDs(G) ⊆ MD(G) does not represent all modules, we state the following remarkable fact [12, 43]: Any subset M ⊆ V is a module if and only if M ∈ MDs(G) or M is the union of children of non-prime modules. Thus, TMDs(G) represents at least implicitly all modules of G. A simple polynomial time recursive algorithm to compute TMDs(G) is as follows [28]: (1) compute the maximal modular partition Pmax(G); (2) label the root node according to the parallel, series or prime type of G; (3) for each strong module M of Pmax(G), A. Fritz, M. Hellmuth, P. F. Stadler and N. Wieseke: Cograph editing by module-merging 7 compute TMDs(G[M ]) and attach it to the root node and proceed with Pmax(G[M ]). The first polynomial time algorithm to compute the modular decomposition is due to Cowan et al. [10], and it runs in O(|V |4). Improvements are due to Habib and Maurer [27], who proposed a cubic time algorithm, and to Müller and Spinrad [45], who designed a quadratic time algorithm. The first two linear time algorithms appeared independently in 1994 [9, 40]. Since then a series of simplified algorithms has been published, some running in linear time [13, 41, 52], and others in almost linear time [13, 26, 29, 42]. For later reference we give the following lemma. Lemma 3.1. Let M be a module of a graph G = (V,E) and M ′ ⊆ M . Then, M ′ is a module of G[M ] if and only if M ′ is a module of G. If M is a strong module of G, then M ′ is a strong module of G[M ] if and only if M ′ is a strong module of G. Moreover, if M1 and M2 are overlapping modules in G, then M1 \M2, M1 ∩M2 and M1 ∪M2 are also modules in G. Proof. The first and the last statement were shown in [43]. We prove the second statement. Let M ∈ MDs(G). Assume that M ′ ⊆ M is a strong module of G[M ]. Assume for contradiction that M ′ is not a strong module of G. Hence M ′ must overlap some module M ′′ in G. This module M ′′ cannot be entirely contained in M as otherwise, M ′′ and M ′ overlap in G[M ] implying that M ′ is not a strong module of G[M ], a contradiction. But then M and M ′′ must overlap, contradicting that M is strong in G. If M ′ ⊆ M is a strong module of G then it does not overlap any module of G. Since every module of G[M ] is also a module of G, there cannot be a module of G[M ] that overlaps M ′ and thus, M ′ is a strong module of G[M ]. 3.3 Useful properties of modular partitions First, we briefly summarize the relationship between cographs G and the modular decom- position MDs(G). While the first three items are from [4, 7], the proof of the fourth item can be found in [3, 30]. Theorem 3.2 ([4, 7, 30]). Let G = (V,E) be an arbitrary graph. Then the following statements are equivalent. 1. G is a cograph. 2. G does not contain induced paths on four vertices. 3. TMDs(G) is the cotree of G and hence, has no inner vertices labeled with “prime”. 4. Define a set R(G) of triples as follows: For any three vertices x, y, z ∈ V we add the triple xy|z toR(G) if either xz, yz ∈ E and xy /∈ E or xz, yz /∈ E and xy ∈ E. There is a tree T that displays all triples inR(G). For later explicit reference, we summarize in the next theorem several results that we already implicitly referred to in the discussion above. Theorem 3.3 ([25, 28, 43]). The following statements are true for an arbitrary graph G = (V,E): (T1) The maximal modular partition Pmax(G) and the modular decomposition MDs(G) of G are unique. 8 Art Discrete Appl. Math. 3 (2020) #P2.01 (T2) Let Pmax(G[M ]) be the maximal modular partition of G[M ], where M denotes a prime module of G and P′ ( Pmax(G[M ]) be a proper subset of Pmax(G[M ]) with |P′ | > 1. Then, ⋃ M ′∈P′ M ′ /∈ MD(G). (T3) Any subset M ⊆ V is a module of G if and only if M is either a strong module of G or M is the union of children of a non-prime module of G. Statements (T1) and (T3) are clear. Statement (T2) explains that none of the unions of elements of a maximal modular partition of G[M ] are modules of G, whenever M is a prime module of G. Moreover, Statement (T3) can be used to show that all prime modules are strong. Lemma 3.4. Let G = (V,E) be an arbitrary graph. Then, every prime module M of G is strong. Proof. Let M be a prime module of G. Assume for contradiction that M is not strong in G. Theorem 3.3(T3) implies that M is the union of children of some non-prime mod- ule M ′. Hence, there is a subset M ( Pmax(G[M ′]) such that M = ⋃ M ′i∈M M ′i . Note that 1 < |M| < |Pmax(G[M ′])|, since all M ′i ∈ Pmax(G[M ′]) are strong and⋃ M ′i∈Pmax(G[M ′]) M ′i = M ′ is non-prime. As M ′ is non-prime, it is either parallel or series. Since M is a non-trivial union of elements in Pmax(G[M ′]), G[M ] is either discon- nected (if M ′ is parallel) or its complement G[M ] is disconnected (if M ′ is series). But then M is non-prime; a contradiction. Thus, M is a strong module of G. In what follows, whenever the term “prime module” is used it refers therefore always to a strong module. 3.4 Cograph editing Given an arbitrary graph we are interested in understanding how the graph can be edited into a cograph. A well-studied problem is the following optimization problem. Problem 3.5 (Optimal Cograph Editing). Given a graph G = (V,E). Find a set F ⊆ ( V 2 ) of minimum cardinality such that H = (V,E 4 F ) is a cograph. We will simply call an edit set of minimum cardinality an optimal (cograph) edit set. For later reference we recall Lemma 9 of [35]. It shows that it suffices to solve the cograph editing problem separately for each connected component of G. Lemma 3.6 ([35]). Let G = (V,E) be a graph with optimal edit set F . Then {x, y} ∈ F \ E implies that x and y are located in the same connected component of G. Let G = (V,E) be a graph and F be an arbitrary edit set that transforms G to the cograph H = (V,E 4 F ). If any module of G is a module of H , then F is called module- preserving. Proposition 3.7 ([25]). Every graph has an optimal module-preserving cograph edit set. The importance of module-preserving edit sets lies in the fact that they update either all or none of the edges between any two disjoint modules. It is worth noting that module preserving edit sets do not necessarily preserve the property of modules being strong, i.e., although M might be a strong module in G it needs not to be strong in H . A. Fritz, M. Hellmuth, P. F. Stadler and N. Wieseke: Cograph editing by module-merging 9 Definition 3.8. Let G = (V,E) be a graph, F a cograph edit set for G and M be a non- trivial module of G. The induced edit set in G[M ] is F [M ] := {{x, y} ∈ F | x, y ∈M}. The next result shows that any optimal edit set F can entirely expressed by the union of edits within prime modules and that F [M ] is an optimal edit set of G[M ] for any module M of G. Hence, if F [M ] is not optimal for some module M of G, then F cannot be an optimal edit set for G. Lemma 3.9 ([25]). Let G = (V,E) be an arbitrary graph and let M be a non-trivial module ofG. If F ′ is an optimal edit set of the induced subgraphG[M ] and F is an optimal edit set of G, then (F \ F [M ]) ∪ F ′ is an optimal edit set of G. Thus, |F [M ]| = |F ′|. Moreover, the optimal cograph editing problem can be solved independently on the prime modules of G. 4 Module merge is the key to cograph editing Since cographs are characterized by the absence of induced P4s, we can interpret every optimal cograph-editing method as the removal of all P4s in the input graph with a min- imum number of edits. A natural strategy is therefore to detect P4s and then to decide which edges must be edited. Optimal edit sets are not necessarily unique, see Figure 1. The computational difficulty arises from the fact that editing an edge of a P4 can produce new P4s in the updated graph. Hence, we cannot expect a priori that local properties of G alone will allow us to identify optimal edits. By Lemma 3.9, on the other hand, it is sufficient to edit within the prime modules. Moreover, as shown in Figure 1, there are strong modulesM? in an optimal edited cograph H that are not modules in G. Hence, instead of editing P4s in G, it might suffice to edit the outMi -neighborhoods for some Mi ∈ Pmax(G[M ]) in such a way that they result in the new module M? in H . The following definitions are important for the concepts of the “module merge process” that we will extensively use in our approach. Definition 4.1 (Module Merge). Let G and H be arbitrary graphs with V (H) ⊆ V (G) and let MD(G) and MD(H) denote their corresponding sets of all modules. Consider a set M := {M1,M2, . . . ,Mk} ⊆ MD(G). We say that the modules inM are merged (w.r.t. H) if (i) M1, . . . ,Mk ∈ MD(H), (ii) M := ⋃k i=1Mi ∈ MD(H), and (iii) M /∈ MD(G). We use the symbols t+ and→ as operations that allows us to illustrate the merge process, that is, we writeM1t+ · · ·t+Mk = t+ki=1Mi →M , whenever the modulesM1,M2, . . . ,Mk are merged w.r.t. H resulting in the module M = ⋃k i=1Mi of H . The intuition is that the modules M1 through Mk of G are merged into a single new moduleM , their union, that is present inH but not inG. This, in particular, already defines all required edits to adjust the neighbors of the vertices in ⋃k i=1Mi in G resulting in the module M = ⋃k i=1Mi of H . It is easy to verify that t+ is commutative in the sense that 10 Art Discrete Appl. Math. 3 (2020) #P2.01 M1 M2 M3 M4 M1 M2 M3 M4'' M4' M1 M2 M3 M4 prime M1 M2 M3 M4 series parallel parallel series M1 M3 M4´´ M2 M4´ series parallel parallel M1 M3 M2 M4 Figure 1: Shown are three graphsG,H1, H2 (from left to right). Maximal non-trivial strong modules are indicated by gray ovals in each graph and edges are used to show whether two modules are adjacent or not. The dots/lines within the modules are used to depict the vertices/edges within the modules. The modular decomposition trees up to a certain level are depicted below the respective graphs. This tree differs from the modular decomposition tree of the original graph G,H1, and H2, respectively, only from the unresolved leaf-nodes (gray boxes). Left: A non-cograph G is shown. The optimal edit set F has cardinality 4. Center: An optimal edited cographH1 = G4F is shown, where F is not module-preserving. None of the new strong modules ofH1 that are not modules ofG can be expressed as the union of the sets M1, . . . ,M4. Hence, none of these modules are the result of a module merge process. Right: An optimal edited cograph H2 = G4 F is shown, where F is module-preserving. The new strong modulesM?1 ,M ? 2 ofH2 that are not modules ofG are two parallel modules. They can be written as M?1 = M1 ∪M3 and M?2 = M2 ∪M4. Hence, they are obtained by merging modules of G, in symbols: M1 t+ M3 →M?1 and M2 t+ M4 →M?2 . Here we have FH2(M1 t+ M3 →M?1 ) = FH2(M2 t+ M4 →M?2 ) = F = {{x, y} | x ∈M1, y ∈M4}. if M1 t+ M2 → M , then M2 t+ M1 → M . However, t+ is not necessarily associative. To see this, consider the example in Figure 2. Although the module M?3 in H is obtained by merging the modules {3}, {4} and {5}, the set {3} ∪ {4} does not form a module in H . Hence, although {3} t+ {4} t+ {5} → M?3 , it does not hold that {3} t+ {4} → M? for any module M? in H . Thus, we cannot write ({3} t+ {4}) t+ {5} →M?3 . It follows directly from Definition 4.1 that every new module M of H that is not a module of G can be obtained by merging trivial modules: simply set M = ⋃ x∈M{x} and t+x∈M{x} → M follows immediately. In what follows we will show, however, that each strong module of H that is not a module of G can be obtained by merging the modules that are contained in Pmax(G[M ]) of some prime module M of G. When modules M1, . . . ,Mk of G are merged w.r.t. H then all vertices in M =⋃k h=1Mh must have the same outM -neighbors in H , while at least two vertices x ∈ Mi, y ∈ Mj , 1 ≤ i 6= j ≤ k must have different outM -neighbors in G. Hence, in order to merge these modules it is necessary to change the outM -neighbors in G. However, edit operations between vertices within M are dispensable for obtaining the module M . A. Fritz, M. Hellmuth, P. F. Stadler and N. Wieseke: Cograph editing by module-merging 11 Definition 4.2 (Module Merge Edit). Let G = (V,E) be an arbitrary graph and F be an arbitrary edit set resulting in the graph H = (V,E 4 F ). Let H ′ ⊆ H be an induced subgraph of H and suppose M1, . . . ,Mk ∈ MD(G) are modules that have been merged w.r.t. H ′ resulting in the module M = ⋃k i=1Mi ∈ MD(H ′). We then call FH′(t+ki=1Mi →M) := {{x, v} ∈ F | x ∈M,v ∈ V (H ′) \M} (4.1) the module merge edits associated with t+ki=1Mi →M w.r.t. H ′. By construction, the edit set FH′(t+ki=1Mi → M) comprises exactly those (non)edges of F that have been edited so that all vertices in M have the same outM -neighborhood in H ′ = (V ′, E′). In particular, it contains only (non)edges of F that are not entirely contained in G[M ], but entirely contained in H ′. Moreover, (non)edges of F that contain a vertex in V (H ′) and a vertex in V \ V (H ′) are not considered as well. Let G be an arbitrary graph and F be an optimal edit set that applied to G results in the cograph H . We will show that every optimal module-preserving edit set F can be expressed completely by means of module merge edits. To this end, we will consider the prime modules M of the given graph G (in particular certain children of M that do not share the same out-neighborhood) and adjust their out-neighbors to obtain new modules. Illustrative examples are given in Figure 1 and 2. We are now in the position to derive the main results, Theorems 4.3 – 4.7. We begin with showing that each strong module of H that is not a module of G can be obtained by merging some children of a particular chosen prime module of G. Moreover, we prove that any strong module of H that is a module of G must also be strong in G. Theorem 4.3. Let G = (V,E) be an arbitrary graph, F an optimal module-preserving cograph edit set, and H = (V,E 4 F ) the resulting cograph. Then, each strong module M? ofH is either a module inG or there exists a prime module PM? ofG that containsM? and is minimal w.r.t. inclusion, i.e., there is no prime module P ′M? ofG withM ? ⊆ P ′M? ( PM? . In the latter case M? is obtained by merging some modules in Pmax(G[PM? ]). Furthermore, if a strong moduleM? ofH is a module inG, thenM? is a strong module of G. Proof. Let M? be an arbitrary strong module of H that is not a module of G. We show first that for the module M? there is a prime module PM? of G with M? ⊆ PM? such that there is no other prime module P ′M? of G with M ? ⊆ P ′M? ( PM? . SinceM? is a module ofH but not ofG there are vertices x ∈M? and y ∈ V \M? with {x, y} ∈ F . Now, let PM? be the strong module of G containing x and y that is minimal w.r.t. inclusion, that is, there is no other strong module of G that is properly contained in PM? and that contains x and y. Thus {x, y} ∈ F [PM? ]. Lemma 3.9 implies that F [PM? ] is an optimal edit set of G[PM? ]. Since PM? is minimal w.r.t. inclusion it holds that x and y are from distinct children Mx,My ∈ Pmax(G[PM? ]). We continue to show that this strong module PM? is indeed prime. Assume for contradiction, that PM? is a non-prime module of G. If PM? is parallel, then editing {x, y} would connect the two connected components Mx,My of G[PM? ]. Then, it follows by Lemma 3.6 that F [PM? ] is not optimal; a contradiction. By similar arguments for the complement G[PM? ] it can be shown that PM? cannot be a series module. Thus PM? must be prime. Since F is module- preserving, PM? is module in H . Hence, PM? and M? cannot overlap, since M? is strong in H . However, since x ∈ PM? ∩M? and y ∈ PM? but y /∈ M? we have M? ⊆ PM? . 12 Art Discrete Appl. Math. 3 (2020) #P2.01 21 3 50 G 4 7 6 21 3 50 H 4 7 6 prime M1 parallel M2 0 1 2 7 prime M3 3 4 5 6 parallel M1 series M1* series M2* 2 parallel M2 7 6 series M3* parallel M4* 4 3 5 0 1 Figure 2: Illustration of the main results. Consider the non-cograph G, the cograph H = G 4 F and the module-preserving edit set F = {{1, 2}, {5, 6}}. The modular decomposition trees are depicted right to the respective graphs. According to Theorem 4.3, both strong modules M1 and M2 of H that are modules of G are also strong modules of G and correspond to the prime module M1 and the parallel module M2 in G, respectively. Moreover, each of the new strong modules M?1 , . . . ,M ? 4 of H are obtained by merging children of a prime module of G. To be more precise, M?1 and M?2 are obtained by merging children of the prime module M1 of G: M2t+ {2} →M?1 and {0} t+ {1} → M?2 with FH[M1](M2 t+ {2} → M?1 ) = FH[M1]({0} t+ {1} → M?2 ) = {{1, 2}}. The new strong modules M?3 and M?4 are obtained by merging children of the prime module M3 of G: {3} t+ {5} → M?4 and {3} t+ {4} t+ {5} → M?3 with FH[M3]({3}t+ {5} →M?4 ) = FH[M3]({3}t+ {4}t+ {5} →M?3 ) = {{5, 6}}. According to Corollary 4.7, the set F can be written as the union of the edit sets used to obtain the new merged modules of H . It is worth noting that not all strong modules of G remain strong in H (e.g. the prime mod- ule M3) and that there are (non-strong) modules in H (e.g. the module {6, 7}) that are not obtained by merging children of prime modules of G. Finally, since PM? is chosen to be minimal w.r.t. inclusion, there exists in particular no prime module P ′M? of G with M ? ⊆ P ′M? ( PM? . We continue to show that M? is obtained by merging some child modules of PM? in G, say M1, . . . ,Mk ∈ Pmax(G[PM? ]). Note that we just formally prove the existence of such a subset {M1, . . . ,Mk} ⊂ Pmax(G[PM? ]) without explicitly constructing it. To this end, we need to verify the three conditions of Definition 4.1, i.e., (i) M1, . . . ,Mk ∈ MD(H), (ii) M? := ⋃k i=1Mi ∈ MD(H), and (iii) M? /∈ MD(G). Since each Mi ∈ Pmax(G[PM? ]) is module of G and F is module-preserving, Condition (i) is always satis- fied. Moreover, by assumption M? /∈ MD(G) and thus Condition (iii) is satisfied. It remains to show that Condition (ii) is satisfied. To this end, we show that there are modules M1, . . . ,Mk of G (without explicitly constructing them) such that M? =⋃k i=1Mi. We prove this by showing that each module from PM? is either completely contained in, or disjoint from M?. First, note that M? 6= PM? , since M? is not a module of G. Second, M? cannot overlap any Mi ∈ Pmax(G[PM? ]), since Mi is a module of H and M? is strong in H . We continue to show that there is no Mi ∈ Pmax(G[PM? ]) such that M? ⊆ Mi. Assume for contradiction that there is a module Mi ∈ Pmax(G[PM? ]) with M? ⊆ Mi. Note that Mi cannot be prime in G, as otherwise M? ⊆ Mi = P ′M? ( PM? , contradicting the minimality of PM? . Moreover, M? cannot overlap any M ij ∈ Pmax(G[Mi]), since M? is strong in H and any M ij is a module of H , since F is module- A. Fritz, M. Hellmuth, P. F. Stadler and N. Wieseke: Cograph editing by module-merging 13 preserving. Furthermore, since Mi is non-prime in G for any subset {M i1, . . . ,M il } ( Pmax(G[Mi]) it holds that the set M ′ = ⋃l j=1M i j is a module of G (cf. Theorem 3.3(T3)). Since M? is no module of G it cannot be a union of elements in Pmax(G[Mi]). Note that this especially implies that M? 6= Mi and M? 6= M ij for all M ij ∈ Pmax(G[Mi]). Now it follows, that M? ⊂M ij for some M ij ∈ Pmax(G[Mi]). Repeating the latter arguments and sinceG is finite, there must be a minimal setMab withM ? ⊂Mab ⊂ · · · ⊂M ij ⊂Mi. Now we apply the latter arguments again and obtain that M? ⊂ M ′ ∈ Pmax(G[Mab ]) which is not possible, since Mab is chosen to be the minimal module that contains M ?. Thus, there is no Mi ∈ Pmax(G[PM? ]) such that M? ⊆Mi. Now, since M? 6= PM? , andM? does not overlap anyMi ∈ Pmax(G[PM? ]), and there is no Mi ∈ Pmax(G[PM? ]) such that M? ⊆ Mi, there must be a set {M1, . . . ,Mk} ( Pmax(G[PM? ]) such that M? = ⋃k i=1Mi. Thus, Condition (ii) is satisfied and therefore M? is obtained by merging modules in Pmax(G[PM? ]). Hence, any strong module of H is either a module of G or obtained by merging the children of a prime module of G. Finally, assume that there is a strong module M? in H that is a module of G. Assume that M? is not strong in G. Then there is a module M in G that overlaps M?. Since F is module-preserving, M is a module in H and thus, M overlaps M? in H; a contradiction. Thus, any strong module M? of H that is also a module of G must be strong in G. Theorem 4.3 allows us to give the following definitions that we will use in the subse- quent part. Definition 4.4. Let G = (V,E) be an arbitrary graph, F an optimal module-preserving cograph edit set, and H = (V,E 4 F ) the resulting cograph. Let M? be a strong module of H but no module of G. We denote by PM? the prime module of G that contains M? and is minimal w.r.t. inclusion, i.e., there is no prime module P ′M? of G with M ? ⊆ P ′M? ( PM? . Further- more, we denote by C(M?) ⊂ Pmax(G[PM? ]) the set of children of PM? that satisfies⋃ Mi∈C(M?)Mi = M ?. The next result provides a characterization of module-preserving edit sets by means of module merge of the children of prime modules. Theorem 4.5. Let G = (V,E) be an arbitrary graph, F an optimal cograph edit set, and H = (V,E 4 F ) the resulting cograph. Then F is module-preserving for G if and only if each new strong module M? of H that is not a module of G is obtained by merging the modules in C(M?) ⊂ Pmax(G[PM? ]), in symbols t+Mi∈C(M?)Mi →M?. Proof. If F is an optimal and module-preserving edit-set forG, we can apply Theorem 4.3. For the converse, assume for contraposition that F is not module-preserving. Then, there is a module Mi in G that is not a module in H . Hence, there is a vertex z ∈ V \Mi and two vertices x, y ∈ Mi such that xz ∈ E(H) and yz /∈ E(H) and thus, either {x, z} ∈ F or {y, z} ∈ F . There are two cases, either xy ∈ E(H) or xy /∈ E(H). Since H is a cograph we can apply Theorem 3.2 and conclude that either yz|x ∈ R(H) or xz|y ∈ R(H). Assume that xz|y ∈ R(H) and let T be the cotree of H . Since T displays xz|y, the strong moduleM? ofH located at the lcaT (x, z) contains the vertices x and z but not y. Moreover, since there is an edit {x, z} or {y, z} in F there is a strong prime module PM? in G that contains x, y, z and is minimal w.r.t. inclusion. Note, Mi 6= PM? since 14 Art Discrete Appl. Math. 3 (2020) #P2.01 x, y ∈Mi and z 6∈Mi. Moreover, since Mi is a module in G, but none of the unions of the children of PM? is a module of G (cf. Theorem 3.3(T3)), we can conclude that Mi ⊆M ′, where M ′ is a child of PM? in G. Since PM? is the minimal prime module that contains x, y, z and there is an edit {x, z} or {y, z} in F , the vertex z must be located in a module different from the module M ′ that contains both x and y. Thus, z /∈ M ′. Therefore, there is no module in G that contains x and z but not y. Thus, M? is no module of G. Since there is no module in G that contains x and z but not y, the set M? cannot be written as the union of children of any strong prime module PM? and thus, M? is not obtained by merging modules of Pmax(G[PM? ]). The case yz|x ∈ R(H) is shown analogously. Combining the latter results, it can be shown that for every graph G there is always an optimal edit set such that the resulting cograph H contains all modules of G and any newly created strong module M? of H is obtained by merging the respective modules in C(M?). Theorem 4.6. Any graph G = (V,E) has an optimal edit-set F such that each strong module M? in H = (V,E4 F ) that is not a module of G is obtained by merging modules in Pmax(G[PM? ]), where PM? is a prime module of G. Proof. Proposition 3.7 implies that any graph has a module-preserving optimal edit set. Hence, we can apply Theorem 4.5 to derive the statement. Finally, the following result shows that each module-preserving edit set can indeed be derived by considering the module merge edits only. Theorem 4.7. Let G = (V,E) be an arbitrary graph, F an optimal module-preserving cograph edit set, H = (V,E 4 F ) the resulting cograph, and M the set of all strong modules of H that are no modules of G. Then, F = ⋃ M?∈M ( FH[PM? ](t+Mi∈C(M?)Mi →M ?) ) . Proof. We set F ? = ⋃ M?∈M ( FH[PM? ](t+Mi∈C(M?)Mi →M?) ) . Clearly, it holds that F ? ⊆ F . It remains to show that, F ⊆ F ?. First, observe, that every edit {x, y} ∈ F is between distinct children Mx,My ∈ Pmax(G[PM? ]) of a prime module PM? of G. To see this, let PM? be a strong module of G such that x and y are in distinct children Mx,My ∈ Pmax(G[PM? ]) and assume for contradiction that PM? is non-prime in G. Let F ′ := ⋃ Mi∈Pmax(G[PM? ]) F [Mi]. Since PM? is non-prime in G it follows that F ′ is an edit set for G[PM? ], that is, G[PM? ]∆F ′ is a cograph. But |F ′| < |F [PM? ]|; contra- dicting Lemma 3.9. Thus, every edit {x, y} ∈ F is between distinct children Mx,My ∈ Pmax(G[PM? ]) of a prime module PM? of G. Assume that {x, y} ∈ F , but {x, y} /∈ F ?. By the latter arguments, there is a prime module PM? of G with x ∈ Mx and y ∈ My and Mx,My ∈ Pmax(G[PM? ]). Now let M ′x be the strong module of H that contains x but not y and that is maximal w.r.t. inclusion. Since F is module-preserving, Mx is a module in H . Moreover, since M ′x is a strong module of H , the modules M ′x and Mx do not overlap in H . Therefore, either Mx ( M ′x or M ′x ⊆ Mx. We show first that the case Mx ( M ′x is not possible. Assume for contradiction, that Mx ( M ′x. Thus, there is a vertex z ∈ M ′x \ Mx. Since PM? is prime in G and Mx ∈ Pmax(G[PM? ]), we can apply Theorem 3.3(T2) and conclude that there is no other module than Mx in G that entirely contains Mx but not y. Since A. Fritz, M. Hellmuth, P. F. Stadler and N. Wieseke: Cograph editing by module-merging 15 Mx ( M ′x ( PM? it follows that M ′x is a new strong module of H and therefore, by Theorem 4.3, obtained by merging modules M1, . . . ,Mk ∈ C(M ′x) ( Pmax(G[PM? ]). But then {x, y} ∈ FH[PM? ](t+Mi∈C(M ′x)Mi →M ′ x) ⊆ F ?; contradicting that {x, y} /∈ F ?. Hence, M ′x ⊆ Mx. Similarly, M ′y ⊆ My for the strong module M ′y of H that contains y but not x and that is maximal w.r.t. inclusion. Consider now the strong moduleM? ofH that is identified with the lowest common an- cestor of the modules {x} and {y} within the cotree of H . Then, there are distinct children in Pmax(H[M?]), containing x and y, respectively. Since M ′x is the strong module of H that contains x but not y and that is maximal w.r.t. inclusion, we haveM ′x ∈ Pmax(H[M?]). Analogously, M ′y ∈ Pmax(H[M?]). Both, Mx as well as My are modules in H and G. Since F is module-preserving, either all or none of the edges between Mx and My are edited. Since {x, y} ∈ F we have, therefore, {x′, y′} ∈ F for all x′ ∈ M ′x ⊆ Mx and y′ ∈ M ′y ⊆ My . Let F ′ := {{x′, y′} | x′ ∈M ′x, y′ ∈M ′y}. By the latter argument F ′ 6= ∅ and F ′ ⊆ F . Note, the subgraphs H[M ′x] and H[M ′ y] are cographs. Since M ? is either a parallel or a series module in H , we have either (i) H[M ′x ∪M ′y] = H[M ′x]∪· H[M ′y] or (ii) H[M ′x ∪ M ′y] = H[M ′ x] ⊕H[M ′y], respectively. Since F ′ comprises the edits {x′, y′} between all vertices x′ ∈ M ′x and y′ ∈ M ′y , the graph H[M ′x ∪ M ′y] 4 F ′ is in case (i) the graph H[M ′x] ⊕H[M ′y] and in case (ii) H[M ′x] ∪· H[M ′y]. By definition, in both cases H[M ′x ∪ M ′y] 4 F ′ is a cograph. Note that F ′ did not change the outM ′x∪M ′y -neighborhood and thus, the graph H[M?] 4 F ′ = G[M?] 4 (F [M?] \ F ′) is a cograph as well. Since {x, y} ∈ F ′ ∩ F [M?] it holds that |F [M?] \ F ′| < |F [M?]|. But then, F [M?] is not optimal, and therefore, by Lemma 3.9 the set F is not optimal; a contradiction. In summary, there exists no edit {x, y} ∈ F with {x, y} /∈ F ?. Hence, F ⊆ F ? and the statement follows. From an algorithmic perspective, Theorem 4.7 implies that it is sufficient to correctly determine the set of strong modules of a resulting cograph H that are no modules of the given graph G. Afterwards, the module-preserving edit set F is obtained by taking all the edits needed for the corresponding module merge operations. On the other hand, by Theorem 4.6 it is ensured that such a closest cograph H that contains all modules of G always exists. 5 Pairwise module merge and algorithmic issues So far, we have shown that for an arbitrary graph G = (V,E) there is an optimal module- preserving edit set F that transformsG into the cographH = (V,E4F ) (cf. Theorem 4.6). Moreover, this edit set F can be expressed in terms of edits derived by module merge operations on the strong modules ofH that are no modules ofG (cf. Theorem 4.7). In what follows, we show that there is an explicit order in which these individual merge operations can be consecutively applied to G such that all intermediate edit-steps result in graphs that contain all modules of G, and, moreover, all new strong modules produced in this edit-step are preserved in any further step. In Section 5.1, we show that an optimal edit set can always be obtained by a series of “ordered” pairwise merge operations. In Section 5.2, we show that the latter “order”-condition can even be relaxed and that particular modules can be pairwisely merged in an arbitrary order to obtain an optimal edited graph. The next Lemma shows that the number of edits in an optimal edit set F can be ex- pressed as the sum of individual edits based on the t+-operator to obtain the strong modules 16 Art Discrete Appl. Math. 3 (2020) #P2.01 in a cograph H = G4 F that are no modules in G. Lemma 5.1. Let G = (V,E) be a graph, F an optimal module-preserving cograph edit- set, and H = (V,E 4 F ) the resulting cograph. LetM = {M?1 , . . . ,M?n} be the set of all strong modules of H that are no modules of G and assume that the elements inM are partially ordered w.r.t. inclusion, i.e., M?i ⊆M?j implies i ≤ j. Let M? ∈ M. We set FM? := {{x, v} ∈ F | x ∈ M?, v ∈ PM? \M?}, that is, the set FM? ⊆ F comprises all edits in F that are used to obtain the module M? within G[PM? ]. Furthermore, we set σM?1 = FM?1 and σM?i = FM?i \ ( ⋃i−1 j=1 FM?j ), 2 ≤ i ≤ n. Then F = n⋃· i=1 σM?i and, thus, |F | = n∑ i=1 |σM?i | . Moreover, for each intermediate graph Gj = G 4 (⋃j i=1 σM?i ) and any M?i ∈ M with i− 1 ≤ j we have Gj [M ? i ] = H[M ? i ] . In each step j the induced subgraphs Gj [M?i ] are already cographs for all sets M ? i with i− 1 ≤ j and hence F [M?i ] \ ⋃j k=1 σM?k = ∅, for all i− 1 ≤ j. Proof. By Theorem 4.3, for each M? ∈ M there is an inclusion-minimal prime module PM? inG and a set of children C(M?) ⊆ Pmax(G[PM? ]) such that t+Mi∈C(M?)Mi →M?. Thus, PM? and C(M?) exists and C(M?) is not empty. Now, we show that |F | can be expressed by the sum of the size of the edits in σM?i To this end, observe that by Theorem 4.7, F = ⋃ M?∈M ( FH[PM? ](t+Mi∈C(M?)Mi →M?) ) . Thus, F = ⋃ M?∈M FM? . By construction of σM?i it holds first that ⋃n i=1 σM?i =⋃n i=1 FM?i and second that σM?i ∩ σM?j = ∅ for all i 6= j. Hence, F = ⋃·ni=1 σM?i and thus, |F | = ∑n i=1 |σM?i |. By construction, M is partially ordered w.r.t. inclusion. We want to show that Gj [M ? i ] = H[M ? i ] for all i − 1 ≤ j. To this end, we show that F [M?i ] \ ⋃j k=1 σM?k = ∅, in which case after each step j there are no more edits left to modify an edge between vertices within M?i . We show first that the latter is satisfied for all 1 ≤ i ≤ n and a fixed j = i − 1. Assume for contradiction that {x, y} ∈ F [M?i ] \ ⋃i−1 k=1 σM?k and thus, x, y ∈ M?i . Since {x, y} ∈ F = ⋃n k=1 FM?k , there must be a module M ? ` ∈ M such that {x, y} ∈ FM?` . By construction, FM?` contains only the edits that affect the outM?` - neighborhood. Thus, w.l.o.g. we can assume that x ∈ M?` and y 6∈ M?` . Since M?` and M?i are strong modules, they do not overlap, and therefore, M ? ` ( M?i . However, since M is partially ordered, we can conclude that ` < i and therefore, {x, y} ∈ ⋃i−1 k=1 σM?k . Hence, {x, y} /∈ F [M?i ] \ ⋃i−1 k=1 σM?k ; a contradiction. Thus, F [M ? i ] \ ⋃i−1 k=1 σM?k = ∅ for all 1 ≤ i ≤ n. But then, clearly F [M?i ] \ ⋃j k=1 σM?k = ∅ holds for any j ≥ i − 1. Thus, Gj [M ? i ] = H[M ? i ] for all i− 1 ≤ j. The following Lemma shows that, given the explicit orderM = {M?1 , . . . ,M?n} from Lemma 5.1, in which the edits are applied to the graphG, the intermediate graphsGi retain all modules of G and also all new modules M?j , j ≤ i. A. Fritz, M. Hellmuth, P. F. Stadler and N. Wieseke: Cograph editing by module-merging 17 Lemma 5.2. Let G = (V,E) be an arbitrary graph, F an optimal module-preserving cograph edit set, and H = (V,E 4 F ) the resulting cograph. Moreover, let M = {M?1 , . . . ,M?n} be the partially ordered (w.r.t. inclusion) set of all strong modules of H that are no modules of G0 := G, and choose σM?i , FM?i and the intermediate graphs Gi, 1 ≤ i ≤ n as in Lemma 5.1. Then, any module M ′ of G is a module of Gi and the set M?j is a module of Gi for 1 ≤ i ≤ n and any j ≤ i. Proof. First note that σM?i affects only modules that are entirely contained in PM?i and only their out-neighbors within PM?i . Moreover M ? j ⊆M?i implies that PM?j ⊆ PM?i . The partial ordering of the elements inM implies that PM?i remains a module in Gi. Before we prove the main statement, we show first that the following statement is sat- isfied: Claim 1. For every M ′ with M?i ( M ′ ( PM?i we have M ′ 6= M?j ∈ M, j ≤ i and M ′ cannot be a module of G. Proof of Claim 1. Let M ′ be an arbitrary set with M?i ( M ′ ( PM?i . By the partial order of the elements inM we immediately observe that M ′ 6= M?j ∈ M for any j ≤ i. Now assume for contradiction that M ′ is a module of G. Note, all elements in Pmax(G[PM?i ]) are strong modules of G, and thus, do not overlap the module M ′. Moreover, since PM?i is prime in G, we can apply Theorem 3.3(T2) and conclude that the union of elements of any proper subset P′ ( Pmax(G[PM?i ]) with |P ′ | > 1 is not a module of G. Taken the latter arguments together and because M ′ ( PM?i , we have M ′ ⊆ M` ∈ Pmax(G[PM?i ]) for some `. Hence, M?i ( M ′ ⊆ M`. However, since M?i is the union of some children P′ ⊆ Pmax(G[PM?i ]) of PM?i it follows that M` ⊆ M ? i ; a contradiction. This proves Claim 1. / We continue with proving the main statement by induction over i. Since G0 = G, the statement is satisfied for G0. We continue to show that the statement is satisfied for Gi+1 under the assumption that it is satisfied for Gi. For further reference, we note that PM?i+1 is a module of Gi, since PM?i+1 is a module of G and by induction assumption. Moreover, PM?i+1 remains a module of Gi+1, since Gi+1 = Gi4 σM?i+1 and σM?i+1 does not affect the outPM?i+1 -neighborhood. Furthermore, M?i+1 is a module ofH and thus, ofH[PM?i+1 ]. Since σM?i+1 contains all such edits to adjust M?i+1 to a module in H[PM?i+1 ], we can conclude that M ? i+1 is a module in Gi+1[PM?i+1 ]. Therefore, Lemma 3.1 implies that M?i+1 is a module of Gi+1. Now, let M ′ be an arbitrary module of G. We proceed to show that M ′ is a module of Gi+1. By induction assumption, each module M ′ of G is a module of Gi. Since F is module-preserving,M ′ is also a module ofH . Hence,M ′ ∈ MD(G)∩MD(Gi)∩MD(H). Moreover, by Claim 1 the case M?i+1 ( M ′ ( PM?i+1 cannot occur for any module M ′ of G. Note, the module M ′ cannot overlap PM?i+1 , since PM?i+1 is strong in G. Hence, for M ′ one of the following three cases can occur: either PM?i+1 ⊆ M ′, PM?i+1 ∩M ′ = ∅, or M ′ ( PM?i+1 . In the first two cases, M ′ remains a module of Gi+1, since σM?i+1 contains only edits between vertices within PM?i+1 , and thus, the outM ′-neighborhood is not affected. Therefore, assume that M ′ ( PM?i+1 . The module M ′ cannot overlap M?i+1, 18 Art Discrete Appl. Math. 3 (2020) #P2.01 since M?i+1 is strong in H . As shown above, the case M ? i+1 (M ′ ( PM?i+1 cannot occur, and thus we have either (1) M ′ ⊆M?i+1, or (2) M?i+1 ∩M ′ = ∅. Case (1): Since σM?i+1 affects only the outM?i+1 -neighborhood, there is no edit between vertices in M ′ and M?i+1 \M ′ and, moreover, Gi+1[M?i+1] = Gi[M?i+1]. By as- sumption, M ′ is a module of Gi. Thus, M ′ is a module in any induced subgraph of Gi that contains M ′ and hence, in particular in Gi[M?i+1]. Hence, M ′ is a module of Gi+1[M ? i+1]. Now, we can apply Lemma 3.1 and conclude that M ′ is also a module of Gi+1. Case (2): Assume for contradiction that M ′ is no module of Gi+1. Thus, there must be an edge xy ∈ E(Gi+1), x ∈ M ′, y ∈ V \ M ′ such that for some other vertex x′ ∈ M ′ we have x′y /∈ E(Gi+1). Since M ′ is a module of Gi it must hold that {x, y} ∈ σM?i+1 or {x ′, y} ∈ σM?i+1 . Since x, x ′ /∈ M?i+1 and each edit in σM?i+1 affects a vertex within M?i+1, we can conclude that y ∈M?i+1. Now, by construction of FM?i+1 and since M ′ ( PM?i+1 , all edits between vertices of M ? i+1 and M ′ are entirely contained in FM?i+1 . But this implies that none of the sets σM?` with ` > i+1 contains {x, y} or {x′, y}. Hence, it holds that xy ∈ E(H) and x′y /∈ E(H), which implies that M ′ is no module of H; a contradiction. Therefore, each module M ′ of G is a module of Gi+1. We proceed to show that M?j ∈ M is a module of Gi+1 for all j ≤ i+ 1. As we have already shown this for j = i + 1, we proceed with j < i + 1. By induction assumption, each module M?j is a module of Gi for all j < i+ 1. Note, the module M ? j cannot overlap PM?i+1 , since M ? j is strong in H and PM?i+1 is a module of H , because F is module- preserving. Hence, for M?j one of the following three cases can occur: either PM?i+1 ⊆ M?j , PM?i+1 ∩M ? j = ∅, or M?j ( PM?i+1 . In the first two cases, M ? j remains a module of Gi+1, since σM?i+1 contains only edits between vertices within PM?i+1 , and thus, the outM?j -neighborhood is not affected. Therefore, assume that M ? j ( PM?i+1 . The module M?j cannot overlap M ? i+1, since both are strong in H . Due to the partial ordering of the elements inM, the case M?i+1 ( M?j cannot occur. Hence there are two cases, either (A) M?j ⊆M?i+1, or (B) M?i+1 ∩M?j = ∅. Case (A): Since σM?i+1 affects only the outM?i+1 -neighborhood, there is no edit between vertices in M?j and M ? i+1 \M?j . By analogous arguments as in Case (1), we can conclude that M?j remains a module of Gi+1[M ? i+1]. Lemma 3.1 implies that M ? j is also a module of Gi+1. Case (B): Assume for contradiction that M?j is no module of Gi+1. Thus, there must be an edge xy ∈ E(Gi+1), x ∈ M?j , y ∈ V \ M?j such that for some other vertex x′ ∈ M?j we have x′y /∈ E(Gi+1). Since M?j is a module of Gi it must hold that {x, y} ∈ σM?i+1 or {x ′, y} ∈ σM?i+1 . Now, we can argue analogously as in Case (2) and conclude that xy ∈ E(H) and x′y /∈ E(H), which implies thatM?j is no module of H; a contradiction. Therefore, each module M?j , j ≤ i+ 1 is a module of Gi+1. The latter two Lemmata show that there exists an explicit order, in which all new mod- ules M?i of H can be constructed such that whenever a module M ? i is produced step i the induced subgraph Gi−1[M?i ] is already a cograph and, moreover, is not edited any further in subsequent steps. A. Fritz, M. Hellmuth, P. F. Stadler and N. Wieseke: Cograph editing by module-merging 19 5.1 Pairwise module-merge Regarding Lemma 5.1, each moduleM?i is created by applying the remaining edits σM?i ⊆ FM?i of the module merge t+M ′∈C(M?i )M ′ →M?i to the previous intermediate graphGi−1. Now, there might be linear many modules in C(M?i ) which have to be merged at once to create M?i . However, from an algorithmic point of view the module M ? i is not known in advance. Hence, in each step, for a given prime moduleM ofG an editing algorithm has to choose one of the exponentially many sets from the power set P(PmaxG[M ]) to determine which new module M?i have to be created. For an algorithmic approach, however, it would be more convenient to only merge modules in a pairwise manner, since then only quadratic many combinations of choosing two elements of PmaxG[M ] have to be considered in each step. The aim of this section is to show that for each of the n steps of creating one of the new strong modules M = {M?1 , . . . ,M?n} of H it is possible to replace the merge operation t+M ′∈C(M?i )M ′ →M?i with a series of pairwise merge operations. Before we can state this result we have to define the following partition of strong mod- ules of a resulting cograph H that are no modules of a given graph G. Definition 5.3. Let G = (V,E) be an arbitrary graph, F a module-preserving cograph edit set, and H = (V,E 4 F ) the resulting cograph. Moreover, let M? ∈ M be a strong module of H that is no module of G and consider the partitions Pmax(H[M?]) = {M̃1, . . . , M̃k} and C(M?) = {M̂1, . . . , M̂l}. We define with X (M?) = {M0, . . . ,Mn} the set of modules that contains the maximal (w.r.t. inclusion) modules of Pmax(H[M?i ])∪ C(M?i ) as follows X (M?) := {M̃i ∈ Pmax(H[M?]) | ∃M̂j ∈ C(M?) s.t. M̂j ⊆ M̃i} ∪ {M̂j ∈ C(M?) | ∃M̃i ∈ Pmax(H[M?]) s.t. M̃i ⊆ M̂j}. Note that for technical reasons the index of the elements in X starts with 0. Furthermore, assume thatM = {M?1 , . . . ,M?n} is a partially ordered (w.r.t. inclusion) set of all strong modules of H that are no modules of G. For each M?i ∈M let X (M?i ) = {Mi,0, . . . ,Mi,li} and set M?i (j) = ⋃j k=0Mi,k for all 1 ≤ i ≤ n and 1 ≤ j ≤ li. Then, we denote with N (M) = {N?1 = M?1 (1), . . . , N?m = M?n(ln)} the set of all such M?i (j). In particular, we assume that N (M) is ordered as follows: if N?k = M ? i (j) and N ? l = M ? i′(j ′), then k < l if and only if either i < i′, or i = i′ and j < j′, i.e., withinN (M) the elements M?i (j) are ordered first w.r.t. i, and second w.r.t. j. Although, we have already shown by Theorem 4.5 that any new strong module M? ∈ M ofH can be obtained by merging the modules from C(M?), we will see in the following that M? can also be obtained by merging the modules form X (M?). In particular, we will see that if all elements in X (M?) are already modules of the intermediate graph G?, then we can use any order of the elements within X (M?) and successively merge them in a pairwise manner to construct M?. As a consequence of doing pairwise module merges we obtain in each step an intermediate module N? ∈ N (M). To see the intention to use the partition X (M?) instead of C(M?) observe the follow- ing. Due to the order of the elements in M, the modules M?1 , . . . ,M?n are constructed 20 Art Discrete Appl. Math. 3 (2020) #P2.01 from bottom to top, i.e., when module M? is processed then all child modules from Pmax(H[M?]) are already constructed. So, instead of obtaining M? by merging C(M?) we can indeed obtain M? also by merging Pmax(H[M?]). However, it might be the case that a non-trivial subset ⋃ i∈I M̃i = M̂j for some j, e.g., if M̂j is a (strong) prime module ofG but not a strong module ofH . But also in this case, we have to assure that M̂j remains a module of H . In particular, we do not want to destroy M̂j by merging the elements from Pmax(H[M?]) in the incorrect order. Thus, we choose M̂j ∈ X (M?) and do not include the individual M̃i, i ∈ I into X (M?). Before we can continue, we have to show that X (M?) as given in Definition 5.3 is indeed a partition of M?. Proposition 5.4. Let G = (V,E) be an arbitrary graph, F a module-preserving cograph edit set, and H = (V,E4F ) the resulting cograph. Moreover, let M? be a strong module of H that is no module of G and consider the partitions Pmax(H[M?]) = {M̃1, . . . , M̃k} and C(M?) = {M̂1, . . . , M̂l}. Then X (M?) is a partition of M?. As a consequence, for each M ∈ X (M?) there are index sets I ⊆ {1, . . . , k} and J ⊆ {1, . . . , l} such that M = ⋃ i∈I M̃i and M = ⋃ j∈J M̂j . Proof. First note that all M̃i ∈ Pmax(H[M?]) are strong modules of H . Moreover, all M̂j ∈ C(M?) are strong modules of G. Since F is module-preserving it follows that none of the elements M̃i ∈ Pmax(H[M?]) overlap any M̂j ∈ C(M?), and vice versa. Hence, for each M̃i ∈ Pmax(H[M?]) there are three distinct cases: Either M̃i ⊆ M̂j , or M̂j ( M̃i, or M̃i∩M̂j = ∅ for all M̂j ∈ C(M?). Now, since Pmax(H[M?]) and C(M?) are partitions of M? it follows for each x ∈M? that x is contained in exactly one M̃i ∈ Pmax(H[M?]) and exactly one M̂j ∈ C(M?) and either M̃i ⊆ M̂j or M̂j ( M̃i. By construction of X (M?) then either M̃i = M̂j ∈ X (M?); or M̃i ∈ X (M?) and M̂j 6∈ X (M?); or M̃i 6∈ X (M?) and M̂j ∈ X (M?). Thus, X (M?) is a partition of M?. Using the partitions X (M?),M? ∈ M we now show that there is a sequence of pair- wise module merge operations that construct the intermediate modulesN?j ∈ N (M) while keeping all modules from G as well as all previous modules N?i , i < j. Lemma 5.5. Let G = (V,E) be an arbitrary graph, F an optimal module-preserving cograph edit set, H = (V,E4 F ) the resulting cograph andM = {M?1 , . . . ,M?n} be the partially ordered (w.r.t. inclusion) set of all strong modules of H that are no modules of G. For each M?i ∈ M let X (M?i ) = {Mi,0, . . . ,Mi,li} and assume that N := N (M) = {N?1 , . . . , N?m}. Note, each N?l coincides with some M?i (j) = ⋃j k=0Mi,k. We define FM?i (j) ⊆ F as the set FM?i (j) := {{x, v} ∈ F | x ∈M ? i (j), v ∈ PM?i \M ? i (j)}. Furthermore, set G′0 = G and for each 1 ≤ l ≤ m define G′l = G′l−1 4 θl with θl = { ∅, if N?l is a module of G′l−1 FN?l \ ⋃l−1 k=1 θk, otherwise. A. Fritz, M. Hellmuth, P. F. Stadler and N. Wieseke: Cograph editing by module-merging 21 If N?l is no module of G ′ l−1, then θl contains exactly those edits that affect the out-neigh- borhood of N?l = M ? i (j) within G[PM?i ] that have not been used so far. The following statements are true for the intermediate graphs G′l, 1 ≤ l ≤ m: 1. Any set N?k is a module of G ′ l for all k ≤ l. 2. Any module M ′ of G is a module of G′l, i.e., ⋃l k=1 θk is module-preserving. 3. Either G′l−1 ' G′l, or there are two modules M1,M2 ∈ G′l−1 such that M1t+M2 → N?l is a pairwise module merge w.r.t. G ′ l. Proof. Before we start to prove the statements, we will first show Claim 1. For each 1 ≤ l ≤ m it holds that N?l is a module of H . Proof of Claim 1. By construction N?l = M ? i (j) = ⋃j k=0Mi,k for some 1 ≤ i ≤ n and 1 ≤ j ≤ li with Mi,k ∈ X (M?i ). Moreover, for each Mi,k it holds either that Mi,k ∈ PmaxH[M?i ] or Mi,k is a union of elements in PmaxH[M?i ]. Therefore, N?l is a union of elements in PmaxH[M?i ]. Since M?i is a strong non-prime module of H , Theorem 3.3(T3) implies that each union of elements in PmaxH[M?i ] is a module of H and therefore, N?l is a module of H , which proves Claim 1. / We proceed to prove Statements 1 and 2 for each intermediate graph G′l by induction over l. Since G′0 = G, the Statements 1 and 2 are satisfied for G ′ 0. We continue to show that Statements 1 and 2 are satisfied for G′l+1 under the assumption that they are satisfied for Gl. We start to prove Statement 1. First assume that N?l+1 is already a module of G ′ l. Then, by construction it holds that θl+1 = ∅ and therefore, G′l = G′l+1. Now, by induction assumption, it holds that all modules of G and all modules N?k ∈ N , k ≤ l are modules of G′l = G ′ l+1. Hence, all modules N ? k ∈ N , k ≤ l + 1 are modules of G′l+1. Hence, if N?l+1 is already a module of G′l, then Statement 1 is satisfied for G ′ l+1. Now assume that N?l+1 is not a module of G ′ l. For the proof of Statement 1, we show first Claim 2. N?l+1 is a module of G′l+1. Proof of Claim 2. By construction it holds that N?l+1 = M ? i (j) for some 1 ≤ i ≤ n and 1 ≤ j ≤ li. Note that PM?i is a module of G and therefore, by induction assumption it is a module of G′l. Since θl+1 ⊆ FM?i (j) did only affect the outM?i (j)-neighborhood within the prime module PM?i of G it follows that PM?i is a module of G ′ l+1. Moreover, it holds that FM?i (j) ⊆ ⋃l+1 k=1 θk. Note that FM?i (j) contains all those edits that affect the outM?i (j)-neighborhood within the prime module PM?i ofG. Hence, for all x ∈M ? i (j) and all y ∈ PM?i \M ? i (j) it holds that xy ∈ E(H) if and only if xy ∈ E(G′l+1). The latter arguments then imply that M?i (j) is a module of G ′ l+1 and therefore, N ? l+1 is a module of G′l+1. This proves Claim 2. / Now, we proceed with showing Claim 3. N?k , k ≤ l is a module of G′l+1. 22 Art Discrete Appl. Math. 3 (2020) #P2.01 Proof of Claim 3. Let N?k = M ? i′(j ′) and N?l+1 = M ? i (j). By induction assumption it holds that N?k is a module of G ′ l. By the ordering of elements in N it holds that i′ ≤ i and by the ordering of elements inM it then follows that PM? i′ ⊆ PM?i or PM?i′ ∩ PM?i = ∅. If PM? i′ ∩PM?i = ∅ then N ? k is not affected by the edits in θl+1 since they are all within PM?i and thus, N ? k remains a module of G ′ l+1. Now consider the case PM? i′ ⊆ PM?i . For later reference, we show Claim 3’. N?k ⊆ N?l+1 or N?k ∩N?l+1 = ∅. Proof of Claim 3’. If i′ = i, then j′ < j and by construction, M?i′(j ′) ⊆ M?i (j) which implies that N?k ⊆ N?l+1. Assume now that i′ < i and thus, N?k = M?i′(j′) ⊆ M?i′ . Since M?i and M ? i′ are strong modules of H they cannot overlap. Therefore, and due to the ordering of the elements in M it follows that either M?i′ ⊂ M?i or M?i′ ∩M?i = ∅. If M?i′ ∩ M?i = ∅, then N?k ∩ N?l+1 = ∅. If M?i′ ⊂ M?i , then there is a module M ′ ∈ Pmax(H[M?i ]) such that M?i′ ∈ M ′, since M?i and M?i′ are strong modules of H . Furthermore, the set M?i (j) is a union of elements in X (M?i ) and for each Mi,h ∈ X (M?i ) it holds that either Mi,h ∈ Pmax(H[M?i ]) or Mi,h is the union of elements in Pmax(H[M?i ]). Hence, it follows that either M ′ ⊆ M?i (j) or M ′ ∩ M?i (j) = ∅. If M ′ ∩M?i (j) = ∅, then M?i′(j′) ∩M?i (j) = ∅ and hence, N?k ∩N?l+1 = ∅. If, on the other hand, M ′ ⊆ M?i (j), then M?i′(j′) ⊆ M?i (j) and thus, N?k ⊆ N?l+1. Therefore, in all cases we have either N?k ⊆ N?l+1 or N?k ∩N?l+1 = ∅, which proves Claim 3’.  By Claim 3’, we are left with the following two cases. Case N?k ⊆ N?l+1. Since θl+1 did not effect edges within N?l+1 it holds that G′l[N?l+1] ' G′l+1[N ? l+1]. By induction assumption, N ? k is a module of G ′ l and hence, of G′l[N ? l+1] = G ′ l[M ? i (j)]. Thus, N ? k is a module of G ′ l+1[M ? i (j)]. Now, since N ? l+1 is a module of G′l+1 and by Lemma 3.1 it follows that N ? k is a module of G ′ l+1. Case N?k ∩N?l+1 = ∅. Recall that N?k = M?i′(j′) and N?l+1 = M?i (j) by the fact that i′ ≤ i. Moreover, as shown in the proof of Claim 2, we have FM?i (j) ⊆ ⋃l+1 k=1 θk. Therefore, for all x ∈ M?i (j) and all y ∈ M?i′(j′) it holds that xy ∈ E(H) if and only if xy ∈ E(G′l+1). Now let y, y′ ∈ M?i′(j′) and x 6∈ \M?i′(j′). Since M?i′(j′) is a module of H , xy as well as xy′ are either both edges H or both are non-edges in H . If x ∈M?i (j), then there are no further edits F \FM?i (j) that may affect any of these edges, sinceFM?i (j) ⊆ ⋃l+1 k=1 θk. Thus, xy ∈ E(G′l+1) if and only if xy′ ∈ E(G′l+1). If x 6∈ M?i (j), then xy as well as xy′ are not affected by θl+1. Hence, xy′ ∈ E(G′l+1) if and only if xy ′ ∈ E(G′l). By induction assumption, M?i′(j′) is a module of G′l and hence, xy ∈ E(G′l) if and only if xy′ ∈ E(G′l) and therefore, xy ∈ E(G′l+1) if and only if xy ′ ∈ E(G′l+1). Hence, N?k = M?i′(j′) is a module of G′l+1, which proves Claim 3. / By Claim 1, 2 and 3, Statement 1 is satisfied for G′l+1. We continue to prove State- ment 2 and assume that M ′ is a module of G and by induction assumption M ′ is a module of G′l. Again, let N?l+1 = M ? i (j) and consider the module PM?i of G. Since PM?i is strong in G, it cannot overlap M ′. Thus, either M ′ ∩ PM?i = ∅, or PM?i ⊆M ′, or M ′ ⊂ PM?i . A. Fritz, M. Hellmuth, P. F. Stadler and N. Wieseke: Cograph editing by module-merging 23 If M ′ ∩ PM?i = ∅ or PM?i ⊆M ′ then M ′ is not affected by the edits in θl+1 since they are all within PM?i and thus, M ′ remains a module of G′l+1. Hence, we only have to consider the case M ′ ⊂ PM?i . We show Claim 4. Either M ′ ⊆ N?l+1 or M ′ ∩N?l+1 = ∅. Proof of Claim 4. Note again, that the set M?i (j) is a union of elements in X (M?i ) and for each Mi,h ∈ X (M?i ) it holds that either Mi,h ∈ Pmax(G[PM?i ]) or Mi,h is the union of elements in Pmax(G[PM?i ]). Hence, M ? i (j) is a union of elements in Pmax(G[PM?i ]). Theorem 3.3(T2) implies that no union of elements in Pmax(G[PM?i ]) of the prime module PM?i is a module of G and thus, M ? i (j) cannot be a proper subset of M ′. Therefore, either M ′ ⊆M?i (j) orM ′∩M?i (j) = ∅ orM ′ andM?i (j) overlap. However, the latter case can- not occur, since then M ′ would either overlap one of the strong modules in Pmax(G[PM?i ]) or be a union of elements in Pmax(G[PM?i ]). Thus, in all cases either M ′ ⊆ N?l+1 or M ′ ∩N?l+1 = ∅, which proves Claim 4. / Now the same argumentation that was used to show Statement 1 can be used to show Statement 2. Thus, Statement 2 is satisfied for G′l+1. Finally, we prove Statement 3. To this end, assume that G′l 6' G′l+1 and that N?l+1 is no module of G′l. We show that there are modules M1,M2 ∈ G′l with M1 t+ M2 → N?l+1 being a pairwise module merge w.r.t. G′l+1. Clearly, Items (ii) and (iii) of Definition 4.1 are satisfied, since N?l+1 is a module of G ′ l+1 but no module of G ′ l. It remains to show that there are two modules M1,M2 ∈ G′l with M1 ∪M2 = N?l+1 and M1,M2 ∈ G′l+1, i.e., Item (i) of Definition 4.1 is satisfied. Note, N?l+1 = M ? i (j) for some i and j ≥ 1. Assume first that j = 1. Then, M?i (1) = Mi,0 ∪Mi,1 with Mi,0,Mi,1 ∈ X (M?i ). For each Mi,h it holds that Mi,h ∈ Pmax(H[PM?i ]) or Mi,h ∈ Pmax(G[PM?i ]). If Mi,h ∈ Pmax(G[PM?i ]) then Mi,h is a module of G and by Statement 2, a module of G′l and G ′ l+1. If Mi,h is no module of G, then Mi,h ∈ Pmax(H[PM?i ]) is a new strong module of H . Therefore, there exists a k < i such that Mi,h = M?k . Since M ? k = M ? k (lk) and by the ordering of elements in N it holds that M?k (lk) = N?k′ for some k′ ≤ l. Thus, by Statement 1, all Mi,h and therefore, Mi,0 and Mi,1 are modules of G′l and G ′ l+1. Now, assume that N?l+1 = M ? i (j) with j > 1. Then, M ? i (j) = M ? i (j − 1) ∪Mi,j . By the same argumentation as before, it holds thatMi,j is a module ofG′l andG ′ l+1. Moreover, by Statement 1, M?i (j − 1) = N?l is a module of G′l and G′l+1. Thus, there are modules M1,M2 of G′l and G ′ l+1 with M1 ∪M2 = N?l+1. Moreover, since for all {x, y} ∈ θl+1 it holds that either x ∈ N?l+1 and y ∈ PM?i \ N ? l+1, or vice versa, it follows that there are no additional edits contained in θl+1 besides the edits of the module merge M1 t+ M2 → N?l+1 that transforms G′l into G′l+1. We are now in the position to derive the main result of this section that shows that optimal pairwise module-merge is always possible. Theorem 5.6 (Pairwise Module-Merge). For an arbitrary graph G = (V,E) and an op- timal module-preserving cograph edit set F with H = (V,E 4 F ) being the resulting cograph there exists a sequence of pairwise module merge operations that transforms G into H . Proof. Set M = {M?1 , . . . ,M?n}, N = {N?1 , . . . , N?m}, X (M?i ) = {Mi,0, . . . ,Mi,li}, as well as θk and G′k for all 1 ≤ k ≤ m as in Lemma 5.5. Again, we set G0 := G 24 Art Discrete Appl. Math. 3 (2020) #P2.01 and H ′ := Gm. By Lemma 5.5 for each 1 ≤ k ≤ m there is a pairwise module merge M1 t+ M2 → N?k that transforms Gk−1 to Gk. Thus, there exists a sequence of module merge operations that transforms G to some graph H ′. In what follows, we will show that ⋃·mk=1 θk = F and therefore H ′ ' H , from which we can conclude the statement. For simplicity, we put F ′ := ⋃m k=1 θk. We start with showing Claim 1. F ′ ⊆ F . Proof of Claim 1. Note first that by construction it holds that θk ∩ θl = ∅ for all k 6= l and therefore, F ′ = ⋃m k=1 θk = ⋃·mk=1 θk. By construction of θ it holds that θk ⊆ F for all 1 ≤ k ≤ m. Hence, F ′ ⊆ F . / Before we show that F = F ′, we will prove Claim 2. All strong modules of H are modules of H ′. Proof of Claim 2. Lemma 5.5(1) implies that all modules M ′ of G are modules of H ′. Moreover, Lemma 5.5(2) implies that all N?k ∈ N are modules of H ′. Since for all M?i ∈ M it holds that M?i = M?i (li) = N?k for some 1 ≤ k ≤ m, the set M?i is a module of H ′. Since each strong module of H is either a module of G or a new module M?i ∈ M, all strong modules of H are modules of H ′. / We continue to show Claim 3. F ′ ( F is not possible. Proof of Claim 3. By Claim 1, F ′ ⊆ F . Thus assume for contradiction that F ′ 6= F . Since F is an optimal edit set and F ′ ( F it follows that H ′ is not a cograph. Thus, there exist a prime module M in H ′ that contains no other prime module. We will now show that M is a module of H and that all Mi ∈ Pmax(H[M ]) are modules of H ′. Therefore, consider the strong module PM of H that entirely contains M and that is minimal w.r.t. inclusion. Since PM is strong inH it is, by Claim 2, also a module of H ′. Moreover, each module Mi ∈ Pmax(H[PM ]) is strong in H and, again by Claim 2, a module of H ′ as well. If PM = M , then M is a module of H and we are done. Assume now thatM ( PM . Note that sinceM and allMi ∈ Pmax(H[PM ]) are modules ofH ′ and M is strong in H ′ it holds that M does not overlap any Mi ∈ Pmax(H[PM ]). Moreover, M 6⊆Mi since otherwiseMi would have been chosen instead of PM . Thus,M = ⋃ i∈IMi is the union of some elements Mi in Pmax(H[PM ]). Since PM is a non-prime module of H it follows by Theorem 3.3(T3) that M is a module of H . Since H is a cograph, the children Mi ∈ Pmax(H[PM ]) of the non-prime module PM are the connected components of either H[PM ] (if PM is parallel) or its complement H[PM ] (if PM is series). Since M = ⋃ i∈IMi is the union of some elements in Pmax(H[PM ]) and H[M ] ⊆ H[PM ], we can conclude that H[M ], resp., its complement H[M ], has as its connected components Mi, i ∈ I . Thus, Pmax(H[M ]) ⊂ Pmax(H[PM ]). Hence, all Mi, i ∈ I are strong modules in H and, by the discussion above, all Mi are modules of H ′. Since all Mi ∈ Pmax(H[M ]) are modules of H ′ and all M ′j ∈ Pmax(H ′[M ]) are strong in H ′, it holds that no Mi ∈ Pmax(H[M ]) can overlap any M ′j ∈ Pmax(H ′[M ]). Therefore, if Mi ∩ M ′j 6= ∅ then either M ′j ( Mi or Mi ⊆ M ′j for any i and j. If M ′j ( Mi then Mi must be the union of some elements in Pmax(H ′[M ]). However, since A. Fritz, M. Hellmuth, P. F. Stadler and N. Wieseke: Cograph editing by module-merging 25 M is prime in H ′ no union of elements in Pmax(H ′[M ]), besides M itself, is a module of H ′ (cf. Theorem 3.3(T2)). Thus, Mi cannot be a module of H ′; a contradiction. Hence, Mi ⊆M ′j and therefore, eachM ′j is the union of some elements in Pmax(H[M ]). Note that this holds for any M ′j ∈ Pmax(H ′[M ]), i.e., there are distinct sets I1, . . . , I| Pmax(H′[M ])| with Ij ( {1, . . . , |Pmax(H[M ])|} such that M ′j = ⋃ i∈Ij Mi. Hence, all M ′ j are modules of H . Since, M is prime in H ′ and M did not contain any other prime module, it holds that all H ′[M ′j ] are cographs. Moreover, since all M ′ j are modules in H and M is prime in H ′ it holds that there are at least two distinct M ′k,M ′ l ∈ Pmax(H ′[M ]) with xy ∈ E(H ′) if and only if xy 6∈ E(H). Thus, F ′′ = {{x, y} | x ∈ M ′k, y ∈ M ′l} ⊆ F . Now, since all H ′[M ′j ] are cographs it holds that H ′[M ′k ∪M ′l ] is a cograph. Now, consider the graph H ′′ = G4 F \ F ′′, and in particular the subgraph H ′′[M ] = G[M ] 4 F [M ] \ F ′′. Again, since all H ′[M ′j ] with M ′j ∈ Pmax(H ′[M ]) are cographs it holds that H[M ′j ] ' H ′[M ′j ] ' H ′′[M ′j ]. By construction of F ′′ for the previously chosen M ′k and M ′ l it holds that H ′[M ′k ∪M ′l ] ' H ′′[M ′k ∪M ′l ] as well as H[M \ (M ′k ∪ M ′l )] ' H ′′[M \ (M ′k ∪M ′l )] is a cograph. Moreover, since for all x ∈ M ′k ∪M ′l and all y ∈ M \ (M ′k ∪ M ′l ) we have xy ∈ E(H) if and only if xy ∈ E(H ′′) it holds that H ′′[M ] is a cograph as well. Note that F ′′ ⊆ F [M ] and F ′′ 6= ∅ and therefore, |F [M ] \ F ′′| < |F [M ]|. But then, since F [M ] \ F ′′ is an edit set for G[M ] and by Lemma 3.9 the set F is not optimal; a contradiction. Thus, F ′ cannot be a proper subset of F , which proves Claim 3. / Claim 1 and 3 immediately imply that F = F ′. In particular, we have F ′ = n⋃· i=1 li⋃· j=1 θ′M?i (j) = n⋃· i=1 li⋃· j=1 θM?i (j) = F. It can easily be seen by the latter results that each of the modules inN (M) = {N?1 , . . . , N?m} that is created by a pairwise module merge is either already a module ofG, or a union of elements from Pmax(G[M ]) of some prime module M of G. 5.2 A modular-decomposition-based heuristic for cograph editing Although the (decision version of the) optimal cograph-editing problem is NP-complete [38, 39], it is fixed-parameter tractable (FPT) [6, 39, 49]. However, the best-known run- time for an FPT-algorithm isO(4.612k+|V |4.5), where the parameter k denotes the number of edits. These results are of little use for practical applications, because the parameter k can become quite large. An exact algorithm that runs in O(3|V ||V |)-time is introduced in [53]. Moreover, approximation algorithms are described in [16, 46]. In the following we provide an alternative exact algorithm for the cograph-editing problem based on pairwise module-merge. The virtue of this algorithm is that it can be adopted very easily to design a cograph-editing heuristic. Algorithm 1 contains two points at which the choice of a particular module or a particular pair of modules affects performance and efficiency. First, the function get-module-pair() returns two modules of P in the correct order of the sequence of pairwise module merge operations that transforms G into H (cf. Theorem 5.6). Second, subroutine get-module-pair-edit() is used to compute the edits needed to merge 26 Art Discrete Appl. Math. 3 (2020) #P2.01 prime M1 0 1 2 3 4 5 1 G 0 3 4 5 2 1 G 0 3 4 5 21 1 H 0 3 4 5 2 prime M1 series M2* 0 1 4 2 parallel M1* 3 5 parallel M1 series M2* series M3* 1 2 parallel M1* 3 5 0 4 Figure 3: Illustration of Lemma 5.1 – 5.5, Theorem 5.6 and the exact algorithm. Consider the non-cograph G, the cograph H = G4 F and the optimal module-preserving edit set F = {{0, 1}, {3, 4}}. The modular decomposition trees are depicted below the respective graphs. LetM = {M?1 ,M?2 ,M?3 } be the inclusion-ordered set of strong modules of H that are no modules of G. For all modules M?i ∈ M the inclusion-minimal module PM?i is the prime module M1 in G. In compliance with Lemma 5.2 we start with constructing the module M?1 . By definition FM?1 = {{3, 4}} = σM?1 . and we obtainG1 = G4σM?1 . Thus, {3}t+{5} →M ? 1 w.r.t.G1. Next, we continue with M?2 . By construction, FM?2 = {{0, 1}, {3, 4}} and σM?2 = FM?2 \ FM?1 = {{0, 1}}. We then obtain G2 = G1 4 σM?2 = H . Thus, t+Mi∈C(M?2 )Mi → M ? 2 w.r.t. G2 = H . The module M?3 is now obtained for free, since FM?3 = {{0, 1}, {3, 4}} and σM?3 = FM?3 \ (FM?1 ∪ FM?2 ) = ∅. In compliance with Lemma 5.5, i.e., when considering pairwise module merge only, we start with constructing the module M?1 (1). Here, X (M?1 ) = {M0 = {3},M1 = {5}} and M?1 (1) = {3, 5} = M?1 . By definition, FM?1 (1) = {{3, 4}} = θM?1 (1) and we obtain G1,1 = G1 = G 4 θM?1 (1). Thus, {3} t+ {5} → M ? 1 w.r.t. G1,1 = G1. Next, we continue with M?2 (1) and M ? 2 (2). Here, X (M?2 ) = {M0 = {1},M1 = {2},M2 = M?1 } and M?2 (1) = {1} ∪ {2} and M?2 (2) = {1, 2, 3, 5} = M?2 . By definition θM?2 (1) = FM?2 (1) \ FM?1 (1) = {{0, 1}} comprises the edits to obtain the new module {1, 2}. Thus, {1}t+ {2} →M?2 (1) w.r.t.G2,1. Then, since FM?2 (2) = FM?2 = {{0, 1}, {3, 4}}, we obtain θM?2 (2) = FM?2 (2) \ (FM?1 ∪ θM?2 (1) = ∅. Thus, there are no edits left to apply in order to derive at H , since G2,1 = G2,2 = G2 = H . Again, the module M?3 is now obtained for free. In all steps, we obtained the new modules by merging pairs of existing modules. A. Fritz, M. Hellmuth, P. F. Stadler and N. Wieseke: Cograph editing by module-merging 27 Algorithm 1 Pairwise Module Merge 1: INPUT: A graph G = (V,E). 2: G? ← G; 3: F ? ← ∅; 4: MDs(G)← compute-modular-decomposition(G). 5: P1, . . . , Pm be the prime modules of G that are partially ordered w.r.t. inclusion, i.e., Pi ⊆ Pj implies i ≤ j. 6: for p = 1, . . . ,m do 7: Pp ← Pmax(G[Pp]) 8: while G?[Pp] is not a cograph do 9: Mi,Mj ←get-module-pair(Pp). {according to Theorem 5.6} 10: if Mi ∪Mj is no module of G? then 11: θ ← get-module-pair-edit(Mi t+ Mj → N w.r.t. G[Pp]) {according to θl in Lemma 5.5} 12: G? ← G?∆ θ 13: end if 14: Pp ← Pp \ {Mi,Mj} ∪ {N} 15: end while 16: end for 17: OUTPUT: H = G?; the modules Mi and Mj to a new module such that these edits affect only the vertices within Pp (cf. Lemma 5.5). Lemma 5.7. Let P(G) be the set of all strong prime modules of G and suppose that Algo- rithm 1 is applied on the graphGwith n = |V (G)|. If get-module-pair() is an “ora- cle” that always returns the correct pair Mi and Mj and get-module-pair-edit() returns the correct edit set θ, then Algorithm 1 computes an optimally edited cograph H in O(mΛh(G)) ≤ O(n2h(G)) time, where m denotes the number of strong prime modules in G and Λ = maxP∈P(G) |Pmax(G[P ])| is the size of the largest maximal strong parti- tion among all prime modules P ∈ P(G), and h(G) is the maximal cost for evaluating get-module-pair() and get-module-pair-edit(). Proof. The correctness of Algorithm 1 follows directly from Lemma 5.5 and Theorem 5.6. The modular decomposition tree of a graph G = (V,E) can be computed in linear- time, i.e., O(|V | + |E|) ≤ O(n2) with n = |V (G)|, see [9, 13, 40, 41, 52]. It yields the partial order P1, . . . , Pm of the prime modules of G (line 5) in time O(n) by depth first search. Then, we have to resolve each of them prime modules and in each step in the worst case all modules have to be merged stepwisely, resulting in an effort of O(|Pmax(G[Pp])|) merging steps in each iteration. Since m ≤ n and Λ ≤ n we obtain O(n2h(G)) as an upper bound. In practice, the exact computation of the optimal editing requires exponential effort. To be more precise, we show now the complexity h(G) as in Lemma 5.7 using a naive brute-force method. Given a prime module P with λ = |Pmax(G[P ])| child modules there are ( λ 2 ) possibilities for selecting the first module pair that has to be merged. After merging those two modules there are at most λ − 1 modules left from which possibly two more have to be merged. In general in the i-th merging step there are at most ( λ−i 2 ) possible merge pairs left. This process have to repeat at most (λ − 4) times, since any module with less than four child modules cannot be prime. In the worst case this adds up to∏λ i=4 ( i 2 ) = ∏λ i=4 i! 2!(i−2)! = ∏λ i=4 i·(i−1) 2 merge sequences per prime module of G which 28 Art Discrete Appl. Math. 3 (2020) #P2.01 givesO((λ!)2) executions of get-module-pair() per prime module inG. Finding the optimal edit set for one merge operation of two modules M1,M2 ∈ Pmax(G[P ]) requires checking the 2λ−2 combinations to add or remove edges to adjust the outM1 - and outM2 - neighbors w.r.t. to the remaining λ − 2 modules. Therefore, for each of the remaining modules M ∈ Pmax(G[P ]) \ {M1,M2} there are either only edges or only non-edges between the vertices from M and M1 ∪M2. In summary, for a given prime module P the graph G[P ] can be optimally edited to a cograph in O((λ!)22λ) time. Therefore, with Λ = maxP |Pmax(G[P ])| being the size of the largest maximal strong partition among all prime modules P of G, it follows that h(G) ∈ O((Λ!)22Λ). We note in passing that Λ is always less than or equal to the maximum degree in the modular decomposition tree, which is also known as modular-width [1, 18]. Hence, the latter findings together with Lemma 5.7 imply the following Observation 5.8. The optimal cograph editing problem parameterized by the modular- width k can be solved in O((k!)22k|V |2) time and thus, it is in FPT. Practical heuristics for get-module-pair() and get-module-pair-edit() can be implemented to run in polynomial time. In particular, as a main result, we can observe that it is always possible to find an optimal edit set by stepwisely merging only pairs of modules. Based on this, we provide in the following several strategies to improve the runtime of these heuristics. A simple greedy strategy yields a heuristic with O(|V |3) time complexity as follows: In each call of get-module-pair() select the pair (Mi,Mj) in P where the edit set that adjusts the outMi - and outMj -neighbors so that the outMi∪Mj -neighborhood becomes identical in G?[Pp] has minimum cardinality. This minimum edit set can be obtained from get-module-pair-edit() by adjusting only the out-neighbors of the smaller module to be identical to the out-neighbors of the larger module. The pseudocode for this heuristic is given in Algorithm 2 which is, in fact, a natural extension of the exact Algorithm 1. A detailed numerical evaluation will be discussed elsewhere. Lemma 5.9. Algorithm 2 outputs a cograph and has a time complexity of O(|V |3). Proof. First we show that Algorithm 2 constructs a cograph. To this end we show that in each iteration of the main for-loop (Lines 16 to 41) the corresponding prime module Pp is edited such that the resulting subgraph G?[Pp] is a cograph and Pp is still a module of G?. Due to the processing order of the prime modules P1, . . . , Pm constructed in Line 4, we may assume that, upon processing a prime module Pp, the induced subgraphsG?[M ],M ∈ Pmax(G[Pp]) are already cographs and all M are modules of G?. This holds in particular for the prime modules that do not contain any other prime module in the input graph G and which, therefore, are processed first. Hence, it suffices to show that if all G?[M ], M ∈ Pmax(G[Pp]), are already cographs and all M are modules in G?, then executing the p − th iteration of the for-loop results in an updated intermediate graph G′ with G′[Pp] being a cograph and Pp as well as all modules M ∈ Pmax(G[Pp]) remain modules of G′. In Line 17, we define P = Pmax(G[Pp]) and therefore, by assumption, all G?[M ], M ∈ P are cographs and all M are modules of G?. In particular, the two sets Mi and Mj that are chosen first (in Line 20) are already cographs. Moreover, since Mi and Mj are modules of G? if follows that G?[Mi ∪Mj ] is either the disjoint union G?[Mi] ∪· G?[Mj ] or the join G?[Mi] ⊕ G?[Mj ] of G?[Mi] and G?[Mj ]. Thus, G?[Mi ∪Mj ] is already a cograph and none of the edges within Mi ∪Mj is edited further. It remains to show that A. Fritz, M. Hellmuth, P. F. Stadler and N. Wieseke: Cograph editing by module-merging 29 Algorithm 2 Pairwise Module Merge Heuristic 1: INPUT: A graph G = (V,E). 2: G? ← G; 3: MDs(G)← compute-modular-decomposition(G). 4: P1, . . . , Pm be the prime modules of G that are partially ordered w.r.t. inclusion, i.e., Pi ⊆ Pj implies i ≤ j. 5: A← zero initialized |MDs(G)| × |MDs(G)| matrix 6: B ← zero initialized |MDs(G)| × |MDs(G)| × |MDs(G)| matrix 7: BLines 8 to 15: Initialize A where the entries Aij store the number |V \ {Mi ∪Mj}| of vertices that need to be adjusted to merge the modules Mi and Mj . Initialize B s.t. Bijk = 1 iff Mi and Mj have different out-neighborhoods w.r.t. Mk 8: for each {Mi,Mj ,Mk} ∈ (MDs(G) 3 ) withMi,Mj ,Mk being children of one and the same prime module P do 9: if outMi∩Mk 6=outMj∩Mk then Bijk, Bjik ← 1 end if 10: if outMi∩Mj 6=outMk∩Mj then Bikj , Bkij ← 1 end if 11: if outMj∩Mi 6=outMk∩Mi then Bjki, Bkji ← 1 end if 12: Aij , Aji ← Aij + |Mk| ·Bijk 13: Aik, Aki ← Aik + |Mj | ·Bikj 14: Ajk, Akj ← Ajk + |Mi| ·Bjki 15: end for 16: for p = 1, . . . ,m do 17: P ← Pmax(G[Pp]) 18: while |P| > 1 do 19: θ ← ∅ {θ denotes the set of (non)edges that will be edited} 20: select two distinct modules Mi and Mj from P with |Mi| ≥ |Mj | that have a minimum value of Aij ∗ |Mj |. 21: BLine 22 to 26: Compute the edits for adjusting the outMi∪Mj -neighborhood s.t. Mj has the same out-neighborhood as Mi within G[Pp]. Note, since Pp is a module of G, Mj and Mi have the same out-neighbors in G after editing. 22: if Aij 6= 0, i.e., Mi ∪Mj is no module of G? then 23: for each Mk ∈ P \ {Mi,Mj} do 24: if Bijk = 1 then θ ← θ ∪ {xy | x ∈Mj , y ∈Mk} end if 25: end for 26: end if 27: BLine 28 to 30: Adjust in A the number of edits needed for merging the new module Mi ∪Mj with some Mk 28: for each Mk ∈ P \ {Mi,Mj} do 29: Aik, Aki ← Aik − |Mj | ·Bikj 30: end for 31: BLine 32 to 34: Adjust in A the number of edits needed for merging two modules Mk and Ml 32: for each {Mk,Ml} ∈ (P\{Mi,Mj} 2 ) do 33: Akl, Alk ← Akl + |Mj | ·Bkli − |Mj | ·Bklj 34: end for 35: remove the j-th row and column A 36: remove the j-th layer in all 3 dimensions of B 37: in P replace Mi with Mi ∪Mj 38: P ← P \ {Mj} 39: G? ← G?∆ θ 40: end while 41: end for 42: OUTPUT: H = G?; 30 Art Discrete Appl. Math. 3 (2020) #P2.01 applying the edits constructed in Line 24 result in the (new) merged module Mi ∪Mj of G?∆θ. Note, if Mi ∪Mj is already a module of G? then Lines 22 to 26 are not executed and therefore, θ = ∅, which implies that Mi ∪Mj remains a module of G?∆θ. On the other hand, if Mi ∪Mj is no module of G? then the for-loop in Lines 12 to 26 iterates over all modules Mk in P \ {Mi,Mj} and adjusts the edges between Mj and Mk to be in accordance to the edges between Mi and Mk. Note that all those edits are within Pp. In particular, the outMi∪Mj -neighborhood was adjusted only between vertices from Mj and vertices from Pp \ (Mi ∪Mj). After applying these edits, Mi ∪Mj is therefore a module in G?[Pp]∆θ. In particular, the outPp -neighborhood has not changed and Pp is therefore a module of G? as well as of G?∆θ. Then, it follows by Lemma 3.1 that Mi ∪Mj is a module in G?∆θ. To see that also all Mk ∈ P \ {Mi,Mj} remain modules in G?∆θ note first that P is a partition of Pp and second, that only edges between Mj and Mk are edited for some Mk ∈ P \ {Mi,Mj}. Moreover, if a (non)edge between Mj and Mk is edited, then all (non)edges {xy | x ∈ Mj , y ∈ Mk} between Mj and Mk are edited. Thus all Mk ∈ P \ {Mi,Mj} remain modules of G?[Pp]∆θ and therefore modules G?∆θ. Now consider the prime module Pp+1 that is processed in the next iteration of the main for-loop. It can be easily seen that for Pp+1 we also have: G?[M ],M ∈ Pmax(G[Pp+1]) is a cograph and all M are modules of G?, since all prime modules of G that are subsets of Pp+1 are already processed, and therefore, are all those M are non-prime modules of G? and form cographs G?[M ]. Hence, by the same argumentation as before, G?[Pp+1] is edited to a cograph by the next execution of the main for-loop. Thus, after processing all prime modules of G the final graph H is a cograph. Next, we show that Algorithm 2 has a time complexity of O(|V |3). Creating the mod- ular decomposition in Line 3 can be done in linear time by the algorithms presented in, e.g., [13, 41, 52]. Note that “linear” in this context means linear in the number of edges, i.e., O(|V | + |E|) ∈ O(|V |2). Initializing the matrices A and B (Lines 8 to 15) requires time O(|V |3) since the corresponding for-loop iterates over every ordered set of 3 strong modules of G and there are at most O(|V |) such modules. Moreover, checking if the out-neighborhoods of two modules Mi and Mj w.r.t. a third module Mk are identical (the if -statements in Lines 9 to 11) can be done in constant time by checking the adjacencies between three arbitrary vertices, exactly one from each of the three modules. For the re- maining Lines 16 to 41 we can consider how often the inner while-loop (Lines 18 to 40) is executed. Therefore, note that within each execution always two modules are merged and there are O(n) of those merge operations at most. This can most easily be seen by consid- ering the matrix A which has MDs(G) rows and columns at first with |MDs(G)| < |V |. Each row, respectively each column, of A represents a module that is possibly selected for merging. Moreover, within each iteration of the while-loop, the matrix A is reduced by one row, respectively one column. This leads to no more than |V | many executions of the while-loop. Selecting the two modules Mi and Mj in Line 20 requires O(|V 2|) time. Although, the for-loop in Lines 23 to 25 is executed O(|V |) times and each partial edit set that is computed in Line 24 might contain more than O(|V |) many edits, the whole edit set θ (constructed within Lines 23 to 25) contains no more thanO(|V |2) edits. Thus, executing Lines 12 to 26 requires O(|V |2) time at most. Adjusting the matrix A is done in two steps. Lines 28 to 30 iterates over O(|V |) many modules Mk and Lines 32 to 34 iterates over O(|V |2) many pairs of modules (Mk,Ml). Shrinking the matrices A and B in Lines 35 and 36 can technically be done in time O(|V |) if we use a labeling function l : N × N to index the values within the matrices, i.e., instead of reading Aij we read Al(i),l(j). Then A. Fritz, M. Hellmuth, P. F. Stadler and N. Wieseke: Cograph editing by module-merging 31 we just have to relabel those indices, i.e., l(x) ← l(x) + 1 for all x > j. In that way we do not have to remove anything from A or B. Line 37 and 38 can also be done in O(|V |) time and applying the edits in Line 39 requires at most O(|V |2) time. In summary, execut- ing a single iteration of the main for-loop requires O(|V |2) time, which yields a total time complexity of O(|V |3). The heuristic as given in Algorithm 2 is deterministic and therefore lacks of a ran- domization component which would be helpful in order to sample solutions and con- struct a consensus cograph. However, randomization can be introduced easily by select- ing a pair of modules Mi and Mj in line 20 with a probability inversely correlated with the value of Aij · |Mj |. Moreover, with probability p = |Mi|/(|Mi| + |Mj |) the edits {xy | x ∈ Mj , y ∈ Mk} can be selected in line 24 and otherwise {xy | x ∈ Mi, y ∈ Mk} with probability 1− p. An even simpler (but probably less accurate) heuristic with time complexity O(|V |2) can be obtained by randomly selecting the next pair of modules Mi and Mj that have to be merged. Such a procedure would not require the computation of the matrices A and B at all. Nevertheless, this O(|V |2)-time heuristic requires that computing the edit set θ can be done in O(|V |) time. However, this is possible if we only track the O(|V |) many edits on the corresponding quotient graph G?[Pp]/Pmax(G[Pp]) and recover the O(|V |2) many individual edits from that only once in a single post-processing step at the end. Cograph editing heuristics based on the destruction of P4s requiresO(|V |4) time merely for enumerating all P4s. Thus, using module merges as editing operation may lead to sig- nificantly faster cograph editing heuristics. ORCID iDs Adrian Fritz https://orcid.org/0000-0002-9853-5577 Marc Hellmuth https://orcid.org/0000-0002-1620-5508 Peter F. Stadler https://orcid.org/0000-0002-5016-5191 Nicolas Wieseke https://orcid.org/0000-0002-9538-7564 References [1] F. N. Abu-Khzam, S. Li, C. Markarian, F. Meyer auf der Heide and P. Podlipyan, Modular- width: An auxiliary parameter for parameterized parallel complexity, in: M. Xiao and F. A. Rosamond (eds.), Frontiers in Algorithmics, Springer International Publishing, Cham, volume 10336 of Lecture Notes in Computer Science, 2017 pp. 139–150, doi:10.1007/ 978-3-319-59605-1_13. [2] A. Blass, Graphs with unique maximal clumpings, J. Graph Theory 2 (1978), 19–24, doi: 10.1002/jgt.3190020104. [3] S. Böcker and A. W. M. Dress, Recovering symbolically dated, rooted trees from symbolic ultrametrics, Adv. Math. 138 (1998), 105–125, doi:10.1006/aima.1998.1743. [4] A. Brandstädt, V. B. Le and J. P. Spinrad, Graph Classes: A Survey, SIAM Monographs on Dis- crete Mathematics and Applications, Society for Industrial and Applied Mathematics, Philadel- phia, PA, USA, 1999, doi:10.1137/1.9780898719796. [5] A. Bretscher, D. Corneil, M. Habib and C. Paul, A simple linear time LexBFS cograph recog- nition algorithm, SIAM J. Discrete Math. 22 (2008), 1277–1296, doi:10.1137/060664690. 32 Art Discrete Appl. Math. 3 (2020) #P2.01 [6] L. Cai, Fixed-parameter tractability of graph modification problems for hereditary properties, Inform. Process. Lett. 58 (1996), 171–176, doi:10.1016/0020-0190(96)00050-6. [7] D. G. Corneil, H. Lerchs and L. Steward Burlingham, Complement reducible graphs, Discrete Appl. Math. 3 (1981), 163–174, doi:10.1016/0166-218x(81)90013-5. [8] D. G. Corneil, Y. Perl and L. K. Stewart, A linear recognition algorithm for cographs, SIAM J. Comput. 14 (1985), 926–934, doi:10.1137/0214065. [9] A. Cournier and M. Habib, A new linear algorithm for modular decomposition, in: S. Tison (ed.), Trees in Algebra and Programming – CAAP ’94, Springer, Berlin, volume 787 of Lecture Notes in Computer Science, pp. 68–84, 1994, doi:10.1007/bfb0017474, proceedings of the Nineteenth International Colloquium held in Edinburgh, April 11 – 13, 1994. [10] D. D. Cowan, L. O. James and R. G. Stanton, Graph decomposition for undirected graphs, in: F. Hoffman, R. B. Levow and R. S. D. Thomas (eds.), Proceedings of the Third Southeastern Conference on Combinatorics, Graph Theory and Computing, Utilitas Mathematica, Winnipeg, 1972 pp. 281–290, held at Florida Atlantic University, Boca Raton, Fla., February 28 – March 2, 1972. [11] C. Crespelle, Linear-time minimal cograph editing, 2019, preprint. [12] E. Dahlhaus, J. Gustedt and R. M. McConnell, Efficient and practical modular decomposi- tion, in: SODA ’97: Proceedings of the Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 1997 pp. 26–35, held in New Orleans, LA, January 5 – 7, 1997. [13] E. Dahlhaus, J. Gustedt and R. M. McConnell, Efficient and practical algorithms for sequential modular decomposition, J. Algorithms 41 (2001), 360–387, doi:10.1006/jagm.2001.1185. [14] R. Dondi, N. El-Mabrouk and M. Lafond, Correction of weighted orthology and paralogy re- lations – Complexity and algorithmic results, in: M. Frith and C. Storm Pedersen (eds.), Algo- rithms in Bioinformatics, Springer, Cham, volume 9838 of Lecture Notes in Computer Science, 2016 pp. 121–136, doi:10.1007/978-3-319-43681-4_10, proceedings of the 16th International Workshop (WABI 2016) held at Aarhus University, Aarhus, August 22 – 24, 2016. [15] R. Dondi, M. Lafond and N. El-Mabrouk, Approximating the correction of weighted and un- weighted orthology and paralogy relations, Alg. Mol. Biol. 12 (2017), Article no. 4 (15 pages), doi:10.1186/s13015-017-0096-x. [16] R. Dondi, G. Mauri and I. Zoppis, Orthology correction for gene tree reconstruction: Theo- retical and experimental results, Procedia Comput. Sci. 108 (2017), 1115–1124, doi:10.1016/j. procs.2017.05.047. [17] A. Ehrenfeucht, H. N. Gabow, R. M. Mcconnell and S. J. Sullivan, An O(n2) divide-and- conquer algorithm for the prime tree decomposition of two-structures and modular decompo- sition of graphs, J. Algorithms 16 (1994), 283–294, doi:10.1006/jagm.1994.1013. [18] J. Gajarský, M. Lampis and S. Ordyniak, Parameterized algorithms for modular-width, in: G. Gutin and S. Szeider (eds.), Parameterized and Exact Computation, Springer International Publishing, Cham, volume 8246 of Lecture Notes in Computer Science, 2013 pp. 163–176, doi:10.1007/978-3-319-03898-8_15, revised selected papers from the 8th International Sym- posium (IPEC 2013) held in Sophia Antipolis, September 4 – 6, 2013. [19] T. Gallai, Transitiv orientierbare Graphen, Acta Math. Acad. Sci. Hungar. 18 (1967), 25–66, doi:10.1007/bf02020961. [20] Y. Gao, D. R. Hare and J. Nastos, The cluster deletion problem for cographs, Discrete Math. 313 (2013), 2763–2771, doi:10.1016/j.disc.2013.08.017. A. Fritz, M. Hellmuth, P. F. Stadler and N. Wieseke: Cograph editing by module-merging 33 [21] M. Geiß, J. Anders, P. F. Stadler, N. Wieseke and M. Hellmuth, Reconstructing gene trees from Fitch’s xenology relation, J. Math. Biol. 77 (2018), 1459–1491, doi:10.1007/ s00285-018-1260-8. [22] M. Geiß, E. Chávez, M. González Laffitte, A. López Sánchez, B. M. R. Stadler, D. I. Valdivia, M. Hellmuth, M. Hernández Rosales and P. F. Stadler, Best match graphs, J. Math. Biol. 78 (2019), 2015–2057, doi:10.1007/s00285-019-01332-9. [23] M. Geiß, M. Hellmuth and P. F. Stadler, Reciprocal best match graphs, J. Math. Biol. 80 (2020), 865–953, doi:10.1007/s00285-019-01444-2. [24] S. Guillemot, F. Havet, C. Paul and A. Perez, On the (non-)existence of polynomial ker- nels for Pl-free edge modification problems, Algorithmica 65 (2013), 900–926, doi:10.1007/ s00453-012-9619-5. [25] S. Guillemot, C. Paul and A. Perez, On the (non-)existence of polynomial kernels for Pl-free edge modification problems, in: V. Raman and S. Saurabh (eds.), Parameterized and Exact Computation, Springer, Berlin, volume 6478 of Lecture Notes in Computer Science, pp. 147– 157, 2010, doi:10.1007/978-3-642-17493-3_15, proceedings of the 5th International Sympo- sium (IPEC 2010) held in Chennai, December 13 – 15, 2010. [26] M. Habib, F. De Montgolfier and C. Paul, A simple linear-time modular decomposition algo- rithm for graphs, using order extension, in: T. Hagerup and J. Katajainen (eds.), Algorithm Theory – SWAT 2004, Springer, Berlin, volume 3111 of Lecture Notes in Computer Science, pp. 187–198, 2004, doi:10.1007/978-3-540-27810-8_17. [27] M. Habib and M. C. Maurer, On the X-join decomposition for undirected graphs, Discrete Appl. Math. 1 (1979), 201–207, doi:10.1016/0166-218x(79)90043-x. [28] M. Habib and C. Paul, A survey of the algorithmic aspects of modular decomposition, Comput. Sci. Rev. 4 (2010), 41–59, doi:10.1016/j.cosrev.2010.01.001. [29] M. Habib, C. Paul and L. Viennot, partition refinement techniques: An interesting al- gorithmic tool kit, Internat. J. Found. Comput. Sci. 10 (1999), 147–170, doi:10.1142/ s0129054199000125. [30] M. Hellmuth, M. Hernandez-Rosales, K. T. Huber, V. Moulton, P. F. Stadler and N. Wieseke, Orthology relations, symbolic ultrametrics, and cographs, J. Math. Biol. 66 (2013), 399–420, doi:10.1007/s00285-012-0525-x. [31] M. Hellmuth, P. F. Stadler and N. Wieseke, The mathematics of xenology: Di-cographs, sym- bolic ultrametrics, 2-structures and tree-representable systems of binary relations, J. Math. Biol. 75 (2017), 199–237, doi:10.1007/s00285-016-1084-3. [32] M. Hellmuth and N. Wieseke, On symbolic ultrametrics, cotree representations, and cograph edge decompositions and partitions, in: D. Xu, D. Du and D. Du (eds.), Computing and Combi- natorics, Springer International Publishing, Cham, volume 9198 of Lecture Notes in Computer Science, pp. 609–623, 2015, doi:10.1007/978-3-319-21398-9_48, proceedings of the 21st In- ternational Conference (COCOON 2015) held in Beijing, August 4 – 6, 2015. [33] M. Hellmuth and N. Wieseke, From sequence data including orthologs, paralogs, and xenologs to gene and species trees, in: P. Pontarotti (ed.), Evolutionary Biology: Convergent Evolution, Evolution of Complex Traits, Concepts and Methods, Springer, Cham, chapter 21, pp. 373–392, 2016, doi:10.1007/978-3-319-41324-2_21. [34] M. Hellmuth and N. Wieseke, On tree representations of relations and graphs: Symbolic ultrametrics and cograph edge decompositions, J. Comb. Optim. 36 (2018), 591–616, doi: 10.1007/s10878-017-0111-7. 34 Art Discrete Appl. Math. 3 (2020) #P2.01 [35] M. Hellmuth, N. Wieseke, M. Lechner, H.-P. Lenhof, M. Middendorf and P. F. Stadler, Phy- logenomics with paralogs, Proc. Natl. Acad. Sci. 112 (2015), 2058–2063, doi:10.1073/pnas. 1412770112. [36] M. Lafond, R. Dondi and N. El-Mabrouk, The link between orthology relations and gene trees: a correction perspective, Algorithms Mol. Biol. 11 (2016), Article no. 4 (13 pages), doi:10. 1186/s13015-016-0067-7. [37] M. Lafond and N. El-Mabrouk, Orthology relation and gene tree correction: complexity results, in: M. Pop and H. Touzet (eds.), Algorithms in Bioinformatics, Springer, Heidel- berg, volume 9289 of Lecture Notes in Computer Science, 2015 pp. 66–79, doi:10.1007/ 978-3-662-48221-6_5, proceedings of the 15th International Workshop (WABI 2015) held at Georgia Technological Institute, Atlanta, GA, September 10 – 12, 2015. [38] Y. Liu, J. Wang, J. Guo and J. Chen, Cograph editing: Complexity and parametrized al- gorithms, in: B. Fu and D.-Z. Du (eds.), Computing and Combinatorics, Springer-Verlag, Berlin, Heidelberg, volume 6842 of Lecture Notes in Computer Science, 2011 pp. 110–121, doi:10.1007/978-3-642-22685-4_10. [39] Y. Liu, J. Wang, J. Guo and J. Chen, Complexity and parameterized algorithms for cograph editing, Theoret. Comput. Sci. 461 (2012), 45–54, doi:10.1016/j.tcs.2011.11.040. [40] R. M. McConnell and J. P. Spinrad, Linear-time modular decomposition and efficient transitive orientation of comparability graphs, in: SODA ’94: Proceedings of the Fifth Annual ACM- SIAM Symposium on Discrete Algorithms, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 1994 pp. 536–545, held in Arlington, Virginia, January 23 – 25, 1994. [41] R. M. McConnell and J. P. Spinrad, Modular decomposition and transitive orientation, Discrete Math. 201 (1999), 189–241, doi:10.1016/s0012-365x(98)00319-7. [42] R. M. Mcconnell and J. P. Spinrad, Ordered vertex partitioning, Discrete Math. Theor. Comput. Sci. 4 (2000), 45–60, https://www.dmtcs.org/dmtcs-ojs/index.php/dmtcs/ article/view/113.1.html. [43] R. H. Möhring, Algorithmic aspects of the substitution decomposition in optimization over relations, set systems and Boolean functions, Ann. Oper. Res. 4 (1985), 195–225, doi:10.1007/ bf02022041. [44] R. H. Möhring and F. J. Radermacher, Substitution decomposition for discrete structures and connections with combinatorial optimization, in: R. E. Burkard, R. A. Cuninghame-Green and U. Zimmermann (eds.), Algebraic and Combinatorial Methods in Operations Research, North- Holland, Amsterdam, volume 95 of North-Holland Mathematics Studies, pp. 257–355, 1984, doi:10.1016/s0304-0208(08)72966-9, proceedings of the Workshop on Algebraic Structures in Operations Research. [45] J. H. Müller and J. Spinrad, Incremental modular decomposition, J. Assoc. Comput. Mach. 36 (1989), 1–19, doi:10.1145/58562.59300. [46] A. Natanzon, R. Shamir and R. Sharan, Complexity classification of some edge modification problems, Discrete Appl. Math. 113 (2001), 109–128, doi:10.1016/s0166-218x(00)00391-7, selected papers from the 12th Workshop on Graph-Theoretic Concepts in Computer Science (WG ’99). [47] N. Nøjgaard, N. El-Mabrouk, D. Merkle, N. Wieseke and M. Hellmuth, Partial homology re- lations – Satisfiability in terms of di-cographs, in: L. Wang and D. Zhu (eds.), Computing and Combinatorics, Springer International Publishing, Cham, volume 10976 of Lecture Notes in Computer Science, 2018 pp. 403–415, doi:10.1007/978-3-319-94776-1_34, proceedings of thee 24th International Conference (COCOON 2018) held in Qing Dao, China, July 2 – 4, 2018. A. Fritz, M. Hellmuth, P. F. Stadler and N. Wieseke: Cograph editing by module-merging 35 [48] T. Ohtsuki, H. Mori, T. Kashiwabara and T. Fujisawa, On minimal augmentation of a graph to obtain an interval graph, J. Comput. System Sci. 22 (1981), 60–97, doi:10.1016/0022-0000(81) 90022-2. [49] F. Protti, M. Dantas da Silva and J. L. Szwarcfiter, Applying modular decomposition to pa- rameterized cluster editing problems, Theory Comput. Syst. 44 (2009), 91–104, doi:10.1007/ s00224-007-9032-7. [50] D. Seinsche, On a property of the class of n-colorable graphs, J. Comb. Theory Ser. B 16 (1974), 191–193, doi:10.1016/0095-8956(74)90063-x. [51] C. Semple and M. Steel, Phylogenetics, volume 24 of Oxford Lecture Series in Mathematics and Its Applications, Oxford University Press, Oxford, UK, 2003. [52] M. Tedder, D. Corneil, M. Habib and C. Paul, Simpler linear-time modular decomposition via recursive factorizing permutations, in: L. Aceto, I. Damgård, L. A. Goldberg, M. M. Halldórsson, A. Ingólfsdóttir and I. Walukiewicz (eds.), Automata, Languages and Program- ming, Part I, Springer, Berlin Heidelberg, volume 5125 of Lecture Notes in Computer Science, pp. 634–645, 2008, doi:10.1007/978-3-540-70575-8_52, proceedings of the 35th International Colloquium (ICALP 2008) held in Reykjavik, July 7 – 11, 2008. [53] W. Timothy J. White, M. Ludwig and S. Böcker, Exact and heuristic algorithms for Cograph Editing, 2018, preprint, arXiv:1711.05839v3 [cs.DS]. ISSN 2590-9770 The Art of Discrete and Applied Mathematics 3 (2020) #P2.02 https://doi.org/10.26493/2590-9770.1250.763 (Also available at http://adam-journal.eu) Cayley graphs of order kp are hamiltonian for k < 48 Dave Witte Morris , Kirsten Wilk Department of Mathematics and Computer Science, University of Lethbridge, Lethbridge, Alberta, T1K 3M4, Canada Received 2 May 2018, accepted 8 July 2018, published online 4 May 2020 Abstract We provide a computer-assisted proof that if G is any finite group of order kp, where 1 ≤ k < 48 and p is prime, then every connected Cayley graph onG is hamiltonian (unless kp = 2). As part of the proof, it is verified that every connected Cayley graph of order less than 48 is either hamiltonian connected or hamiltonian laceable (or has valence ≤ 2). Keywords: Cayley graph, hamiltonian cycle, hamiltonian connected, hamiltonian laceable. Math. Subj. Class. (2020): 05C25, 05C45 1 Introduction In a series of papers [7, 11, 12, 16], it was shown that if 1 ≤ k < 32 (with k 6= 24) and p is any prime number, then every connected Cayley graph on every group of order kp has a hamiltonian cycle (unless kp = 2). This note extends that work, by treating the previously excluded case k = 24, and by increasing the upper bound on k: Theorem 1.1. If 1 ≤ k < 48, and p is any prime number, then every connected Cayley graph on every group of order kp has a hamiltonian cycle (unless kp = 2). All of the results in the previous papers [7, 11, 12, 16] were verified by hand. However, some of the proofs are quite lengthy, so many details were probably never checked by any- one other than the authors and the referees. The present paper takes the opposite approach: many of the results have not been verified by hand, but all of the source code is available at1 https://doi.org/10.26493/2590-9770.1250.763 E-mail addresses: dave.morris@uleth.ca (Dave Witte Morris), kirsten.wilk@uleth.ca (Kirsten Wilk) 1Also available in https://arxiv.org/src/1805.00149v1/anc/. cb This work is licensed under https://creativecommons.org/licenses/by/4.0/ 2 Art Discrete Appl. Math. 3 (2020) #P2.02 so the results can easily be reproduced by anyone with a standard installation of the com- puter algebra system GAP [10] (including the SmallGrp package [5]) and G. Helsgaun’s implementation LKH [14] of the Lin-Kernighan heuristic for the traveling salesperson prob- lem. An effort was made to keep the algorithms in this paper simple, so they would be easy to verify, even though this precluded many optimizations. In addition to extending the above-mentioned results for k < 32, the present work also provides an independent verification of those results, because the proofs are essentially self- contained (other than relying heavily on the correctness of extensive GAP computations). We also establish the following two results of independent interest: Corollary 1.2. If |G| < 144 (and |G| > 2), then every connected Cayley graph on G is hamiltonian. Proposition 1.3. If |G| < 48, then every connected Cayley graph on G is either hamilto- nian connected or hamiltonian laceable (or has valence ≤ 2). Remarks 1.4. 1. The definition of the terms “hamiltonian connected” and “hamiltonian laceable” can be found in Definition 2.5. 2. Almost all of this paper is devoted to the proof of Theorem 1.1. Corollary 1.2 and Proposition 1.3 are proved in Section 2C. 3. It is explained in Section 5 that the paper’s calculations for the proof of Theorem 1.1 could be substantially shortened by accepting all of the results in the literature, rather than reproving some of them. For example, instead of treating all values of k from 1 to 47, it would suffice to consider only k ∈ {24, 32, 36, 40, 42, 45} (see Lemma 5.3(1)). 4. It is natural to ask whether the conclusion of Proposition 1.3 holds for all Cayley graphs, without any restriction on the order (cf. [8, Questions 4.1 and 4.3, pp. 121– 122]). This is known to be true when G is abelian [6] and for a few other (very restricted) classes of Cayley graphs [1, 2, 3, 4], but Proposition 1.3 seems to be the first exhaustive examination of this topic for Cayley graphs of small order. Further calculations reported that the conclusion of Proposition 1.3 holds for all orders less than 108, but the additional computations took several weeks and were marred by crashes and other issues, so they are not definitive. Method of attack 1.5. For each fixed k and prime number p, there are only finitely many groups G of order kp (up to isomorphism), and each of these groups has only finitely many Cayley graphs. Assuming that kp is not too large, LKH can find a hamiltonian cycle in all of them. This means that (given sufficient time) a computer can deal with any finite number of primes. Therefore, large primes are the main concern. For these, we have the helpful observa- tion that ifG is a group of order kp, where p is prime and p > k, thenG has a unique Sylow p-subgroup (so the Sylow p-subgroup is normal), and the Sylow p-subgroup is (isomorphic to) Zp. This means that, after some computer calculations to eliminate the small cases (see Section 2D), we may assume Zp / G, and p - k. For convenience, let G = G/Zp, so |G| = k. D. Witte Morris and K. Wilk: Cayley graphs of order kp are hamiltonian for k < 48 3 Since Zp is cyclic, we are in position to apply the Factor Group Lemma (Lemma 2.12): it suffices to find a hamiltonian cycle in Cay ( G;S ) whose voltage generates Zp. There are infinitely many primes p, so a given group G of order k is the quotient of infinitely many different groups G. In order to deal simultaneously with all primes, first note that the Schur-Zassenhaus Theorem [19] tells us that G is a semidirect product: G = Zp oτ G (see Lemma 2.13(10)). We construct a single “universal” (infinite) semidirect product G̃ = Z oτ̃ G that has every Zp oτ G as a quotient. (For example, if all values of the twist homomorphism τ are ±1, then G̃ = Z oτ G.) In almost all cases, a computer search yields a hamiltonian cycleH in Cay ( G;S ) , such that its voltage ṽ in Z is nonzero. Then H has nontrivial voltage in Zp unless p is one of the finitely many prime divisors of ṽ. LKH can verify that all of the (finitely many) Cayley graphs corresponding to these primes are hamiltonian. Fortunately, theoretical arguments can handle the few situations where the computer search was unable to find any hamiltonian cycles with nonzero voltage (see Lemma 3.1). Contents 1 Introduction 1 2 Preliminaries 3 2A GAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2B Finding hamiltonian cycles with LKH and exhaustive search . . . . . . . . . 4 2C Some Cayley graphs that are hamiltonian connected/laceable . . . . . . . . 5 2D Cases where the Sylow p-subgroup is not Zp or is not normal . . . . . . . . 7 2E Notation and assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3 Irredundant generating sets of the quotient 9 4 Redundant generating sets of the quotient 13 5 Known results that can reduce the number of cases 15 2 Preliminaries Notation 2.1. 1. G is always a group of order kp, where p is prime, 2. S is a generating set of G, and 3. Cay(G;S) is the Cayley graph on G with respect to the generators S. The vertices of this graph are the elements of G, and there is an edge joining g and sg whenever g ∈ G and s ∈ S. Remark 2.2. Unlike most authors, we do not require S to be symmetric (i.e., closed under inverses). Instead, in our notation, Cay(G;S) = Cay(G;S ∪ S−1). Hamiltonian cycles in a subgraph are also hamiltonian cycles in the ambient graph, so, in order to prove Theorem 1.1, there is no harm in making the following assumption: Assumption 2.3. The generating set S of G is irredundant, in the sense that no proper subset of S generates G. 4 Art Discrete Appl. Math. 3 (2020) #P2.02 As mentioned in the introduction, the paper relies heavily on the computer algebra sys- tem GAP [10] and G. Helsgaun’s implementation LKH of the Lin-Kernighan heuristic [14]. 2A GAP The Small Groups library in GAP contains all of the groups of order less than 1024, and many others [5]. The number of groups of order k is given by the function NumberSmallGroups(k), and each group of order k has a unique id number (from 1 to NumberSmallGroups(k)). The GAP function SmallGroup(k, id) constructs the group of order k with the given id number. The GAP package grape provides tools for working with graphs. In particular, it defines the function CayleyGraph(G, S) that constructs the Cayley graph of the group G with respect to the generating set S. To prove Theorem 1.1, we wish to show, for certain groups G, that all of the Cayley graphs Cay(G;S) are hamiltonian. With Assumption 2.3 in mind, we would like to have a list of all of the irredundant generating sets of G. However, there is no need to distin- guish between Cayley graphs that are isomorphic, so we consider two generating sets to be equivalent if one can be obtained from the other by applying an automorphism of G. Furthermore, since Cay(G;S) = Cay(G;S ∪ S−1), we also consider two generating sets to be equivalent if one can be obtained from the other by replacing some elements by their inverses. The function IrredUndirGenSetsUpToAut(G) constructs a list of all of the irredundant generating sets of G, up to equivalence. It is defined in the file UndirectedGeneratingSets.gap and is adapted from the AllMinimalGeneratingSets algorithm in the masters thesis of B. Fuller [9, pp. 31–34]. (Fuller’s program does not allow generators to be replaced by their inverses.) Combining IrredUndirGenSetsUpToAut(G) with CayleyGraph(G, S) provides a list of all of the irredundant Cayley graphs on any group G. 2B Finding hamiltonian cycles with LKH and exhaustive search G. Helsgaun’s [14] implementation LKH of the Lin-Kernighan heuristic is a very powerful tool for finding hamiltonian cycles, and the function LKH(X, AdditionalEdges, RequiredEdges) interfaces GAP with this program. (It is defined in the file LKH.gap.) Given a graph X (in grape format), and two lists of edges, the function constructs a graph X+ by adding the edges in AdditionalEdges to X , and asks LKH to find a hamiltonian cycle in X+ that contains all of the edges in RequiredEdges. If X = CayleyGraph(G, S), then the hamiltonian cycle is returned as a list of elements of G, in the order that they are visited by the cycle. D. Witte Morris and K. Wilk: Cayley graphs of order kp are hamiltonian for k < 48 5 For example, the function IsAllHamiltonianOfTheseOrders(OrdersToCheck) uses LKH (together with IrredUndirGenSetsUpToAut(G) and CayleyGraph(G, S)) to verify that every Cayley graph of order k is hamiltonian, for every k in the list OrdersTo- Check. (It is defined in the file IsAllHamiltonianOfTheseOrders.gap.) LKH returns a single hamiltonian cycle, but we sometimes want several hamiltonian cycles, in order to find one whose voltage is nonzero. The function HamiltonianCycles(X, RequiredEdges) finds all of the hamiltonian cycles in X that contain all of the edges in the list Required- Edges. (It is defined in the file HamiltonianCycles.gap.) However, the list of all hamiltonian cycles may be unreasonably long (and may take too long to compute), so we instead rely on two functions that provide a fairly short list of hamiltonian cycles that suffice for the task at hand: SeveralHamCycsInCay(GBar, SBar) SeveralHamCycsInRedundantCay(GBar, S0Bar, a) (Both of these functions are defined in the file SeveralHamCycsInCay.gap.) The first provides a list of hamiltonian cycles in Cay(G;S), whereas the second provides hamilto- nian cycles in Cay ( G;S0 ∪ {a} ) . Remark 2.4. In order to verify the correctness of the results in this paper, it is not neces- sary to verify the correctness of the source code of any of the four functions that provide hamiltonian cycles. This is because the output of these functions is always checked for validity before it is used; the function IsHamiltonianCycle(X, H, AdditionalEdges, RequiredEdges) was written for this purpose. It verifies that H is a hamiltonian cycle in the graph X+ that is obtained from X by adding the edges in the list AdditionalEdges, and also that H contains all of the edges in the list RequiredEdges. Our convention is that each edge [u, v] in AdditionalEdges and RequiredEdges is considered to be directed, unless [v, u] is also in the list, in which case the edge is undirected. 2C Some Cayley graphs that are hamiltonian connected/laceable Definition 2.5 ([3, Definition 1.3]). Let X be a graph. 1. X is hamiltonian connected if X has a hamiltonian path from v to w, for all vertices v and w, such that v 6= w. 2. X is hamiltonian laceable ifX is bipartite, and it has a hamiltonian path from v tow, for all vertices v and w, such that v and w are not in the same bipartition set. Justification of Proposition 1.3. It is easy to write a GAP program that • loops through all groups G of order < 64, • loops through all irredundant generating sets S0 of G, and 6 Art Discrete Appl. Math. 3 (2020) #P2.02 • uses LKH to verify that Cay(G;S0) is hamiltonian connected/laceable if the valence is ≥ 3. Cayley graphs are vertex transitive, so, for the last step, it suffices to find a hamiltonian path from the identity element e to all other elements a of G (for hamiltonian connectivity) or to all elements a of the other bipartition set (for hamiltonian laceability). To find this hamiltonian path, one can ask LKH to find a hamiltonian cycle in the graph X ∪{ea}, such that the hamiltonian path contains the edge ea. (Note that, by symmetry, there is no need to find hamiltonian paths to both of a and a−1.) However, this is not sufficient to establish Proposition 1.3. Any generating set S of G contains an irredundant generating set S0, and it is obvious that: • If Cay(G;S0) is hamiltonian connected, then Cay(G;S) is hamiltonian connected. • If Cay(G;S0) is hamiltonian laceable, and Cay(G;S) is bipartite, then Cay(G;S) is hamiltonian laceable. But it may be the case that Cay(G;S0) is bipartite and Cay(G;S) is not bipartite. In this situation, the hamiltonian laceability of Cay(G;S0) does not imply the required hamilto- nian connectivity of Cay(G;S). Therefore, in cases where Cay(G;S0) is bipartite, the program also needs to verify hamiltonian connectivity for generating sets of the form S = S0∪{g}, such that Cay(G;S) is not bipartite. (Such a set S can be called a nonbipartite extension of S0.) Note: we may assume that no proper subset of S generates G and gives a nonbipartite Cayley graph. (The hamiltonian connectivity of the Cayley graph of such a subset would imply the hamiltonian connectivity of Cay(G;S).) Since Cay(G;S0) is hamiltonian laceable, we already know there are paths from e to any vertex in the other bipartition set, so only endpoints a in the bipartition set of e need to be considered. Furthermore, if Cay(G;S0) has valence two, then it is (usually) not hamiltonian lace- able. Therefore, in this case, the program should verify that Cay ( G;S0 ∪ {g} ) is hamilto- nian connected/laceable for all g /∈ {e} ∪ S ∪ S−1 (except that we need not consider both g and g−1). The GAP program in 1-3-HamConnOrLaceable.gap does all of this. When dealing with the case k = 32, our proof of Theorem 1.1 also applies the following known result: Lemma 2.6 ([20]). Every connected Cayley graph of order 64 is hamiltonian. Justification. This is a special case of the fact that all Cayley graphs of prime-power order are hamiltonian (see Theorem 5.1(6)). However, to avoid relying on the literature, one can use the function call IsAllHamiltonianOfTheseOrders([64]) to verify this via a few days of computation. (There are over 14,000 Cayley graphs to consider — most of the 267 groups of order 64 have many irredundant generating sets.) Proof of Corollary 1.2. Assume |G| < 144. It is known that every connected Cayley graph on any nontrivial 2-group is hamiltonian (see Theorem 5.1(6)), so we may assume that |G| is divisible by some prime p ≥ 3. Then |G| = kp, where k = |G|/p < 144/3 = 48, so Theorem 1.1 applies. It might be possible to avoid appealing to Theorem 5.1(6), by using LKH to find hamil- tonian cycles in all of the Cayley graphs of order 128, but this would be a massive compu- tation, and we did not carry it out. D. Witte Morris and K. Wilk: Cayley graphs of order kp are hamiltonian for k < 48 7 2D Cases where the Sylow p-subgroup is not Zp or is not normal In all later sections of this paper, we will assume that the Sylow p-subgroup of G is iso- morphic to Zp, and is normal in G. The following proposition deals with the finitely many groups that do not satisfy this hypothesis. (See Lemma 2.13(4) for a justification of the assumption that p is the largest prime divisor of kp.) Proposition 2.7. Let P be a Sylow p-subgroup of G, and assume |G| = kp, where p is the largest prime divisor of kp, and k < 48. 1. If P 6∼= Zp, then every connected Cayley graph on G is hamiltonian. 2. If P ∼= Zp and P 6/ G , then every connected Cayley graph on G is hamiltonian. Justification. (1) Since the Sylow p-subgroup of G is not isomorphic to Zp, we know that p2 is a divisor of |G| = kp, so p | k. In fact, p must be the largest prime divisor of k (since it is the largest prime divisor of kp). So p is uniquely determined by k. It is a simple matter to write a GAP program that • loops through the values of k in {1, . . . , 47}, • loops through all the nonabelian groups G of order kp, where p is the largest prime divisor of k, • loops through all the irredundant generating sets S ofG (up to automorphisms ofG), and • uses LKH to verify that Cay(G;S) is hamiltonian. (See the file 2-7(1)-SylowSubgroupNotZp.gap.) The calculations take several hours to complete. About half of the time is spent finding hamiltonian cycles in the Cayley graphs of order 32× 2 = 64, since there are so many of them, so we separated out that part of the calculation (see Lemma 2.6). One important modification to the algorithm deals with the problem that the original version of the program ran out of memory when trying to find the generating sets of SmallGroup(1058, 4). (This group arises for k = 23.) Since 1058 = 2 × 232 is of the form 2p2, Theorem 5.1(4) tells us that every Cayley graph on this group is hamil- tonian. (In fact, this group is of “dihedral type” so it is very easy either to find all of the irredundant generating sets by hand, or to prove that every connected Cayley graph is hamiltonian.) Therefore, the program skips this group (and prints the comment that it “is dihedral type of order 2pˆ2”). (2) Let d be the number of Sylow p-subgroups of G. We know from Sylow’s Theorem that d is a divisor of k, and that d ≡ 1 (mod p). Also note that d > 1, since the Sylow subgroup Zp is not normal, and therefore has conjugates. This implies p < k (indeed, p < d since d ≡ 1 (mod p), and d ≤ k, since d is a divisor of k). Therefore, for each k, there are only finitely many possibilities for p. It is a simple matter to write a GAP program that • loops through the values of k in {1, . . . , 47}, • loops through the primes p that are: ◦ greater than the largest prime divisor of k, 8 Art Discrete Appl. Math. 3 (2020) #P2.02 ◦ less than or equal to k, and ◦ such that there is a divisor d of k, with d > 1 and d ≡ 1 (mod p), • loops through all the groups G of order kp, such that a Sylow p-subgroup is not normal, • loops through all the irredundant generating sets S ofG (up to automorphisms ofG), and • uses LKH to verify that Cay(G;S) is hamiltonian. (See the file 2-7(2)-SylowSubgroupNotNormal.gap.) 2E Notation and assumptions Notation 2.8. In the remainder of this paper: 1. G is always a group of order kp, where 1 ≤ k < 48, and p is a prime number. 2. S is a generating set of G. 3. : G → G/Zp is the natural homomorphism, if it is the case that Zp is the unique Sylow p-subgroup of G. Convention 2.9. To avoid treating k = 2 as a special case, we will consider the graph K2 to be hamiltonian, because it has a closed walk that visits all the vertices exactly once before returning to the starting point. Notation 2.10. For s1, . . . , sn ∈ S ∪ S−1, we use (s1, . . . , sn) to denote the walk in Cay(G;S) that visits (in order), the vertices e, s1, s1s2, s1s2s3, . . . , s1s2 · · · sn. Definition 2.11 (cf. [13, §2.1.3, p. 61]). For any hamiltonian cycle H = (s1, s2, . . . , sn) in the Cayley graph Cay(G;S), we let voltG,S(H) = ∏n i=1 si be the voltage of H . This is an element of Zp. We wish to show that Cay(G;S) has a hamiltonian cycle. Our main tool is the follow- ing elementary observation: Lemma 2.12 (“Factor Group Lemma” [21, §2.2]). Suppose • H = (s1, s2, . . . , sk) is a hamiltonian cycle in Cay(G;S), and • voltG,S(H) generates Zp. Then (s1, s2, . . . , sk)p is a hamiltonian cycle in Cay(G;S). Lemma 2.13. To prove Theorem 1.1, we may assume: (1) G is not abelian. (2) k > 1. (3) If G′ is any group of order k′p′, where 1 ≤ k′ < k, and p′ is any prime number, then every connected Cayley graph on G′ is hamiltonian. (4) p is strictly greater than the largest prime factor of k. D. Witte Morris and K. Wilk: Cayley graphs of order kp are hamiltonian for k < 48 9 (5) Zp is a Sylow p-subgroup of G, and Zp / G. (6) There does not exist s ∈ S, such that 〈s〉 E G, and such that either (a) s ∈ Z(G), or (b) Z(G) ∩ 〈s〉 = {e}, or (c) |s| is prime. (7) S ∩ Zp = ∅. (8) s 6= t, for all s, t ∈ S ∪ S−1 with s 6= t. (9) If s ∈ S with |s| = 2, then |s| = 2. (10) G = Zp oτ G, where τ is a homomorphism from G to Z×p . Proof. (1) Showing that all connected Cayley graphs on abelian groups are hamiltonian is an easy exercise. (The Chen-Quimpo Theorem (Theorem 5.5) is a much stronger result.) (2) If k = 1, then |G| = p, so G is abelian, contrary to (1). (3) We may assume this by induction on k. (4) Let p′ be the largest prime factor of k, and write |G| = k′p′. If p = p′, then |G| is divisible by p2, so Proposition 2.7(1) applies. If p < p′, then k > k′, so (3) applies. (5) If either P 6∼= Zp or P 6/ G, then Proposition 2.7 applies. (6) For any s ∈ S, we know, from (1), that 〈s〉 6= G. In addition, we see from (3) that Cay ( G/〈s〉;S ) is hamiltonian. Therefore, it is well known (and easy to prove) that if s satisfies any of the given conditions, then Cay(G;S) is hamiltonian [16, Lemma 2.27]. (7) This is a special case of (6c). (8) From Proposition 1.3, we see that every edge of Cay(G;S) is in a hamiltonian cycle. Therefore, if s = t with s 6= t, then the existence of a hamiltonian cycle in Cay(G;S) is a well-known (and easy) consequence of the Factor Group Lemma (Lemma 2.12) (cf. [16, Corollary 2.11]). (9) Since s = s−1, this follows from (8) with t = s−1. (10) From (4), we know that gcd ( |G|, k ) = 1. Therefore, the desired conclusion is a consequence of the Schur-Zassenhaus Theorem [19]. Remark 2.14. It is immediate from (7) and (8) of Lemma 2.13 that the Cayley graphs Cay(G;S) and Cay(G;S) have the same valence (and have no loops). 3 Irredundant generating sets of the quotient In this section, we assume that the generating set S of G is irredundant. The assumptions stated in Notations 2.1 and 2.8 and Lemma 2.13 are also assumed to hold. In most cases, we will find a hamiltonian cycle in Cay(G;S) with nonzero voltage, so that the Factor Group Lemma (Lemma 2.12) applies. The following lemma deals with the exceptional cases in which this approach does not work. Lemma 3.1. Assume the generating set S of G is irredundant. Then Cay(G;S) has a hamiltonian cycle in each of the following situations: 1. S = {a, b}, with |a| = 2, |b| = 3, and τ(b) = 1. 2. G ∼= A4, S = {a, b}, where |a| = |b| = 3, and G centralizes Zp. 10 Art Discrete Appl. Math. 3 (2020) #P2.02 3. G = (Z2 × Z2) oτ Zm, S = {a, b}, where a ∈ Z2 × Z2, |b| = m, G is not abelian, and G centralizes Zp. 4. S contains an element a, such that a ∈ Z(G), |a| = 2, and τ(a) = 1. 5. S contains an element a, such that a2 has prime order, 〈a2〉 / G, and τ(a) = −1. 6. G is a dihedral group of order k (with k > 4), S = {a, b} with |a| = 2 and b = k/2, a inverts Zp and b centralizes Zp. 7. |G| = 4q and |[G,G]| = q, where q is prime, and S = {a, b}, where |a| = 4 and |b| = 2. Furthermore, G centralizes Zp, but b does not centralize [G,G]. Proof. (1) We may assume a projects trivially to Zp. (If a centralizes Zp, this follows from Lemma 2.13(9). If a does not centralize Zp, then it is true after conjugation by some element of Zp.) So bmust project nontrivially. Since b centralizes Zp, this implies |b| = 3p. Since |a| = 2 and |b| = 3, it is easy to see that every hamiltonian cycle in Cay(A4; a, b) is of the form (a, b ±2 , a, b ±2 , . . . , a, b ±2 ) [17, p. 238]. Hence, each right coset of 〈b〉 appears as consecutive vertices in this cycle, so it is not difficult to see that (a, b±(3p−1), a, b±(3p−1), . . . , a, b±(3p−1)) passes through all of the vertices in each right coset of 〈b〉, and is therefore a hamiltonian cycle in Cay(G; a, b). See Subcase 1.1 of [23, §3] for a detailed verification of a very similar example. (2) [16, Subcase 2.2 of Proposition 7.2]: Assume, without loss of generality, that a projects nontrivially to Zp, so |a| = 3p. Therefore 4|a| = |G|. Since G centralizes Zp, we have G ∼= Zp×A4. Therefore, [G,G] ∼= [A4, A4] ∼= Z2×Z2, so |[a, b]| = 2. It is now not difficult to verify that ( a|a|−1, b−1, a−(|a|−1), b )2 is a hamiltonian cycle. (This is a special case of a lemma of D. Jungreis and E. Friedman that can be found in [16, 2.14].) (2) Lemma 2.13(9) tells us that the projection of a to Zp is trivial. So the projection of b to Zp is nontrivial. Since b centralizes Zp, this implies |b| = mp = |G|/4. Also note that b does not centralize a (since G is not abelian), so {e, b−1ab, b−1ab a, a} = Z2×Z2. Therefore, it is easy to see that (bmp−1, a, b−(mp−1), a)2 is a hamiltonian cycle in Cay(G;S). (This is an easy special case of the same lemma of D. Jungreis and E. Friedman that was used in (3).) (4) Let s ∈ S with s = a. Lemma 2.13(8) (with t = a−1) implies |s| = 2. Since τ(a) = 1, we know that s centralizes Zp, so Lemma 2.13(9) implies that s has trivial projection to Zp (since p > 2). Therefore, we have s ∈ Z(G), which contradicts Lemma 2.13(6a). (5) Let s ∈ S with s = a. Since τ(a) 6= 1, we know that s does not centralize Zp, so we may assume (after conjugating by an appropriate element of Zp) that the projection of s to Zp is trivial. This means s = a. Then, since τ(a2) = ( τ(a) )2 = (−1)2 = 1, we see that s2 generates a subgroup of prime order that is normal in G. (Indeed, we know, by assumption, that 〈a2〉 is normalized by G, and it centralizes Zp since τ(s2) = 1.) This contradicts the conclusion of Lemma 2.13(8) (with t = s−1, and with 〈s2〉 in the role of Zp). (6) Since a does not centralize Zp, we may assume that a projects trivially to Zp (after replacing S with a conjugate). Since S generatesG, this implies that b projects nontrivially to Zp. Since b centralizes Zp, we conclude that |b| = p|b| p. Also, since a = a inverts both b and Zp, we know that a inverts b. So G is the dihedral group of order kp, and {a, b} is D. Witte Morris and K. Wilk: Cayley graphs of order kp are hamiltonian for k < 48 11 the obvious generating set consisting of a reflection a and a rotation b. Therefore, if we let m = 12 |G| − 1, then we have the hamiltonian cycle (a, b m)2. (7) This is a known result. Namely, since |G| = 4pq, this is a special case of Theo- rem 5.1(2). (Alternatively, we may apply Theorem 5.2(1), since [G,G] = |[G,G]| = q.) For completeness, we record a proof that is adapted from [15, Case 5.3].) We know that |G| = 4q, |[G,G]| = q, and |a| = 4, so we may write G = Zq o Z4, with Zq = [G,G] and Z4 = 〈a〉. Since b has order 2 and centralizes Zp, we see from Lemma 2.13(9) that b projects trivially to Zp, so a must project nontrivially. Therefore a generates G/Zq , so we have b ∈ aiZq , for some (even) i with 0 ≤ i < 4p. (Also, we know i 6= 0, be- cause |b| = 2 is not a divisor of q.) Then (b, a−(i−1), b, a4q−i−1) is a hamiltonian cycle in Cay(G/Zq; a, b). If we write b = γai, with γ ∈ Zq , then the voltage of this hamiltonian cycle is ba−(i−1)ba4q−i−1 = (γai)a−(i−1)(γai)a4q−i−1 = γaγa−1. Since b does not centralize Zq (and b ∈ aiZq with i even), we know that a does not invert Zq . Therefore the voltage γaγa−1 is nontrivial, so the Factor Group Lemma (Lem- ma 2.12) applies. We wish to show, for each lift of S to a generating set S of G, that some hamiltonian cycle in Cay(G;S) has nonzero voltage. Definition 3.2 ([18]). Recall that the norm of an algebraic number is the product of all of its Galois conjugates in C. Lemma 3.3 (cf. [23, Lemma 2.11]). Assume • G = Zp oτ G, where τ is a homomorphism from G to Z×p , • ζ = φ◦τ , where φ is an isomorphism from Z×p onto the group µp−1 of (p−1)th roots of unity in C, so ζ is an abelian character ofG (more precisely, ζ is a homomorphism from G to µp−1), • Z is the subring of C that is generated by the (p− 1)th roots of unity, • S = {a1, a2, . . . , am, b1, . . . , bn} ∪B0 is a generating set of G, such that ◦ each ai has order 2, and centralizes Zp, ◦ either B0 is empty, or B0 consists of a single element b0 that does not central- ize Zp, • Hi is a hamiltonian cycle in Cay(G;S), for i = 1, 2, . . . , n, and • for j = 1, 2, . . . , n, Sj is the generating set of G, such that Sj = S, and s ∈ G for all s ∈ Sj , except that (1, bj) ∈ Sj . If Norm ( det [ voltZoζG,Sj (Hi) ]) is not divisible by p, then Cay(G;S) is hamiltonian. Proof. From Lemma 2.13(9), we know that ai = (0, ai) for each i. Also, if B0 has an element b, then we may assume b = (0, b), after conjugating by an element of Zp. So b1, . . . , bn are the only elements of S that contribute to voltZpoτG,S(H). Therefore, if we write bj = (zj , bj), then, from the definition of S1, . . . , Sn, we have voltZpoτG,S(H) = n∑ j=1 zj voltZpoτG,Sj (H). 12 Art Discrete Appl. Math. 3 (2020) #P2.02 Note that z1, . . . , zn cannot all be 0, since 〈S〉 = G. Therefore, if voltZpoζG,S(Hi) = 0 for all i, then elementary linear algebra tells us that ∆p = 0, where ∆p = det [ voltZpoτG,Sj (Hi) ] . (∗) We will show that this leads to a contradiction. (So there must be a hamiltonian cycle with nonzero voltage, so the Factor Group Lemma (Lemma 2.12) applies.) The isomorphism φ−1 : µp−1 → Z×p extends to a unique ring homomorphism Φ: Z → Zp. Since Φ ◦ ζ = τ (and Φ is a ring homomorphism), it is easy to see that pairing Φ with the identity map on G yields a group homomorphism Φ̂ : Z oζ G→ Zp oτ G. Therefore Φ ( voltZoζG,Sj (H) ) = voltZpoτG,Sj (H) for every hamiltonian cycle H in Cay(G;S). Since Φ is a ring homomorphism (and deter- minants are calculated simply by adding and multiplying), this implies Φ(∆) = ∆p, where ∆ = det [ voltZoζG,Sj (Hi) ] . The assumption that Norm(∆) is not divisible by p tells us that Φ ( Norm(∆) ) 6= 0. Since, by definition, Norm(∆) is the product of ∆ with its other conjugates, and the ring homomorphism Φ respects multiplication, we conclude that Φ(∆) 6= 0. In other words, ∆p 6= 0. This contradiction to (∗) completes the proof. Proposition 3.4. If the generating set S of G is irredundant, then Cay(G;S) is hamilto- nian. Justification. For each group G of order less than 48, and each irredundant generating set S of G, the GAP program in the file 3-4-IrredundantSBar.gap constructs a list SeveralHamCycsInCG of some hamiltonian cycles in Cay(G;S) (by calling the function SeveralHamCycsInCay). Now, the program considers each abelian character ζ ofG. If Lemma 3.1 (or some other lemma) provides a hamiltonian cycle in Cay ( Zp oτ G;S ) , then nothing more needs to be done. Otherwise, the program constructs the list S1, . . . , Sn of generating sets described in Lemma 3.3, and calculates the voltage voltZoζG,Sj (Hi) for each Hi in SeveralHam- CycsInCG. Now, the program calls the function FindNonzeroDet, which returns a list i1, . . . , in of indices. The program then verifies that if we use Hi1 , . . . ,Hin as the hamiltonian cycles in Lemma 3.3, then the norm of the determinant of the matrix of voltages is nonzero. Hence, Lemma 3.3 provides a hamiltonian cycle in G = Zp oτ G for all but the finitely many primes p that are a divisor of this norm. To deal with these remaining primes, the program calls the function CallLKHOnLifts- OfSBar, which constructs every possible lift of S to a generating set S ofG, and uses LKH to verify that Cay(G;S) is hamiltonian. Remark 3.5. It is not necessary to verify the source code of SeveralHamCycsInCay or FindNonzeroDet, because the output of both of these programs is validated before it is used. D. Witte Morris and K. Wilk: Cayley graphs of order kp are hamiltonian for k < 48 13 4 Redundant generating sets of the quotient We now assume that the generating set S of G is redundant (but S is irredundant, and the other assumptions stated in Notations 2.1 and 2.8 and Lemma 2.13 are also assumed to hold). The following well-known observation tells us that (up to an automorphism of G) every S of this type can be constructed by choosing an irredundant generating set S0 of G and an element a of G, and letting S = ( {0} × S0 ) ∪ {(1, a)}. Lemma 4.1. Assume the generating set S of G is redundant. Then, perhaps after conju- gating by an element of Zp, there is an element a of S, such that if we let S0 = S \ {a}, then 1. S0 is an irredundant generating set of G, and 2. S0 ⊆ {0}oG. Proof. By assumption, there is a proper subset S0 of S, such that 〈S0〉 = G. By choosing S0 to be of minimal cardinality, we may assume that S0 is irredundant. Since |〈S0〉| is divisible by |〈S0〉| = |G| = |G|/p, and is a proper divisor of |G|, we must have |〈S0〉| = |G|/p. So 〈S0〉 is a maximal subgroup of G. Therefore, we have 〈S0, a〉 = G for any element a of S that is not in S0. Since S is irredundant, we conclude that S = S0 ∪ {a}. Since |〈S0〉| = |G|/p, we see from Lemma 2.13(4) that 〈S0〉 is a Hall subgroup of G. Then, since Zp is a solvable normal complement, the Schur-Zassenhaus Theorem [19] tells us that, after passing to a conjugate, we have 〈S0〉 = {0}oG. Lemma 4.2. Assume • S = ( {0} × S0 ) ∪ {(1, a)}, • S0 is an irredundant generating set of G, • either Cay(G;S0) is not bipartite, or Cay(G;S) is bipartite, and • |S0 ∪ S−10 | ≥ 3. Then Cay(G;S) is hamiltonian. Proof. We know from Lemma 2.13(7) that a 6= e. Therefore, Proposition 1.3 tells us there is a hamiltonian path (si)n−1i=1 from e to a −1 in Cay(G;S0). So H = ( a, (si) n−1 i=1 ) is a hamiltonian cycle in Cay(G;S). Write a = (z, a), with z ∈ Zp \{0}. Since S0 ⊆ {0}oG, we must have z 6= 0, and the voltage as1s2 · · · sn−1 of H is z. Hence, the Factor Group Lemma (Lemma 2.12) provides a hamiltonian cycle in Cay(G;S). To complete the proof of Theorem 1.1, the following two results consider the special cases that are not covered by Lemma 4.2. Proposition 4.3. Assume • S = ( {0} × S0 ) ∪ {(1, a)}, • S0 is an irredundant generating set of G, • Cay(G;S0) is bipartite, and • Cay(G;S) is not bipartite. 14 Art Discrete Appl. Math. 3 (2020) #P2.02 Then Cay(G;S) is hamiltonian. Justification. The GAP program in 4-3-RedundantSBar.gap: • loops through all groups G of order less than 48, • loops through all irredundant generating sets S0 ofG, such that Cay(G;S0) is bipar- tite, • loops through all nonidentity elements a of G, such that Cay(G;S) is not bipartite, where S = S0 ∪ {a}, • constructs the set S = ( {0} × S0 ) ∪ {(1, a)}, • makes a list of a few hamiltonian cycles in Cay(G;S) (by calling the function SeveralHamCycsInRedundantCay, • loops through all abelian characters ζ of G, • ignores this character if the condition in Lemma 2.13(9) is not violated, • ignores this character if S is not a minimal generating set of G, • calculates the GCD of the norms of the voltages of the hamiltonian cycles in the list, and • uses LKH to find a hamiltonian cycle in Cay(Zp oτ G;S) for each prime p that divides the GCD, by calling CallLKHOnLiftsOfSBar. (The use of CallLKHOnLiftsOfSBar in the last step is overkill, because we are interested only in the one particular lift S of S, but we are calling a function that checks all possible lifts. It does not seem worthwhile to write and verify another GAP program, just to eliminate this slight waste.) Remark 4.4. It is not necessary to verify the source code of the function SeveralHam- CycsInRedundantCay, because the output of this program is validated before it is used. Lemma 4.5. Assume • S = ( {0} × S0 ) ∪ {(1, a)}, • S0 is an irredundant generating set of G, and • |S0 ∪ S−10 | ≤ 2. Then Cay(G;S) is hamiltonian. Justification. Since S0 is a generating set of G, and a /∈ {e} ∪ S0 ∪ S0 −1 (by (7) and (8) of Lemma 2.13), it is easy to see that we must have k ≥ 4. Also note that the only groups with a 2-valent, connected Cayley graph are cyclic groups and dihedral groups, and that the 2-valent generating set of such a group is unique, up to an automorphism of the group. Applying the same method that was used for Proposition 4.3, the GAP program in 4-5-Valence2.gap: • loops through all values of k from 4 to 47, • loops through the groups G of order k that have a 2-valent, connected Cayley graph, and defines S0 to be the 2-valent generating set of G, D. Witte Morris and K. Wilk: Cayley graphs of order kp are hamiltonian for k < 48 15 • loops through all nonidentity elements a ofG, such that a /∈ {e}∪S0∪S0 −1 (except that we do not need to consider both a and a−1), • constructs the generating set S = ( {0} × S0 ) ∪ {(1, a)} of G, • makes a list of 20 hamiltonian cycles in Cay(G;S), • loops through all abelian characters ζ of G, • ignores this character if the condition in Lemma 2.13(9) is not violated, • ignores this character if S is not a minimal generating set of G, • calculates the GCD of the norms of the voltages of the hamiltonian cycles in the list, and • uses LKH to find a hamiltonian cycle in Cay(Zp oτ G;S) for each prime p that divides the GCD, by calling CallLKHOnLiftsOfSBar. (As in Proposition 4.3, the use of CallLKHOnLiftsOfSBar in the last step is overkill.) 5 Known results that can reduce the number of cases There are several results in the literature that can be used to substantially reduce the number of Cayley graphs considered in the proof of Theorem 1.1 (but then the proof is not self- contained). The following theorem of Kutnar et al. is the main example. Theorem 5.1 ([16, Theorem 1.2], [20]). Every connected Cayley graph on G is hamilto- nian if |G| has any of the following forms (where p, q, and r are distinct primes): 1. kp, where 1 ≤ k < 32, with k 6= 24, 2. kpq, where 1 ≤ k ≤ 5, 3. pqr, 4. kp2, where 1 ≤ k ≤ 4, 5. kp3, where 1 ≤ k ≤ 2, 6. pk. The following result is also useful. Theorem 5.2 ([15, 22, 24]). Every connected Cayley graph on G has a hamiltonian cycle if either 1. [G,G] is cyclic of prime-power order, or 2. |[G,G]| = pq, where p and q are distinct primes, and |G| is odd, or 3. |[G,G]| = 2p, where p is an odd prime. Lemma 5.3. To prove Theorem 1.1, one may assume: 1. k ∈ {24, 32, 36, 40, 42, 45}. 2. |[G,G]| ≥ 3. 3. Either |[G,G]| ≥ 4, or the twist function τ is nontrivial. Proof. (1) If k < 32 and k 6= 24, then Theorem 5.1(1) applies. Therefore, either k is in the specified set, or k ∈ {33, 34, 35, 37, 38, 39, 41, 43, 44, 46, 47}, in which case some part of Theorem 5.1 applies: 16 Art Discrete Appl. Math. 3 (2020) #P2.02 k form of |G| = kp k form of |G| = kp 33 9p (if p = 3), 3p2 (if p = 11), or pqr 41 p2 (if p = 41) or pq 34 2p2 (if p = 17) or 2pq 43 p2 (if p = 43) or pq 35 25p (if p = 5), 5p2 (if p = 7), or pqr 44 4p2 (if p = 11) or 4pq 37 p2 (if p = 37) or pq 46 2p2 (if p = 23) or 2pq 38 2p2 (if p = 19) or 2pq 47 p2 (if p = 47) or pq 39 9p (if p = 3), 3p2 (if p = 13), or pqr (2) The commutator subgroup of G is a subgroup of Zp oτ G ′ , so its order is a divisor of p |G′|. Therefore, if |[G,G]| ≤ 2, then |[G,G]| is either 1, 2, p, or 2p. (Furthermore, if p = 2, then τ must be trivial, so G = Z2 × G, which implies that [G,G] = [G,G].) So Theorem 5.2 establishes that every connected Cayley graph on G has a hamiltonian cycle. (3) As in (2), if τ is trivial, then G = Zp × G, so [G,G] = [G,G]. Therefore, The- orem 5.2(1) provides a hamiltonian cycle in every Cayley graph on G if |[G,G]| is prime. (In particular, if |[G,G]| < 4.) Remark 5.4. If we apply Lemma 5.3(1), then the proof of Theorem 1.1 requires hamilto- nian connectivity/laceability only for Cayley graphs of the orders listed in Lemma 5.3(1), not the full strength of Proposition 1.3. The computations to justify Proposition 1.3 could be shortened a bit by applying the following interesting result: Theorem 5.5 (Chen-Quimpo [6]). Assume Cay(G;S) is a connected Cayley graph. If G is abelian, and the valence of Cay(G;S) is at least three, then Cay(G;S) is either hamiltonian connected or hamiltonian laceable. References [1] B. Alspach, Hamilton paths in Cayley graphs on Coxeter groups: I, Ars Math. Contemp. 8 (2015), 35–53, doi:10.26493/1855-3974.509.d9d. [2] B. Alspach, C. C. Chen and K. McAvaney, On a class of Hamiltonian laceable 3-regular graphs, Discrete Math. 151 (1996), 19–38, doi:10.1016/0012-365x(94)00077-v. [3] B. Alspach and Y. Qin, Hamilton-connected Cayley graphs on Hamiltonian groups, European J. Combin. 22 (2001), 777–787, doi:10.1006/eujc.2001.0456. [4] T. Araki, Hyper Hamiltonian laceability of Cayley graphs generated by transpositions, Net- works 48 (2006), 121–124, doi:10.1002/net.20126. [5] H. U. Besche, B. Eick and E. O’Brien, SmallGrp – a GAP package, Version 1.3, 9 April 2018, https://gap-packages.github.io/smallgrp/. [6] C. C. Chen and N. F. Quimpo, On strongly Hamiltonian abelian group graphs, in: K. L. McA- vaney (ed.), Combinatorial Mathematics VIII, Springer, New York, volume 884 of Lecture Notes in Mathematics, pp. 23–34, 1981, proceedings of the Eighth Australian Conference held at Deakin University, Geelong, August 25 – 29, 1980. [7] S. J. Curran, D. Witte Morris and J. Morris, Cayley graphs of order 16p are hamiltonian, Ars Math. Contemp. 5 (2012), 185–211, doi:10.26493/1855-3974.207.8e0. [8] M. Dupuis and S. Wagon, Laceable knights, Ars Math. Contemp. 9 (2015), 115–124, doi: 10.26493/1855-3974.420.3c5. D. Witte Morris and K. Wilk: Cayley graphs of order kp are hamiltonian for k < 48 17 [9] B. Fuller, Finding CCA groups and graphs algorithmically, Master’s thesis, University of Leth- bridge, 2017, http://hdl.handle.net/10133/4996. [10] The GAP Group, GAP – Groups, Algorithms, and Programming, Version 4.8.7 (2017) and 4.8.10 (2018), http://www.gap-system.org. [11] E. Ghaderpour and D. Witte Morris, Cayley graphs of order 27p are hamiltonian, Int. J. Comb. 2011 (2011), Article ID 206930, doi:10.1155/2011/206930. [12] E. Ghaderpour and D. Witte Morris, Cayley graphs of order 30p are Hamiltonian, Discrete Math. 312 (2012), 3614–3625, doi:10.1016/j.disc.2012.08.017. [13] J. L. Gross and T. W. Tucker, Topological Graph Theory, Wiley-Interscience Series in Discrete Mathematics and Optimization, John Wiley & Sons, New York, 1987. [14] K. Helsgaun, LKH – an effective implementation of the Lin-Kernighan heuristic, Version 2.0.7, 2012, http://www.akira.ruc.dk/˜keld/research/LKH/. [15] K. Keating and D. Witte, On Hamilton cycles in Cayley graphs in groups with cyclic com- mutator subgroup, in: B. R. Alspach and C. D. Godsil (eds.), Cycles in Graphs, North- Holland, Amsterdam, volume 115 of North-Holland Mathematics Studies, pp. 89–102, 1985, doi:10.1016/s0304-0208(08)72999-2, papers from the workshop held at Simon Fraser Univer- sity, Burnaby, B.C., July 5 – August 20, 1982. [16] K. Kutnar, D. Marušič, D. Witte Morris, J. Morris and P. Šparl, Hamiltonian cycles in Cayley graphs whose order has few prime factors, Ars Math. Contemp. 5 (2012), 27–71, doi:10.26493/ 1855-3974.177.341. [17] P. E. Schupp, On the structure of Hamiltonian cycles in Cayley graphs of finite quotients of the modular group, Theoret. Comput. Sci. 204 (1998), 233–248, doi:10.1016/s0304-3975(98) 00041-3. [18] Wikipedia contributors, Field norm — Wikipedia, The Free Encyclopedia, 2018, https: //en.wikipedia.org/wiki/Field_norm. [19] Wikipedia contributors, Schur–Zassenhaus theorem — Wikipedia, The Free Encyclopedia, 2018, https://en.wikipedia.org/wiki/Schur-Zassenhaus_theorem. [20] D. Witte, Cayley digraphs of prime-power order are hamiltonian, J. Comb. Theory Ser. B 40 (1986), 107–112, doi:10.1016/0095-8956(86)90068-7. [21] D. Witte and J. A. Gallian, A survey: hamiltonian cycles in Cayley graphs, Discrete Math. 51 (1984), 293–304, doi:10.1016/0012-365x(84)90010-4. [22] D. Witte Morris, Odd-order Cayley graphs with commutator subgroup of order pq are hamilto- nian, Ars Math. Contemp. 8 (2015), 1–28, doi:10.26493/1855-3974.330.0e6. [23] D. Witte Morris, Infinitely many nonsolvable groups whose Cayley graphs are hamiltonian, J. Algebra Comb. Discrete Struct. Appl. 3 (2016), 13–30, doi:10.13069/jacodesmath.66457. [24] D. Witte Morris, Cayley graphs on groups with commutator subgroup of order 2p are hamilto- nian, Art Discrete Appl. Math. 1 (2018), #P1.04, doi:10.26493/2590-9770.1240.60e. ISSN 2590-9770 The Art of Discrete and Applied Mathematics 3 (2020) #P2.03 https://doi.org/10.26493/2590-9770.1241.7f8 (Also available at http://adam-journal.eu) Some remarks on Balaban and sum-Balaban index∗ Martin Knor Slovak University of Technology in Bratislava, Faculty of Civil Engineering, Department of Mathematics, Radlinského 11, 813 68, Bratislava, Slovakia Jozef Komornı́k Faculty of Management, Comenius University, Odbojárov 10, 831 04 Bratislava Riste Škrekovski FMF, University of Ljubljana, Jadranska ulica 19, 1000 Ljubljana, Slovenia, and Faculty of Information Studies, Ljubljanska cesta 31a, 8000 Novo mesto, Slovenia, and FAMNIT, University of Primorska, Glagoljaška, 6000 Koper, Slovenia Aleksandra Tepeh † Faculty of Information Studies, Ljubljanska cesta 31a, 8000 Novo mesto, Slovenia, and Faculty of Electrical Engineering and Computer Science, University of Maribor, Koroška cesta 46, 2000 Maribor, Slovenia Received 3 February 2018, accepted 14 March 2019, published online 4 May 2020 Abstract In the paper we study maximal values of Balaban and sum-Balaban index, and correct some results appearing in the literature which are only partially correct. Henceforth, we were able to solve a conjecture of M. Aouchiche, G. Caporossi and P. Hansen regarding the comparison of Balaban and Randić index. In addition, we showed that for every k and large enough n, the first k graphs of order n with the largest value of Balaban index are trees. We conclude the paper with a result about the accumulation points of sum-Balaban index. Keywords: Topological index, Balaban index, sum-Balaban index, Randić index. Math. Subj. Class. (2020): 05C12, 05C90 ∗The authors acknowledge partial support by Slovak research grants VEGA 1/0142/17, VEGA 1/0238/19, APVV-15-0220 and APVV-17-0428, Slovenian research agency ARRS, program no. P1–0383, and National Scholarship Programme of the Slovak Republic SAIA. †Corresponding author. E-mail addresses: knor@math.sk (Martin Knor), jozef.komornik@fm.uniba.sk (Jozef Komornı́k), skrekovski@gmail.com (Riste Škrekovski), aleksandra.tepeh@gmail.com (Aleksandra Tepeh) cb This work is licensed under https://creativecommons.org/licenses/by/4.0/ 2 Art Discrete Appl. Math. 3 (2020) #P2.03 1 Introduction In this paper we consider simple and connected graphs. For a graph G, by V (G) and E(G) we denote the vertex and edge sets of G, respectively. Let n = |V (G)| and m = |E(G)|. For vertices u, v ∈ V (G), by distG(u, v) (or shortly just dist(u, v)) we denote the distance from u to v in G, and by w(u) we denote the transmission (or the status) of u, defined as w(u) = ∑ x∈V (G) dG(u, x). Balaban index and sum-Balaban index are two of many distance-based topological in- dices, which are widely used in QSAR/QSPR modeling. Balaban index J(G) of a con- nected graph G, defined as J(G) = m m− n+ 2 ∑ e=uv 1√ w(u) · w(v) , was introduced in early eighties by Balaban [2, 3]. Later Balaban et al. [4] (and indepen- dently also Deng [9]) proposed a derived measure, namely the sum-Balaban index SJ(G) for a connected graph G: SJ(G) = m m− n+ 2 ∑ uv∈E(G) 1√ w(u) + w(v) . Although sum-Balaban index was introduced just a few years ago, several interesting results have already been published. Regarding extremal values, it was shown by Deng [9] and Xing et al. [23] that for a tree T on n vertices, n ≥ 2, SJ(Pn) ≤ SJ(T ) ≤ SJ(Sn) (1.1) with left (right, resp.) equality if and only if T = Pn (T = Sn, resp.), where Pn is the path on n vertices and Sn is the star on n vertices. In [23] also trees with the second-largest, and third-largest (as well as the second-smallest, and third-smallest) sum-Balaban index among the n-vertex trees for n ≥ 6 were determined. In [15] alternative proof for the above results and further ranking up to seventh maximum sum-Balaban index was presented. In [26] the authors investigated the maximum value of sum-Balaban index for trees with a given diameter. The extremal graphs which attain the maximum sum-Balaban in- dex among trees with given number of vertices and maximum degree, are determined in [25]. Unicyclic graphs on n vertices with the maximum value of sum-Balaban index were considered in [24], and n-vertex bicyclic graphs were studied in [6, 11]. For various upper and lower bounds on general graphs in terms of some other parame- ters (such as the maximum degree, number of edges, etc.) see [9] and [23], and for recent results on r-regular graphs, see [20]. Balaban index is somewhat better explored. We refer an interested reader to [13, 14, 16, 18] for recent papers, and to [19] for a survey. Despite the fact that Balaban index was introduced much earlier, some of its basic properties, such as the smallest possible value among all n-vertex graphs, are still unknown. Balaban index was originally named as the “average distance-sum connectivity index”. It is based on a Randić type formula, today called the Randić index [21], and known also as the connectivity index R(G) of a graph G, defined by R(G) = ∑ uv∈E(G) 1√ deg(u) · deg(v) , M. Knor et al.: Some remarks on Balaban and sum-Balaban index 3 where deg(u) (deg(v), resp.) denotes the degree of u (v, resp.) in G. Note that in the definition of Balaban index, vertex degrees are replaced by transmissions. With this paper we would like to contribute to better understanding of maximal values of both indices, and correct erroneous statements that appeared in the literature regarding some of these values (see Section 2). In addition, having correct results, we were able to show in Section 3 that a conjecture from [1] regarding the comparison of Balaban and Randić index holds. We conclude the paper with a result about the accumulation points of sum-Balaban index. The result is based on a proof of an upper bound for the minimum value of SJ(G). 2 Maximum values The cyclomatic number µ of G, which is the minimum number of edges that must be removed from G in order to transform it to an acyclic graph, equals m − n + 1. Note that the denominator m − n + 2 in the definition of Balaban and sum-Balaban index can be expressed as µ + 1. In this section we determine maximal values for both indices for graphs which contain at least one cycle. Theorem 2.1. Let G be a connected graph on n vertices with µ ≥ 1. Then: (1) J(G) is maximum if and only of G is the complete graph Kn; (2) SJ(G) is maximum if and only of G is the complete graph Kn. Proof. Since G is a connected graph with µ ≥ 1, we have n ≥ 3 and m ∈ [n, ( n 2 ) ]. For every u ∈ V (G), we have w(u) ≥ n− 1, which implies J(G) ≤ m 2 (m− n+ 2)(n− 1) , (2.1) with equality if and only if G = Kn. Let G be a graph on n vertices which is not complete. In order to prove that J(G) < J(Kn) = n2(n−1) 2(n2−3n+4) , one needs to check that for every n ≥ 3 and m ∈ [n, ( n 2 ) ] we have m2 (m− n+ 2)(n− 1) ≤ n 2(n− 1) 2(n2 − 3n+ 4) , (2.2) or equivalently − 2m2(n2 − 3n+ 4) + n2(n− 1)2(m− n+ 2) ≥ 0. (2.3) Let f(m) be the left-hand side of (2.3), i.e., f(m) = −2m2(n2 − 3n + 4) + n2(n − 1)2(m−n+2). Then f is quadratic inm with a negative leading coefficient. Hence, f(m) is concave. Since f(n) = n2(2n− 6) ≥ 0 and f(n(n−1)2 ) = 0, we conclude that f(m) ≥ 0 for every m ∈ [n, ( n 2 ) ]. Hence (2.3) is true, which completes the proof for Balaban index. To prove the statement for sum-Balaban index, observe that since w(u) ≥ n − 1, we have SJ(G) ≤ m 2 (m− n+ 2) √ 2n− 2 , (2.4) 4 Art Discrete Appl. Math. 3 (2020) #P2.03 with equality if and only if G = Kn. Let G be a graph on n vertices which is not complete. In order to prove that SJ(G) < SJ(Kn) = n2(n−1)2 2(n2−3n+4) √ 2n−2 , one needs to check that for every n ≥ 3 and m ∈ [n, ( n 2 ) ] we have m2 (m− n+ 2) √ 2n− 2 ≤ n 2(n− 1)2 2(n2 − 3n+ 4) √ 2n− 2 . Since the above inequality is equivalent to (2.2), the proof is complete. Sun [22], and Dong and Guo [10], independently studied Balaban index of trees with given number of vertices. Their results hold, however Deng [8] corrected mistakes in their proofs of the statement that for a tree T on n ≥ 2 vertices it holds J(Pn) ≤ J(T ) ≤ J(Sn) (2.5) with left (right, resp.) equality if and only if T = Pn (T = Sn, resp.). In [10] the authors also state that for a connected graph G with n vertices J(G) ≤ J(Sn) = √ (n− 1)3 2n− 3 , with equality if and only if G = Sn. It was brought to our attention that two years later (seemingly unaware of the paper by Dong and Guo), Aouchiche et al. [1] posed the con- jecture, which we can state here as a theorem. Theorem 2.2. For any connected graph G on n ≥ 2 vertices, we have J(G) ≤ { J(Kn), if n ≤ 7 J(Sn), if n ≥ 8. In their proof, Dong and Guo use the assumption that n ≥ 9 and neglect smaller cases. By Theorem 2.1 and (2.5), to complete the proof of Theorem 2.2 it suffices to compare J(Sn) = √ (n− 1)3 2n− 3 and J(Kn) = n2(n− 1) 2(n2 − 3n+ 4) for n ∈ [3, 8]. It turns out that J(Sn) < J(Kn) if n ∈ [3, 7], while J(S8) > J(K8). For sum-Balaban index we have an analogous result. Theorem 2.3. For any connected graph G on n ≥ 2 vertices, we have SJ(G) ≤ { SJ(Kn), if n ≤ 5 SJ(Sn), if n ≥ 6. Proof. By Theorem 2.1 and (1.1), it suffices to compare SJ(Sn) = (n− 1)2√ 3n− 4 and SJ(Kn) = n2(n− 1)2 2(n2 − 3n+ 4) √ 2n− 2 . By a computer one can check that f(x) = 1√ 3x− 4 − x 2 2(x2 − 3x+ 4) √ 2x− 2 has only two roots on [2,∞), namely 2 (in which case S2 = K2) and 5.5543. Since SJ(S5)− SJ(K5) < 0 and SJ(S6)− SJ(K6) > 0, we conclude the result. M. Knor et al.: Some remarks on Balaban and sum-Balaban index 5 In [10] the authors state a problem of characterizing graphs with the maximum (the minimum) Balaban index among k-connected (k-edge-connected) graphs on n vertices. Although the case of the minumum Balaban index may be hard to solve, Theorems 2.1 and 2.2 yield the following corollary. Corollary 2.4. Let G be a graph with the maximum value of Balaban index in the class of k-connected (k-edge-connected) graphs of order n. Then we have: (1) if k = 1 and, n = 2 or n ≥ 8, then G is the star Sn; (2) if k = 1 and n ≤ 7, or k ≥ 2, then G is the complete graph Kn. Analogously, by Theorems 2.1 and 2.3 we have: Corollary 2.5. Let G be a graph with the maximum value of sum-Balaban index in the class of k-connected (k-edge-connected) graphs of order n. Then we have: (1) if k = 1 and, n = 2 or n ≥ 6, then G is the star Sn; (2) if k = 1 and n ≤ 5, or k ≥ 2, then G is the complete graph Kn. By the proof of Theorem 2.1, J(Kn) ∼ n2 while for every tree T we have J(T ) ∼ n2 w , where w is the harmonic mean of {√ w(u) · w(v); uv ∈ E(G) } , i.e. w = m∑ uv∈E(G) 1√ w(u)·w(v) , (note that J(G) = m 2 (m−n+2)w ). This means that, roughly speaking, if w < 2n, then J(T ) > J(Kn). Denote by D∗a,b a tree on a+ b vertices, one of which has degree a, another has degree b, and all the other vertices have degree 1. Then D∗a,b is the double star. Observe that if a tree has diameter 2, then it is a star, while if it has diameter 3, it is a double star. Theorem 2.6. Let a and b be positive integers such that a, b ≥ 2, a + b = n and n ≥ 9. Then J(D∗a,b) > J(Kn). Proof. Consider the double star D∗a,b. Let u2 and u3 be the vertices of degree a and b, respectively. Moreover, let u1 (u4, resp.) be a pendant vertex adjacent to u2 (u3, resp.). Since b = n− a, we have w(u1) = 1 + 2(a− 1) + 3(b− 1) = 3n− a− 4, w(u2) = a+ 2(b− 1) = 2n− a− 2, w(u3) = b+ 2(a− 1) = n+ a− 2, w(u4) = 1 + 2(b− 1) + 3(a− 1) = 2n+ a− 4. Hence, f(a) = ∑ uv∈E(G) 1√ w(u) · w(v) = a− 1√ (3n− a− 4)(2n− a− 2) + 1√ (2n− a− 2)(n+ a− 2) + n− a− 1√ (n+ a− 2)(2n+ a− 4) , 6 Art Discrete Appl. Math. 3 (2020) #P2.03 and J(D∗a,b) = (n− 1)f(a). In [8], see the text before Theorem 4, it is proved that f ′′(x) > 0, which means that f(x) is a convex function. Since f(a) = f(n− a), 2 ≤ a ≤ n− 2, we have J(D∗2,n−2) > J(D ∗ 3,n−3) > · · · > J(D∗bn/2c,dn/2e) ≥ (n− 1)f(n/2). If n ≥ 70, then (n−1)f(n/2) > n 2(n−1)2 2n(n−3)+8 · 1 n−1 = J(Kn), which implies that J(D ∗ a,b) > J(Kn) in this case. The cases when n < 70 were checked using a computer software. Theorem 2.6 implies the following. Corollary 2.7. For every k there exists n0 such that for every n ≥ n0 the first k graphs of order n with the biggest value of Balaban index are trees. Analogous result can be proved for sum-Balaban index: Theorem 2.8. Let a and b be positive integers such that a, b ≥ 2, a + b = n and n ≥ 8. Then SJ(D∗a,b) > SJ(Kn). Proof. Using the values w from the proof of Theorem 2.6 we get f(a) = ∑ uv∈E(G) 1√ w(u) + w(v) = a− 1√ 5n− 2a− 6 + 1√ 3n− 4 + n− a− 1√ 3n+ 2a− 6 , and SJ(D∗a,b) = (n− 1)f(a). In [23, Lemma 3.2] it is proved that SJ(D∗2,n−2) > SJ(D ∗ 3,n−3) > · · · > SJ(D∗bn/2c,dn/2e) ≥ (n− 1)f(n/2). Since (n − 1)f(n/2) > n 2(n−1)2 2(n2−3n+4) √ 2n−2 = SJ(Kn) if n ≥ 8, we have SJ(D ∗ a,b) > SJ(Kn). Corollary 2.9. For every k there exists n0 such that for every n ≥ n0 the first k graphs of order n with the biggest value of sum-Balaban index are trees. 3 Comparison with Randić index In the class of trees, the star Sn maximizes the Balaban index [8, 10, 22] and minimizes the Randić index [5]. Hence, for every tree T we have J(T ) R(T ) ≤ n− 1√ 2n− 3 , with equality if and only if T is the star Sn. This observation was pointed out by Aouchiche et al. [1], who proposed to study an extension of this bound to the class of all connected graphs. Based on their computer experiments for n ≥ 5 they proposed the conjecture, which turns out to be true (see Theorem 3.1). Namely, by Theorem 2.2, for n ≥ 8, the star Sn is the graph that maximizes the Balaban index over the class of n-vertex connected graphs, and over this class of graphs Sn also minimizes the Randić index [5, 27]. Using a computer program we have checked that the result holds also for n ∈ {5, 6, 7}, however, for n ∈ {3, 4}, the quotient J(G)R(G) attains its maximal value for the complete graph Kn. Thus we can state the following. M. Knor et al.: Some remarks on Balaban and sum-Balaban index 7 Theorem 3.1. For any connected graph G on n ≥ 2 vertices, we have J(G) R(G) ≤ { n2−n n2−3n+4 , if n ≤ 4 n−1√ 2n−3 , if n ≥ 5, with equality if and only if G = Kn for n ≤ 4, and for n ≥ 5 equality holds precisely for G = Sn. Note that a similar observation can be done for the class of n-vertex connected unicyclic graphs. For this class Gao and Lu [12] proved that S+n (i.e., the graph obtained from the star Sn by adding an edge between two nonadjacent vertices) has the minimum Randić index, but on the other hand it has the maximum Balaban index [7, 24]. In other words, J(G) ≤ J(S+n ) = n 2 ( 1 2n− 4 + 2√ (2n− 4)(n− 1) + n− 3√ (2n− 3)(n− 1) ) , and R(G) ≥ R(S+n ) = n− 3√ n− 1 + 2√ 2(n− 1) + 1 2 , for any connected unicyclic graph on at least 4 vertices. Thus we obtain the following result. Theorem 3.2. For any connected unicyclic graph G on n ≥ 4 vertices, we have J(G) R(G) ≤ J(S + n ) R(S+n ) with equality if and only if G = S+n . 4 Accumulation points of sum-Balaban index In [17] it is shown that for every nonnegative real number r there exists a sequence of graphs {Gr,i}∞i=1 such that the number of vertices of Gr,i tends to infinity as i → ∞ and limi→∞ J(Gr,i) = r. Here we prove an analogous result for sum-Balaban index. Let Ka and K ′a be two disjoint complete graphs on a vertices and let Pb be a path on b vertices. The balanced dumbbell graph Da,b is obtained from Ka ∪Pb ∪K ′a by joining all vertices of Ka to one end-vertex of Pb and all vertices of K ′a to the other end-vertex of Pb. Thus, Da,b has 2a+ b vertices. See Figure 1 for D5,5. Figure 1: The graph D5,5. DenoteQ = √ 2 ln(1+ √ 2). Observe thatQ .= 1.24650 and 1+Q+2 √ Q . = 4.47934. We have the following statement. 8 Art Discrete Appl. Math. 3 (2020) #P2.03 Theorem 4.1. Let r ≥ 1 +Q+ 2 √ Q. Further, let {Dai,bi}∞i=1 be a sequence of balanced dumbbell graphs on ni = 2ai + bi vertices such that ni →∞ and lim i→∞ ai√ ni = 1√ 2 √ r − 1−Q+ √ (r − 1−Q)2 − 4Q. Then limi→∞ SJ(Dai,bi) = r. Proof. First observe that if r ≥ 1 + Q + 2 √ Q then (r − 1 − Q)2 − 4Q ≥ 0, and so (1/ √ 2) √ r − 1 +Q+ √ (r − 1−Q)2 − 4Q is a real number. In [14, Equation (9)] it is proved that if a ∼ c √ n for a (real) constant c, then for a balanced dumbbell graph Da,b on n vertices it holds SJ(Da,b) ∼ c2 + 1 +Q+ Q c2 . Hence, for c = 1√ 2 √ r − 1−Q+ √ (r − 1−Q)2 − 4Q we get SJ(Da,b) ∼ 1 2 ( r − 1−Q+ √ (r − 1−Q)2 − 4Q ) + 1 +Q + 2Q r − 1−Q+ √ (r − 1−Q)2 − 4Q = 1 2 ( r + √ (r − 1−Q)2 − 4Q+ 1 +Q + 4Q r + √ (r − 1−Q)2 − 4Q− 1−Q ) = 1 2 · 2r2 + 2r √ (r − 1−Q)2 − 4Q− 2r − 2rQ r + √ (r − 1−Q)2 − 4Q− 1−Q = r. Although we have a conjecture that for graphs G on large number of vertices SJ(G) ≥ 1 +Q+ 2 √ Q (see Corollary 8 and Conjecture 9 in [14]), it is proved only that SJ(G) ≥ 4 + o(1) (see Theorem 2 in [14]). Hence, if our conjecture is false, then the problem of accumulation points of sum-Balaban index for values in interval [4, 4.47934) remains open. ORCID iDs Riste Škrekovski https://orcid.org/0000-0001-6851-3214 Aleksandra Tepeh https://orcid.org/0000-0002-2321-6766 References [1] M. Aouchiche, G. Caporossi and P. Hansen, Refutations, results and conjectures about the Balaban index, Internat. J. Chem. Model. 5 (2013), 189–202. M. Knor et al.: Some remarks on Balaban and sum-Balaban index 9 [2] A. T. Balaban, Highly discriminating distance-based topological index, Chem. Phys. Lett. 89 (1982), 399–404, doi:10.1016/0009-2614(82)80009-2. [3] A. T. Balaban, Topological indices based on topological distances in molecular graphs, Pure Appl. Chem. 55 (1983), 199–206, doi:10.1351/pac198855020199. [4] A. T. Balaban, P. V. Khadikar and S. Aziz, Comparison of topological indices based on iterated ‘sum’ versus ‘product’ operations, Iranian J. Math. Chem. 1 (2010), 43–67, doi:10.22052/ijmc. 2010.5134. [5] B. Bollobás and P. Erdős, Graphs of extremal weights, Ars Combin. 50 (1998), 225–233. [6] Z. Chen, M. Dehmer, Y. Shi and H. Yang, Sharp upper bounds for the Balaban in- dex of bicyclic graphs, MATCH Commun. Math. Comput. Chem. 75 (2016), 105–128, http://match.pmf.kg.ac.rs/electronic_versions/Match75/n1/ match75n1_105-128.pdf. [7] B. Deng and A. Chang, Maximal Balaban index of graphs, MATCH Commun. Math. Comput. Chem. 70 (2013), 259–286, http://match.pmf.kg.ac.rs/electronic_ versions/Match70/n1/match70n1_259-286.pdf. [8] H. Deng, On the Balaban index of trees, MATCH Commun. Math. Comput. Chem. 66 (2011), 253–260, http://match.pmf.kg.ac.rs/electronic_versions/ Match66/n1/match66n1_253-260.pdf. [9] H. Deng, On the sum-Balaban index, MATCH Commun. Math. Comput. Chem. 66 (2011), 273–284, http://match.pmf.kg.ac.rs/electronic_versions/ Match66/n1/match66n1_273-284.pdf. [10] H. Dong and X. Guo, Character of trees with extreme Balaban index, MATCH Commun. Math. Comput. Chem. 66 (2011), 261–272, http://match.pmf.kg.ac.rs/electronic_ versions/Match66/n1/match66n1_261-272.pdf. [11] W. Fang, Y. Gao, Y. Shao, W. Gao, G. Jing and Z. Li, Maximum Balaban index and sum-Balaban index of bicyclic graphs, MATCH Commun. Math. Comput. Chem. 75 (2016), 129–156, http://match.pmf.kg.ac.rs/electronic_versions/ Match75/n1/match75n1_129-156.pdf. [12] J. Gao and M. Lu, On the Randić index of unicyclic graphs, MATCH Commun. Math. Comput. Chem. 53 (2005), 377–384, http://match.pmf.kg.ac.rs/electronic_ versions/Match53/n2/match53n2_377-384.pdf. [13] M. Knor, J. Kranjc, R. Škrekovski and A. Tepeh, A search for the minimum value of Balaban index, Appl. Math. Comput. 286 (2016), 301–310, doi:10.1016/j.amc.2016.04.023. [14] M. Knor, J. Kranjc, R. Škrekovski and A. Tepeh, On the minimum value of sum-Balaban index, Appl. Math. Comput. 303 (2017), 203–210, doi:10.1016/j.amc.2017.01.041. [15] M. Knor, R. Škrekovski and A. Tepeh, Trees with large sum-Balaban index, submitted. [16] M. Knor, R. Škrekovski and A. Tepeh, Balaban index of cubic graphs, MATCH Commun. Math. Comput. Chem. 73 (2015), 519–528, http://match.pmf.kg.ac.rs/electronic_ versions/Match73/n2/match73n2_519-528.pdf. [17] M. Knor, R. Škrekovski and A. Tepeh, A note on accumulation points of Balaban index, MATCH Commun. Math. Comput. Chem. 78 (2017), 163–168, http://match.pmf.kg. ac.rs/electronic_versions/Match78/n1/match78n1_163-168.pdf. [18] M. Knor, R. Škrekovski and A. Tepeh, Convexity result and trees with large Balaban index, Appl. Math. Nonlinear Sci. 3 (2018), 433–445, doi:10.21042/amns.2018.2.00034. 10 Art Discrete Appl. Math. 3 (2020) #P2.03 [19] M. Knor, R. Škrekovski and A. Tepeh, Mathematical aspects of Balaban index, MATCH Commun. Math. Comput. Chem. 79 (2018), 685–716, http://match.pmf.kg.ac.rs/ electronic_versions/Match79/n3/match79n3_685-716.pdf. [20] H. Lei and H. Yang, Bounds for the sum-Balaban index and (revised) Szeged index of regular graphs, Appl. Math. Comput. 268 (2015), 1259–1266, doi:10.1016/j.amc.2015.07.021. [21] M. Randić, On characterization of molecular branching, J. Am. Chem. Soc. 97 (1975), 6609– 6615, doi:10.1021/ja00856a001. [22] L. Sun, Bounds on the Balaban index of trees, MATCH Commun. Math. Comput. Chem. 63 (2010), 813–818, http://match.pmf.kg.ac.rs/electronic_versions/ Match63/n3/match63n3_813-818.pdf. [23] R. Xing, B. Zhou and A. Graovac, On sum-Balaban index, Ars Combin. 104 (2012), 211–223. [24] L. You and X. Dong, The maximum Balaban index (sum-Balaban index) of unicyclic graphs, J. Math. Res. Appl. 34 (2014), 392–402, doi:10.3770/j.issn:2095-2651.2014.04.002. [25] L. You and H. Han, The maximum sum-Balaban index of tree graph with given vertices and maximum degree, Adv. Appl. Math. (Chinese) 2 (2013), 147–151, doi:10.12677/aam.2013. 24019. [26] L. You and H. Han, The maximum sum-Balaban index of trees with given diameter, Ars Com- bin. 112 (2013), 115–128. [27] P. Yu, An upper bound on the Randić of trees, J. Math. Study (Chinese) 31 (1998), 225–230, http://en.cnki.com.cn/Article_en/CJFDTotal-SSYJ199802021.htm. ISSN 2590-9770 The Art of Discrete and Applied Mathematics 3 (2020) #P2.04 https://doi.org/10.26493/2590-9770.1271.e54 (Also available at http://adam-journal.eu) On the Terwilliger algebra of a certain family of bipartite distance-regular graphs with ∆2 = 0 Štefko Miklavič* , Safet Penjić† University of Primorska, Andrej Marušič Institute, Muzejski trg 2, 6000 Koper, Slovenia Received 27 September 2018, accepted 4 January 2019, published online 10 August 2020 Abstract Let Γ denote a bipartite distance-regular graph with diameterD ≥ 4 and valency k ≥ 3. Let X denote the vertex set of Γ, and let Ai (0 ≤ i ≤ D) denote the distance matrices of Γ. We abbreviate A := A1. For x ∈ X and for 0 ≤ i ≤ D, let Γi(x) denote the set of vertices in X that are distance i from vertex x. Fix x ∈ X and let T = T (x) denote the subalgebra of MatX(C) generated by A,E∗0 , E ∗ 1 , . . . , E ∗ D, where for 0 ≤ i ≤ D, E∗i represents the projection onto the ith subconstituent of Γ with respect to x. We refer to T as the Terwilliger algebra of Γ with respect to x. By the endpoint of an irreducible T -module W we mean min{i | E∗iW 6= 0}. In this paper we assume Γ has the property that for 2 ≤ i ≤ D− 1, there exist complex scalars αi, βi such that for all y, z ∈ X with ∂(x, y) = 2, ∂(x, z) = i, ∂(y, z) = i, we have αi + βi|Γ1(x) ∩ Γ1(y) ∩ Γi−1(z)| = |Γi−1(x) ∩ Γi−1(y) ∩ Γ1(z)|. We study the structure of irreducible T -modules of endpoint 2. Let W denote an irre- ducible T -module with endpoint 2, and let v denote a nonzero vector in E∗2W . We show that W = span ( {E∗i Ai−2E∗2v | 2 ≤ i ≤ D} ∪ {E∗i Ai+2E∗2v | 2 ≤ i ≤ D − 2} ) . It turns out that, except for a particular family of bipartite distance-regular graphs with D = 5, this result is already known in the literature. Assume now that Γ is a member of this particular family of graphs. We show that if Γ is not almost 2-homogeneous, then up to isomorphism there exists exactly one irreducible T -module with endpoint 2 and it is not thin. We give a basis for this T -module. Keywords: Distance-regular graphs, Terwilliger algebra, irreducible modules. Math. Subj. Class. (2020): 05E30, 05C50 *The author acknowledge the financial support from the Slovenian Research Agency (research core funding No. P1-0285 and research projects N1-0032, N1-0038, N1-0062, J1-5433, J1-6720, J1-7051, J1-9108, J1-9110). †The author acknowledges the financial support from the Slovenian Research Agency (research core funding No. P1-0285 and Young Researchers Grant). E-mail addresses: stefko.miklavic@upr.si (Štefko Miklavič), safet.penjic@iam.upr.si (Safet Penjić) cb This work is licensed under https://creativecommons.org/licenses/by/4.0/ 2 Art Discrete Appl. Math. 3 (2020) #P2.04 1 Introduction Throughout this introduction let Γ denote a bipartite distance-regular graph with diameter D ≥ 4, valency k ≥ 3 and path-length function ∂. Let X denote the vertex set of Γ. For x ∈ X and 0 ≤ i ≤ D, let Γi(x) denote the set of vertices in X that are distance i from vertex x, and let T = T (x) denote the Terwilliger algebra of Γ with respect to x (see Section 2 for formal definitions). It is known that there exists a unique irreducible T -module with endpoint 0, and this module is thin [8, Proposition 8.4]. Moreover, Curtin showed that up to isomorphism Γ has exactly one irreducible T -module with endpoint 1, and this module is thin [4, Corol- lary 7.7]. We now discuss the irreducible T -modules of endpoint 2. It turns out that the structure of these modules is particularly nice if we assume that Γ has the following combinatorial property: for 2 ≤ i ≤ D − 1, there exist complex scalars αi, βi such that for all y, z ∈ X with ∂(x, y) = 2, ∂(x, z) = i, ∂(y, z) = i, we have αi + βi|Γ1(x) ∩ Γ1(y) ∩ Γi−1(z)| = |Γi−1(x) ∩ Γi−1(y) ∩ Γ1(z)|. Irreducible modules of endpoint 2 of these graphs were studied extensively, see [10, 11, 12, 13, 15]. We are motivated by the fact that the above equation holds if Γ is Q-polynomial. Assume that Γ has the above mentioned combinatorial property. We show that if W is an irreducible T -module with endpoint 2 and v is a nonzero vector in E∗2W , then W = span ( {E∗i Ai−2E∗2v | 2 ≤ i ≤ D} ∪ {E∗i Ai+2E∗2v | 2 ≤ i ≤ D − 2} ) . Except for a particular family of bipartite distance-regular graphs with D = 5, this result is already known in the literature. To define this particular family we introduce a certain parameter ∆2 in terms of the intersection numbers of Γ by ∆2 = (k− 2)(c3− 1)− (c2 − 1)p222. It turns out that ∆2 ≥ 0 and that ∆2 = 0 implies c2 ∈ {1, 2} or D ≤ 5. The above mentioned family of bipartite distance-regular graphs with D = 5 is exactly the family of such graphs with ∆2 = 0. Assume now that Γ is such a graph. We show that if Γ is not almost 2-homogeneous, then up to isomorphism there exists exactly one irreducible T -module with endpoint 2, and this module is not thin. We give a basis for this T -module. If Γ is almost 2-homogeneous, then the structure of irreducible T -modules with endpoint 2 is described in [7]. 2 Preliminaries In this section we review some definitions and basic results concerning distance-regular graphs. See the book of A. E. Brouwer, A. M. Cohen and A. Neumaier [2] for more background information. Let C denote the complex number field and let X denote a nonempty finite set. Let MatX(C) denote the C-algebra consisting of all matrices whose rows and columns are indexed by X and whose entries are in C. Let V = CX denote the vector space over C consisting of column vectors whose coordinates are indexed by X and whose entries are in C. We observe MatX(C) acts on V by left multiplication. We call V the standard module. We endow V with the Hermitean inner product 〈 , 〉 that satisfies 〈u, v〉 = utv for u, v ∈ V , where t denotes transpose and denotes complex conjugation. Recall that 〈u,Bv〉 = 〈Btu, v〉 (2.1) Š. Miklavič and S. Penjić: On the Terwilliger algebra of BDRG with ∆2 = 0 3 for u, v ∈ V and B ∈ MatX(C). For y ∈ X let ŷ denote the element of V with a 1 in the y coordinate and 0 in all other coordinates. Note that {ŷ | y ∈ X} is an orthonormal basis for V. Let Γ = (X,R) denote a finite, undirected, connected graph, without loops or multiple edges, with vertex set X and edge set R. Let ∂ denote the path-length distance function for Γ, and set D := max{∂(x, y) | x, y ∈ X}. We call D the diameter of Γ. For a vertex x ∈ X and an integer i let Γi(x) denote the set of vertices at distance i from x. For an integer k ≥ 0 we say Γ is regular with valency k whenever |Γ1(x)| = k for all x ∈ X . We say Γ is distance-regular whenever for all integers h, i, j (0 ≤ h, i, j ≤ D) and for all vertices x, y ∈ X with ∂(x, y) = h, the number phij = |Γi(x) ∩ Γj(y)| is independent of x and y. The phij are called the intersection numbers of Γ. For the rest of this paper we assume Γ is distance-regular with diameter D ≥ 4. Note that phij = p h ji for 0 ≤ h, i, j ≤ D. For convenience set ci := pi1,i−1 (1 ≤ i ≤ D), ai := p i 1i (0 ≤ i ≤ D), bi := pi1,i+1 (0 ≤ i ≤ D − 1), ki := p0ii (0 ≤ i ≤ D), and c0 = bD = 0. By the triangle inequality the following hold for 0 ≤ h, i, j ≤ D: (i) phij = 0 if one of h, i, j is greater than the sum of the other two; (ii) phij 6= 0 if one of h, i, j equals the sum of the other two. In particular ci 6= 0 for 1 ≤ i ≤ D and bi 6= 0 for 0 ≤ i ≤ D−1. We observe that Γ is regular with valency k = k1 = b0 and that ci + ai + bi = k (0 ≤ i ≤ D). (2.2) Note that ki = |Γi(x)| for x ∈ X and 0 ≤ i ≤ D. By [2, p. 127], ki = b0b1 · · · bi−1 c1c2 · · · ci (1 ≤ i ≤ D). (2.3) Recall Γ is bipartite whenever ai = 0 for 0 ≤ i ≤ D. Setting ai = 0 in (2.2) we find bi + ci = k (0 ≤ i ≤ D). (2.4) The following formulae for the bipartite case will be useful. Lemma 2.1 ([2, Lemma 4.1.7]). Let Γ denote a bipartite distance-regular graph with diameter D ≥ 4 and valency k ≥ 3. Then pi2i = ci(bi−1 − 1) + bi(ci+1 − 1) c2 (1 ≤ i ≤ D − 1), pD2D = k(bD−1 − 1) c2 . We recall the Bose-Mesner algebra of Γ. For 0 ≤ i ≤ D let Ai denote the matrix in MatX(C) with (x, y)-entry (Ai)xy = { 1 if ∂(x, y) = i, 0 if ∂(x, y) 6= i (x, y ∈ X). (2.5) For notational convenience, we define Ai to be the zero matrix for all integers i < 0 or i > D. We call Ai the ith distance matrix of Γ. We abbreviate A := A1 and call this the adjacency matrix of Γ. We observe (i) A0 = I; (ii) ∑D i=0Ai = J ; (iii) Ai = Ai (0 ≤ i ≤ D); (iv) Ati = Ai (0 ≤ i ≤ D); (v) AiAj = ∑D h=0 p h ijAh (0 ≤ i, j ≤ D), where I (resp. J) denotes the identity matrix (resp. all 1’s matrix) in MatX(C). Using these facts we find A0, A1, . . . , AD is a basis for a commutative subalgebra M of MatX(C). We call M the Bose-Mesner algebra of Γ. It turns out that A generates M [1, p. 190]. 4 Art Discrete Appl. Math. 3 (2020) #P2.04 3 Terwilliger algebra Let Γ denote a distance-regular with diameter D ≥ 4 and valency k ≥ 3. We first recall the dual idempotents of Γ. To do this fix a vertex x ∈ X. We view x as a “base vertex”. For 0 ≤ i ≤ D let E∗i = E∗i (x) denote the diagonal matrix in MatX(C) with (y, y)-entry (E∗i )yy = { 1 if ∂(x, y) = i, 0 if ∂(x, y) 6= i (y ∈ X). We call E∗i the ith dual idempotent of Γ with respect to x [16, p. 378]. We observe (ei)∑D i=0E ∗ i = I; (eii) E ∗ i = E ∗ i (0 ≤ i ≤ D); (eiii) E∗ti = E∗i (0 ≤ i ≤ D); (eiv) E∗i E ∗ j = δijE ∗ i (0 ≤ i, j ≤ D). By these facts E∗0 , E∗1 , . . . , E∗D form a basis for a commutative subalgebra M∗ = M∗(x) of MatX(C). We call M∗ the dual Bose-Mesner algebra of Γ with respect to x [16, p. 378]. For 0 ≤ i ≤ D we have E∗i V = span{ŷ | y ∈ X, ∂(x, y) = i}, so dimE∗i V = ki. We call E ∗ i V the ith subconstituent of Γ with respect to x. Note that V = E∗0V + E ∗ 1V + · · ·+ E∗DV (orthogonal direct sum). (3.1) Moreover E∗i is the projection from V onto E ∗ i V for 0 ≤ i ≤ D. We now recall the Terwilliger algebra of Γ. Let T = T (x) denote the subalgebra of MatX(C) generated by M , M∗. We call T the Terwilliger algebra of Γ with respect to x [16, Definition 3.3]. Recall M is generated by A, so T is generated by A and the dual idempotents. We observe T has finite dimension. By construction T is closed under the conjugate-transpose map so T is semisimple [16, Lemma 3.4(i)]. By a T -module we mean a subspace W of V such that BW ⊆ W for all B ∈ T . Let W denote a T -module. Then W is said to be irreducible whenever W is nonzero and W contains no T -modules other than 0 and W . By [9, Corollary 6.2] any T -module is an orthogonal direct sum of irreducible T - modules. In particular the standard module V is an orthogonal direct sum of irreducible T -modules. Let W , W ′ denote T -modules. By an isomorphism of T -modules from W to W ′ we mean an isomorphism of vector spaces σ : W →W ′ such that (σB −Bσ)W = 0 for all B ∈ T . The T -modules W , W ′ are said to be isomorphic whenever there exists an isomorphism of T -modules from W to W ′. By [4, Lemma 3.3] any two nonisomor- phic irreducible T -modules are orthogonal. Let W denote an irreducible T -module. By [16, Lemma 3.4(iii)] W is an orthogonal direct sum of the nonvanishing spaces among E∗0W,E ∗ 1W, . . . , E ∗ DW . By the endpoint ofW we mean min{i | 0 ≤ i ≤ D, E∗iW 6= 0}. By the diameter of W we mean |{i | 0 ≤ i ≤ D, E∗iW 6= 0}| − 1. We say W is thin whenever the dimension of E∗iW is at most 1 for 0 ≤ i ≤ D. The following matrices of MatX(C) will be useful later in the paper. Definition 3.1. Let Γ denote a distance-regular with diameter D ≥ 4 and valency k ≥ 3. Fix x ∈ X and let E∗i = E∗i (x) (0 ≤ i ≤ D) and T = T (x). We define matrices L = L(x), R = R(x) by L = D∑ h=1 E∗h−1AE ∗ h, R = D−1∑ h=0 E∗h+1AE ∗ h. Š. Miklavič and S. Penjić: On the Terwilliger algebra of BDRG with ∆2 = 0 5 Note thatA = L+R [4, Lemma 4.4] andLt = R. We callL andR the lowering matrix and the raising matrix of Γ with respect to x, respectively. Observe that L and R are contained in T . Definition 3.2 ([7, Definition 3.2]). Let Γ denote a distance-regular with diameter D ≥ 4 and valency k ≥ 3. Fix x ∈ X . For 1 ≤ i ≤ D we define matrices Λi = Λi(x) in MatX(C) by (Λi)zy = { |Γ1(x) ∩ Γ1(y) ∩ Γi−1(z)|, if ∂(x, y) = 2, ∂(x, z) = ∂(y, z) = i, 0, otherwise for z, y ∈ X . 4 The scalars ∆i and γi Let Γ denote a distance-regular graph with diameter D ≥ 4 and valency k ≥ 3. From now on we assume that Γ is bipartite. In this section we introduce certain scalars ∆i and γi (2 ≤ i ≤ D − 1) which we find useful. Definition 4.1. Let Γ denote a distance-regular with diameter D ≥ 4 and valency k ≥ 3. Then for 2 ≤ i ≤ D − 1 we define ∆i = (bi−1 − 1)(ci+1 − 1)− (c2 − 1)pi2i and γi = ci(bi−1 − 1) pi2i (observe that pi2i > 0 by [3, Lemma 11]). By [3, Theorem 12] we have ∆i ≥ 0 for 2 ≤ i ≤ D − 1. Moreover, the scalars ∆i and γi are related as follows. Lemma 4.2 ([3, Theorem 13]). Let Γ denote a distance-regular with diameter D ≥ 4 and valency k ≥ 3 and fix an integer 2 ≤ i ≤ D − 1. Then the following (i),(ii) are equivalent. (i) ∆i = 0. (ii) For all x, y, z ∈ X with ∂(x, y) = 2, ∂(x, z) = i, ∂(y, z) = i, |Γ1(x) ∩ Γ1(y) ∩ Γi−1(z)| = γi. If ∆i = 0 for 2 ≤ i ≤ D − 2, then Γ is called almost 2-homogeneous, see [7]. In this case the structure of irreducible T -modules is well understood, so we will assume that Γ is not almost 2-homogeneous. In the rest of the paper we therefore consider the following situation. Notation 4.3. Let Γ = (X,R) denote a bipartite distance-regular graph with diameter D ≥ 4, valency k ≥ 3 and intersection numbers bi, ci, which is not almost 2-homogeneous. Let Ai (0 ≤ i ≤ D) be the distance matrices of Γ, and let V denote the standard module for Γ. We fix x ∈ X and let E∗i = E∗i (x) (0 ≤ i ≤ D) and T = T (x) denote the dual idempotents and the Terwilliger algebra of Γ with respect to x, respectively. We assume 6 Art Discrete Appl. Math. 3 (2020) #P2.04 that for 2 ≤ i ≤ D − 1, there exist complex scalars αi, βi such that for all y, z ∈ X with ∂(x, y) = 2, ∂(x, z) = i, ∂(y, z) = i, we have αi + βi|Γ1(x) ∩ Γ1(y) ∩ Γi−1(z)| = |Γi−1(x) ∩ Γi−1(y) ∩ Γ1(z)|. Let matrices L = L(x), R = R(x) and Λi = Λi(x) (1 ≤ i ≤ D) be as in Definitions 3.1 and 3.2. Let scalars ∆i, γi (2 ≤ i ≤ D − 1) be as in Definition 4.1. With reference to Notation 4.3, pick 2 ≤ i ≤ D − 1 and assume that ∆i 6= 0. By [12, Theorem 5.4] scalars αi and βi are uniquely determined and given by αi = ci(ci − 1)(bi−1 − c2)− cici−1(bi − 1)(c2 − 1) c2∆i , βi = ci(ci+1 − ci)(bi−1 − 1)− bi(ci+1 − 1)(ci − ci−1) c2∆i . (4.1) If ∆i = 0, then scalars αi and βi are not uniquely determined. For example, if ∆2 = 0, then one of the possible values for α2 and β2 is α2 = 0, β2 = 1. Note however that by Lemma 4.2 this is not the only possible solution. 5 Some products in T With reference to Notation 4.3, in this section we compute some products of matrices of T . We start by recalling the following results. Lemma 5.1 ([14, Lemma 6.1]). With reference to Notation 4.3, for 0 ≤ h, i, j ≤ D and y, z ∈ X the (y, z)-entry of E∗hAiE∗j is 1 if ∂(x, y) = h, ∂(y, z) = i, ∂(x, z) = j, and 0 otherwise. Lemma 5.2 ([14, Lemma 6.5]). With reference to Notation 4.3, for 0 ≤ h, i, j, r, s ≤ D and y, z ∈ X the (y, z)-entry of E∗hArE∗i AsE∗j is |Γi(x) ∩ Γr(y) ∩ Γs(z)| if ∂(x, y) = h, ∂(x, z) = j, and 0 otherwise. Lemma 5.3 ([7, Lemma 3.3]). With reference to Notation 4.3, we have Λ1 = E ∗ 1AE ∗ 2 , Λi = E ∗ i Ai−1E ∗ 1AE ∗ 2 − c2E∗i Ai−2E∗2 (2 ≤ i ≤ D). In particular, Λi ∈ T (1 ≤ i ≤ D). Theorem 5.4. With reference to Notation 4.3 the following holds for 3 ≤ i ≤ D: LE∗i Ai−2E ∗ 2 = bi−1E ∗ i−1Ai−3E ∗ 2 + (ci−1 − αi−1)E∗i−1Ai−1E∗2 − βi−1Λi−1. (5.1) Proof. Pick z, y ∈ X and an integer 3 ≤ i ≤ D. We show that (z, y)-entries of both sides of (5.1) agree. Note that by the property (eiv) of Section 3 and Lemma 5.2, (LE∗i Ai−2E ∗ 2 )zy = { |Γi(x) ∩ Γi−2(y) ∩ Γ1(z)| if ∂(x, y) = 2, ∂(x, z) = i− 1, 0 otherwise. (5.2) It follows from (5.2), Lemma 5.1 and Definition 3.2 that the (z, y)-entries of both sides of (5.1) are 0 if ∂(x, y) 6= 2 or ∂(x, z) 6= i−1. Assume now ∂(x, y) = 2 and ∂(x, z) = i−1. Š. Miklavič and S. Penjić: On the Terwilliger algebra of BDRG with ∆2 = 0 7 Observe that by the triangle inequality we have that ∂(z, y) ∈ {i − 3, i − 1, i + 1}. We consider each of these three cases separately. Case 1: ∂(x, y) = 2, ∂(x, z) = i−1 and ∂(z, y) = i−3. Note that in this case we have (LE∗i Ai−2E ∗ 2 )zy = bi−1 by (5.2). By Lemma 5.1 and Definition 3.2 the (z, y)-entries of both sides of (5.1) agree. Case 2: ∂(x, y) = 2, ∂(x, z) = i − 1 and ∂(z, y) = i − 1. Observe that by (5.2) we have (LE∗i Ai−2E ∗ 2 )zy = ci−1 − |Γ1(z) ∩ Γi−2(x) ∩ Γi−2(y)| = ci−1 − (αi−1 + βi−1|Γi−2(z) ∩ Γ1(x) ∩ Γ1(y)|). By Lemma 5.1 and Definition 3.2 the (z, y)-entries of both sides of (5.1) agree. Case 3: ∂(x, y) = 2, ∂(x, z) = i − 1 and ∂(z, y) = i + 1. By (5.2), Lemma 5.1 and Definition 3.2 the (z, y)-entries of both sides of (5.1) are 0. 6 Irreducible T -modules with endpoint 2 With reference to Notation 4.3, let W denote an irreducible T -module with endpoint 2. In this section we find a spanning set for W . Definition 6.1. With reference to Notation 4.3, letW denote an irreducible T -module with endpoint 2 and let v denote a nonzero vector in E∗2W . For 0 ≤ i ≤ D, define v+i = E ∗ i Ai−2E ∗ 2v, v − i = E ∗ i Ai+2E ∗ 2v. Note that v+2 = v, v + i = 0 if i < 2, and v − i = 0 if i < 2 or i > D − 2. Lemma 6.2 ([5, Corollary 9.3(i), Theorem 9.4]). With reference to Definition 6.1, the following (i)–(iv) hold. (i) E∗i AiE ∗ 2v = −(v+i + v − i ) (2 ≤ i ≤ D). (ii) Rv+i = ci−1v + i+1 (2 ≤ i ≤ D − 1) and Rv + D = 0. (iii) Lv−i = bi+1v − i−1 (2 ≤ i ≤ D − 2). (iv) Lv+i+1 −Rv − i−1 = biv + i − civ − i (1 ≤ i ≤ D − 1). Lemma 6.3. With reference to Definition 6.1, the following (i)–(iii) hold. (i) Λiv = −c2v+i (2 ≤ i ≤ D). (ii) Lv+2 = 0 and Lv+i = (bi−1 − ci−1 + αi−1 + c2βi−1)v + i−1 − (ci−1 − αi−1)v − i−1 for 3 ≤ i ≤ D. (iii) Rv−i = (c2βi+1 − ci+1 + αi+1)v + i+1 + αi+1v − i+1 for 2 ≤ i ≤ D − 2. Proof. (i) Immediate from Lemma 5.3 and Definition 6.1. (ii) Note that Lv+2 = 0 as the endpoint of W is 2. To obtain the result for Lv + i (3 ≤ i ≤ D) apply (5.1) to v and use Definition 6.1, Lemma 6.2(i) and (i) above. (iii) Immediately by (ii) above and Lemma 6.2(iv). 8 Art Discrete Appl. Math. 3 (2020) #P2.04 Theorem 6.4. With reference to Definition 6.1, W = span{v+2 , v + 3 , . . . , v + D, v − 2 , v − 3 , . . . , v − D−2}. Proof. Denote W ′ = span{v+2 , v + 3 , . . . , v + D, v − 2 , v − 3 , . . . , v − D−2} and note that W ′ ⊆ W . We now show that W = W ′. Note that E∗i v + j = δijv + j for 2 ≤ j ≤ D and E∗i v − j = δijv − j for 2 ≤ j ≤ D − 2. Therefore, W ′ is invariant under the action of E∗i for 0 ≤ i ≤ D. Ob- serve also that W ′ is invariant under the action of L by Lemma 6.2(iii) and Lemma 6.3(ii), and also invariant under the action of R by Lemma 6.2(ii) and Lemma 6.3(iii). As A = R+L, W ′ is invariant under the action ofA. As T is generated byA andE∗i (0 ≤ i ≤ D), this implies that W ′ is a T -module. Recall that W is irreducible and that W ′ contains a nonzero vector v. It follows that W = W ′. Corollary 6.5. With reference to Definition 6.1, we have dim (E∗D−1W ) ≤ 1, dim (E∗DW ) ≤ 1. Proof. Immediately from Theorem 6.4. As already mentioned, the result from Theorem 6.4 is already known in the literature, except for the case D = 5 and ∆2 = 0, see [11, 12, 15]. In the rest of the paper we study this case in detail. If D = 5 and ∆2 = ∆3 = 0, then Γ is almost 2-homogeneous, contradicting our assumption in Notation 4.3. Therefore, we have that ∆3 6= 0. 7 Case ∆2 = 0 and ∆3 6= 0 With reference to Notation 4.3, in this section we study graphs with ∆2 = 0 and ∆3 6= 0. We first have the following observation. Lemma 7.1. With reference to Definition 6.1, assume that ∆2 = 0 and ∆3 6= 0. Then the following (i), (ii) hold. (i) c3 = (c22 − c2 + 1)k − c2(c2 + 1) k + c22 − 3c2 . (ii) α3 = 0, β3 = c2(k − 2) k + c22 − 3c2 . Proof. (i) Solve ∆2 = 0 for c3. Note that k+ c22 − 3c2 = (c2 − 1)(c2 − 2) + k− 2 > 0 as k ≥ 3. (ii) Use Definition 4.1, (4.1) and (i) above. Lemma 7.2. With reference to Definition 6.1, assume that ∆2 = 0 and ∆3 6= 0. Then E∗2A2E ∗ 2v = − c2(k − 2) k + c22 − 3c2 v. Š. Miklavič and S. Penjić: On the Terwilliger algebra of BDRG with ∆2 = 0 9 Proof. Let Γ22 = Γ 2 2(x) denote the graph with vertex set X̃ = Γ2(x) and edge set R̃ = {yz | y, z ∈ X̃, ∂(y, z) = 2}. The graph Γ22 has exactly k2 vertices and it is regular with valency p222 ([6, Lemma 3.2]). Let à denote the adjacency matrix of Γ 2 2. The matrix à is symmetric with real entries. Therefore à is diagonalizable with all eigenvalues real. Note that eigenvalues for E∗2A2E ∗ 2 and à are the same. Since ∆2 = 0, we know E∗2A2E ∗ 2 has exactly one distinct eigenvalue η on E ∗ 2W by [6, Theorem 4.11, Corollary 4.13, Lemma 5.3]. Thus, every nonzero vector in E∗2W is an eigenvector for E∗2A2E ∗ 2 with eigenvalue η. By [6, Lemmas 5.4, 5.5] we find η = − c2γ2 . The result now follows from Definition 4.1 and Lemma 7.1(i). Corollary 7.3. With reference to Definition 6.1, assume that ∆2 = 0 and ∆3 6= 0. Then v−2 = b2(c2 − 1) k + c22 − 3c2 v+2 . Proof. By Lemma 6.2(i) and Lemma 7.2 we have −v+2 − v − 2 = E ∗ 2A2E ∗ 2v = − c2(k − 2) k + c22 − 3c2 v+2 . The result follows. Corollary 7.4. With reference to Definition 6.1, assume that D = 5, ∆2 = 0 and ∆3 6= 0. Then W = span{v+2 , v + 3 , v + 4 , v + 5 , v − 3 }. (7.1) Proof. Immediately from Theorem 6.4 and Corollary 7.3. Observe that by (3.1) vectors v+2 , v + 3 , v + 4 , v + 5 are linearly independent, provided they are non-zero. 8 Some scalar products With reference to Definition 6.1, assume that D = 5, ∆2 = 0 and ∆3 6= 0. Our goal for the rest of this paper is to find a basis for W . In this section we compute the norms of vectors v+3 , v + 4 , v + 5 , v − 3 in terms of the intersection numbers of Γ and ‖v‖. Note that by [10, Lemma 6.4] we have ∆4 6= 0 as well. The assumptions of [10, Lemma 6.4] are somehow different from assumptions of Notation 4.3. However, the proof of [10, Lemma 6.4] works just fine also under assumptions of Notation 4.3. Lemma 8.1. With reference to Definition 6.1, assume that ∆2 = 0 and ∆3 6= 0. Then ‖v+3 ‖2 = b2(b2 − c2) k + c22 − 3c2 ‖v‖2. In particular, if D ≥ 5 then v+3 6= 0. Proof. By Lemma 6.2(ii), (2.1) and Definition 3.1 we have ‖v+3 ‖2 = 〈v + 3 , v + 3 〉 = 〈Rv + 2 , v + 3 〉 = 〈v + 2 , Lv + 3 〉. 10 Art Discrete Appl. Math. 3 (2020) #P2.04 The result now follows from Lemma 6.3(ii), Corollary 7.3 and since α2 = 0, β2 = 1. Now assume that v+3 = 0. Observe that this implies b2 = c2. If D ≥ 5 then by [2, Proposition 4.1.6](i),(ii) we have c2 ≤ c3 ≤ b2, and so c2 = c3. But then c2 = 1 by Lemma 7.1(i), and so k = b2 + c2 = 2, a contradiction. Lemma 8.2. With reference to Definition 6.1, assume that ∆2 = 0 and ∆3 6= 0. Then 〈v+3 , v − 3 〉 = b2b4(c2 − 1) k + c22 − 3c2 ‖v‖2. Proof. By Lemma 6.2(ii), (2.1) and Definition 3.1 we have 〈v+3 , v − 3 〉 = 〈Rv + 2 , v − 3 〉 = 〈v + 2 , Lv − 3 〉. The result now follows from Lemma 6.2(iii) and Corollary 7.3. Lemma 8.3. With reference to Definition 6.1, assume that D = 5, ∆2 = 0 and ∆3 6= 0. Then ‖v+4 ‖2 = b2((b3 − 1)b2 − c3(c2 − 1)b4) c2(k + c22 − 3c2) ‖v‖2. In particular, v+4 = 0 if and only if c2 6= 1 and b4 = b2(b3 − 1)/(c3(c2 − 1)). Proof. By Lemma 6.2(ii), (2.1) and Definition 3.1 we have 〈v+4 , v + 4 〉 = 1 c2 〈Rv+3 , v + 4 〉 = 1 c2 〈v+3 , Lv + 4 〉. The formula for ‖v+4 ‖2 now follows from Lemma 6.3(ii), Lemma 7.1, Lemma 8.1 and Lemma 8.2. It is clear that v+4 = 0 if c2 6= 1 and b4 = b2(b3 − 1)/(c3(c2 − 1)). Therefore assume now that v+4 = 0. It follows that (b3 − 1)b2 = c3(c2 − 1)b4. If c2 = 1, then also b3 = 1 and c3 = 1 by Lemma 7.1(i). But then k = c3 + b3 = 2, a contradiction. Therefore c2 6= 1 and the result follows. Lemma 8.4. With reference to Definition 6.1, assume that D = 5, ∆2 = 0 and ∆3 6= 0. Then ‖v−3 ‖2 = ( (c2 − 1)(c4 − 1)b2 k + c22 − 3c2 + (k − 1)∆3 b2 − 1 ) b2b4‖v‖2 c2(kc2 − k − c2) + b2 . Proof. By Lemma 6.2(iv), (2.1) and Definition 3.1 we have c3〈v−3 , v − 3 〉 = b3〈v + 3 , v − 3 〉+ 〈Rv − 2 , v − 3 〉 − 〈v + 4 , Rv − 3 〉. The result now follows from Lemmas 6.3(iii), 7.1, 8.2 and 8.3, Corollary 7.3 and (4.1). Corollary 8.5. With reference to Definition 6.1, assume that D = 5, ∆2 = 0 and ∆3 6= 0. Then the following (i), (ii) hold. (i) v−3 6= 0. (ii) v+3 , v − 3 are linearly independent. Š. Miklavič and S. Penjić: On the Terwilliger algebra of BDRG with ∆2 = 0 11 Proof. (i) Note that (c2−1)(c4−1)b2/(k+c22−3c2) ≥ 0 and that (k−1)∆3/(b2−1) > 0 by [3, Theorem 12]. Moreover, it is easy to see that c2(kc2 − k − c2) + b2 > 0. The result follows. (ii) Assume on the contrary that v+3 , v − 3 are linearly dependent. Let B = ( 〈v+3 , v + 3 〉 〈v + 3 , v − 3 〉 〈v−3 , v + 3 〉 〈v − 3 , v − 3 〉 ) and note that det(B) = 0. Using Lemmas 8.1, 8.2 and 8.4 one could easily see that the only factor of det(B) which could be zero is c4k − c32k + 2c22k − 2c2k + c32c4 − 2c22c4 − c2c4 + 2c22. Solving this for c4 and then computing ∆3 using Definition 4.1, we obtain ∆3 = 0, a contradiction. This shows that v+3 , v − 3 are linearly independent. Lemma 8.6. With reference to Definition 6.1, assume that D = 5, ∆2 = 0 and ∆3 6= 0. Then ‖v+5 ‖2 = b4 − c4 + α4 + c2β4 c3 ‖v+4 ‖2. In particular, v+5 = 0 if and only if v + 4 = 0 or b4 − c4 + α4 + c2β4 = 0. Proof. By Lemma 6.2(ii), (2.1) and Definition 3.1 we have 〈v+5 , v + 5 〉 = 1 c3 〈Rv+4 , v + 5 〉 = 1 c3 〈v+4 , Lv + 5 〉. The result now follows from Lemma 6.3(ii). 9 A basis With reference to Definition 6.1, assume that D = 5, ∆2 = 0 and ∆3 6= 0. In this section we display a basis for W . We will also show that, up to isomorphism, Γ has a unique irreducible T -module with endpoint 2. Theorem 9.1. With reference to Definition 6.1, assume that D = 5, ∆2 = 0 and ∆3 6= 0. Then the following (i)–(iii) hold. (i) If v+5 6= 0, then the following is a basis for W : v+i (2 ≤ i ≤ 5), v − 3 . (9.1) (ii) If v+4 6= 0 and v + 5 = 0, then the following is a basis for W : v+i (2 ≤ i ≤ 4), v − 3 . (9.2) (iii) If v+4 = 0, then the following is a basis for W : v+i (2 ≤ i ≤ 3), v − 3 . (9.3) In particular, W is not thin. 12 Art Discrete Appl. Math. 3 (2020) #P2.04 Proof. Note that by (7.1), W is spanned by vectors v+i (2 ≤ i ≤ 5) and v − 3 . Vector v+2 = v is nonzero by definition. Vectors v + 3 and v − 3 are nonzero by Lemma 8.1 and Corollary 8.5(i), respectively. We prove part (i) of the theorem. Proofs of parts (ii) and (iii) are similar. If v+5 6= 0, then v + 4 6= 0 by Lemma 8.6. Vectors v + i (2 ≤ i ≤ 5) and v − 3 are linearly independent by (3.1) and Corollary 8.5(ii). This shows that (9.1) is a basis for W . As dim (E∗2 (W )) = 2, W is not thin. The result follows. Theorem 9.2. With reference to Definition 6.1, assume that D = 5, ∆2 = 0 and ∆3 6= 0. Then Γ has, up to isomorphism, exactly one irreducible T -module with endpoint 2. Proof. Let U denote an irreducible T -module with endpoint 2, different from W . Fix nonzero u ∈ E∗2U , and for 2 ≤ i ≤ 5 define u+i = E ∗ i Ai−2E ∗ 2u and let u−3 = E ∗ 3A5E ∗ 2u. It follows from the results of Section 8 and Theorem 9.1 that u+2 , u + 3 , u − 3 are nonzero and that nonzero vectors in the set {u + i | 2 ≤ i ≤ 5} ∪ {u − 3 } form a basis for U . Furthermore, it follows from Lemma 8.3 and Lemma 8.6 that u+4 (u + 5 , respectively) is nonzero if and only if v+4 (v + 5 , respectively) is nonzero. Let σ : W → U be defined by σ(v+i ) = u + i (2 ≤ i ≤ 5) and σ(v − 3 ) = u − 3 . It follows from the comments above that σ is a vector space isomorphism from W to U . We show that σ is a T -module isomorphism. Since A generates M and E∗0 , E ∗ 1 , . . . , E ∗ 5 is a basis for M∗, it suffices to show that σ commutes with each of A,E∗0 , E ∗ 1 , . . . , E ∗ 5 . Using the fact that E∗i E ∗ j = δijE ∗ i and the definition of σ we immediately find that σ commutes with each of E∗0 , E ∗ 1 , . . . , E ∗ 5 . Recall that A = R + L. It follows from Lemma 6.2, Lemma 6.3 and Corollary 7.3 that σ commutes with A. The result follows. We would like to emphasize that together with the results in [10, 12, 15], Theorems 9.1 and 9.2 imply the following characterization. Theorem 9.3. Let Γ = (X,R) denote a bipartite distance-regular graph with diameter D ≥ 4 and valency k ≥ 3. Assume Γ is not almost 2-homogeneous. We fix x ∈ X and let E∗i = E ∗ i (x) (0 ≤ i ≤ D) and T = T (x) denote the dual idempotents and the Terwilliger algebra of Γ with respect to x, respectively. Then the following (i), (ii) are equivalent. (i) Γ has, up to isomorphism, exactly one irreducible T -module W with endpoint 2, and W is non-thin with dim(E∗2W ) = 1, dim(E ∗ D−1W ) ≤ 1 and dim(E∗iW ) ≤ 2 for 3 ≤ i ≤ D. (ii) ∆2 = 0, and there exist complex scalars αi, βi (2 ≤ i ≤ D − 1) such that |Γi−1(x) ∩ Γi−1(y) ∩ Γ1(z)| = αi + βi|Γ1(x) ∩ Γ1(y) ∩ Γi−1(z)| (9.4) for all y ∈ Γ2(x) and z ∈ Γi(x) ∩ Γi(y). With reference to Definition 6.1, assume that ∆2 = 0 and ∆3 6= 0. It is known that this implies c2 ∈ {1, 2}, or D ≤ 5, see [12, Theorem 4.4]. If c2 ∈ {1, 2}, then the structure of irreducible T -modules with endpoint 2 was studied in detail in [12, 15]. Therefore, we are mainly interested in the case c2 ≥ 3. We have to mention however that we are not aware of any of such a graph. Using a computer program we found intersection arrays Š. Miklavič and S. Penjić: On the Terwilliger algebra of BDRG with ∆2 = 0 13 {b0, b1, b2, b3, b4; c1, c2, c3, c4, c5} up to valency k = 20000, which satisfy the following conditions: c2 ≥ 3, ∆2 = 0, ∆3 > 0, ∆4 > 0, γ2 ∈ N, p222 ∈ N. None of them passed the feasibility condition p1ij ∈ N ∪ {0}, see the table below. intersection arrays feasibility condition (58, 57, 49, 21, 1; 1, 9, 37, 57, 58) p123 = 1102/3 /∈ N (112, 111, 100, 45, 4; 1, 12, 67, 108, 112) p134 = 103600/67 /∈ N (186, 185, 161, 35, 1; 1, 25, 151, 185, 186) p123 = 6882/5 /∈ N (274, 273, 256, 120, 10; 1, 18, 154, 264, 274) p123 = 12467/3 /∈ N (274, 273, 256, 120, 1; 1, 18, 154, 273, 274) p123 = 12467/3 /∈ N (1192, 1191, 1156, 561, 28; 1, 36, 631, 1164, 1192) p123 = 118306/3 /∈ N (3236, 3235, 3136, 760, 1; 1, 100, 2476, 3235, 3236) p123 = 523423/5 /∈ N ORCID iDs Štefko Miklavič https://orcid.org/0000-0002-2878-0745 Safet Penjić https://orcid.org/0000-0001-6664-4130 References [1] E. Bannai and T. Ito, Algebraic combinatorics. I, The Benjamin/Cummings Publishing Co., Inc., Menlo Park, CA, 1984, association schemes. [2] A. E. Brouwer, A. M. Cohen and A. Neumaier, Distance-regular graphs, volume 18 of Ergeb- nisse der Mathematik und ihrer Grenzgebiete (3) [Results in Mathematics and Related Areas (3)], Springer-Verlag, Berlin, 1989, doi:10.1007/978-3-642-74341-2. [3] B. Curtin, 2-homogeneous bipartite distance-regular graphs, Discrete Math. 187 (1998), 39–70, doi:10.1016/S0012-365X(97)00226-4. [4] B. Curtin, Bipartite distance-regular graphs. I, Graphs Combin. 15 (1999), 143–158, doi:10. 1007/s003730050049. [5] B. Curtin, Bipartite distance-regular graphs. II, Graphs Combin. 15 (1999), 377–391, doi:10. 1007/s003730050072. [6] B. Curtin, The local structure of a bipartite distance-regular graph, European J. Combin. 20 (1999), 739–758, doi:10.1006/eujc.1999.0307. [7] B. Curtin, Almost 2-homogeneous bipartite distance-regular graphs, European J. Combin. 21 (2000), 865–876, doi:10.1006/eujc.2000.0399. [8] E. S. Egge, A generalization of the Terwilliger algebra, J. Algebra 233 (2000), 213–252, doi: 10.1006/jabr.2000.8420. [9] J. T. Go, The Terwilliger algebra of the hypercube, European J. Combin. 23 (2002), 399–429, doi:10.1006/eujc.2000.0514. [10] M. S. MacLean and Š. Miklavič, On bipartite distance-regular graphs with exactly one non- thin T -module with endpoint two, European J. Combin. 64 (2017), 125–137, doi:10.1016/j. ejc.2017.04.004. [11] M. S. MacLean and Š. Miklavič, On bipartite distance-regular graphs with exactly two ir- reducible T-modules with endpoint two, Linear Algebra Appl. 515 (2017), 275–297, doi: 10.1016/j.laa.2016.11.021. 14 Art Discrete Appl. Math. 3 (2020) #P2.04 [12] M. S. MacLean, Š. Miklavič and S. Penjić, On the Terwilliger algebra of bipartite distance- regular graphs with ∆2 = 0 and c2 = 1, Linear Algebra Appl. 496 (2016), 307–330, doi: 10.1016/j.laa.2016.01.040. [13] M. S. MacLean, Š. Miklavič and S. Penjić, An A-invariant subspace for bipartite distance- regular graphs with exactly two irreducible T -modules with endpoint 2, both thin, J. Algebraic Combin. 48 (2018), 511–548, doi:10.1007/s10801-017-0798-7. [14] Š. Miklavič, The Terwilliger algebra of a distance-regular graph of negative type, Linear Alge- bra Appl. 430 (2009), 251–270, doi:10.1016/j.laa.2008.07.013. [15] S. Penjić, On the Terwilliger algebra of bipartite distance-regular graphs with ∆2 = 0 and c2 = 2, Discrete Math. 340 (2017), 452–466, doi:10.1016/j.disc.2016.09.001. [16] P. Terwilliger, The subconstituent algebra of an association scheme. I, J. Algebraic Combin. 1 (1992), 363–388, doi:10.1023/A:1022494701663. ISSN 2590-9770 The Art of Discrete and Applied Mathematics 3 (2020) #P2.05 https://doi.org/10.26493/2590-9770.1363.b06 (Also available at http://adam-journal.eu) Smart elements in combinatorial group testing problems with more defectives* Dániel Gerbner† , Máté Vizer‡ MTA Rényi Institute, Hungary H-1053, Budapest, Reáltanoda utca 13-15 Received 23 May 2017, accepted 18 February 2019, published online 14 August 2020 Abstract In combinatorial group testing problems Questioner needs to find a defective element x ∈ [n] by testing subsets of [n]. In [18] the authors introduced a new model, where each element knows the answer for those queries that contain it and each element should be able to identify the defective one. In this article we continue to investigate this kind of models with more defective ele- ments. We also consider related models inspired by secret sharing models, where the ele- ments should share information among them to find out the defectives. Finally the adaptive versions of the different models are also investigated. Keywords: Combinatorial group testing, defectives, cancellative. Math. Subj. Class. (2020): 94A50 1 Introduction In the most basic model of combinatorial group testing Questioner needs to find a special element x of {1, 2, . . . , n} (=: [n]) by asking minimal number of queries (or group tests or pools) of type “does x ∈ F ⊂ [n]?”. Special elements are usually called defective (or positive). For every combinatorial group testing problem there are at least two main approaches: whether it is adaptive (or sequential) or non-adaptive (or oblivious). In the adaptive scenario Questioner asks queries depending on the answers for the previously asked queries, however in the non-adaptive version Questioner needs to pose all the queries *We would also like to thank all participants of the Combinatorial Search Seminar at the Alfréd Rényi Institute of Mathematics for fruitful discussions. †Research supported by the János Bolyai Research Fellowship of the Hungarian Academy of Sciences. Re- search supported by the National Research, Development and Innovation Office – NKFIH, grant K116769. ‡Research supported by the National Research, Development and Innovation Office – NKFIH, grant SNN 116095. E-mail addresses: gerbner@renyi.hu (Dániel Gerbner), vizermate@gmail.com (Máté Vizer) cb This work is licensed under https://creativecommons.org/licenses/by/4.0/ 2 Art Discrete Appl. Math. 3 (2020) #P2.05 at the beginning. We call the complexity of a specific combinatorial group testing problem the number of the queries needed to ask by Questioner in the worst case during an optimal strategy. Combinatorial group testing problems were first considered during the World War II by Dorfman [10] in the context of mass blood testing. Since then group testing techniques have had many different applications, for example in fault diagnosis in optical networks [20], in quality control in product testing [26] or failure detection in wireless sensor networks [23]. In this article we will mainly discuss non-adaptive models. The interested reader can find many variants and generalizations of the basic non-adaptive model and also many applications in the book [11]. 1.1 Description of the new model In [18] the authors introduced new combinatorial group testing models, inspired by the results of Tapolcai et al. [30, 29]. The main novel ingredient of these combinatorial group testing models is that the el- ements are smart and they distrust the Questioner, thus they want to control the tests they are involved in. So the following extra condition was introduced: each element knows the answer for those queries that contain it, and the goal: each element should be able to identify the defective one. Motivated by secret sharing schemes (see e.g. [2]), the following variant was also con- sidered: the elements can work together and share their knowledge. In this case we require certain sets of elements to be able to identify the defective, while we require other sets to be unable to identify the defective element. We emphasize that the way the data is transmitted does not play a role here. Information can not be distributed between different groups. We mention here some other motivation to introduce these models: it is often mentioned in the group testing literature that an advantage of testing pools together is that it increases privacy. However, systematical research on this property has only started recently, see e.g. [1, 5, 16]. These papers focus on cryptographic versions of the problem. Here we deal with a simple combinatorial version, where privacy only means that an unauthorized participant cannot completely detect the defective element(s). In [18] the authors considered models with one defective element. The main aim of this article is to continue these investigations with more defectives. 1.2 Simple combinatorial models with d defectives A well-studied generalization of the basic model is the following. There are exactly d defective elements, a query corresponds to a set F , and the answer shows if there is at least one defective elements in F or not. About Questioner’s strategy we remark that - as he should find all the defectives - the asked queries should form a d-separating family (see the next section for a definition) in the non-adaptive case, so for the minimum number of tests the known lower bound is Ω( d 2 log d log n), while the best upper bound construction yields O(d 2 log n) (see e.g. [14, 25]). It is one of the major open problems in the theory of combinatorial group testing models to close the gap between the previous upper and lower bound. In the adaptive case there is a multiplicative constant factor between the information theoretic lower bound and the best existing algorithm. The known best lower bound is d log nd , while the upper bound is O(d log n). D. Gerbner and M. Vizer: Smart elements in combinatorial group testing problems 3 1.3 Structure of the paper We organize the paper as follows: in Section 2 we introduce some properties and related results about families of sets, that we will need later. In Section 3 we introduce the non- adaptive models that we investigate, while in Section 4 we prove the main results. In Section 5 we look at the adaptive scenario, and we finish the article with remarks and open questions in Section 6. We also mention that in this article we use standard asymptotic notation. 2 Finite set theory background Our topic is connected to several areas of finite set theory. In this section we introduce some notions on families of subsets and known results about them, that we will use during the proofs. In this article we use the notation of 2[n] for the power set of [n] and for any F ⊂ 2[n], a ∈ [n] we use Fa := {F ∈ F : a ∈ F}. The complement of a family F ⊂ 2n is F := {[n] \ F : F ∈ F}, while the dual of a family F ⊂ 2n is F ′ := {Fa : a ∈ [n]}. It is defined on the underlying set F and has cardinality at most n. For a family F ⊂ 2[n] and d ≥ 1 let Fd := {∪di=1Fi : Fi ∈ F , Fi 6= Fj for i 6= j}. Now we introduce some notions about families of subsets of [n]. Definition 2.1. We say that F ⊂ 2[n] is: (1) intersection closed if F,G ∈ F implies F ∩G ∈ F . (2) Sperner if there are no two different F1, F2 ∈ F with F1 ⊂ F2. (3) cancellative if for any three F1, F2, F3 ∈ F we have F1 ∪ F2 = F1 ∪ F3 ⇒ F2 = F3. (4) intersection cancellative if for any three F1, F2, F3 ∈ F we have F1 ∩ F2 = F1 ∩ F3 ⇒ F2 = F3. (5) d-separating for some 1 ≤ d ≤ n − 1 positive integer, if for any two different X1, X2 ⊂ [n] with |X1| = |X2| = d there is F ∈ F with: F ∩X1 6= ∅ and F ∩X2 = ∅, or F ∩X2 6= ∅ and F ∩X1 = ∅. (6) d-union-free for some d ≥ 1 if for different F1, . . . , Fd ∈ F and different G1, . . . , Gd ∈ F d⋃ i=1 Fi = d⋃ i=1 Gi implies {F1, . . . , Fd} = {G1, . . . , Gd}. 4 Art Discrete Appl. Math. 3 (2020) #P2.05 (7) d-cover-free for some d ≥ 1 positive integer if there are no (d + 1) different F1, F2, ..., Fd+1 ∈ F with Fd+1 ⊂ d⋃ i=1 Fi. (8) (r,d)-cover-free for some r, d ≥ 1 positive integers if there are no (d + r) different F1, F2, ..., Fd+r ∈ F with d+r⋂ i=d+1 Fi ⊂ d⋃ i=1 Fi. Before defining the last notion, we need some introduction. We will generalize a graph property, so it is more comfortable to use the word hypergraph instead of family of subsets of [n] (where F is the set of the hyperedges and [n] is the set of vertices of the hypergraph). There are several ways to define cycles in hypergraphs. Here we use one due to Berge [4]. A Berge-cycle in a hypergraph of length k (a Berge-Ck) consists of k different hyperedges E1, . . . , Ek and k different vertices x1, . . . xk such that Ei contains xi and xi+1 for 1 ≤ i ≤ k (modulo k, so Ek contains xk and x1). Note that for a 2-uniform hypergraph (that is a graph) this notion is the same as the ’usual’ cycle in a graph. The Berge-girth (that we call just girth in this article) of a hypergraphH is the smallest length of a cycle inH (that is∞ if there is no cycle in H). A hypergraph is d-regular if every vertex is contained in exactly d hyperedges, r-uniform if every hyperedge has size r and linear if any two hyperedges intersect in at most one vertex. 2.1 Some known results about these notions that we will use later • The notion cancellative was first introduced by Frankl and Füredi in [17]. Fact 2.2. F ⊂ 2[n] is intersection cancellative if and only if F is cancellative. • The notion of separating family in the context of combinatorial search theory was intro- duced and first studied by Rényi in [24]. The following fact is rather trivial, so we omit its proof. Fact 2.3. Suppose Fn ⊂ 2[n] is a minimal separating family. Then we have: |Fn| ≤ dlog2 ne. Fact 2.4. F ⊂ 2[n] finds d defectives if and only if F is d-separating. The dual of a d-separating family is d-union-free. • The notion of d-union-free families was introduced by Hwang and T. Sós in [21] under the name of d-Sidon families. They proved the following: Theorem 2.5 (Hwang, T. Sós, [21, Theorem 3]). There exists a d-union-free family Fn ⊂ 2[n] with: 1 2 (1 + 1 (4d)2 )n ≤ |Fn|. • The notion of d-cover-free families was introduced by Kautz and Singleton in [22]. Note that a d-cover-free family is also d-union-free. They proved the following lower bound. D. Gerbner and M. Vizer: Smart elements in combinatorial group testing problems 5 Theorem 2.6 (Kautz, Singleton, [22]). There exists a d-cover-free family Fn ⊂ 2[n] with: Ω( 1 d2 ) = log2 |Fn| n . D’yachkov and Rykov proved the following upper bound on the size of d-cover-free families: Theorem 2.7 (D’yachkov, Rykov, [14]). Suppose that Fn ⊂ 2[n] is a d-cover-free family. Then we have: log2 |Fn| n ≤ 2 log2 d d2 (1 + o(1)). • The notion of (r, d)-cover-free families were introduced by D’yachkov, Macula, Torney and Vilenkin in [12]. They showed that a result by Stinson, Wei and Zhu [28] implies the following: Theorem 2.8. If the dual of Fn ⊂ 2[n] is (r, d)-cover-free, then we have: |Fn| = Ωd,r(log2 n). • Ellis and Linial [15] studied regular uniform linear hypergraphs with large girth. They mention that a result of Cooper, Frieze, Molloy and Reed [6] implies that for any d ≥ 2, r, g ≥ 3 and sufficiently large n, if r divides n, then there is an r-uniform, d-regular, n- vertex linear hypergraph with girth at least g. Moreover the argument can be adapted to show the same statement in the case r divides dn. Theorem 2.9. Let d ≥ 2, r, g ≥ 3, n large enough and r divides dn. Then there exists a linear, d-regular, r-uniform hypergraph with girth at least g on n vertices. 3 Models In this section we start our investigations and give a systematic study of models with the extra property that each element knows the answers for those queries that contain it. In all the models in this section an input set [n] is given, and d of them are defectives (d ≤ n). We are dealing with non-adaptive models, so Questioner needs to construct a family F ⊂ 2[n]. A set F correspond to a query of the following type: ’is there any defective element in F ⊂ [n]?’. In each model we assume that knowing all the answers is enough information for Questioner to find the defective elements, i.e. F is d-separating. Note that this immediately implies a lower bound of Ωd(log2 n) on the size of the query family in each model. We mention whenever the query family satisfies another property that could improve the factor depending only on d, but calculating the factors is outside the scope of this paper. The main difference between the following models is what we want the elements to find out. Using only the information available to them, i.e. the answers to the queries containing them, we can require that they find out something about the defective elements, or oppositely, that they cannot find out something. When we say that an element x knows the defective elements, we mean that the query family satisfies the following property: no matter what the defective element is, after the answers x can find out the defective ones, i.e. the subfamily Fx is d-separating. In the opposite when we say that x does not know any of the defective elements, we mean that the 6 Art Discrete Appl. Math. 3 (2020) #P2.05 query family satisfies the following property: no matter what the defective elements are, after the answers x cannot identify any defective element. Equivalently, for any D ⊂ [n], y ∈ D with |D| = d there is a D′ ⊂ [n] with |D′| = d, y 6∈ D′, such that the same members of Fx intersect D and D′. Another variant of this problem is when elements can share information among them. It is possible that in some model some element can not find out the defective, however if we pick two elements and they share their information among them, they can find the defective elements. We consider these kind of models. We also assume that in each model the elements know the setup of the problem, i.e. that n elements are given and exactly d of them are defectives. We use the expression that a family solves a model if it satisfies the property that describes the model. In each of the following models we first give a property describing what the elements should know, and then we examine if there is a query set that solves that specific model or state results about the cardinality of such query sets. Then we consider models where we require some information to remain hidden from the elements. Finally we mix these types of properties. In this section we assume that there are exactly d ≥ 2 defective elements (and every element knows that). We consider models analogous to the ones introduced in [18]. 3.1 Model 1d Probably the most natural model is the following: Property. All elements find out if they are defective. We note that some cryptographic problems concerning this model were investigated in [1], where the authors observed that the dual of a d-cover-free family solves this model. Here we show that only such families solve this model. Theorem 3.1. F solves Model 1d if and only if its dual is d-cover-free. We prove this theorem in Section 4. By Theorem 3.1, Theorem 2.6 and Theorem 2.7 we have: Corollary 3.2. If Fn ⊂ 2[n] solves Model 1d and has minimum cardinality, then we have: Ω( d2 log2 d log2 n) = |Fn| = O(d2 log2 n). 3.2 Model 2d Another natural model is when the elements should find out everything. Property. Every element finds all the defectives. It is obvious that no F can solve Model 2d if 1 < d < n: a defective element cannot gather any information about the other elements, as it gets only YES answers. 3.3 Model 2′d As defective elements cannot gather any information about the other elements, in the next model we only require non-defective elements to find the defective ones. D. Gerbner and M. Vizer: Smart elements in combinatorial group testing problems 7 Property. Every non-defective element finds all the defectives. Theorem 3.3. Suppose Fn solves Model 2′d and has minimum cardinality. Then we have |Fn| = Θd(log2 n). Proof. We claim that the solution is the dual of a d-cover-free family, and the dual of a (2, d)-cover-free family is always a solution. This together with Theorem 2.7 and Theorem 2.8 implies the statement. Suppose that the dual is not d-cover-free. Then there are F1, F2, ..., Fd+1 ∈ F with Fd+1 ⊂ ∪di=1Fi. For the corresponding elements in the primal version we have xd+1 such that any set F ∈ F contains one of the other elements x1, . . . , xd. Thus if x1, . . . , xd are the defectives, xd+1 receives only YES answers, thus cannot distinguish this case from the case xd+1 and any d− 1 other elements are the defectives. On the other hand let us assume F is the dual of a (2, d)-cover-free family. Then for every non-defective elements x, y there is a set F ∈ F such that F contains both x and y, but none of the defective elements, thus x finds out that y is not defective, F solves Model 2′d. Indeed, there is an element in the intersection of the duals of x and y that is not contained by the duals of the defective elements by the (2, d)-cover-free property. That element is the dual of a set F ∈ F that has the desired properties. 3.4 Model 2′′d The fact that defective elements cannot gather any information about the other elements shows that even d − 1 elements together cannot always find the defectives. However, if d elements share information, then either they are all the defectives and they do not need to gather information about the other elements, or at least one of them is not a defective, and then there is a solution by Model 2′d. Property. d elements together know who the defective elements are. Theorem 3.4. Fn solves Model 2′′d if and only if its dual G is d-union-free and Gd is Sperner and intersection-cancellative. We prove this theorem in Section 4. Note that we know the maximum possible size of a Sperner and intersection cancellative family (by results of Frankl and Füredi [17] and Tolhuizen [31]), but we do not know if that construction can be written as Gd for a d-union- free family G. Theorem 3.5. Suppose Fn solves Model 2′′d and has minimum cardinality. Then we have |Fn| = Θd(log2 n). Proof. It is easy to see that if a family solves both Model 1d and Model 2′d, then it also solves Model 2′′d . As we have seen in the proof of Theorem 3.3, a solution for Model 2 ′ d is the dual of a d-cover-free family, thus it also solves Model 1d by Theorem 3.1. This implies the upper bound. 3.5 Model 3d Let us now examine the case when we require that elements do not find the defective. Note that as always, we assume that knowing all the answers is enough to find the defective element. 8 Art Discrete Appl. Math. 3 (2020) #P2.05 Property. No element knows any of the defective ones. Note that for d = 1 there is a solution for Model 2d and there is no solution for Model 3d [18]. For d ≥ 2 the situation is just the opposite: we will show that there is a solution for Model 3d for n large enough. We will use arguments similar to the ones used in [3]. Theorem 3.6. If d ≥ 2, r ≥ 3 and n ≥ dr + 2, then an r-uniform, d-regular linear hypergraph with girth at least 5 solves Model 3d. Proof. Let us consider an r-uniform, d-regular linear hypergraph F of girth 5. For an arbitrary element x its neighborhood consists of d disjoint sets of size r − 1. Also, there are more than d elements not in its neighborhood. It is easy to see that by r ≥ 3 x cannot identify any defective elements. On the other hand, if we know all the answers, the YES answers form stars with the defective elements in the centers. The elements that get only YES answers are the candi- dates for being defective. Every candidate that is not defective has to be connected to all the defectives. Two such candidate would form a Berge-C4 with any two of the defective elements, thus there is only one additional candidate. But then it is the only one among the d + 1 candidates that is connected to the other candidates, otherwise we could find a Berge-C3. Corollary 3.7. If n is large enough compared to d > 1, then there is a solution for Model 3d. Proof. If d ≥ 3, let us choose r = d, then Theorem 2.9 shows that we can find such a family. If d = 2, then Theorem 2.9 with r = 4 shows we can find such a family for n even. If n is odd, we find such a family for n+ 1, and delete an element. The resulting family is not 4-uniform, but that property is not actually needed (in fact, we used only that every set in F has size at least 3). 3.6 Model 4d Now we start to investigate models where elements can share information among them. Let i and j be integers with 1 ≤ i < j ≤ n. When we say that a set of j elements together know the defective elements, we mean that knowing the answers to all the queries containing at least one element from the set is enough to find all the defectives. Similarly, when we say that a set of i elements do not know any of the defectives, we mean that knowing the answers to all the queries intersecting the set is not enough to identify any of the defective elements. Property. Any j elements together know the defectives, but i elements together do not know any of the defectives, for some i and j with 1 ≤ i < j ≤ n. Note that Corollary 3.7 shows that there is a solution if d > 1, i = 1 and j = n, where n is large enough compared to d. In fact any n− r+ 1 elements together know the answer to all the queries, thus it is enough to assume j ≥ n − r + 1. A more precise version of Theorem 2.9 (see [15], Theorem 5) shows that about n 1/6 d can be chosen as r, which shows that j can be as small as n− n 1/6 d . Proposition 3.8. If i ≥ d or j < d, then there is no solution. D. Gerbner and M. Vizer: Smart elements in combinatorial group testing problems 9 Proof. Let us assume first we are given j elements. If all of them are defectives, they only get YES answers, and do not gather any information about the other elements. If i ≥ d, for a set X let FX := ∪x∈XFx. Let us consider among the d-element sets X such that FX is maximal. We claim that if the elements of X are the defectives, they can find it out by sharing information. Indeed, they get only YES answers. If they cannot be sure that they are the defective ones, then there is another d-set Y that could be the set of defectives. It means all the answers to the queries in FX would still be YES if Y was the set of defectives, i.e. FX ⊆ FY . By the assumption on X we have FX = FY , but then the family is not d-separating. Proposition 3.9. If j = d, then there is no solution. Proof. If x receives only YES answers, he cannot find out he is defective, thus there is a set D = {y1, . . . , yd} no containingX that intersects every member of Fx. On the other hand, if x and y1, . . . , yd−1 are the defectives, they together can figure that out. In particular, they know that the set of defectives is notD, thus there is a set intersecting {x, y1, . . . , yd−1} but not D. Such a set would be a member of Fx that does not intersect D, a contradiction. 4 Proofs 4.1 Proof of Theorem 3.1 The dual of the d-cover-free property is that for every elements x1, . . . , xd+1 we cannot have that the sets that contain xd+1 all contain at least one of the other xi’s. Let Hx := {F \ {x} : F ∈ Fx}, and τ(Hx) be the size of the smallest set that intersects every member of Hx. With these notation the following lemma finishes the proof of Theorem 3.1. Lemma 4.1. An element x always finds out if he is defective if and only if τ(Hx) > d. Proof. If x gets a NO answer, he learns he is not defective, thus we can assume he only gets YES answers. If Hx cannot be covered by at most d elements different from x, then the only way to get YES answer to every element of Fx is if x is defective (as defective elements cover the sets that get YES answers). On the other hand if Hx can be covered by at most d elements different from x, then x cannot exclude the possibility that those are the defective elements, together with arbitrary additional elements to reach d defectives. 4.2 Proof of Theorem 3.4 Lemma 4.2. F ⊂ 2[n] solves Model 2′′d if and only if the following two properties hold: (1) for any two different d-element sets X,Y ⊂ [n] there is F ∈ F with F ∩X 6= ∅ and F ∩ Y = ∅, and (2) for any three different d-element sets X,Y, Z ⊂ [n] there is F ∈ F with (F ∩X 6= ∅ and F ∩ Y 6= ∅ and F ∩ Z = ∅) or (F ∩X 6= ∅ and F ∩ Z 6= ∅ and F ∩ Y = ∅). Proof. 10 Art Discrete Appl. Math. 3 (2020) #P2.05 1. Note that the property that Questioner can find out the answer is: for any two different d-element sets X,Y ⊂ [n] there is F ∈ F with (F ∩ X 6= ∅ and F ∩ Y = ∅) or (F ∩ Y 6= ∅ and F ∩X = ∅). This property is contained in (1). Let us assume now X is a set of size d. 2. If X is the set of defectives, they have to find this out. It means that for a different d-element set Y , there should be an F ∈ F with X ∩ F 6= ∅ and Y ∩ F = ∅. 3. If X is not the set of defectives, then another set Y is, and they have to identify Y . Thus for a third d-element set Z, there should be a set that intersects X (so they know the answer for it), and distinguishes Y and Z, i.e. it intersects exactly one of them. Lemma 4.3. F ⊂ 2[n] satisfies properties (1) and (2) if and only if its dual G is d-union- free and Gd is Sperner and intersection cancellative. Proof. The dual of (1) is the following statement: (3) for two different subfamilies each consisting of d sets {F1, . . . , Fd}, {G1, . . . , Gd} ⊂ F there is f ∈ [n] with f ∈ ∪di=1Fi \ ∪di=1Gi. The dual of (2) is the following statement: (4) for three different subfamilies each consisting of d sets {F1, . . . , Fd}, {G1, . . . , Gd}, {H1, . . . ,Hd} ⊂ F there is f ∈ [n] with either f ∈ (∪di=1Fi ∩ ∪di=1Gi) \ ∪di=1Hi, or f ∈ (∪di=1Fi ∩ ∪di=1Hi) \ ∪di=1Gi. It is easy to see that (3) is equivalent to the statement that G is d-union-free and Gd is Sperner. Now we claim that (4) means that Gd is intersection cancellative. Let us use the following notation: F := ∪di=1Fi, G := ∪di=1Gi, H := ∪di=1Hi. Using these, the existence of f means either F ∩ G 6⊂ H or F ∩ H 6⊂ G. Let us define three properties. (i) F ∩G 6⊂ H . (ii) F ∩H 6⊂ G. (iii) H ∩G 6⊂ F . Property (2) (for these three sets in this order) means that at least one of (i) and (ii) holds. Considering the same three sets in different orders we get that also at least one of (i) and (iii) and one of (iii) and (ii) holds. It is true if and only if at least two of these three properties hold. To finish the proof of Lemma 4.3 we prove the following: D. Gerbner and M. Vizer: Smart elements in combinatorial group testing problems 11 Claim 4.4. A family H ⊂ 2[n] is intersection cancellative if and only if at least two out of (i), (ii) and (iii) hold for any three members of it. Proof. Let us assume F ′ is intersection cancellative and let F,G,H ∈ H. Let us assume at most one, say (iii) of the three properties holds, thus (i) and (ii) do not hold. The first one implies F ∩G ⊂ H , and obviously F ∩G ⊂ F . Thus we have F ∩G ⊂ F ∩H . Similarly the second one implies F ∩H ⊂ F ∩G, hence they together imply F ∩H = F ∩G, which contradicts the intersection cancellative property and our assumption that F,G,H are three different sets. Let us assume now thatH is not intersection cancellative, thus we have F ∩G = F ∩H . This implies both F ∩ G ⊂ H and F ∩ H ⊂ G, thus at most one of (i), (ii) and (iii) can hold. We are done with the proof of Theorem 3.4. 5 Adaptive scenario A natural idea is to consider the adaptive versions of these problems. Here we assume the Questioner knows all the earlier answers, and then he can choose the next query. He can find the defective, and then use further queries to share some information with the elements. However, there are two versions of this problem. The elements might know the algorithm, and use the order of the queries to gain information, or they only receive the answers to the queries at the end in no particular order. For example in Model 2d in the second version we require that for every element x the family Fx with the answers is enough to find all the defectives, i.e. for two distinct sets D,D′ of size d there is a query that contains x and intersects only one of D and D′. It is still obviously not solvable, as every defective element only gets YES answers and no information about the others. However, in the first version Questioner may start with a d-separating family, then ask the set of defectives and then the set of non-defectives. This way every element has to look only at the last query that contains it. If the answer to that is YES, then it is the set of defectives, if the answer is NO, it is the set of non-defectives. In both cases the defectives are identified. From now on we consider only the second version, i.e. the elements receive the answer to the queries containing them at the end of the algorithm in no particular order, and they only know the underlying set and the number of defectives. It is still possible for the Questioner to find the defective, and then share some information using further queries. Let ta(d, n) denote the number of queries in the fastest adaptive algorithm that finds the d defective (we mentioned some inequalities on ta(d, n) in the introduction), then ta(d, n) is a lower bound in every model. On the other hand ta(d, n) + d+ 1 queries are enough in Model 1d, ta(d, n) + 1 queries are enough in Model 2′d and t a(d, n) + d + 1 queries are enough in Model 2′′d : first Questioner finds the d defectives, then ask them as singletons, and/or the set of non-defectives. Let us consider now Model 3d. By Corollary 3.7 there is a solution for n large enough, but that solution is linear in n. On the other hand it can be seen easily that for n = d + 1 there is no solution even adaptively. Here we give a faster algorithm. Theorem 5.1. There is an adaptive algorithm that solves Model 3d and uses at most 2d log2 n+ 5d queries if n is large enough. 12 Art Discrete Appl. Math. 3 (2020) #P2.05 Proof. Questioner starts with asking a query Q of size bn/2c and its complement. Then in the next round he asks two complementing subsets of size differing by at most one in every query that was answered YES (sayQ1 andQ2 withQ1∪Q2 = Q). He repeats this in every round except if the subset has size at most 5, he stops and does not ask that subset as a query. Since he asks disjoint sets in every round, he gets at most d YES answers, thus there are at most 2d queries in the next round. There are obviously at most log2 n rounds. After that we have a family D of at most d sets of size at least 3 and at most 5, each containing at least one defective element. Let A := {a1, . . . , al} be their union (l ≤ 5d), we also know that every defective is in A. Let Di ∈ D be the set that contains ai. As n is large enough, we can assume that there were two disjoint queries B and C that were answered NO and have size at least 5d. Let b1, . . . , b5d be distinct elements of B and c1, . . . , c5d be distinct elements of C. Then Questioner also asks the queries {ai, bi, ci} for i ≤ l. As we know bi and ci are not defective, Questioner finds out if ai is defective for every i. On the other hand, if ai is defective, every query Q that contains ai also contains either other elements of Di or contains bi, thus ai cannot be sure he is defective. If ai is not defective, then all he knows is that another element of Di is defective, but there are more than one such elements. Any other element x appears in a queryQ1 that got answer NO. At that point when this NO answer arrives, xmight knows that the answer was YES to a larger set Q that contains x. Q \Q1 has size at least 3, thus x does not know at this point which one is defective. If x 6∈ B ∪ C, then he does not appear in any queries later, thus cannot find any defectives. If x = bi or x = ci for some i, he can get additional information about only one element of Q \Q1, thus there are two candidates remaining, again, x cannot find any defectives. Finally, if the answer to {ai, bi, ci} is YES, then bi does not know if ai or ci is defective. It is easy to see that Model 4d still cannot be solved if i ≥ d or j < d. Indeed, the defectives still get only YES answers, thus less than d of them cannot have any idea about the remaining defectives. On the other hand we will show that there are possible answers such that the defectives together will find out they are the defectives, showing i ≥ d is impossible. Let us assume that every answer is YES, unless it is impossible. If Questioner finds out that D is the set of defectives, it means that for every other set D′ of size d there was a query at some point that intersected exactly one of D and D′. At that point YES was a possible answer, thus the answer to that query was YES. Hence it intersected D and was disjoint from D′. Then an element of D knows D′ is not the set of defectives, and this holds for every set D′ 6= D of size d. 6 Remarks We finish this article with some possible directions that can be investigated: • In some of the above models we proved that there is a family that solves the model, but did not say anything about its possible size. • In case of Model 4d our results can only be considered as the starting point of the investigations. In particular, it would be interesting to see if i can go above 1. It is tempting to try to extend the proof of Theorem 3.6 to this case, and use a linear hypergraph of large girth. However, it does not work even for i = 2. The property that the defectives can be identified forces the elements to be contained in many hyperedges, while the property that no 2 elements can identify any of the defectives forces the opposite. D. Gerbner and M. Vizer: Smart elements in combinatorial group testing problems 13 If the query hypergraph is linear and d-separating, it is easy to see that for two elements contained by the same query there must be at least d other sets containing the two elements. This implies almost every element has to be contained in more than (d+ 1)/2 queries. On the other hand let us consider two elements x, y that are not contained in the same hyperedge. The large girth of the query hypergraph implies that there is at most one other element z contained in a hyperedgeQ together with x and another hyperedge together with y (if there is no such z, then let Q be an arbitrary query containing x). Let us assume the answer to Q is NO, and the answer to every other query containing x or y is YES. Then x and y together know that x is not defective. If they cannot identify y as a defective, there cannot be more than d hyperedges containing x or y besides Q. This implies almost every element has to be contained in at most (d+ 1)/2 queries. • In [18] the authors considered the abstract version of the model introduced by Tapol- cai et al. [29, 30]. Here we extended our models to the case of more defectives. It would be interesting to see if their model can be extended similarly. • It is a phenomenon in combinatorial group testing that in most of the models the adaptive version actually means two round version of the problem (see e.g. [8]) Recently there was some interest in the r round (or multi-stage) versions of combinatorial group testing problems, where this phenomenon does not hold (see e.g. [7, 19]). It would be interesting to investigate these models in this context. • One can consider a variant of these models, where instead of requiring that the ele- ments find all (or none) of the defective elements, we require that they identify at least i and/or at most j of them. ORCID iDs Dániel Gerbner https://orcid.org/0000-0001-7080-2883 Máté Vizer https://orcid.org/0000-0002-2360-3918 References [1] M. J. Atallah, K. B. Frikken, M. Blanton and Y. Cho, Private combinatorial group testing, in: Proceedings of the 2008 ACM symposium on Information, computer and communications security, 2008 pp. 312–320, doi:10.1145/1368310.1368355. [2] A. Beimel, Secret-sharing schemes: a survey, in: Coding and cryptology, Springer, Hei- delberg, volume 6639 of Lecture Notes in Comput. Sci., pp. 11–46, 2011, doi:10.1007/ 978-3-642-20901-7 2. [3] F. S. Benevides, D. Gerbner, C. T. Palmer and D. K. Vu, Identifying defective sets using queries of small size, Discrete Math. 341 (2018), 143–150, doi:10.1016/j.disc.2017.08.023. [4] C. Berge, Hypergraphs, volume 45 of North-Holland Mathematical Library, North-Holland Publishing Co., Amsterdam, 1989, combinatorics of finite sets, Translated from the French, https://books.google.com/books?id=jEyfse-EKf8C. [5] A. Cohen, A. Cohen and O. Gurewitz, Secure group testing, in: 2016 IEEE International Symposium on Information Theory (ISIT), IEEE, 2016 pp. 1391–1395, https://arxiv. org/abs/1607.04849. 14 Art Discrete Appl. Math. 3 (2020) #P2.05 [6] C. Cooper, A. Frieze, M. Molloy and B. Reed, Perfect matchings in random r- regular, s-uniform hypergraphs, Combin. Probab. Comput. 5 (1996), 1–14, doi:10.1017/ s0963548300001796. [7] P. Damaschke, A. S. Muhammad and E. Triesch, Two new perspectives on multi-stage group testing, Algorithmica 67 (2013), 324–354, doi:10.1007/s00453-013-9781-4. [8] A. De Bonis, L. Ga̧sieniec and U. Vaccaro, Optimal two-stage algorithms for group testing problems, SIAM J. Comput. 34 (2005), 1253–1270, doi:10.1137/s0097539703428002. [9] T. J. Dickson, On a problem concerning separating systems of a finite set, J. Combinatorial Theory 7 (1969), 191–196, doi:10.1016/s0021-9800(69)80011-6. [10] R. Dorfman, The detection of defective members of large populations, The Annals of Mathe- matical Statistics 14 (1943), 436–440, doi:10.1214/aoms/1177731363. [11] D.-Z. Du and F. K. Hwang, Pooling designs and nonadaptive group testing, volume 18 of Series on Applied Mathematics, World Scientific Publishing Co. Pte. Ltd., Hackensack, NJ, 2006, doi:10.1142/9789812773463, important tools for DNA sequencing. [12] A. D’yachkov, P. Vilenkin, A. Macula and D. Torney, Families of finite sets in which no in- tersection of l sets is covered by the union of s others, J. Combin. Theory Ser. A 99 (2002), 195–218, doi:10.1006/jcta.2002.3257. [13] A. D’yachkov, P. A. Vilenkin and S. Yekhanin, Upper bounds on the rate of superimposed (s, l)-codes based on engel’s inequality, in: Proceedings of the International Conf. on Algebraic and Combinatorial Coding Theory (ACCT), Citeseer, 2002 pp. 95–99. [14] A. G. D’yachkov and V. V. Rykov, Bounds on the length of disjunctive codes, Problemy Peredachi Informatsii 18 (1982), 7–13, http://mi.mathnet.ru/eng/ppi1232. [15] D. Ellis and N. Linial, On regular hypergraphs of high girth, Electron. J. Combin. 21 (2014), Paper 1.54, 17, doi:10.37236/3851. [16] D. Eppstein, M. T. Goodrich and D. S. Hirschberg, Combinatorial pair testing: distinguishing workers from slackers, in: Algorithms and data structures, Springer, Heidelberg, volume 8037 of Lecture Notes in Comput. Sci., pp. 316–327, 2013, doi:10.1007/978-3-642-40104-6 28. [17] P. Frankl and Z. Füredi, Erratum: “Union-free hypergraphs and probability theory” [Euro- pean J. Combin. 5 (1984), no. 2, 127–131; MR0753001 (85g:05110)], European J. Combin. 5 (1984), 395, doi:10.1016/s0195-6698(84)80025-6. [18] D. Gerbner and M. Vizer, Smart elements in combinatorial group testing problems, J. Comb. Optim. 35 (2018), 1042–1060, doi:10.1007/s10878-018-0248-z. [19] D. Gerbner and M. Vizer, Rounds in a combinatorial search problem, Discrete Appl. Math. 276 (2020), 60–68, doi:10.1016/j.dam.2019.11.016. [20] N. J. Harvey, M. Patrascu, Y. Wen, S. Yekhanin and V. W. Chan, Non-adaptive fault diagnosis for all-optical networks via combinatorial group testing on graphs, in: IEEE INFOCOM 2007- 26th IEEE International Conference on Computer Communications, IEEE, 2007 pp. 697–705, doi:10.1109/infcom.2007.87. [21] F. Hwang and V. Sós, Non-adaptive hypergeometric group testing, Studia Sci. Math. Hungar 22 (1987), 257–263, http://real.mtak.hu/id/eprint/107775. [22] W. Kautz and R. Singleton, Nonrandom binary superimposed codes, IEEE Transactions on Information Theory 10 (1964), 363–377, doi:10.1109/tit.1964.1053689. [23] C. Lo, M. Liu, J. P. Lynch and A. C. Gilbert, Efficient sensor fault detection using combinato- rial group testing, in: 2013 IEEE international conference on distributed computing in sensor systems, IEEE, 2013 pp. 199–206, doi:10.1109/dcoss.2013.57. D. Gerbner and M. Vizer: Smart elements in combinatorial group testing problems 15 [24] A. Rényi, On random generating elements of a finite Boolean algebra, Acta Sci. Math. (Szeged) 22 (1961), 75–81, http://acta.bibl.u-szeged.hu/id/eprint/13920. [25] M. Ruszinkó, On the upper bound of the size of the r-cover-free families, J. Combin. Theory Ser. A 66 (1994), 302–310, doi:10.1016/0097-3165(94)90067-1. [26] M. Sobel and P. A. Groll, Group testing to eliminate efficiently all defectives in a binomial sample, Bell System Tech. J. 38 (1959), 1179–1252, doi:10.1002/j.1538-7305.1959.tb03914.x. [27] J. Spencer, Minimal completely separating systems, J. Combinatorial Theory 8 (1970), 446– 447, doi:10.1016/s0021-9800(70)80038-2. [28] D. R. Stinson, R. Wei and L. Zhu, Some new bounds for cover-free families, J. Combin. Theory Ser. A 90 (2000), 224–234, doi:10.1006/jcta.1999.3036. [29] J. Tapolcai, L. Rónyai, É. Hosszu, L. Gyimóthi, P.-H. Ho and S. Subramaniam, Signaling free localization of node failures in all-optical networks, IEEE Transactions on Communications 64 (2016), 2527–2538, doi:10.1109/infocom.2014.6848125. [30] J. Tapolcai, L. Rónyai, È. Hosszu, P.-H. Ho and S. Subramaniam, Signaling free localization of node failures in all-optical networks, in: IEEE INFOCOM 2014-IEEE Conference on Computer Communications, IEEE, 2014 pp. 1860–1868, doi:10.1109/tcomm.2016.2545653. [31] L. M. Tolhuizen, New rate pairs in the zero-error capacity region of the binary multiplying channel without feedback, IEEE Transactions on Information Theory 46 (2000), 1043–1046, doi:10.1109/18.841182. ISSN 2590-9770 The Art of Discrete and Applied Mathematics 3 (2020) #P2.06 https://doi.org/10.26493/2590-9770.1302.f4e (Also available at http://adam-journal.eu) On 2-skeleta of hypercubes Paul C. Kainen Georgetown University, Department of Mathematics and Statistics, Washington, DC 20057, USA Received 5 November 2018, accepted 30 April 2019, published online 10 August 2020 Abstract It is shown that the 2-skeleton of the odd-d-dimensional hypercube can be decomposed into sd spheres and τd tori, where sd = (d − 1)2d−4 and τd is asymptotically in the range (64/9)2d−7 to (d− 1)(d− 3)2d−7. Keywords: Cube decomposition, even-degree 2-complex, generalized book. Math. Subj. Class. (2020): 57M20, 57M15, 05C45 1 Introduction A decomposition of a graph is an edge-disjoint family of subgraphs such that each edge of the graph is in exactly one of the subgraphs. In recent decades, research on decomposition of graphs into cycles of varying lengths has been carried out for various graphs, including hypercubes. The symbol “×” denotes Cartesian product of topological spaces. It is natural to try to extend decomposition (and other frameworks) from graphs to 2- complexes. We do that for the 2-skeleton of the d-dimensional hypercube: the 2-complex Q2d obtained from the d-dimensional hypercube graph Qd by attaching a topological 2-cell [0, 1] × [0, 1] to each Q2-subgraph of Qd in the natural way, and the decompositions are into spheres and tori. A necessary condition to decompose a 2-complex into surfaces is that the complex be even: each edge belongs to a positive even number of 2-cells. But the condition isn’t sufficient; e.g., a surface can intersect itself like the Klein bottle in 3-space. Note Q2d is even iff d ≥ 3 is odd. The next section contains definitions, a precise statement of the results, and the proofs. The paper concludes with a brief discussion. E-mail address: kainen@georgetown.edu (Paul C. Kainen) cb This work is licensed under https://creativecommons.org/licenses/by/4.0/ 2 Art Discrete Appl. Math. 3 (2020) #P2.06 2 Definitions, theorems, and proofs In this section, we define complexes in a more general sense and give a product KL of 2-complexes (analogous to the Cartesian product of graphs). A 2-cell is any space homeomorphic to the standard unit disk in the plane. A 2-complex is a graph together with a non-empty family of closed 2-cells which are attached by homeo- morphisms from their boundaries to some of the cycles in the graph. The degree of an edge is the number of 2-cells which contain it; a complex is even iff all its edges have positive even degree. If K is a complex, we write K(r) for the set of r-cells, 0 ≤ r ≤ 2, where the vertices and edges, resp., are the 0- and 1-cells. The box-product of two 2-complexes K and L is the 2-complexM := KL, where for k = 0, 1, 2 Y ∈M(k) ⇐⇒ Y = A×B, A ∈ K(i), B ∈ L(j), i+ j = k; (2.1) we call Y of type (i, j) in this case. It is easy to check that for all d ≥ 2, Q2d = Q 2 d−2 Q 2 2. (2.2) E.g., the 2-cells of Q24 = Q 2 2 Q 2 2 consist of four of type (0, 2), four of type (2, 0), and 16 of type (1, 1). The box product of even complexes is even. A decomposition of a 2-complex K is a set of 2-complexes whose union is K such that every 2-cell in K is in exactly one of the components. An r-factor of a graph is a spanning r-regular subgraph and a factorization of a graph G is an edge-disjoint family of factors whose union is G. The following result is due to El-Zanati and Vanden Eynden [3, Theorem 7]. Theorem A. A Let d ≥ 3 be odd and suppose 2 ≤ r ≤ d. Then there is a 1-factor F of Qd such that Qd − F has a factorization into s-cycles with s = 2r. A complex is a sphere or torus if it is homeomorphic to a sphere or torus. If a complex is isomorphic to K, we call it a K-complex. Theorem 2.1. For d odd ≥ 5, Q2d has a decomposition into sd spheres and td tori, where the spheres are Q23, each torus is C4 × C` for some ` = 2r, r odd, 3 ≤ r ≤ d− 2, and sd = (d− 1)2d−4 and td = ( 2d−1 − (3/2)(d−3)− 4 ) /9. (2.3) Theorem 2.2. For d odd ≥ 5, Q2d has a decomposition into sd spheres and Td tori, where each sphere equals ∂Q3, each torus is C4 × C4, and Td = (d− 1)(d− 3)2d−7. (2.4) For d = 5, 7, 9, sd = 8, 48, 256, td = 1, 6, 27, and Td = 2, 24, 192, respectively. Proof of Theorem 2.1. By Theorem A, with r = d − 2, Qd−2 can be factored into Hamil- tonian cycles and a 1-factor F . We proceed by induction. For the basis case d = 5, by equation (2.2), Q25 = Q 2 3 Q 2 2 As Q 2 3 is a sphere, the union of all 2-cells of type (2, 0) in Q25 is a set of four disjoint spheres. If F is the 1-factor in Q3, then F  ∂(Q22), is the union of four disjoint cylinders formed by 16 2-cells of type P. C. Kainen: On 2-skeleta of hypercubes 3 (1, 1), while there are eight 2-cells of type (0, 2) which constitute the tops and bottoms of the cylinders, giving a total of 8 spheres in the decomposition of Q25. Finally, if H is the Hamiltonian cycle inQ3−F , then the 2-cells inH  ∂(Q22), each of type (1, 1), determine a torus of the form C4 × C8. Thus, s5 = 8 and t5 = 1. Noting that s3 = 1, for the induction step, we again use equation (2.2) and the above argument to see that for d ≥ 5, sd = 4sd−2 + 2d−3 and it is straightforward to check that sd = (d− 1)2d−4 satisfies the recursion. Indeed, for d ≥ 5 4(d− 3)2d−6 + 2 · 2d−4 = (d− 1)2d−4. Similarly, as (d−3)/2 is the number of Hamiltonian cycles in the factorization ofQd−2−F , we find that td = 4td−2 + (d− 3)/2, and for d odd ≥ 5, one easily checks that 4 ( 2d−3 − (3/2)(d− 5)− 4 ) /9 + (d− 3)/2 = ( 2d−1 − (3/2)(d− 3)− 4 ) /9, which proves the theorem as the recursively added tori are of the form C4 × C`, for ` the number of vertices in odd hypercubes of dimensions < d. For instance, writing Tk for Ck × C4, the 6 tori for Q27 are 4 copies of T8 and 2 copies of T32. For Q29, there are 16 copies of T8, 8 of T32, and 3 of T128. Using Theorem A with r = 2, one proves Theorem 2.2. 3 Conclusion The decomposition of the odd-dimensional hypercube 2-complex into spheres and tori is an example of decomposing an even complex into surfaces, as proposed in [4]. We believe that similar decompositions are possible for even 2-complexes related to complete graphs (i.e., the simplex). Decomposition into surfaces may allow improved display for graphs and 2-complexes embeddable in hypercubes. For instance, embedding the graph Qd in a surface requires genus 1+ (d− 4)2d−3 (e.g., [5, p. 119]) and such an embedding does not include all of the 2-complex. In contrast, a set of spheres and tori with 1-dimensional intersections suffice for the complex. The problem of finding such representations has been considered by L. De Floriani and colleagues in a series of papers, e.g., [1, 2]. Two types of singularities 0-dimensional (“pinch points”) and 1-dimensional (where several disks share a common line) are shown in Figures 3 and 1, respectively, of [1]. Their work, however, concentrates on simplicial complexes, rather than the cubical complexes considered here, and they don’t consider the issue of topological complexity. Our hypercube decompositions, which are face-disjoint unions of spheres and tori, are examples of generalized books in the sense of Overbay [6, 7]. If decompositions include surfaces with boundary, then every 2-complex has a decom- position. Indeed, ifK is a 2-complex, then take a genus embedding of the underlying graph, and put each 2-cell, not corresponding to a region of the embedding, onto a separate disk. That Q2d (d ≥ 5 odd) is decomposable into closed surfaces follows from Euler’s theo- rem using induction as above. Indeed, removing any 1-factor from Qd−2 leaves a (d− 3)- regular graph, which must be decomposable into cycles. Using [3] instead gives the least and greatest numbers of tori. 4 Art Discrete Appl. Math. 3 (2020) #P2.06 ORCID iDs Paul C. Kainen https://orcid.org/0000-0001-8035-0745 References [1] L. De Floriani and A. Hui, Representing non-manifold shapes in arbitrary dimensions, in: The 6th Israel-Korea Bi-National Conference on New Technologies and Visualization Methods for Product Development on Design and Reverse Engineering, held in Haifa, Israel, November 8 – 9, 2005. [2] L. De Floriani, M. M. Mesmoudi, F. Morando and E. Puppo, Non-manifold decomposition in arbitrary dimensions, in: A. Braquelaire, J.-O. Lachaud and A. Vialard (eds.), Discrete Geometry for Computer Imagery, Springer, Berlin, volume 2301 of Lecture Notes in Computer Science, pp. 69–80, 2002, doi:10.1007/3-540-45986-3 6, proceedings of the 10th International Conference (DGCI 2002) held in Bordeaux, April 3 – 5, 2002. [3] S. El-Zanati and C. Vanden Eynden, Cycle factorizations of cycle products, Discrete Math. 189 (1998), 267–275, doi:10.1016/s0012-365x(98)00053-3. [4] R. H. Hammack and P. C. Kainen, Graph bases and diagram commutativity, Graphs Combin. 34 (2018), 523–534, doi:10.1007/s00373-018-1891-y. [5] F. Harary, Graph Theory, volume 2787 of Addison-Wesley Series in Mathematics, Addison- Wesley, Reading, Massachusetts, 1969. [6] S. Overbay, Embedding graphs in cylinder and torus books, J. Combin. Math. Combin. Comput. 91 (2014), 299–313. [7] S. B. Overbay, Generalized Book Embeddings, Ph.D. thesis, Colorado State University, Fort Collins, Colorado, 1998, https://search.proquest.com/docview/304444285. ISSN 2590-9770 The Art of Discrete and Applied Mathematics 3 (2020) #P2.07 https://doi.org/10.26493/2590-9770.1281.0ad (Also available at http://adam-journal.eu) Hereditary polyhedra with planar regular faces∗ Egon Schulte† Department of Mathematics, Northeastern University, Boston, MA 02115, USA Asia Ivić Weiss‡ Department of Mathematics and Statistics, York University, Toronto, Ontario M3J 1P3, Canada In memory of Norman Johnson, our friend and colleague. Received 18 December 2018, accepted 6 July 2019, published online 10 August 2020 Abstract A skeletal polyhedron in Euclidean 3-space is called hereditary if the symmetries of each face extend to symmetries of the entire polyhedron. In this paper we describe the finite hereditary skeletal polyhedra which have regular convex polygons or regular star- polygons as faces. Keywords: Symmetries of polyhedra, geometric polyhedral, uniform polyhedra. Math. Subj. Class. (2020): 51M20, 52B05, 52B22 1 Introduction In the design of polyhedral structures with high symmetry it is quite natural to proceed from a highly symmetric structure of lower rank (or dimension) and ask for the symmetries of the lower rank structure to be preserved for the entire structure. The entire structure then inherits the symmetries of the lower rank structure. For example, the Platonic solids and ∗Special thanks go to Peter McMullen for pointing out the omission of a known uniform polyhedron from the list in an earlier version of the manuscript. We did know about the polyhedron but a bit of absentmindedness had caused us to forget to include it. His forthcoming paper [13] will describe an alternative approach to the enumeration presented here, and will also deal with the case of skew faces. We would also like to thank Tomaž Pisanski and the anonymous referee for helpful comments which have improved the paper. †Supported by the Simons Foundation Award No. 420718. ‡Corresponding author. Supported by NSERC grant. E-mail addresses: e.schulte@northeastern.edu (Egon Schulte), weiss@yorku.ca (Asia Ivić Weiss) cb This work is licensed under https://creativecommons.org/licenses/by/4.0/ 2 Art Discrete Appl. Math. 3 (2020) #P2.07 Figure 1: The finite regular polyhedra with planar faces (the five Platonic solids and the four Kepler-Poinsot polyhedra). the Kepler-Poinsot polyhedra shown in Figure 1 have the property that each symmetry of each face extends to the entire figure (see [1]). In this paper we study finite hereditary geometric polyhedra in E3 with planar regular faces. Here a polyhedron is viewed as a finite geometric graph with a distinguished class of polygonal cycles, called faces, such that two faces meet at each edge. A polyhedron is hereditary if each symmetry of each face extends to a symmetry of the polyhedron. For instance, all of the eighteen finite regular polyhedra in E3 are hereditary. Recall that these polyhedra consist of the nine classical regular polyhedra, that is, the Platonic solids and the Kepler-Poinsot polyhedra, and their Petrie duals (see [4, 5, 6, 15] or [16, Ch. 7E]). Hereditary polyhedra with regular faces are highly-symmetric polyhedra and have maximal local symmetry (with respect to faces). For hereditary polyhedra, the regularity assumption on the faces has strong implications for the geometry and enables us to say a great deal about them. Our main result is the following theorem. Theorem 1.1. The finite hereditary polyhedra with planar regular faces in E3 are (a) the nine classical regular polyhedra (Platonic solids and Kepler-Poinsot polyhedra), (b) the medials of the eighteen finite regular polyhedra, (c) the great ditrigonal icosidodecahedron (5 · 3)3, (d) the small ditrigonal icosidodecahedron ( 52 · 3) 3, and (e) the ditrigonal dodecadodecahedron (5 · 52 ) 3. Theorem 1.1 might give the false impression that there are 9 + 18 + 3 = 30 finite hereditary polyhedra with planar regular faces in E3. However, some polyhedra are counted more than once in the theorem, since pairs of dual finite regular polyhedra have the same medials, and the regular octahedron also occurs as the medial of the regular tetrahedron. The exact number of polyhedra turns out to be 25, not 30. E. Schulte and A. Ivić Weiss: Hereditary polyhedra with planar regular faces 3 Figure 2: The great ditrigonal icosidodecahedron (5 · 3)3 (left), small ditrigonal icosido- decahedron ( 52 · 3) 3 (middle), and ditrigonal dodecadodecahedron (5 · 52 ) 3 (right). Theorem 1.2. Up to similarity, there are precisely 25 finite hereditary polyhedra with planar regular faces in E3. We list these polyhedra and some of their properties in Table 1 at the end of the paper. Figure 2 shows the three exceptional polyhedra listed in parts (c), (d), and (e) of Theo- rem 1.1. The Petrie duals of the classical regular polyhedra are hereditary (in fact, regular) polyhedra but have skew faces and therefore do not occur in the list of Theorem 1.1. Historically, polyhedra with regular faces have attracted a lot of attention (for example, see [12]). Usually these figures were convex polyhedra or star-polyhedra. This paper is dedicated to the late Norman Johnson who has greatly contributed to our understanding of the geometry, combinatorics, and algebra of polyhedra and more general polyhedral structures (see Johnson [11, 10]). 2 Basic notions and facts A (finite) polygon, or more specifically a p-gon (with p > 3), consists of a sequence v1, v2, . . . , vp of p distinct points in E3, as well as of the line segments [vi, vi+1] for i = 1, . . . , p (with indices considered mod p). The points are the vertices and the line segments are the edges of the polygon. A polygon is planar if its vertices (and edges) lie in a plane; otherwise the polygon is skew, or non-planar. An incident vertex-edge pair of a polygon F is called a flag (or sometimes an arc) of F . A polygon F is said to be regular if its symmetry group G(F ) is transitive on the flags of F . Recall that the (geometric) symmetry group of a figure is the group of all isometries of the ambient space that leave the figure invariant; its elements are the symmetries of the figure. Thus a planar polygon has a planar symmetry group. A planar regular polygon with p vertices is necessarily (the graph consisting of the vertices and edges of) a regular convex p-gon, denoted {p}, or a regular star polygon, denoted {pd}, with (p, d) = 1 (see [1]). Recall that the vertices of { p d} are the same as those of {p}, and that its edges successively connect vertices d steps apart on {p}, beginning at the first vertex (say) of {p}. The symmetry group of a planar regular p-gon, the (planar) dihedral group Dp of order 2p, by definition consists of 2-dimensional isometries, and is generated by two reflections (in lines). When the p-gon is viewed as lying in a plane of E3, these reflections extend in an obvious way to plane reflections generating a reflection group 4 Art Discrete Appl. Math. 3 (2020) #P2.07 in E3 isomorphic to Dp. We call this group the trivial extension of Dp to E3, and by abuse of terminology and notation we also call this extension the dihedral group Dp. A skew regular p-gon must have an even number of vertices p. Its symmetry group is generated by a plane reflection (interchanging the edges at a vertex) and a half-turn (about the midpoint of an edge containing that vertex), and again is isomorphic to Dp; note that in this case the product of the two generators is a rotatory reflection (the composition of a rotation, and a reflection in a plane perpendicular to the rotation axis). A (finite skeletal) polyhedron P in E3 consists of a finite set of distinct points, called vertices, a set of line segments connecting vertices, called edges, and a set of polygons made up of edges, called faces, with the following three properties. • The graph formed by the vertices and edges of P , called the edge graph (or 1- skeleton) of P , is connected. • The vertex-figure at each vertex of P is connected. By the vertex-figure of P at a vertex v, denoted P/v, we mean the graph whose vertices are the neighbors of v in the edge graph of P and whose edges are the line segments (u,w), where (u, v) and (v, w) are edges of a common face of P . • Each edge of P belongs to exactly two faces of P . Note that a polyhedron is a geometric realization in E3, in the sense of [16, Ch. 5], of a finite abstract polyhedron and its respective map on a closed surface. A skeletal polyhedron P is called planar-faced or skew-faced respectively, if all faces of P are planar or some faces of P are skew. We call a polyhedron P regular-faced if each face of P is a regular polygon (and thus is a polygon with maximum possible symmetry). The symmetry group of a (finite) polyhedron P is a finite group of isometries of E3 and thus fixes the centroid of the vertex set of P , which we call the center of P . Throughout we assume that the center of P lies at o, the origin of E3. We call a face of P central if the center of P is the centroid of the vertex-set of the face. If a central face is planar then its ambient plane passes through the center of P . A non-central face of P is a face of P which is not central. A polyhedron P is said to be (geometrically) hereditary if the symmetry groupG(F ) of each face F of P can be viewed as a subgroup of the symmetry group G(P ) of P , or more informally, if each symmetry of each face F of P extends to a symmetry of P . (Note that the abstract polyhedron underlying a geometrically hereditary polyhedron P with regular faces is also combinatorially hereditary, in the sense that the combinatorial automorphism group of each face extends to a subgroup of the automorphism group of P . Combinatorially hereditary abstract polyhedra were shown in [17] to be regular or 2-orbit of type 201; see also [8] and [9].) For a face F of a hereditary polyhedron P we let GP (F ) denote the subgroup of G(P ) consisting of the symmetries of P which extend symmetries of F . If a face F of a hereditary polyhedron P is skew, then each symmetry of F already is 3-dimensional and thus the extended symmetry is the symmetry itself; that is, GP (F ) = G(F ). Note that a regular skew face of a hereditary polyhedron must necessarily be central, since o and the center of the face must be invariant under the symmetries of the face. However, this is different for planar faces. If a planar face F of P is non-central, then the extensions of the symmetries of F to symmetries of P still are unique, since o must be invariant; in this case, if F is a regular p-gon and G(F ) = Dp is identified with its trivial extension to E3, then GP (F ) = G(F ) = Dp. If a planar face F of P is central, E. Schulte and A. Ivić Weiss: Hereditary polyhedra with planar regular faces 5 however, then the symmetries of F may occur in P in one of two ways; in fact, if F admits a reflection symmetry in a line l (through o), then this planar symmetry of F may occur either as a reflection in a plane through l perpendicular to the plane of F or as a half-turn about l. We later see that, for a regular-faced hereditary polyhedron with planar faces, each symmetry of a central face F has a unique extension to P (so reflective symmetries of F can not extend to both a plane reflection and a half-turn). The geometry of the group GP (F ) then depends on the nature of these extensions but is still isomorphic to G(F ). An incident vertex-edge-face triple of a polyhedron P is called a flag of P . A poly- hedron P is (geometrically) regular if its symmetry group G(P ) is transitive on the flags of P . Regular polyhedra are hereditary regular-faced polyhedra. An incident vertex-edge pair of a polyhedron P is called an arc of P . We say that P is vertex-, edge-, or arc-transitive ifG(P ) acts transitively on the vertices, edges, or arcs of P , respectively. Clearly, if P is arc-transitive, then P is vertex-transitive and edge-transitive. For a vertex v of P , let Gv(P ) denote the stabilizer of v in G(P ). Proposition 2.1. Let P be a (finite) hereditary polyhedron with regular faces in E3. Then P is arc-transitive. In particular, the vertex-figures are mutually equivalent under G(P ), and the stabilizer Gv(P ) of a vertex v of P in G(P ) acts transitively on the vertices of the vertex-figure P/v at v. Moreover, the vertex-figures are planar. Proof. The first statement follows from [17, Prop. 1]. Any two arcs of P are related via a finite sequence of arcs such that successive arcs in the sequence are arcs of a common face of P . Since the faces are regular and thus arc-transitive under their own symmetry group, and since every symmetry of a face of P extends to a symmetry of P , it follows that P is arc-transitive. Thus P is vertex-transitive and the stabilizer Gv(P ) of a vertex v in G(P ) acts transitively on the vertices of the vertex-figure P/v at v. Moreover, the vertex-figures must be planar since P is finite. In fact, by the vertex- transitivity, the vertices of P must all lie on a sphere centered at o; and since Gv(P ) acts vertex-transitively on P/v, the vertices of P/v must all lie on a sphere centered at v. Thus the vertices of P/v all lie on the intersection of the two spheres, which is a circle. This shows that P/v is planar. It follows that every hereditary polyhedron with regular faces is a uniform polyhedron in E3. Recall that a uniform polyhedron is a vertex-transitive polyhedron with regular faces (see [1]). The finite uniform polyhedra with planar faces were classified by Coxeter, Longuet- Higgins and Miller [2] in 1954. It is customary to describe these polyhedra by a vertex- symbol (n1 ·n2 · . . . ·nq) with integral or rational entries. Here q is the valency of a vertex, and the entries n1, . . . , nq represent the faces that surround a vertex, in cyclic order, such that the face corresponding to ni has Schläfli symbol {ni} for i = 1, . . . , q (thus the face is a convex regular ni-gon if ni is an integer, or a regular star-polygon {ni} if ni is a fraction). For example, the small ditrigonal icosidodecahedron occurring in Theorem 1.1 and shown in Figure 2 has vertex-symbol ( 52 · 3 · 5 2 · 3 · 5 2 · 3), indicating that at each vertex three pentagrams { 52} and three triangles {3} alternate; the symbol is abbreviated to ( 5 2 · 3) 3. For our purposes, we further refine the vertex-symbol to indicate the presence of central faces. If a polyhedron has a central face, then the superscript “∗” in its refined vertex-symbol indicates that the corresponding face type represents a central face of the polyhedron. For example, the symbol ( 52 ·6 ∗ · 52 ·6 ∗) would represent a polyhedron in which two pentagrams 6 Art Discrete Appl. Math. 3 (2020) #P2.07 { 52} and two central regular hexagons {6} alternate at a vertex. Thus, a polyhedron has a central face if and only if a “∗” occurs in its refined vertex-symbol. No classification of the finite uniform polyhedra with skew faces is known to date, but new uniform polyhedra with skew faces have recently been found in [21, 23, 24]. See also Grünbaum [7]. Note that, for a regular-faced hereditary polyhedron with vertices of valency q, the vertex stabilizers Gv(P ) may not be isomorphic to Dq , even though Gv(P ) acts vertex- transitively on the q-gonal vertex-figure P/v at v. However, the following proposition holds. Proposition 2.2. Let P be a (finite) regular-faced hereditary polyhedron without central faces and with vertices of valency q. Then, for every vertex v of P , the vertex stabilizer Gv(P ) of v in G(P ) is a dihedral subgroup Dq , if q is odd, or contains a dihedral group Dq/2 which acts transitively on the q vertices of the vertex-figure P/v, if q is even. More- over, if q is odd then P is a regular polyhedron. Proof. By the vertex-transitivity of P it suffices to consider the vertex stabilizer subgroup for a single vertex. So let v be a vertex of P . Since the faces are non-central, each face F of P at v contributes toG(P ) a unique plane reflection which leaves both F and v invariant and interchanges the two edges of F meeting at v. This holds regardless of whether F is planar or skew. These reflections for the q faces at v generate a dihedral group Dq if q is odd, orDq/2 if q is even. Note that this subgroup ofGv(P ) acts vertex-transitively on P/v. If q is odd, then the dihedral subgroup Dq of Gv(P ) must necessarily coincide with Gv(P ). HenceGv(P ) must contain symmetries that swap adjacent faces of P meeting at v. Thus Gv(P ) must act flag-transitively on the vertex-figure P/v at v, and since P is vertex- transitive, G(P ) itself must act flag-transitively on P . Thus P is a regular polyhedron. Proposition 2.1 is telling us that the vertex-figures of hereditary regular-faced polyhedra must be congruent. The faces, however, need not be congruent (even though all are regular). On the other hand, by the edge-transitivity of P there can be at most two face orbits under G(P ). If indeed there are two face orbits, then the two faces of P meeting at an edge of P must lie in different face orbits under G(P ), and hence q must be even. If a hereditary regular-faced polyhedron P has a central planar face, then each face adjacent to any such face must either be a non-central planar face or a skew face, as we explain in a moment. As a consequence, by the edge-transitivity of P , each edge of P must lie in a central planar face as well as in a non-central planar face or a skew face. In particular, G(P ) must have two face orbits, one consisting of the central planar faces and the other of the non-central planar faces or the skew faces. Further, q must be even. Note that P cannot have a pair of adjacent central planar faces. In fact, any such pair of faces would necessarily have to lie in the same plane and share o as the center. The edge transitivity then would force the entire polyhedron P to lie in this plane, with all faces sharing the same symmetry group. However this is impossible since then all faces would have to coincide; in fact, since the faces are regular, the symmetry group of a face is entirely determined by the angle subtended at o by one of its edges. We noted earlier that, for non-central planar faces or skew faces of a hereditary regular- faced polyhedron P , there is just one way in which a planar symmetry of a face can extend to a symmetry of P . This also remains true for the central planar faces of P , for the following reason. Suppose F is a central planar face and l is a reflection line for F in the E. Schulte and A. Ivić Weiss: Hereditary polyhedra with planar regular faces 7 plane that contains F . There are only two isometries of E3 which extend the 2-dimensional reflection in l, namely the half-turn about l and the reflection in the plane perpendicular at l to the plane of F . Now if both isometries are symmetries of P , then so is their product, which is the reflection in the plane containing F . However, this reflection cannot be a symmetry, since the image of an adjacent (non-central planar, or skew, respectively) face G of F under this reflection would yield another (non-central planar, or skew) face G′ of F meeting F at the same edge asG. This is impossible. Thus each planar reflection symmetry of F extends in just one way to a symmetry of P . This forces the same to be true for the rotational symmetries of F . We also require the following two well-known concepts for polyhedra (see [1, 16, 19]). A Petrie polygon of a regular polyhedron P in E3 is a path along edges of P such that every two, but no three, consecutive edges belong to a face of P (see [1, 3, 19]). Every regular polyhedron P gives rise to a new structure, denoted Pπ and called the Petrie dual, or Petrial, of P , which in most cases is again a polyhedron (see [16, Lemma 7B3]). For example, {4, 3}π , the Petrial of a cube, is a polyhedron with four hexagonal skew faces. Given a regular polyhedron P in E3 the medial Me(P ) is a new structure, usually a polyhedron, with faces of two kinds: the polygons with vertices at the midpoints of consec- utive edges in a face of P , and the polygons with vertices at the midpoints of consecutive edges meeting at a vertex of P . The medial of a regular polyhedron may not always be a polyhedron. For example, in the blended polyhedron {3, 6}#{ } edges can cross at mid- points and hence the edge midpoints occupy the same point in E3 (see [16, Ch. 7E]). Thus its medial is not a polyhedron. 3 Planar-faced polyhedra with no central faces In the next two sections, we describe and characterize the finite regular-faced hereditary polyhedra P in E3 all of whose faces are planar. Their vertex-figures are also planar, by Proposition 2.1. Our analysis of these polyhedra greatly depends on whether or not they have central faces. In this section, we deal with the finite planar-faced hereditary polyhedra P with no central faces. Polyhedra with central faces are discussed in the next section. So let P be a finite hereditary polyhedron with regular faces all of which are planar and non-central, and with vertices of valency q. Recall our standing assumption that the center of P lies at o. Then each symmetry of each face F of P is extended to P in the trivial way. Thus the subgroup GP (F ) of G(P ) is a dihedral group, namely the trivial extension of the dihedral symmetry group G(F ) of F . In particular, G(P ) must contain many plane reflections. It follows that G(P ) must be the full symmetry group of a Platonic solid R (say), and that the face centers of P , being centers of rotation of a regular face, must lie on axes of rotation of R. In particular, each face F of P must have 3, 4 or 5 vertices. As P has no central faces, we know from Proposition 2.2 that the vertex stabilizer Gv(P ) of a vertex v is a dihedral subgroup Dq if q is odd, or contains a dihedral subgroup Dq/2 if q is even. In particular, each vertex v of P is a center of rotational symmetry of P about an axis passing through v and o. Thus v must lie on a rotation axis of the underlying Platonic solidR and therefore coincide with a vertex, the midpoint of an edge, or the center of a face of R, up to rescaling of R. Clearly, by replacing R by its dual (if need be), we may assume that the vertices of P lie either at vertices or edge midpoints of R. Then, since G(P ) = G(R) and P is vertex-transitive, the vertex set of P coincides with either the full 8 Art Discrete Appl. Math. 3 (2020) #P2.07 vertex set of R or the full set of edge midpoints of R. If the vertex valency q is odd, then Proposition 2.2 is telling us that P is a regular polyhedron and that Gv(P ) = Dq for every vertex v of P . Thus the vertex-figures are congruent regular polygons, and by Proposition 2.1 are planar. Inspection of the list of finite regular polyhedra in E3 then establishes the following proposition (see [16]). Proposition 3.1. Let P be a (finite) hereditary polyhedron with planar regular faces, all non-central, and with vertices of odd valency. Then P is either a Platonic solid or a Kepler- Poinsot polyhedron. When the vertex valency q of P is even, the (q/2)-fold rotation about a vertex of P has order 2, 3, 4, or 5, and so q = 4, 6, 8, or 10. This case is more involved. The remainder of this section deals with the proof of the following proposition. Proposition 3.2. Let P be a (finite) hereditary polyhedron with planar regular faces, all non-central, and with vertices of even valency. Then P is the medial of a Platonic solid, the medial of a Kepler-Poinsot polyhedron, a small ditrigonal icosidodecahedron ( 52 · 3) 3, a great ditrigonal icosidodecahedron (5 ·3)3, or a ditrigonal dodecadodecahedron (5 · 52 ) 3. Proof. The proof of Proposition 3.2 investigates the two possible placements of the vertices of P on R, with R as above, namely either at the vertices of R (Case 1) or at the edge midpoints of R (Case 2). So let the vertex valency q of P be even. Case 1: The vertices of P lie at the vertices of R. We first rule out the possibility that R is a tetrahedron {3, 3}, a cube {4, 3}, or an icosahedron {3, 5}. Clearly, since q > 4, R cannot be {3, 3}. To see that R = {3, 5} is impossible, we note that since q is even and P in this case has 5-fold rotational symmetries about its vertices, q must be 10 and the vertex-figures of P must be planar decagons; how- ever, no ten vertices of {3, 5} lie in a common plane. Similarly, R = {4, 3} is impossible since no six vertices of {4, 3} lie in a common plane. If R = {3, 4}, then clearly q = 4 and the neighbors of a vertex v in P are just those in R. Since a 4-fold rotation about v must cyclically permute the faces of P at v, and since the faces are planar, the faces of P at v must necessarily be the faces of R at v. Hence P = {3, 4}, which is the medial of the tetrahedron. The case when R = {5, 3} is more complicated. By arguments as above we find that q = 6, and that the vertex-figure at a vertex v of P is a planar hexagon with D3-symmetry and with vertices among those of R. It is easy to see that only two configurations for the convex hull of the vertex-figure at v are possible, as indicated by the yellow and blue poly- gons in Figure 3. In the first (yellow) configuration for the convex hull, the vertex-figure of P at v has as its vertices the vertices of the three pentagons of R at v that lie on edges opposite to v on these pentagons. In the second (blue) configuration for the convex hull, the vertices are the antipodes of the vertices of the hexagon in the first configuration. We next consider these two configurations in turn to show that the first leads to three hereditary polyhedra with non-central planar faces, and that the second cannot occur. For the first configuration of the convex hull three scenarios are possible and each con- tributes one polyhedron. First suppose that the vertex-figure of P at v is a convex hexagon and thus coincides with its convex hull. Then the edges opposite to v on the pentagon faces of R at v are among the edges of the vertex-figure of P at v. In P , these edges appear as the vertex- figures of pentagram faces { 52} at v inscribed in the pentagon faces of R at v. The other E. Schulte and A. Ivić Weiss: Hereditary polyhedra with planar regular faces 9 Figure 3: Configurations for the convex hull of the vertex-figure at v when P has the same vertex-set as R = {5, 3}. Figure 4: Faces of P at v, for the yellow (first) configuration of Figure 3 when the vertex- figure at v is a convex hexagon. faces of P at v are equilateral triangles formed by the three vertices that are adjacent in R to a neighbour of v in R. Thus three pentagrams and three triangles alternate around v in P , as illustrated in Figure 4. The resulting polyhedron P is the uniform polyhedron ( 52 ·3) 3 called the small ditrigonal icosidodecahedron (see [2]). Next suppose that (still in the first configuration for the convex hull) the vertex-figure of P at v is not a convex hexagon. In this case the vertex-figure is a non-convex hexagon of one of two kinds. The first kind of non-convex hexagonal vertex-figure is indicated with dashed red lines in Figure 5. Here the edges opposite to v on the pentagon faces of R at v are not among the edges of the vertex-figure of P at v. The faces of P at v again are of two kinds alternating around v. There are three regular convex pentagons “cutting across” R (shown in heavy red lines in Figure 5), and there are three equilateral triangles of the same kind as before, each formed by the three vertices that are adjacent in R to a neighbour of v in R. Now P is the uniform polyhedron (5 · 3)3 called the great ditrigonal icosidodecahedron (see [2]). 10 Art Discrete Appl. Math. 3 (2020) #P2.07 Figure 5: Faces of P at v, for the yellow (first) configuration of Figure 3 when the vertex- figure at v is a non-convex hexagon not sharing any edges with R = {5, 3}. The second kind of non-convex hexagonal vertex-figure is indicated with dashed red lines in Figure 6. Now the edges opposite to v on the pentagon faces of R at v are edges of the vertex-figure of P at v. Again two kinds of faces of P alternate at v. There are three regular convex pentagons “cutting across” R (shown in heavy red lines in Figure 6), and there are three pentagrams inscribed in the pentagon faces of R at v. Thus P is the uniform polyhedron (5 · 52 ) 3 called the ditrigonal dodecadodecahedron (see [2]). The second (blue) configuration of Figure 3 for the convex hull of the vertex-figure of P at v can be ruled out as follows. As for the first configuration of Figure 3, the vertex-figure of P at v must either be a convex hexagon identical with the convex hull, or a non-convex hexagon sharing three edges with the convex hull. In either case, each edge of the vertex- figure of P at v which is an edge of the convex hull is necessarily the vertex-figure of a face of P at v, and therefore must span, together with v, the plane of this face. As this plane contains only three vertices of R, the face itself could only be a (non-regular) triangle, so P could not be regular-faced. Thus the second configuration of Figure 3 cannot occur. This completes the enumeration of the polyhedra P for Case 1. We next investigate the second possibility for the placement of vertices of P relative to R. Recall that q is even. Case 2: The vertices of P lie at the edge midpoints of R. In this case necessarily q = 4 since now the vertices of P have only D2-symmetry. Thus the vertex-figures are congruent planar 4-gons with D2-symmetry. As pairs of dual Platonic solids yield the same set of edge midpoints up to similarity, it suffices to consider only the regular polyhedra R = {3, 3}, {3, 4}, and {5, 3}. The first possibility can be ruled out immediately. IfR = {3, 3}, then the (planar) faces of P must be regular triangles since R has only rotations of order 2 or 3. The vertices of P are just those of a regular octahedron, and the vertex-figures are given by equatorial squares of this octahedron. Hence P must coincide with this octahedron. But then G(P ) 6= G(R), which contradicts our choice of R. (Recall that the octahedron occurred as the medial of {3, 3} in Case 1.) Thus this choice R does not contribute a polyhedron. If R = {3, 4}, then the faces of P must be regular triangles or squares since R has only rotations of order 2, 3 or 4. Now the vertices of P are just those of a cuboctahedron E. Schulte and A. Ivić Weiss: Hereditary polyhedra with planar regular faces 11 Figure 6: Faces of P at v, for the yellow (first) configuration of Figure 3 when the vertex- figure at v is a non-convex hexagon sharing edges with pentagonal faces of R = {5, 3} at v. { 3 4 } . The vertex-figure at a vertex v of P must necessarily be convex. In fact, opposite vertices of the convex hull of the vertex-figure at v cannot be joined (in a bowtie fashion) by an edge of the vertex-figure, since otherwise the planar face at v determined by this edge would need to be central, in violation to our standing assumption in this section that P has no central faces. There are only two possible configurations (shown in yellow and light blue in Figure 7) for the four neighbours of v in P . In the first (yellow) configuration, the neighbors of v in P are the same as those of v in the cuboctahedron. In this case P must coincide with the cuboctahedron and thus be the medial of {3, 4}. In fact, the triangular faces of P at v must be just those of the cuboctahedron, and then this must also hold for the square faces. In the second (light blue) configuration, the neighbors of v in P are the antipodal points of those in the first configuration. But this choice can be ruled out since the triangular faces at v could not be regular. If R = {3, 5}, then the faces of P must be regular triangles, convex pentagons, or pentagrams, since R has only rotations of order 2, 3 or 5. Now the vertices of P are just those of an icosidodecahedron { 3 5 } . Any triangular face of P must either be inscribed in a triangle face of R as shown in Figure 8(a), or have as its vertices the midpoints of the edges of R which emanate from the vertices of a triangle face of R but do not belong to the adjacent triangle faces (see Figure 8(b)). Clearly, not all faces of P can be triangles. Similarly, by the D5-symmetry of the pentagonal faces of P , there can only be two possible configurations for the vertex sets of pentagonal faces of P . The convex hulls of these vertex sets are shown in Figures 8(c,d). 12 Art Discrete Appl. Math. 3 (2020) #P2.07 Figure 7: Possible convex hulls of the vertex-figures of P at v when the vertices of P lie at the edge midpoints of an octahedron. Figure 8: Possible convex hulls of the faces of P when the vertices of P lie at the edge midpoints of {3, 5}. E. Schulte and A. Ivić Weiss: Hereditary polyhedra with planar regular faces 13 First suppose that P indeed has a triangular face. If the triangular faces are positioned as in Figure 8(a), then the pentagonal faces must be convex and P itself must be an icosi- dodecahedron, the medial of the icosahedron. On the other hand, if the triangular faces are as in Figure 8(b), then the pentagonal faces must be pentagrams with vertex sets located as in Figure 8(d), and P itself must be { 3 5 2 } , the medial of the Kepler-Poinsot polyhedron {3, 52} (or its dual { 5 2 , 3}). If P has no triangular faces, then all its faces must be convex pentagons or pentagrams. If a pentagonal face with vertices located as in Figure 8(c) occurs, then this must be a pentagram face whose adjacent faces are convex pentagon faces with vertex sets located as in Figure 8(d). Then P must be { 5 5 2 } , the medial of {5, 52} (or its dual { 5 2 , 5}). Lastly, we can rule out the possibility that all pentagonal faces of P have vertex sets as in Figure 8(d). In fact, otherwise all faces of P must be convex pentagons, or all faces of P must be pentagrams, in both cases with four faces meeting at a vertex. But the vertex configurations of Figure 8(d) arising from two different vertices of the icosahedronR (each corresponding to a point like the central point in the figure) can never intersect in more than one point, so adjacent faces of P cannot both be convex pentagons or pentagrams. This settles the enumeration of the polyhedra P for Case 2, and completes the proof of Proposition 3.2. 4 Planar-faced polyhedra with a central face In this section, we treat the finite hereditary polyhedra P with planar regular faces some of which are central. So let P be a polyhedron of this kind. Recall from Section 2 that then each edge of P must lie in a central planar face and a non-central planar face, and that G(P ) must have two face orbits given by the central faces respectively the non-central faces of P . Moreover, the vertex valency q is even and the vertex-figures are planar. First observe that the reflections in the perpendicular bisectors of edges of P are sym- metries of P . In fact, the planar symmetry group of the non-central face at a given edge, trivially extended to a subgroup ofG(P ), is generated by plane reflections and in particular contains the reflection in the perpendicular bisector of this edge. Next we need to analyze the way in which the planar symmetries of central faces F that interchange the two edges of F at a vertex of F appear in G(P ). It turns out that there can only be two possible scenarios: either all such symmetries appear as plane reflections, or all appear as half-turns. In particular we will see that the first scenario will not occur. The goal of this section is to prove the following proposition. Proposition 4.1. Let P be a (finite) hereditary polyhedron with regular planar faces, in- cluding some central faces. Then P is the medial of the Petrie dual of either a Platonic solid or a Kepler-Poinsot polyhedron. Proof. The proof investigates the two possible ways (Cases 1 and 2 below, respectively) in which the planar reflective symmetries of a central face are extended to symmetries of P , namely either all as plane reflections or one half as plane reflections (interchanging the two vertices of an edge) and the other half as half-turns (interchanging the two edges at a vertex). The behavior is uniform across all central faces, since any two central faces are equivalent under G(P ). If the reflective symmetries of all central faces are extended to P by plane reflections, then the reflective symmetries of all faces of P are extended to P by plane reflections, since we know this to be true for the non-central faces. 14 Art Discrete Appl. Math. 3 (2020) #P2.07 Now suppose P is a finite hereditary polyhedron with regular planar faces, including some central faces. Case 1: The planar reflective symmetries of central faces are extended to P by plane re- flections. We show that this case does not occur; in other words, there are no polyhedra with cen- tral faces in which all reflective symmetries of all faces are extended by plane reflections. Suppose that P is a polyhedron with central faces such that all planar reflective sym- metries of all faces are extended to P by plane reflections. Then all subgroups of G(P ) extending planar symmetry groups of faces of P are generated by plane reflections. Again, as in Section 3, the vertices of P must lie at the vertices or the edge midpoints of a Platonic solid R with G(R) = G(P ). The case R = {3, 3} can be eliminated as follows. In this case the vertices of P could only be the six edge midpoints of R, since otherwise P could not have a central face. Then the central faces of P could only be given by the three equatorial squares of the octahedron formed by these six vertices, and the non-central faces by four alternate triangle faces of this octahedron. Thus P would have to be Me({3, 3}π), the medial of the Petrial of {3, 3}. However, the planar reflective symmetries of the central faces of Me({3, 3}π) that interchange the edges at a vertex are not extended to Me({3, 3}π) by plane reflections, so in the present context this polyhedron must be rejected by our case assumption. Note, however, that Me({3, 3}π) will occur as a legitimate polyhedron in Case 2. Now let R = {3, 4}. If the vertices of P are just those of R, then again P must be the medial of the Petrial of a tetrahedron and can be eliminated as before (or here, alternatively, because G(P ) 6= G(R)). The case when the vertices of P are the edge midpoints of R (that is, the vertices of a cuboctahedron) can be ruled out as follows. Since a central face would need to have full dihedral symmetry, it could only be a triangle or square. There are no central triangles with D3-symmetry spanned by vertices of a cuboctahedron, so the central faces could only be squares. However, the squares inscribed as central squares in the vertex-set of a cuboctahedron are such that each vertex of the cuboctahedron can only lie in one such square. Thus this possibility is excluded as well. The cube R = {4, 3} also does not contribute a polyhedron. The case of vertex place- ments for P at the edge-midpoints of R is the same as for {3, 4} and can again be ruled out. The vertices of P also cannot lie at the vertices of R, since there are no central regular polygons spanned by vertices of the cube. The two cases R = {3, 5} and R = {5, 3} similarly do not give a polyhedron. In fact, the central regular faces of P would have to be triangles, pentagons or pentagrams. But no such faces can be placed with full dihedral symmetry. This applies to both kinds of vertex placements for P on R. In summary, Case 1 does not lead to a hereditary polyhedron of the desired kind. Case 2: Some planar reflective symmetries of central faces are not extended to P by plane reflections. We know from our previous discussion that the reflective symmetries of faces which are not extended to P by plane reflections, are just the reflective symmetries of central faces which interchange the two edges at a vertex, and that these are extended to P by half-turns. Thus the subgroups of G(P ) extending symmetry groups of central faces are generated by a plane reflection and a half-turn. In particular, the central faces must have an even number E. Schulte and A. Ivić Weiss: Hereditary polyhedra with planar regular faces 15 of vertices. The subgroups of G(P ) extending symmetry groups of non-central faces still are generated by two plane reflections. We show that q = 4 and that the vertices of P must lie on axes of 2-fold rotation. Suppose v is a vertex, F a central face at v, and G a non-central face at v adjacent to F . Let rF and rG respectively denote the extended symmetries of the faces F and G that interchange the edges at v. Then it is clear that the product rF rG has order q/2. (Recall that q is even.) On the other hand, rF is the half-turn about the line through o and v, and rG is a reflection in a plane through o and v perpendicular to the plane of G. Hence, since the rotation axes of rF lies in the reflection plane of rG, the product rF rG must be a reflection in the plane which is perpendicular to the reflection plane of rG and meets this plane in the rotation axis of rF . Thus q = 4, and there are just two central faces and two non-central faces meeting in alternating fashion at v. If F ′ and G′ respectively are the central and non- central faces of P at v distinct from F andG, and rF ′ and rG′ are the extended symmetries of F ′ andG′ defined in the same way as rF and rG for F andG, then necessarily rF ′ = rF and rG′ = rG, and rF rG interchanges F and F ′, and G and G′. It follows as before that the vertices of P must lie at the vertices or edge midpoints of a Platonic solid R with G(R) = G(P ). Clearly, by what we just said, the vertices of P could only lie at vertices of R if R = {3, 4} or {4, 3} (but below these possibilities will be ruled out as well). If R = {3, 3} then vertex placements for P at the edge midpoints of R are possi- ble precisely for the reason that they were ruled out under Case 1. In fact, the resulting polyhedron is Me({3, 3}π), the medial of the Petrie dual of {3, 3}, also known as the tetra- hemihexahedron (see [2]). In Me({3, 3}π), the planar symmetries of the central faces that interchange the edges at a vertex indeed are extended by half-turns, not plane reflections. Thus R = {3, 3} contributes Me({3, 3}π). Now letR = {3, 4}. In this case the vertex placements for P at the vertices ofR can be ruled out, since the only possible candidate for a polyhedron, Me({3, 3}π), has a smaller symmetry group than R. This polyhedron occurred in the previous case for R. On the other hand, the vertex placements for P at the edge midpoints of R lead to two possible polyhedra, as we can see as follows. First note that, under the assumption of Case 2, the only possible central faces are the equatorial hexagons of the cuboctahedron determined by the edge midpoints of R, or triangles with vertices among those of an equatorial hexagon. The latter are excluded since the central faces must have an even number of vertices. Thus the central faces are the equatorial hexagons of the cuboctahedron. The non-central faces must necessarily be triangles or squares, as only these have dihedral symmetry. In either case the non-central faces must be faces of the cuboctahedron. If the non-central faces are triangles, then P is the medial of the Petrie dual of a cube, Me({4, 3}π), also called the octahemioctahedron [2]. If the non-central faces are squares, then P is the medial of the Petrie dual of the octahedron, Me({3, 4}π), also called the cubohemioctahedron [2]. For R = {4, 3}, the polyhedron P cannot have its vertices at the vertices of R, since a central face could not be regular. On the other hand, by duality, the vertex placements for P at the edge-midpoints of R result in the same two polyhedra as in the previous case. Now let R = {3, 5}. Suppose the vertices of P lie at the edge midpoints of R. The central faces all must have 2-fold, 3-fold, or 5-fold rotational symmetry, as well as an even number of vertices, and thus must be squares, hexagons, or decagons. Squares can be ruled out immediately. In fact, although a square can be placed as a central square with its vertices at edge midpoints of R, this cannot be done in such a way that all symmetries of 16 Art Discrete Appl. Math. 3 (2020) #P2.07 Figure 9: Central hexagonal faces with vertices at edge midpoints of an icosahedron. the square extend to symmetries of P (or equivalently, R), so P could not be hereditary. On the other hand, hexagonal central faces indeed can occur. Figure 9 shows how the vertices of a central regular hexagon can be placed at the edge midpoints of R, in such a way that the half-turns about the edges of R that contain a vertex of this hexagon map the hexagon to itself. Each pair of antipodal vertices of this hexagon also lies in another central hexagon of the same kind. These two hexagons are interchanged by the reflection in the plane spanned by the pair of antipodal edges of R determining the common vertices of the hexagons. Note that the six edges of R whose midpoints are the vertices of any such hexagon form a regular skew hexagon centered at o; this is a 2-zigzag of R (see [16, p. 196]). At each vertex v of P , two central hexagonal faces and two non-central faces meet in an alternating fashion. The angle at v between an edge of a central hexagon at v, and an edge of the other central hexagon at v, is either 2π/5 or 3π/5 (see again Figure 9). Thus, in between the two central hexagons meeting at v can fit only two regular convex pentagons or two regular pentagrams. If the pentagonal faces are convex, then P is Me({ 52 , 5} π), the medial of the Petrie dual of the regular star polyhedron { 52 , 5}, with vertex-symbol (5·6 ∗)2, also called the great dodecahemi-icosahedron [2]. If the pentagonal faces are pentagrams, then P is Me({5, 52} π), the medial of the Petrie dual of the regular star polyhedron {5, 52}, with vertex-symbol ( 52 · 6 ∗)2, called the small dodecahemi-icosahedron [2]. There are also four hereditary polyhedra P where the central faces are regular decagons (and the vertices still are at the edge midpoints of R). Their central faces are regular convex decagons {10} or regular star decagons { 103 }, with each central face lying in a plane perpendicular to a 5-fold rotation axis of R. If the central faces of P are convex decagons, then each pair of antipodal vertices of a central decagon also lies in another central decagon of the same kind, as shown in Fig- ure 10. This only leaves room at a vertex for non-central faces which are regular triangles or regular convex pentagons. If the non-central faces are triangles, then P is Me({5, 3}π), the medial of the Petrie dual of a dodecahedron, with vertex-symbol (3 · 10∗)2, known as the small icosihemidodecahedron [2]. If the non-central faces are convex pentagons, then P is Me({3, 5}π), the medial of the Petrie dual of an icosahedron, with vertex-symbol (5 · 10∗)2, called the small dodecahemidodecahedron [2]. E. Schulte and A. Ivić Weiss: Hereditary polyhedra with planar regular faces 17 Figure 10: Central decagonal faces with vertices at edge midpoints of an icosahedron. On the other hand, if the central faces of P are star decagons { 103 }, which we may picture as inscribed in a regular decagon of the kind shown in Figure 10, then the non- central faces must either be regular triangles or regular pentagrams { 52}. If the non-central faces are triangles, then P is a great icosihemidodecahedron [2], Me({ 52 , 3} π), the medial of the Petrie dual of { 52 , 3}, with vertex-symbol (3 · ( 10 3 ) ∗)2. If the non-central faces are pentagrams, then P is a great dodecahemidodecahedron [2], Me({3, 52} π), the medial of the Petrie dual of {3, 52}, with vertex-symbol ( 5 2 · ( 10 3 ) ∗)2. Finally, appealing to duality, for R = {5, 3} the vertex placements for P at the edge midpoints of R produce the same four polyhedra as in the previous case. This settles the enumeration of the polyhedra in Case 2. Now the proof of Proposi- tion 4.1 is complete. The final step of the proof of Theorem 1.1 consists of drawing together Propositions 3.1 and 4.1. Propositions 3.1 describes the finite polyhedra with no central planar face, while Propositions 4.1 deals with the polyhedra that have central planar faces. This leads to the desired result. 5 The enumeration As pointed out earlier, several polyhedra listed in Theorem 1.1 are counted more than once in the theorem. For example, each pair of dual finite regular polyhedra gives the same medial. The Platonic solids and Kepler-Poinsot polyhedra each have a geometric dual which is also regular, but their Petrie duals do not. The Petrie duals of course have combinatorial duals, but these are not realizable as regular geometric polyhedra in E3. Leaving aside the octahedron, which is already counted in the list of Platonic solids but also occurs as the medial of the tetrahedron, we therefore can obtain at most 4+9=13 different medials (other than the octahedron) from regular polyhedra. This then leaves at most 25 possible polyhedra. Inspection of the 25 polyhedra shows that these are indeed different, that is, mutually geometrically non-similar. The arguments are based on a comparison of the vertex-symbols as well as on the existence and nature of central faces (if any). 18 Art Discrete Appl. Math. 3 (2020) #P2.07 In Table 1, we list the 25 polyhedra along with the refined vertex-symbols, symmetry groups, and relevant internal references. Recall that the superscript π denotes the Petrie- dual, and that the superscript “∗” in a vertex-symbol means that the corresponding face type represents a central face of the polyhedron. For example, the medial of the Petrie dual of the Kepler-Poinsot polyhedron { 52 , 3}, denoted Me({ 5 2 , 3} π), has vertex-symbol (3 · ( 103 ) ∗)2, indicating that two regular triangles {3} and two central regular star-decagons { 103 } alternate at a vertex. A vertex-symbol only contains a superscript “∗” if the poly- hedron has a central face. The next to last column lists the symmetry groups, with [p, q] denoting the symmetry group of the Platonic solids {p, q}. The last column of the table gives the internal reference where the corresponding polyhedron is described or derived; for example, 4.1/C2 means “Proposition 4.1, Case 2 of its proof”. ORCID iDs Egon Schulte https://orcid.org/0000-0001-9725-3589 Asia Ivić Weiss https://orcid.org/0000-0003-4937-2246 References [1] H. S. M. Coxeter, Regular Polytopes, Dover Publications, New York, 3rd edition, 1973. [2] H. S. M. Coxeter, M. S. Longuet-Higgins and J. C. P. Miller, Uniform polyhedra, Philos. Trans. Roy. Soc. London. Ser. A. 246 (1954), 401–450, doi:10.1098/rsta.1954.0003. [3] H. S. M. Coxeter and W. O. J. Moser, Generators and Relations for Discrete Groups, volume 14 of Ergebnisse der Mathematik und ihrer Grenzgebiete, Springer-Verlag, Berlin, 4th edition, 1980. [4] A. W. M. Dress, A combinatorial theory of Grünbaum’s new regular polyhedra, Part I: Grünbaum’s new regular polyhedra and their automorphism group, Aequationes Math. 23 (1981), 252–265, doi:10.1007/BF02188039. [5] A. W. M. Dress, A combinatorial theory of Grünbaum’s new regular polyhedra, part II: Com- plete enumeration, Aequationes Math. 29 (1985), 222–243, doi:10.1007/BF02189831. [6] B. Grünbaum, Regular polyhedra—old and new, Aequationes Math. 16 (1977), 1–20, doi:10. 1007/BF01836414. [7] B. Grünbaum, “New” uniform polyhedra, in: A. Bezdek (ed.), Discrete Geometry, Dekker, New York, volume 253 of Monographs and Textbooks in Pure and Applied Mathematics, pp. 331–350, 2003, doi:10.1201/9780203911211.ch23, in honor of W. Kuperberg’s 60th birthday. [8] I. Hubard, M. del Rı́o Francos, A. Orbanić and T. Pisanski, Medial symmetry type graphs, Electron. J. Combin. 20 (2013), #P29 (28 pages), https://www.combinatorics.org/ ojs/index.php/eljc/article/view/v20i3p29. [9] I. Hubard, A. Orbanić and A. Ivić Weiss, Monodromy groups and self-invariance, Canad. J. Math. 61 (2009), 1300–1324, doi:10.4153/CJM-2009-061-5. [10] N. W. Johnson, Uniform Polytopes, unpublished book manuscript. [11] N. W. Johnson, Convex polyhedra with regular faces, Canadian J. Math. 18 (1966), 169–200, doi:10.4153/CJM-1966-021-8. [12] H. Martini, A hierarchical classification of Euclidean polytopes with regularity properties, in: T. Bisztriczky, P. McMullen, R. Schneider and A. I. Weiss (eds.), Polytopes: Abstract, Con- vex and Computational, Kluwer Academic Publishers, Dordrecht, volume 440 of NATO Ad- vanced Science Institutes Series C: Mathematical and Physical Sciences, 1994 pp. 71–96, doi: E. Schulte and A. Ivić Weiss: Hereditary polyhedra with planar regular faces 19 10.1007/978-94-011-0924-6 4, proceedings of the NATO Advanced Study Institute held in Scarborough, Ontario, August 20 – September 3, 1993. [13] P. McMullen, Quasi-regular polytopes of full rank, in preparation. [14] P. McMullen, Geometric Regular Polytopes, 2020, in press. [15] P. McMullen and E. Schulte, Regular polytopes in ordinary space, Discrete Comput. Geom. 17 (1997), 449–478, doi:10.1007/PL00009304. [16] P. McMullen and E. Schulte, Abstract Regular Polytopes, volume 92 of Encyclopedia of Math- ematics and its Applications, Cambridge University Press, Cambridge, 2002, doi:10.1017/ CBO9780511546686. [17] M. Mixer, E. Schulte and A. I. Weiss, Hereditary polytopes, in: R. Connelly, A. Ivić Weiss and W. Whiteley (eds.), Rigidity and Symmetry, Springer, New York, volume 70 of Fields Institute Communications, pp. 279–302, 2014, doi:10.1007/978-1-4939-0781-6 14. [18] D. Pellicer and E. Schulte, Regular polygonal complexes in space, I, Trans. Amer. Math. Soc. 362 (2010), 6679–6714, doi:10.1090/S0002-9947-2010-05128-1. [19] T. Pisanski and B. Servatius, Configurations from a graphical viewpoint, Birkhäuser Advanced Texts: Basler Lehrbücher. [Birkhäuser Advanced Texts: Basel Textbooks], Birkhäuser/Springer, New York, 2013, doi:10.1007/978-0-8176-8364-1. [20] E. Schulte and A. I. Weiss, Skeletal geometric complexes and their symmetries, Math. Intelli- gencer 39 (2017), 5–16, doi:10.1007/s00283-016-9685-7. [21] E. Schulte and A. Williams, Wythoffian skeletal polyhedra in ordinary space, I, Discrete Com- put. Geom. 56 (2016), 657–692, doi:10.1007/s00454-016-9814-2. [22] Wikipedia contributors, Wikipedia, The Free Encyclopedia, https://en.wikipedia. org/. [23] A. Williams, Wythoffian skeletal polyhedra in ordinary space, II, in preparation. [24] A. Williams, Wythoffian Skeletal Polyhedra, Ph.D. thesis, Northeastern University, Boston, Massachusetts, 2015, https://search.proquest.com/docview/1680014921. [25] Wolfram Research, Inc., Mathematica, champaign, Illinois. 20 Art Discrete Appl. Math. 3 (2020) #P2.07 Polyhedra Description Vertex-symbol Group Proposition/ Case Platonic {3, 3} (3)3 [3, 3] 3.1 {3, 4} = Me({3, 3}) (3)4 [3, 4] 3.2/C1 {4, 3} (4)3 [3, 4] 3.1 {3, 5} (3)5 [3, 5] 3.1 {5, 3} (5)3 [3, 5] 3.1 Kepler-Poinsot {3, 52} (3) 5 [3, 5] 3.1 { 52 , 3} ( 5 2 ) 3 [3, 5] 3.1 {5, 52} (5) 5 [3, 5] 3.1 { 52 , 5} ( 5 2 ) 5 [3, 5] 3.1 Medials Me({3, 4}) = Me({4, 3}) (3 · 4)2 [3, 4] 3.2/C2 Me({3, 5}) = Me({5, 3}) (3 · 5)2 [3, 5] 3.2/C2 Me({3, 52}) = Me({ 5 2 , 3}) (3 · 5 2 ) 2 [3, 5] 3.2/C2 Me({5, 52}) = Me({ 5 2 , 5}) (5 · 5 2 ) 2 [3, 5] 3.2/C2 Me({3, 3}π) (3 · 4∗)2 [3, 3] 4.1/C2 Me({3, 4}π) (4 · 6∗)2 [3, 4] 4.1/C2 Me({4, 3}π) (3 · 6∗)2 [3, 4] 4.1/C2 Me({3, 5}π) (5 · 10∗)2 [3, 5] 4.1/C2 Me({5, 3}π) (3 · 10∗)2 [3, 5] 4.1/C2 Me({3, 52} π) ( 52 · ( 10 3 ) ∗)2 [3, 5] 4.1/C2 Me({ 52 , 3} π) (3 · ( 103 ) ∗)2 [3, 5] 4.1/C2 Me({5, 52} π) ( 52 · 6 ∗)2 [3, 5] 4.1/C2 Me({ 52 , 5} π) (5 · 6∗)2 [3, 5] 4.1/C2 Exceptional (3 · 5)2 [3, 5] 3.2/C1 (3 · 52 ) 2 [3, 5] 3.2/C1 (5 · 52 ) 2 [3, 5] 3.2/C1 Table 1: The 25 finite hereditary polyhedra with planar regular faces in E3. ISSN 2590-9770 The Art of Discrete and Applied Mathematics 3 (2020) #P2.08 https://doi.org/10.26493/2590-9770.1326.9fd (Also available at http://adam-journal.eu) On median and quartile sets of ordered random variables* Iztok Banič Faculty of Natural Sciences and Mathematics, University of Maribor, Koroška 160, SI-2000 Maribor, Slovenia, and Institute of Mathematics, Physics and Mechanics, Jadranska 19, SI-1000 Ljubljana, Slovenia, and Andrej Marušič Institute, University of Primorska, Muzejski trg 2, SI-6000 Koper, Slovenia Janez Žerovnik Faculty of Mechanical Engineering, University of Ljubljana, Aškerčeva 6, SI-1000 Ljubljana, Slovenia, and Institute of Mathematics, Physics and Mechanics, Jadranska 19, SI-1000 Ljubljana, Slovenia Received 30 November 2018, accepted 13 August 2019, published online 21 August 2020 Abstract We give new results about the set of all medians, the set of all first quartiles and the set of all third quartiles of a finite dataset. We also give new and interesting results about rela- tionships between these sets. We also use these results to provide an elementary correctness proof of the Langford’s doubling method. Keywords: Statistics, probability, median, first quartile, third quartile, median set, first quartile set, third quartile set. Math. Subj. Class. (2020): 62-07, 60E05, 60-08, 60A05, 62A01 1 Introduction Quantiles play a fundamental role in statistics: they are the critical values used in hypoth- esis testing and interval estimation. Often they are the characteristics of distributions we usually wish to estimate. The use of quantiles as primary measure of performance has *This work was supported in part by the Slovenian Research Agency (grants J1-8155, N1-0071, P2-0248, and J1-1693.) E-mail addresses: iztok.banic@um.si (Iztok Banič), janez.zerovnik@fs.uni-lj.si (Janez Žerovnik) cb This work is licensed under https://creativecommons.org/licenses/by/4.0/ 2 Art Discrete Appl. Math. 3 (2020) #P2.08 gained prominence, particularly in microeconomic, financial and environmental analysis and others. Quartiles (i.e 0.25, 0.50, and 0,75 quantiles) are used in elementary statistics very early, c.f. for drawing box and whisker plots. Whereas there is no dispute that the median of an ordered dataset is either the middle element or the arithmetic mean of the two middle elements (when the number of elements is even), the situation is seemingly much more complicated when quartiles are considered. There are many well-known formulas and algorithms that give certain values, claiming for these values to be medians (or quartiles) for a given statistical data (for examples see [5]). However, the trouble begins when realizing that different formulas (or algorithms) may give different values. Many authors or users of such formulas or algorithms go even further by taking the value obtained by such a formula or an algorithm to be the definition of the median or the first quartile or the third quartile of a given data. As a result, going through the literature, one may find it very difficult to find and then choose an appropriate definition (formula, algorithm) of a median or a quartile to use it for the statistical analysis of a given data. In [2, 5] provide references and comparison of several methods for computing the quartiles of a finite data set that appear in the literature and in software. While it is well known that these methods do not always give the same results, Langford writes that the “situation is far worse than most realize ” [5]. Although the differences tend to be small, Langford further answered the question “Why worry? The differences are small so who cares? ” with words of [1]: “Before we go into any details, let us point out that the numerical differences between answers produced by the different methods are not necessarily large; in- deed, they may be very small. Yet if quartiles are used, say to establish criteria for making decisions, the method of their calculation becomes of critical concern. For instance, if sales quotas are established from historical data, and salesper- sons in the highest quarter of the quota are to receive bonuses, while those in the lowest quarter are to be fired, establishing these boundaries is of interest to both employer and employee. In addition, computer-software users are sometimes unaware of the fact that different methods can provide different answers to their problems, and they may not know which method of calculating quartiles is actually provided by their software.” Langford [5] also proposes a method that is consistent with the CDF (cumulative distri- bution function). The method is slightly more complicated than some other methods used, however it is not too much involved and there are equivalent methods that can be used in the classroom [10, 9]. Indeed, the discussion about quartiles in teaching elementary statistics is considerable, c.f. [10, 1, 4, 5, 9]. In short, some of the elementary methods are based on the idea that a quartile is a median of the lower, or the upper half of the dataset. The question arises what is the half of dataset when it has an odd number of elements. Langford naturally answers with the idea of doubling the dataset thus assuring the even number of elements, while the quantile values remain the same. On the positive side, it seems that all methods have one thing in common: they all expect the following to hold: 1. the median to be such a value m ∈ R, for which at least half of the data is less or equal to m and at least half of the data is greater or equal to m, 2. the first quartile to be such a value q1 ∈ R, for which at least quarter of the data is less or equal to q1 and at least three quarters of the data is greater or equal to q1, I. Banič and J. Žerovnik: On median and quartile sets of ordered random variables 3 3. the third quartile to be such a value q3 ∈ R, for which at least three quarters of the data is less or equal to q3 and at least quarter of the data is greater or equal to q3. We will use this fact as a motivation to define the median set, the first quartile set, and the third quartile set of a given data. The main contribution of this paper is the idea to redefine the median, and the quartiles, and possibly more general, the quantiles as sets (intervals) instead of the usual considera- tion of this notions as reals. We indicate that in this way we may avoid the dispute caused by various methods, algorithms, and even definitions of quartiles. We also show that some methods for computing the quartiles do not extend to quartile sets, and provide an elemen- tary method that can be used to compute the quartile sets. The rest of the paper is organized as follows. The set of all medians M(X) of X is defined in Section 3, and in Section 4, the set of all first quartiles Q1(X) of X and the set of all third quartiles Q3(X) of X are defined. Main results about relationships among these sets are provided in Section 5. In Section 6, we recall some well known methods for computing of quartiles and show that one of them, the Langford’s doubling method can be used to compute the quartile sets. 2 Preliminaries Here we introduce some basic notions that we use in the paper. Suppose that we have a finite ordered m-tuple (y1, y2, y3, . . . , ym) ∈ Rm of some data such that y1 < y2 < y3 < . . . < ym, together with the m-tuple of their frequencies (k1, k2, k3, . . . , km) ∈ Nm. This means that the datum yi occurs ki-times for each i ∈ {1, 2, 3, . . . ,m}. Let k1 + k2 + k3 + . . .+ km = n. Then the random variable Y defined by Y ∼ ( y1 y2 y3 · · · ym k1 n k2 n k3 n · · · km n ) , where kin is the probability P (Y = yi) for each i ∈ {1, 2, 3, . . . ,m}, represents these data. One may represent the above data equivalently, using the random variable X in the following way X ∼ ( x1 x2 x3 · · · xn 1 n 1 n 1 n · · · 1 n ) , where x1 ≤ x2 ≤ x3 ≤ . . . ≤ xn and x1 = x2 = x3 = . . . = xk1 = y1, xk1+1 = xk1+2 = xk1+3 = . . . = xk1+k2 = y2, xk1+k2+1 = xk1+k2+2 = xk1+k2+3 = . . . = xk1+k2+k3 = y3, ... xk1+k2+...+km−1+1 = xk1+k2+...+km−1+2 = xk1+k2+...+km−1+3 = . . . = xn = ym. In this article, we will present data using such random variable X . We will call such a random variable X an ordered random variable. Using this notation, we define the set of all medians M(X) of X , the set of all first quartiles Q1(X) of X , and the set of all third quartiles Q3(X) of X in the following sections. 4 Art Discrete Appl. Math. 3 (2020) #P2.08 3 The median set of a random variable We begin the section by giving the definition of a median and the median set of an ordered random variable. Definition 3.1. Let X be an ordered random variable, given by X ∼ ( x1 x2 x3 · · · xn 1 n 1 n 1 n · · · 1 n ) , and let x be any real number. We say that x is a median of X , if P (X ≤ x) ≥ 1 2 and P (X ≥ x) ≥ 1 2 . We call the set M(X) = {x ∈ R | x is a median of X} the median set of the random variable X . In the following proposition we give an explicit description of the median set M(X) for any ordered random variable X . Proposition 3.2. Let X be an ordered random variable, given by X ∼ ( x1 x2 x3 · · · xn 1 n 1 n 1 n · · · 1 n ) . Then M(X) = { {xk} if n = 2k − 1 for some positive integer k, [xk, xk+1] if n = 2k for some positive integer k. Proof. We consider the following two possible cases. CASE 1: n = 2k − 1 for some positive integer k. Since P (X ≤ xk) = k · 1 n = n+ 1 2 · 1 n = 1 2 + 1 2n ≥ 1 2 and P (X ≥ xk) = k · 1 n = n+ 1 2 · 1 n = 1 2 + 1 2n ≥ 1 2 , it follows that xk ∈M(X). Next, let x < xk. Since P (X ≤ x) ≤ P (x ≤ xk−1) = (k − 1) · 1 n = n− 1 2 · 1 n = 1 2 − 1 2n < 1 2 , therefore x 6∈M(X). Finally, let x > xk. Since P (X ≥ x) ≤ P (x ≥ xk+1) = (k − 1) · 1 n = n− 1 2 · 1 n = 1 2 − 1 2n < 1 2 , it follows that x 6∈M(X). I. Banič and J. Žerovnik: On median and quartile sets of ordered random variables 5 CASE 2: n = 2k for some positive integer k and let x ∈ [xk, xk+1]. Since P (X ≤ x) ≥ P (X ≤ xk) = k · 1 n = n 2 · 1 n = 1 2 ≥ 1 2 and P (X ≥ x) ≥ P (X ≥ xk+1) = k · 1 n = n 2 · 1 n = 1 2 ≥ 1 2 , it follows that x ∈M(X) for any x ∈ [xk, xk+1]. Next, let x < xk. Since P (X ≤ x) ≤ P (x ≤ xk−1) = (k − 1) · 1 n = n− 2 2 · 1 n = 1 2 − 1 n < 1 2 , therefore x 6∈M(X). Finally, let x > xk+1. Since P (X ≥ x) ≤ P (x ≥ xk+2) = (n− k + 1) · 1 n = n− 2 2 · 1 n = 1 2 − 1 n < 1 2 , therefore x 6∈M(X). Note that for any ordered random variable X , X ∼ ( x1 x2 x3 · · · xn 1 n 1 n 1 n · · · 1 n ) , the following holds: 1. the median set M(X) is nonempty, 2. the median set M(X) is bounded and closed in R, 3. max(M(X)) = { xk if n = 2k − 1 for some positive integer k, xk+1 if n = 2k for some positive integer k. 4. min(M(X)) = { xk if n = 2k − 1 for some positive integer k, xk if n = 2k for some positive integer k. 5. M(X) ∩ {x1, x2, x3, . . . , xn} = = { {xk} if n = 2k − 1 for some positive integer k, {xk, xk+1} if n = 2k for some positive integer k. Clearly, the statements (1) and (2) above imply Fact 3.3. The median set M(X) is either a singleton (one real number) or a closed inter- val. We call the maximum max(M(X)) of M(X) the upper median of X and we will always denote it by m1; we call the minimum min(M(X)) of M(X) the lower median of X and we will always denote it by m0. The median m 1 2 = min(M(X)) + max(M(X)) 2 = { xk if n = 2k − 1 for some positive integer k, xk+xk+1 2 if n = 2k for some positive integer k will be called the middle median of X or the canonical value of median of X . 6 Art Discrete Appl. Math. 3 (2020) #P2.08 4 The first and the third quartile sets of a random variable We begin this section by giving the definition of a first and a third quartile as well as the first quartile and the third quartile set of an ordered random variable. Definition 4.1. Let X be an ordered random variable, given by X ∼ ( x1 x2 x3 · · · xn 1 n 1 n 1 n · · · 1 n ) , and let x be any real number. We say that x is 1. a first quartile of X , if P (X ≤ x) ≥ 1 4 and P (X ≥ x) ≥ 3 4 . 2. a third quartile of X , if P (X ≤ x) ≥ 3 4 and P (X ≥ x) ≥ 1 4 . We call the set Q1(X) = {x ∈ R | x is a first quartile of X} the first quartile set of the random variable X and the set Q3(X) = {x ∈ R | x is a third quartile of X} the third quartile set of the random variable X . In the following proposition we give an explicit description of the sets Q1(X) and Q2(X) for any ordered random variable X . Proposition 4.2. Let X be an ordered random variable, given by X ∼ ( x1 x2 x3 · · · xn 1 n 1 n 1 n · · · 1 n ) . Then Q1(X) =  [xk, xk+1] if n = 4k for some positive integer k, {xk+1} if n = 4k + 1 for some non-negative integer k, {xk+1} if n = 4k + 2 for some non-negative integer k, {xk+1} if n = 4k + 3 for some non-negative integer k and Q3(X) =  [x3k, x3k+1] if n = 4k for some positive integer k, {x3k+1} if n = 4k + 1 for some non-negative integer k, {x3k+2} if n = 4k + 2 for some non-negative integer k, {x3k+3} if n = 4k + 3 for some non-negative integer k. I. Banič and J. Žerovnik: On median and quartile sets of ordered random variables 7 Proof. We consider the following four possible cases. CASE 1: n = 4k for some positive integer k. First we find all such ` ∈ {1, 2, 3, . . . , n} that x` ∈ Q1(X). Suppose that ` ∈ {1, 2, 3, . . . , n} is such an integer that x` ∈ Q1(X). Then • ` 4k ≥ 1 4 holds and ` 4k ≥ 1 4 ⇐⇒ ` ≥ k, • 4k − `+ 1 4k ≥ 3 4 holds and 4k − `+ 1 4k ≥ 3 4 ⇐⇒ ` ≤ k + 1. Therefore, x` ∈ Q1(X) ⇐⇒ ` ∈ {k, k + 1}. Therefore, it can easily be seen that Q1(X) = [xk, xk+1]. Next we find all such ` ∈ {1, 2, 3, . . . , n} that x` ∈ Q3(X). Suppose that ` ∈ {1, 2, 3, . . . , n} is such an integer that x` ∈ Q3(X). Then • ` 4k ≥ 3 4 holds and ` 4k ≥ 3 4 ⇐⇒ ` ≥ 3k, • 4k − `+ 1 4k ≥ 1 4 holds and 4k − `+ 1 4k ≥ 1 4 ⇐⇒ ` ≤ 3k + 1. Therefore, x` ∈ Q3(X) ⇐⇒ ` ∈ {3k, 3k + 1}. Therefore, it can easily be seen that Q3(X) = [x3k, x3k+1]. CASE 2: n = 4k + 1 for some non-negative integer k. First we find all such ` ∈ {1, 2, 3, . . . , n} that x` ∈ Q1(X). Suppose that ` ∈ {1, 2, 3, . . . , n} is such an integer that x` ∈ Q1(X). Then • ` 4k + 1 ≥ 1 4 holds and ` 4k + 1 ≥ 1 4 ⇐⇒ ` ≥ k + 1 4 , • 4k + 1− `+ 1 4k + 1 ≥ 3 4 holds and 4k − `+ 2 4k + 1 ≥ 3 4 ⇐⇒ ` ≤ k + 5 4 . 8 Art Discrete Appl. Math. 3 (2020) #P2.08 Therefore, x` ∈ Q1(X) ⇐⇒ ` = k + 1. Therefore, it can easily be seen that Q1(X) = {xk+1}. Next we find all such ` ∈ {1, 2, 3, . . . , n} that x` ∈ Q3(X). Suppose that ` ∈ {1, 2, 3, . . . , n} is such an integer that x` ∈ Q3(X). Then • ` 4k + 1 ≥ 3 4 holds and ` 4k + 1 ≥ 3 4 ⇐⇒ ` ≥ 3k + 3 4 , • 4k + 1− `+ 1 4k + 1 ≥ 1 4 holds and 4k − `+ 2 4k + 1 ≥ 1 4 ⇐⇒ ` ≤ 3k + 7 4 . Therefore, x` ∈ Q3(X) ⇐⇒ ` = 3k + 1. Therefore, it can easily be seen that Q3(X) = {x3k+1}. CASE 3: n = 4k + 2 for some non-negative integer k. First we find all such ` ∈ {1, 2, 3, . . . , n} that x` ∈ Q1(X). Suppose that ` ∈ {1, 2, 3, . . . , n} is such an integer that x` ∈ Q1(X). Then • ` 4k + 2 ≥ 1 4 holds and ` 4k + 2 ≥ 1 4 ⇐⇒ ` ≥ k + 1 2 , • 4k + 2− `+ 1 4k + 2 ≥ 3 4 holds and 4k − `+ 3 4k + 2 ≥ 3 4 ⇐⇒ ` ≤ k + 3 2 . Therefore, x` ∈ Q1(X) ⇐⇒ ` = k + 1. Therefore, it can easily be seen that Q1(X) = {xk+1}. Next we find all such ` ∈ {1, 2, 3, . . . , n} that x` ∈ Q3(X). Suppose that ` ∈ {1, 2, 3, . . . , n} is such an integer that x` ∈ Q3(X). Then • ` 4k + 2 ≥ 3 4 holds and ` 4k + 2 ≥ 3 4 ⇐⇒ ` ≥ 3k + 3 2 , I. Banič and J. Žerovnik: On median and quartile sets of ordered random variables 9 • 4k + 2− `+ 1 4k + 2 ≥ 1 4 holds and 4k − `+ 3 4k + 2 ≥ 1 4 ⇐⇒ ` ≤ 3k + 5 2 . Therefore, x` ∈ Q3(X) ⇐⇒ ` = 3k + 2. Therefore, it can easily be seen that Q3(X) = {x3k+2}. CASE 4: n = 4k + 3 for some non-negative integer k. First we find all such ` ∈ {1, 2, 3, . . . , n} that x` ∈ Q1(X). Suppose that ` ∈ {1, 2, 3, . . . , n} is such an integer that x` ∈ Q1(X). Then • ` 4k + 3 ≥ 1 4 holds and ` 4k + 3 ≥ 1 4 ⇐⇒ ` ≥ k + 3 4 , • 4k + 3− `+ 1 4k + 3 ≥ 3 4 holds and 4k − `+ 4 4k + 3 ≥ 3 4 ⇐⇒ ` ≤ k + 7 4 . Therefore, x` ∈ Q1(X) ⇐⇒ ` = k + 1. Therefore, it can easily be seen that Q1(X) = {xk+1}. Finally, we find all such ` ∈ {1, 2, 3, . . . , n} that x` ∈ Q3(X). Suppose that ` ∈ {1, 2, 3, . . . , n} is such an integer that x` ∈ Q3(X). Then • ` 4k + 3 ≥ 3 4 holds and ` 4k + 3 ≥ 3 4 ⇐⇒ ` ≥ 3k + 9 4 , • 4k + 3− `+ 1 4k + 3 ≥ 1 4 holds and 4k − `+ 4 4k + 3 ≥ 1 4 ⇐⇒ ` ≤ 3k + 13 4 . Therefore, x` ∈ Q3(X) ⇐⇒ ` = 3k + 3. Therefore, it can easily be seen that Q3(X) = {x3k+3}. 10 Art Discrete Appl. Math. 3 (2020) #P2.08 Note that for any ordered random variable X , X ∼ ( x1 x2 x3 · · · xn 1 n 1 n 1 n · · · 1 n ) , the following holds: 1. the sets Q1(X) and Q3(X) are both nonempty, 2. the sets Q1(X) and Q3(X) are both bounded and closed in R, 3. max(Q1(X)) =  xk+1 if n = 4k for some positive integer k, xk+1 if n = 4k + 1 for some non-negative integer k, xk+1 if n = 4k + 2 for some non-negative integer k, xk+1 if n = 4k + 3 for some non-negative integer k 4. min(Q1(X)) =  xk if n = 4k for some positive integer k, xk+1 if n = 4k + 1 for some non-negative integer k, xk+1 if n = 4k + 2 for some non-negative integer k, xk+1 if n = 4k + 3 for some non-negative integer k 5. max(Q3(X)) =  x3k+1 if n = 4k for some positive integer k, x3k+1 if n = 4k + 1 for some non-negative integer k, x3k+2 if n = 4k + 2 for some non-negative integer k, x3k+3 if n = 4k + 3 for some non-negative integer k 6. min(Q3(X)) =  x3k if n = 4k for some positive integer k, x3k+1 if n = 4k + 1 for some non-negative integer k, x3k+2 if n = 4k + 2 for some non-negative integer k, x3k+3 if n = 4k + 3 for some non-negative integer k 7. Q1(X) ∩ {x1, x2, x3, . . . , xn} = =  {xk, xk+1} if n = 4k for some positive integer k, {xk+1} if n = 4k + 1 for some non-negative integer k, {xk+1} if n = 4k + 2 for some non-negative integer k, {xk+1} if n = 4k + 3 for some non-negative integer k 8. Q3(X) ∩ {x1, x2, x3, . . . , xn} = =  {x3k, x3k+1} if n = 4k for some positive integer k, {x3k+1} if n = 4k + 1 for some non-negative integer k, {x3k+2} if n = 4k + 2 for some non-negative integer k, {x3k+3} if n = 4k + 3 for some non-negative integer k Similarly as for the median, we observe that Fact 4.3. The quartile sets Q1(X) and Q3(X) are either singletons (one real number) or closed intervals. I. Banič and J. Žerovnik: On median and quartile sets of ordered random variables 11 We call the maximum max(Q1(X)) and the minimum min(Q1(X)) of Q1(X) the upper first quartile and the lower first quartile of X respectively, and we will denote them by q11 and q 0 1 respectively. The first quartile q 1 2 1 = min(Q1(X)) + max(Q1(X)) 2 =  xk+xk+1 2 if n = 4k for some positive integer k, xk+1 if n = 4k + 1 for some non-negative integer k, xk+1 if n = 4k + 2 for some non-negative integer k, xk+1 if n = 4k + 3 for some non-negative integer k will be called the middle first quartile of X (or, the canonical value of the first quartile). We call the maximum max(Q3(X)) and the minimum min(Q3(X)) of Q3(X) the upper third quartile and the lower third quartile of X respectively, and we will always denote them by q13 and q 0 3 respectively. The third quartile q 1 2 3 = min(Q3(X)) + max(Q3(X)) 2 =  x3k+x3k+1 2 if n = 4k for some positive integer k, x3k+1 if n = 4k + 1 for some non-negative integer k, x3k+2 if n = 4k + 2 for some non-negative integer k, x3k+3 if n = 4k + 3 for some non-negative integer k will be called the middle third quartile of X (or, the canonical value of the third quartile). 5 Main results In present section we formulate and prove our main theorems. We start with the following definition. Definition 5.1. Let X be an ordered random variable, given by X ∼ ( x1 x2 x3 · · · xn 1 n 1 n 1 n · · · 1 n ) . Then 2X is the ordered random variable, defined by 2X ∼ ( y1 y2 y3 · · · y2n 1 2n 1 2n 1 2n · · · 1 2n ) , where y2i−1 = y2i = xi for each i ∈ {1, 2, 3, . . . , n}. The following theorem says that the set of all medians of X may be obtained by calcu- lating the set of all medians of 2X . Theorem 5.2. Let X be an ordered random variable, given by X ∼ ( x1 x2 x3 · · · xn 1 n 1 n 1 n · · · 1 n ) . Then M(X) = M(2X). 12 Art Discrete Appl. Math. 3 (2020) #P2.08 Proof. Let 2X ∼ ( y1 y2 y3 · · · y2n 1 2n 1 2n 1 2n · · · 1 2n ) , We look at the following two possible cases. CASE 1: n = 2k − 1 for some positive integer k. By Proposition 3.2 and by the definition of 2X , the following holds: M(2X) = [y2k−1, y2k] = [xk, xk] = {xk} = M(X). CASE 2: n = 2k for some positive integer k. By Proposition 3.2 and by the definition of 2X , the following holds: M(2X) = [y2k, y2k+1] = [xk, xk+1] = M(X). In the following theorem, the ordered random variable 4X is defined to be the ordered random variable 2(2X). The theorem says that the set of all first (third) quartiles of X may be obtained by calculating the set of all first (third) quartiles of 4X . Theorem 5.3. Let X be an ordered random variable, given by X ∼ ( x1 x2 x3 · · · xn 1 n 1 n 1 n · · · 1 n ) . Then Q1(X) = Q1(4X) and Q3(X) = Q3(4X). Proof. Let 4X ∼ ( y1 y2 y3 · · · y4n 1 4n 1 4n 1 4n · · · 1 4n ) , We look at the following four possible cases. CASE 1: n = 4k for some positive integer k. By Proposition 4.2 and by the definition of 4X , the following holds: Q1(4X) = [yn, yn+1] = [xk, xk+1] = Q1(X) and Q3(4X) = [y3n, y3n+1] = [x3k, x3k+1] = Q3(X). CASE 2: n = 4k + 1 for some non-negative integer k. By Proposition 4.2 and by the definition of 4X , the following holds: Q1(4X) = [yn, yn+1] = [xk+1, xk+1] = {xk+1} = Q1(X) and Q3(4X) = [y3n, y3n+1] = [x3k+1, x3k+1] = {x3k+1} = Q3(X). CASE 3: n = 4k + 2 for some non-negative integer k. I. Banič and J. Žerovnik: On median and quartile sets of ordered random variables 13 By Proposition 4.2 and by the definition of 4X , the following holds: Q1(4X) = [yn, yn+1] = [xk+1, xk+1] = {xk+1} = Q1(X) and Q3(4X) = [y3n, y3n+1] = [x3k+2, x3k+2] = {x3k+2} = Q3(X). CASE 4: n = 4k + 3 for some non-negative integer k. By Proposition 4.2 and by the definition of 4X , the following holds: Q1(4X) = [yn, yn+1] = [xk+1, xk+1] = {xk+1} = Q1(X) and Q3(4X) = [y3n, y3n+1] = [x3k+3, x3k+3] = {x3k+3} = Q3(X). In the definitions and the results that follow we try to mimic statistical methods that suggest the following well-known strategy. To find a first or a third quartile, split the data into two halves and find the medians of these halves. Definition 5.4. Let X be an ordered random variable, given by X ∼ ( x1 x2 x3 · · · x2n 1 2n 1 2n 1 2n · · · 1 2n ) . Then 12X − is the ordered random variable, given by 1 2 X− ∼ ( x1 x2 x3 · · · xn 1 n 1 n 1 n · · · 1 n ) and 12X + is the ordered random variable, given by 1 2 X+ ∼ ( xn+1 xn+2 xn+3 · · · x2n 1 n 1 n 1 n · · · 1 n ) We continue with the following theorem which gives a relationship between M( 12X −) and Q1(X), and M( 12X +) and Q3(X). Theorem 5.5. Let X be an ordered random variable, given by X ∼ ( x1 x2 x3 · · · x2n 1 2n 1 2n 1 2n · · · 1 2n ) . Then M( 12X −) = Q1(X) and M( 12X +) = Q3(X). Proof. We look at the following two possible cases. CASE 1: n = 2k − 1 for some positive integer k. By Propositions 3.2 and 4.2, and by the definition of 12X − and 12X +, the following holds: M( 1 2 X−) = {xk} = Q1(X) 14 Art Discrete Appl. Math. 3 (2020) #P2.08 and M( 1 2 X+) = {xn+k} = {x3k−1} = Q3(X). CASE 2: n = 2k for some positive integer k. By Propositions 3.2 and 4.2, and by the definition of 12X − and 12X +, the following holds: M( 1 2 X−) = [xk, xk+1] = Q1(X) and M( 1 2 X+) = [xn+k, xn+k+1] = [x3k, x3k+1] = Q3(X). Note that 12X − and 12X + can only be obtained if n = 2k for some positive integer k. The following definition generalizes the notion of 12X − and 12X + to define the lower and upper parts of X in any proportion for arbitrary n. Definition 5.6. Let X be an ordered random variable, given by X ∼ ( x1 x2 x3 · · · xn 1 n 1 n 1 n · · · 1 n ) and let x ∈ [x1, xn] be any real number. Then we define the ordered random variables Lcx, Lox, U c x, and U o x by Lcx ∼  ( x1 x2 x3 · · · xk 1 k 1 k 1 k · · · 1 k ) if x = xk for some k,( x1 x2 x3 · · · xk 1 k 1 k 1 k · · · 1 k ) if xk < x < xk+1 for some k Lox ∼  ( x1 x2 x3 · · · xk−1 1 k−1 1 k−1 1 k−1 · · · 1 k−1 ) if x = xk for some k,( x1 x2 x3 · · · xk 1 k 1 k 1 k · · · 1 k ) if xk < x < xk+1 for some k U cx ∼  ( xk xk+1 xk+2 · · · xn 1 n−k+1 1 n−k+1 1 n−k+1 · · · 1 n−k+1 ) if x = xk for some k,( xk+1 xk+2 · · · xn 1 n−k 1 n−k · · · 1 n−k ) if xk < x < xk+1 for some k. Uox ∼  ( xk+1 xk+2 · · · xn 1 n−k 1 n−k · · · 1 n−k ) if x = xk for some k,( xk+1 xk+2 · · · xn 1 n−k 1 n−k · · · 1 n−k ) if xk < x < xk+1 for some k. I. Banič and J. Žerovnik: On median and quartile sets of ordered random variables 15 The sets Lox, L c x, U o x , and U c x can respectively be called open and closed lower part, and open and closed upper parts of X relative to x. From the definitions it directly follows: Proposition 5.7. Let X be an ordered random variable, given by X ∼ ( x1 x2 x3 · · · xn 1 n 1 n 1 n · · · 1 n ) . Then 1. Lcx ⊇ Lox, U cx ⊇ Uox for any x ∈ [x1, xn], 2. Lox ∩ Uox = ∅ for any x ∈ [x1, xn], 3. if x = xk ∈ X then Lcx ∩ U cx = {x}, 4. if x 6= xk ∈ X then Lox ∪ Uox = X , 5. Lcx ∪ U cx = X for any x ∈ [x1, xn]. Furthermore, the following theorem holds. Theorem 5.8. Let X be an ordered random variable, given by X ∼ ( x1 x2 x3 · · · xn 1 n 1 n 1 n · · · 1 n ) . If n = 2k for some k, then for any median m ∈M(X), m 6= m0, m 6= m1, we have (1) Lcm = L o m = 1 2X − and U cm = U o m = 1 2X +. (2) M(Lcm) = M(L o m) = Q1(X) and M(U c m) = M(U o m) = Q3(X) . Proof. Statement (1) follows directly from the definitions. Statement (2) follows from (1) and Theorem 5.5. The situation is a bit more complicated for odd n. Recall that for odd number of ele- ments n = 2`+ 1, the median m = x`+1 is an element of X . Theorem 5.9. Let n be an odd integer and X be an ordered random variable, given by X ∼ ( x1 x2 x3 · · · xn 1 n 1 n 1 n · · · 1 n ) . Then (1) if n = 4k+1 then for the unique median m = x2k+1 we have M(Lcm) = Q1(X) = {xk+1} ⊆ M(Lom) = [xk, xk+1] and M(Lcm) = Q3(X) = {x3k+1} ⊆ M(Lom) = [xk+1, xk+2]. (2) if n = 4k+3 then for the unique median m = x2k+2 we have M(Lom) = Q1(X) = {xk+1} ⊆ M(Lcm) = [xk+1, xk+2] and M(Lom) = Q3(X) = {x3k+3} ⊆ M(Lom) = [x3k+2, x3k+3]. Proof. The proof is straight forward. We leave it to a reader. 16 Art Discrete Appl. Math. 3 (2020) #P2.08 Thus from Theorem 5.8 we have learned that for X with even number of elements, taking any value from the median set to divide X to obtain the lower and the upper half, and computing its median sets will provide exact values of the first and the third quartile sets. However, by Theorem 5.9, the situation is slightly more complicated for odd n. Two cases have to be distinguished, because the quartile sets are median sets of the open halves when n = 4k + 1 and are medians of the closed halves when n = 4k + 3. We conclude the section by stating and proving another interesting result not depending whether n is even or odd. It gives an algorithm how to obtain the first and the third quartile sets of any data by doubling the data first, and then obtaining the median sets of the first and the second halves of the obtained doubled data. The advantage of this method is the fact that it works perfectly in both cases — for any even and for any odd n. Theorem 5.10. Let X be an ordered random variable, given by X ∼ ( x1 x2 x3 · · · xn 1 n 1 n 1 n · · · 1 n ) . Then M( 12 (2X) − ) = Q1(X) and M( 12 (2X) + ) = Q3(X). Proof. We distinguish the following four possible cases. CASE 1: n = 4k for some positive integer k. By Proposition 4.2, Q1(X) = [xk, xk+1] and Q3(X) = [x3k, x3k+1]. In this case 2X ∼ ( x1 x1 · · · x2k x2k x2k+1 x2k+1 · · · xn−1 xn xn 1 2n 1 2n · · · 1 2n 1 2n 1 2n 1 2n · · · 1 2n 1 2n 1 2n ) . By Proposition 3.2, one can easily get that M( 12 (2X) − ) = [xk, xk+1] = Q1(X) and M( 12 (2X) + ) = [x3k, x3k+1] = Q3(X). CASE 2: n = 4k + 1 for some non-negative integer k. By Proposition 4.2, Q1(X) = {xk+1} and Q3(X) = {x3k+1}, and by Proposition 3.2, M(X) = {x2k+1}. In this case 2X ∼ ( x1 x1 x2 · · · x2k x2k+1 x2k+1 x2k+2 · · · xn xn 1 2n 1 2n 1 2n · · · 1 2n 1 2n 1 2n 1 2n · · · 1 2n 1 2n ) . By Proposition 3.2, M( 12 (2X) − ) = {xk+1} = Q1(X) and M( 12 (2X) + ) = {x3k+1} = Q3(X). CASE 3: n = 4k + 2 for some non-negative integer k. By Proposition 4.2, Q1(X) = {xk+1} and Q3(X) = {x3k+2}. In this case 2X ∼ ( x1 x1 x2 · · · x2k+1 x2k+2 · · · xn−1 xn xn 1 2n 1 2n 1 2n · · · 1 2n 1 2n · · · 1 2n 1 2n 1 2n ) . By Proposition 3.2, M( 12 (2X) − ) = {xk+1} = Q1(X) and M( 12 (2X) + ) = {x3k+2} = Q3(X). I. Banič and J. Žerovnik: On median and quartile sets of ordered random variables 17 CASE 4: n = 4k + 3 for some non-negative integer k. By Proposition 4.2, Q1(X) = {xk+1} and Q3(X) = {x3k+3}. In this case 2X ∼ ( x1 x1 x2 x2 · · · x2k+2 x2k+2 · · · xn xn 1 2n 1 2n 1 2n 1 2n · · · 1 2n 1 2n · · · 1 2n 1 2n ) . By Proposition 3.2, M( 12 (2X) − ) = {xk+1} = Q1(X) and M( 12 (2X) + ) = {x3k+3} = Q3(X). 6 On some elementary methods for computing the quartiles The usual methods for computation of quartiles are based on the idea to split the dataset in two halves and obtain the quartiles as the medians of the halves. The obvious question arises ”how to define the halves if the number of elements is odd ?”. As we know it is answered differently, yielding different methods and, unfortunately, different results(!) [5]. Three methods are among the most popular, the first two being often used in elementary textbooks. The third was proposed in [5] and argued to be accessible at elementary level in [10]. All the methods below first compute the median of X and then divide X in two halves to obtain the quartiles as medians of the halves. However, when n is odd, the methods differ as follows: • Method M1. Include the median in both halves. • Method M2. Exclude the median in both halves. • Method L. If n = 4k + 1 then include the median. If n = 4k + 3 then exclude the median. Method L was suggested by Langford [5] who shows that both M1 and M2 fail to provide correct answers in some cases. We say that a method or an algorithm for computing a first quartile of a given data is correct, if it gives a value q and q ∈ Q1(X). We say that a method or an algorithm for computing a third quartile of a given data is correct, if it gives a value q and q ∈ Q3(X). Considering Theorem 5.9 immediately confirms that M1 and M2 are not correct. For example, for n = 4k+3, method M1 gives q1 as the median of the lowest 2k+2 elements, i.e. 12 (xk+1 + xk+2) whereas Q1(X) = {xk+1}. Similarly, for n = 4k + 1, method M2 gives q1 as the median of the lowest 2k elements, i.e. 12 (xk + xk+1) whereas Q1(X) = {xk+1}. Method L however naturally extends to the general case. Theorem 6.1. The L method is a correct algorithm for computing the quartile sets. Proof. Let n be even, say n = 2k. Then by method L, the first quartile is the median of the set {x1, x2, . . . , xk}, and the third quartile is the median of the set {xk+1, xk+2, . . . , x2k}, which is correct by Theorem 5.5. Let n be odd. If n = 4k+1 then by method L, the first quartile is the median of the set {x1, x2, . . . , x2k+1}, and the third quartile is the median of the set {x2k+1, x2k+2, . . . , x4k+1}, (median included in both sets), which is correct by Theo- rems 5.8 and 5.9. 18 Art Discrete Appl. Math. 3 (2020) #P2.08 If n = 4k + 3 then by method L, the first quartile is the median of the set {x1, x2, . . . , x2k+1}, and the third quartile is the median of {x2k+3, x2k+4, . . . , x4k+3}, (median excluded from both sets), which is correct by Theorems 5.8 and 5.9. Another natural idea [5], equivalent to method L, can naturally be extended to a method for computing the quartile sets. Instead of asking and to answering the question whether to include or exclude the median when splitting the dataset in two halves, one can decide to give ”half of the median” to each part. This can be realized by doubling the dataset and giving one copy of the median into each half. We call this the Langford’s doubling method. Recall that Theorem 5.5 implies that this method works correctly for the generalized defi- nition of quartiles. Theorem 6.2. The doubling method is a correct algorithm for computing the quartile sets. In conclusion, one may ask how some other methods for computing quartiles are related to the generalized notion of median and quartiles. For example, assuming n = 4k, one could ask whether a method of interest gives quartile values that are within the quartile set. This may be a good evidence that the method is sound. Finally, we wish to note that the interval sets can be naturally associated with any quantiles, and an analogous theory may be developed. ORCID iDs Iztok Banič https://orcid.org/0000-0002-5097-2903 Janez Žerovnik https://orcid.org/0000-0002-6041-1106 References [1] J. E. Freund and B. M. Perles, A new look at quartiles of ungrouped data, The American Statis- tician 41 (1987), 200–203, doi:10.1080/00031305.1987.10475479. [2] R. Hyndman and Y. Fan, Sample quantiles in statistical packages, The American Statistician 50 (1996), 361–365, doi:10.1080/00031305.1996.10473566. [3] C. Jentsch and A. Leucht, Bootstrapping sample quantiles of discrete data, 2014, https: //madoc.bib.uni-mannheim.de/36588/. [4] A. H. Joarder and M. Firozzaman, Quartiles for discrete data, Teaching Statistics 23, 86–89, doi:10.1111/1467-9639.00063. [5] E. Langford, Quartiles in elementary statistics, Journal of Statistics Education 14 (2006), doi: 10.1080/10691898.2006.11910589. [6] Y. Ma, M. G. Genton and E. Parzen, Asymptotic properties of sample quantiles of discrete distributions, Ann. Inst. Statist. Math. 63 (2011), 227–243, doi:10.1007/s10463-008-0215-z. [7] Y. Miao, Y.-X. Chen and S.-F. Xu, Asymptotic properties of the deviation between or- der statistics and p-quantile, Comm. Statist. Theory Methods 40 (2011), 8–14, doi:10.1080/ 03610920903350523. [8] J. W. Tukey, Exploratory data analysis, volume 2, Reading, MA, 1977. [9] J. Žerovnik, Računanje kvartilov v elementarni statistiki, Obzornik matematiko in fiziko 64 (2017), 20–31, http://www.dlib.si/?URN=URN:NBN:SI:DOC-9RRUOHKH. [10] J. Žerovnik and D. Rupnik Poklukar, Elementary methods for computation of quartiles, Teach- ing Statistics 39, 88–91, doi:10.1111/test.12133. ISSN 2590-9770 The Art of Discrete and Applied Mathematics 3 (2020) #P2.09 https://doi.org/10.26493/2590-9770.1285.fd8 (Also available at http://adam-journal.eu) Sphere decompositions of hypercubes* Richard H. Hammack† Virginia Commonwealth University, Dept. of Mathematics, Richmond, VA 23284, USA Paul C. Kainen Georgetown University, Dept. of Mathematics and Statistics, Washington, DC 20057, USA Received 28 December 2018, accepted 28 August 2019, published online 23 August 2020 Abstract For d ≡ 1 or 3 (mod 6), the 2-skeleton of the d-dimensional hypercube is decomposed into the union of pairwise face-disjoint isomorphic 2-complexes, each a topological sphere. If d = 5n, then such a decomposition can be achieved, but with non-isomorphic spheres. Keywords: Face-disjoint union of spheres, combinatorial design, 2-skeleton of a cube. Math. Subj. Class. (2020): 57M20, 57M15, 05C45 By Euler’s theorem [9, Prop. 1.2.27], any graph (1-complex) with all vertices of even degrees is an edge-disjoint union of cycles. We say a 2-complex is even if every edge lies in a positive even number of (2-dimensional) faces. Is every even 2-complex a face-disjoint union of “2-dimensional cycles”? (A 2-complexX is a face-disjoint union of 2-complexes X1, . . . , Xn if X = ⋃n i=1Xi and each face of X is a face of exactly one Xi.) There are (at least) two natural choices for a 2-dimensional interpretation of cycle – sphere or manifold. As even complexes include surfaces like the torus, one cannot always decompose them into face-disjoint spheres. But we show below that sphere decompositions do exist in more than two-thirds of the odd-dimensional hypercubes. For d ≡ 1 or 3 (mod 6), we can decompose the 2-skeleton Q2d of the d-dimensional hypercube Qd into face-disjoint copies of ∂Q3, the boundary of a 3-cube. That is, Q2d is factored by ∂Q3. In [6], when d is odd (so the 2-skeleton is even), Q2d is decomposed into a face-disjoint union of tori and 3-cube boundaries. In [4] we showed that the 2-skeleton of any d- dimensional Platonic polytope is a face-disjoint union of surfaces if the 2-skeleton is even. Except for the hypercubes, all such decompositions were decompositions into spheres. (A polytope is Platonic if it is maximally symmetric. In dimension greater than four, the Platonic polytopes are just the cubes, simplexes, and hyperoctahedra.) For which odd d is the 2-skeleton of the d-cube decomposable into spheres? For which d can the decomposition be a factorization? We address these questions below. *We thank the referees for feedback that has improved the paper. †Supported by Simons Foundation Collaboration Grant for Mathematicians 523748. E-mail addresses: rhammack@vcu.edu (Richard H. Hammack), kainen@georgetown.edu (Paul C. Kainen) cb This work is licensed under https://creativecommons.org/licenses/by/4.0/ 2 Art Discrete Appl. Math. 3 (2020) #P2.09 Throughout this paper I denotes the interval [0, 1] and O its boundary O = {0, 1}. (We use the non-standard notation O for ∂I because it will be convenient to think of an interval as being “active” (I) or “inactive” (O) in the manner indicated below.) We regard the d-cube as Qd = Id ⊆ Rd. Thus the 2d vertices of Qd are the elements of Od, which we identify with the binary strings of length d. An edge ofQd is a line segment joining two vertices that differ in exactly one position (i.e., coordinate). Selecting a coordinate i from 1 to d, there are 2d−1 edges among the connected components of O×O×· · ·× I×· · ·×O, where the sole (“active”) factor I occurs in the ith position. ThusQd has d2d−1 edges. The faces of Qd are the squares that are the connected components of O × · · · × I × · · · × I × · · · ×O, where exactly two of the factors are I’s and the rest are O’s. Thus Qd has ( d 2 ) 2d−2 faces, and the boundary of each face consists of four edges. Likewise Qd has ( d 3 ) 2d−3 3-facets O × · · · × I × · · · × I × · · · × I × · · · ×O, formed by selecting three positions for the I’s. Each 3-facet is a 3-cube whose boundary consists of six faces. Similarly, Qd has ( d k ) 2d−k k-facets for each 0 ≤ k ≤ d, and each k-facet is a k-cube. The 2-skeleton, Q2d, of Qd is the union of all of its faces. Notice that each edge ofQd belongs to d−1 faces, so the 2-skeleton is even if and only if d is odd. Hence Q2d has no sphere decomposition if d is even. 1 Sphere decompositions in dimensions 1 and 3 (mod 6) Here we show that if d = 3, 7, 9, 13, 15, 19, 21, . . ., that is, if d ≡ 1 or 3 (mod 6), then the 2-skeleton of Qd can be decomposed into a face-disjoint union of boundaries of 3-cubes. We use combinatorial designs [1], [8, pp. 96–100]. Let [d] := {1, . . . , d}. A k-design S(k, d) on [d] is a family of k-subsets of [d] (called blocks) such that each 2-subset of [d] is contained in a unique block. Though 3-designs are called Steiner triple systems, it was Kirkman [7] who proved that they exist if and only if d ≡ 1 or 3 (mod 6). Conditions that are algebraically necessary turned out to be combinatorially sufficient. Before describing our general construction we illustrate it for Q7. We will decompose the 2-skeleton of Q7 into 112 pairwise face-disjoint 3-cube boundaries. The first step is to realize a Steiner triple system S(3, 7). Label the vertices of a 7-gon with the integers 1 through 7, as in in Figure 1. The shaded triangle on the left has vertices 1, 2 and 4, and any two of them are a distance of 1, 2 or 3 apart along the 7-gon. Rotating the triangle in multiples of 2π/7 yields seven triangles, whose respective vertex sets are tallied below them. These are the blocks of S(3, 7) because any two vertices on the 7-gon are at distance 1, 2, or 3, and therefore they are vertices of exactly one of the triangles. 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 124 235 346 457 561 672 713 Figure 1: Construction of a Steiner triple system S(3, 7). R. H. Hammack and P. C. Kainen: Sphere decompositions of hypercubes 3 Each block of S(3, 7) corresponds to one of seven classes of 3-cubes in Q7 indicated in Table 1, where an integer i belongs to the block if and only if the product is active in the ith factor. Notice that permuting the factors in a class cyclicly yields the subsequent class. 124 I × I ×O× I ×O×O×O 235 O× I × I ×O× I ×O×O 346 O×O× I × I ×O× I ×O 457 O×O×O× I × I ×O× I 561 I ×O×O×O× I × I ×O 672 O× I ×O×O×O× I × I 713 I ×O× I ×O×O×O× I Table 1: The seven classes of 3-cubes in Q7. As O = {0, 1}, each of the seven classes contains 16 disjoint 3-cubes, for a total of 112 3-cubes. Notice that any two cubes from the same class have empty intersection. Further, two 3-cubes from different classes are either disjoint or they intersect at an edge because by construction they have exactly one I as a common factor. We have accounted for 6 · 112 = 672 faces of Q7, which has indeed ( 7 2 ) 25 = 672 faces. We therefore have a decomposition of its 2-skeleton into pairwise face-disjoint boundaries of 3-cubes. To visualize this, let P : R7 → R2 be the projection sending the standard basis elements e1, e2, . . . , e7 to the vertices of a regular 7-gon, cyclically, as in Figure 1. Figure 2 (left) shows the projection P of the 16 disjoint 3-cubes in the class I × I ×O× I ×O×O×O (shown bold in the figure, with other edges of Q7 gray). There is much overlap in this figure. The right of Figure 2 shows the same projection, but with the vectors P (e1), P (e2) and P (e4) scaled by a factor of about 0.2 in order to separate the 3-cubes. Observe that rotating Figure 2 (left) by 2π/7 brings the cubes I × I ×O× I ×O×O×O to the cubes O × I × I ×O × I ×O ×O, etc. 1 Figure 2: Two views of the sixteen 3-cubes I × I ×O× I ×O×O×O (bold lines) in Q7. 4 Art Discrete Appl. Math. 3 (2020) #P2.09 Now that we have illustrated our construction, we can prove the general result. Theorem 1.1. The 2-skeleton ofQd can be decomposed into a pairwise face-disjoint union of 3-cube boundaries if and only if d ≡ 1 or 3 (mod 6). Proof. Let d ≡ 1 or 3 (mod 6) and let S(3, d) be a 3-design. As [d] has ( d 2 ) pairs and each block of S(3, d) contains ( 3 2 ) = 3 pairs, the number of blocks is 13 ( d 2 ) = d(d−1)6 . For each block {i, j, k} of S(3, d), construct a class of 3-cubes O × · · · × I × · · · × I × · · · × I × · · · ×O, where there is an I precisely in the ith, jth and kth factors. Such a class consists of 2d−3 disjoint 3-cubes. By construction, the intersection of any two 3-cubes from different classes corresponding to blocks {i, j, k} and {i′, j′, k′} is either empty, or a vertex, or an edge. Indeed, the intersection cannot be a face in anyO×· · ·×I×· · ·×I×· · ·×O because this would mean that some pair belongs to both {i, j, k} and {i′, j′, k′}. Thus these 3-cubes are pairwise face-disjoint. The cubes in the d(d−1)6 classes thus account for 6 d(d−1) 6 2 d−3 =( d 2 ) 2d−2 faces of Qd, which is all of the faces of Qd. We have thus decomposed the 2- skeleton of Qd into a pairwise face-disjoint union of boundaries of 3-cubes. Conversely suppose that d 6≡ 1 or 3 (mod 6). If d is even, then Q2d is not even, so it does not have a sphere decomposition. Thus assume d is odd, in which case d ≡ 5 (mod 6). An easy computation shows that, in this case, the number of faces in Q2d is not a multiple of 6. Hence Q2d cannot be decomposed as a pairwise face-disjoint union of 3-cubes. Theorem 1.1 does not cover the cases d = 5, 11, 17, 23, . . ., where d ≡ 5 (mod 6). We do not know if all such such Q2d have sphere decompositions. In the next section we find sphere decompositions when d = 5n. However, these decompositions are not factorizations as they involve non-isomorphic complexes. 2 A sphere decomposition of the 5-cube We now show that there is a sphere decomposition for Q25, which is the smallest case not covered by a Steiner triple system. In fact, we will get somewhat more. Theorem 2.1 below guarantees sphere decompositions of Q2d exist for arbitrarily large d ≡ 5 (mod 6). 0011 0010 0111 0110 0001 0101 0100 1101 1100 1110 1001 1000 1011 1010 Figure 3: The 2-skeleton of the 4-cube, minus the vertices 0000 and 1111, is a sphere S. R. H. Hammack and P. C. Kainen: Sphere decompositions of hypercubes 5 S 11110000 1 Figure 4: The rhombic dodecahedron obtained by deleting opposite vertices of Q24. The watercolor (right) by David W. Brisson (1977) is a hypersterogram [2] showing two views differing in two degrees of parallax. Used with permission of Harriet and Erik Brisson. Theorem 2.1. If d = 5n, then the 2-skeleton of Qd is a face-disjoint union of spheres. Proof. We first treat the case d = 5. The case d = 5n will follow from design theory. Our plan is to realize the 2-skeleton of Q4 as a face-disjoint union of a sphere S and six disks D1, . . . , D6 with edge-disjoint boundaries, then show that the 2-skeleton of Q5 is the face-disjoint union of the eight spheres S×{0}, S×{1}, ∂(D1×[0, 1]), · · · , ∂(D6×[0, 1]). Let S = Q24−{0000, 1111} be Q24 with the antipodal vertices 0000 and 1111 removed (and with them all the edges and faces incident with them). We thus have removed two vertices, eight edges and 12 faces. What remains is a sphere S with 12 square faces. It is shown in Figure 3 embedded in the punctured sphere (plane). We note in passing that sphere S is a rhombic dodecahedron, which can be embedded in R3 with 12 congruent rhombic faces. (See Figure 4.) The sphere S accounts for 12 of the 4-cube’s 24 faces. The 12 missing squares are all incident with one or the other of the removed vertices 0000 and 1111. Figure 5 shows eight of these missing squares. Four of them form a disk D1 centered at 0000 and the other four make a disk D2 centered at 1111. These disks are pairwise face-disjoint, and their boundaries are pairwise edge-disjoint. And none of their faces are faces of S, because each face of D1 and D2 contains either the vertex 0000 or 1111, and neither of these vertices is in S. 0000 1000 0010 0100 0001 D1 0110 0011 1100 1001 1111 0111 1101 1011 1110 D2 1001 1100 0011 0110 Figure 5: The disks D1 and D2 centered at 0000 and 1111, respectively. 6 Art Discrete Appl. Math. 3 (2020) #P2.09 So far we have accounted for 20 squares of Q24, 12 of them in S, four in D1, and four in D2. There are just four squares in Q24 that are unaccounted for. They are not hard to find, because 0000 and 1111 are each contained in six squares of Q24 and Figure 5 shows only four squares at 0000 and 1111. Thus the four missing squares are incident with 0000 or 1111. They are shown in Figure 6, superimposed on the drawings from Figure 5. Call these four squares disks D3, D4, D5 and D6. 0000 1000 0010 0100 0001 D3 D4 1010 0101 1111 0111 1101 1011 1110 D5 D6 0101 1010 Figure 6: The disks D3, D4, D5 and D6. Note that the sphere S and disks D1, D2, . . . D6 are pairwise face-disjoint and account for all squares of Q24. Further the boundaries of the disks are pairwise edge-disjoint. We now have eight spheres in Q25: S × {0}, S × {1}, ∂(D1 × [0, 1]), · · · , ∂(D6 × [0, 1]). By construction they are face-disjoint. (See Figure 7.) Moreover the total number of squares used is 12 + 12 + 16 + 16 + 6 + 6 + 6 + 6 = 80, so we have used all the squares in Q25. We have now decomposed the 2-skeleton of Q5 into a pairwise face-disjoint union of spheres, two of which are rhombic dodecahedrons, two of which have the structure shown in Figure 7 (left), and four of which are the boundaries of a 3-cube, as in Figure 7 (right). 00001 00000 00100 00110 00010 10000 10010 0100 0 0110 0 1100 0 1100 1 10001 0100 1 10011 00011 1 00001 00000 00100 01001 01000 00010 00011 01010 01011 1 Figure 7: The spheres ∂(D1 × I) (left) and ∂(D3 × I) (right) intersect at the hexagon 00000–00010–00011–00001–01001–01000–00000. Our decomposition of of Q5 uses two spheres of the type on the left, four of the type on the right, and two rhombic dodecahedra. Having obtained a sphere decomposition of Q25, we get a generalization. Consider the finite field F5 consisting of the integers modulo 5. The vector space Fn5 then consists of 5n elements, or points, and each 1-dimensional subspace V = {λv | λ ∈ F5} consists of five points. A line L is a translate L = {w + λv | λ ∈ F5} of a 1-dimensional subspace. We R. H. Hammack and P. C. Kainen: Sphere decompositions of hypercubes 7 can realize S(5, 5n) by letting the blocks be the lines in Fn5 . (Each line consists of 5 of the 5n points in Fn5 , and any two points in Fn5 lie on a unique line.) From each block we can extract 10 pairs of points, so the total number of blocks is 110 ( 5n 2 ) = 5 n−1(5n−1) 4 . Using the development from Section 1, it follows that the 2-skeleton of Q5n is the face-disjoint union of 5 n−1(5n−1) 4 2 5n−5 5-cubes, each of which is decomposable into a pairwise face- disjoint union of spheres. We can thus decompose the 2-skeleton of Q5n into a pairwise face-disjoint union of spheres. Indeed, the total number of faces used in this decomposition is 80 5 n−1(5n−1) 4 2 5n−5 = 5 n(5n−1) 2 2 5n−2 = ( 5n 2 ) 25 n−2, the number of faces of Q25n . Notice that 5n ≡ 5 (mod 6) if and only if n is odd, so Theorem 2.1 yields a new class of hypercubes with sphere decompositions that is not covered by Theorem 1.1. 3 Discussion Design theory applies to additional cases where d ≡ 5 (mod 6) by using the technique of the previous section. Suppose one has a sphere decomposition of some Q2k and there is a k-design on [d]. Then there is a sphere decomposition for Q2d. We illustrate this for k = 5. In [5, Thm. 2], Hanani showed that a 5-design exists if and only if d ≡ 1 or 5 (mod 20). So for d = 41, 65, etc., any S(5, d) and any sphere decomposition of Q25 can be combined to construct a sphere decomposition of Q2d for some d 6= 5n. We conjecture that sphere decompositions exist for Q2d for all odd d, but that spherical factorizations exist if and only if d ≡ 1 or 3 (mod 6). Note that cyclical configurations of points and lines were constructed by Grünbaum through a similar use of Steiner triple systems. See [3, pp. 253 and 325]. ORCID iDs Richard H. Hammack https://orcid.org/0000-0002-6384-9330 Paul C. Kainen https://orcid.org/0000-0001-8035-0745 References [1] E. F. Assmus, Jr. and J. D. Key, Designs and their codes, volume 103 of Cambridge Tracts in Mathematics, Cambridge University Press, Cambridge, 1992, doi:10.1017/cbo9781316529836. [2] D. Brisson, Hypergraphics: Visualizing Complex Relationships in Art, Science and Technology, American association for the advancement of science. Selected symposia Series 24, Westview, 1978, https://books.google.com/books?id=J6yOzQEACAAJ. [3] B. Grünbaum, Configurations of points and lines, volume 103 of Graduate Studies in Mathe- matics, American Mathematical Society, Providence, RI, 2009, doi:10.1090/gsm/103. [4] R. Hammack and P. C. Kainen, On 2-skeleta of Platonic polytopes, Bull. Hellenic Math. Soc. 62 (2018), 94–102, http://bulletin.math.uoc.gr/bulletin/work/autolist. php?decade=2010. [5] H. Hanani, On balanced incomplete block designs with blocks having five elements, J. Combi- natorial Theory Ser. A 12 (1972), 184–201, doi:10.1016/0097-3165(72)90035-0. [6] P. C. Kainen, On 2-skeleta of hypercubes, Art Disc. Appl. Math. 3 (2020), #P2.06, 4p., doi: 10.26493/2590-9770.1302.f4e. 8 Art Discrete Appl. Math. 3 (2020) #P2.09 [7] T. P. Kirkman, On a problem in combinations, Cambridge and Dublin Mathematical Journal 2 (1847), 191–204, http://resolver.sub.uni-goettingen.de/purl? PPN600493962_0002. [8] H. J. Ryser, Combinatorial mathematics, The Carus Mathematical Monographs, No. 14, Pub- lished by The Mathematical Association of America; distributed by John Wiley and Sons, Inc., New York, 1963, doi:10.5948/upo9781614440147. [9] D. B. West, Introduction to Graph Theory, Prentice Hall, 2001, https://books.google. com/books?id=TuvuAAAAMAAJ. ISSN 2590-9770 The Art of Discrete and Applied Mathematics 3 (2020) #P2.10 https://doi.org/10.26493/2590-9770.1295.25b (Also available at http://adam-journal.eu) Strongly regular graphs with parameters (37, 18, 8, 9) having nontrivial automorphisms* Dean Crnković , Marija Maksimović Department of Mathematics, University of Rijeka Radmile Matejčić 2, 51000 Rijeka, Croatia Received 26 March 2019, accepted 20 September 2019, published online 21 August 2020 Abstract All strongly regular graphs having at most 36 vertices have been enumerated. Hence, the first open case is enumeration of the SRGs with parameters (37, 18, 8, 9). In this pa- per we show that there are exactly forty SRGs with parameters (37, 18, 8, 9) having non- trivial automorphisms. Comparing the constructed graphs with previously known SRGs with these parameters we conclude that six of the SRGs with parameters (37, 18, 8, 9) con- structed in this paper are new, and that up to isomorphism there are at least 6766 strongly regular graphs with parameters (37, 18, 8, 9). Keywords: Strongly regular graph, automorphism group, orbit matrix. Math. Subj. Class. (2020): 05E30, 20D45 1 Introduction One of the main problems in the theory of strongly regular graphs (SRGs) is constructing and classifying SRGs with given parameters. A frequently used method of constructing combinatorial structures is a construction with a prescribed automorphism group using or- bit matrices. While orbit matrices of block designs have been used for such a construction of designs since 1980s, orbit matrices of strongly regular graphs have not been introduced until 2011 (see [2]). Using orbit matrices we construct all strongly regular graphs with pa- rameters (37, 18, 8, 9) having nontrivial automorphisms. In that way we have constructed forty SRGs with parameters (37, 18, 8, 9), and six of them are new. Thereby we proved that there are exactly forty SRGs with parameters (37, 18, 8, 9) having nontrivial automor- phisms, and at least 6766 strongly regular graphs with these parameters. *This work has been fully supported by Croatian Science Foundation under the project 6732. E-mail addresses: deanc@math.uniri.hr (Dean Crnković), mmaksimovic@math.uniri.hr (Marija Maksimović) cb This work is licensed under https://creativecommons.org/licenses/by/4.0/ 2 Art Discrete Appl. Math. 3 (2020) #P2.10 The paper is organized as follows: after a brief description of the terminology and some background results in Section 2, in Section 3 we describe the concept of orbit matrices. In Section 4 we apply the method of constructing SRGs using orbit matrices to construct all strongly regular graphs with parameters (37, 18, 8, 9) having nontrivial automorphisms. 2 Background and terminology We assume that the reader is familiar with basic notions from the theory of finite groups. For basic definitions and properties of strongly regular graphs we refer the reader to [3, 9, 14]. A graph is regular if all its vertices have the same valency. A simple regular graph Γ = (V, E) is strongly regular with parameters (v, k, λ, µ) if it has |V| = v vertices, valency k, and if any two adjacent vertices are together adjacent to λ vertices, while any two nonadjacent vertices are together adjacent to µ vertices. A strongly regular graph with parameters (v, k, λ, µ) is usually denoted by SRG(v, k, λ, µ). An automorphism of a strongly regular graph Γ is a permutation of vertices of Γ, such that every two vertices are adjacent if and only if their images are adjacent. Let Γ1 = (V, E1) and Γ2 = (V, E2) be strongly regular graphs and G ≤ Aut(Γ1) ∩ Aut(Γ2). An isomorphism α : Γ1 → Γ2 is called a G-isomorphism if there exists an automorphism τ : G→ G such that for each x, y ∈ V and each g ∈ G the following holds: (τg).(αx) = αy ⇔ g.x = y. Strongly regular graphs having at most 36 vertices have been enumerated, so SRGs with parameters (37, 18, 8, 9) are the first open case that still have to be classified (see [4]). It is known that there exists at least 6760 SRGs(37, 18, 8, 9), which are obtained as the descendants of the 191 regular two-graphs on 38 vertices constructed in [11]. The adjacency matrices of these 6760 SRGs(37, 18, 8, 9) can be found at [12]. In this paper we classify SRGs(37, 18, 8, 9) having nontrivial automorphisms, showing that there are at least 6766 strongly regular graphs with parameters (37, 18, 8, 9). 3 Orbit matrices of strongly regular graphs Orbit matrices of block designs have been frequently used for construction of block designs, see e.g. [6, 7, 8, 10]. In this section we describe the concept of orbit matrices of SRGs, which is introduced in 2011 by Behbahani and Lam (see [2]). Let Γ be a SRG(v, k, λ, µ) and A be its adjacency matrix. Suppose an automorphism groupG of Γ partitions the set of vertices V into b orbitsO1, . . . , Ob, with sizes n1, . . . , nb, respectively. The orbits divide A into submatrices [Aij ], where Aij is the adjacency matrix of vertices in Oi versus those in Oj . We define matrices C = [cij ] and R = [rij ], 1 ≤ i, j ≤ b, such that cij = column sum of Aij , rij = row sum of Aij . The matrix R is related to C by rijni = cijnj . (3.1) Since the adjacency matrix is symmetric, it follows that R = CT . (3.2) D. Crnković and M. Maksimović: Strongly regular graphs with parameters (37, 18, 8, 9) 3 The matrix R is the row orbit matrix of the graph Γ with respect to G, and the matrix C is the column orbit matrix of the graph Γ with respect to G. Let us assume that a group G acts as an automorphism group of a SRG(v, k, λ, µ). Behbahani and Lam showed that orbit matrices R = [rij ] and RT = C = [cij ] satisfy the condition b∑ s=1 cisrsjns = δij(k − µ)nj + µninj + (λ− µ)cijnj . Since R = CT , it follows that b∑ s=1 ns nj ciscjs = δij(k − µ) + µni + (λ− µ)cij (3.3) and b∑ s=1 ns nj rsirsj = δij(k − µ) + µni + (λ− µ)rji. In order to enable a construction of SRGs with a presumed automorphism group G, each matrix with the properties of an orbit matrix will be called an orbit matrix for param- eters (v, k, λ, µ) and a group G (see [1]). Therefore, we introduce the following definition of orbit matrices of strongly regular graphs (see [5]). Definition 3.1. A (b× b)-matrix R = [rij ] with entries satisfying conditions: b∑ j=1 rij = b∑ i=1 ni nj rij = k (3.4) b∑ s=1 ns nj rsirsj = δij(k − µ) + µni + (λ− µ)rji (3.5) where 0 ≤ rij ≤ nj , 0 ≤ rii ≤ ni − 1 and ∑b i=1 ni = v, is called a row orbit matrix for a strongly regular graph with parameters (v, k, λ, µ) and the orbit lengths distribution (n1, . . . , nb). Definition 3.2. A (b× b)-matrix C = [cij ] with entries satisfying conditions: b∑ i=1 cij = b∑ j=1 nj ni cij = k (3.6) b∑ s=1 ns nj ciscjs = δij(k − µ) + µni + (λ− µ)cij (3.7) where 0 ≤ cij ≤ ni, 0 ≤ cii ≤ ni − 1 and ∑b i=1 ni = v, is called a column orbit matrix for a strongly regular graph with parameters (v, k, λ, µ) and the orbit lengths distribution (n1, . . . , nb). 4 Art Discrete Appl. Math. 3 (2020) #P2.10 Not every orbit matrix gives rise to strongly regular graphs while, on the other hand, a single orbit matrix may produce several nonisomorphic strongly regular graphs. For the elimination of orbit matrices that produce G-isomorphic strongly regular graphs we use the same method as for the elimination of orbit matrices of G-isomorphic designs (see for example [7]). We could use row or column orbit matrices, but since we construct matrices row by row, it is more convenient for us to use column orbit matrices. 3.1 Orbit lengths distribution Suppose an automorphism group G of the graph Γ partitions the set of vertices V into b orbits O1, . . . , Ob, with sizes n1, . . . , nb, respectively. It is well known that ni divides |G|, for i = 1, . . . , b. Further, b∑ i=1 ni = v. In this paper we will be interested in groups that act in orbits having at most two lengths, since we will consider automorphism groups of prime order. If the group G acts with d1 orbits of length 1 and dh orbits of length h, we will denote this distribution with (d1 × 1, dh × h). When determining the orbit lengths distributions we use the following result that can be found in [1]. Theorem 3.3. Let s < r < k be the eigenvalues of a SRG(v, k, λ, µ), then φ ≤ max(λ, µ) k − r v, where φ is the number of fixed points for a nontrivial automorphism. In the case of SRGs with parameters (37, 18, 8, 9) we obtain that φ ≤ 20, so to find all feasible orbit length distributions (d1 × 1, dh × h) we need to solve the system d1 + h · dh = 37 d1 ≤ 20. 4 Classification of SRGs with parameters (37, 18, 8, 9) having non- trivial automorphisms It is known that there exists at least 6760 SRGs with parameters (37, 18, 8, 9) (see [11]). Spence [12] listed adjacency matrices of all of them. In Table 1 we give information on orders of the full automorphism groups of these 6760 SRGs(37, 18, 8, 9). The graph hav- ing the full automorphism group of order 666 is the Paley graph obtained from the field GF (37), having the full automorphism group isomorphic to Z37 : Z18 (see [14]). In this section we give the classification of strongly regular graphs with parameters (37, 18, 8, 9) having nontrivial automorphisms. We show that there are exactly 6 strongly regular graphs with parameters (37, 18, 8, 9) having an automorphism group of order two, D. Crnković and M. Maksimović: Strongly regular graphs with parameters (37, 18, 8, 9) 5 Table 1: Orders of the full automorphism groups of the known SRGs(37, 18, 8, 9) |Aut(Γi)| #SRGs 1 6726 2 3 3 24 9 4 18 2 666 1 all of them isomorphic to the graphs given at [12]. Further, we show that there are exactly 37 strongly regular graphs with parameters (37, 18, 8, 9) having an automorphism group of order three, 6 of them nonisomorphic to any of the graphs listed at [12]. Finally we show that there is no SRG(37, 18, 8, 9) having an automorphism group Zp, where p is prime and 3 < p < 37, and that there is exactly one SRG(37, 18, 8, 9) having the automorphism of order 37 (the Paley graph with 37 vertices). Comparing the constructed SRGs with the SRGs given at [12], we establish that six of the strongly regular graphs having a nontrivial automorphism group of prime order constructed in this paper have not been previously known. In order to construct orbit matrices of SRGs with parameters (37, 18, 8, 9) that have automorphism of prime order p, we first find all permissible distibutions (d1 × 1, dp × p). Then for each distribution we find all prototypes (see [1]). Using prototypes we construct orbit matrices row by row and we eliminate mutually G-isomorphic orbit matrices during this process. In the next step we construct adjacency matrices of SRGs(37, 18, 8, 9). Table 2: Number of orbit matrices and SRGs(37, 18, 8, 9) for the automorphism group Z2 distribution #OM #SRGs distribution #OM #SRGs (1× 1, 18× 2) 24 6 (11× 1, 13× 2) 0 0 (3× 1, 17× 2) 0 0 (13× 1, 12× 2) 0 0 (5× 1, 16× 2) 6 0 (15× 1, 11× 2) 0 0 (7× 1, 15× 2) 0 0 (17× 1, 10× 2) 0 0 (9× 1, 14× 2) 0 0 (19× 1, 9× 2) 0 0 4.1 SRGs with parameters (37, 18, 8, 9) having an automorphism group of order two Using the program Mathematica we get all the possible orbit lengths distribution that sat- isfy Theorem 3.3, and using our own programs written in GAP [13] we construct all or- bit matrices for the given orbit lengths distributions. In Table 2 we present the number of mutually nonisomorphic orbit matrices for Z2 for each orbit lengths distribution. In 6 Art Discrete Appl. Math. 3 (2020) #P2.10 the next step we obtain the adjacency matrices of strongly regular graphs with parame- ters (37, 18, 8, 9). Finally, we check isomorphisms of strongly regular graphs using GAP. Thereby we prove Theorem 4.1. The number of the constructed nonisomorphic SRGs with parameters (37, 18, 8, 9) are presented in Table 2. Orders of the full automorphism groups of these SRGs, also determined by using GAP, are shown in Table 3. Table 3: SRGs with parameters (37, 18, 8, 9) that have automorphisms of order 2 |Aut(Γi)| #SRGs 2 3 18 2 666 1 Theorem 4.1. Up to isomorphism there exists exactly 6 strongly regular graphs with pa- rameters (37, 18, 8, 9) having an automorphism group of order 2. 4.2 SRGs with parameters (37, 18, 8, 9) having an automorphism group of order three Using the program Mathematica we get all the possible orbit lengths distribution that sat- isfy Theorem 3.3, and using our own programs written in GAP [13] we construct all orbit matrices for given orbit lengths distributions. In Table 4 we present the number of mutually nonisomorphic orbit matrices for Z3 for each orbit lengths distribution. In the next step we obtain the adjacency matrices of strongly regular graphs with parame- ters (37, 18, 8, 9). Finally, we check isomorphisms of strongly regular graphs using GAP. Thereby we prove Theorem 4.2. The number of the constructed nonisomorphic SRGs with parameters (37, 18, 8, 9) are presented in Table 4. Orders of the full automorphism groups of these SRGs are presented in Table 5. Table 4: Number of orbit matrices and SRGs(37, 18, 8, 9) for the automorphism group Z3 distribution #OM #SRGs distribution #OM #SRGs (1× 1, 12× 3) 18 37 (13× 1, 8× 3) 0 0 (4× 1, 11× 3) 0 0 (16× 1, 7× 3) 0 0 (7× 1, 10× 3) 0 0 (19× 1, 6× 3) 0 0 (10× 1, 9× 3) 0 0 Theorem 4.2. Up to isomorphism there exists exactly 37 strongly regular graphs with parameters (37, 18, 8, 9) having an automorphism group of order 3. D. Crnković and M. Maksimović: Strongly regular graphs with parameters (37, 18, 8, 9) 7 Table 5: SRGs with parameters (37, 18, 8, 9) that have automorphisms of order 3 |Aut(Γi)| #SRGs 3 30 9 4 18 2 666 1 4.3 SRGs (37, 18, 8, 9) for Zp, where p is a prime and 3 < p ≤ 37 We show that there is no orbit matrix forZp, where p is a prime and 3 < p < 37. The results are presented in Table 6. Hence, there is no SRG(37, 18, 8, 9) having an automorphism group isomorphic to Zp, where p is a prime and 3 < p < 37. Furher, there is exactly one SRG(37, 18, 8, 9) admitting an automorphism group isomorphic to Z37, namely the Paley graph with 37 vertices having the full automorphism group isomorphic to Z37 : Z18. Table 6: Possible distributions for Zp, p a prime and 3 < p < 37 distribution #OM distribution #OM (2× 1, 7× 5) 0 (15× 1, 2× 11) 0 (7× 1, 6× 5) 0 (11× 1, 2× 13) 0 (12× 1, 5× 5) 0 (3× 1, 2× 17) 0 (17× 1, 4× 5) 0 (20× 1, 1× 17) 0 (2× 1, 5× 7) 0 (18× 1, 1× 19) 0 (9× 1, 4× 7) 0 (14× 1, 1× 23) 0 (16× 1, 3× 7) 0 (8× 1, 1× 29) 0 (4× 1, 3× 11) 0 (6× 1, 1× 31) 0 We summarize the presented information in Theorem 4.3. Theorem 4.3. Up to isomorphism there exists at least 6766 strongly regular graphs with parameters (37, 18, 8, 9). These are exactly forty SRGs(37, 18, 8, 9) having nontrivial au- tomorphisms, and at least 6726 SRGs(37, 18, 8, 9) having the full automorphism group of order one. The adjacency matrices of the six newly constructed SRGs can be found at the link: http://www.math.uniri.hr/˜mmaksimovic/srg37.txt. ORCID iDs Dean Crnković https://orcid.org/0000-0002-3299-7859 Marija Maksimović https://orcid.org/0000-0002-8094-3724 8 Art Discrete Appl. Math. 3 (2020) #P2.10 References [1] M. Behbahani, On strongly regular graphs, Ph.D. thesis, Concordia University, 2009, https: //spectrum.library.concordia.ca/976720. [2] M. Behbahani and C. Lam, Strongly regular graphs with non-trivial automorphisms, Discrete Math. 311 (2011), 132–144, doi:10.1016/j.disc.2010.10.005. [3] T. Beth, D. Jungnickel and H. Lenz, Design theory. Vol. I, volume 69 of Encyclopedia of Mathematics and its Applications, Cambridge University Press, Cambridge, 2nd edition, 1999, doi:10.1017/cbo9780511549533. [4] A. E. Brouwer, Parameters of strongly regular graphs, https://www.win.tue.nl/ ˜aeb/graphs/srg/srgtab51-100.html. [5] D. Crnković, M. Maksimović, B. G. Rodrigues and S. Rukavina, Self-orthogonal codes from the strongly regular graphs on up to 40 vertices, Adv. Math. Commun. 10 (2016), 555–582, doi:10.3934/amc.2016026. [6] D. Crnković and M.-O. Pavčević, Some new symmetric designs with parameters (64, 28, 12), Discrete Math. 237 (2001), 109–118, doi:10.1016/s0012-365x(00)00364-2. [7] D. Crnković and S. Rukavina, Construction of block designs admitting an abelian automor- phism group, Metrika 62 (2005), 175–183, doi:10.1007/s00184-005-0407-y. [8] D. Crnković, S. Rukavina and M. Schmidt, A classification of all symmetric block designs of order nine with an automorphism of order six, J. Combin. Des. 14 (2006), 301–312, doi: 10.1002/jcd.20090. [9] C. Godsil and G. Royle, Algebraic graph theory, volume 207 of Graduate Texts in Mathematics, Springer-Verlag, New York, 2001, doi:10.1007/978-1-4613-0163-9. [10] Z. Janko, Coset enumeration in groups and constructions of symmetric designs, in: Combina- torics ’90 (Gaeta, 1990), North-Holland, Amsterdam, volume 52 of Ann. Discrete Math., pp. 275–277, 1992, doi:10.1016/s0167-5060(08)70919-1. [11] B. D. McKay and E. Spence, Classification of regular two-graphs on 36 and 38 vertices, Aus- tralas. J. Combin. 24 (2001), 293–300. [12] E. Spence, Strongly regular graphs on at most 64 vertices, http://www.maths.gla.ac. uk/˜es/srgraphs.php. [13] The GAP Group, Gap, 2018, https://www.gap-system.org. [14] V. D. Tonchev, Combinatorial configurations: designs, codes, graphs, volume 40 of Pitman Monographs and Surveys in Pure and Applied Mathematics, Longman Scientific & Technical, Harlow; John Wiley & Sons, Inc., New York, 1988, translated from the Bulgarian by Robert A. Melter, https://books.google.com/books?id=J_XuAAAAMAAJ. Call for papers for the Marston Conder 65 issue of ADAM On September 9, 2020 Marston Conder turned 65. For this occasion we are opening a special issue of ADAM to be edited by Asia Ivić Weiss, Gabriel Verret and Primož Šparl. You are cordially invited to submit a paper that is related to Marston’s work. We will accept submissions until September 1, 2021. All accepted papers will be published on- line as soon as possible. The volume will be completed in 2022. When submitting a paper, please choose option The Marston Conder Issue of ADAM so that it is directed to the correct editors. Klavdija Kutnar, Dragan Marušič, Tomaž Pisanski Editors-In-Chief