Also available at http://amc.imfm.si ISSN 1855-3966 (printed edn.), ISSN 1855-3974 (electronic edn.) ARS MATHEMATICA CONTEMPORANEA 2 (2009) 121–136 Deciding the deterministic property for soliton graphs Miklós Bartha ∗ Memorial University of Newfoundland, St. John’s, Canada Miklós Krész † University of Szeged, Szeged, Hungary Received 1 October 2007, accepted 29 March 2009, published online 20 July 2009 Abstract Soliton automata are a graph theoretic model for electronic switching at the molecu- lar level. In the design of soliton circuits, the deterministic property of the corresponding automata is of primary importance. The underlying graphs of such automata, called deter- ministic soliton graphs, are characterized in terms of graphs not having even-length cycles and graphs having a unique perfect matching. On the basis of this characterization, a mod- ification of the currently most efficient unique perfect matching algorithm is worked out to decide in O(m log4 n) time if a graph with n vertices and m edges defines a deterministic soliton automaton. A yet more efficient O(m) algorithm is given for the special case of chestnut and elementary soliton graphs. All of these algorithms are capable of constructing a state for the corresponding soliton automaton, and the general algorithm can also be used to simplify the automaton to an isomorphic elementary one. Keywords: Graphs, matchings, soliton graphs, soliton automata. Math. Subj. Class.: 05C70 ∗Work partially supported by Natural Science and Engineering Research Council of Canada, Discovery Grant #170493-03. †Work partially supported by the Hungarian-Slovenian Intergovernmental Programme for 2008-2009 (project number BI-HU/08-09-012) and by the MÖB-DAAD Hungarian-German Researcher Exchange Program (project No. 3). E-mail addresses: bartha@cs.mun.ca (Miklós Bartha), kresz@jgypk.u-szeged.hu (Miklós Krész) Copyright c© 2009 DMFA 122 Ars Math. Contemp. 2 (2009) 121–136 1 Introduction Soliton switching as a physical phenomenon has first been described in [7] through several examples of a soliton wave traveling along chains of alternating single and double bonds in hydrocarbon molecules, which typically contain a system of such chemical bonds. The corresponding mathematical model was introduced in [8] by the name soliton automaton. Since the early paper [8], theoretical research on soliton automata has taken two sep- arate routes. One approach tries to characterize the transition monoids of deterministic soliton automata directly as appropriate groups. Some of the most important results in this direction can be found in [8], [9]and [10]. The other approach, highlighted by the works [1], [2], [3], [4],[15], [17]and [18] analyzes the internal structure of the underlying graphs of soliton automata on the grounds of matching theory. The two approaches have the same final goal: determine the computational power of soliton automata. Other more practical aspects of soliton switching have been studied in [13]. The present paper follows the second route outlined above. Based on the character- ization of deterministic soliton graphs obtained in [4], we provide an efficient algorithm to decide if an arbitrary graph G defines a deterministic soliton automaton. On a graph with n vertices and m edges, the algorithm runs in O(m log4 n) time. We also present a linear-time decision algorithm for two special classes of deterministic soliton graphs: chestnuts and elementary graphs. Our algorithms can be used in other applications as well. One of these applications aims at finding a flexible perfect matching for a graph equipped with a so-called grace set of vertices [5]. See also [11] and [12] for the same problem in graphs without the grace feature. Another application in computational biology concerns the description of an automated system that successfully predicts RNA structure [20]. 2 Soliton graphs and automata By a graph G = (V (G), E(G)), throughout the paper, we mean a finite undirected graph with multiple edges and loops allowed. Edges in E(G) are denoted by pairs (u, v) ∈ V (G) × V (G) with no direction between u and v implied. Our notation and terminology will follow [19]. For a vertex v ∈ V (G), d(v) denotes the degree of v. Vertex v is called external if d(v) = 1 or d(v) = 0, and internal if d(v) ≥ 2. External edges are those of E(G) that are incident with at least one external vertex. All other edges are called internal. Graph G is called open if it has at least one external vertex and closed if it has none. For a set of verticesX ⊆ V (G),G[X] denotes the subgraph ofG induced byX , that is, the graph which has as vertices X , and as edges those of E(G) that connect two vertices in X . If S is a subgraph of G, then G[S] = G[V (S)]. In this way, [ ] becomes an idempotent operation on subgraphs of G. Since in our discussion external vertices play a distinguished role, we do not want to allow that subgraphs of G have external vertices other than those of G. Therefore, in every subgraph of G, we shall automatically add a “protective” loop around each vertex with degree one or zero that is not external in G. A walk in graph G is an alternating sequence of vertices and edges, which starts and ends with a vertex, and in which each edge is incident with the vertex immediately pre- ceding it and the vertex immediately following it. The length of a walk is the number of occurrences of edges in it. A path is a walk in which all vertices are distinct, and a cycle is a walk which can be decomposed into a path and an edge connecting the two endpoints of the path. A matching M of graph G is a subset of E(G) such that no vertex of G occurs more M. Bartha and M. Krész: Deciding the deterministic property for soliton graphs 123 than once as an endpoint of some edge in M . It is understood by this definition that loops do not participate in any matching. The endpoints of the edges contained in M are said to be covered by M . A perfect matching is a matching that covers all vertices in V (G), while a perfect internal matching is one that covers all of the internal vertices. An open graph having a perfect internal matching is called a soliton graph. To keep our discussion coher- ent, we shall deal with perfect matchings as perfect internal matchings of closed graphs. To this end it is sufficient to introduce a loop around each unwanted external vertex, thus turning these vertices internal. It is clear that no generality is lost by this simple maneuver. Let G be a graph and M be a matching of G. An edge e ∈ E(G) is said to be M - positive (M -negative) if e ∈ M (respectively, e 6∈ M ). An M -alternating path (cycle) in G is a path (respectively, even-length cycle) stepping on M -positive and M -negative edges in an alternating way. Let us agree that, if the matching M is understood or irrelevant in a particular context, then it will not be explicitly indicated in these terms. An alternating path is positive (negative) if it is such at its internal endpoint(s), meaning that the edges incident with those endpoints are positive (respectively, negative). An external alternating path is one that has an external endpoint. If both endpoints of the path are external, then it is called a crossing. An alternating unit is either a crossing or an alternating cycle. Let M be a perfect internal matching of a soliton graph G. Switching on an M - alternating unit amounts to changing the sign of each edge along the unit with respect to M . It is easy to see that the operation of switching on an M -alternating unit α creates a new perfect internal matching for G, denoted S(M,α). Now we generalize the M -alternating property for walks α starting from an external vertex. In parallel we also generalize the concept of switching on such an external alter- nating walk α, which process creates a set of edges S(M,α). The definition is as follows: (i) The walk α = v0ev1, where e = (v0, v1) for an external v0 ∈ V (G), is an external M -alternating walk and S(M,α) = M∆{e}, where ∆ denotes symmetric difference of sets. (ii) If α = v0e1 . . . envn is an external M -alternating walk ending at an internal ver- tex vn, and en+1 = (vn, vn+1) is such that en+1 ∈ S(M,α) iff en ∈ S(M,α), then α′ = αen+1vn+1 is an external M -alternating walk and S(M,α′) = S(M,α)∆{en+1}. It is required, however, that en+1 6= en, unless en ∈ S(M,α) is a loop. It follows from the above definition that S(M,α) is a perfect internal matching iff the end- point vn of α is external, too. In this case we say that α is a soliton walk. By traversing a soliton walk α we mean switching on α, that is, creating the perfect internal matching S(M,α) in an interactive way. Intuitively, a walk starting and ending at an external vertex can be traversed as a soliton walk if, dynamically, its edges follow an alternating pattern with respect to the given perfect internal matching M . Example 2.1. Consider the graph G of Figure 1, and let M = {e, h1, h2}. A possible soli- ton walk from u to v with respect toM is α = uewgz1h1z2l2z3h2z4l1z1gwfv. Traversing α then results in S(M,α) = {f, l1, l2}. Every soliton graph G gives rise to a soliton automaton AG, the states of which are the perfect internal matchings of G. The input alphabet for AG is the set of all (ordered) pairs of external vertices in G, and the state transition function δ is defined by δ(M, (v, w)) = {S(M,α)|α is a soliton walk from v to w} if there exists a soliton walk from v to w, and 124 Ars Math. Contemp. 2 (2009) 121–136 b b b b b b bPPPP h @ @ @ @ v e f g z1 z4 l1 z2 z3 h1 l2 h2 w u Figure 1: Example soliton graph G δ(M, (v, w)) = {M} if there is none. As a new feature compared to previous definitions of soliton automata, we extendAG with a binary output defined by the function ζ(M, (v, w)) = 1 iff there exists a soliton walk from v to w. The rationale for this extension is the following. We do not want the automaton AG to crash upon receiving an input (v, w) for which there exists no soliton walk from v to w in a given state M . This is reflected by the above definition of δ. On the other hand, if v = w, then there might as well exist a soliton walk α from v to v, the traversal of which does not change the current state M . If all soliton walks from v to v are of this nature, then the only way to distinguish between having or not having a soliton walk is to introduce a binary output. The practical importance of this observation is clear: we can actually build a variety of interesting flip-flops using simple soliton automata. Example 2.2. (Binary flip-flop) Consider the top graph G in Figure 2a with two external vertices. The automaton AG has two states, which are shown below G in Figure 2a. (Dou- ble lines indicate positive edges in the figure.) Without the output feature AG would be isomorphic to AG′ , where G′ is the graph appearing on the top of Figure 2b. In AG, the inputs (1, 1) and (2, 2) can be used to test the current state through the output provided by the automaton. The same test is not possible in AG′ . 1 2 1 2 2 1 2 1 2 1 2 1 (b)(a) Figure 2: A binary flip-flop Example 2.3. (Ternary flip-flop) The underlying graphG, and the four states of the soliton automatonAG are shown in Figure 3. This automaton comes with a distinguished “initial” state, in which all three external vertices are covered by the corresponding matching. The initial state is recognized by getting a negative output on each of the inputs (1, 1), (2, 2), and (3, 3). The transition function and monoid of AG is described in [8]. Following the pattern of Examples 2.2 and 2.3 one can easily design flip-flops with an arbitrary number of states as soliton automata. In order for this idea to work in practice, one needs an interface sophisticated enough to pick up the output signals provided by soliton automata. M. Bartha and M. Krész: Deciding the deterministic property for soliton graphs 125 21 3 1 3 1 2 2 1 2 3 3 21 3 Figure 3: A ternary flip-flop Notice that both the binary and ternary flip-flop graphs are 3-graphs in the sense that they do not have loops or multiple edges, and the degree of each vertex is at most 3. Soliton graphs and automata have been introduced originally for such graphs in [8]. Even though our generalization from 3-graphs to all undirected graphs regarding the underlying structure seems substantial, it turns out that this generalization is merely technical. Every soliton automaton is isomorphic to one having a 3-graph as its underlying graph [17]. A soliton graph G is called deterministic if AG is such in the usual sense, that is, for every state M and input (v, w), |δ(M, (v, w))| = 1. The binary and ternary flip-flop graphs are both deterministic. The soliton graph of Figure 1 is, however, non-deterministic, as α′ = uewfv is another soliton walk from u to v with respect to the stateM of Example 2.1 such that S(M,α) 6= S(M,α′). In the sequel, for brevity, all perfect internal matchings of a soliton graph G will simply be called states. Recall from [8] that an edge e ∈ E(G) (vertex v ∈ V (G)) of a soliton graph G is impervious if there is no soliton walk passing through e (respectively, v) in any state of G. Edge e (vertex v) is viable if it is not impervious. Graph G is called viable if all of its vertices are such. See edge h in Figure 1 for an example of an impervious edge. The importance of viability in soliton graphs is self-explanatory: only the viable vertices and edges affect the behavior of the corresponding soliton automata. All impervious vertices and edges can be deleted without any impact on these automata. Following the terminology introduced in [3], an internal vertex v ∈ V (G) is called accessible in state M if there exists a positive external M -alternating path leading to v. It is easy to see, cf.[3], that v is accessible in some state M iff v is accessible in all states of G. As a consequence, the following simple characterization of impervious edges and vertices is obtained. Proposition 2.4. [4] For every edge e ∈ E(G), e is viable iff at least one endpoint of e is accessible. Vertex v ∈ V (G) is viable iff there exists an external alternating path (positive or negative) leading to v in any state of G, or, equivalently, if v is the endpoint of at least one viable edge in E(G). 3 Elementary decomposition of soliton graphs In this section we review the main results obtained in [3] on the structure of soliton graphs. We begin with a short summary of the matching theoretic concepts involved. The reader can obtain a good understanding of these concepts by following the definitions to come on Figure 4 below. Let us fix a graph G (open or closed) having a perfect internal matching for the entire section. An edge e ∈ E(G) is called allowed (mandatory) if e is contained in some (re- spectively, all) state(s) of G. Forbidden edges are those that are not allowed. We shall also use the term constant edge to identify an edge that is either forbidden or mandatory. One 126 Ars Math. Contemp. 2 (2009) 121–136 of the earliest results on soliton automata [1] is the simple fact that, for every state M of G, every other state M ′ can be obtained from M by switching on a number of pairwise disjoint M -alternating units. By this statement it is obvious that an edge e ∈ E(G) is not constant iff there exists an alternating unit passing through e in every state of G. We shall use this observation in the proof of Proposition 5.4 below. By the standard definition [19], a closed graph G is elementary if its allowed edges form a connected subgraph. This definition is extended to open graphs by requiring that the connected subgraph spanned by the allowed edges contain all the external vertices, orG itself be an isolated external vertex. Observe that if G is elementary, then it cannot contain a mandatory edge, unless G is a mandatory edge by itself with a number of loops incident with one of its endpoints. The concept of canonical equivalence[19] in elementary graphs is also well-known. To define this equivalence we choose the simplest wording here that is also meaningful for open graphs with perfect internal matchings. Consider the relation ∼ on the set of internal vertices of an elementary graph G for which v1 ∼ v2 iff the pair e = (v1, v2) becomes a forbidden edge in G+ e. It is known, cf. [19], [3], that ∼ is an equivalence relation, which determines the so called canonical partition of (the internal vertices of) G. The reader is referred to [19] for more information on canonical equivalence. In general, the subgraph of G determined by its allowed edges has several connected components, which are called the elementary components of G. Notice that an elementary component can be as small as a single external vertex of G. Such elementary compo- nents are called degenerate. Elementary components are classified as external or internal, depending on whether or not they contain an external vertex. A mandatory elementary component is a single mandatory edge e ∈ E(G), which, as a subgraph of G, will have a loop around one or both of its endpoints. An elementary component C is viable if all vertices in C are such. By Proposition 2.4, a non-viable elementary component can only contain impervious vertices, and is therefore called impervious itself. In Figure 4, every elementary component, with the exception of C7, is viable. Recall from [19] that an ear of an arbitrary graph G relative to a subgraph G′ is a path in G having both endpoints – but no interior vertices – in G′, or a cycle in G having exactly one vertex in G′. In the former case the ear is called open, while in the latter it is closed. All ears considered in this paper will be of odd lenght. If M is a state of G, the restriction of which to E(G′) defines a state ofG′, too, then anM -alternatingG′-ear α is an ear ofG relative to G′ which alternates on positive and negative edges with respect to M . Clearly, the edges on α incident with G′ must then be negative. In the same vein, an M -alternating negative G′-fork is a pair of edge-disjoint negative external M -alternating paths having their two endpoints – but no other vertices – in G′. An ear decomposition of G starting from a subgraph G′ is a representation of G in the formG = G′+α1+· · ·+αk, where α1 is an ear ofG′+α1 relative toG′ and αi is is an ear of G′+α1 + · · ·+αi relative to G′+α1 + · · ·+αi−1 for 2 ≤ i ≤ k. This decomposition is M -alternating if all ears αi are such with respect to some fixed state M of G. The connected components of the decomposition are those of the subgraph α1 + · · ·+ αk. For each non-degenerate elementary component C of G, let Ch denote the elementary graph obtained from C by adding the following edges. A pair (u, v) is added to E(C) if there exists anM -alternatingC-ear or negativeC-fork αwith respect to some stateM such that the internal endpoints of α are u and v. The newly added edges are called hidden, and graphG, enriched with all of its hidden edges, is denoted byGh. See Figure 4 for a number M. Bartha and M. Krész: Deciding the deterministic property for soliton graphs 127 C C C C C C 1 2 3 5 6 7 viable family viable families impervious edges viable family impervious family C 4 impervious edge hidden edges Figure 4: The structure of elementary components in a soliton graph of hidden edges. Observe that each hidden edge must connect two vertices belonging to the same canonical class of its elementary component, even when this component has already been endowed with all hidden edges. As a consequence [3], the elementary decomposition of Gh, with components Ch, is the same as that of G. Moreover, the soliton automata AG and AGh are isomorphic [17]. A viable internal elementary component C of G is called one-way if all external al- ternating paths (with respect to any state M ) enter C at vertices belonging to the same canonical class of Ch. This unique class, as well as the vertices in this class, are called principal. Also as part of this definition, every external elementary component is consid- ered a priori one-way (with no principal canonical class, of course). A viable elementary component is two-way if it is not one-way. In Figure 4, elementary components C1, C4, and C6 are one-way internal, with their principal vertices encircled. Let C and C ′ be two distinct elementary components of G. We say that C ′ is two-way accessible from C with respect to any (or all) state(s) M , in notation CρC ′, if there exists an M -alternating C-ear passing through C ′. It is required, though, that if C is one-way and internal, then the endpoints of this ear are not in the principal canonical class of Ch. As it was shown in [3], the two-way accessible relationship ρ is antisymmetric and state invariant. In Figure 4, C2 is two-way accessible from C1, C3 from C2, and C5 from C4. (But C3 is not two-way accessible from C1, and C2, C3, C4, C5 are not two-way accessible 128 Ars Math. Contemp. 2 (2009) 121–136 fromC6, even though there exists a negative closed ear originating from the principal vertex of C6 that covers all four of these components.) A family of elementary components in G is a block of the partition induced by the smallest equivalence relation containing ρ. A family F is called external if it contains an external elementary component, otherwise F is internal. Family F is viable if every ele- mentary component inF is such, and impervious if this is not the case. A degenerate family consists of a single isolated external vertex (elementary component), andG is degenerate if its single family is such. The graph of Figure 4 has five families, four of which are viable. The only external family is a stand-alone degenerate external elementary component. The first group of results obtained in [3] on the structure of elementary components of G can now be stated as follows. Theorem 3.1. Each viable family F of G contains a unique one-way elementary compo- nent, called the root of the family. Each vertex in every member of the family F , except for the principal vertices of the root, is accessible. The principal vertices themselves are in- accessible, but all other vertices can only be reached through them by external alternating paths. Corollary 3.2. Let e ∈ E(G) be incident with at least one viable vertex. Then e is imper- vious iff the viable endpoints of e are principal. Along the lines of Theorem 3.1, a mandatory family is a viable family having a manda- tory edge as its root. For two distinct viable families F1 and F2 of G, F2 is said to follow F1, in notation F1 7→ F2, if there exists an edge in G connecting a non-principal vertex in F1 with a principal vertex of the root of F2. The reflexive and transitive closure of 7→ is denoted by ∗7→. With a slight ambiguity, family F will be identified with the subgraph of G induced by the vertices belonging to ∪F . The second group of results found in [3] can now be summarized as follows. Theorem 3.3. The following four statements hold for the families of G. 1. Every viable family F has an M -alternating ear decomposition relative to its root R with respect to any state M . The connected components of this decomposition are state invariant, and for each such component P , the vertices common to P and R belong to the same non-principal canonical class of Rh. 2. For every edge e connecting a viable family F1 to any other family F2 (viable or not), at least one endpoint of e is principal in F1 or F2. If the endpoint of e in F1 is not principal, then F2 is viable and it follows F1. 3. The relation ∗7→ is a partial order between viable families, by which the external families are minimal elements. See again Figure 4, and identify the partial order ∗7→ of families. For convenience, the inverse of the partial order ∗7→ between G’s families will be denoted by ≤G, or simply ≤ if G is understood. Referring to the Hasse diagram of ≤, we say that family G is below family F if G ≤ F , that is, G follows F . 4 Deterministic soliton automata Deterministic soliton automata play a distinguished role in the design of soliton circuits. It is therefore crucial that we have a simple characterization of such automata, which also M. Bartha and M. Krész: Deciding the deterministic property for soliton graphs 129 allows efficient algorithms to decide the deterministic property. Partial results towards this goal have been obtained in [4] and [18]. In this section we present a complete character- ization of soliton automata, and provide a linear-time decision algorithm for two kinds of deterministic soliton automata: chestnuts and elementary graphs. Definition 4.1. [8] A connected graph G is a chestnut if it has a representation in the form G = γ + α1 + · · · + αk with k ≥ 1, where γ is a cycle of even length and each αi (1 ≤ i ≤ k) is a tree subject to the following conditions: (i) V (αi) ∩ V (αj) = ∅ for 1 ≤ i 6= j ≤ k; (ii) V (αi) ∩ V (γ) consists of a unique vertex – denoted by vi – for each 1 ≤ i ≤ k; (iii) vi and vj are at even distance on γ for any distinct 1 ≤ i, j ≤ k; (iv) any vertex wi ∈ V (αi) with d(wi) > 2 is at even distance from vi in αi for each 1 ≤ i ≤ k. See Figure 5 for an example chestnut. Figure 5: A chestnut Clearly, every chestnut is a bipartite graph. A baby chestnut [15] is a special instance of a chestnut graph by which γ is a pair of parallel edges, and each αi is a single edge or 2-length path incident with the same (principal) vertex of γ. Theorem 4.2. ([4], see also [8]) Let G be a connected deterministic soliton graph having no impervious edges. If G has a non-mandatory internal elementary component, then G is a chestnut. A redex r in an arbitrary graph G consists of two adjacent edges e = (u, z) and f = (z, v) such that u 6= v are both internal and d(z) = 2. The vertex z is called the center of r, while u and v are the two focal vertices of r. A secondary loop in G is one, the removal of which does not introduce a vertex with degree 1. Let r be a redex in G. Shrinking r means creating a new graph Gr from G by deleting the center of r and merging the two focal vertices of r into one “sink” vertex s. As it was shown in [4], the automata AG and AGr are isomorphic for every soliton graph G. ReducingG entails shrinking all of its redexes in a recursive manner, and, at the same time, deleting all secondary loops emerging in this process. Reduction yields a graph r(G), which is unique up to graph isomorphism. For an example, every chestnut graph reduces to a baby chestnut with no chance of deleting a secondary loop in this process, since chestnuts are bipartite. In general, ifG is a soliton graph, then the automataAr(G) andAG need not be isomor- phic. The reason is that the deletion of a secondary loop may also eliminate some existing soliton walks in G. See again Figure 2, where the automata AG and AG′ are not isomor- phic. It is still true, however, that Ar(G) is deterministic iff AG is such, unless r(G) is a baby chestnut, in which case G need not be deterministic. See [15] for details. Reduction is well-defined for all graphsG, and, as proved in [6], ifG has n vertices andm edges, then 130 Ars Math. Contemp. 2 (2009) 121–136 the graph r(G) can be constructed in O(m) time. See also [15] for a simpler O(n2) algo- rithm. The following characterization of viable deterministic soliton graphs was obtained in [4]. Using the terminology introduced there, a generalized tree is a connected graph not containing even-length cycles. Theorem 4.3. A connected viable soliton graph G is deterministic iff it satisfies one of the following two conditions. 1. G is a chestnut augmented by a number of impervious edges. 2. Each external component of G reduces to a generalized tree, and the subgraph of G induced by the union of its internal components has a unique perfect matching. Corollary 4.4. It is decidable inO(m) time if an arbitrary graph G with n vertices and m edges is a chestnut or an elementary deterministic soliton graph. Proof. Chestnuts allow a simple O(m) decision algorithm [15], even when they are aug- mented by any number of impervious edges connecting their principal vertices. As to elementary graphs, reduce G first, and see if r(G) is a generalized tree. By the depth-first algorithm given in [6], this can be done in O(m) time. Then reverse the shrinking proce- dure, and check if it preserves the elementary property. For more details of this procedure, see [6] or [15]. According to Theorems 3.3 and 4.3, every non-chestnut deterministic soliton automaton is isomorphic to an elementary one. The internal elementary components of the underlying graph G of such an automaton can be cut without affecting the behavior of the automaton. To preserve isomorphism, however, all hidden edges must be introduced in every external elementary component of G. Moreover, at each vertex v of every external elementary component such that v is adjacent to an internal family F , a loop must be introduced around v in order not to lose soliton walks that go down to F and return, leaving no change behind. Let G′ denote the resulting graph, which is the union of the augmented external ele- mentary components ofG. To effectively constructG′, one must first isolate and detach the internal elementary components of G, check their union for the unique perfect matching property, and produce that matching. At the same time, all hidden edges must be located and added to the remaining graph, together with the loops specified above. The graph obtained in this way is G′, which still must be tested for the deterministic elementary prop- erty using reduction. If the test is successful, then it will also produce a state for the graph G′, so that the whole graph G will have a state. Graph G′ may then be re-reduced with- out eliminating any loops in it, which will preserve isomorphism, yet simplify G′ to some extent. 5 The general algorithm In this section we present an algorithm to decide if an arbitrary graph G is a viable deter- ministic soliton graph. We start out by generalizing an old theorem of Kotzig [16], which will be the fundamental instrument in our algorithm. Through this generalization we also provide a new and elegant proof of the old result. Recall that a bridge (or cut edge) of a connected graph G is an edge e ∈ E(G), the deletion of which cuts G into two connected components G1 and G2. A bridge e is called odd if G1 and G2 are both closed, having an odd number of vertices, semi-odd if G1 is M. Bartha and M. Krész: Deciding the deterministic property for soliton graphs 131 odd closed and G2 is open, and open if both G1 and G2 are open. A bridge of an arbitrary graph G is one of a connected component of G. Theorem 5.1. Every non-chestnut viable deterministic soliton graphG having at least one internal family of elementary components contains a semi-odd bridge which belongs to every state M of G. Proof. Consider the partial order≤ ofG’s families by Theorem 3.3, and letF be a minimal element in this partial order. It is evident that the root of F , as a mandatory edge in G, is a semi-odd bridge. Intuitively, for every internal family F of G, only the mandatory root r of F can be a bridge. (See statement 1 in Theorem 3.3 regarding the ear decomposition of F .) This may happen only if every family below F is only reachable through F from above F in the Hasse diagram of ≤. Even this fact may not guarantee that r is a bridge, because G might have impervious edges coming from above F to below F . But if F is at the bottom of the diagram, then such impervious edges do not exist, so that F becomes isolated when removing its root. Corollary 5.2. (Kotzig [16], see also [19]) If a closed graph G has a unique perfect matching M , then G contains an odd bridge belonging to M . Proof. Add a single external edge e to any vertex ofG to obtain an open graphGe. Clearly, Ge is a deterministic soliton graph, which is not a chestnut. Moreover, M is a state of Ge by which e 6∈ M . Separate the viable part of Ge, and apply Theorem 5.1 to obtain a semi- odd bridge f . The edge f belongs to M , so that e 6= f . After deleting e, f becomes a bridge of G, which must be odd since G has an even number of vertices. Gabow et al. [11] gave the following algorithm to decide if a closed graph G has a unique perfect matching (UPM, for short), and specify that matchingM in case of a positive answer. Using the terminology of [11], a 2-edge component of graph G is a connected component of the graph remaining from G after the deletion of all bridges. Algorithm A Initialize M = ∅ and R to be the set of all bridges of G. While R 6= ∅ repeat the following steps: 1. Delete an edge (x, y) from R. 2. If (x, y) is an odd bridge, delete (x, y) from G, add (x, y) to M , and repeat the following steps for each edge (v, w) incident with x or y: a) Delete (v, w) from G, and from R if it is in R. b) If v and w are still connected but are in different 2-edge components (in the current graph), then: Find a path p(v, w) connecting v andw, and add every bridge on p(v, w) to R. c) Delete vertices x and y. Graph G has a unique perfect matching M iff the final graph becomes empty after running the above procedure. As it was shown in [11], Algorithm A runs in O(m log4 n) time. The correctness of Algorithm A is based on Kotzig’s theorem. The algorithm will re- move all odd bridges from G in a recursive manner. Every time such a bridge is deleted in Step 2, it must be the case that the resulting two connected components still have a UPM. 132 Ars Math. Contemp. 2 (2009) 121–136 The bridges in these components are located in Step 2b, and the efficiency of the whole algorithm hinges upon the speed of implementing this particular step. The O(m log4 n) complexity is achieved by using the dynamic 2-edge connectivity algorithm given in [14]. From our point of view this is only a technical issue, because the minor changes that we are going to apply have little impact on how the algorithm works under the modified circum- stances. It is an important point, however, that the order in which odd bridges are picked for deletion in Step 2 is irrelevant. The changes to Algorithm A are summarized as follows. (i) In the initialization, R is set to the collection of all non-open bridges. (Note that a non-open bridge need not be closed, e.g. it can be semi-odd.) (ii) In Step 1, bridge (x, y) is considered only if it is odd or semi-odd, and in Step 2b only non-open bridges are added to R. In the light of Theorem 5.1, if G is a non-chestnut viable deterministic soliton graph, then the modified algorithm will remove every internal family of G after checking each of these families for the UPM property. Conversely, if, after running the modified algorithm, the remainder of G consists of a number of open components with open bridges only (if any), then the closed subgraph induced by the deleted vertices has a UPM. We should not forget, however, to come back at the end and check for the viability of the deleted part. In Step 2 we can also easily mark those vertices in the remainder graph that are adjacent to the semi-odd bridges deleted last. These are potentially the vertices where a loop must be inserted at the end to preserve isomorphism of automata. Observe that the remainder graph at this point is still a collection of external families, not components. There is an easy way to implement the changes (i) and (ii) above without actually modifying Algorithm A. The trick is to augment G by the following vertices and edges. – Introduce a new vertex c, and connect c to each external vertex in G. – If |V (G)| is even, then add another vertex c′ and connect it to c. In this way we have added a new star graph S with center c to G. The edges of S are called auxiliary. Let G∗ denote the resulting graph, which is turned closed by adding a loop around each vertex with degree 1. The role of the star S is to keep the external components ofG together, while killing all open bridges. Running Algorithm A onG∗ will preserve this status quo, provided that the auxiliary edges are not considered for deletion in Step 1. Notice that an originally open bridge e might become semi-odd during the modified algorithm, but in this case e will become odd when Algorithm A is run on G∗. See Figure 6 for a trivial example. In general, a non-auxiliary edge (x, y) is deleted in Step 1 of Algorithm A run on G∗ iff (x, y) is deleted in the same step of the modified algorithm when run on G. At the same time, of course, all edges incident with x or y are also removed by both versions of the algorithm, regardless of these edges being auxiliary or not. e c Figure 6: An open bridge e becoming close M. Bartha and M. Krész: Deciding the deterministic property for soliton graphs 133 After Algorithm A has successfully removed all potential external families of G, it will pause for a while before it resumes on a modified version of the remainder graph G′. The pause will last O(m) time, during which a simplification of G′ takes place. We now de- scribe the actions to be taken during the pause. Remember that we still have entire external families to cope with, which families might contain internal elementary components that we must get rid of after checking them for being mandatory. Recall from [19] that a factor-critical graph is a closed graph G for which G− v has a perfect matching for each v ∈ V (G). Such a matching is called a near-perfect matching. Let G be a connected non-degenerate viable deterministic soliton graph different from a chestnut. By pruning G we mean deleting all of its external vertices and edges. Let pr(G) denote the resulting closed graph. Furthermore, for each external e ∈ E(G), let pre(G) be the open graph obtained from pr(G) by putting back the edge e only. Adding a loop around the external endpoint of e then results in a closed graph, denoted cpre(G). Theorem 5.3. If pr(G) does not contain bridges, then it is factor-critical. Moreover, for every external edge e, cpre(G) has a unique perfect matching. Proof. It is sufficient to prove the first statement only. Indeed, if pr(G) is factor-critical, then cpre(G) does have a perfect matching, but the existence of an alternating cycle in cpre(G) would contradict the assumption that G does not contain such a cycle. By Theorems 3.3 and 5.1, G must be a single external family of elementary compo- nents. If this family is mandatory, then the statement follows from Theorem 3.3/1. Indeed, by Theorem 3.1 every internal vertex v of G is accessible by a positive alternating path p starting from the single external vertex of G. Switching along p then gives rise to a near- perfect matching of pr(G) that misses v. If G is not a mandatory family, then introduce all hidden edges in the external elementary component C of G to obtain an elementary deterministic soliton graph Ch. By construction, Ch does not contain bridges. Using The- orem 4.3, reduce Ch to a generalized tree T . Since reduction does not introduce bridges, pr(T ) is a collection of pairwise edge-disjoint odd-length cycles. It follows immediately that pr(T ) is factor-critical. Consequently, for every external e ∈ E(G), G has a state Me for which e is the unique external edge present in Me. (Remember that reduction preserves, while inverse reduction creates perfect internal matchings without adding any external edges to them.) Thus, again by Theorems 3.3 and 5.1, pre(G) is a determinis- tic soliton graph consisting of a single mandatory family. The result now follows from Theorem 3.3/1 as above. Now let G be an arbitrary soliton graph consisting of a single external family, and as- sume thatG does contain internal bridges. By Theorem 3.3, all these bridges must be open. Let c ∈ E(G) be an internal bridge. By cutting c we mean deleting c from G first, then putting it back separately in both of the resulting two connected components as an external edge. Let G1 and G2 denote the two connected graphs obtained. Clearly, G1 and G2 are still soliton graphs consisting of a single external family, and G is deterministic iff both G1 and G2 are such. Cutting all internal bridges of G will then result in a number of soliton graphs G1, . . . , Gk, each of which is a single external family, so that G is deterministic iff all of G1, . . . , Gk are such. Combining the above argument with Theorem 5.3, we decide if a connected open graph G containing open internal bridges only is a deterministic soliton graph as follows. Algorithm B 134 Ars Math. Contemp. 2 (2009) 121–136 Step 1. Cut all open internal bridges in G to obtain the open graphs G1, . . . , Gk as de- scribed above. Step 2. In each Gi still containing internal edges, keep one external edge ei arbitrarily and delete the rest to obtain a graph Ḡi. Step 3. For each 1 ≤ i ≤ k, check if Ḡi has a unique perfect matching, and if so, find that matching Mi by applying Algorithm A. Step 4. Taking Mi as a state of Gi, find those vertices Xi ⊆ V (Gi) that lie on some cross- ing with respect to Mi. At the same time, locate all open M -alternating ears attached to Gi[Xi] and augment this graph by edges connecting the two endpoints of such ears. Step 5. Check if Gi[Xi], augmented by the hidden edges, reduces to a generalized tree. Graph G is a deterministic soliton graph iff a positive answer is obtained for each i. Proposition 5.4. Algorithm B correctly decides in O(m log4 n) time if a connected open graph G containing open internal bridges only is a deterministic soliton graph. Proof. IfG is deterministic, then by Theorem 5.1 it must be a single external family, so that the argument after Theorem 5.3 applies. Conversely, assume that Mi is a UPM of each Ḡi, and Gi[Xi] reduces to a generalized tree. As we have seen in the proof of Theorem 5.3, Ḡi is a single external family. But then Gi, too, is a single external family, since every internal family of Gi would be one of Ḡi as well. (See again Theorems 3.1 and 3.3.) Let Ci denote the external elementary component ofGi, and observe that each allowed edge f in Ci must lie on some crossing of Ci with respect to to the restriction M ′i of Mi to Ci. Indeed, since f is not constant, there exists an M ′i -alternating unit passing through f . This unit cannot be an alternating cycle, because Mi is a UPM of Ḡi. Consequently, V (Ci) = Xi, so that Gi is deterministic by Theorem 4.3. Thus, G is deterministic. Regarding the complexity of Algorithm B, it is clear that Steps 1 and 2 require O(m) time. We also know that the cost of Step 3 isO(m log4 n). In the presence of the matchings Mi Step 4, too, can be implemented in O(m) time using an alternating depth-first tree rooted at the distinguished external vertex of Gi. Finally, Step 5 also requires O(m) time only, according to Corollary 4.4. Thus, the overall time complexity of the algorithm is O(m log4 n). We are now ready to combine Algorithms A and B, and state our main result. Theorem 5.5. It is decidable in O(m log4 n) time if an arbitrary graph G with n vertices and m edges is a viable deterministic soliton graph. Proof. Without loss of generality we can assume that G is open, connected, non-dege- nerate, and is not a chestnut. We first apply Algorithm A on G∗ to recursively identify all odd and semi-odd bridges f , and decide if f , together with either of the odd internal components incident with it, is indeed a graph having a UPM with f belonging to that matching. When this algorithm stops, we remove the star graph S (or whatever is left of it) from the remainder graph and pause. During the pause we apply Steps 1 and 2 of Algorithm B to each connected component of the current graph. We also “freeze” (i.e., memorize) the graphs Gi obtained in Step 1. The pause lasts O(m) time, during which the graph has simplified. Then we add the edges ei specified in Step 2 to the set R of bridges, and continue running the main loop of Algorithm A all the way until the graph becomes empty or an M. Bartha and M. Krész: Deciding the deterministic property for soliton graphs 135 error occurs. Since the size of the graph after the pause is not greater than that of the original G, the overall complexity of Algorithm A is still O(m log4 n). At the end, with the matchings Mi at our disposal, we perform the required checks described in Steps 4 and 5 of Algorithm B on the graphs Gi. At this point we have succesfully decided if G is a deterministic soliton graph with all of its internal elementary components being mandatory. In case of a positive answer, we have also found a state M for G. Moreover, we have isolated the external elementary components, added all of the hidden edges, and marked the vertices in these components where a loop must be inserted to preserve isomorphism of automata. In total, we have used O(m log4 n) time only, but we still need to check if G is viable. Knowing state M , this check can easily be performed in O(m) time by the algorithm given in [2] to find the accessible vertices and isolate the families of an arbitrary soliton graph. The proof of Theorem 5.5 is now complete. 6 Conclusions We have presented anO(m log4 n)-time algorithm to decide if a graph with n vertices and m edges is a viable soliton graph defining a deterministic automaton. The basis of our algorithm was the characterization of deterministic soliton graphs and automata given in Theorem 4.3. This characterization shows that deterministic soliton graphs different from a chestnut are essentially the open counterparts of graphs having a unique perfect matching. We have considered the currently fastest UPM algorithm, and modified it to accommodate deterministic soliton graphs. The resulting algorithm is capable of constructing a state for the automaton found, and can also be used to simplify the automaton to an isomorphic elementary one. We have also given yet more efficientO(m)-time algorithms for chestnuts and elementary soliton graphs. References [1] M. Bartha and E. Gombás, On graphs with perfect internal matchings, Acta Cybernetica 12 (1995), 111–124. [2] M. Bartha and M. Krész, Isolating the families of soliton graphs, Pure Mathematics and Appli- cations 13 (2002), 49–62. [3] M. Bartha and M. Krész, Structuring the elementary components of graphs having a perfect internal matching, Theoretical Computer Science 299 (2003), 179–210. [4] M. Bartha and M. Krész, Deterministic soliton graphs, Informatica 30 (2006), 281–288. [5] M. Bartha and M. Krész, Flexible matchings, Lecture Notes in Computer Science 4271 (2006), 313–324. [6] M. Bartha and M. Krész, A depth-first algorithm to reduce graphs in linear time, submitted for publication. [7] F. L. Carter, Comformational switching at the molecular level, in: F. L. Carter (ed.), Molecular Electronic Devices, Marcel Dekker, Inc., New York, 1982, 51–72. [8] J. Dassow and H. Jürgensen, Soliton automata, J. Comput. System Sci. 40 (1990), 154–181. [9] J. Dassow and H. Jürgensen, Soliton automata with a single exterior node, Theoretical Com- puter Science 84 (1991), 281–292. [10] J. Dassow and H. Jürgensen, Soliton automata with at most one cycle, Journal of Computer and System Sciences 46 (1993), 155–197. 136 Ars Math. Contemp. 2 (2009) 121–136 [11] H. N. Gabow, H. Kaplan and R. E. Tarjan, Unique maximum matching algorithms, Journal of Algorithms 40 (2001), 159–183. [12] M. C. Golumbic, T. Hirst and M. Lewenstein, Uniquely restricted matchings, Algorithmica 31 (2001), 139–154. [13] M. P. Groves, C. F. Carvalho, and R. H. Prager, Switching the polyacetylene soliton, Materials Science and Engineering C3 (1995), 181–185. [14] J. Holm, K. de Lichtemberg, and M. Thorup, Poly-logarithmic deterministic fully-dynamic algorithms for connectivity, minimum spanning tree, 2-edge, and biconnectivity, in: Proc. 30th Annual ACM Symposium on Theory of Computing, 1998, 79–89. [15] M. M. Hossain, Some Graph Algorithms of Molecular Switching Devices. Master’s thesis, Memorial University of Newfoundland, Canada, 2007. [16] A. Kotzig, On the theory of finite graphs with a linear factor I, Mat.-Fyz. Casopis Slovensk. Akad. Vied 9 (1959), 73–91. [17] M. Krész, Soliton Automata: A Computational Model on the Principle of Graph Matchings, PhD thesis, University of Szeged, Hungary, 2004. [18] M. Krész, Simulation of soliton circuits, Lecture Notes in Computer Science 3845 (2005), 347– 348. [19] L. Lovász and M. D. Plummer, Matching Theory, North Holland, Amsterdam, 1986. [20] J. E. Tabaska, R. B. Cary, H. N. Gabow, and G. D. Stormo, An RNA folding method capable of identifying pseudoknots and base triples, Bioinformatics 14 (1998), 691–699.