Introduction to Persistent Homology Žiga Virk 2022 Univerza v Ljubljani Fakulteta za računalništvo in informatiko Kataložni zapis o publikaciji (CIP) pripravili v Narodni in univerzitetni knjižnici v Ljubljani COBISS.SI-ID=99057667 ISBN 978-961-7059-10-6 (PDF) Digitalna izdaja je prosto dostopna This digital publication is freely available http://zalozba.fri.uni-lj.si/virk2022.pdf DOI: 10.51939/0002 Recenzenta / Reviewers: prof. dr. Nežka Mramor Kosta, doc. dr. Boštjan Gabrovšek Založnik: Založba UL FRI, Ljubljana Izdajatelj: UL Fakulteta za računalništvo in informatiko, Ljubljana Urednik: prof. dr. Franc Solina Copyright © 2022 Založba UL FRI. All rights reserved. Preface In the past few decades persistent homology has emerged as one of the focal points of modern topology. While its beginnings have been motivated by practical demand in computational geometry, the scope of persistence soon expanded. Results focused on topological and algebraic fundamentals have been complemented by algorithmic and later statistical development leading to successful applications in sciences. Today, persistent homology is a wide ranging topic engaging a diverse community of mathematicians, computer scientists, data analysts and scientists in general. This textbook does not aim to encompass the endless variety of topics related to persistence. The main goal is to provide students with a manageable, geometrically intuitive and self contained introduction to persistent homology. The writing arose from the lecture notes on the course that has been taught at the University of Ljubljana for more than a decade. It is intended for a mixed audience of mathematics and computer science students at the masters level. However, any motivated student with scientific or technical inclination should (hopefully) find it accessible. As a prerequisite only basic linear algebra (including Gaussian elimination) and basic Euclidean geometry are assumed. The textbook is structured so that it gradually introduces fundamental structures. Simplicial complexes and Euler characteristic are first defined in the plane, before generalizing them to the abstract setting. Orientation is first introduced on surfaces, before utilizing it for homology computations. Required algebra is summarized in Chapter 6 to ensure a familiar algebraic footing. Side notes throughout the text are intended to clarify certain ideas, provide some explanation, or to convey a geometric idea without in-terrupting the flow of the main text. At the end of each chapter, a reader may find a short comment on the background, along with a few references, keywords, and appendices with further topics or additional material. Contents 1 Metric spaces 11 1.1 Definition of metric spaces and basic examples 11 1.2 Maps and equivalence types 13 Connectedness 16 2 Planar triangulations 19 2.1 Definition of planar triangulations 19 Modifications of triangulations 20 2.2 Recap on convexity 20 2.3 Euler characteristic 22 2.4 Constructing planar triangulations with line sweep 23 2.5 Voronoi diagram and Delaunay triangulation 24 Local Delaunay condition 25 Construction of D(S) 27 2.6 Concluding remarks 29 Appendix: Proof of Proposition 2.5.6 30 3 Simplicial complexes 33 3.1 Affine independence 33 3.2 Geometric simplicial complex 35 3.3 Abstract simplicial complex 37 Two invariants 40 6 3.4 Simplicial maps 42 Elementary collapses 43 3.5 Concluding remarks 45 Appendix: Proof of Theorem 3.3.4 45 4 Surfaces 47 4.1 Surfaces as manifolds 47 Combinatorial manifolds 49 4.2 Orientability 50 4.3 Connected sum of surfaces 52 4.4 Classification of surfaces 53 General surfaces 54 4.5 Concluding remarks 55 Appendix: imagining S3 56 5 Constructions of simplicial complexes 59 5.1 Rips complexes 59 5.2 ˇ Cech complexes 61 5.3 Nerve complexes 62 Alpha complexes 64 Mapper 65 5.4 Interleaving properties 67 Rips- ˇ Cech correlation 69 5.5 Concluding remarks 69 Appendix: the MiniBall algorithm 70 Appendix: a sketch of a proof of the nerve theorem 5.3.2 71 Appendix: Dowker duality 73 6 Fields and vector spaces 75 6.1 Fields 75 The fields of remainders Zp 76 7 6.2 Vector spaces 78 6.3 Concluding remarks 81 Appendix: A very short introduction to Abelian groups 81 7 Homology: definition and computation 85 7.1 Definition 85 Chains 85 Boundary 86 Homology 88 Zero-dimensional homology 90 Homology of a graph 91 7.2 Computing homology 92 Echelon forms 92 Smith normal form and representatives 94 Incremental expansion and elementary collapse 96 7.3 Examples of homology 97 Disjoint unions 97 Euler characteristic 97 Spheres 98 Surfaces 98 Impact of coefficients: the Klein bottle 100 Alexander duality 101 7.4 Concluding remarks 102 Appendix: Homology with coefficients in Abelian groups 103 Appendix: cubical homology 106 8 Homology: impact and computation by parts 109 8.1 Impact 109 Functoriality of homology 110 Brouwer fixed point 111 Hairy ball 112 Invariance of domain 113 8 8.2 Homology by parts 114 Exact sequences 114 Mayer-Vietoris exact sequence 115 8.3 Concluding remarks 116 Appendix: zig-zag lemma 117 Appendix: Relative homology 119 9 Persistent homology: definition and computation 121 9.1 Definition 121 Formal definition 123 9.2 Visualization 125 Barcodes 125 Persistence diagrams 126 The fundamental lemma of persistent homology 127 9.3 Computation 127 Matrix reduction 128 Extracting persistence 128 Representatives 129 Example 130 Computational tricks 132 9.4 Concluding remarks 133 Appendix: zig-zag persistence and multi-parameter persistence 134 10 Persistent homology: stability theorem 137 10.1 Continuous filtrations 137 Interleaving distance for filtrations 140 10.2 Persistence modules 141 Persistence modules 141 Decomposition 142 Interleaving distance for persistence modules 143 9 10.3 Bottleneck distance and stability theorem 145 Bottleneck distance 145 Stability theorem 147 10.4 Interpretations and examples 148 1-dimensional persistence of geodesic spaces 148 Stability demonstrated 150 Spheres 151 De-noising a function 152 10.5 Concluding remarks 153 Appendix: From the interleaving distance to the bottleneck distance 154 11 Discrete Morse theory 157 11.1 Motivation 157 11.2 Discrete Morse functions and discrete vector fields 159 Gradient vector fields 160 11.3 Morse homology 162 Morse chain complex 162 Morse homology 163 Generating DMFs and gradient vector fields 165 11.4 Concluding remarks 166 A proof of Theorem 11.2.6 166 Index 169 Bibliography 173 1 Metric spaces Topology and geometry study the shapes of spaces. In this book we will look at the modelling, computation, and representation of shapes and their properties. Our starting point will be metric spaces. These are sets with a meaningful notion of a distance (metric). In this chapter, we will focus on an intuitive understanding of three equivalence types of metric spaces: the isometry type, the homeomorphism type, and the homotopy type of spaces. These types will play a crucial role in later sections. 1.1 Definition of metric spaces and basic examples Definition 1.1.1. A metric space (X, d) is a pair consisting of a set X and a function d : X ⇥ X ! [0, •), such that for any x, y, z 2 X the following hold: • d(x, y) = 0 i↵ x = y, T Word “i↵” stands for “if and only • symmetry: d(x, y) = d(y, x), and if ”. • triangle inequality: d(x, z)  d(x, y) + d(y, z). Function d is referred to as a distance or a metric. If X is introduced as a metric space, we implicitly assume d or dX is the metric on X, unless stated otherwise. Example 1.1.2. The following are some of metrics d⇤ on Rn. For x = (x1, x2, . . . , xn) and y = (y1, y2, . . . , yn) we define: • d1(x, y) = Âni=1 |xi yi| q • d2(x, y) = Âni=1(xi yi)2 p • dp(x, y) = p Âni=1(xi yi)p for p > 1 12 introduction to persistent homology • d•(x, y) = maxi2{1,2,...,n} |xi yi| y = (3, 5) From now on, Rn is always considered to be equipped with the Euclidean d2 metric, unless stated otherwise. Example 1.1.3. The underlying space can be di↵erent that Rn. Here are some examples: 3 • Suppose X is a finite graph with a length associated with each edge. The geodesic distance dg between two vertices in X is the length of the shortest path between these vertices in X. x = (1, 2) • Suppose X is a surface. We can think of it as a sphere or the surface of the earth. Similarly as above, the geodesic distance dg be-2 tween two points on X is the length of the shortest path between Figure 1.1: A few distances: p d1(x, y) = 5, d2(x, y) = 13, these points on X. For example, consider the distance between Lond•(x, y) = 3. don and Sydney (see Figure 1.2). The distance usually thought of in this case is the geodesic distance on Earth, that is, the length of the dg(LON, SY D) shortest path between the two cities. The actual Euclidean distance LON in space between the two cities is shorter, but usually not of interest, d2(LON, SY D) since the path that realizes it passes fairly close to the center of the Earth. • Suppose A is a finite set which we call alphabet. Let X denote a set SY D of finite sequences (words) consisting of the elements of A (letters). The Levenshtein distance between two words in X is defined as Figure 1.2: Geodesic vs d2 distance the minimum number of edits required to transform one word into between London and Sydney. another, where the allowed edits are: – an insertion of a letter at any position; – a deletion of a letter anywhere; DOG DOL DOLF W OLF – a substitution of a letter in any place by another letter. Figure 1.3: Levenshtein distance between DOG and WOLF is 3 by the See Figure 1.3 for example. following argument. The sequence above demonstrates that the distance is at most 3. As WOLF has three • Let X be a finite set and let 2X be the collection of all subsets of X. letters that do not appear in DOG, The Jaccard distance on 2X is defined as the distance is at least 3. d |A [ B| |A \ B| J (A, B) = . |A [ B| For a metric space (X, d), x 2 X and r > 0 we define the closed1 1 As we will only consider closed balls, the phrase will be simplified to r-ball around x as just “balls”. Bd(x, r) = {y 2 X | d(x, y)  r}. When the metric is apparent from the context we omit it and use B(x, r). Figure 1.4: Balls in d1, d2 and d• metric in the plane. metric spaces 13 1.2 Maps and equivalence types When transforming or mapping spaces, we will always be using continuous maps. Definition 1.2.1. A map f : X ! Y between metric spaces is con-x tinuous if for each x 2 X and for each # > 0 there exists d > 0 so that the following holds for all y 2 X: dX(x, y) < d =) dY( f (x), f (y)) < #. The notion of continuity between metric spaces includes the classical continuity from calculus, i.e., all continuous elementary functions R ! R are continuous in the sense of Definition 1.2.1 on (R, d1). An equivalent definition of continuity could be stated in terms of convergent sequences. A sequence of points {zi}i2N in Z converges to w 2 Z (notation lim zi = w) i↵ dZ(zi, w) converges to zero. It i!• turns out that a map f : X ! Y between metric spaces is continuous if the following implication holds: If {xi}i2N is any sequence in X with Figure 1.5: A graph (above) and a lim xi = u 2 X, then lim f (xi) f (u) 2 Y. A practical interpretation of ball around x in the geodesic metric i!• i!• on the graph (below). continuity would be the following: if we improve our measurements xi in the sense that we get a better approximation for the desired state w, then the values over a continuous map f (xi) also converge to the value f (w). For example, suppose we want to estimate the area of Madagascar from a .bmp image representing a map of the island. We expect that as the resolution increases, we should get a better estimate for the total area. A continuous map g : [0, 1] ! X is called a path from g(0) to g(1). Next, we give three di↵erent equivalence relations on the class of metric spaces, each of which preserves a di↵erent level of geometric information. We start with the strictest equivalence, which preserves the most structure. Definition 1.2.2. A map f : X ! Y between metric spaces is an isometry, if it is bijective and preserves distances, i.e., for every x1, x2 2 X, dX(x1, x2) = dY( f (x1), f (x2)). Two metric spaces are isometric, if there exists an isometry between them. Isometries of the plane are combinations of translations, rotations and reflections. In Rn, isometries are combinations of a translation and a linear map. Linear isometries in Rn are represented by orthogo-nal matrices. Figure 1.6: Four isometric planar It turns out that no patch of a sphere (equipped with the geodesic sets. metric) is2 isometric to a subset of a plane. A practical consequence of 2 This is a consequence of Gauss’ Theorem Egregium. 14 introduction to persistent homology this fact is that all topographic maps are distorted. Although isometries are convenient in many situations, they are essentially a geometric notion that is too rigid for topological treatment. We next introduce a topological counterpart. Definition 1.2.3. A map f : X ! Y between metric spaces is a homeomorphism, if it is bijective, continuous, and f 1 is continuous. Two metric spaces are homeomorphic (or of the same topological type; notation: X ⇠ = Y), if there exists a homeomorphism between them. It is not hard to see that every isometry is a homeomorphism. While homeomorphisms are much more flexible and preserve a number of invariants of a space (later we will mention dimension, number of components and holes, etc.), they do not preserve some of the geometric properties, e.g. diameter (the supremum of pairwise distances in a space), radii of the smallest enclosing balls, etc. Figure 1.7: Four homeomorphic sets We will often be referring to the following two spaces: in the plane. • For n 2 N an n-sphere Sn is any space homeomorphic to the n-dimensional sphere n n+1 o {x 2 Rn+1 | d2(x, 0) = 1} = (x1, . . . , xn+1) 2 Rn+1 |  x2i = 1 , i=1 where 0 = (0, 0, . . . , n) 2 Rn. Observe that S0 consists of two points, S1 is homeomorphic to a circle3, S2 to the usual sphere, etc. • For n 2 N an n-disc Dn is any space homeomorphic to the ball B(0, 1) = {x 2 Rn | d2(x, 0)  1}, Figure 1.8: Four homeomorphic sets in the plane. where 0 is the n-tuple of zeros4. Observe that D1 is a closed inter-3 A circle is a 1-dimensional subset of val, whose endpoints are S0. Similarly, D2 can be thought of as the R2 defined by (x a)2 + (y b)2 = r2, i.e., it is “empty inside”. unit disc in the plane. Note that its boundary in the plane is S1. 4 A clarification on terminology: A Example 1.2.4. Here we provide some examples of homeomorphisms. ball (a metric concept) in a metric space is a particular specific subspace • Two finite metric spaces are homeomorphic i↵ they consist of the of that metric space. An n-disc (a topological concept) is any space same number of points. Each map between finite metric spaces is homeomorphic to the standard unit continuous. ball in Rn, and thus defined up to homeomorphism. A square in the • Any two closed intervals are homeomorphic. In particular, a homeo-plane is a 2-disc, but is not a ball in the Euclidean metric. Any unit morphism f : [0, 1] ! [a, b] for a < b is given by f (t) = a + t(b a). ball of radius at least 1 on a circle of circumference 1 is the entire circle • A square [ 1, 1]2 in the plane is homeomorphic to the ball B((0, 0), 1) and so is not a 1-disc. in the plane. One of the homeomorphisms is given by the radial map B((0, 0), 1) ! [ 1, 1]2 mapping: (0, 0) 7! (0, 0) and metric spaces 15 (x, y) 7! r(x, y) · (x, y) for ⇣ ⌘ px2 + y2 d2 (0, 0), (x, y) r(x, y) = = max ⇣ ⌘ . {|x|, |y|} d• (0, 0), (x, y) • All three balls in Figure 1.4 are homeomorphic via radial maps. • For each n 2 N, Sn ⇠ = Sm i↵ n = m. We will prove this result using homology in a later chapter. Figure 1.9: Two homeomorphic surfaces. • For each n 2 N, Dn ⇠ = Dm i↵ n = m. We will prove this result in Theorem 8.1.7. • No n-disc is homeomorphic to any k-sphere. Each n-sphere can be obtained as a union of two n-discs acting as hemispheres. While homeomorphism is the focal equivalence in the field of topology, it turns out that many computable invariants are in fact invariant Figure 1.10: Two non-homeomorphic with respect to a continuous deformation of spaces. These deforma-surfaces. tions are formalized by the concept of homotopy. Definition 1.2.5. Continuous maps f , g : X ! Y between metric spaces X and Y are homotopic [ f ' g] if there exists a continuous deformation of f into g, i.e., if there exists a map H : X ⇥ [0, 1] ! Y, Figure 1.11: The surface of a cube such that H|X⇥{0} = f and H|X⇥{1} = g. Map H is called a with a puncture in each of the six homotopy. sides is homeomorphic to a planar set with five holes. Another way to think about homotopy between f and g would be as a continuous collection of paths from f (x) to g(x) in X. Homotopies induce an equivalence relation on continuous maps between X and Y. Two maps belong to the same homotopy class i↵ they are homotopic. Example 1.2.6. Some examples concerning homotopies: 1. For each metric space X, any two maps f , g : X ! Rn are homo-H f, g topic. A homotopy consists of line segments between f (x) and g(x). g(S1) In particular, S1 f (S1) H(x, t) = (1 t) f (x) + tg(x). Figure 1.12: Two maps in the plane are homotopic. 2. Let w 2 S1. Then the identity map id : S1 ! S1 is not homotopic to the constant map cw : S1 ! S1, which maps each point to w. Later we will be able to prove this fact using homology. Note that by the previous example both maps are homotopic in R2, hence the relation of being homotopic depends on the target space of the maps. 16 introduction to persistent homology 3. Consider the two spaces in Figure 1.13. Space X is a single point, space Y consists of a point, an empty triangle (S1), a square (D2) X Y and a disc with a tail. Observe that there are four homotopy classes of maps from X to Y, one for each component of Y. We are now ready to introduce homotopy equivalence. Definition 1.2.7. Metric spaces X and Y are homotopy equivalent Figure 1.13: There are four homo- [X ' Y] if there exist maps f : X ! Y and g : Y ! X, such that topy classes of maps from a single f g ' idY and g f ' idX. Maps f and g are called homotopy point space X to Y. equivalences. Homeomorphic spaces are homotopy equivalent. A metric space X is contractible, if it is homotopy equivalent to the one-point space. Example 1.2.8. Some examples concerning homotopy equivalences: Figure 1.14: Four contractible spaces. • Let X = [0, 1] and Y = {0}. Then X ' Y, i.e., [0, 1] is contractible. Map f : X ! Y is the constant map and map g : Y ! X can be chosen to be any map, say g(0) = 0. Composition f g is identity. It remains to show that h = g f : [0, 1] ! [0, 1], which is the constant map at 0, is homotopic to the identity. Such a homotopy is, for example, the linear homotopy from 1. of Example 1.2.6. In the same way we can prove that Dn is contractible for each n 2 N. Figure 1.15: Four spaces homotopy equivalent to S1: Moebius band (top • Convex sets and trees are contractible. left), usual band S1 ⇥ [0, 1] (top right) and two planar sets below. Only two • It turns out that no Sn is contractible. The case n = 1 follows from of them are homeomorphic. 2. of Example 1.2.6. • Rn \ {(0, 0, . . . , 0)} ' Sn 1. Thinking of Sn 1 as the standard unit sphere, this equivalence can be proved using the inclusion map Sn 1 ,! Rn \ {(0, 0, . . . , 0)}, the radial map (see Figure 1.18) Rn \ {(0, 0, . . . , 0)} ! Sn 1 defined by x 7! x/||x||, and linear homotopy. Homotopy equivalence does not preserve all topological properties Figure 1.16: Two more homotopy (for example, dimension), but it does preserve many of those that we equivalent spaces. can compute: the number of components, holes, etc. Connectedness The first homotopy invariant we will mention is connectedness. Figure 1.17: A sequence of steps There are a few versions of it in topology. We will focus on the one deforming O to P. While the figure generated by paths. demonstrated a continuous defor- mation (homotopy equivalence), the spaces presented in this case are actually homeomorphic. metric spaces 17 Definition 1.2.9. Space X is path connected, if for each x, y 2 X there exists a path from x to y in X. Subset A ✓ Y of a metric space Y is a path component, if it is a maximal path connected subset. A space is path connected i↵ it is itself a path component. As was mentioned above, path connectedness is a homotopy invariant: if X is path connected and Y ' X, then Y is also path connected. Similarly, the number of path components of a metric space is a homotopy invariant. Space Y on Figure 1.13 has four components. Figure 1.18: Radial map of a punc- tured ball B2((0, 0), 1) \ {(0, 0)} ⇢ R2 to the standard unit sphere (cir- 1.2 Concluding remarks cle) S in the plane. Using the argument used in last part of Ex- ample 1.2.8, the induced homo- Recap (highlights) of this chapter topy equivalence demonstrates B2((0, 0), 1) \ {(0, 0)} ' S. • Metric spaces; " From now on we will be dropping adjective “path” and only refer to • Isometry; “connectedness”, and “components”. • Homeomorphism; • Homotopy equivalence; • Connectedness. Background and applications Mathematics is the language of science and scientific concepts are modelled by mathematical objects. These objects can range from simple to sophisticated: a simple Boolean value (0 or 1, i.e., TRUE or FALSE), a numeric value (integer, real, complex, etc.), a collection of numeric values (e.g., a point in Rn), a collection of points in Rn, a function, a vector space, a probability distribution, a graph, a matrix, a metric space, etc. For most of these notions, there is a useful notion of a metric that transfers the possible outputs into a metric space and thus into the realm of geometry and topology, some of which we have explored here. The notions introduced in this chapter are covered in standard books on topology5. 5 James R. Munkres. Topology. Prentice Hall, Inc, 2nd ed edition, 2000 2 Planar triangulations In the previous chapter we learned about metric spaces along with the homeomorphism and homotopy type. However, the descriptions we used are not of combinatorial nature, and one would have edge xy difficulties using them for computations. In this chapter we will intro-y duce one of the simplest combinatorial descriptions of planar spaces: triangulations in the plane. Essentially, we would like to describe a x planar region as a “nice” union of triangles. Triangles are used primar-ily because they are easy to describe: we only have to provide three vertex z points. In later sections we will use these triangulations to compute various invariants of the space: components, homology, etc. z It turns out that not every planar subset can be triangulated. How-Figure 2.1: Triangle xyz. ever, finite triangulations (i.e., triangulations with finitely many triangles) can be obtained for most planar subsets of interest to us . 2.1 Definition of planar triangulations A triangle in the plane has three edges and three vertices. Definition 2.1.1. A triangulation of a closed region D ⇢ R2 is a decomposition of D into triangles, so that: 1. no triangle is degenerate (i.e., a point or just a line segment), 2. interiors of triangles are disjoint, and 3. intersection of any pair of triangles is either a common edge, a common vertex, or empty. Geometric description of Definition 2.1.1 is provided by Figure 2.3. Figure 2.2: A planar triangulation. The idea of a triangulation may be generalized in various ways. One could use di↵erent shapes of pieces to decompose a planar region or T A planar region admitting a triangulation is called a polygonal the entire plane. Such decompositions are called tessellations. General-region. 20 introduction to persistent homology X 7 X 7 X 7 Figure 2.3: Conditions of Definition 2.1.1. izing by dimension, one could use “higher dimensional triangles”, such as tetrahedra, to decompose a higher dimensional space. This idea will be formalized as simplicial complex in the next chapter. Modifications of triangulations Occasionally we will want to modify a triangulation. Here are some of the most used modifications: • add a triangle; • remove a triangle; • flip a common edge; • refine using a subdivision. An example of a subdivision is the barycentric subdivision: for each edge and each triangle consider its geometric center (centroid) as a new vertex in our triangulation, and then decompose each triangle as demonstrated by Figure 2.4. This subdivision is convenient when we want to refine a triangulation, i.e., systematically decompose the triangles into smaller triangles. We will often focus on triangulations of convex polyhedra, i.e., convex hulls of finitely many points in the plane, as defined below. Figure 2.4: Modifications of tirangu- Given a finite S ⇢ R2 we say a triangulation on S is any triangulation lations. of the convex hull of S, whose vertex set is S. 2.2 Recap on convexity Given points x, y 2 Rn, the line segment between them is parameterized as (1/2) g(t) = tx + (1 t)y, t 2 [0, 1]. x = (1) y = (0) Note that g(0) = y, g(1) = x, and g(1/2) corresponds to the Figure 2.5: Line segment. midpoint of the line segment. planar triangulations 21 Definition 2.2.1. A subset A ⇢ Rn is convex, if for each a, b 2 A the entire line segment between a and b lies in A, i.e., if 8t 2 [0, 1] we have ta + (1 t)b 2 A. See Figure 2.6. Given a subset B ⇢ Rn, its convex hull Conv(B) is the smallest convex set containing B. The closed region on Figure 2.2 is not convex, while the ones on Figure 2.4 are convex. A triangle is the convex hull of the set of its vertices, which pro-Figure 2.6: A convex (left) and a vides a convenient description of a triangle: the triangle with affinely non-convex (right) subset of the independent vertices x, y, z 2 R2 can be parameterized by all possible plane. convex combinations of these vertices: 3 { a 1x + a 2y + a 3z, | 8i 2 {1, 2, 3} : a i 2 [0, 1],  a i = 1}. i=1 The term “convex combination” (as opposed to “linear combination”) refers to the fact that the coefficients a i are from [0, 1] and add up to 1. These coefficients are called barycentric coordinates in a triangle. The point with a 1 = a 2 = a 3 = 1/3 is the centroid of the triangle, while points with two barycentric coordinates1 1/2 are the midpoints 1 ...and the third coordinate equal to 0. of the corresponding edges; all these points are vertices in the barycentric subdivision shown in Figure 2.4. Convex hull can be constructed by iteratively adding all feasible line segments. It is important to note that for B ⇢ Rn the set obtained by adding all line segments 2 { a 1x + a 2y | x, y 2 B, 8i : a i 2 [0, 1],  a i = 1} i=1 is typically not the convex hull. For example, starting with three vertices and adding the line segments between all three pairs we would obtain the set consisting of the edges but not the interior of the triangle (which constitutes the convex hull of three points). Instead, we have to add all possible line segments, or alternatively, add all convex combinations in one step: Figure 2.7: A collection of points and [ m m its convex hull. Conv B = { a ixi | 8i : xi 2 B, a i 2 [0, 1],  a i = 1}. m2N i=1 i=1 By the Carathéodory theorem (see original reference2 or any modern 2 Constantin Caratheodory. U¨ ber den Variabilitätsbereich der Koeffizienten book treating convexity) we can bound the number of summands3 by von Potenzreihen, die gegebene the dimension of the ambient space plus one: Werte nicht annehmen. Math. Ann. 64, no. 1, 95–115, 1907. doi: n+1 n+1 10.1007/BF01449883 Conv B = {  a ixi | 8i : xi 2 B, a i 2 [0, 1],  a i = 1}. 3 ...and also the number of iterative i=1 i=1 steps in the procedure above... In particular: for a finite subset F ⇢ R2, each point of Conv(F) is contained in the convex hull of some triple of points from F. 22 introduction to persistent homology 2.3 Euler characteristic Along with the number of components of a space, the Euler characteristic is one of the first real topological invariants we come across. In particular, while there are many triangulations of Conv(S) on a finite subset S ⇢ R2, the Euler characteristic is the same for all of them. For a given triangulation let: • V be the number of its vertices, • E be the number of its edges, = 1 • F be the number of its triangles. Definition 2.3.1. The Euler characteristic c of a triangulation is defined as c = F E + V. = 0 Theorem 2.3.2 (A simple version of the Euler-Poincaré formula). Assume S ⇢ R2 is finite. For each triangulation of Conv(S) we have c = 1. Proof. Let us assume our triangulation has no vertical edge: if necessary this can be achieved by a small rotation. Assign to each triangle value +1 and to each edge value 1. Position each of these values to- = 3 wards the unique rightmost vertex of the corresponding triangle/edge Figure 2.8: A few planar trianas suggested by Figure 2.9. Assign to each vertex value +1. The total gulations along with their Euler characteristics. sum of all assigned values is c. For each single vertex add: the value at the vertex and all the values of the triangles and edges, that gathered at that vertex. We can see that for each vertex the total sum is zero (arising from a sequence edge-triangle-edge-...-edge-triangle-edge on the left from the vertex plus the vertex itself) except for the leftmost vertex, where the value equals one. Remark 2.3.3. It turns out that c = 1 for each triangulation of a contractible set4 in R2. In fact, for a triangulation of D ⇢ R2, c 4 Even more: c is a homotopy invariant. equals the number of components minus the number of holes of D. More technical details on this fact will be provided in later sections. Let us just mention that the number of holes of a bounded set D ⇢ R2 equals the number of components of R2 \ D minus one (see Figure 2.10). planar triangulations 23 + Figure 2.9: The assignment of values + + + on triangles (red +), edges (blue ) and vertices (green +) from the proof of Theorem 2.3.2. Vertices also hold + additional +1 value. The triangles are present but not shaded. + + + + + + + + + + 2.4 Constructing planar triangulations with line sweep Let S ⇢ R2 be finite. Perhaps the simplest way to construct a triangulation on S is using a line sweep, which we now describe. Assume no two points of S have the same horizontal coordinate (this can be achieved by a small rotation if necessary). Now imagine a vertical line sweeping Conv(S) from left to right. Each time the line reaches a point of S (a vertex in our triangulation), add all possible edges to-Figure 2.10: The top set has 2 holes. wards left without creating intersections. Furthermore, for each new Equivalently, its complement on the bounded region add the corresponding triangle. As the line sweeps S bottom has 2 + 1 components. we thus obtain a triangulation on S. Figure 2.11: A line sweep using the vertical dashed line. Each time the vertical line reaches a point, we add all possible edges from that point to a point with smaller horizontal coordinate. The condition that no two points have the same horizontal coordinate was added for reasons of simplicity only. If more points, say a1, a2, . . . , ak have the same horizontal coordinate then, instead of adding all edges for all points ai at once, proceed point by point: add all possible edges for a1, then for a2, etc. Depending on the order of points ai we typically get di↵erent triangulations. It should also be obvious that the line sweep does not need to proceed from left to right, but can proceed along any direction by sweep- 24 introduction to persistent homology ing a line perpendicular to that direction. While the line sweep is conceptually simple, it does tend to construct triangulations with very thin triangles, which may be undesir-able in applications. The triangulation that avoids thin triangles as much as possible is the Delaunay triangulation. 2.5 Voronoi diagram and Delaunay triangulation Figure 2.12: A line sweep triangula- tion resulting in thin triangles. Throughout this section let S ⇢ R2 be a finite subset satisfying a general position property: no four points of S lie on the same circle. We will first present the Voronoi diagram of S, which is a decomposition of the plane into specific regions. For each s 2 S define the Voronoi region of s: Figure 2.13: Circle on left, and a ball Vs = {x 2 R2 | 8u 2 S \ {s} : d(x, s)  d(x, u)}. on right. The boundary of the ball is the circle. Figure 2.14: An example of a Vu Voronoi decomposition. u If a pair of Voronoi regions Vs , V 1 s2 has a non-empty intersection, then (due to the general position condition above) this intersection is a bounded or unbounded line segment called a Voronoi edge and lies on the bisector between s1 and s2. If a triple of Voronoi regions Vs , V , V 1 s2 s3 has a non-empty intersec- tion, then this intersection is a point called a Voronoi vertex. As this point lies on all three pairwise bisectors, it is the center of the circle Figure 2.15: A Voronoi vertex ⇤ is containing s1, s2 and s3. the center of the circle containing the corresponding points • of S. Voronoi edges lie on the bisector lines between the corresponding points of S. planar triangulations 25 Definition 2.5.1. The Voronoi diagram (or decomposition) of S is the collection of Voronoi regions, edges and vertices. T Voronoi regions and Voronoi diagrams can be defined for subsets A Voronoi region Vs consists of points, whose closest point of S is of Rn in the same way. In this case s. If for some point w there are two such closest points in S, then w the general position property is: no collection of n + 2 points from S lies is on the corresponding edge. If for some point w there are three such on the same sphere. closest points in S, then w is a Voronoi vertex. The general position criterion above states that for each point in the plane, there can be no four closest points in S. Voronoi diagram can be thought of as a result of a uniform expansion from the points of S. Suppose that in the initial stage we start with a finite set of locations S. Then, as time goes by, each point of S is being expanded into a region by growing at the same speed in all directions. At the beginning all these regions are balls centered at the points of S. As the growing regions collide, the growth towards the regions (edges) of contact stops, but continues along all other directions. The Voronoi decomposition is the final result of such growth, with each Voronoi region Vs containing the points that were reached from s first. Definition 2.5.2. The Delaunay triangulation on S, denoted by D(S), is the triangulation on S, such that: Figure 2.16: Voronoi diagram arising • its vertices are all points of S, from expansion around points. • xy is an edge i↵ Vx \ Vy 6= ∆, and • xyz is a triangle i↵ Vx \ Vy \ Vz 6= ∆. It turns out that Definition 2.5.2 indeed defines a planar triangulation on S. As a curiosity we mention that an edge xy of a Delaunay triangulation may partially lie outside of the union Vx [ Vy. Note that the edge xy of a Delaunay triangulation is a boundary edge (meaning it is contained in precisely one triangle) i↵ Vx \ Vy is unbounded. Similarly, x is a boundary vertex of a Delaunay triangulation (meaning it is an endpoint of some boundary edge) i↵ Vx is unbounded. Local Delaunay condition For a triple of non-colinear points x, y, z 2 R2 in the plane define C(x, y, z) to be the circle containing x, y, z, and let B(x, y, z) be the ball whose boundary is C(x, y, z). 26 introduction to persistent homology Figure 2.17: An example of a Delau- nay triangulation with its underlying Voronoi decomposition. y z y z Definition 2.5.3. Suppose an edge xy is shared by two di↵erent triangles xyz and xyw of a triangulation. The edge xy is locally Delaunay [abbreviation: LD], if w / 2 B(x, y, z). x x Figure 2.18: C(x, y, z) on the left and B(x, y, z) on the right. Proposition 2.5.4. Suppose edge xy is shared by two di↵erent triangles xyz and xyw from a triangulation. 1. Definition 2.5.3 is symmetric, i.e., w / 2 B(x, y, z) i↵ z / 2 B(x, y, w). x x w w 2. Each edge in a Delaunay triangulation is LD. z z y y Proof. Part (1) is apparent from Figure 2.19. Figure 2.19: Proof of Proposition (2): Since abc is a triangle in D(S), there exists the corresponding 2.5.4 (1). Voronoi vertex q = Vx \ Vy \ Vz, which is the center of C(x, y, z). As q / 2 Vw (recall that no four Voronoi regions have a nonempty interesection by the general position property), d(q, x) = d(q, y) = d(q, z) < d(q, w) by the definition of Voronoi regions, hence w / 2 B(x, y, z). The property of being LD is a local property, shared by all edges of a Delaunay triangulation. It turns out that the converse of Proposition 2.5.4(2) is also true. planar triangulations 27 Theorem 2.5.5. Suppose K is a triangulation on S. Then K is the Delaunay triangulation i↵ each edge is locally Delaunay. A proof can be found in a textbook5. 5 Mark de Berg, Marc van Krev- eld, Mark Overmars, and Otfried Schwarzkopf. Computational Geom- Construction of D(S) etry: Algorithms and Applications. Springer-Verlag, second edition, Theorem 2.5.5 motivates the edge-flipping construction of Delaunay 2000. doi: 10.1007/978-3-540-77974-2 triangulations: starting with any triangulation on S (say, one obtained by a line sweep), keep flipping the non-LD edges. In order to algorith-mically implement this construction we have to clarify two issues: 1. How do we verify the LD condition? 2. Does the procedure stop? We address 1. first. It turns out it is not hard to verify the condi-Figure 2.20: Edge flip. tion of LD using the incircle test. Proposition 2.5.6. [Incircle test] Suppose x = (x1, x2), y = (y1, y2), z = (z1, z2) and w = (w1, w2) are four points in R2. Assume x, y, z are not collinear and form a positively oriented triple, i.e.: 1 x1 x2 1 y1 y2 > 0 1 z1 y2 Then w / 2 B(x, y, z) i↵ 1 x1 x2 x21 + x22 z 1 y1 y2 y21 + y22 > 0. 1 z1 z2 z21 + z22 1 w1 w2 w21 + w22 x A proof and technical details of Proposition 2.5.6 are provided in the Appendix. While Proposition 2.5.6 provides a convenient way to verify LD property (and answer 1.), it does not suggest whether y the edge flip algorithm actually stops (2.). In order to address this Figure 2.21: Positively oriented triple (x, y, z). question we provide a couple more equivalent conditions to LD. Suppose edge xy is shared by two di↵erent triangles xyz and xyw from a triangulation K on S. We say that edge xy is a MaxMin edge, if the minimal angle appearing in triangles xyz and xyw is larger than the minimal angle appearing in triangles xzw and yzw. 28 introduction to persistent homology Proposition 2.5.7. Suppose edge xy is shared by two di↵erent triangles xyz and xyw from a triangulation K on S. Then the following c p conditions are equivalent: ↵ (i) xy is LD. a u ↵ (ii) xy is a MaxMin edge. (iii) ]xzy + ]xwy < p. ⇡ ↵ v Proof. Let us prove the equivalence (i) , (iii) first using the inscribed angle theorem (see Figure 2.22). b xy definition inscribed angle is LD () w / 2 B(x, y, z) () p ]xzy > ]xwy , Figure 2.22: Inscribed angle theorem. Suppose u, v ]xzy + ]xwy < p. 2 C(a, b, c), as the figure demonstrates. Then ]acb = ]aub = We now turn our attention to (i) , (ii). Let a be the minimal an-p ]avb. This obviously implies gle appearing in triangles xyz, xyw, xzw and yzw. It is easy to see that ]acb > ]apb. a has to lie either along xy or zw as all the other angles get dissected w (and hence decreased) by either xy or zw in one of the configurations. Assume xy is LD. According to the inscribed angle theorem, ]xyz > x ]xwz, hence ]xyz does not equal a. In the same way we can prove that no angle along xy equals a, hence xy is the MaxMin edge. Assume now that xy is not LD. Using the identical argument as in the previous paragraph we can prove that each angle along zw is larger than the corresponding angle along xy. Hence a has to lie along z xy and therefore xy is not a MaxMin edge. Proposition 2.5.7 implies that each edge flip, which makes an edge y in a triangulation LD, increases the minimal angles in the triangula-Figure 2.23: Proof of Proposition tion. Let us explain this statement in more detail. For each triangle 2.5.7, (i) , (ii). Ti in a triangulation K on S let ti denote the size of its minimal angle. Construct a lexicographically ordered list of these minimal angles, i.e., ti0  ti1  . . .  tim. Proposition 2.5.7 implies that every time we execute an edge flip making an edge in a triangulation LD, the new lexicographically ordered list of the minimal angles t0i  t0  . . .  t0 0 i1 im is lexicographical larger than the previous list, i.e., ti , j  t0i 8j with j strict inequality holding for at least one index j . Hence by making the required edge flips that keep turning edges into LD edges we can’t return to the initial or any already visited triangulation. Since there are only finitely many triangulations on S, and therefore finitely many possible ordered lists of minimal angles, the edge flipping algorithm terminates, answering 2. above affirmatively. Conclusion: the edge flipping algorithm terminates with D(S). planar triangulations 29 A triangulation, for which the lexicographically ordered list of the minimal angles is maximal in the lexicographical order, is called a MaxMin triangulation. Theorem 2.5.8. A MaxMin triangulation on S coincides with D(S). In particular, there exists only one MaxMin triangulation on S. 2.6 Concluding remarks Recap (highlights) of this chapter • Planar triangulations; • Convexity; • Euler characteristic; • Line sweep; • Voronoi diagram and Delaunay triangulation; • Constructing the Delaunay triangulation using the locally Delaunay condition and the incircle test. Background and applications The Euler characteristic was introduced by Leonhard Euler in the 18th century. The line sweep algorithm, Voronoi diagram and Delaunay triangulation are basic notions studied especially in computational geometry6. Applications of the Euler characteristic include image 6 Mark de Berg, Marc van Kreveld, Mark Overmars, and Otfried analysis7, target enumeration8, etc. The edge flip algorithm we men-Schwarzkopf. Computational Geom- tioned requires O(n2) edge flips, where n is the number of vertices of etry: Algorithms and Applications. Springer-Verlag, second edition, S. There are known algorithms to construct the Delaunay triangula-2000. doi: 10.1007/978-3-540-77974-2 tion in O(n log n). 7 A. Roy, R. A. I. Haque, A. J. Mi- The above mentioned properties of the Delaunay triangulation make tra, M. Dutta Choudhury, S. Taraf-dar, and T. Dutta. Understanding it one of the favorite choices for a triangulation on a finite planar set flow features in drying droplets S. For example, assume you are given a collection of points modelling via Euler characteristic surfaces— a geographic profile in a small region. The points consist of coordi-a topological tool. Physics of Fluids, 32(12):123310, 2020. doi: nates and elevations at these coordinates. The task is to model the 10.1063/5.0026807 surface approximating the geographic profile. A standard solution 8 Yuliy Baryshnikov and Robert would be to construct the Delaunay triangulation on the set of coor-Ghrist. Target enumeration via Euler characteristic integrals. dinate points, and then lift these points and triangles according to the SIAM Journal on Applied Mathe-given elevations. Triangles lifted this way provide a good approxima-matics, 70(3):825–844, 2009. doi: 10.1137/070687293 tion of the geographic profile on the sampled region. 30 introduction to persistent homology Appendix: Proof of Proposition 2.5.6 Proof. Let us explain the positively oriented criterion first, see Figure 2.24. Points x, y, z form a positively oriented triple i↵ vectors ! xy, ! yz are positively oriented, meaning that the third component of the cross product (y1 x1, y2 x2, 0) ⇥ (z1 x1, z2 x2) is positive. This third component equals the 3 ⇥ 3 determinant z 1 x1 x2 1 y1 y2 . 1 z1 y2 We now turn our attention to the proof of the proposition. Surpris-x ingly enough, we need to use three-dimensional geometry, see Figure 2.25 throughout the proof. Embed R2 (and points x, y, z, w) into R3 y by assigning the third coordinate to be 0, i.e., x = (x1, x2, 0), etc. Figure 2.24: Positively oriented triple Consider the graph of the function f (x, y) = x2 + y2. Lift points x, y, z. x, y, z, w to the graph of f and let x0, y0, z0, w0 denote the lifted points, i.e., x0 = (x1, x2, x21 + x22), etc. Let P denote the plane containing x0, y0, z0. Figure 2.25: Proof of Proposition 2.5.6. z = x2 + y2 ⇧ z0 x0 y0 z x C(x, y, z) y Let C be the intersection of the graph of f and P. Note that the vertical projection of C onto R2 ⇥ {0} is a circle: substituting z in z = x2 + y2 by an equation of a plane z = ax + by + c we obtain an equation of a circle in the plane of the form x2 + y2 ax by c = 0. As this circle contains x, y, z, it coincides with C(x, y, z). It is planar triangulations 31 geometrically apparent that w / 2 B(x, y, z) i↵ w0 lies above P (the region where the graph of f is below P is B(x, y, z)). Since x, y, z are positively oriented, a normal of P with a positive third component ! ! ! is ~n = x0y0 ⇥ x0z0. Point w0 lies above P i↵ ~n · x0w0 is positive. It is ! elementary to verify that ~n · x0w0 equals 1 x1 x2 x21 + x22 1 x1 x2 x21 + x22 0 y1 x1 y2 x2 y21 + y22 (x21 + x22) 1 y = 1 y2 y21 + y22 . 0 z1 x1 z2 x2 z21 + z22 (x21 + x22) 1 z1 z2 z21 + z22 0 w1 x1 w2 x2 w21 + w22 (x21 + x22) 1 w1 w2 w21 + w22 3 Simplicial complexes Topological and computational treatment of metric spaces relies on their convenient description. Given a metric space, we would like to have a finite combinatorial description, that can be used for computations. In the previous chapters we introduced planar triangulations as an example of such a description for planar subsets. In this chapter we will introduce simplicial complexes, which will form the basic structure upon which all our later computations will depend. Simplicial complexes are higher-dimensional analogues of planar triangulations. While the latter are collections of triangles that fit together nicely, simplicial complexes are collections of higher dimensional simplices (generalizations of triangles) that fit together nicely. Essentially we will be building spaces from simple building blocks (simplices) given a rule describing how these blocks fit together... just like building a castle from LEGO cubes. 3.1 Affine independence Figure 3.1: Some geometric sim- plices: a point, a line segment, a A point, a line segment, a triangle, a tetrahedron, etc. These are triangle, a tetrahedron. some of the geometric simplices. They are basic building blocks of geometric simplicial complexes. A geometric simplex is a convex hull of a finite collection of points. Before we state their formal definition we need to clarify a general position property required of a set of points spanning such a simplex. Under this property we want a pair of points to span a line segment, a triple of points to span a triangle (and not just a line segment), etc. Choose d, k 2 N and let V = {v0, v1, . . . , vk} ⇢ Rd be a collection of points. Their affine combination is any sum of the form k k  a ivi, with  a i = 1. i=0 i=0 The affine hull of V is the collection of all affine combinations of ele- 34 introduction to persistent homology ments of V. An affine hull is always an affine linear subspace in Rd, meaning it is obtained from a linear subspace of Rd by a translation. Points {v0, v1, . . . , vk} are affinely independent if no vi can be expressed as an affine combination of points V \ {vi}. Proposition 3.1.1 explains how to test points for affine independence using linear inde-Figure 3.2: The affine hull of the pendence, and why each affine hull is a translated linear subspace. two points on the left is a line. The affine hull of the three colinear points on the right is also a line, implying Proposition 3.1.1. Points of V = these three points are not affinely {v0, v1, . . . , vk} ⇢ Rd are affinely independent. independent i↵ {v1 v0, v2 v0, . . . , vk v0} are linearly indepen- dent. z y Proof. Assume points of V are not affinely independent. Then, without the loss of generality, v0 = Âki=1 a ivi and Âki=1 a i = 1, which implies equality Âki=1 a i(vi v0) = 0 and not all a i are zero. We conclude that the points of V are not linearly independent. On the other hand assume Âki=1 b i(vi v0) = 0 with not all b i being zero. We define b 0 = Âki=1 b i and observe that x k k  b ivi + b 0v0 = 0 and  b i = 0. Figure 3.3: The convex hull of three i=1 i=0 affinely independent points is a trian- gle. The affine hull is the supporting Choose K 2 {0, 1, . . . , k} so that b K 6= 0. Then plane of the triangle. k k v b i b i K =  vi and  = 1. i b b =0, i6=K K i=0, i6=K K Hence points of V are not affinely independent. T A linearly independent collection of vectors in Rd can have at most d elements. An affinely independent Proposition 3.1.2. Suppose points of V = collection of points in Rd can have at {v0, v1, . . . , vk} ⇢ Rd most d + 1 elements. are affinely independent. Then for each point x 2 Conv(V) there exist unique coefficients a i 2 [0, 1], i 2 {0, 1, . . . , k}, such that k k x =  a ivi and  a i = 1. i=0 i=0 Coefficients a i in are called barycentric coordinates of point x in Conv(V). Proof. The existence of such coefficients a i follows from x 2 Conv(V). In order to prove the coefficients are unique assume the statement holds for two di↵erent sets of coefficients a i and a 0i, i.e., k k k k x =  a ivi =  a 0ivi and  a i =  a 0i = 1. i=0 i=0 i=0 i=0 simplicial complexes 35 At some index i the coefficients a i and a 0i di↵er. Without loss of generality we can assume that index is zero, i.e., a 0 a 00 6= 0. Then k ( a 0 a 00)v0 = Â( a 0i a i)vi i=1 and k a 0 v i a i 0 =  vi, i a =1 0 a 00 which contradicts the assumption that the points of V are affinely independent. 3.2 Geometric simplicial complex We are now ready to define our basic building blocks. Definition 3.2.1. Let k, d 2 {0, 1, . . .} with k  d. A geometric k-simplex s in Rd is the convex hull of an affinely independent family V = {v0, v1, . . . , vk} ⇢ Rd, i.e., s = Conv(V). The following is some terminology related to a geometric simplex s = Conv({v0, v1, . . . , vk}): Figure 3.4: A two-dimensional • Dimension: dim( s) = k. We sometimes express it by writing it as simplex has six simplices as faces superscript: s = s k. (three edges and three vertices), three of which are facets (edges). • Vertices of s: v0, v1, . . . , vk. • Edges of s: convex hulls of pairs of vertices. • We say that s is spanned by the set of its vertices. • If simplex t is spanned by a subset of the vertices of s, we say that: – t is a face of s. – s is a coface of t. – t is a facet of s if dim( t) = dim( s) 1. T All our simplicial complexes Note that s k ⇠ = Dk. By Proposition 3.1.2 each point of s is uniquely will be finite. For that reason we will be dropping the word “finite”. described by its barycentric coordinates using the vertices of s. There also exist simplicial complexes We can now use these building blocks to assemble more complicated with infinitely many simplices. However, a proper definition of spaces. infinite simplicial complexes brings along additional technicalities which we want to avoid in our context. Definition 3.2.2. Let d 2 {0, 1, . . .}. A (finite) geometric simplicial complex K ⇢ Rd is a (finite) collection of geometric simplices, such that: a: If s 2 K and t is a face of s, then t 2 K. 36 introduction to persistent homology b: If s, t 2 K, then s \ t is either empty or a common face of both. Each planar triangulation has a corresponding simplicial complex consisting of all triangles, edges and vertices of the triangulation. Let K be a simplicial complex. We define: • Dimension dim(K) = max s 2K dim( s). A one-dimensional simplicial complex is a graph. • Vertices of K as the collection of all vertices of all simplices of K. Figure 3.5: The smallest two- dimensional simplicial complex • Edges of K as the collection of all edges of all simplices of K. consists of a triangle and all its faces: three edges and three vertices. • A geometric simplicial complex L is a subcomplex of K [notation: L  K], if L ✓ K. • For n 2 {0, 1, . . .} the n skeleton of K [notation: K(n)] is the simplicial subcomplex of K consisting of all simplices of K of dimension at most n. For example, K(0) is the set of vertices of K. Figure 3.6: The simplicial complex • The body of K [notation: |K|] is the union of all simplices of K. from Figure 3.5 (left) and its 1-skeleton (right) consisting of three Formally speaking, a geometric simplicial complex in Rd is a collec-edges and three vertices. tion of simplices and its body is a subset in Rd. In practice however we will be often identifying the two objects in geometric discussions. From now on we will be visualizing simplicial complexes by drawing their body and assuming the underlying structure of asimplicial complex. Figure 3.7: On the left there is a geometric simplicial complex We are now ready to describe a connection between a metric sub-presented by drawing its body. Each space of a Euclidean space and its combinatorial description. edge of a sketched triangle and each vertex of a sketched edge is assumed to be in the complex. On the right Definition 3.2.3. Let d 2 {0, 1, . . .}. A triangulation of a subspace is the 1-skeleton of the simplicial complex on the left. X ⇢ Rd is a simplicial complex K in Rd, such that |K| ⇠ = X. Not every subspace of Rd admits a triangulation. However, all the subsets that will arise in our context will admit it. Triangulations of BR2((0, 0), 1) include examples in Figure 3.8 and Delaunay triangulations. A geometric simplicial complex is a triangulation of its body. Occasionally we will want to refine the triangulation of a space, meaning we will want to decrease the size of simplices in order to improve visualisation, level of details, etc. Such refinements are called Figure 3.8: A few triangulations of subdivisions. Given a geometric simplicial complex K, a geometric D2. simplicial complex L is its subdivision, if each simplex of K is the union of a collection of simplices from L. As an example we already mentioned the barycentric subdivision of planar triangulations, which also exists for simplicial complexes. At this point we will refrain from introducing its formal definition. The lower right part of Figure 3.8 Figure 3.9: A few triangulations of S1. depicts a subdivision of a single 2-simplex, see also Figure 3.10. simplicial complexes 37 3.3 Abstract simplicial complex When buying a commercial object to be assembled, be it a piece of furniture, a toy or a model made of cubes, or a picture made of puz-zles, the package usually arrives in a big box. On the box is a picture of the object, which in our context represents the body of a geometric simplicial complex. On the picture we can often determine pieces, which in our setting would be geometric simplicial simplices. Pieces on that picture have specific locations and just like geometric simplices, Figure 3.10: A geometric simplicial could be described by specific coordinates. However, the assembly incomplex and its subdivision. structions contain no coordinates. There is a good reason for that1. 1 Besides the fact that nobody would purchase such an item. In order to assemble the object, the instructions only provide a list of pieces and instructions about how to put them together. That information is sufficient to reconstruct the object. Abstract simplicial complexes play the role of such instructions. Assume we want to describe a geometric simplicial complex. That means we have to provide a list of all simplices. A simplex could be provided by a list of coordinates of its vertices, but then we also have to make sure the simplices intersect appropriately. It would be much easier to just list the simplices and describe how they fit together in a coordinate free way. Here is a way to do it. Definition 3.3.1. Let V be a finite set. An abstract simplicial com-T In some sources the non-empty plex L on V is a family of non-empty subsets of V, such that if s 2 condition in the definition of an L abstract simplicial complex is omitted and t ✓ s is non-empty, then t 2 L. and the empty set is always included as an abstract simplex of dimension 1. A few more accompanying definitions using the notation of Definition 3.3.1: • An abstract simplex s is an element of L. Its dimension is dim( s) = | s| 1. • If t ✓ s 2 L, then: – t is a face of s. – s is a coface of t. – t is a facet of s if dim( t) + 1 = dim( s). • Dimension dim(L) of L is the maximal dimension of a simplex in L. • The (closed) star of a vertex v 2 K is StK(v) = St(v) = { s 2 K | s [ {v} 2 K}  K. • The link of a vertex v 2 K is LkK(v) = Lk(v) = { t 2 St(v) | v / 2 t}  St(v). 38 introduction to persistent homology A geometric simplex is a subset of an Euclidean space, given as the convex hull of the collection of its vertices. Each vertex is given as a point in space, usually in terms of coordinates. A geometric simplicial complex is a set of such simplices, contains all faces and has to satisfy the intersection properties of Definition 3.2.2. Figure 3.11: A star (left) and a link (right) of a vertex. An abstract simplex is just a collection of vertices. No coordinates are needed. An abstract simplicial complex is a set of such collections which contains all faces (all subsets of its elements). There are no intersections to be checked. It is a complete and convenient combinatorial description. Example 3.3.2. Let K be a geometric simplicial complex provided by Figure 3.12. As a geometric simplicial complex, K contains specific geometric simplices described by the coordinates of their vertices. We can construct a corresponding abstract simplicial complex L. Label the vertices as demonstrated by the figure. Then d L = {{a, c, d}, {a, b}, {b, c}, {c, d}, {d, a}, {a, c}, {a}, {b}, {c}, {d}}. No coordinates are involved. We could also only list the inclusion-maximal simplices, which completely determine the simplicial complex: {{a, c, d}, {a, b}, {b, c}}. a c A simpler structure of an abstract simplicial complex will suffice for most of our topological analysis of spaces and the corresponding computations. Indeed, it will simplify them. A geometric simplicial complex however is still useful when we want to visualise a complex. For example, outputs of various scans come in the form of geometric b Figure 3.12: A picture accompanying simplicial complexes modelling the scanned shape. While geometric Example 3.3.2. simplicial complexes describe geometric information about the space (various sizes, lengths, etc.), abstract simplicial complexes contain only topological information (homeomorphism type). It is easy to turn a geometric simplicial complex into an abstract simplicial complex: replace each coordinate given vertex by a unique label. The opposite is a bit harder. Turning an abstract simplicial complex into a geometric simplicial complex requires us to choose coordinates of vertices in line with the requirements for a geometric simplicial complex. If it can be done, such a geometric simplicial complex is called a geometric realization (or just realization) of the original abstract simplicial complex. It turns out that geometric realizations always exists, although obtaining them in a low-dimensional space is typically hard. The following are two special cases of such realizations. simplicial complexes 39 Theorem 3.3.3. Every abstract simplicial complex K with n vertices x1 x2 x3 admits a geometric realization in Rn 1. Proof. Simplicial complex K is a subcomplex of the full simplicial complex L on n vertices, i.e., the simplicial complex, whose simplices are all subsets of vertices of K. As L admits a realization in Rn 1 as an (n 1)-simplex (i.e., the convex hull of a collection of n affinely independent points), so does K as its subcomplex. Theorem 3.3.4. Every abstract simplicial complex of dimension d admits a geometric realization in R2d+1. y1 y2 y3 Figure 3.13: A sketch of a one- A proof of Theorem 3.3.4 is provided in Appendix. dimensional abstract simplicial As an example consider graphs, i.e., one-dimensional simplicial complex (graph) with no realization complexes. It is well known that some graphs are planar, which means in R2. The complex consists of all edges between xi and yj. they admit a geometric realization in the plane. However, there are graphs, that are not planar. These graphs can only be realized in R3 = R2·1+1 (and, of course, in Rm for m > 3). See Figure 3.13. One of the goals of this course is the following: given an abstract c simplicial complex, extract topological properties of its geometric realization. We will study and analyze spaces by working with their triangulations. a b Remark 3.3.5. It is important to understand the di↵erences between Figure 3.14: A metric space (left) and the corresponding a metric space, its triangulation and a corresponding abstract simpli-geometric simplicial complex cial complex. In practice however, we will be frequently vague in our (right). The correspond-expression for the sake of simplicity, ofter referring to just simplicial ing abstract simplicial complex is {{a, b, c}, {a, b}, {b, c}, {a, c}, {a}, {b}, {c}}. complex. Given a picture of a space like the one in Figure 3.12, we will keep in mind the three possible interpretations and use the one that fits the context at the moment. Also note that the terminology is essentially the same for the abstract and geometric simplicial complexes. We declare this to be the case for all further2 definitions as well. For example, an abstract 2 Including the concepts of link and star of a complex at a point. simplicial complex L is a triangulation of a metric space X if the corresponding geometric simplicial complex is, i.e., if the body of a geometric realization of L is homeomorphic to X. Example 3.3.6. One of our standard examples of a metric space will be the torus T. It is a two-dimensional metric space, actually a surface, depicted in Figure 3.15. A triangulation of T in terms of an abstract simplicial complex is provided by Figure 3.17. Topologically speaking, the torus can be obtained from a square by identifying the opposite sides along the same direction. This construction is depicted in Figure 3.16. Figure 3.15: Torus. 40 introduction to persistent homology a Figure 3.16: The torus arising from a a square. Starting with a square (top left) we identify its pairs of opposite sides along the same direction as the b b labels and arrows suggest. Identify- b b ing the sides a we obtain a cylinder (top right). Identifying the other pair of sides, which represent loops b in the cylinder, we obtain the torus (bottom right). a b b b As a result we can obtain a structure of an abstract simplicial complex by triangulating a square and respecting the mentioned identifications. This provides a convenient topological visualisation of a torus. x v w x Observe that a triangulation of T in terms of a geometric simplicial complex would be more complicated and not presentable in the plane. The mentioned abstract simplicial complex is provided by Figure z z 3.17. We divide the square into 18 triangles and keep in mind the identifications suggested by the arrows. The two sides of the square y y along the single arrows get identified along the direction of the arrows, and the same holds for the two sides along the double arrows. For the sake of clarity we also labeled the outer vertices of the triangles as each x x label appears at least twice due to the identifications. v w Figure 3.17: A triangulation of Example 3.3.7. Choose n 2 {1, 2, . . .}. In this example we provide the a torus in terms of an abstract simplicial complex. simplest triangulations of discs and spheres. Let s n be an n-simplex and define K to be the simplicial complex whose only maximal simplex is s n, i.e., K contains s n and all of its faces. Simplicial complex K is a triangulation of Dn. To obtain a triangulation of Sn 1 remove from K the maximal simplex, i.e., K0 = K \ { s n}. Simplicial complex K consists of all faces of s n but does not contain s n itself. Simplicial complex K0 is a triangulation of Sn 1. Figure 3.18: A triangle as a trian- gulation of D2 and its boundary as a triangulation of S1. In a similar Two invariants way a solid tetrahedron is a triangu- lation of D3 while its boundary is a triangulation of S2. Here we provide two invariants (of a space) that can be extracted from a triangulation. Both are homotopy invariants (and hence also simplicial complexes 41 topological invariants), meaning they coincide for homotopically equivalent spaces. A space typically has infinitely many possible triangulations. Imagine all possible Delaunay triangulations in R2: they are all triangulations of D2. We conclude that the numbers of vertices, edges, or higher dimensional simplices in a triangulation cannot be topological invariants. The first invariant is the number of components. Given a triangula-Figure 3.19: A complex with two tion K, it is easy to extract that number from K(1) (which is a graph) components. using standard approaches of graph theory. Later we will explain in detail how to obtain this number in terms of homology. For example, the simplicial complex in Figure 3.19 has two components. The second invariant is the Euler characteristic, which can be defined for simplicial complexes. Definition 3.3.8. Suppose K is a simplicial complex and let ni denote the number of i-simplices in K. The Euler characteristic c(K) 2 Z is defined as c(K) = n0 n1 + n2 n3 + . . . . The Euler characteristic of a metric space is the Euler characteristic of any of its triangulations. As was mentioned above, the Euler characteristic is homotopy Figure 3.20: A series of homotopic invariant. Using this fact we can compute the following cases: simplicial complexes, each di↵ering from the previous one by a • Let X be a one-point space. Then c(X) = 1. Since each Delaunay small modifications. Note that the modifications preserve the Euler triangulation K in R2 is homotopic to a point (meaning that |K| ' characteristic. In the first step we X), we also conclude c(K) = 1, a statement which we have already add a vertex and an edge; in the second step we add a vertex, two edges proved directly. In fact, the homotopy invariance implies that each and a triangle; and so on. Each new triangulation of a contractible space is of Euler characteristic 1. addition contributes a total of 0 to the Euler characteristic. • The Euler characteristic of a torus is 0. It can be computed directly from the triangulation presented by Figure 3.17, which has 18 triangles, 27 edges and 9 vertices (keep in mind the identifications). • Let n 2 {0, 1, 2, . . .}. Then c(Sn) = 1 + ( 1)n. Proof. Let s n+1 be an (n + 1)-simplex and define K to be the simplicial complex whose only maximal simplex is s n+1. As we know K is a triangulation of Dn+1. As Dn+1 is contractible, c(K) = 1. We also mentioned that K0 = K \ s n+1 is a triangulation of Sn. As K0 is obtained from K by removing an (n + 1)-simplex, c(K0) T c(Sn) could also be computed from is obtained from c(K) by removing a contribution of that simplex, the triangulation K0 directly using the binomial formula. which is ( 1)n+1. Hence c(Sn) = 1 ( 1)n+1 = 1 + ( 1)n. 42 introduction to persistent homology 3.4 Simplicial maps Just as simplicial complexes provide a convenient combinatorial description of metric spaces, simplicial maps provide a combinatorial description of continuous maps. We first define them in the abstract setting. Definition 3.4.1. Suppose K and L are abstract simplicial complexes. A simplicial map between K and L is an assignment f : K(0) ! L(0) on vertices, such that for each abstract simplex {v0, v1, . . . , vk} 2 K ,! its image { f (v0), f (v1), . . . , f (vk)} is an abstract simplex in L. Figure 3.21: An embedding of a simplicial subcomplex L  K into K is Remark 3.4.2. A simplicial map will be usually denoted by f : K ! L. identity on the vertices and always a However, since K and L are collections of sets, such a notation would simplicial map. formaly include maps that map, say, a vertex to an edge, a highly unfavourable occurence. When talking about simplicial maps K ! L we thus always consider only maps in the sense of Definition 3.4.1, i.e., maps that map a vertex to a vertex, while images on simplices are always determined by the values on the vertices: • For each vertex v 2 K we define a corresponding vertex f (v). • For each abstract simplex s = {v0, v1, . . . , vk} 2 K its image f ( s) = { f (v0), f (v1), . . . , f (vk)} is determined by the values on its vertices. • Note that f ( s) is a set, meaning there are no repetitions of elements. In particular this means that each vertex appears at most once in f ( s), even if it appears multiple times as f (vi). As a result the image of an n-dimensional simplex can be of dimension less than or equal to n, but never more than n. b Example 3.4.3. Let K be the simplicial complex in Figure 3.22. Assignment a 7! a; b 7! c; c 7! c; d 7! d; e 7! b can be verified e to induce a simplicial map K ! K. Note that triangle {a, b, c} gets a d mapped to edge {a, c}. We are now ready to define simplicial maps in the geometric set-c ting. Figure 3.22: Simplicial complex K of example 3.4.3. Definition 3.4.4. Suppose K and L are geometric simplicial complexes. A map f : K ! L is a simplicial map, if: 1. For each vertex v of K its image f (v) is a vertex of L. 2. The corresponding map between the corresponding abstract simplicial complexes is simplicial, i.e., if {v0, v1, . . . , vk} span a geo- simplicial complexes 43 metric simplex in K then { f (v0), f (v1), . . . , f (vk)} span a geometric simplex in L. 3. Map f is linear on simplices (in terms of barycentric coordinates), i.e., k k k 8ti 2 [0, 1],  ti = 1, 8vi 2 K(0) : f( tivi) =  ti f(vi). i=0 i=0 i=0 Given a simplicial map between geometric simplicial complexes the induced map (i.e., the restriction to vertices) between abstract simplicial complexes is simplicial. Conversely, each simplicial map between abstract simplicial complexes corresponds to the unique simplicial map between the corresponding geometric simplicial complexes: the extension from vertices to geometric simplices is defined using the formula of item 3 of Definition 3.4.4. In accordance with our declarations simplicial maps will be used to denote maps either in geometric or abstract setting: in case of a preferred interpretation it will be stated explicitly or should be obvious from the context. Simplicial maps between geometric simplicial complexes are continuous maps as they are linear (hence) continuous on each simplex. Surprisingly enough, each continuous map can be (up to homotopy) represented by a simplicial map, which means that as long as we are interested in homotopical properties, we can restrict ourselves to simplicial maps. Theorem 3.4.5. Suppose f : K ! L is a continuous map between geometric simplicial complexes. Then there exist sufficiently fine subdivisions K0 of K and L0 of L, and a simplicial map f 0 : K0 ! L0, such that f ' f 0. We call f 0 a simplicial approximation of f . The subdivisions above can be taken to be sufficiently fine barycentric subdivision. A continuous map between simplicial complexes is Figure 3.23: The upper part of the figure presents a simplicial com- formally a map between the bodies of the simplicial complexes. In this plex K. The middle part consists sense both f and f 0 map |K| = |K0| to |L| = |L0| hence f ' f 0 makes of a smooth curve (bold) given by a continuous map sense. f : S1 ! K, su- perposed on the simplicial complex. The bottom part consists of a sim- Elementary collapses plicial approximation (bold red) of the curve superimposed on the Elementary collapses are minor local modifications of simplicial simplicial complex and the curve. The indicated simplicial approxima- complexes, which preserve its homotopy type. Conveniently enough, tion requires a triangulation of the they can be described in purely combinatorial terms. Their impor-domain S1 with at least 12 edges. tance stems from the following lemma. Lemma 3.4.6. Let K be a geometric simplicial complex containing simplex s = {v0, v1, . . . , vk} and let t = {v1, . . . , vk} be its facet. If 44 introduction to persistent homology s is the only coface3 of t, then the inclusion i : K \ { t, s} ,! K is a 3 i.e., a simplex which contains t. homotopy equivalence. Proof. The proof is sketched in Figure 3.24. We first need to subdivide s and t. Choose a point a in the middle of t and connect it to all vertices of s. This induces a subdivision of s and t and in fact of K as no other simplex contains4 s or t. 4 If there was another simplex con- taining s, then that simplex would In order to obtain a continuous deformation5 from K to K \ { t, s}, have been a coface of t, which con-slide a towards v0. This sliding is a linear homotopy and can be easily tradicts our assumptions. In particular, If s is the only coface, then t is described in the barycentric coordinates of the new subdivision of a facet of s. This fact also refers to K. conditions of Definition 3.4.7. 5 Homotopy equivalence. ⌧ a v0 Figure 3.24: Elementary collapse from Lemma 3.4.6 with s being the triangle and t its left side. Definition 3.4.7. Let K be a simplicial complex, t(k 1) ⇢ s(k) 2 K, and assume s is the only coface of t. A removal K ! K \ { t, s} is called an elementary collapse. By Lemma 3.4.6 each elementary collapse preserves the homotopy type of a complex. Note that the collapsing map K ! K \ { t, s} is not6 a simplicial map on K. It is, however, a simplicial map on the subdivision of K employed in Lemma 3.4.6, defined by mapping a 7! v0 and keeping all the other vertices intact. Its homotopy inverse Figure 3.25: The elementary collapse is the inclusion K \ { t, s} ,! K, which is a simplicial map. of Figure 3.24 is usually indicated by an arrow from t into s. Elementary collapses are convenient because they provide us with a 6 Except if t is a vertex. simple combinatorial condition that can be used to induce homotopy equivalence on abstract simplicial complexes. This idea will be further expanded later within the context of discrete Morse theory. simplicial complexes 45 Figure 3.26: An example of a simpli- fication of (the homotopy type of) a simplicial complex using elementary collapses. 3.5 Concluding remarks Recap (highlights) of this chapter • Geometric simplex; • Geometric simplicial complex; • Abstract simplicial complex; • Geometric realization; • Simplicial map; • Simplicial approximation and elementary collapse. Background and applications Simplicial complexes model spaces in a wide spectrum of theory and applications. Great portions of topology and geometry are based on them due to their simple structure and amenability to combinatorial treatment. On the applied side simplicial complexes are typically used to model shapes. The notions introduced in this chapter are covered in standard books on topology7. 7 James Munkres. Elements of Algebraic Topology. Perseus Books, 1984. doi: 10.1201/9780429493911 Appendix: Proof of Theorem 3.3.4 Before we begin with the proof we clarify a fact that will be used. A generic8 (random) collection of n + 1 points in Rn is affinely inde-8 Notion “generic” as used in this appendix is usually referred to as pendent. Geometrically this is easy to believe: “general linear position” in the literature. • Generic two points in R will be di↵erent; • Generic three points in R2 will not be colinear; • Generic four points in R3 will not be coplanar. Similarly, even for k > n + 1 a generic set V of k points in Rn has the same property: each collection of n + 1 points in V is affinely 46 introduction to persistent homology independent. For example, in a generic collection of points in the plane no triple of points will be collinear. We will only use the fact that generic collections exist. This fact can be proved using linear algebra. Proof of Theorem 3.3.4. Let K be an abstract simplicial complex of dimension d, whose vertices are v0, v1, . . . , vk. Choose a generic collection of points V = {x0, x1, . . . , xk} ⇢ R2d+1, meaning that each collection of 2d + 2 points from V is affinely independent. We will Figure 3.27: A generic collection of 5 prove that the correspondence vi $ xi for all i 2 {0, 1, . . . , k} provides points in the plane, meaning that no three are collinear. Only by adding a geometric realization of K. a point on any grey line does the For each abstract simplex s 2 K spanned by vj , v , . . . , v 0 j1 jm , the generic condition brake. If we add corresponding geometric simplex s 0 is spanned by the affinely indepen-any other point to this collection, the obtained collection of 6 points is still dent collection xj , x , . . . , x 0 j1 jm . It remains to prove that if s, t 2 K, generic. then s 0 \ t 0 = ( s \ t)0. As ( s \ t)0 ✓ s 0, t 0 by the definition, we have s 0 \ t 0 ◆ ( s \ t)0. To prove the other inclusion we will make use of the dimension assumption. Let z 2 s 0 \ t 0, which means that z can be expressed as a convex combination of vertices in s 0 and also as a convex combination of vertices in t 0. As the total number of vertices in t 0 and s 0 is at most 2n + 2 (by the dimension assumption), the generic condition implies these are affinely independent, and thus the convex (affine) combinations above coincide as they have to be unique by Proposition 3.1.2. In particular, this means that only the barycentric coordinates corresponding to the vertices that lie in both simplices (i.e., s \ t) can be non-zero, which implies z 2 ( s \ t)0 and hence s 0 \ t 0 ✓ ( s \ t)0. 4 Surfaces Surfaces are some of the simplest topological spaces appearing frequently in science and data analysis. From each local perspective appearing as a part of the plane, the global shape of a surface may take many forms. Think of the surface of the earth: because it appears to be “planar” at each point, it had long been believed that Earth is actually a part of a plane or maybe a disc instead of a sphere. Surprisingly enough, most surfaces of interest can be recognized up to homeomorphism fairly easily. In this chapter we explain this recognition process and the accompanying theory, both of which will come handy in the chapters to come. 4.1 Surfaces as manifolds Ever since the ancient times people were wondering about the shape of the world. They agreed that from the perspective of a human being, the world looked like a plane, a part of a surface. What was much harder to figure out was the global picture. The first and most obvious idea was that the world was a flat disc. Later came indications, such as deviations in the angle of the shadow depending on the latitude, that the world might be curved. Magellan’s first circumnavigation of Figure 4.1: Some of the surfaces we mentioned before: a planar set (top Earth does not constitute a rigorous proof that the world is a sphere left), a space homeomorphic to S2 by modern mathematical standards, but at the time it was a momen- (top right), Moebius band (center left), cylinder (center right), and tous achievement which confirmed that the Earth is indeed round. torus (bottom). While we will not be sailing around the world in this course, we will be interested in the moral of this story: things that locally look like a plane may globally not be a plane. We will want to determine the global structure from local information. Spaces that locally look like a plane are called surfaces and their generalizations to other dimensions are called manifolds. Here is a formal definition. 48 introduction to persistent homology Definition 4.1.1. Let n 2 {0, 1, 2, . . .}. A metric space X is an n-manifold, if for each x 2 X there exists r > 0, such that BX(x, r) is homeomorphic to the n-dimensional disc Dn. 2-manifolds are called surfaces. Point x on an n-manifold X is: • a boundary point if a homeomorphism BX(x, r) ! Dn from Definition 4.1.1 maps x to a point on the boundary of Dn. • an interior point if a homeomorphism BX(x, r) ! Dn from Defini-Figure 4.2: A neighborhood of a point on S2 homeomorphic to B2 tion 4.1.1 maps x to a point in the interior of Dn. a a These two notions are independent of the choice of homeomorphism b b b b BX(x, r) ! Dn. Each point of X is either a boundary point or an interior point. The boundary of X consists of all the boundary points, a and the interior of X consists of all the interior points. We say that a manifold X is without boundary, if it has no boundary points. For an n-manifold Y its boundary is an (n 1)-manifold b b without boundary, as can be seen from the examples below. For our Figure 4.3: Klein bottle obtained purposes a closed manifold (closed surface) will be a manifold (surface) by identifying the edges of a square: without boundary admitting a (finite) triangulation. two along the same direction and two along the opposite direction. Example 4.1.2. We provide some examples of connected n-manifolds The resulting space (bottom right) is not realizable in R3 due to the self- listed by dimension n. intersection. However, Klein bottle can be embedded into R4 without • n = 0: This one is fairly unimpressive: a single point. self-intersections. • n = 1: Circle and intervals (0, 1), [0, 1], (0, 1]. Each connected 1-a manifold is homeomorphic to one of these. A circle and an open interval have no boundary, while the boundary of [0, 1] consists1 of 0 and 1. b b • n = 2: We will provide a list of all surfaces by the end of this chapter. Here we list some of the more prominent ones. The already mentioned ones are recapped in Figure 4.1: note that the boundary of the band consists of two copies of S1, while the Moebius band has a single boundary component. A closed disc D2 is also a surface, a whose boundary is S1. Figure 4.4: Projective plane ob- tained by identifying the edges of a Closely related to the torus are the Klein bottle (see Figure 4.3) square: both pairs along the oppo- and the projective plane, neither of them has a boundary and site direction. The resulting space is not realizable in R3. However, it neither can be obtained as a subset of R3. However, they can be can be embedded into R4 without obtained as subsets of R4. While these two spaces are challenging self-intersections. 1 Also, the boundary of [0, 1) is 0. to imagine geometrically, it is fairy easy to provide their (abstract) triangulations (see Figure 4.3) and compute some of their topological invariants, such as the Euler characteristic. The Torus and the surfaces 49 Klein bottle have Euler characteristic 0, while the Euler characteris-x v w x tic of the projective plane is 1. The projective space is homeomorphic to the space of all 1-dimensional z y subspaces in R3. • General n: Dn and Sn are both n-manifolds. The boundary of Dn is y z Sn 1, while Sn has no boundary. There are many other n-manifolds. Combinatorial manifolds x x v w We will mostly be working with triangulated manifolds. A natural question that arises in this context is how to recognize whether a given q w v x simplicial complex is a triangulation of a manifold. Tackling this task we first introduce nice combinatorial descriptions of manifolds. z y Definition 4.1.3. Suppose K is a simplicial complex and n 2 N. We say that K is a combinatorial n-manifold, if for each vertex v 2 K its link Lk(v) is homeomorphic either to Sn 1 or Dn 1. y z Properties and notation: x q v w • Each combinatorial n-manifold is a triangulation of an n-manifold. Figure 4.5: Triangulations of the Klein bottle (top) and the projective • For n < 4, each n-manifold admits a triangulation as a combinato-plane (bottom). rial n-manifold2. 2 Surprisingly enough, this does not hold for n 4 • Vertices of a combinatorial manifold K satisfying Lk(v) ⇠ = Dn 1 are called boundary vertices. • Vertices of a combinatorial manifold K satisfying Lk(v) ⇠ = Sn 1 are called interior vertices. • Edges of a combinatorial surface K that are contained in only one triangle are called boundary edges. The union of the boundary edges corresponds to the boundary of the manifold. • Edges of a combinatorial surface K that are contained in two triangles are called interior edges. No edge in a combinatorial surface is contained in more than two triangles. Using these properties it is fairly easy to recognize whether a given simplicial complex K is a combinatorial surface and thus a triangulation of a surface: for each vertex v 2 K we verify whether Lk(v) Figure 4.6: Triangulation of the is homeomorphic to S1 or D1. It is easy to see that a connected 1-Klein bottle (top) and of the Moe- dimensional simplicial complex is homeomorphic to: bius band (bottom). In the top triangulation each vertex is an in- • S1 i↵ each of its vertices is contained in two edges. terior vertex as each link (bold) is homeomorphic to S1. In the bottom case each vertex is a boundary vertex as each link (bold) is homeomorphic to B1. 50 introduction to persistent homology • D1, i.e., the line segment, i↵ two of its vertices are contained in one edge, and all other vertices are contained in two edges. We will leave the elementary proofs of these two facts to the reader. 4.2 Orientability Orientability is about defining “up and down”. It is quite easy to agree on the two directions on the surface of the earth. However, in general that may not be the case for all surfaces. Consider the cylinder from Figure 4.1. It is orientable because we can define two di↵erent sides of it. To put it into a more colorful language, we can color one side of the band in red and the other side in blue, without the colors ever touching each other. The story is di↵erent on the Moebius band: it has only one side. We could start coloring it in red at some spot y y y and keep expanding the color along the surface (but not across the boundary): eventually we will color the whole band, i.e., there is no “other” side. x x x Orientability is an important property of surfaces. It will be re-Figure 4.7: The edge {x, y} (left) and quired for our classification result. In order to fully understand it we the oriented edges hx, yi (center) and have to define orientation for simplices first. Besides its application hy, xi = hx, yi (right). in this section orientation on simplices will feature prominently later z z z within the context of homology computation. Up to now a simplex was given by a set of its vertices. An oriented x y x y x y simplex is a simplex with a choice of orientation. For an edge that Figure 4.8: The triangle {x, y, z} means direction, for a triangle that means “a normal” (see Figures 4.7 (left) and the oriented triangles and 4.8). This direction/orientation will be described by a choice of an hx, y, zi = hy, z, xi = hz, x, yi (center) and hy, x, zi = hx, z, yi = hz, y, xi = order on vertices. hx, y, zi (right). Definition 4.2.1. An oriented simplex on vertices v0, v1, . . . , vk is an ordered (k + 1)-tuple s = hv0, v1, . . . , vki. For a permutation p on {0, 1, . . . , k} we identify: s = ( 1)sgn( p)hv p(0), v p(1), . . . , v p(k)i, where sgn( p) is the signature of permutation p, i.e., value 0 if p is even and value 1 if p is odd. A 0-dimensional simplex with vertex v can also be oriented in two ways: as hvi and as hvi. Figures 4.7 and 4.8 provide examples of descriptions of oriented edges and triangles, and their geometric interpretations. Here are some properties that follow from Definition 4.2.1: • Each simplex on vertices v0, v1, . . . , vk can be oriented in two di↵erent ways: s = hv0, v1, . . . , vki and s. surfaces 51 • An oriented simplex has a sign + (usually omitted) or prepended. • Exchanging two vertices in an oriented simplex t changes the orientation of t by changing the prefixed sign. An important property of an oriented simplex is that it induces an orientation on each of its facets. Definition 4.2.2. Suppose s = hv0, v1, . . . , vki is an oriented simplex z z and p 2 {0, 1, . . . , k}. Then induced orientation of the facet of s obtained by dropping vp is ( 1)phv0, v1, . . . , vp 1, vp+1, . . . , vki. x y x y Figure 4.9: Oriented triangle hx, y, zi Oriented edge hx, yi induces orientations hyi and hxi on its facets (left) and induced orientation on the edges (right): hx, yi, hy, zi, and hz, xi. (vertices). Oriented triangle hx, y, zi induces orientations hy, zi, hx, zi Note that the edges are oriented and hx, yi on its facets (edges), see Figure 4.9. along the direction of the circular arrow indicating the orientation of Now that we established a way to orient a single simplex, we turn the triangle. our attention to orienting the whole surface. Definition 4.2.3. Suppose oriented 2-simplices s and s 0 share a common edge. Simplices s and s 0 are oriented consistently, if they induce the opposite orientation on the common edge. (see Figure 4.10) Figure 4.10: Consistent orientation: Definition 4.2.4. Let K be a triangulation of a surface |K|. We say note that the orientations of the triangles agree (both directed circular that |K| is oriented, if all triangles of K are oriented (as simplices) arrows point counter-clockwise). This so that the following holds: each pair of oriented triangles with a com-implies that the induced orientations mon edge is oriented consistently. on the common edge are opposite to each other. A surface is orientable if it can be oriented. Orientability of a surface does not depend on a triangulation but on the topological type of the surface only. The following are two basic examples that demonstrate the underlying geometric idea. Example 4.2.5. The cylinder S1 ⇥ [0, 1] is orientable as Figure 4.11 demonstrates. To the contrary, the Moebius band is not orientable y y as Figure 4.12 demonstrates. Since the Klein bottle and the projective plane both contain a copy of the Moebius band (any of the three horizontal strips of triangulations in Figure 4.5), neither of them is orientable. x x Figure 4.11: Orientable triangula- As Example 4.2.5 and Figure 4.12 suggest it is fairly easy to check tion of a usual band. The oriented simplices induce the opposite orien- whether a connected triangulated surface is orientable. This can be tation on all the edges, including the done directly by orienting one triangle and then inductively orienting edge {x, y}, along which the glueing all neighboring triangles with shared edges, while checking that each occurs. newly oriented triangle is oriented consistently with respect to the already oriented triangles. 52 introduction to persistent homology x y x y x y x y Figure 4.12: A proof that the Moe- bius band is not orientable. Assume we want to orient a triangulation of the Moebius band on the left. We first choose an orientation of one triangle (far left) and then induc- tively induce consistent orientation on the neighboring triangles. In the end (far right) we obtain conflicting requirements on the orientation on the last (bold) triangle, which means there is no consistent way to orient all the triangles in this triangulation. ! y x y x y x y x 4.3 Connected sum of surfaces One of the ways to make new surfaces (and actually manifolds in general) out of known ones is the connected sum. Definition 4.3.1. Suppose X and Y are connected surfaces. Choose topological 2-discs DX ⇢ X and DY ⇢ Y, neither of which contains any boundary point of the surfaces. The corresponding boundaries of these discs are topological 1-spheres (circles) SX ⇢ X and SY ⇢ Y respectively. The connected sum X#Y is obtained by removing the interiors of discs DX and DY from X and Y, and gluing the resulting spaces by identifying SX with SY. See Figure 4.13 for a sketch of this construction. A few technical re-Figure 4.13: Two tori (top) and their marks about connected sums as defined above: connected sum (bottom) obtained by identifying the boundaries of the two • It turns out that the topological type of X#Y does not depend on removed discs (center). the choice of discs DX, DY. • A connected sum is a surface, whose boundary components correspond to the union of the boundary components of X and the boundary components of Y. surfaces 53 • Surfaces X and Y are both orientable i↵ X#Y is orientable. • For each surface X, the following holds: X#S2 ⇠ = X. If abstract simplicial complexes K and L are triangulations of surfaces X and Y respectively and X \ Y = ∆, we can obtain a triangulation M of X#Y in the following way: 1. Choose triangles DX and DY in K and L respectively, so that no point of these two triangles lies on the boundary of X or Y. 2. Define M = (K \ {DX}) [ (L \ {DY})/ ⇠, where ⇠ stands for the identification of each of the boundary edges of DX with an appropriate boundary edge of DY. In short, M is obtained by removing DX and DY from the union of K and L, and then identifying the boundaries of the removed triangles. This procedure is a discrete version of the one in Definition 4.3.1. Proposition 4.3.2. c(X#Y) = c(X) + c(Y) 2. Proof. Assume abstract simplicial complexes K and L are triangulations of surfaces X and Y respectively and K \ L = ∆. It is obvious that c(K [ L) = c(K) + c(L). In order to obtain a triangulation of X#Y from K [ L, we: • Remove two triangles (change 2 to the Euler characteristic); • Identify three pairs of vertices, meaning we have three vertices less (change 3 to the Euler characteristic); • Identify three pairs of edges, meaning we have three edges less (change +3 to the Euler characteristic); The total change to the Euler characteristic after these steps is 2. a b b 4.4 Classification of surfaces a Figure 4.14: The three closed con- We can now describe the classification of surfaces. Let T denote the nected surfaces (the sphere S2, the torus T and the projective plane P), torus and let P denote the projective plane. that are used to construct any other closed connected surface using the connected sum operation. 54 introduction to persistent homology Theorem 4.4.1. [Classification Theorem for closed connected surfaces] Suppose X is a closed connected surface. Then X is homeomorphic to one of the following: 1. S2. 2. n-torus nT = T#T# . . . #T | {z } for some n 2 N. n 3. nP = P#P# . . . #P | {z } for some n 2 N. n It turns out that the surfaces appearing in Theorem 4.4.1 can be distinguished using orientability and the Euler characteristic. From the properties of the connected sum recall that (for each n 2 N) S2 and nT are orientable while nP are not. Furthermore, using Proposition 4.3.2 and we can inductively deduce3 : 3 Using Proposition 4.3.2 we can deduce c(2T) = c(T#T) = c(T) + • c(nT) = 2 2n as c(T) = 0. c(T) 2 = 0 + 0 2 = 2, c(3T) = c(T) + c(T#T) 2 = 0 + 2 2 = 4, and proceed inductively. The proof • c(nP) = 2 n as c(P) = 1. for nP is analogous. Consequently we obtain the following table of connected closed surfaces. Surface c Table 4.1: A list of closed connected 9 surfaces along with their Euler S2 2 = characteristic and orientability. T 0 orientable ; nT 2 2n P 1 not orientable nP 2 n Theorem 4.4.1 motivates the following classification algorithm for a closed connected surface given as an abstract simplicial complex4 K: 4 I.e., we assume the triangulation K is a connected combinatorial 1. Check for orientability of K. 2-manifold and has no boundary components. 2. Compute the Euler characteristic. T A shortcut to computing c: while the Euler characteristic is formally 3. Consult Table 4.1. defined on a triangulation, it turns out it can also be obtained from the representation of a surface in terms Example 4.4.2. Which of the surfaces in Theorem 4.4.1 is the Klein of a polygon with identified sides. For example, the representations bottle? We have already discovered that it is not orientable and that of the torus in Figure 4.14 and its Euler characteristic is 0. By the Classification Theorem the Klein Klein bottle of Figure 4.3 have bottle is homeomorphic to P#P. one 2-dimensional square, two 1- dimensional edges, and one vertex, yielding c = 2. The representation General surfaces of the projective plane in Figure 4.4 has one 2-dimensional square, Theorem 4.4.1 can also be used to classify general surfaces admit-two 1-dimensional edges, and two vertices, yielding c = 1. This trick ting a finite triangulation. Suppose X is a surface: could assist with Figure 4.15. A justification will be provided in the chapter on discrete Morse theory. surfaces 55 1. If X is not connected, it is a disjoint union of connected surfaces and it obviously suffices to recognize each of its components. 2. If X is connected and has a boundary Y, then Y is a 1-manifold without boundary, meaning Y is a disjoint union of k copies of S1 for some k 2 N. By glueing a disc along each component of Y we obtain a closed connected surface X0, which we can recognize 5. We 5 Such a gluing of discs does not change the orientation, i.e., X is conclude that X is homeomorphic to X0 with k discs removed6. orientable i↵ X0 is. However, an addition of each disc increases the Example 4.4.3. It is easy to see that the cylinder S1 ⇥ [0, 1] is obtained Euler characteristic by 1. 6 from S2 by removing two discs. It is a bit harder to see how to get the It turns out that the homeomorphic type does not depend on the discs Moebius band M this way. It is easy to see that M has one boundary we remove from X0, only on their component and has Euler characteristic7 0. Gluing a disc along the number. 7 boundary component we obtain a closed connected non-orientable sur-We could count the simplices in Figure 4.12. face of Euler characteristic 1, which is P. Hence the Moebius band is obtained by removing a disc from the projective plane. f We are now ready to state a classification algorithm for a surface f e given as an abstract simplicial complex K: 1. Partition K into its connected components and classify each of e them. a 2. For each component K0: c d (a) Count the number n(K0) of boundary components of K0. (b) Check for orientability of K0. b a (c) Compute the Euler characteristic of K0. (d) Let Y be the surface matching the orientability of K0 and of Euler characteristic8 c(K0) + n(K0) by Table 4.1. d c (e) Surface K0 is homeomorphic to Y with n(K0) many discs removed. b Figure 4.15: Which surfaces are these? With this classification algorithm we can always determine whether 8 c(K0) + n(K0) is the Euler charac-two surfaces are homeomorphic or not. teristic of a surface obtained from K by gluing n(K0) discs along the boundary components of K0. 4.5 Concluding remarks Recap (highlights) of this chapter • Surfaces, combinatorial surfaces; • Orientation and orientability; • Connected sum of surfaces; • Classification of surfaces; 56 introduction to persistent homology Background and applications For most of the practical purposes, we live in a three-dimensional space. Objects in our everyday life are often modelled by surfaces enclosing the objects. Outputs of many 3-D scans are given in terms of triangulated surfaces (for example, as .stl files). Surfaces and other higher-dimensional manifolds are also often assumed to be the underlying spaces in specific settings. A randomly generated bitmap image will seldom represent something reasonable, and yet there is a huge number of images that convey an information Figure 4.16: Spheres S0 (two points), to the human eye. A space of “recognizable” images is a huge subspace S1 (a circle), and S2 (a sphere). (perhaps a manifold) in the space of all bitmap images. Manifold recognition approaches the attempt to detect the underlying manifolds from sample data. The third source of surfaces and manifolds are spaces described with two or more degrees of freedom: configuration spaces of molecules, robotic arms, etc. For example, the configuration space of a robotic arm (i.e., the space of all possible positions of the arm) with two in-Figure 4.17: Discs D1 (a line seg- dependent joins, each of which allows a full rotational motion, is the ment), D2, and D3. torus S1 ⇥ S1. On a similar note, given two annotated9 points on S1, 9 The annotation refers to the fact the configuration space of all possible positions of the two points is that each point has a name in the sense that if the two points lie at again a torus S1 ⇥ S1, since the degree of freedom of each point is S1. di↵erent positions, then exchanging It is interesting to observe the di↵erence that appears if the points are them changes the configuration. The not annotated10: in such a case all possible configurations actually situation is sometimes also described using “ordered pairs” of points (x, y) form the projective plane. for which x, y 2 S1. On a more theoretical note, the question of whether each manifold 10 In particular, if two non-annotated admits a triangulation had been one of the focal points of topology in points lie at di↵erent positions, then exchanging them does not change the the previous century. It turns out that every manifold in dimension configuration as the pair represents 3 or less admits a triangulation. Surprisingly enough, there are man-the same collection of points on ifolds in higher dimensions that do not admit any triangulation. A S1. The situation is sometimes also described using “unordered detailed proof of the classification theorem of surfaces can be found in pairs” of points (x, y) by additionally many textbooks11,12. identifying (x, y) ⇠ (x0, y0) if x = y0 and y = x0. 11 L. Christine Kinsey. Topology of Appendix: imagining S3 Surfaces. Springer New York, 1993. doi: 10.1007/978-1-4612-0899-0 In this appendix we will try to explain two ways of thinking about 12 Jean Gallier and Dianna Xu. A the three-dimensional sphere S3 and spheres in general. Guide to the Classification Theorem for Compact Surfaces. Springer Berlin Heidelberg, 2013. doi: 1. The first observation has to do with the relationship between discs 10.1007/978-3-642-34364-3 and spheres. We have already mentioned that Sn 1 appears as the boundary of Dn. It should also be apparent (see Figure 4.18) that gluing two copies of a disc Dn along their boundaries (copies of Sn 1) results in Sn. In particular, we obtain S3 by taking two 3-discs (solid balls) and gluing them along the boundary. 2. As for the second observation we will refer to Figure 4.19. It turns surfaces 57 out that Sn can be obtained in the following way: pick two opposite points (the north pole and the south pole) and span an interval’s worth of copies of the sphere Sn 1 between them, so that the spheres are shrinking as they are approaching the poles. Note that if we only take one point and have the spheres shrinking only as they approach that point (and have them increase otherwise) we obtain Dn (see Figure 4.20 for some low-dimensional examples). It may come as a surprise that these points of view can be observed in Dante’s Divine Comedy, written about seven centuries ago. Dante’s description of the universe coincides with the topology of S3: on one extreme are the depths of Hell (part of Inferno), from which Dante is guided by Beatrice through the spheres of Inferno, Purgatory, and Paradise, until he reaches Empyrean, the place which contains the Figure 4.18: Gluing two copies of a disc together results in a sphere. essence of God. This description coincides with 2. above in terms of spheres. Equivalently, we may consider Inferno and Purgatory together as one 3-disc with center at Hell, and Paradise as another 3-disc with center at Empyrean: in this setup the Universe consists of both 3-discs that intersect along the surface of the Earth (which coincides with the boundary S2). Figure 4.19: Obtaining Sn as a collection of spheres Sn 1 between two points. Figure 4.20: Obtaining R2 (left) and R3 (right) as a collection of concen- tric spheres of all positive radii. In the same way we can decompose Rk for any k 2 {1, 2, . . .}. 5 Constructions of simplicial complexes Topological methods typically take a simplicial complex as input. However, objects of interest are often not provided in this form. The first step in a topological treatment is thus frequently a creation of simplicial complexes. In this chapter we will present various constructions of complexes from a point cloud, i.e., from a finite collection of points in a metric space. These points may represent a sample of our shape, a collection of numerical data (a subset of Rn), etc. Our discussion will include two properties we expect from such a construction. The first one is a relationship with the underlying shape, which is often guaranteed by the nerve theorem. The second one describes stability to perturbations and a way to measure distance between di↵erent constructions, as formalized by the interleaving property. 5.1 Rips complexes Rips complexes represent perhaps the simplest construction of a complex from a finite collection of points. T Rips complexes are a special case of clique complexes. Suppose G is a graph with vertices V and edges Definition 5.1.1. Let X be a metric space and let a sample S ⇢ X E. The clique complex of G is the be a finite subset. Choose a scale r 0. The Rips complex Rips(S, r) abstract simplicial complex with the is an abstract simplicial complex defined by the following rules: vertex set V, whose simplices satisfy the following condition: a subset s ✓ S is a simplex i↵ each pair of 1. The vertex set is S. vertices of s is an edge in G. A Rips complex is the clique complex of its 2. A subset s ✓ S is a simplex i↵ Diam( s)  r. 1-skeleton. T Diameter of a finite subset A ⇢ X of a metric space X is defined as Remark 5.1.2. A few comments: Diam(A) = max d(x, y). x,y2A • Rips complexes are sometimes also called Vietoris-Rips complexes. Diameter of X is defined as Diam(X) = sup d(x, y). • Rips(S, r) represents a combinatorial snapshot of S at scale r. x,y2X 60 introduction to persistent homology Figure 5.1: Five points in the plane and three corresponding Rips com- plexes Rips(S, r). Visualisation is as- • Diam( s) is the diameter of s. Condition Diam( s)  r means that sisted by circles of radius r/2 around the distance between any two vertices of s is at most r. each point. For much larger scales the Rips complex is not planar and eventually becomes 4-dimensional. • It is easy to verify that Rips complexes are indeed abstract simplicial complexes: if s is a simplex then so is each of its subsets. Remark 5.1.3. Some properties of the Rips complexes: 1. Rips complexes are often the preferred construction in TDA due to their simplicity in terms of the definition and computation. 2. Rips(S, r) is an abstract simplicial complex, typically not embed-dable in X. 3. For r smaller than the smallest pairwise distance between the points in S, Rips(S, r) is a discrete set, i.e., a complex with no edges or Figure 5.2: Suppose we are given higher-dimensional simplices. three points in the plane. These three points span a triangle in the Rips complex for r greater or equal 4. For r at least as large as Diam(S), i.e., the largest pairwise disto the maximal pairwise distance tance between the points in S, Rips(S, r) is the (|S| 1)-simplex, between these points. Equivalently, the three balls of radius r/2 should i.e., the simplicial complex on S containing all subsets of S. pairwise intersect. 5. If r1  r2, then Rips(S, r1) ✓ Rips(S, r2). Definition 5.1.4. Let X be a metric space and let a sample S ⇢ X be a finite subset. The Rips filtration on S is the collection of ab-T More generally, a filtration of a stract simplicial complexes {Rips(S, r)} simplicial complex K is a family of r 0 along with inclusions subcomplexes {Kr}r 0 indexed by a parameter r, such that Kr  K i r0 for r : Rips(S, r 1,r2 1) ,! Rips(S, r2) for all r1  r2. all r < r0. The Rips filtration on a finite set S is a filtration of the (|S| 1)-simplex. A Rips filtration provides the collection of all Rips complexes on S. While a single Rips complex depends on the choice of the scale, the filtration does not. Filtrations will play a fundamental role later in the definition of persistent homology. constructions of simplicial complexes 61 5.2 Čech complexes Definition 5.2.1. Let X be a metric space and let a sample S ⇢ X T Recall B(x, r) = {y 2 X | d(x, y)  be a finite subset. Choose a scale r 0. The Čech complex Cech(S, r) r} is a closed ball. is an abstract simplicial complex defined by the following rules: 1. The vertex set is S. T Čech complexes are a special case T 2. A subset s ✓ S is a simplex i↵ of nerve complexes, a connection x2 s B(x, r) 6= ∆. that will be explained below. Figure 5.3: Five points in the plane and three corresponding Čech com- plexes Cech(S, r). Visualisation is Remark 5.2.2. A few comments about the definition: assisted by circles of radius r around each point. For much larger scales • Čech complexes are a classical topological construction used in many the Čech complex is not planar and contexts throughout topology. eventually becomes 4-dimensional. • Cech(S, r) represents a combinatorial snapshot of S at scale r. • It is easy to verify that Čech complexes are indeed abstract simplicial complexes. Remark 5.2.3. Some properties of the Čech complexes: 1. While harder to compute1, Čech complexes are attractive due to a 1 See the MiniBall algorithm in the Appendix. well understood geometric interpretation, which will be explained in the next section within the context of nerve complexes. 2. Cech(S, r) is an abstract simplicial complex, typically not embed-dable in X, although it is often homotopy equivalent2 to a subset of 2 See the nerve theorem below for details. X. 3. For r smaller than one half of the smallest pairwise distance between the points in S, Cech(S, r) is a discrete set. 4. For r at least as large as twice the largest pairwise distance between points in S, Cech(S, r) is the (|S| 1)-simplex. 5. If r1  r2, then Cech(S, r1) ✓ Cech(S, r2). 62 introduction to persistent homology 6. It is easy to verify3 that Cech(S, r) ✓ Rips(S, 2r). 3 If balls of radius r intersect then the pairwise distances between the 7. It is also easy to see that Rips(S, r) centers are at most 2r. ✓ Cech(S, r). A non-trivial p inclusion Rips(S, r 2) ✓ Cech(S, r) holds in Euclidean spaces by Jung’s theorem4. 4 Theorem 5.2.4 (Jung’s theo- Definition 5.2.5. Let X be a metric space and let a sample S ⇢ X rem). If D is the diameter of a be a finite subset. The Čech filtration of S is the collection of ab-finite subset F ⇢ Rn, then F is contained in a ball of radius at stract simplicial complexes {Cech(S, r)} q r 0 along with inclusions most D n 2 . (n+1) ir : Cech(S, r 1,r2 1) ,! Cech(S, r2) for all r1  r2. For X = (Rn, d2) we actually obtain r ! As was the case with Rips filtrations, Čech filtrations also provide a 2 Rips S, r (n + 1) scale-free approximation of the underlying set by simplicial complexes. n ✓ Cech(S, r). p The smallest example demonstrating a di↵erence between the two The factor r 2 in 7 of Remark constructions is given in Figure 5.4. 5.2.3 is thus only the smallest upper bound that holds for all n. Figure 5.4: Suppose we are given three points in the plane. These three points span a triangle in the Čech complex i↵ the three balls of radius r intersect. The left complex consisting of three edges and no triangle does not appear as a Rips complex of any triple of points. 5.3 Nerve complexes Čech complexes are a special case of a classical topological construction called the nerve. Definition 5.3.1. For k 2 N let U = {U1, U2, . . . , Uk} be a collection of subsets of X. The nerve of U is the abstract simplicial complex N (U ) defined by the following rules: 1. The vertex set is U = {U1, U2, . . . , Uk}, consisting of k elements. T 2. A subset s ✓ U is a simplex i↵ i2 s Ui 6= ∆. A Čech complex is the nerve of the corresponding collection of r-balls, i.e., Cech(S, r) = N ({B(s, r)}s2S). Another example is the Delaunay triangulation, which is the nerve of the Voronoi diagram. constructions of simplicial complexes 63 U N (U) Figure 5.5: An example of a nerve. Figure 5.6: Two examples of Čech complexes: balls and the correspond- ing complex superimposed (left), complex only (center) and the union of balls homotopy equivalent to the complex by the nerve theorem(right). 64 introduction to persistent homology One of the main advantages of nerve complexes is that their ho-T Nerve theorem actually holds motopy type in some cases represents the union of the elements of U. much more generally. For example, This is formalized within the context of the nerve theorem, of which assume each finite intersection of sets of U (including each member we now state a special case. U 2 U, since it appears as the intersection of {U} ✓ U) is either empty or contractible. If U is a finite Theorem 5.3.2. [Nerve theorem] Let n 2 N and assume a collec-collection of closed subsets in Rn, or tion U = {U an arbitrary collection of open sets 1, U2, . . . , Uk} consists of closed convex subsets of Rn. in a metric space, then Then U1 [ U2 [ . . . [ Uk ' N (U ). [ U ' N(U). U2U An idea of a proof is given in Appendix. The nerve theorem does This is a stronger statement than Theorem 5.3.2 by Lemma 5.3.3. not hold for an arbitrary collection of subsets as Figure 5.5 demonstrates. Lemma 5.3.3. Let n 2 N. Each convex subset of Rn is contractible. For Delaunay triangulations the nerve theorem provides no additional information. As the Voronoi cells are convex, the nerve theorem Proof. Assume A ⇢ Rn is convex and fix x0 2 A. We can slide each a 2 implies that the Delaunay triangulation is contractible, a fact we al-A into x0 along the line segment from ready know as it triangulates a convex (hence contractible by Lemma a to x0. This results in a homotopy 5.3.3) planar region. H(a, t) = (1 t)a + t x0 between the identity map on A and the On the other hand, the nerve theorem provides a homotopical de-constant map at x0, hence A is scription of the Čech complex. As Euclidean balls are convex, we contractible. obtain [ Cech(S, r) = N ({B(s, r)}s2S) ' B(s, r), x0 s2S i.e., the Čech complex Cech(S, r) has the homotopy type of the r-neighborhood of S. This fact is the foremost reason for the use of Čech Figure 5.7: A sketch of Lemma 5.3.3. complexes: while they are harder to compute than Rips complexes, we know that the obtained homotopy type represents the r-neighborhood of S. In this spirit we can interpret Figure 5.6. Furthermore, this observation can be used to prove reconstruction results: given a closed connected surface X in an Euclidean space, for each sufficiently small scale parameter r 0 and for each sufficiently dense finite subset S ⇢ X we have X ' Cech(S, r), i.e., the homotopy type of a space X can be reconstructed using Čech complexes (in Euclidean or geodesic metric)5. In the Euclidean metric this holds as6 X ' N(X, r) for small 5 The same result holds for more general spaces and under appropriate r. conditions also for Rips complexes. However, almost all of the proofs are based on the application of the nerve Alpha complexes theorem to Čech complexes. 6 Imagine a circle in the plane, a knot Alpha complexes are a fusion between planar Čech complexes and in R3 or a surface in R3: its small Delaunay triangulations. thickening is homotopy equivalent to the space itself. constructions of simplicial complexes 65 Definition 5.3.4. Let r 0 and assume S ⇢ Rn is a finite collec- tion of points satisfying a general position property: no n + 2 points of S lie on the same (n 1)-sphere. For each s 2 S let Vs denote the corresponding Voronoi cell. The alpha complex of S at scale r ⇣ ⌘ is the following nerve: N {Vs \ B(s, r)}s2S . Assume S is as in Definition 5.3.4. While Cech(S, r) may be of arbi-trarily high dimension7, the nerve theorem guarantees it is homotopy 7 This implies it may be computa-tionally inefficient. equivalent to a subset of Rn. The alpha complex of S at scale r is a complex8, that is homotopy equivalent to Cech(S, r). To see this note 8 Note that it is a subcomplex of the Delaunay triangulation on S. It is that by the nerve theorem9 both are homotopy equivalent to planar if S is. [ [ 9 Sets Vs \ B(s, r) are intersections of B(s, r) = Vs \ B(s, r) . closed convex sets thus closed and s2S s2S convex themselves. Thus alpha complexes may be seen as an efficient way of obtaining T S S To see s2S B(s, r) = s2S Vs \ B S (s, r) , take any x 2 the homotopy type of a Čech complex in Rn. s2S B(s, r) and note that if s 2 S is a closest point to Another way of thinking of alpha complexes is as a model for x in S, then x 2 Vs \ B(x, r). molecules. Each atom in a molecule has a radius10 and touches (rather 10 Assume all the radii are the same. For di↵erent radii there is a well than intersects) other atoms within the range. studied concept of a weighted alpha complex. Figure 5.8: Alpha complexes corre- sponding to the situation in Figure 5.6. Note that the alpha complexes are smaller (or equal) yet still homotopy equivalent to the corresponding Čech complexes. For larger r the Čech complexes become higher- dimensional while the alpha com- plexes of planar subsets maintain the dimensionality bound 2. A decomposition into regions of the form Vs \ B(s, r) (on the left) mimics the decomposition of molecules into atoms. Mapper Another example of a construction based on the idea of the nerve is Mapper. In contrast to the constructions above it is typically11 a one-11 By the definition that will be provided, a Mapper is a simplicial dimensional simplicial complex, i.e., a graph. Mapper can be thought complex of arbitrary dimension. of as a one-dimensional sketch of a space X as detected through the However, our discussion and examples will focus on one-dimensional lens of a single map on X. case, as do the practical applications We first describe a theoretical setup. Assume: in which Mapper is used. 66 introduction to persistent homology • X is a metric space; • f : X ! [0, 1] is a (continuous) map12; 12 We restrict ourselves to the cases when the target space is [0, 1]. How- ever, there is no theoretical reason • U is a collection13 of subsets of [0, 1], whose union is [0, 1]. for doing so and the construction is well defined even if we replace [0, 1] by some more complicated space. 13 Definition 5.3.5. For each U let VU denote the collection of all com-Typically we restrict to cases S when no three subsets of U intersect. ponents of f 1(U) and define V = U2U VU as the collection of all In such cases Mapper is a one-subsets of X that appear as a component of a preimage f 1(U) for dimensional simplicial complex. some U 2 U. Mapper is defined as M(X, f , U ) = N (V). An example is provided by Figure 5.9. f 1(U1) X U1 M apper f 1(U2) f U2 f 1(U3) U3 Figure 5.9: The construction of Mapper: a space (the torus on the In practice data is often given as a finite set of points along with left), a continuous map (projection certain measurements. For example, we may have a collection of pa-f onto the vertical axis), cover U = {U1, U2, U3} of the interval [0, 1], tients along with their heart rate and blood pressure, or a collection a decomposition of the preimages of basketball players along with their statistics, etc. In this case a into the four components and the resulting graph (right). modified practical setup comes into play. Assume: • X is a finite set; • f : X ! I is a map (measurement); • U is a chosen partition of [0, 1] into intervals, typically of fixed length # > 0. • P is a chosen clustering scheme14 on X. 14 This step possibly includes ad- ditional choices of clustering algo- rithms. constructions of simplicial complexes 67 Definition 5.3.6. For each U 2 U let VU denote the collection of all S clusters of f 1(U) with respect to P and define V = U2U VU as the collection of all subsets of X that appear as a cluster of a preimage f 1(U) for some U 2 U. Mapper is defined as M(X, f , U ) = N (V). f 1(U1) U1 M apper f 1(U2) f U2 Figure 5.10: The construction of a Mapper when X is a point cloud. While a point cloud and a number of measurements on it are often given, one has to construct a single function f and a partition U, and choose other parameters very carefully to extract the desired information. Mapper is usually not analyzed further with topological tools but rather visualized, which is why the one-dimensionality is preferable. 5.4 Interleaving properties Given a finite subset of a metric space we have described how to associate various complexes with that set. If we think for a moment about finite abstract complexes we note that these objects are discrete: Example 5.4.1. Let X = {0, 1} ⇢ it would be hard to define an obvious distance between abstract sim-R. Note that Rips(X, r) changes plicial complexes. On the other hand, we have a continuous selection discontinuously at r = 1: while Rips(X, 1) is a single edge (along of inputs and input parameters: scale r is typically positive and there with the two boundary points), for are reasonable notions of a distance between finite subsets of a metric each r < 1 the complex Rips(X, 1) consists of only two vertices. space. As a result any assignment of a single complex is bound to have discontinuities15 (instabilities) of some sort, see Example 5.4.1 for a 15 Unless it assigns a constant com-demonstration. plex, of course. However, it turns out we can define a distance on filtrations, for 68 introduction to persistent homology which the assignment of a filtration16 becomes a continuous function 16 Of course, this eliminates the dependency on the scale parameter r. of the input set and the scale parameter. 17 17 For the sake of simplicity we will restrict ourselves to the mentioned Rips and Čech filtrations although Definition 5.4.2. Choose # > 0. Filtrations {Ar}r 0 and {Br}r 0 the concept can be defined more (obtained by the Rips or the Čech construction) are #-interleaved if generally. there exist simplicial maps j r : Ar ! Br+ # and y r : Br ! Ar+ # such that j r+ # y r : Br ! Br+2 # and y r+ # j r : Ar ! Ar+2 # are equal to the corresponding inclusions. Maps of Definition 5.4.2 can be visualised by drawing the following commutative18 “ladder” diagram. 18 Adjective “commutative” refers to the fact that all maps commute, i.e., going from one complex to another · · · / Ar / Ar+ # / Ar+2 # / · · · = ; through any viable sequence of maps j r gives the same result. y r ! # · · · / Br / Br+ # / Br+2 # / · · · Definition 5.4.3. Given two filtrations their interleaving distance is defined as the infimum of all values # > 0, for which the filtrations are #-interleaved. Example 5.4.4. Let X = {0, 1} ⇢ R It turns out that in our context (Rips and Čech filtrations on finite and Y = {0.1, 1.2} ⇢ R. Rips(X, r) collections of points) the interleaving distance is a metric on the set of consists of: filtrations. In contrast, recall that there seems to be no geometrically • two points for r < 1; meaningful metric on the set of single finite simplicial complexes. • one edge for r 1. Rips(Y, r) consists of: The concept of interleaving will play an important role later in the • two points for r < 1.1; context of the stability of persistent homology. At this point we can • one edge for r 1.1. use it to phrase two proximity results. The filtrations are 0.1-interleaved. Theorem 5.4.5 (Stability with respect to spaces). Choose # > 0 and assume X = {x1, x2, . . . , xk} and Y = {y1, y2, . . . , yk} with d(xi, yi)  #, 8i, i.e., X and Y each consist of k points, such that the corresponding distances are at most #. Then: • The Rips filtrations of X and Y are 2 #-interleaved. • The Čech filtrations of X and Y are #-interleaved. Proof. It follows directly from the triangle inequality (see Figure 5.11) that if a subset s ⇢ X is of diameter r, then the corresponding19 19 Subset t is formed by taking the points of Y with the same indices as subset t ⇢ Y is of diameter at most r + 2 #. Hence if s is a simplex appear in the points of s. in Rips(X, r), then t is a simplex in Rips(Y, r + 2 #). Consequently we may deduce that: • maps Rips(X, r) ! Rips(Y, r + 2 #) defined by xi 7! yi are simplicial; constructions of simplicial complexes 69 • maps Rips(Y, r) ! Rips(X, r + 2 #) defined by yi 7! xi are simplicial; • as the above two maps obviously commute with the inclusions we y1 y2 conclude that the Rips filtrations of X and Y are 2 #-interleaved;  "  "  r • in a similar fashion we may conclude that the Čech filtrations of X x1 x2 and Y are #-interleaved. Figure 5.11: If d(x1, x2)  r and d(xi, yi)  # then it is apparent that d(y1, y2)  r + 2 #. These conclusions tell us that if we perturb our point set slightly, the resulting filtration does not change much in terms of the interleaving distance, i.e., the construction of a filtration is stable. In a similar fashion we can express the relationship between Rips and Čech filtrations. Rips- Čech correlation Recall that Cech(S, r) ✓ Rips(S, 2r) and Rips(S, r) ✓ Cech(S, r) ✓ Cech(S, 2r). This implies that the Rips and Čech filtrations, when constructed with logarithmic scales20, are (log 2)-interleaved, i.e., 20 Note that Cech(S, er) ✓ Rips(S, 2er) = Rips(S, elog 2+r) and similarly Rips(S, er) ✓ Cech(S, er) ✓ {Rips(S, er)}r 0 and {Cech(S, er)}r 0 Cech(S, elog 2+r). These inclusions are the interleaving maps. are (log 2)-interleaved. 5.5 Concluding remarks Recap (highlights) of this chapter • Complexes: Rips, Čech, nerve, alpha, Mapper; • Nerve theorem; • Interleaving; Background and applications T The curse of dimensionality: an Constructions of simplicial complexes greatly depend on the in-inconvenient fact that the number of simplices typically grows fast with the tended use. Rips and Čech complexes along with the nerve construc-dimension of a simplicial complex. tion are relatively well understood and have been originally introduced This presents challenges for their for theoretical purposes in the first half of the twentieth century. Čech computational applications, which are partially addressed by alternative complexes and particularly nerves were instrumental in development constructions of complexes. of Čech homology and cohomology theories, which later led to shape theory. Rips complexes have seen various independent introductions, including in geometric group theory. Their use has recently been extended to the applied setting. They are the complexes most exposed to the curse of dimensionality. Alpha complexes arose decades later 70 introduction to persistent homology within the realm of computational geometry and are intended for computationally intense applications. For more details and references on historical background on these complexes see a textbook21. There is 21 Herbert Edelsbrunner and John Harer. Computational Topology: An also a modern perspective22 on nerve theorem, Dowker duality, and Introduction. Applied Mathematics. connections to Rips complexes. Mapper is a more recent construc-American Mathematical Society, 2010. doi: 10.1090/mbk/069 tion23. It is often thought of as a low-dimensional projection method 22 Žiga Virk. Rips complexes as and has turned out to be a commercial success. nerves and a functorial Dowker-nerve At about the same time the interleaving distance emerged24 as a diagram. Mediterranean Journal of Mathematics, 18(2):58, 2021. doi: measure of stability of filtrations and persistent homology, although 10.1007/s00009-021-01699-4 equivalent concepts have been known in pure topology for a long time. 23 Gurjeet Singh, Facundo Memoli, and Gunnar Carlsson. Topological Methods for the Analysis of High Di- Appendix: the MiniBall algorithm mensional Data Sets and 3D Object Recognition. In M. Botsch, R. Pa- Given a finite subset s ⇢ X ⇢ Rn the MiniBall algorithm25 is jarola, B. Chen, and M. Zwicker, a recursive algorithm that returns the miniball of s, i.e., the min-editors, Eurographics Symposium imal26 ball in Rn containing on Point-Based Graphics. The Eu- s. As such, the algorithm provides a rographics Association, 2007. doi: computational verification of the containment of s in a Čech com- 10.2312/SPBG/SPBG07/091-100 plex: s 24 2 Cech(X, r) i↵27 the radius of the miniball is at most r. Steve Y. Oudot. Persistence The- As the radius of the ball is also provided, the algorithm actually pro-ory: From Quiver Representations to Data Analysis. Number 209 in Math- vides the exact lower bound for the scales r at which s is a simplex in ematical Surveys and Monographs. Cech(X, r), hence a single execution of the algorithm suffices for the American Mathematical Society, 2015. doi: 10.1090/surv/209 entire filtration. 25 Emo Welzl. Smallest enclos- ing disks (balls and ellipsoids). In • Input: disjoint finite sets t, n ⇢ Rn. Hermann Maurer, editor, New Re- sults and New Trends in Computer • Output: the minimal ball with: Science, pages 359–370, Berlin, Heidelberg, 1991. Springer Berlin – t in the ball; Heidelberg 26 Minimality is considered with – n on the boundary of the ball. respect to the radius. Such a ball is unique. T 27 z 2 x2 B s (x, r) , s ⇢ B(z, r). Algorithm 1: Miniball( t, n). if t = ∆ then compute miniball B directly; " Given random finite t, n ⇢ Rn, there typically exists no ball else containing t and having n on the choose u 2 t; boundary. The algorithm is designed B =miniball( t so that only the pairs ( t, n), for {u}, n); which this condition is satisfied are if u / 2 B then called. B =miniball( t {u}, n [ {u}); return B The algorithm is initiated by calling Miniball( s, ∆) 28 and termi-28 I.e., t = s, n = ∆ nates with the miniball B when t = ∆. It inductively scans through the points of t. At each step it either removes a point (if removing it from the set does not change the miniball of the set) or puts a point constructions of simplicial complexes 71 into n (if removing the point decreases the miniball). When t = ∆ the set n consists of at most n + 1 points that lie on the boundary of the miniball of s and determine it. In this case we can use the standard circumsphere and circumradius formulas in terms of determinants to get the miniball. Appendix: a sketch of a proof of the nerve theorem 5.3.2 A special case of the proof is illustrated by Figures 5.12, 5.13, and 5.14. Proof. For the sake of simplicity29 let us assume the nerve is of di-29 A complete general proof is much more technical but broadly follows mension 1, i.e., all triple intersections of sets of U are empty. Define the same steps as are presented here. X = U1 [ U2 [ . . . [ Uk and Z ⇢ X ⇥ N (U) as: [ ⇣ \ ⌘ Z = Us ⇥ s . s 2N (U ) s2 s We will prove that Z ' X and Z ' N (U ). In order to prove Z ' X note that for each x 2 X the section ({x} ⇥ N (U )) \ Z is a simplex30 in the nerve spanned by all s 2 S, 30 In our case, either an edge or a vertex. for which x 2 Us. Contracting each such simplex to a point in a synchronized manner for each x 2 X we obtain a deformation of Z to X, hence Z ' X. In order to prove Z ' N (U ) note that for each y 2 N (U ) the section (X ⇥ {y}) \ Z is a contractible set by assumptions. Contract first the sections of this form for all non-vertices y, and then conclude by contracting all the sections for vertices. We obtain a deformation of Z to N (U), hence Z ' N (U). U N (U) Z Figure 5.12: A collection U of subsets of a circle X = S1 (left), the corre- sponding nerve (center) and space Z constructed in the proof (right). The sets of U are illustrated as sub- sets of the plane for greater clarity, while formally U consists of their intersections with X. 72 introduction to persistent homology Z S1 Figure 5.13: Proving Z ' S1 we contract the sections above points of x 2 X in Z corresponding to edges in the nerve complex (contract along the indicated arrows on the left) to obtain S1. Z N (U) Figure 5.14: Proving Z ' N (U ) we first contract the sections above non-vertices y 2 N (U ) in Z (contract along the indicated arrows on the left) to obtain the space in the center. Conclude by contracting the sections above vertices y 2 N (U ) in Z (contract along the indicated dashed arrows in the center) to obtain N (U ) on the right. constructions of simplicial complexes 73 Appendix: Dowker duality Nerve complexes are natural complexes arising from a collection of subsets. There is another similar construction, called the Vietoris complexes, that is in a way dual to the nerve construction. Definition 5.5.1. For k 2 N let U = {U1, U2, . . . , Uk} be a collection of subsets of a finite space X, whose union is X. The Vietoris complex of U is the abstract simplicial complex V(U ) defined by the following rules: 1. The vertex set is X. 2. A set s ✓ X is a simplex i↵ there exists U 2 U containing s. We see that maximal simplices of V(U ) are determined by (inclusion-wise) maximal sets of U. There is a surprising connection31 between the nerve complexes and Vietoris complexes. T Dowker duality actually holds for an arbitrary collection of subsets U Theorem 5.5.2 (Dowker duality). For k 2 N let U = {U1, U2, . . . , Uk} of an arbitrary set X, no additional be a collection of subsets of a finite space X, whose union is X. Then structure is necessary. Even in such generality it can be proved with ease N (U ) ' V(U ). using a general form of the nerve theorem. 31 Proof. Consider Cli↵ord H. Dowker. Homology V(U ) as a subspace of a Euclidean space. Each Ui groups of relations. Annals of determines a simplex DU spanned by all points of U i i. Note that Mathematics, 56(1):84–95, 1952. doi: {DU , D , . . . , D 1 U2 U2 } are closed convex sets whose union is V (U ). By 10.2307/1969768 the nerve theorem 5.3.2 V(U ) is homotopy equivalent to the nerve of {DU , D , . . . , D 1 U2 U2 }. This nerve, on the other hand, is actually N (U ) via the correspondence DUi 7! Ui, see Figure 5.15 for a visual sketch of the proof. Figure 5.15: A cover of six points by U V(U) N (U) four colored sets (left), its Vietoris complex (center) and nerve (right). Colored simplices on the central picture provide a collection of subsets satisfying the conditions of the nerve theorem. On the other hand, their nerve is actually N (U ) on the right. By the nerve theorem we conclude the Dowker duality. While Čech complexes are nerves associated to a collection of balls of radius r, Rips complexes are Vietoris complexes associated to a collection of sets of diameter at most r. 6 Fields and vector spaces The material presented up to this point mostly falls into the premise of geometric topology and combinatorics: we introduced metric spaces and their combinatorial descriptions, simplicial complexes. Our eventual goal however is to compute meaningful topological invariants from these combinatorial descriptions. Within mathematics the field dealing with operations is called algebra and our milestone on the path to computational implementation is an algebraic formulation based on simplicial complexes. With that intention in mind we first review and introduce some algebraic concepts. In this lecture we will present fields and vector spaces. Specific cases of the first two notions are probably familiar to the reader: real numbers and vectors in Euclidean space. We will introduce a few more fields and vector space constructions, which will provide us with enough structure to introduce homology in the next chapter. 6.1 Fields Within the context of algebra, a field is a set with two operations satisfying a number of properties. For our purposes we will deflect a formal introduction and rather introduce specific fields which will be of our interest. We will think of a field as our number system. We will want to be able to add, subtract, multiply and divide (except by zero) in our field. The fields a reader is most familiar1 with are probably Q, R, and 1 N is not a field as it does not contain all results of subtractions, C. However, there is also a family of finite fields (consisting of finitely for example, 3 5 / 2 N. Z is is many numbers) which often provides convenient examples: fields of not a field as it does not contain all quotients by non-zero numbers, for remainders. example 3/5 / 2 Z. 76 introduction to persistent homology The fields of remainders Zp Definition 6.1.1. Let p 2 {2, 3, 5, . . .} be a prime number. Define: (a) pZ = {p · n | n 2 Z} = {. . . , 2p, p, 0, p, 2p, 3p, . . .}; (b) Zp = Z/(pZ) as the quotient consisting of remainders when dividing by p. Let us discuss (b)2 in detail. The quotient Z/(pZ) consists of 2 I.e., the fields of remainders and the quotient construction that defines it. classes, each of which can be represented by a number from the “nu-merator” Z. If a 2 Z then the corresponding class is represented by [a]. Two such numbers represent the same class in the quotient i↵ their di↵erence is3 in the “denominator” Zp. To phrase it di↵erently4, 3 I.e., i↵ their di↵erence is a multiple of p. In particular this means [a] = [b] if the remainder after dividing by [a] = [b] , b a 2 pZ. p is the same for both a and b. 4 What we just described is a general Example 6.1.2. Let p = 5. In Z5 two numbers represent the same class construction of an algebraic quotient i↵ their di↵erence is divisible by 5. Classes [0], [1], [2], [3], [4] are all structure. We will come across it again in the context of vector spaces. distinct but5: 5 A few more examples in Z5: [ 4] = [1], [8] = [3], [17] = [2], • [5] = [0] as 5 0 = 1 · 5. [1346134523451] = [1], [3457] = [2], [ 23513252] = [3]. • [6] = [1] as 6 1 = 1 · 5. • [ 1] = [4] as 1 4 = 1 · 5. • [126] = [1] as 126 1 = 25 · 5. In particular, two positive numbers represent the same class i↵ their T A few examples in Z7: [8] = [1] = [15], [ 5] = [2] = [72]. remainder when dividing by 5 is the same. We draw another conclusion from Example 6.1.2: the most convenient representation6 of Zp is given by p classes [0], [1], . . . , [p 1]. 6 We will actually be using this representation almost exclusively These classes are all distinct7 and together form Zp. from now on. 7 They form all possible remainders Example 6.1.3. The structure of Z2 encodes parity: for a 2 Z we after division by p. observe that [a] = 0 i↵ a is even, and [a] = 1 i↵ a is odd. Defining addition, subtraction and multiplication in Zp These three operations are defined in the obvious way: [a] + [b] = [a + b], [a] [b] = [a b] and [a] · [b] = [a · b]. It turns out that the operations are well defined8 in the following 8 A proof that addition is well defined: sense: [a] = [a0], [b] = [b0] =) [a] = [a0], [b] = [b0] =) [a + b] = [a0 + b0] 9ka, kb 2 Z : a0 = a + ka p, b0 = b + kb p. and the same holds for subtraction and multiplication. Thus [a0 + b0] = [a + ka p + b + kb p] = = [(a + b) + (ka + kb)p] = [a + b]. fields and vector spaces 77 Example 6.1.4. Addition: In Z5: [3] + [4] = [2], [3] [4] = [4], [1] + [2] = [3]. In Z7: [3] + [4] = [0], [3] [4] = [6], [1] + [2] = [3]. Multiplication: In Z5: [3] · [4] = [2], [2] · [4] = [3], [2] · [3] = [1]. In Z7: [3] · [4] = [5], [2] · [4] = [1], [2] · [3] = [6]. Example 6.1.5. Note that in Z2 = {[0], [1]} we have [a] = [ a] hence addition is the same as subtraction. In fact, if we identify [0] and [1] with their Boolean values, addition and multiplication encode9 logical 9 Which means, amongst others, that these operations in Z2 are fairly nat- operations “Exclusive or” (XOR) and “Conjunction” (AND): ural in computer implementations, exact, and fast. In fact, computa- [a] + [b] = [a XOR b], [a] · [b] = [a AND b]. tions in topological data analysis are often performed using Z2. Defining division in Zp Up to this point the described structure of Zp did not require p to be prime. This assumption, however, is required10 if we want to define 10 If q is not a prime then Zq contains divisors of zero, i.e., non-zero division . From number theory we know that if p is a prime, then for classes, whose product is the zero each number a 2 Z with [a] 6= [0], the classes [a], [2a], . . . [pa] = [0] class. For example, [2] 2 Z4 is a non- zero class, but [2] represent the entire Z · [2] = [4] = [0] 2 Z4 p. In particular, we can choose a coefficient k is the zero class. If we wanted to representing [1] = [kp] and define the inverse of p by [p] 1 = [k]. We find an inverse of [2] in Z4 we would can consequently define the division by need to find an integer k 2 Z, so that [2k] = [1] 2 Z4, an unattainable feat as 2k is always even. A divisor of [a]/[b] = [a] · [b] 1, zero has no inverse. which turns out to be well defined if p is prime and [b] 6= [0]. Example 6.1.6. In Z5 we have [1 · 3] = [3], [2 · 3] = [1], [3 · 3] = [4], [4 · 3] = [2], [5 · 3] = [0 · 3] = [0]. The products [k · 3] exhaust entire Z5 and [3] 1 = [2]. Similarly, [2] 1 = [3]. 1 2 In Z7 we have [2] 1 = [4], [3] 1 = [5], ... We are now able to add, subtract, multiply and divide (except by zero) in Zp, which makes Zp a field. Remark 6.1.7. Counting and computing in Zp is surprisingly common 3 0 in everyday life. It appears whenever we have a periodic behaviour. • Z2 is a model for true/false in logic, odd/even numbers, and binary numbers. • We use Z4 when thinking about seasons of the year. Figure 6.1: Quotient Zp models • We use Z7 when thinking about days of the week (if today is the ath rotations by 2 p/p. Adding p such day of the week then b days from today it will be [a + b]th day of the rotations we arrive at the original situation week). 0 2 Zp. The Figure represents Z4. Given any situation the addition of 1 is represented by a rotation by • We use Z10 whenever we are computing in decimal numbers. Given p/2 in the positive direction. a, b 2 N, the first digit of a + b equals [a + b] in Z10 and the same goes for multiplication. 78 introduction to persistent homology • We use Z10 whenever we are converting units in the metric sys-tem11. 11 As the reader might imagine, there is no reasonable algebraic explanation for the imperial system. • We use Z24 when thinking about hours in a day. When thinking about hours coupled with the am/pm prefixes we actually do a combination of Z2 and Z12. • We use Z60 when thinking about minutes and seconds. As a summary let us recall all the fields we mentioned: Q, R, C, and Zp for any prime number p. These are the only fields we will be considering. 6.2 Vector spaces Let F be a field. A prototype of a vector space over field R a reader is familiar with is Rn for any n 2 N. It consists of n-tuples (vectors) of real numbers, which we can add, subtract, and multiply by any element of our field R. In a similar way Fn is a vector field over F: it consists of n-tuples (vectors) of numbers from F, which we can add, subtract, and multiply by any element of our field F. While all our vector spaces will essentially12 be of the form Fn, some of our 12 I.e., up to isomorphism, which will defined later. constructions will require us to use a more formal definition. T Glossary of algebraic properties Definition 6.2.1. Let F be a field. A vector space V over field F is mentioned in Definition 6.2.1: a set of elements (vectors) equipped with two operations, • associativity: (u + v) + w = u + (v + w), 8u, v, w 2 V 1. addition + : V ⇥ V ! V and • commutativity: u + v = v + u, 8u, v 2 V 2. scalar multiplication · : F ⇥ V ! V • zero vector: 0 + v = v, 8v 2 V [it should always be clear from satisfying the following properties: the context whether 0 denotes a number in F or the zero vector in • addition is associative, commutative, contains the identity (zero) V] vector 0, and V contains the opposite element of each vector; • the opposite element of v 2 V is denoted by v 2 V and satisfies • scalar multiplication is compatible, distributive, and normalized. v v = 0 • compatibility: (ab)v = a(bv), 8a, b 2 F, 8v 2 V Roughly speaking, if we have a set of vectors we can reasonably • distributivity: (a + b)v = av + bv add, subtract, and multiply by elements of some field, then this set and a(v + w) = av + aw, 8a, b 2 F, 8v, w 2 V forms a vector space. • normalization: 1 · v = v, 8v 2 V. Example 6.2.2. Let n 2 N. Given symbols v1, . . . , vn and a field F, all formal13 sums Âni=1 aivi where ai 2 F form a vector space. Operations 13 A “formal sum” in this setting means that vi + vj is not defined as are defined in the obvious way: a single element vk (as a result of a summation) in a vector space, but n n n n n is rather thought of as an abstract  aivi +  a0ivi = Â(ai + a0i)vi, and b  aivi = Â(bai)vi. element in itself. For example, if i=1 i=1 i=1 i=1 i=1 we want to shop for an apple and a pear, our shopping list should be apple + pear, which does not equal any other single fruit. fields and vector spaces 79 When F = Z2 the corresponding vector space models the power set of v1, . . . , vn. A subset {vi+1, . . . , vi } corresponds do v . k i+1 + . . . + vik The sum of two formal sums in this setting models the symmetric di↵erence14 between the corresponding sets. 14 The symmetric di↵erence of sets A, B equals A [ B \ A \ B. For a prime number p and n 2 N the vector space (Zp)n = Znp is a finite vector space consisting of pn elements. While this vector space appears di↵erent from Rn, the formal theory, concepts, and proofs are T Let X be a metric space and m, n 2 N. The following are vector the same in both cases. We next recast the familiar notions from Rn spaces over F: the set of all m ⇥ n in the setting of vector spaces over F. matrices with entries in F, the set of all functions Let V, W be a vector space over field F. X ! F, the set of all continuous functions X ! F, the set of all di↵erentiable functions X ! F 1. A linear combination of vectors in V is any expression of the form if F 2 {Q, R, C}, ... Operations on functions in these examples are k defined pointwise.  aivi, ai 2 F, vi 2 V i=1 2. A set of vectors {v1, v2, . . . , vk} ⇢ V is linearly independent15 if the 15 For example, vectors (1, 3) and (2, 1) are linearly independent in only coefficients ai 2 F satisfying Âki=1 aivi = 0 2 V are the zero R2, Q2, Z213, but not in Z25. coefficients, i.e., ai = 0, 8i. 3. A basis of V is a maximal16 linearly independent set in V. A a 16 In particular, each element of V can be expressed uniquely as a linear vector space typically has many di↵erent bases. However, if V is combination of the basis vectors. finite dimensional17, then the cardinality of each basis is the same. 17 I.e., if it admits a finite basis. This number is called the dimension of V. 4. A subset U ✓ V is a vector subspace [notation: U  V] of V if it is itself a vector space over F. 5. A map f : V ! W is linear if it is additive18 and multiplicative19. 18 f (v + w) = f (v) + f (w), 8v, w 2 V 19 A linear map is completely determined20 by the images of its basis. f (av) = a f (v), 8a 2 F, v 2 V 20 Consequently, a linear map can 6. A bijective linear map is called an isomorphism [notation: ⇠ be represented by a matrix =]. M with coefficients in F if we chose bases Every vector space over F of dimension d 2 N is isomorphic to Fd. of V and W, with the matrix-vector product M · v representing f (v). 7. Let f : V ! W be a linear map. (a) The kernel of f is defined as ker( f ) = {v 2 V; f (v) = 0}  V (b) The image of f is defined as Im( f ) = { f (v); v 2 V}  W. The dimension of Im( f ) is called the rank of f . (c) Given bases {v1, v2, . . . , vk} of V and {w1, w2, . . . , wl} of W, map f may be represented by an l ⇥ k matrix with entries in F. If f (vi) = Âjj=1 ai,jwj, then the entry at (j, i) equals ai,j. 80 introduction to persistent homology 8. Given a matrix with coefficients in F, we can still preform Gauss reduction to, for example, compute the rank of a linear map, solve systems of linear equations, ... The procedure is the same as in Rn. 9. Given U  V, the quotient V/U is defined as the vector space over F consisting of classes [v] for v 2 V under the following identifica-tion21: 21 The operations of addition [u] + [v] = [u + v] and multiplication by [u] = [v] , u v 2 U. a scalar a[u] = [au] for a 2 F, u, v 2 V are well defined by the same argument that was provided in the In particular, [v] = [0] i↵ v 2 U. previous section for the fields. Our forthcoming descriptions of holes in simplicial complexes will TTake the following system in Z5: be expressed in terms of dimensions and bases of vector spaces, for (1) 2x + 3y = 2 which the following proposition will turn out to be very handy. (2) 3x y = 1. Multiply (1) by 2 1 = 3 to obtain Proposition 6.2.3. Assume U, V, W are vector spaces over a field F. (3) x + 4y = 1. 1. Let f : V To match with the leading coefficient ! W be a linear map. Then Im( f ) ⇠ = V/ ker( f ). of (2) multiply (3) by 3 to obtain 2. Let U  V be a subspace. Then dim(V/U) = dim(U) dim(V). (4) 3x + 2y = 3. Now subtract (4) (2) to obtain Proof. 1. Consider the map g : V/ ker( f ) ! Im( f ) defined by [u] 7! 3y = 2 f (u). The map is: and thus y = 4 and x = 2. • well defined because [u] = [v] =) u v 2 ker( f ) =) f (u v) = 0 =) f (u) = f (v) =) g([u]) = g([v]); • surjective by the definitions of Im f and g; • injective as g([u]) = f (u) = 0 implies u 2 ker( f ) and thus [u] = [0]. We conclude that g is an isomorphism. 2. Let {w1, . . . , wk} be a basis22 if U. Complete it by a set B1 = 22 This implies dim(U) = k. {v1, . . . , vl} to a basis23 of V. Observe that B2 = {[v1], . . . , [vl]} is a 23 This implies dim(V) = k + l. basis24 of U/V: 24 This implies dim(V) = l and thus proves our claim. Furthermore, it demonstrates a way to obtain a basis • B2 is linearly independent: if a linear combination of B2 was the of U/V. zero vector in V/U then the corresponding combination of the elements of B1 was in U. This can only happen if the later combination equals 0 by the choice of B1 and thus all the coefficients equal 0 by the linear independence of B1 • B2 spans the whole V/U because25 B1 and U span the whole V. 25 Take any v 2 V and express it as v = v0 + v00, where v0 2 U and v00 is a linear combination of B1. Then [v] = [v00]. fields and vector spaces 81 6.3 Concluding remarks Recap (highlights) of this chapter • fields, vector spaces • quotients and dimension Background and applications Fields and abstract vector spaces have a long presence in mathematics going back centuries. Finite fields are attractive for computational implementation due to their simplicity. Computations in them are typically faster than in real numbers. Furthermore they are re-sistant to some numerical issues present in reals and floating point computations. On the other hand, there is a potential issue of over-flowing with computer stored numbers, say integers. Algebraically it stems from the fact that counting with integers in a computer is typically performed in Zp where log2 p is the number of bits assigned to a variable. Some of related issues are described in a non-technical book26, including an interesting rule in Swiss train regulations27. 26 Matt Parker. Humble pi: a comedy of maths errors. Allen Lane, 2019 Algebraic predecessor of fields and vector spaces are (algebraic) 27 Apparently trains in Switzerland groups, another classical subject of algebra, which is now present in are not allowed to have an e↵ective virtually every corner of mathematics. For further algebraic back-total number of axles equal to 256. Note that 256 = 28. ground see a textbook28. 28 David S. Dummit and Richard M. Homology of the forthcoming section is typically introduced through Foote. Abstract algebra. Wiley, 3rd edition, 2004 groups in the theoretical setting, while in practice fields are used almost exclusively. A short recap of groups is given in the appendix. Persistent homology, on the other hand, is almost exclusively introduced through coefficients in a field and the resulting persistence modules due to the accessible description in terms of a persistence diagram. Appendix: A very short introduction to Abelian groups T The term ”Abelian” refers to commutativity. If the commutativ- Definition 6.3.1. An Abelian group (G, +) is a set G with an as-ity condition is not satisfied, the structure is called a (non-Abelian) sociative commutative operation + : G ⇥ G ! G, such that: group. These include the groups of permutations (with the operation be- 1. there exists the zero element 0 2 G satisfying 0 + g = g + 0, 8g 2 ing the composition) on n elements, G; the groups of isometries of a metric space (with the operation being the composition), the group of invertible 2. for each g 2 G there exists its converse g 2 G satisfying matrices (with the operation being g + ( g) = 0. the product), etc. T We will typically shorten a + ( b) to a b. Examples of Abelian groups include (R, +), (C, +), (Q, +), (Z, +), 82 introduction to persistent homology (Zq, +) for any q, (R \ {0}, ·), (C \ {0}, ·), (Q \ {0}, ·), (Zq \ {0}, ·) for any prime p, etc. Many definitions concerning groups are the same as those of fields and vector spaces. Definition 6.3.2. Suppose G, H are Abelian groups. A map f : G ! H is a homomorphism if f (a + b) = a(a) + b(b), 8a, b 2 A. A bijective homomorphism is called an isomorphism [notation: ⇠ =]. Suppose G, H are Abelian groups and map f : G ! H is a homomorphism. 1. A subset G0 ✓ G is a subgroup [notation: G0  G] of G0 if it is itself a group for the same operation. 2. The kernel of f is defined as ker( f ) = {a 2 G; f (a) = 0}  G. 3. The image of f is defined as Im( f ) = { f (a); a 2 G}  H. 4. A set of elements a1, a2, . . . , ak 2 G is called a generating set29 of 29 Or just “generators”. G, if each element of G can be expressed30 as a sum31 Âki=1 niai for 30 As opposed to vector spaces, such expressions in groups are often not some ni 2 Z. unique, which is why the expression “generating set” is used instead of 5. Group G is finitely generated if there exists a finite generating set. “basis”. 31 For n 2 N and a 2 G we define 6. If G0  G, the quotient G/G0 is defined as the group consisting of n · a = a · a · . . . · a classes [a] for a | {z } 2 G under the following identification: n times and ( n) · a = (n · a). [a] = [b] , a b 2 G0. 7. The direct sum of groups G and H is the group denoted by G H and defined as G H = {(a, b); a 2 G, b 2 H} and the operation being defined coordinate-wise. A remarkable fact about finitely generated Abelian groups is that they can be classified in a wonderful way. fields and vector spaces 83 Theorem 6.3.3. [Classification theorem for finitely generated Abelian groups] Let G be a finitely generated Abelian group. Then there exist : • k, r 2 {0, 1, . . .}, • q1, q2, . . . , qk 2 N, and • prime numbers p1, p2, . . . , pk 2 N, such that G is isomorphic to Zr |{z} Z . . . . pq1 Z Z 1 pq2 2 pqk k free part of G | {z } torsion of G Number r = rank(G) is called the rank of G. Example 6.3.4. Z12 ⇠ = Z3 Z4, while Z4 6⇠ = Z2 Z2: for each element a 2 Z2 Z2 we have a + a = 0, while the same does not hold in Z4. Proposition 6.3.5. Suppose G, H are Abelian groups, a map f : G ! H is a homomorphism, and G0  G. Then: 1. Im( f ) ⇠ = G/ ker( f ). 2. rank(G/G0) = rank(G) rank(G0). 7 Homology: definition and computation Now that we presented combinatorial and algebraic prerequisites, we are ready to define homology. The notion of homology arose from the need to detect the holes in a simplicial complex or a more general space. Its definition is not as straight forward as one might hope, but nonetheless results in a notion amenable to practical compu-b a tations and consistent with the geometric intuition we presented in the first chapter. In this chapter we will journey through a geometric introduction and definition of homology, and study the basic methods of compu-d tation. We will provide examples of homologies, which should build up our understanding and detection of holes of all dimension not only for subspaces of Euclidean spaces, but also within the combinatorial context of abstract simplicial complexes. 7.1 Definition e c Figure 7.1: Abstract simplicial Homology measures holes in a simplicial complex. As the latter is complex L. provided by a collection of simplices, we need to devise a computational framework based on the simplices that will result in a meaningful result. The formal treatment of this section will be provided in b b a a parallel to a simple example on the right. Let K be an abstract simplicial complex of dimension n and choose d d a field of coefficients F. e e c c Chains Figure 7.2: Two 1-chains in L: the red chain on the left hc, ai + ha, bi + Chains are formal sums of simplices along with coefficients from hc, di + hd, bi + hc, bi coincides with the blue chain on the right hc, ai + F. They are an algebraic model of collections of simplices as demon-ha, bi + hc, di + hd, bi 2hc, bi = strated in Figure 7.3. hc, ai + ha, bi + hc, di + hd, bi + 2hb, ci For each p i↵ the coefficients are from Z 2 {0, 1, . . . , n} let n 3. p denote the number of simplices of dimension p in K. 86 introduction to persistent homology p Definition 7.1.1. A p-chain is a formal sum Ânp i=1 l i s i with l i 2 F p and s i being an oriented simplex of dimension p in K for each i. This formalism incorporates orientation: if s is an oriented simplex then ( 1) · s = s is the simplex s with the opposite orientation. p p p We assume that { s 1 , s 2 , . . . , s np} is the collection of all p-simplices b b a a of K. The p-simplices that are ”absent” in a p-chain have coefficient 0. p-chains can be added/subtracted and multiplied by any scalar: d d np np np  p p p l i s i +  l 0i s i = Â( l i + l 0i) s i 8 l i, l 0i 2 F. e e c c i=1 i=1 i=1 n b b p np a a k  p p l i s i = Â(k l i) s i , 8k, l i 2 F. i d d =1 i=1 Example 7.1.2. Consider the simplicial complex L from Figure 7.1. e e c c Two examples of 1-chains and their additions are presented in Figure Figure 7.3: Top row: addition of 7.3. chains in Z2. Bottom row: addition of chains in any other field. • Working in Z2 (top of Figure 7.3) the 1-chains are merely subsets of the collection of edges as the orientation does not matter (+1 = 1 in Z2). Adding the red chain {a, c} + {b, c} and the blue chain {b, c} + {b, d} results in the purple chain {a, c} + {b, d}. • Computing in any other field (bottom of Figure 7.3) the orientation does matter. Adding the red chain hb, ai + ha, ci + hc, di and the blue chain hb, ai + hd, ci results in the purple chain ha, ci + 2hb, ai. As a result the collection of all chains forms a vector space1. 1 For historical and practical reasons we will match the established ter- minology in the literature and call Definition 7.1.3. The chain group C this vector space a chain group, the p(K; F) is the vector space of reason being that if the coefficients all p-chains. are in a group (as is standard in classical theoretical approaches, see also Appendix), the resulting chains Thinking of p-simplices of K as an abstract collection of linearly form only a group. In our case the independent vectors, the resulting linear space (with coefficients in F) chains still form a group for addition, but the overall structure along with spanned by them is the chain group. If np is the number of p-simplices multiplication by a scalar is that of a of K then Cp(K; F) ⇠ = Fnp . vector space. Boundary z z With the definition of chain groups in place, we can now express the boundary relation as a linear map. The boundary map encodes the assembly instruction for a simplicial complex. x y x y Figure 7.4: Oriented triangle hx, y, zi and its boundary ∂ 2(hx, y, zi) = hx, yi + hy, zi + hz, xi. homology: definition and computation 87 Definition 7.1.4. Let p 2 N. The boundary map ∂ p : Cp(K; F) ! Cp 1(K; F) is the linear map defined by the following rule on the basis of Cp(G; F): for each oriented p-simplex s = hv0, v1, . . . , vpi the image ∂ p s is the T The 0 vector is also called the sum of facets of s equipped with the induced orientation from s, i.e.,: trivial vector. The trivial map between vector spaces is the map whose p image is the 0 vector. ∂ p s = Â( 1)ihv0, v1, . . . , vi 1, vi+1, . . . , vpi. i=0 " We will typically be dropping For technical reasons we additionally define ∂ 0 : C0(K; F) ! 0 to be the index of the boundary map ∂ whenever it will be evident either the trivial map (actually, the only map) into the trivial vector space that the statement relating to the use (the space only containing the 0 vector). of ∂ refers to all indices p or to a specific p. For example, when talking about ∂s p, it is apparent that the A crucial fact for the algebraic formulation of a homology is that map in question is ∂ p. On the other hand, notation ∂ 2 = 0 means that for the composition of two boundary maps is the trivial map. In partic-each p 2 N, ∂ p ∂ p 1 is the trivial ular, this implies that the image of a boundary map is contained in map whose image is the zero vector. the kernel of the subsequent boundary map. See the note on the right concerning the notation in the following statement. Theorem 7.1.5. ∂ 2 = 0. Proof. It suffices to prove that ∂ 2 s = 0 for an oriented p-simplex s = hv0, v1, . . . , vpi. Note that ∂ 2 s is a formal sum of faces of s of dimension p 2. Choose indices i < j from {0, 1, . . . , p} and consider how does the face2 2 The following face is obtained from s by dropping vertices vi and vj. s 0 = hv0, v1, . . . , vi 1, vi+1, . . . , vj 1, vj+1, . . . , vpi z + appear in ∂ 2 s. Such a face appears from two terms: + • By first removing vertex vj from s in the expression of ∂ p and then removing vertex vi from the resulting simplez in the expression of x y + ∂ Figure 7.5: An example of The- p 1. The indices of removed vertices are j and i hence the sign in orem 7.1.5: Oriented triangle from t of s 0 is ( 1)i( 1)j. hx, y, zi on the left, its boundary ∂(hx, y, zi) = hx, yi + hy, zi + hz, xi, • By first removing vertex vi from s in the expression of ∂ p and then and ∂ 2(hx, y, zi) = hxi hxi + hyi removing vertex v hyi + hzi hzi = 0 as indicated by the j from the resulting simplez in the expression of signs at the vertices on the right. ∂ p 1. The indices of removed vertices are i and3 (j 1) hence the 3 As vertex vi has already been sign in from t of s 0 is ( 1)i( 1)j 1. removed and i < j, the vertex vj in now on position j 1. As the signs are opposite, the total sum equals zero. Corollary 7.1.6. Im( ∂) ⇢ ker( ∂). 88 introduction to persistent homology Definition 7.1.7. The collection of chain groups bound together by the boundary maps is called the chain complex: · · · ∂ ! Cn(K; F) ∂ ! Cn 1(K; F) ∂ ! · · · ∂ ! C1(K; F) ∂ ! C0(K; F) ∂ ! 0 For computational purposes the boundary maps are typically represented as matrices with entries in F. For each p 2 N a matrix Mp corresponding to ∂ p is obtained as follows: • Columns are enumerated by oriented p-simplices of K. b a • Rows are enumerated by oriented (p 1)-simplices of K. • Entry at position (i, j) equals +1 or 1 if the i-th row appears with d orientation +1 or 1 correspondingly in the boundary of the j-th column. All other entries are zero. Example 7.1.8. In particular, the boundary ∂a of a chain a is obtained by multiplying the boundary matrix with the natural representation of a in the chosen4 basis. Labeled boundary matrices5 for complex L of Figure 7.6: e c ha, b, ci ha, bi hb, ci ha, ci hb, di hc, di Figure 7.6: Abstract simplicial 0 1 0 1 complex L. ha, bi 1 hai 1 1 4 The same basis that is used to B hb, ci C B C B 1 C hbiB 1 1 1 C enumerate rows. B C B C 5 M 1 , M 1 1 1 Only non-zero entries are provided. 2 = ha, ciB C hciB C B C 1 = B C Matrix M0 has no formal rows as hb, di@ 0 A hdi@ 1 1 A it represents the zero-map into the hc, di 0 hei one-element vector space 0. Homology We are now finally ready to define homology as a measure of holes. Let us first build an intuition on the simplicial complex L from Figure b b 7.6. This will be followed up by a formal introduction in Definition a a 7.1.9. d d Our task is to compute that L has one hole. In the figure the hole seems to be enclosed by edges cd, db and bc. Following this observation e e we decide that holes will be represented by a special kind of chains c c called cycles, see Figure 7.7. These are the chains that model closed b b a a simplicial loops in our simplicial complex, just as the one describing the hole in L above. Formally, we define cycles to be those chains, d d whose boundary is zero. These are our candidates for the representatives of holes. e e c c However, not all cycles represent loops. For example, the top right Figure 7.7: Top row: Two cycles. cycle in Figure 7.7 is the boundary of a triangle and thus does not Bottom row: a chain that is not a cycle (left) and the cycle, that is the enclose any hole. Such cycles thus do not represent a hole and should sum of the cycles of the top row. homology: definition and computation 89 be treated as trivial. Similarly, if a cycle is obtained as the boundary of a 2-chain, then it should be treated as trivial. Such cycles are called boundaries6 and the structure formalizing the triviality of boundaries 6 At this point, the term “boundary” can refer to a geometric boundary is the quotient space. of a simplex, a boundary map, Summing up the idea, the holes are represented by the quotient or a chain, that is the image of a boundary map. group cycles/boundaries. Recall that for each p 2 {0, 1, . . .} we have Im ∂ p+1  ker ∂ p. b b a a d d Definition 7.1.9. Let K be an abstract simplicial complex. Choose a field F and q 2 {0, 1, . . .}. We define c c • the group of q-cycles as Zq(K; F) = ker ∂ q  Cq(K; F). b b a a • the group of q-boundaries as Bq(K; F) = Im ∂ q+1  Zq(K; F)  Cq(K; F). d d • q-homology group as the quotient Hq(K; F) = Zq(K; F)/Bq(K; F). c c The dimension of Hq(K; F) is called the q-Betti number (of K with Figure 7.8: Top left: a simplicial coefficients in F) and is denoted by bq = bq(K; F). complex with two holes. Its first homology group H1 with coefficients in Z2 has three non-trivial elements, In particular, each element of a homology group is an equivalence depicted as the blue, the red, and the class7 of cycles. The homology group of example L from Figure 7.6 purple chain. However, that does not mean that the number of holes equals will depend on F. Defining a = hb, ci + hc, di + hd, bi as the top 3. Along with the trivial homology left cycle in Figure 7.7, we see that H1(L; F) is {k[ a] | k 2 F}. Even class, the homology groups consists of 4 elements. This means that its though we have only one hole, the homology group typically has more dimension over Z2 equals 2, which is elements. However, the entire H1(L; F) is spanned by [ a] and thus the number of holes. We also observe the number of holes should be interpreted8 as the dimension of the that any two of the three non-trivial chains above could form the basis of homology group, in this case 1. H1. In fact, each of the three non- More generally, each homology group with coefficients in a field trivial chains is the sum of the other two. F is a vector space and thus isomorphic to Fr for some dimension r. The main goal of our computations is thus to compute r = bq, which 7 Given a cycle b, the corresponding represents the number of q-dimensional holes: class in homology will be denoted by [ b]. 8 • b0 is the same for all fields F and equals the number of components At this point we observe that it is crucial to preserve the algebraic of K (0-dimensional holes). structure (of a vector space) of the homology group in order to compute • b1 is the number of holes in the usual geometric sense (1-dimensional the dimension as the number of holes), although various fields detect di↵erent9 holes in this setting. holes. 9 See the example of the Klein bottle For planar graphs however, b1 is always the number of the holes. later in this section. • b2 is the number of caves/enclosures. These interpretations will be explored, demonstrated and partially proved throughout the rest of this chapter. Before we do that let us mention that homology groups are homotopy invariants even though cycles and boundaries are not. 90 introduction to persistent homology Theorem 7.1.10. Assume K and K0 are simplicial complexes. Then a homotopy equivalence K ' K0 implies Hp(K; F) ⇠ = Hp(K0; F), for each field F and for each p 2 {0, 1, . . .}. Zero-dimensional homology In this subsection we prove that b0 is the number of components10 10 While there are alternative ways to obtain the number of components of the underlying simplicial complex. Let K be a simplicial complex employing a smaller amount of and F any field. The homology group H0(K; F) is computed from the algebra, there are no alternatives to homological constructions when it following piece of information: comes to 1- and higher-dimensional ∂ ∂ holes. C 1 0 1(K; F) ! C0(K; F) ! 0. In order to compute H0(K; F) we need to determine ker ∂ 0 and Im ∂ 1. Since ∂ 0 is trivial we have11 ker ∂ 0 = C0(K; F). In order to determine 11 Dimension 0 is the only case where a single simplex forms a cycle. Im ∂ 1 we prove the following proposition. Proposition 7.1.11. Let K be a simplicial complex, F any field and assume x, y 2 K(0) are vertices. Then hyi hxi 2 Im ∂ 1 i↵ x and y lie in the same component of K. Proof. Assume x and y lie in the same component of K. Then there exists12 a path from x to y tracing edges. Let x = x0, x1, . . . , xk = y 12 ...by the simplicial approximation theorem. denote the sequence of vertices traced by one such path. Then the chain hyi hxi is the boundary of the 1-chain Âk 1 b i=0 hxi, xi+1i. See a Figure 7.9 for an example. In order to prove the other direction assume hyi hxi = ∂a for some 1-chain a. Let K0  K be the component of K containing vertex x and define a 0 to be the part of a contained in K0. i.e., s 0 contains d all those terms of a whose edge is in K0. No vertex of K0(0) \ {x, y} appears in ∂a 0 as none appears in ∂a either and the terms containing edges with such a vertex as an endpoint are the same in both a and a 0. Hence ∂a 0 is either hyi hxi in case y 2 K0 or hxi otherwise. Since the coefficients in front of vertices of any boundary add13 up to zero, only the first of these two options is possible. e c Assume K Figure 7.9: The boundary of the de- 1, K2, . . . , Kn are the components of K with xi 2 Ki, 8i. picted chain is hdi hbi, which is also We now combine the following information that allows us to describe the boundary of hb, di. As a conse-H0(K; F): quence, the column corresponding to hb, di in the matrix of ∂ 1 is the sum 1. Equality ker ∂ of the columns corresponding to the 0 = C0(K; F) means ker ∂ 0 = Z0(K; F) has a basis edges of the chain. {hvi}v2K(0). 13 As ∂(khz, wi) = khwi khzi this property holds for boundaries of 2. For each edge hx, yi 2 K we have ∂ hx, yi = hyi hxi, meaning that single terms. By linearity of ∂ the hxi and hyi get identified in the homology group, i.e., [hxi] = [hyi]. same also holds for chains. homology: definition and computation 91 3. By Proposition 7.1.11 the equivalence classes of two vertices are b identified in homology i↵ the vertices lie in the same components. a 4. By 1. {[hvi]}v2K(0) span H0(K; F) and by 2. and 3. so do {[hxii]}ni=1. 5. The collection {[hxii]}ni=1 is linearly independent, the proof of this d claim being similar to the second part of the proof of Proposition 7.1.11. As a result {[hxii]}ni=1 is a basis of H0(X; F) and thus the dimension of H0(X; F) equals the number of components of K, i.e., b0 = n. For example see Figure 7.10. e c Homology of a graph Figure 7.10: Abstract simplicial complex L. H0(L; F) is of dimension Let K be a simplicial complex which is a connected planar graph, two (representing two components) and let F be any field. In this subsection we prove that b1 is the with a basis [< a >] = [< b > ] = [< c >] = [< d >] and [< number of holes K generates in the plane. e >]. H1(L; F) is of dimension one The homology group H1(K; F) is computed from the following piece representing one hole, with a basis of information: [hc, di + hd, bi + hb, ci]. C ∂ 2 ∂ 1 2(K; F) ! C1(K; F) ! C0(K; F). As C2(K; F) = 0 we have H1(K; F) = ker ∂ 1 so it suffices to determine the kernel of ∂ 1. 1. Let K0  K be a maximal tree with edges e1, e2, . . . , en. 2. The collection ∂ e1, ∂ e2, . . . , ∂ en is linearly independent by the following argument14. As K0 ' 0 its first homology is trivial by 14 For an alternative geometric argument see Figure 7.11. Theorem 7.1.10 and as K0 contains no triangles, H1(K0; F) = ker ∂ 1|C a 1(K0; F). In particular, ∂ 1|C1(K0; F) is injective. Its matrix contains ∂ e1, ∂ e2, . . . , ∂ en as columns and injectivity implies the columns b are linearly independent. c 3. Let W denote the span of ∂ e1, ∂ e2, . . . , ∂ en. e e d d 4. Let en+1, en+2, . . . , em be the edges of K that are not contained in f f K0, with each ej being the edge from vertex xj to vertex yj. Figure 7.11: In this figure we demon- strate a geometric reason why the 5. Adding edges en+1, en+1, . . . , em to K0 inductively, each addition of collection of the boundaries of all edges of a tree is linearly indepen- an edge increases the number of holes generated by the resulting dent. Given a tree (on the left side graph by one. of the figure) assume a linear combi- nation of the boundaries of its edges 6. In a parallel fashion, each addition of an edge increases the dimen-is the zero vector. Since vertex a only appears in edge sion of the kernel of the first boundary map by 1 as ha, di, the co- ∂ ej 2 W, 8j 2 efficient in front of that edge in the {n + 1, n + 1, . . . , m} by Proposition 7.1.11. mentioned linear combination equals 0. The same argument holds for b 7. In the end of this process of adding edges we have generated m n and c and thus the mentioned linear combination only contains edges from holes and the dimension of ker ∂ 1 (and b1) turns out to be m n. the subtree on the right. Repeating the argument above, now for vertices d and e, we conclude that the men- tioned linear combination is trivial and thus the claim holds. The same argument works for any tree. 92 introduction to persistent homology 8. For each j 2 {n + 1, n + 1, . . . , m} let cj denote the (simplicial) path in K0 from xj to yj represented as a 1-chain. The following form a basis of H1(K; F): [ej cj] for j 2 {n + 1, n + 1, . . . , m}. An example is displayed in Figure 7.12. a Figure 7.12: From left to right, the b pictures represent a planar graph, a c maximal tree, edges not contained in the chosen maximal tree and two cycles representing a basis of the e first homology. Note that the graph d induces two holes and thus b1 = 2. f 7.2 Computing homology A systematic way to compute homology groups is through matrix reduction which allows us to obtain the rank15 of a linear map. Before 15 Given a linear map of vector spaces, its rank is the dimension of we provide the details on the rank computation, let us explain how to its image. use it in order to compute the Betti numbers. Let K be an abstract simplicial complex of dimension n and choose a field of coefficients F. For each p 2 {0, 1, . . . , n} let np denote the number of simplices of dimension p in K. Proposition 7.2.1. Let f : A ! B be a linear map of vector Proposition 7.2.2. 1. dim ker ∂ p = np rank ∂ p spaces. Then: 1. dim A = dim(ker f ) + rank f 2. bp = np rank ∂ p rank ∂ p+1 2. dim(B/ Im f ) = dim B rank f Proof. Part 1. follows from 1. of Proposition 7.2.1 as np = dim Cp(K; F). Part 1. of Proposition 7.2.1 is a Part 2. follows from 2. of Proposition 7.2.1 and 1. standard statement of linear algebra. Part 2. was proved in the previous Thus the Betti numbers can be expressed only using the number chapter. of simplices of a given dimension and the ranks of the corresponding boundary maps. Now let us turn our attention to rank computations. Given a boundary matrix, its rank16 is easily obtained17 from the row 16 And thus also the rank of the boundary map. Equivalent defini- or column echelon form. tions of the rank of a matrix include: the maximal number of linearly in- dependent columns; the maximal Echelon forms number of linearly independent rows. 17 When using coefficients in R or Q In order to obtain a row echelon form of a matrix we can use the the numerical procedure to obtain following operations18: rank might in some cases result in certain instabilities. When using R1: exchange two rows; coefficients in Zp however such issues do not arise, at least not for R2: multiply a row by a non-zero element of F; reasonably small p. 18 Operations are considered in F. homology: definition and computation 93 R3: add a multiple of one row to a di↵erent row. C1: exchange two columns. In the end we are aiming for the following transformation19 19 Symbols ⇤ denote arbitrary ele- ments of F. The first r elements of r the diagonal are declared to be 1 in 0 1 our case. This is one version of the 0 1 1 ⇤ ⇤ row echelon form and can always be a1,1 a1,n B C achieved. However, there is a variant B C B0 C B C B C of row echelon form in which these B C B 1 ⇤ ⇤C diagonal entries are non-zero but not B C B C B C B 0 0C necessarily 1, with the other ⇤ entries B C B C @ A B C still being arbitrary (possibly zero) @ A elements of F. Using this variant am,1 am,n 0 0 the rank is still obtained in the same way and as a benefit, the number of The number of the non-trivial diagonal entries, r, equals the rank of row operations required to reach it is the matrix. In practice we will sometimes refrain from using C1 and typically smaller. only reduce to the classical row echelon form that is typically obtained b through Gaussian elimination. a In a similar way we can also compute the column echelon form using the corresponding column operations C1, C2, C3 and (possibly) R1. d Example 7.2.3. Let us compute the homology of simplicial complex L from Figure 7.13. The boundary matrices are ha, b, ci ha, bi hb, ci ha, ci hb, di hc, di 0 1 0 1 ha, bi 1 hai 1 1 B hb, ci C B C B 1 C hbiB 1 1 1 C M B C B C 2 = ha, ci 1 , M hci 1 1 1 . e B C B C B C 1 = B C c hb, di@ 0 A hdi@ 1 1 A Figure 7.13: Simplicial complex L. hc, di 0 hei Performing only row operations we obtain20 20 The reduced form in this case coincides for all fields F. Later we 011 01 1 1 will see, for example with the Klein B C B C bottle, that the reduced forms and B0C B 1 1 1 C ranks in general depend on F. B C B C B0C and B 1 1C B C B C @0A @ A 0 These are the classical row echelon forms typically obtained through Gaussian reduction21 and the rank of such a matrix is the number of 21 In order to obtain pivots only on the diagonal, as the row echelon form pivots22. The corresponding ranks of the matrices are 1 and 3. We as we defined it requires, we would thus have rank ∂ 2 = 1, rank ∂ 1 = 3, n2 = 1, n1 = 5, n0 = 5 and we need to exchange columns 3 and 4. 22 conclude: Equivalently, the number of non- zero rows. • b2 = n2 rank ∂ 2 = 0, the complex encloses no “void”. T Row and column operations • amount to changes in the bases of b1 = n1 rank ∂ 1 rank ∂ 2 = 1, which is the number of holes. the domain and target vector spaces. • These changes can be encoded in b0 = n0 rank ∂ 1 = 2, which is the number of components. transformation matrices and in fact, most special forms or reductions of matrices are often expressed in terms of matrix factorizations. For our illustrative purposes though we will stick with the annotations. 94 introduction to persistent homology Smith normal form and representatives While the computation of the echelon forms suffices to compute the Betti numbers, we are often interested in the representing cycles23 of 23 There are also other ways to compute the representing cycles al- homology groups as well. To that end we employ a di↵erent canonical though, at the end of the day, most form of a matrix: the Smith normal form. It is obtained from the row of them use a similar amount of linear algebra. A high-level approach echelon form by eliminating the ⇤ entries to zero using the row and would be the following. First com-column operations R1, R2, R3, C1, C2, C3. pute the basis of Im ∂ p+1, which is the column space of the correspond- r ing boundary matrix. Then complete 0 1 it to the basis of ker ∂ p. The vectors 0 1 a 1 0 0 1,1 a1, forming the completion represent the n B C B C B0 C basis of p-homology. As mentioned, B C B C there are many ways to practically B C B C B C B 1 0 0C formalize these steps, including the B C B C B C B 0 0C presented one through the Smith @ A B C normal form. @ A am,1 am,n 0 0 In order to obtain representing cycles though, we need to use annotated rows and columns: • The annotations of columns from index r + 1 on form the basis of the kernel. • The boundaries of annotations of columns of index up to r on form the basis of the image. ker 0 1 1 0 0 b B C B0 C a B C B C B 1 0 0C B C B 0 0C B C @ A d 0 0 Example 7.2.4. Let us compute the representatives of the homology groups of simplicial complex L from Figure 7.14. The annotated boundary matrices are e ha, b, ci ha, bi hb, ci ha, ci hb, di hc, di 0 1 0 1 c ha, bi 1 hai 1 1 Figure 7.14: Abstract simplicial B 1 C B C complex hbi 1 1 1 L. hb, ciB C B C M B C B C 2 = ha, ciB 1 C, M hciB 1 1 1 C B C 1 = B C hb, di@ 0 A hdi@ 1 1 A hc, di 0 hei with the annotated row echelon forms being24 24 Only the column annotations will be displayed as the row annotations are not required. homology: definition and computation 95 ha, b, ci ha, bi hb, ci hb, di ha, ci hc, di 0 1 1 0 1 1 1 B C B C B 0 C B 1 1 1 C B C B C B 0 C and B 1 1 C. B C B C @ 0 A @ A 0 The first of these two matrices is already in the Smith normal form. The Smith normal form of the second matrix is: b a ha,bi hb,ci hb,di hb,ci ha,ci hb,ci ha,bi hc,di hb,di+hb,ci 0 1 1 B C B 1 C B C B 1 C. d B C @ A We now construct the homology representatives by dimension: Dimension 0: e c Figure 7.15: Obtained representatives 1. ker ∂ 0 has a basis hai, hbi, hci, hdi, hei. of bases of the homology groups of L. Representatives 2. Im ∂ hai and hei in red 1 has a basis formed by the images of the first three anno-spanning H0(L; F), and representative tated columns of the Smith normal form, i.e., ha, bi, hb, ci, and hc, di hb, di + hb, ci in blue spanning hb, di hb, ci. The basis obtained in this way is H0(L; F). hbi hai, hci hbi, and hdi hbi + hci hbi. 3. We may complete the basis from 2. to the basis of ker ∂ 0 by, for example, adding hai and hei and thus hai and hei represent the two 0-holes25 spanning H0(L; F). 25 I.e., components. Dimension 1: 1. ker ∂ 1 has a basis ha, ci hb, ci ha, bi and hc, di hb, di + hb, ci. 2. Im ∂ 1 has a basis formed by the images (boundaries) of the first annotated column of the Smith normal form, i.e., ha, b, ci. The basis obtained in this way is ha, bi + hb, ci ha, ci. 3. We may complete26 the basis from 2. to the basis of ker ∂ 0 by, 26 The fact that the basis element from 2. is a member from the basis for example, adding hc, di hb, di + hb, ci and thus hc, di of 1. helps us to see this completion hb, di + hb, ci represents a 1-holes spanning H1(L; F). immediately. However, such a situa- tion is an exception and a completion of basis typically involves some work with linear algebra. 96 introduction to persistent homology Incremental expansion and elementary collapse We conclude the section by analysing how a minimal change to a simplicial complex, an addition of one or two simplices, a↵ects the homology. We first discuss the incremental expansion, or how an addition27 of 27 Or a removal, which can be analy-ized in a similar fashion. a simplex to a simplicial complex changes the homology. Let K be a simplicial complex and let s(n) / 2 K be an n-simplex on vertices of K such that K [ { s} is28 also a simplicial complex. The addition of s to 28 In particular, all faces of s should be present in K. K has the following e↵ect to the homology computation scheme: 1. The number of n-simplices increases29 by 1. 29 This means that either the dimen- sion of the kernel of ∂ n or the rank of ∂ n increases by 1. 2. If chain ∂s is already contained in ker ∂ n, then the addition of s to the boundary matrix of ∂ n adds a column, which is linearly dependent on other columns and in e↵ect, the dimension of the kernel is increased by 1. 3. If chain ∂s is not in ker ∂ n, then the addition of s to the boundary matrix of ∂ n adds a column, which is linearly independent on other columns and in e↵ect, the rank of the matrix is increased by 1. As a result (see Figure 7.16), an incremental expansion either in-Figure 7.16: A demonstration of creases b incremental expansion. Adding an n by 1 (case 2.), or decreases bn 1 by 1 (case 3.). edge to a simplicial complex may either reduce b0 (the number of We next discuss an elementary collapse. We have already men-components) by 1 (blue case) or increase b1 (the number of holes) by tioned it in the chapter on simplicial complexes. Let K be a simplicial 1 (red case). complex, t(k 1) ⇢ s(k) 2 K, and assume s is the only coface of t. A removal K ! K \ { t, s} is called an elementary collapse. It is a modification that does not change the homotopy type, and hence the homology is preserved. Let us see how an elementary collapse e↵ects the computation of homology. • The boundary of s is not a linear combination of boundaries of Figure 7.17: An elementary collapse. other k-simplices as s is the only30 coface of t. Hence removing s 30 Meaning that ∂s is the only boundary of a k-simplex contain- decreases rank ∂ k by 1. ing a term with t. • The boundary of t is a linear combination of boundaries of other (k 1)-simplices by the following argument. Simplex t is con-tained31 in the chain ∂s. Since the boundary of this chain equals 31 ...with coefficient +1 or 1. zero32, we can express ∂t as a sum of boundaries of other facets of 32 ∂ 2 s = 0 s with the appropriate coefficients ±1. Hence t is a linear combination of boundaries of other (k 1)-simplices and thus removing it decreases ker ∂ k 1 by 1. In total, the dimensions of the homology groups do not33 change. 33 Recall that the only homology group that may potentially change is Hk 1. It is defined as ker ∂ k 1/ Im ∂ k and since the dimension of both ker ∂ k 1 and Im ∂ k decreases by one, the dimension of the quotient is preserved. homology: definition and computation 97 7.3 Examples of homology In this section we present some further aspects of homology that should aid our understanding of the concept. Disjoint unions Two abstract simplicial complexes are said to be disjoint if their collections are disjoint34. Two geometric simplicial complexes are 34 I.e., if there is no intersection between the sets of vertices. Formally said to be disjoint if their bodies are disjoint. The union of disjoint speaking, if such an intersection simplicial complexes K, L is called a disjoint union and is denoted by existed it would mean that we are treating both collection of vertices as K ‰ L. subsets of some larger set. Given two disjoint simplicial complexes K, L, the homology of their disjoint union is the cartesian product35 of the individual homologies: 35 In our setting, the term “direct sum” could also be used. Hi(K ‰ L; G) ⇠ = Hi(K; G) ⇥ Hi(K; G). Computationally we can see this by observing that the boundary map ∂ has block-diagonal matrices: boundaries of chains from K lie in K and the same holds for L. Since each simplicial complex is the disjoint union of its components, the technical computations and treatments of homology are typically restricted to connected simplicial complexes. Example 7.3.1. Given a planar graph K and any field F: • b0 is the number of components of K. • b1 is the number of holes of K induces in the plane. This is consistent with our obervation for disjoint union in case K is not connected (as in Figure 7.18). Euler characteristic Figure 7.18: A planar graph with four components: b0 = 4, b1 = 7, c = Suppose K is a simplicial complex and let ni denote the number 3. of i-simplices in K. Recall that the Euler characteristic c(K) 2 Z is defined as c(K) = n0 n1 + n2 n3 + . . . . This invariant has an interesting interpretation in terms of homology. Proposition 7.3.2. c(K) = b0 b1 + b2 b3 + . . .. Proof. By 2. of Proposition 7.2.2 we have bp = np rank ∂ p rank ∂ p+1. Substituting these equality into b0 b1 + b2 b3 + . . . we obtain c. Example 7.3.3. Given a planar graph K and any field F, c(K) equals the number of components subtracted by the number of holes K gener-Figure 7.19: S0 demonstrates non- ates in the plane. trivial H0, S1 represents a one- dimensional hole, and S2 encloses a two-dimensional hole. 98 introduction to persistent homology Spheres Holes as measured by homology are represented by cycles and the fundamental examples of holes are provided36 by spheres. In this sub-36 The homology of a metric space is, for our purposes, the homology of section we prove that given a triangulation of an n-sphere for n 1, any triangulation of that space. the consistently oriented collection of n-simplices represents an n-hole. In fact, this is the only hole a sphere has. A convenient triangulation of Sn we will be using will be the one37 consisting of all faces of an 37 To be precise: take an (n + 1)- simplex, add all of its faces to obtain (n + 1)-simplex. a simplicial complex called the full simplex on n + 2 points (sometimes also called the full (n + 1)-simplex), Proposition 7.3.4. For each F and n 2 {1, 2, . . .} we have: and then remove the (n + 1)-simplex to obtain a trianagulation of Sn. • H0(Sn; F) ⇠ = Hn(Sn; F) ⇠ = F; T An observation: The full simplex on n points is contractible hence its • Hi(Sn; F) = 0, 8i /2 {0, n}. Euler characteristic equals 1. On the other hand, computing its Euler characteristic by definition we get Proof. The full simplicial complex on n + 2 points is contractible n points (n2) edges + (n3) triangles hence all its homology groups are trivial except for H0, which is of . . . ( 1)n · 1 n-simplex. Summing it up we get: rank 1. Removing the only (n + 1)-simplex reduces rank ∂ n+1 by ✓n◆ ✓n◆ ✓n◆ ✓n◆ one and hence increases bn by 1, as was explained in the context of + . . . ( 1)n = 1, 1 2 3 n incremental expansions and removals. by the binomial formula. Since S0 is a collection of two points it is easy to see38. that the 38 ..by a direct computation or by the argument of Proposition 7.3.4 only non-trivial homology group of S0 is H0(S0; F) ⇠ = F2. Surfaces A beautiful demonstration of the two-dimensional homology is provided by surfaces. Proposition 7.3.5. Let K be a triangulation of a closed (i.e., without boundary) connected orientable surface. For each group F we have H2(K; F) ⇠ = F. Figure 7.20: Examples of closed connected surfaces: they all enclose one 2-dimensional hole in the form of Proof. Recall that K being orientable means there exists a consistent a “cave”, which is manifested in the fact that b2 = 1. choice of orientations on all triangles of K. Let us fix such an orienta- " The statement of Proposition tion on them. 7.3.5 does not hold for connected surfaces with boundary. If there 1. The structure of a surface implies that each edge of K is a face of at was a non-trivial 2-cycle in such a case, the same argument as in most two triangles. the proof of the proposition would imply that the cycle would be the 2. The structure of a closed surface implies that each edge of K is a oriented sum of all triangles (possibly face of precisely two triangles. multiplied by a single non-trivial factor l 2 F). Since a presence of 3. Consistency of orientations on triangles implies that whenever two a boundary of a manifold implies the existence of an edge, which is a triangles intersect in an edge, the induced orientations on the edge face of precisely one triangle, such a are the opposite. triangle (multiplied by l) would thus appear in the boundary of the cycle, a contradiction. The second homology of a con- nected manifold with a boundary is thus always trivial. homology: definition and computation 99 Let us define chain a as the sum of all oriented triangles. By 1.-3. above each edge appears in ∂a twice, once with each orientation (see Figure 7.21), and thus ∂a = 0, meaning that a is a chain. As the image of ∂ 3 is trivial, a represents a non-trivial homology class. On the other hand, whenever a 2-cycle b contains a term39 + s, 39 If s appear in the term ls for some non-zero l 2 F, we repeat the where s is an oriented triangle, observations 2. and 3. imply that all same argument for the chain divided oriented simplices sharing an edge with s also appear in a with coeffi-by l. cient 1. Inductively expanding this conclusion to further neighbors we reach all triangles as K is connected and thus deduce that b = a. The proposition is thus proved. Homology class [ a] generating H2(K; F) as defined40 in the proof 40 Formally speaking, there are two fundamental classes, one for each is called the fundamental class of the surface. In the same way we orientation of triangles...except when can prove that if K is a closed connected orientable manifold of di-F = Z2. If F = Z2 there is only one non-trivial homology class which is mension n, then Hn(K; F) ⇠ = F with the generator, which is again its own converse. called the fundamental class, being the sum of all consistently oriented n-simplices of K. The case of non-orientable surfaces is the first presented situation in which the choice of coefficients matters. Proposition 7.3.6. Let K be a triangulation of a closed connected non-orientable surface. Then H2(K; Z2) ⇠ = Z2 and H2(K; F) ⇠ = 0 for each F 6⇠ = Z2. Proof. As in the proof of Proposition 7.3.5, the fact that K is a surface means that if a 2-cycle a contains a term + s for some oriented triangle s, it also contains a term + s 0 for each oriented triangle s 0 sharing41 an edge with s. Again, as K is connected, this means that a is the sum of all oriented triangles. However, as K is non-orientable, there is no consistent orientation on triangles and thus42 some edges Figure 7.21: Top: The boundary of a chain consisting of all consis- appear with coefficient 2 in the boundary, see Figure 7.21. Thus if43 tently oriented triangles of a surface 0 6= 2 the boundary is non-trivial and the assumed 2-cycle does not without boundary is zero, as the induced orientations on edges cancel have an empty boundary, a contradiction. Hence the only 2-cycle is out. Bottom: The boundary of a the trivial cycle. chain consisting of a not-consistently However, if F = Z oriented collection of all triangles 2, the obtained boundary equals zero and thus a of a surface without boundary is the only non-trivial cycle. As a result, H2(K; Z2) ⇠ = Z2. contains each edge between two non-consistently oriented triangles twice. We may summarize these two propositions and the corresponding 41 Sharing in the sense of consistent comments as follows: orientation, meaning that the in- duced orientation on the shared edge • A connected surface K is closed i↵ H2(K; Z2) 6= 0. are the opposite. 42 As each edge appears twice in the boundary of such a chain and not all • Given any field F 6= Z2, a closed connected surface is orientable if such appearances may cancel each H2(K; F) ⇠ = F. other out by the non-orientability. 43 Equivalently, if F 6= Z2. T Proposition 7.3.6 also generalizes to the n-dimensional homology of closed connected non-orientable n-manifolds. 100 introduction to persistent homology Impact of coefficients: the Klein bottle An example where the choice of coefficients makes a di↵erence in 1-dimensional homology computations is the Klein bottle, which will be denoted by K in this subsection. It is depicted in Figure 7.22. Its triangulation is given by the black portion in Figure 7.24. We already know that b0 = 1 as K is connected. However, the second Betti number of this closed surface depends on the coefficients due to the non-orientability: • b2(K; Z2) = 1. • For F 6⇠ = Z2, b2(K; F) = 0. From this information and the expression of the Euler characteristic as the alternative sum of Betti numbers we conclude: • b1(K; Z2) = 2, i.e., H2(K; Z2) ⇠ = Z22. • For F 6⇠ = Z2, b1(K; F) = 1. These Betti numbers can also be computed through the matrix reduction. Instead of computing them, we will rather demonstrate the Figure 7.22: The Klein bottle. geometric reason for the di↵erence in b1 depending on the coefficients. The explanation will be based on Figure 7.24. On each of the five parts of the figure a triangulation of K is provided by the black/grey portion. The black arrows indicate the direction in which the identifications are performed. A single red horizontal directed line represents a cycle a generating the extra dimension of H2(K; Z2). It is also depicted in Figure 7.22. It turns out that [ a] is homologically non-zero i↵ coefficients are Z2. In order to prove this statement we first present a claim. We claim that 2[ a] = 0 in 1-dimensional homology. In order to prove the claim the leftmost part of Figure 7.24 has two copies of a drawn slightly apart from each other. The corresponding homology class does not change if we move44 each of the copies of a separately. 44 “Moving” in this setting can be thought of as a homotopic change. So let’s move them as on the Figure: Formally speaking, a moved chain represents the same homology class • move the upper copy slightly higher; if the di↵erence between the original and the new chain is in the boundary • move the lower copy to the bottom of the side. Due to the reversed group, see Figure 7.23. orientation, the chain then appears on the top of the square with (in the plane seemingly) reversed orientation. Moving this representative lower to the first copy of a we see, that the copies cancel each other out: they consist of the same edges with converse orientations. Figure 7.23: Excerpt from the trans- formation in Figure 7.24. The blue As a result, the claim holds, i.e, 2[ a] is the trivial homology class. and the red chain represent the same Depending on the coefficients of our computation this has the follow-homology class because their di↵er- ing ramifications: ence (blue red) is the boundary of the 2-chain consisting of the strip of depicted oriented triangles. homology: definition and computation 101 Figure 7.24: The Klein bottle. • if F 6⇠ = Z2 then we can divide equation 2[ a] = 0 by 2 and obtain that [ a] = 0 2 H1(K; F). • if F ⇠ = Z2 then we can’t divide equation 2[ a] = 0 by 2 as 2 = 0. It turns out that [ a] 6= 0 2 H1(K; Z2) and thus a provides an extra dimension to H1(K; Z2). For an alternative argument proving the claim see Figure 7.25. Alexander duality Homology is defined for any abstract simplicial complex. However, Figure 7.25: Another proof of the fact that 2[ a] is homologically trivial if there is an underlying geometric simplicial complex K embedded in within the Klein bottle. The chain a sphere or a Euclidean space, there is a connection between the ho-2[ a] is depicted in red and is the boundary of the 2-chain consisting of mology of K and that of its complement. The relationship is formally all depicted oriented triangles. known as Alexander duality. Before we state the duality we should explain a few technical details of the complement construction. Let K ⇢ R2 be a geometric simplicial complex. In particular, K consists45 of finitely many simplices. The 45 Formally, the body of K is the union of finitely many simplices. complement of K, denoted by KC = R2 \ K, is unfortunately not homeomorphic to a (finite46) simplicial complex. As a proof of this claim 46 Recall that all simplicial complexes considered here are finite. Within observe that K is a closed47 subset of the plane, while KC is usually48 the context of infinite simplicial com- not. However, KC is homotopic to a finite simplicial complex. For ex-plexes though, the complement can be triangulated and the treatment ample see Figure 7.26. At this point we defer from specifying details of of complements presented here is triangulation of KC or its homotopy type and rather conclude with the immaterial. 47 declaration: KC is homotopy equivalent to a finite simplicial complex In particular, this means that the limit of each converging sequence in K0 and so whenever we will be talking about the homology of KC, we K lies in K. 48 will formally be thinking of the homology of K0. The same discussion Except if K is empty. applies if K is a geometric simplicial complex in any Euclidean space or a sphere. Alexander duality provides a connection between the homologies of K and its complement. 102 introduction to persistent homology Theorem 7.3.7 (Alexander duality). Let n 2 N and suppose K ⇢ Sn is a geometric simplicial complex. Then for any coefficients F we have: 1. b0(K; F) 1 = bn 1(KC; F). 2. bn 1(K; F) = b0(KC; F) 1. 3. bq(K; F) = bn q 1(KC; F) for all q 2 {1, 2, . . . , n 1}. Figure 7.26: Simplicial complex K in the plane in black, and a simplicial For a proof see a textbook49. From the Alexander duality we may complex K0 homotopy equivalent draw a similar conclusion for complexes in Euclidean spaces by tak-to its complement in red. Note the number of holes of K is one less than ing into account that removing a point50 from Sn results in a space the number of components of K0, i.e., homeomorphic to Rn. b1(K) = b0(K0) 1. Also, number of holes of K0 equals the number of Corollary 7.3.8. Let n 2 N and suppose K ⇢ Rn is a geometric components of K, i.e., b1(K0) = b0(K). 49 James Munkres. Elements of simplicial complex. Then for any coefficients F we have: Algebraic Topology. Perseus Books, 1984. doi: 10.1201/9780429493911 1. b0(K; F) = bn 1(KC; F). 50 A removal of a point from KC ⇢ Sn increases bn 1 by one. 2. bn 1(K; F) = b0(KC; F) 1. 3. bq(K; F) = bn q 1(KC; F) for all q 2 {1, 2, . . . , n 1}. Alexander duality is handy when computing homology groups of simplicial complexes in Euclidean spaces or spheres. For example, instead of computing the one-dimensional homology of a planar simplicial complex, we can51 compute the number of components of its 51 Provided there is an easy description of a complement. Such examples complement, which is typically much faster. would include bitmap images. 7.4 Concluding remarks Recap (highlights) of this chapter • Cycles, boundaries, homology • Detecting components and holes with homology • Computing homology through matrix reduction • Euler characteristic • Alexander duality Background and applications Homology is one of the focal invariants in topology and geometry. Homological conditions and constructions can be found throughout Figure 7.27: A demonstration of Alexander duality: given a bounded mathematics. We will present one of them in the appendix (Cubical subset X of the plane, each component of X corresponds to a hole in XC, and each hole in X corresponds to bounded component of XC. homology: definition and computation 103 homology). The version presented here is usually called “simplicial homology” as it arises from the structure of a simplicial complex. For non-triangulated spaces a version called “singular homology” can be defined. In general though, any reasonable boundary map ∂ satisfying ∂ 2 induces its own homology structure. Examples52 include cubical 52 Another example is the De Rham cohomology and exterior derivative. homology (see appendix) and cohomology. For a general reference we While the theory itself is quite inmention a few textbooks53. volved, a snapshot of the fact that ∂ 2 = 0 can be observed in low dimen- Amenability to algorithmic computations through matrix reductions sions via specific derivatives: gradi-and, as we will see later, Discrete Morse Theory makes homology an ent, divergence, and curl, are specific obvious tool with which we could determine topological properties boundary maps as the composition of a consecutive pair amongst them of data. In practice, though, the usual homology is often superseded equals zero. by persistent homology, which is a richer, parameterized version of 53 James Munkres. Elements of homology described in later chapters. Algebraic Topology. Perseus Books, 1984. doi: 10.1201/9780429493911; Allen Hatcher. Algebraic topology. Appendix: Homology with coefficients in Abelian groups Cambridge Univ. Press, Cambridge, 2000; and Raoul Bott and Loring W. Classical introductions of homology typically consider coefficients Tu. Di↵erential Forms in Algebraic Topology. Springer New York, New from an Abelian group rather than a field. By far the most popular York, NY, 1982. doi: 10.1007/978-1- choice among non-fields is the group of integers Z. In this subsection 4757-3951-0 we review the construction and properties of homology using coeffi-cients54 in a group Z. 54 The presented treatment would be practically identical for any Abelian Let K be an abstract simplicial complex of dimension n. For each group as the coefficient group. q 2 {0, 1, . . . , n} let nq denote the number of simplices of dimension q in K. The definition of homology in this case remains the same with the only di↵erence being that the structure of the resulting algebraic invariants is that of Abelian groups, and the boundary operator ∂ is a homomorphism: q q 1. A q-chain is a formal sum Ânq i=1 ai s i where ai 2 Z and s i is an oriented simplex of dimension q in K. 2. The chain group Cq(K; Z) ⇠ = Znq is the group of all q-chains. Its generators are oriented q-simplices of K. 3. For each p 2 N the boundary map ∂ p : Cp(K; Z) ! Cp 1(K; Z) is the homomorphism defined by p ∂ phv0, v1, . . . , vpi = Â( 1)ihv0, v1, . . . , vi 1, vi+1, . . . , vpi. i=0 As before, ∂ 2 = 0. Additionally define ∂ 0 = 0. 4. The collection of chain groups bound together by the boundary homomorphisms is called the chain complex: · · · ∂ ! Cn(K; Z) ∂ ! Cn 1(K; Z) ∂ ! · · · ∂ ! C1(K; Z) ∂ ! C0(K; Z) ∂ ! 0 104 introduction to persistent homology 5. For each q 2 {0, 1, . . .}. We define groups: • q-cycles as Zq(K; Z) = ker ∂ q  Cq(K; Z). • q-boundaries as Bq(K; Z) = Im ∂ q+1  Zq(K; Z)  Cq(K; Z). • q-homology group as the quotient group Proposition 7.4.1. Suppose G, H H are Abelian groups, a map q(K; Z) = Zq(K; Z)/Bq(K; Z). f : G ! H is a homomorphism, and G0  G. Then: Up to this point the introduction has been analogous to the one 1. ker( f )  G. where coefficients form a field. However, as Hq(K; Z) is an Abelian 2. Im( f )  H. group, its rank does not completely determine it. In particular, 3. Im( f ) ⇠ = G/ ker( f ). 4. rank(G/G0) = rank(G) Hq(K; Z) ⇠ = Zr Z . . . , pq1 Zpq2 Zpqk rank(G0). |{z} 1 2 k free part of G | {z } torsion of G where the rank of the group r = bq(K; Z), referred to as the q-Betti number, only determines55 the free part of the group. 55 Two cases when the homology group has no torsion: Let rank ∂ q be the rank of the image of ∂ q. By Proposition 7.4.1, • For any simplicial complex K we numbers bq can be deduced56 from the ranks of ∂ q, ∂ q+1 and nq. How-have H0(K; Z) ⇠ = Zb0 , where b0 is ever, in order to compute torsion we need to delve deeper into the the number of components of K. structure of the boundary maps. • If K is a planar graph then H1(K; Z) ⇠ = Zb1 , where b1 is For example, suppose the ranks of the two maps in the following the number of holes K generates diagram are 1: in the plane. j y 56 Z ! Z ! Z, I.e., bp = np rank ∂ p rank ∂ p+1. and assume Im j ✓ ker y. Defining H = ker y/ Im j, we know that rank H = 0. However, depending on maps j, y group H could be any group of the form Zm. For example, if y(n) = k · s · n for some k, s 2 N and j(n) = k · n, then H ⇠ = Zs. T It turns out that amongst all pos- sible choices of coefficients, homology In order to compute homology with coefficients in Z we may reduce with coefficients in Z contains the each boundary matrix to its Smith normal form. Given a matrix with most information. Details of this statement are formalized in the uni- entries in Z, its Smith normal form is: versal coefficient theorem, which 0 1 explains the connection between coef- a1 0 0 ficients Z and all other coefficients. B C B 0 a2 C B C B C B C D = B C B ar 0 0C, B C B 0 0C B C @ A 0 0 where each diagonal entry ai divides57 the next one. The diagonal 57 I.e., ai|ai+1, 8i 2 {1, 2, . . . , r 1}. entries ai are called elementary divisors and r is the rank58 of the 58 The rank of the matrix corresponding to a boundary map coin- matrix. cides with the rank of the boundary map. homology: definition and computation 105 Each matrix with entries in Z has59 a Smith normal form. Some of 59 Formally, every matrix A with entries in Z can be factored as its properties are: A = UDV, where D is its Smith normal form, and U and V are 1. The form is obtained through a combination60 of row reduction matrices with entries in Z with and the Euclidean algorithm for computing greatest common divi-determinant ±1. In particular, the last condition means that U and V sors. are invertible, and that its inverses have entries in Z. 2. The form is unique up to the signs of the elementary divisors. 60 At this point the di↵erence of the structure of a group as compared to Elementary divisors generate the torsion part of homology. that of a field becomes prominent. When coefficients were in a field, We now describe how to obtain homology groups using the Smith we could always divide a row by a normal form. non-zero entry. When working with coefficients in Z that is not allowed • Choose q 2 {0, 1, . . .}. (except for ±1, which doesn’t really help). As a result, obtaining the desired form of a matrix requires us • Assume matrix D above is the Smith normal form of ∂ q+1 with all to involve greatest common divisors diagonal entries positive. and even then not all non-trivial diagonal entries can be transformed • Compute the rank of ∂ q, possibly also through its Smith normal to 1. form. • Then: r M Hq(K; Z) ⇠ = Znq rank ∂ q rank ∂ q+1 Za .i i=1 Note that this form may potentially be simplified61 further. 61 If s1, s2 are relatively prime, then Zs ⇠ 1·s2 = Zs1 Zs2 . Also, if some a We conclude by providing analogues of the examples of homology i = 1, then Zai is the trivial group, i.e., it can be omitted from the with field coefficients: expression. • The formula for disjoint union holds as before: Hi(K ‰ L; Z) ⇠ = Hi(K; Z) ⇥ Hi(K; Z). • The expression for the Euler characteristic with integer Betti numbers is the same: c(K) = b0 b1 + b2 b3 + . . .. • For each n 2 {1, 2, . . .} we have: – H0(Sn; Z) ⇠ = Hn(Sn; Z) ⇠ = Z; – Hi(Sn; Z) = 0, 8i / 2 {0, n}. • For each connected manifold K of positive dimension n we have: – Hn(K; Z) = 0 if K has boundary. – Hn(K; Z) ⇠ = Z if K is closed orientable. – Hn(K; Z) ⇠ = Z2 if K is closed non-orientable. • If K is the Klein bottle, then H1(K; Z) ⇠ = Z Z2. For an extended treatment of these examples see a book62. 62 Allen Hatcher. Algebraic topology. Cambridge Univ. Press, Cambridge, 2000 106 introduction to persistent homology Appendix: cubical homology The homology construction we described above is called simplicial homology as it is based on the structure of a simplicial complex: a space assembled using simplices. However, there are settings in which alternative shapes of basic building blocks appear to be more suitable. One such setting is image analysis, where we work with an image or a video consisting of pixels. In this setting it would be natural to consider pixels as the building blocks. This leads to a new construction63 of complexes and homology: cubical complexes and cubical homology. We will restrict ourselves to the setting of two-dimensional images, meaning the pixels are chosen from a fixed grid. The construction could easily be generalized to three-Figure 7.28: A 4 dimensional (movies of 2-D images or a 3-D image) four-dimensional ⇥ 4 image consisting of grey pixels. (movies of 3-D images) or highere-dimensional images with di↵erent 63 Actually, we could build com-shapes of grids and cubes, or even without a fixed grid. plexes and the corresponding theory from many di↵erent shapes of basic Let n 2 N and consider a square grid of size n ⇥ n, where n refers building blocks. to the number of squares along each side, see Figure 7.28. Our image is given by a collection of pixels (grey squares). The first task is to define the building blocks: • 0-dimensional cubes are the vertices appearing on the grid. There are (n + 1)2 vertices. • 1-dimensional cubes are the vertical and horizontal edges between vertices appearing on the grid. There are 2n(n + 1) edges. • 2-dimensional cubes are the squares of the grid. There are n2 squares. A cubical complex K on an n ⇥ n grid is a collection of cubes such that if s 2 K and t ✓ s, then t 2 K. Our next task is to determine a convenient systematic labelling for Figure 7.29: The collection of all potential cubical simplices. the squares, edges and vertices. In the context of simplicial complexes the labels were just the oriented collections of vertices. While the same 8 approach could64 be used here, there is a more elegant enumeration of 7 the cubes. 6 Instead of thinking about coordinates in terms of the n ⇥ n grid, 5 we systematically imagine all potential cubes of a complex drawn in 4 a table-like pattern as Figure 7.29 demonstrates. Each cube can be 3 assigned coordinates (x, y) where x, y 2 {0, 1, 2, . . . , 2n} according 2 to this pattern. Drawing the corresponding coordinate axes superim-1 posed over the original n ⇥ n grid (Figure 7.30) we see that a pair of 0 coordinates (x, y) represents the cube65, whose center is (x, y). We 0 1 2 3 4 5 6 7 8 additionally define the orientations: Figure 7.30: The assignment scheme. 64 Although, the approach would • Each square is oriented with the ordering of its vertices in the be cumbersome. We would need 4 positive-rotational order. vertices to describe a square. 65 A square, and edge, or a vertex. homology: definition and computation 107 • Vertical edges are oriented upwards, horizontal to the right. The resulting assignment of coordinates/labels has the following properties (see Figure 7.31): 8 • If x, y are both odd, then (x, y) is a square. 7 6 • If exactly one of x, y is odd, then (x, y) is an edge. square 5 (3, 5) • edge If x, y are both even, then (x, y) is a vertex. 4 (7, 4) 3 vertex We are now in a position to define cubical homology. The structure 2 (4, 2) of the definition is the same as for simplicial homology with the only edge 1 (0, 1) essential di↵erence in the boundary map. 0 Let K be a cubical complex and choose66 a a field of coefficients F. 0 1 2 3 4 5 6 7 8 For each q 2 {0, 1, 2} let nq denote the number of q-cubes in K. Figure 7.31: The assignment scheme. 66 q q We could also choose the coeffi- 1. A q-chain is a formal sum Ânq i=1 ai s i where ai 2 F and s i is an cients from an Abelian group, the oriented cube of dimension q in K. construction would be analogous. 2. The chain group Cq(K; F) ⇠ = Fnq is the vector space of all q-chains. Its generators are oriented q-cubes of K. 3. For each p 2 N the boundary map ∂ p : Cp(K; F) ! Cp 1(K; F) is the linear map defined67 by the following rules: 67 The map encodes the geometric boundary. If x and y are both even (vertex): ∂ p(x, y) = 0. If x is odd and y is even (horizontal edge): ∂ p(x, y) = (x + 1, y) (x 1, y). " The operations between coor- dinates in the boundary map are formal summations and subtrac- If x is even and y is odd (vertical edge): tions in the chain group and should not be considered as operations on ∂ p(x, y) = (x, y + 1) (x, y 1). pairs. The coordinates (x, y) are only labels and shouldn’t be added to or subtracted from each other. For If x and y are both odd (square): example, (0, 0) (2, 0) is a formal chain consisting of two vertices with ∂ p(x, y) = (x + 1, y) (x, y + 1) (x 1, y) + (x, y 1). coefficients 1 and 1, while label ( 2, 0) is undefined. As before, ∂ 2 = 0. 4. The collection of chain groups bound together by the boundary homomorphisms is the chain complex: · · · ∂ ! Cn(K; F) ∂ ! Cn 1(K; F) ∂ ! · · · ∂ ! C1(K; F) ∂ ! C0(K; F) ∂ ! 0 5. For each q 2 {0, 1, . . .}. We define groups: 108 introduction to persistent homology • q-cycles as Zq(K; F) = ker ∂ q  Cq(K; F). • q-boundaries as Bq(K; F) = Im ∂ q+1  Zq(K; F)  Cq(K; F). • cubical q-homology group as the quotient Hq(K; F) = Zq(K; Z)/Bq(K; F). It turns out that the cubical homology of a cubical complex K is isomorphic to the homology of the simplicial complex obtained by subdividing the cubes into simplices. In particular, the homology detects components, holes, and (in the case of higher dimensional cubical complexes) higher-dimensional holes as simplicial homology would. Figure 7.32: The cubical homology of the above image is given by H0 ⇠ = F (one component) and H1 ⇠ = F (one hole). 8 Homology: impact and computation by parts Homology as defined in the previous chapter is an invariant assigned to a simplicial complex. While its homotopy invariance and computational amenability make homology a suitable tool for computational purposes, the structural depth of the underlying theory goes far beyond the presented material. In this chapter we present some further properties of homology. The first one is functoriality and its impact on significant topological results from the beginning of the twentieth century: Brouwer fixed point theorem, hairy ball theorem, and invariance of domains. The second property is the ability to combine homology computations of two parts of a space in order to deduce the homology of the whole space. 8.1 Impact One of the fundamental tasks of mathematics is a construction of new objects (invariants) assigned to known objects. For example, given a closed surface we can assign to it a triangulation. In turn, we can assign homology groups to the obtained triangulation. It turns out to be very beneficial if such an assignment can be extended in a consistent manner to maps between the objects as well. When this is the case, we say the assignment is functorial1. It turns 1 Functoriality and its formal consequences are studied within the out that homology is functorial as Proposition 8.1.2 demonstrates. category theory. 110 introduction to persistent homology Functoriality of homology Definition 8.1.1. Suppose f : K ! L is a simplicial map between simplicial complexes, q 2 {0, 1, . . .}, and F is a field. The induced maps f " We refrain from specifying q and # and f⇤ are defined as follows: F in the notation f⇤ in order not to • f overload it with the indices. As such # : Cq(K; F) ! Cq(L; F) is the linear map of chain groups de-f⇤ represents the induced map on fined as homology in any dimension or with ⇣ ⌘ any coefficients. The relevant choice f q of the dimension(s) and coefficients #  ai s i =  ai f ( s i), ai 2 F, s i 2 K. should always be apparent from the i {i | dim( f ( s i))=q} context. • f⇤ : Hq(K; F) ! Hq(L; F) is the linear map defined as T The induced maps in the case of f⇤([ a]) = [ f#( a)]. coefficients in a group are homomor- phisms and are still well defined. T Identity maps between spaces Comments on Definition 8.1.1 using the notation established in it: induce identity maps on homology. Constant maps between spaces induce trivial (i.e., zero) maps on homology. 1. Given a simplex s 2 K of dimension q, its image f ( s) is a simplex of dimension q or less. The condition on dimension in the definition of f# means that only the images of those simplices s i, which are of full dimension q, are taken into account. In particular, 8 < f ( s); dim( f ( s)) = q f#( s) = K L :0; else. f ,! 2. The induced map f⇤ turns out to be well defined, i.e., if [ a] = [ b] then f⇤([ a]) = f⇤([ b]). 3. Homotopic maps induce the same maps on homology. Figure 8.1: An embedding f : K ! L. While the first homology groups 4. Suppose X and Y are metric spaces with triangulations K and of K and L are of dimension 2, the L. By the simplicial approximation theorem there exists for each image f⇤(H1(K; F)) is of dimension 1 continuous map f , demonstrating that the embed- 1 : X ! Y a simplicial map f2 between some sub- ding preserves only one hole. This divisions of K and L, such that f1 ' f2. Whenever we mention the interpretation will be significantly homology of X, we formally think of the homology of K. In a simi-expanded within the context of persistent homology. lar manner, whenever we talk about the maps on homology induced by a continuous map f1, we formally think2 of maps induced by the 2 With this explanation, the notion of a map on homology induced by simplicial map f2. a continuous map between spaces X and Y is well defined. The induced maps are consistent with respect to compositions3 as 3 the following proposition explains. Formally speaking we express this property by saying that homology is functorial. homology: impact and computation by parts 111 Proposition 8.1.2. [Functoriality of the induced maps] Suppose maps f : K ! L and g : L ! M between simplicial complexes are simplicial. Then for each q 2 {0, 1, . . .} and for each F we have (g f )# = g# f#, and (g f )⇤ = g⇤ f⇤. The proof follows straight from the definition. One of the most natural demonstrations of the power of functoriality concerns the existence of retractions. Given a space X and its subspace4 A ⇢ X, a retraction of X to A is any continuous map Figure 8.2: Geometric intuition f : X ! A such that f (a) = a, 8a 2 A. For example, the radial map dictates that if we want to retract from a two-dimensional Euclidean ball with the center removed to its B2 onto S1 = ∂ B2, the resulting map would need to have a discontinuity, boundary sphere is a retraction, see Figure 8.7. i.e., at least one point where we “tear” the disc. A fairly simple proof Example 8.1.3. For each n 2 N the standard (n 1)-sphere Sn 1 is of this fact is given using homology. 4 the boundary of the standard n-ball Bn. We claim there is no retrac-A required condition for the exis- tence of a retraction is for A to be tion Bn ! Sn 1. As a special case, there is no retraction of the unit closed in X. interval onto its endpoints. g f Proof. Assume such a retraction f : Bn ! Sn 1 = ∂ Bn exists. Pre-compose it with the inclusion g : Sn 1 ,! Bn, see Figure 8.3. Let g⇤ f⇤ H ⇠ H ⇠ 1 = F 1 = 0 H1 = F [ a] 6= 0 be a basis (generator) of Hn 1(Sn 1; F). We combine two Figure 8.3: The proof of Example observations: 8.1.3. The composition of maps is identity on S1, while the composition • As f g : Sn 1 of induced maps can’t be identity as ! Sn 1 is identity, ( f g)⇤([ a]) = [ a] 6= 0. it factors through 0. • As Hn 1(Bn; F) = 0, g⇤([ a]) = 0 and thus f⇤(g⇤([ a])) = 0. By Proposition 8.1.2 ( f g)⇤([ a]) = f⇤(g⇤([ a])), a contradiction. Hence a retraction f does not exist. Brouwer fixed point Brouwer fixed point theorem is probably one of the most famous early results of topology. It has a surprisingly short proof using the functoriality of homology. g(x) x Theorem 8.1.4. Every continuous map f : B2 ! B2 has a fixed point, i.e., a point x0 2 B2 such that f (x0) = x0. f (x) Proof. Assume map f has no fixed point. Define map g : B2 ! S1 by f (y) declaring that for each x 2 B2, point g(x) 2 S1 = ∂ B2 is the inter-y = g(y) section of S1 with the ray starting at f (x) containing x, see Figure 8.4. As f has no fixed point, such a ray always exists. Map g is a contin-Figure 8.4: Map g from the proof of uous retraction, a contradiction according to Example 8.1.3. Hence a the Brouwer fixed point theorem. fixed point exists. 112 introduction to persistent homology Hairy ball Another prominent theorem that can be conveniently proved using homology is the hairy ball theorem. The name comes from a popular adaptation of the result: one can’t comb the hair on a hairy ball without creating a hair whorl. Before we state the theorem we need to clarify a few technical details. Throughout this subsection let S2 denote the unit two-dimensional sphere in R3. 1. A tangent vector field on the sphere S2 is a continuous map f : S2 ! R3 such that for each x 2 S2 we have x ? f (x). A vector field f is Figure 8.5: A tangent vector field non-vanishing, if it is non-zero at each point. on a sphere induces a flow presented by the streamlines on this figure. Theorem 8.1.5 states that the vector 2. Given a centrally symmetric5 triangulation K of S2, let a = field must have a zero, which can  s(2)2K s be the cycle defined as the sum of consistently oriented be demonstrated on our example triangles of K. Without loss of generality we may assume the trian-by the source of streamlines. To the contrary, there are non-trivial gles are oriented so that their “upwards” direction is pointing away tangent vector fields in the plane and from the point (0, 0, 0). Recall that [ a] is the fundamental class on the torus. 5 I.e., the triangulation K has the spanning H2(K; R) ⇠ = R. following property: for each simplex t 2 K its reflection through the point 3. For each triangle s 2 K the reflection of s through (0, 0, 0) is again (0, 0, 0) is also a simplex. a simplex s 0 of K. However, if s has the chosen orientation from the previous point6, then the reflected triangle has the opposite 6 I.e., such that the chosen normal is pointing away from the point (0, 0, 0). orientation7 from the originally chosen orientation on s 0, see the left 7 I.e., such that the chosen normal is portion of Figure 8.6. In particular, [ a 0] =  s(2)2K s 0 is a non-trivial pointing towards the point (0, 0, 0). homology class representing [ a]. 4. Let r : K ! K be the reflection map and let g : K ! K be the identity map. By 2. and 3. maps g and r are not homotopic as g⇤([ a]) = [ a] 6= r⇤([ a]) = [ a 0]. x Figure 8.6: Elements of the proof of Theorem 8.1.5. On the left side is simplex ha, b, ci c and its (oriented) reflection through (0, 0, 0): h a, b, ci. Observe that b in both cases the normal to the simplex is in the direction (1, 1, 1). The same argument and picture work for any odd dimension, which leads a to Theorem 8.1.6. a On the right side is the construc- f (x) f ( x) tion of homotopy from the proof of Theorem 8.1.5. Point x is connected to x by the geodesic passing b through f (x) and vice versa. c x homology: impact and computation by parts 113 Theorem 8.1.5. Every tangent vector field on S2 has a zero, i.e., there is no non-vanishing tangent vector field on S2. Proof. Suppose f is a non-vanishing vector field on S2. Without loss of generality8 we can assume || f (x)|| = 1, 8x 2 S2. Using the notation 8 I.e., by normalizing each vector in the image of f . leading to this theorem, we will prove that g ' r, which is a contradiction by 4. above. We will construct an explicit homotopy between g and r. Such a homotopy can be thought of as a continuous collection9 9 A homotopy in question is of the form H : S2 ⇥ [0, 1] ! S2. For each of paths from x to x for all10 x 2 S2. y 2 S2 the restriction H|{y}⇥[0,1] is The simplest way to connect two diametrically opposite points on a thus a path from y to y. The fact that H is continuous means that the sphere, for the sake of simplicity let us assume we are connecting the collection of such paths is continuous. north pole N to the south pole S, is by drawing a meridian between 10 While the homology setup above them. Such a meridian is completely determined by the point at which is performed in the simplicial setting of K, the homology here will be it intersects the equator. We define this intersection point to be11 constructed on a “smooth” sphere S2. f (N). 11 Recall that || f (N)|| = 1 and In general, connect x to x by a geodesic12 on the sphere passing f (N) ? N, since f (N) lies on the equator. through f (x). This is a continuous assignment of paths and constiT A geodesic on S2 is the shortest tutes the homotopy between g and r, which completes the proof. path between two points on S2. Geodesics between N and S are The argument of Theorem 8.1.5 works for any even dimension meridians. 12 which leads to a more general result. This geodesic traces the trail of x as translated by the resulting homotopy. On the other hand, the Theorem 8.1.6. The sphere Sn admits a non-vanishing tangent field trail of x as translated by the resulting homotopy is given by i↵ n is even. the geodesic from x to x passing through f ( x). See the right portion of Figure 8.6 for a sketch. When n is even there is an easy construction of a non-vanishing tangent field: (x1, y1, x2, y2, . . . , xm, ym) 7! (y1, x1, y2, x2, . . . , ym, xm). Invariance of domain The last classical result we mention explains why Euclidean balls of di↵erent dimensions are fundamentally di↵erent in the sense that they can’t be homeomorphic13. 13 While homology itself is a homo- topy invariant, the trick we will use will allow us to use it to di↵erentiate Theorem 8.1.7. For any pair of natural numbers m 6= n the closed homeomorphic types of spaces. balls B1 = BRm (0, 1) and B2 = BRn (0, 1) are not homeomorphic. T As a consequence of Theorem 8.1.7, Dn 6⇠ = Dm if m 6= n. The same argument gives Rn 6⇠ = Rm if m 6= n. Proof. Assume there exists a homeomorphism f : B1 ! B2. Then f |B1\{0} : B1 \ {0} ! B2 \ { f (0)} is also a homeomorphism. Recall that B1 \ {0} ' Sn 1 via the radial projection (see Figure 8.7), which means Hn 1(B1 \ {0}; F) is non-trivial for any F. On the other hand, B2 \ { f (0)} is either: 114 introduction to persistent homology • homotopy equivalent to Sm 1 if f (0) / 2 ∂ B2, or • contractible if f (0) 2 ∂ B2. In both cases Hn 1(B2 \ { f (0)}; F) = 0, a contradiction. 8.2 Homology by parts Given a decomposition of a simplicial complex K = A [ B as the union of subcomplexes A and B, can14 we compute the homology of X from the homology of A and B? Figure 8.7: Radial projection of a The answer to this question is unfortunately negative, for example: disc with the center removed to the boundary of the disc. The induced • As the sidenote on the right on a similar question demonstrates, the homotopy equivalence demonstrates B1 \ {0} ' Sn 1 in the proof of cumulative zero-dimensional homology of K and L may be too large Theorem 8.1.7. 14 and should possibly be decreased by the zero-dimensional union of A similar question: Given a finite set Y = C the intersection. [ D, can we determine the cardinality |Y| from |C| and |D|? The answer |Y| = |C| + |D| |C \ D|, • On the other hand, a circle is the union of two 1-discs, i.e., a space which also includes the intersec-with a one-dimensional hole is the union of two subspaces without tion, is not unlike the answer to our question about homology...especially holes. since, for discrete sets, the cardinal- ity represents the zero-dimensional These two examples show that A and B can have cumulatively “too homology. much” or “too little” homology to deduce the homology of the union A A A X and that one should probably take into account the homology of the intersections as well. The algebraic structure through which the connection between the homologies of X, A, and B is expressed is that B B B of exact sequences. Figure 8.8: Two contractible com- plexes, whose union is not con- tractible. Exact sequences Definition 8.2.1. A sequence of vector spaces V0, V1, . . . and linear maps T Recall that homology is defined j n : Vn ! Vn 1 is exact, if for each n we have Im j n from a sequence of chain groups +1 = ker fn. called the chain complex; it is de- fined as the quotient ker ∂/ Im ∂. In particular, the homology of a chain It turns out that in an exact sequence, the dimension of each vector complex is zero at all dimensions i↵ space (except for the last one) can be deduced from the ranks of the the chain complex forms an exact se-neighboring maps. quence. Or, to put it locally, Hq = 0 i↵ the chain complex is “exact at Cq” in the sense that Im ∂ q+1 = ker ∂ q. Proposition 8.2.2. Suppose the following sequence is exact: Homology thus measures the extent to which a chain complex is not j n+1 j n j 2 j 1 exact. · · · ! Vn+1 ! Vn ! Vn 1 ! · · · ! V2 ! V1 ! V0. Then for each n > 0, dim Vn = rank j n+1 + rank j n. Proof. We know that dim Vn = dim ker j n + rank j n. Now use exactness: Im j n+1 = ker j n. homology: impact and computation by parts 115 Mayer-Vietoris exact sequence We are now able to express15 the connection between the homology 15 Standard proofs use zig-zag lemma given in an appendix. of X and the homology of its two parts A and B, a connection that also includes the homology of the intersection A \ B. Theorem 8.2.3. Suppose A, B  X are subcomplexes of a simplicial complex K such that A [ B = X. Then for each choice of coefficients the following sequence of homology groups is exact: d (i µ · · · ! H n+1 n,jn) n n+1(X) ! Hn(A \ B) ! Hn(A) Hn(B) ! Hn(X) ! · · · (i µ · · · ! H 0,j0) 0 0(A \ B) ! H0(A) H0(B) ! H0(X) ! 0, with the involved maps defined as follows: • i⇤, j⇤ are inclusion induced maps, i.e., i⇤[ a] = [ a] and j⇤[ a] = [ a]. • µ is the subtraction map, i.e., µ⇤([ a], [ b]) = [ a b]. • d is a variant of a boundary map defined as follows. Given an n-cycle a in X, decompose it as a = a A + a B where a A is an n-chain A A \ B B in A and a B is an n-chain in B. Define d[ a] = [ ∂a A] as the homology class corresponding to the boundary of the chain a A. Figure 8.9: A decomposition of S1 into two 1-discs. Example 8.2.4. We will compute the homology of S1 with coefficients in a field F. Express S1 as the union of two 1-discs A and B as Figure 8.9 suggests. The only non-trivial part of the corresponding Mayer-Vietrois sequence is the following: H1(A) H1(B) ! H1(X) ! H0(A \ B) ! H0(A) H0(B) ! H0(X) ! 0, which is of the form 0 d d ! H 1 0 1(X) ! F2 (i0,j0) ! F2 µ 0 ! H0(X) ! 0, since A \ B has two components. We proceed by the following sequence of deductions: 1. rank d 0 = 0 as it is the trivial map. 2. Map µ 0 is of rank16 1. 16 Recall that µ(u, v) = u v. Its rank is either 0, 1, or 2. It can’t be 3. By Proposition 8.2.2 we get dim H 0 as the map is nontrivial. It can’t 0(S1) = 1. be 2, as it has a non-trivial kernel generated by (u, u) since A and B are 4. By exactness and observation 2. we have dim Im(i0, j0) = dim ker µ 0 = in the same component of X. 1, hence rank(i0, j0) = 1. 5. By exactness and the previous item we have dim Im d 1 = dim ker(i0, j0) = 1, hence rank d 1 = 1. 116 introduction to persistent homology 6. By Proposition 8.2.2 we get dim H1(S1) = 1. 7. All higher homotopy groups (for n > 1) are trivial as they appear as · · · 0 ! Hn(X) ! 0 · · · in the exact sequence which, by Proposition 8.2.2, means they are trivial. Remark 8.2.5. In the same manner we could compute the homology groups of Sm for each m by observing that it can be decomposed as the Figure 8.10: A decomposition of S2 into two discs, whose intersection is union of two hemispheres (m-discs) whose intersection is homotopy S1. equivalent to Sm 1, see Figure 8.10. Example 8.2.6. In a similar manner we can compute the homology of the torus X with coefficients in any field F. We will only mention A A \ B B how to compute its first homology as the homology groups of other ↵ dimensions are already known. We will use the decomposition of Figure 8.11. The relevant part of the Mayer-Vietoris sequence is Figure 8.11: A decomposition of the torus into two parts, whose intersection is the disjoint union of H1(A) H1(B) ! H1(X) ! H0(A \ B) ! H0(A) H0(B) ! H0(X) ! 0 two copies of S1. which is of the form d F2 µ 1 ! H 1 1(X) ! F2 (i0,j0) ! F2 ! F ! 0. We proceed by the following sequence of deductions: 1. By the same argument as in Example 8.2.4 we have rank d 1 = 1. 2. The generators of H1(A) and H1(B) are cycles/loops a and b respectively. Note that a ' b in X thus [ a] = [ b] 2 H1(X). Furthermore, as17 0 6= [ a] 2 H1(X), we have18 rank µ 1 = 1. 17 An algebraic way to see that 0 6= [ a] 2 H1(X) is through the Mayer-Vietrois sequence: 3. By Proposition 8.2.2 we get dim H1(X) = 2. Proof. If [ a] was trivial in H1(X) then ([ a], 0) would be in ker µ 1. By 8.3 Concluding remarks exactness, this would mean that ([ a], 0) 2 Im(i1, j1). However, Im(i1, j1) is generated by the images of the Recap (highlights) of this chapter two obvious cycles in A \ B, each of which maps into (±[ a], ⌥[ b]). Space • Induced maps on homology and functoriality; Im(i1, j1) is thus one-dimensional and generated by (±[ a], ⌥[ b]), hence ([ a], 0) / 2 Im(i1, j1) as [ a] 6= 0 in H1(A) • Brouwer fixed point theorem and hairy ball theorem; and [ b] 6= 0 in H1(B). 18 The map • Exact sequences; µ 1 is defined as µ 1(u, v) = u v in the basis ([ a], 0), (0, [ b]) of H1(A) H1(B). • Mayer-Vietoris exact sequence. Its rank is either 0, 1, or 2. Its can’t be 0 as the map is nontrivial, since 0 6= [ a] 2 H1(X). It can’t be 2, as it has a non-trivial kernel generated by (u, u) since [ a] = [ b] 2 H1(X). homology: impact and computation by parts 117 Background and applications Invariants of homological nature appear throughout topology, geometry and other fields of mathematics. The examples of theoretical applications presented here barely scratch the surface. Some of the settings in which such constructions contributed to significant development include knot theory (Khovanov homology), di↵erential geometry (De Rham cohomology, Floer homology), etc. For further background on theoretical foundations see Hatcher’s book19. 19 Allen Hatcher. Algebraic topology. Cambridge Univ. Press, Cambridge, The Mayer-Vietoris sequence arises from a decomposition of a space 2000 into two pieces. A natural question about a similar result in the context of decompositions into more pieces is treated within the context of spectral sequences, an algebraic formalism far above the reach of our presentation. These theoretical developments allow for a certain level of distributed computation of homology. Appendix: zig-zag lemma Lemma 8.3.1. [Zig-zag lemma] Let F be a field of coefficients. Assume the following diagram of vector spaces over F and linear maps20 is 20 For the sake of simplicity the indices of maps will be omitted. For commutative21: example, maps a q : Aq ! Bq are all denoted by a even though they .. . . . .. .. depend on q . For the same reason we will refrain from mentioning F again. ∂ ∂ ∂ 21 I.e., ∂ a = a ∂ and ∂ b = b ∂. ✏ ✏ b ✏ 0 a / Aq+1 / Bq+1 / Cq+1 / 0 T An exact sequence of the form 0 ∂ ∂ ∂ ! A ! B ! C ! 0 ✏ ✏ b ✏ 0 a is called a short exact sequence. / Aq / Bq / Cq / 0 In such a situation, map A ! B is injective as its kernel is the trivial ∂ ∂ ∂ image of the map 0 ! A. On a ✏ ✏ b ✏ similar note, B ! C is surjective as 0 a / Aq 1 / Bq 1 / Cq 1 / 0 its image is the kernel of the map C ! 0, which is C. As Im(B ! C) ⇠ = ∂ ∂ ∂ B/ ker(B ! C) we conclude C ⇠ = B/A ✏ ✏ ✏ . since equality ker(B ! C) = Im(A ! . . . . .. .. B) holds by exactness. If each row is a short exact sequence, and each columns is a chain complex22, then there exists a long exact sequence of homology groups23 22 I.e., ∂ 2 = 0. 23 I.e., the homology groups arising a⇤ b⇤ from the vertical chain complexes. In · · · / Hq+1(B) / Hq+1(C) particular, Hq(A) is the quotient d ker(Aq ! Aq 1)/ Im(Aq+1 ! Aq). t H a⇤ b⇤ q(A) / Hq(B) / Hq(C) d t H a⇤ b⇤ q 1(A) / Hq 1(B) / · · · 118 introduction to persistent homology The idea of a proof. The proof is performed using the “diagram chas-ing” technique. We will only prove the existence of the d map. T Diagram 1: In order to define d let us choose a non-trivial cycle c 2 Cq+1. c / 0 _ Charted by the diagrams on the right, the chase after d([c]) begins: ∂ ✏ Diagram 1: ∂(c) = 0 as c is a cycle. 0 T Diagram 2: Diagram 2: By the exactness of the row map b is surjective, thus there exists b1 2 b 1(c). Define b2 = ∂(b1). By the commutativity b1 / c / 0 _ b _ b(b2) = 0. ∂ ✏ b ✏ Diagram 3: By the exactness of the row map there exists a1 2 b2 / 0 a 1(b2). Define d([c]) = [a1]. T Diagram 3: The rest of the proof goes along the same lines. For example, in b1 / c / 0 _ _ order to prove a1 is a cycle we use diagram 4: • Define a ✏ ✏ 2 = ∂(a1) observe ∂(b2) = 0 as ∂ 2 = 0. a a 1 / b2 / 0 • By the commutativity a(a2) = 0. T Diagram 4: b • By the exactness of the row a 1 / c / 0 2 = 0, hence a1 is a cycle. _ _ Remark 8.3.2. The construction and proof of the Mayer-Vietoris se- ✏ ✏ a1 / b / 0 quence follows from the zig-zag lemma using the following commutative _ 2 _ diagram (using notation of Theorem 8.2.3) ✏ ✏ . . . a2 / 0 .. .. .. ∂ ∂ ∂ ✏ ✏ ✏ 0 b / Cq+1(A \ B) a / Cq+1(A) Hq+1(B) / Cq+1(X) / 0 ∂ ∂ ∂ ✏ ✏ ✏ 0 b / Cq(A \ B) a / Cq(A) Hq(B) / Cq(X) / 0 ∂ ∂ ∂ ✏ ✏ ✏ 0 b / Cq 1(A \ B) a / Cq 1(A) Hq 1(B) / Cq 1(X) / 0 ∂ ∂ ∂ ✏ ✏ ✏ .. . . . .. .. with maps a being induced by inclusion, and maps b being the subtraction maps24. Observe that the horizontal maps are short exact 24 I.e., b([ g 1], [ g 2]) = [ g 1 g 2]. sequences. Zig-zag lemma provides a useful template for constructions of exact sequences. Another setting in which it applies is that of relative homology. homology: impact and computation by parts 119 Appendix: Relative homology Let us fix a field F, a simplicial complex K, and L  K. Homology construction on K is based on cycles: chains whose boundaries are trivial. The concept of relative homology expands this construction in the following way: given L  K, the relative homology construction is based on relative cycles, i.e., chains, whose boundaries are contained in L. Algebraic specifics of the definition. From the chain complexes of K and L we can construct the quotient chain complex: · · · ∂ ! Cq(K)/Cq(L) ∂ ! Cq 1(K)/Cq 1(L) ∂ ! · · · ∂ ! C0(K)/C0(L) ∂ ! 0 The relative homology groups Hq(K, L) are the homology groups arising from this chain complex. In particular: ker C H q+1(K)/Cq+1(L) ∂ ! Cq(K)/Cq(L) q(K, L; F) = . Im Cq(K)/Cq(L) ∂ ! Cq 1(K)/Cq 1(L) T Observe that Hq(K, ∆) = Hq(K). Combining Lemma 8.3.1 and the commutative diagram .. . . . .. .. ∂ ∂ ∂ ✏ ✏ ✏ 0 / Cq+1(L) / Cq+1(K) / Cq+1(K)/Cq+1(L) / 0 ∂ ∂ ∂ ✏ ✏ ✏ 0 / Cq(L) / Cq(K) / Cq(K)/Cq(L) / 0 ∂ ∂ ∂ ✏ ✏ ✏ 0 / Cq 1(L) / Cq 1(K) / Cq 1(K)/Cq 1(L) / 0 ∂ ∂ ∂ ✏ ✏ ✏ .. . . . .. .. we conclude that relative homology groups fit into the following exact sequence: · · · ! Hq+1(K, L) ! Hq(L) ! Hq(K) ! Hq(K, L) ! · · · · · · ! H0(L) ! H0(K) ! H0(K, L) ! 0, Relative homology has a geometric meaning, which expands that of the usual homology. Table 8.1 summarizes the relative homology of the pair (K, L) of simplicial complexes from Figure 8.12. Let us geometrically interpret Table 8.1: 120 introduction to persistent homology h i j k Figure 8.12: Simplicial complex K. Its subcomplex L  K contains f vertices a, c, b, d, e, j, k and all edges between these vertices. It is depicted by bold red edges. g e a b c d q dim Hq(K) dim Hq(K, L) Table 8.1: The comparison of the homology of K and the relative 0 3 1 homology of the pair (K, L) from 1 2 2 Figure 8.12. Dimension 0: K has three components. However, the relative homology detects only component [ f ]. Homology class [e] is contained in L and is thus trivial by the definition. Homology class [h] is homologous to [e] and thus trivial as well. Dimension 1: A convenient basis for H1(K) would consist of [ha, bi + hb, ei + he, ai] h i and [he, ii + hi, hi + hh, ei]. A basis for H1(K, L) however would consist of [he, ii + hi, hi + hh, ei] and [hj, ci]. Note f that: • [ha, bi + hb, ei + he, ai] is a trivial in H1(K, L) as it is contained in L. g • hj, ci is a cycle in the relative homology chain complex as its Figure 8.13: The space obtained from boundary is contained in L and thus trivial. simplicial complex K from Figure 8.12 by contracting the subcomplex L Geometrically we can think of the relative homology H to a point. The space has two holes ⇤(K, L) as but is not a simplicial complex in the homology of the space obtained from K when the subcomplex L general. is contracted to a point, see Figure 8.13 for an example. The only exception to this rule is H0(K, L), whose dimension is one less25 than 25 In the literature this exception is usually encoded in the phrase the number of the components of the resulting space26. “reduced homology”. 26 Note that the resulting space does not inherit the structure of a simplicial complex from K. However, it can be triangulated. 9 Persistent homology: definition and computation The concept of persistent homology along with its varia-tions is at the forefront of topological data analysis. Mathematically speaking, persistent homology is an obvious extension of homology: the functoriality of homology is applied to a sequence of inclusions. The resulting structure is, somewhat surprisingly, not harder to compute than ordinary homology. When coupled with the standard constructions of complexes, persistent homology contains not only topological but also geometric information. We will start this chapter by explaining geometric intuition on persistent homology. We will continue by presenting formal and convenient visualisation techniques. We will conclude with a fairly simple algorithm for computation one could call a “labelled matrix reduction”. 9.1 Definition We first describe the geometric intuition of persistent homology. Given a “growing” simplicial complex, persistent homology describes the evolution of its holes. As an illustrative example we consider four simplicial complexes K1  K2  K3  K4 of Figures 9.1 and 9.2. Here is how we interpret the corresponding zero-dimensional barcode1 described by Figure 9.1: 1 I.e., the evolution of the compo- nents. K1 : There are two components of K1. This fact is visualised by the fact that there are two bars (blue and red) starting at that time. The corresponding homology generators2 (points) are colored ac-2 Note that the generator of the red component is unique. On the other cordingly. hand, we could have chosen any vertex of the other component as a K2 : There are three components of K1: this fact is visualised by the generator and color it blue. fact that there are three bars (blue, purple and green) passing from that time on. The corresponding homology generators of the new components are colored accordingly. However, the two components of K1 merge, which we interpret as one of the components of K1 122 introduction to persistent homology K Figure 9.1: Nested simplicial com- 1 K2 K3 K4 plexes K1  K2  K3  K4 are divided by vertical lines. The hori- zontal arrows below are called “bars” and form a barcode. They indicate the persistence of zero-dimensional homology classes: components. The left endpoint of each bar corresponds to the birth complex of a component. The right endpoint of each bar cor- responds to the terminal complex of a component. The color of each bar also appears on one vertex (the representative of the component) and potentially on one edge (the edge, that terminates the component). disappearing. We declare3 that the component disappearing is the 3 As the components appeared at the same time, we might as well red component, which is visualised by the fact that the red bar have chosen to have the blue bar terminates just before K2. The edge making the connection between terminated and keep the red bar going. The uncolored barcode would the two components is colored in red. have remained the same. However, whenever there is a merger of com- K3 : The purple component terminates by connecting to the blue ponents with di↵erent birth times we component via two edges, one of which is indicated by the purple act according to the elder rule: the older component survives. This will color. be apparent at K3. The reader may rest assured this is not a product of K4 : There is no change in components as compared to K3, both bars discrimination but rather a rule that are passing through to infinity. is consistent with the mathematical structure of persistence (especially the interleaving and stability) that will be described later. K Figure 9.2: Nested simplicial com- 1 K2 K3 K4 plexes K1  K2  K3  K4 and the corresponding one-dimensional homology barcode. In a similar fashion we interpret the corresponding one-dimensional barcode4 described by Figure 9.2: 4 I.e., the indicated evolution of the holes. K1 : There are no holes and hence no bars passing on. K2 : A blue hole appears inducing a blue bar. K3 : The blue hole becomes trivial by the blue triangle and hence the blue bar terminates. However, two new holes appear, the red one persistent homology: definition and computation 123 and the green one. Consequently, there are two bars passing from K3 on. K4 : The red hole becomes trivial while the green hole lives on for-ever, just as the corresponding bar. The goal of this chapter is to present the theoretical background formalizing the presented geometric idea of persistent homology, and to introduce the computational procedure to obtain the barcodes. Formal definition We first formally introduce a filtration: a nested sequence of ever larger simplicial complexes modelling a growing simplicial complex. Definition 9.1.1. Let K be a simplicial complex. A (discrete) filtration of K is a sequence of subcomplexes K1  K2  . . .  Km = K. An example of a filtration is given in Figures 9.1 and 9.2. Persistent homology measures how homology elements5 persist6 5 I.e., components, holes, etc. 6 I.e., remain non-trivial through steps of a filtration. A filtration of a simplicial complex K in Definition 9.1.1 can be expressed as a sequence of natural inclusion maps denoted by map7 i , : 7 For example, is,t : Ks ,! Kt. i i i K 1,2 2,3 m 1,m 1 ,! K2 ,! . . . , ! Km = K. Given a field F and q 2 {0, 1, 2, . . .} we can apply homology Hq( ; F) to obtain a sequence8 of homology groups connected by 8 By the functoriality of the homology we have (iu,t)⇤ (is,u)⇤ = (is,t)⇤. linear maps: H (i1,2)⇤ (i2,3)⇤ q(K1; F) ! Hq(K2; F) ! . . . (im 1,m)⇤ ! Hq(Km; F) = Hq(K; F) T In each step of a filtration we add simplices. The addition of a Definition 9.1.2. Assume K is a simplicial complex, F is a field, and single d-dimensional simplex in one step may either “terminate” a q 2 {0, 1, 2, . . .}. Given a filtration non-trivial homological element of dimension d 1, create a non-trivial K1  K2  . . .  Km = K homological element of dimension d, or have no e↵ect on homology. of K, the corresponding q-dimensional persistent homology groups with coefficients in F are images of the maps (is,t)⇤ : Hq(Ks; F) ! Hq(Kt; F) q T q Note that b for all 0 s,t is a non-increasing  s  t  m. The corresponding ranks b s,t = rank(is,t)⇤ function in t and a non-decreasing are called persistent Betti numbers. function in s. 124 introduction to persistent homology As is the case with ordinary homology with coefficients in a field, each persistent homology group is determined9 up to isomorphism 9 While the rank of (is,t)⇤ determines the image of the map is,t up to iso- by its Betti number. A single filtration results in a table of persistent morphism, it does not determine a Betti numbers. specific b s,t-dimensional subspace of Hq(Kt; F). In this aspect persistent Example 9.1.3. Given any field F the following are the tables of the homology as a specific subgroup of Hq(Kt; F) contains more information zero-dimensional and one-dimensional persistent Betti numbers of the that persistent Betti numbers, i.e., filtration of Figure 9.3: its basis consists of homology rep- resentatives spanning the persistent homology group. t t Table 9.1: The table of persistent s 1 2 3 4 s 1 2 3 4 Betti numbers corresponding to 1 2 1 1 1 1 0 0 0 0 the filtration of Figure 9.3. The b 0 diagonal entries coincide with the s,t ! 2 / 3 2 2 b 1s,t ! 2 / 1 0 0 Betti numbers of the corresponding 3 / / 2 2 3 / / 2 1 stages of the filtration. The sub- 4 / / / 2 4 / / / 1 diagonal entries are undefined. Let us demonstrate how to interpret these numbers geometrically: • b 02,3 = 2 means that two of the di↵erent components of K2 are still disconnected from each other in K3. • b 13,4 = 1 roughly means that only one homologically non-trivial loop of K3 is still10 homologically non-trivial in K4. 10 A mathematically correct state- ment would be: the space of one- dimensional homology elements in • b 12,3 = 0 means all one-dimensional homology elements in H1(K2; F) H1(K4; F) which have represenatives are homologically trivial in K3. in C1(K3; F) is of dimension one. K Figure 9.3: A filtration K 1 K2 K3 K4 1  K2  K3  K4 along with the corresponding Betti numbers of each of the stage and the zero-dimensional barcode. 0 = 2 0 = 3 0 = 2 0 = 2 1 = 0 1 = 1 1 = 2 1 = 1 While the tables of persistent Betti numbers are useful, there are other ways to visualize the evolution of homology groups through a filtration. One such visualization we have already presented is the barcode. persistent homology: definition and computation 125 9.2 Visualization Throughout this section we fix a field F, q 2 {0, 1, . . .}, a filtration K1  K2  . . .  Km = K, and 1  s < t  m. Barcodes Barcodes have been geometrically introduced above. In this subsection we will provide their formal definition. q Persistent Betti number b s,t represents the dimension of the subspace of homology elements in Kt that have a representative in Ks. q Putting it di↵erently, b s,t indicates the dimension of the collection11 11 This collection is formally not a linear subspace. In a formal setting of homology elements in Ks that are still non-trivial in Kt in the sense we represent it as the quotient linear q that b s,t = dim Hq(Ks; F)/ ker(is,t)⇤. Barcodes as indicated above, subspace appearing at the end of the sentence. however, have more specific information: a bar [s, t) represents a homology element that is born precisely at s and terminates precisely at t. Let us phrase this formally: 1. The number of bars containing s and passing through t equals12 12 Through the rest of the section we q will drop the supscript q indicating b s,t. the fixed dimension. 2. Homology born at s is defined as13 Hq(Ks; F)/(Im is 1,s)⇤. Its 13 For formal reasons we define (i0,t)⇤ to be the trivial map. dimension is b s,s b s 1,s and represents the number of bars starting at s. 3. Homology terminating at t is defined as ker(it 1,t)⇤. Its dimen-sion14 is b t 1,t 1 b t 1,t and represents the number of bars termi-14 Using the fact that ker(it 1,t)⇤ ⇠ = Hq(Kt 1; F)/ Im(it 1,t)⇤. nating at t. 4. The quantity b s,t b s 1,t represents15 the dimension of homology 15 Compare to the interpretation of persistent Betti numbers above. born at s which is still alive at t. It represents the number of bars Also note that b s,t b s 1,t = starting at s which are passing through t. dim((Im is,t)⇤/ Im(is 1,t)⇤), i.e., the dimension of the homology elements 5. Quantity n in s, t = b H s,t 1 b s 1,t 1 ( b s,t b s 1,t) represents16 q(Kt; F) that have a representa- tive in Ks module the ones that have the dimension of homology born at s which terminates at t. It a representative in Ks 1. represents the number of bars starting at s and terminating at t. 16 Observation 4. interprets this for- mula as [the dimension of homology 6. We additionally define ns,• = b s,m b s 1,m, which represents the born at s which is alive at t 1] - [the dimension of homology born at s dimension of homology born at s which is still alive at the end of which is still alive at t] . the filtration. The q-dimensional barcode17 consists of intervals18 of the form 17 ...of the chosen filtration with coefficients in F... 18 In the setting of a barcode these i. [s, t) for 1  s < t  m, and intervals will be called bars. ii. [s, •) for 1  s < m. 126 introduction to persistent homology A barcode can have multiple19 copies20 of each interval. Fixing 19 ...or none... 20 Alternatively, we could think 1  s < t  m: of the barcode as the collection of all possible intervals of the forms • The number of the intervals [s, t) is denoted by ns,t. (i) and (ii), each with an assigned multiplicity from {0, 1, 2, . . .}. • The number of the intervals [s, •) is denoted by ns,•. Example 9.2.1. We again turn our attention the familiar filtration t s 1 2 3 4 in Figure 9.3. From the table on the right we can deduce that n2,3 = 1 2 1 1+ 1 1 1 (2 3) = 1 and as a result there is 1 bar of the form [2, 3), as b 0s,t ! 2 / 3+ 2 2 3 / / 2 2 displayed in the figure. In a similar fashion we compute n1,2 = n1,• = 4 / / / 2 n2,• = 1 and n1,3 = n1,4 = n2,4 = n3,4 = n3,• = n4,• = 0. Computing n2,3 of Figure 9.3: the coloring corresponds to the defining A barcode represents the persistences of homology elements. The formula and the subscripts to the longer a bar, the longer the corresponding homology element persists. signs within. In most settings the longer persistence of a homology element also means higher importance21. However, there are also settings in which 21 I.e., a more prominent topological feature. information is contained in shorter bars, especially when the bars are numerous and specifically distributed. Persistence diagrams Another established method of visualisation of persistent homology are persistence diagrams defined as follows. Given a barcode as defined above we can think of an interval [s, t) as a pair of numbers and visualize22 it as a point (s, t) 2 R2. A point of the form (s, •) 22 Just as there can be more bars with the same endpoints in a bar- obviously can’t be drawn in a plane so we choose a y-coordinate above code, there can be more copies of the k, perhaps most conveniently as k + 1, to act as a representative of •, same point visualized at the same location in a persistence diagram. i.e., a bar [s, •) corresponds to a point (s, k + 1). Each point (s, t) of While multiple such intervals can a persistence diagram has an assigned multiplicity ns,t, which repre-be visualized in a vertical stack, the sents the number of bars of the form [s, t). In the case of (s, k + 1), the same can not be done with points. For this reason we always consider a multiplicity is ns,•. point (s, t) in a persistence diagram The result is a collection of weighted points in the plane called a as a weighted point with weight (multiplicity) n persistence diagram. An example is provided in Figure 9.4. s,t. A barcode encodes precisely the same information as a persistence diagram. While the persistence of a bar is measured by its length, the persistence of a point on a persistence diagram is measured by its distance from the diagonal D = {(x, x) | x 2 R}. All points of a persistence diagram lie above D. T Theoretically speaking, if there existed bars [s, s) of length zero, then Persistence diagrams are often the chosen method of visualization these would have been the shortest when it comes to representation of persistent homology. Especially bars. They would have corresponded to diagonal points (s, s). This point when the number of points and bars is large, their distribution seems of view will come handy in the next to be well represented by persistence diagrams. On the other hand, chapter in the context of stability. when the number of points and bars is low, a barcode is often more descriptive. persistent homology: definition and computation 127 K 1 1 K2 K3 K4 4 3 2 1 1 2 3 4 Figure 9.4: A filtration along with the corresponding zero-dimensional barcode and persistence diagram. The fundamental lemma of persistent homology The colors of bars match the colors of the corresponding points in the Numbers ns,t are defined using persistent Betti numbers b s,t. It persistence diagram. turns out that the reverse expression also exists. " In this setting, condition t0 > t Lemma 9.2.2. [Fundamental lemma of persistent homology] implies that t0 as an index in n could also take the value •. T b Lemma 9.2.2 has a geometric s,t =  ns,t interpretation in the context of s0s, t0>t persistence diagrams, see Figure 9.5. It essentially states that b s,t is the The formula in the lemma can be verified explicitly. However, the sum of all multiplicities of points of statement is apparent from the definitions, as a persistence diagram, which lie in the upper-left quadrant [0, s] ⇥ (t, •] with the apex at (s, t). In the context • b s,t represents the homology born at s or before and terminating of this interpretation, the formula for after t; multiplicity ns,t = b s,t 1 b s 1,t 1 b s,t + b s 1,t is the expression of the • ns,t represents the homology born precisely at s and terminating square (s 1, s] ⇥ [t 1, t) in terms of such quadrants. precisely at t. Lemma 9.2.2 implies that the information encoded in a barcode 1 or in a persistence diagram is precisely the same as the information 4 2,3 = 2 encoded by persistent Betti numbers. 3 9.3 Computation 2 While the multiplicities ns,t of points of persistence diagrams are 1 formally expressed by persistent Betti numbers, there is an algorithm to obtain them directly without referring to the Betti numbers and 1 2 3 4 the corresponding k(k + 1)/2 ranks of maps. In this section we will Figure 9.5: The sum of multiplicities present perhaps the simplest23 version of the algorithm, which is also of points in the blue quadrant with the most illustrative. We will proceed in two steps: apex (2, 3) is b 2,3 by Lemma 9.2.2. It equals 2 as the point (2, 3) is not contained in it, see also the table in • compute the matrix reduction, and Example 9.1.3. 23 There exist many improvements • extract the persistent homology. of this algorithm which may sig- nificantly improve the computing time. 128 introduction to persistent homology We will conclude the section with an example. Throughout this section we fix a field F and filtration K1  K2  . . .  Km = K. Parameter q 2 {0, 1, . . .} will denote the dimension of a considered object. Matrix reduction This part could be called an annotated matrix reduction using only column operation from the left. 1. Order simplices consistently with the filtration. For each q order all q-simplices in an order in which they appear in the filtration. If more simplices appear at the same time their internal order is immaterial24. 24 It would eventually e↵ect the obtained critical simplices and rep- resentatives, but not the persistent 2. For each q construct the boundary matrix Mq using the order homology. chosen in 1 to label columns and rows. 3. For each q reduce Mq from the left using a single type of column operations: the addition of an F-multiple of any of the previous25 25 In the chosen order from 1. columns to a treated column. Specifically, starting with the leftmost column go through all the columns by passing to the right and for each column: (a) Determine the pivot26. 26 The lowest non-trivial entry in the column (b) If any of the previous columns on the left has a pivot in the same row, subtract the appropriate multiple of that column so that the pivot of the current column either disappears or its location is moved up. (c) Repeat as long as there are matching pivots on the left. For each q the resulting matrix is denoted by M0q. Each of its columns is either trivial or has a pivot, whose row is unique amongst all pivots. Extracting persistence At this point we have sufficient information to extract homology of Km from the number of pivots27. However, we can also use the 27 Note that the rank of a matrix is the number of its pivots in a reduced locations of pivots to extract numbers ns,k required to construct the form, and the ranks themselves barcode and persistence diagram. In order to explain the extraction suffice to compute the Betti numbers. process we first recall the incremental expansion. Given a simplicial complex, an addition of a single q-simplex can change the homology in two ways: persistent homology: definition and computation 129 • If its boundary is a linear combination of boundaries of other q-simplices28, then the simplex gives birth to a non-trivial q-28 I.e., if, after adding the simplex to the boundary matrix as the dimensional homology element. In this case we call the simplex a rightmost column, its column gets birth simplex. reduced to the trivial column by the above reduction. • If its boundary is not a linear combination of boundaries of other q-simplices29, then the simplex terminates a non-trivial (q 1)- 29 I.e., if, after adding the simplex to the boundary matrix as the dimensional homology element. In this case we call the simplex a rightmost column, its column does terminal simplex. not reduce to the trivial column. A filtration can be considered to be a sequence30 of incremental 30 A specific sequence should respect the ordering given by a filtration as expansions. At each stage of the filtration we may assume we first in 1. above, and also the structure of add all vertices according to the ordering in 1., then all edges, etc. a simplicial complex, i.e., a simplex cannot be added before all of its Combining such an ordering through all stages we get a sequence of faces are present. incremental expansions inducing boundary matrices Mq and their reduced forms M0q. Based on such an ordering each simplex of K is either a terminal simplex or a birth simplex. We are now in a position to extract persistence diagram and barcode: T As each row contains at most one • For each terminal q-simplex t there exists a paired birth (q 1)- pivot, each birth simplex is paired simplex s, which is the label of the pivot in the column t. Such a to at most one terminal simplex. A pair induces a bar [s, t) in the corresponding barcode or, equiva-terminal simplex cannot appear as the label of a pivot column. lently, a point (s, t) in the corresponding persistence diagram, where s, t are the stages of the filtration at which s and t appear31. 31 Note that if s = t we obtain an empty interval in the barcode and a point on the diagonal in the • Each birth simplex which is not paired to a terminal simplex in-persistence diagram, both of which duces a bar [s, •) in the corresponding barcode or, equivalently, a we ignore in the visualization as they point (s, m + 1) in the corresponding persistence diagram, where s is represent elements of persistence zero. This is consistent with our the stage of the filtration at which s appears. interpretation of persistent homology, which measures only holes that As a result we obtain a barcode and a persistence diagram as persist through at least one stage of demonstrated in the example in the last subsection. the filtration. T It turns out that the presented definition and computation of the Representatives barcode respects the elder rule mentioned at the beginning of the Occasionally we are also interested in homology representatives of chapter. the bars and points of persistence diagrams. These can be extracted from the reduction process. In this subsection we present the most direct way of generating representatives. Given a bar with the birth simplex s and the terminal simplex t we define: • The birth representative of s as the chain formulated32 by the re-32 For example, in the next sub- section we provide an example in duction of the column corresponding to s to the zero column in the which the column hc, di is reduced column reduction scheme. In particular, if the linear combination to the zero column by subtract-ing the column turning column s into the zero column in our column reduction hb, di and adding the column hb, ci. This means scheme is encoded in terms of columns as ∂s Âi l i ∂s i = 0, then ∂ hc, di ∂ hb, di + ∂ hb, ci = 0 and hence hc, di hb, di + hb, ci is the chain that is our birth representative. 130 introduction to persistent homology the birth representative is a = s Âi l i s i. The birth representative gives a homology class [ a] that is born33 by the addition of s. 33 " Homology class [ a] is not the only homology class born by the addition of • The terminal representative is encoded by the column corresponds. If [ b] is another homol- ogy class of the same dimension that ing34 to t in the reduced matrix. has existed before the addition of s, then [ a + b] is also a homology class The birth representative and the terminal representative typically born by adding s. 34 do not represent the same homology class. The birth representative For example, in the next subsection we provide an example in may not even be a good representative of the corresponding bar in which the column corresponding to the sense that it may remain homologically non-trivial beyond35 the hb, c, di in M02 encodes the terminal appearance of the corresponding terminal simplex. On the other hand, representative hb, ci + hd, bi + hc, di. 35 See the discussion on the represen- infinite intervals do not have a terminal representative. As a result we tatives of 0-dimensional bars below define the representative of a bar as follows: for an example. 1. If the bar is a finite interval, the representative of the bar is the U There is no guarantee that these terminal representative. representatives are geometrically the most convenient. There are 2. If the bar is an infinite interval, the representative of the bar is the more involved ways of obtaining representatives that optimize given birth representative. criterion function. For example, we may want to obtain the shortest This choice of representatives is algebraically sound in the sense 1-dimensional representatives, etc.. T that the representatives form a basis of the elementary intervals of the Let us prove that the lifespan of the homology class of the terminal decomposition described in the structure theorem for persistent homol-representative b corresponds to the ogy, a result we discuss in details in the next chapter. This statement lifespan of the corresponding bar: includes the fact that the lifespan of each representative matches the • [ b] appears by the time s appears by construction; lifespan of the corresponding bar, and that the representatives are • if a representative b 0 of [ b] ap-linearly independent36 at all times. peared before s, then the column In practice we sometimes deviate from the algebraically orthodox corresponding to t could have been reduced further to b 0 thus choice when declaring the representatives of 0-dimensional bars: we eliminating the pivot labeled as s, choose the birth representative as a bar representative even if the bar a contradiction. is bounded. Let us explain this geometrically motivated exception on • [ b] becomes trivial by the time t the example of the next subsection, where pair ( emerges by definition; hbi, ha, bi) induces a 0-dimensional bar. Sometimes we would geometrically like to think of • if [ b] became trivial sooner, its expression as a boundary could this bar as a representation of the component containing b merging be used to reduce the t column to with a larger component, hence the choice of the birth representative the zero column, a contradiction. 36 hbi which fits into this geometric intuition. However, we should be ...or trivial beyond their lifespans aware that the homological element [hbi] does not become trivial37 37 The terminal representative hbi hai does become trivial. In fact, [hbi] after adding ha, bi. In terms of homology the appearance of ha, bi never becomes trivial. identifies38 [hai] = [hbi] rather than sets [hbi] = 0. 38 In this sense, the terminal repre- sentative tells us which two compo- nents merge. Example Let us compute persistent homology of our standard example, see Figure 9.4. The annotation of simplices we will be using is provided in Figure 9.6. The chosen order is apparent from the following boundary matrices, in which vertical and horizontal lines divide simplices from di↵erent stages of the filtration. persistent homology: definition and computation 131 0hb, ci hb, di ha, bi hc, di ha, ci ha, ei hb, ei1 d hai 1 1 1 B hbi C B 1 1 1 1 C B C c hciB 1 1 1 C M1 = B C B hdi C B 1 1 C B b hei C @ 1 1 A h f i f a hb, c, di ha, b, ei 0 1 e hb, ci 1 Figure 9.6: The annotation of sim- B hb, di C B 1 C plices of K. B C ha, biB 1 C B C M2 = M02 = B hc, di C B 1 C B U Green entries are the pivots. ha, ci C B C B C ha, ei@ 1 A hb, ei 1 We now perform the labelled matrix reduction as described above. T In the matrices below a blue col- umn is modified using red columns. 0hb, ci hb, di ha, bi hc, di ha, ci ha, ei hb, ei1 hai 1 1 1 B hbi C B 1 1 1 1 C B C U ∂ hc, di = ∂ hb, di ∂ hb, ci. hciB 1 1 1 C M1 = B C B hdi C B 1 1 C B hei C @ 1 1 A h f i 0hb, ci hb, di ha, bi hc, di ha, ci ha, ei hb, ei1 hai 1 1 1 B hbi C B 1 1 1 1 C B C hciB 1 1 C U ∂ ha, ci = ∂ ha, bi + ∂ hb, ci. B C B hdi C B 1 C B hei C @ 1 1 A h f i 0hb, ci hb, di ha, bi hc, di ha, ci ha, ei hb, ei1 hai 1 1 B hbi C B 1 1 1 1 C B C hciB 1 C U ∂ hb, ei = ∂ ha, ei ∂ ha, bi. B C B hdi C B 1 C B hei C @ 1 1 A h f i 132 introduction to persistent homology 0hb, ci hb, di ha, bi hc, di ha, ci ha, ei hb, ei1 hai 1 1 B hbi C B 1 1 1 C B C hciB 1 C M01 = B C B hdi C B 1 C B hei C @ 1 A h f i We can now extract the barcode from the birth-terminal pairs and unpaired birth simplices. We start by extracting the zero-dimensional barcode from the pivots of M01 and unpaired vertices. • Pairs (hci, hb, ci) and (hdi, hb, di) provide no contribution39. 39 Formally, they contribute the empty interval [1, 1) as all involved simplices appear at K1. • Pair (hbi, ha, bi) induces a 0-dimensional (component) bar40 [1, 2) 40 Recall that hbi appears at K1 while representedby [hbi]. ha, bi appears at K2, hence the values of the endpoints. • Pair (hei, ha, ei) induces a 0-dimensional (component) bar [2, 3) represented by [hei]. • Vertices a and f are unpaired and thus induce 0-dimensional bars [1, •) (generated by [hai]) and [2, •) (generated by [h f i]). We next extract the one-dimensional barcode from the pivots of M02 and unpaired edges. U In this example the 1-dimensional birth and terminal representatives • Pair (hc, di, hb, c, di) induces a 1-dimensional bar [2, 3) repre-coincide. This is not generally the case. sented41 by [hc, di hb, di + hb, ci]. 41 Recall that hc, di appears at K2 while hb, c, di appears at K3, hence • Pair (hb, ei, ha, b, ei) induces a 1-dimensional bar [3, 4) represented the values of the endpoints. The by [hb, ei ha, ei + ha, bi]. linear combination that made the column corresponding to hc, di trivial • Edge ha, ci is unpaired and thus induces the 1-dimensional bar in M01 was ∂ hc, di ∂ hb, di + ∂ hb, ci and hence the representative. [3, •) generated by [ha, ci ha, bi hb, ci]. Computational tricks We conclude by mentioning a trick that speeds up the computation of persistent homology. It is based on an observation that boundary matrices Mi that are being reduced in the reduction process are not completely independent of each other. If the reduction process of Mq reduces the column corresponding to q-simplex t to a non-trivial column, we can extract the following information 1. t is a terminal simplex and hence the row corresponding to t in Mq+1 will have been reduced to the zero-row, which means we can set it to zero immediately. persistent homology: definition and computation 133 2. The pivot location reveals the corresponding birth simplex s. As a result the column corresponding to s in Mq 1 will have been reduced to the zero-column and can hence be set to zero immediately. T Overview: reducing a t-column to a non-zero column reveals: Hence a single reduction of a column in Mq corresponding to a • t is a terminal simplex; terminal simplex also reveals a zero-row in Mq+1 and a zero-column • t-row is trivial; in Mq 1. Of course, this information can’t be of much help if it has • pivot label s is a birth simplex; already been extracted from previous reductions. For this reasons the • s-column is trivial. matrices Mq can be reduced in the order of decreasing dimension: this way no column in Mq 1 has been reduced by the time Mq has been reduced and as a result we avoid reducing almost half42 the columns 42 This estimate depends on the filtration but seems to hold for most resulting in a significant speedup. of the practical cases. 9.4 Concluding remarks Recap (highlights) of this chapter • Persistent homology; • Barcode; • Persistence diagram; • Computing persistent homology. Background and applications Persistent homology43,44,45 is perhaps the most popular and fruit-43 Herbert Edelsbrunner, David Letscher, and Afra Zomorodian. ful construction of topological data analysis. For the past two decades Topological persistence and simpli-it has been an inspiration to extensive theoretical and practical treat-fication. Discrete & Computational Geometry, 28(4):511–533, 2002. doi: ments, spanning from purely mathematical theoretical foundations46 10.1007/s00454-002-2885-2 to computable aspects and applications in numerous fields of science 44 Herbert Edelsbrunner and John and engineering. When coupled with standard constructions of com-Harer. Computational Topology: An Introduction. Applied Mathematics. plexes, persistent homology contains information about geometry of American Mathematical Society, data. As such, the method is applied whenever the geometric shape of 2010. doi: 10.1090/mbk/069 45 data is thought to contain significant information. Tamal K. Dey and Yusu Wang. Computational Topology for Data Applications include de-noising schemes, dimension reduction Analysis. Cambridge: Cambridge schemes, feature extraction methods, and specific data analysis of University Press, 2022. doi: materials, molecular structures, medical images, weather patterns, etc. 10.1017/9781009099950 46 We will mention two ideas of The combinatorial treatment of this chapter will be followed by generalizations of the standard further properties in the following chapter. While the definition of persistent homology in the appendix. persistent homology could have been expressed using coefficients in an Abelian group, the visualizations47 and efficient implementations48 47 I.e., barcodes and persistence diagrams. crucially depend on the structure of a field. 48 I.e., matrix reductions. 134 introduction to persistent homology Appendix: zig-zag persistence and multi-parameter persistence In this appendix we will sketch the ideas of two generalizations of the standard persistent homology as presented throughout the chapter. In both cases the generalization refers to the type of filtration used. T In case L1, L2, . . . , Lm are sub- The first generalization is based on zig-zag filtration. While a stan-complexes of a simplicial complex L not satisfying the condition of a dard filtration models a growing simplicial complex, a zig-zag filtration zig-zag filtration, and we still want models a changing simplicial complex, in which the simplices may be to compute a meaningful zig-zag homology, a standard way to construct appearing or disappearing. a corresponding zig-zag filtration is to connect them either by unions or intersections: Definition 9.4.1. Let K be a simplicial complex. A zig-zag filtration K1 ,! K1 [ K2 - K2 ,! . . . - Km, of K is a sequence of subcomplexes K1, K2, . . . , Km of K, such that for K1 - K1 \ K2 ,! K2 - . . . ,! Km. each i 2 {1, 2, . . . , m 1} either Ki  Ki+1 or Ki+1  Ki. Interestingly enough, while the two options induce generally di↵erent For example, a zig-zag filtration may be of the following sort: barcodes, they encode precisely the same information. K1 ,! K2 - K3 - K4 ,! K5 - K6 ,! K7. It turns out that even in this setting there exists an algorithm based on matrix reductions which will produce a well defined barcode49 49 Or, equivalently, a persistence diagram. describing what is called zig-zag persistence. An example is displayed in Figure 9.7. K1 K2 K3 K4 Figure 9.7: A zig-zag filtration and the corresponding zero-dimensional barcode, visualized as a table for - ,! - technical reasons (i.e., the absence of a designated direction of all arrows as endpoints of bars). In the same way we could have presented the barcodes of ordinary persistent K1 K2 K3 K4 homology as well. The second generalization is based on a multi-parameter filtration. While standard filtration models a one-parameter50 growth of 50 I.e., a sequential. a simplicial complex, multi-parameter filtration models growth with more degrees of freedom. For our demonstrative purposes it suffices to formally introduce only a 2-parameter filtration. persistent homology: definition and computation 135 Definition 9.4.2. Let K be a simplicial complex. A 2-parameter filtration of K is a collection of subcomplexes Kj,k  K parameterized with j, k 2 {1, 2, . . . , m}, such that for each j 2 {1, 2, . . . , m 1} and for each k the following containments hold (see Figure 9.8): • Kj,k  Kj K K K +1,k, and 1,m ,! K2,m ,! 3,m ,! · · · ,! m,m ,! ,! ,! ,! . . . . • K . . . . k,j  Kk,j+1. . . . . ,! ,! ,! ,! K1,2 ,! K2,2 ,! K3,2 ,! · · · ,! Km,2 , There are theoretical and practical settings in which multi-parameter ! ,! ,! ,! K1,1 ,! K2,1 ,! K3,1 ,! · · · ,! Km,1 filtrations arise naturally. A multi-parameter persistent homology is Figure 9.8: A scheme of a 2-the object obtained by applying the homology to spaces and maps of parameter filtration. such a filtration. Unfortunately, there exists no convenient51 visual-51 While a 1-parameter persistent homology “decomposes” into simple ization52 in this setting. As a result, theoretical treatments of multi-pieces called bars (we will explain parameter persistent homology typically deal with a multi-dimensional this statement in detail in the next chapter), the pieces of a multi- grid of interconnected homology groups, while practical applications parameter persistent homology can of the same object use incomplete information about it such as multi-be quite complicated and not easily parameter tables of Betti numbers, restrictions to a 1-parameter filtra-visualized or encoded. 52 ...such as multi-dimensional bar- tions yielding a standard barcode, etc. For more details see a book53. code or persistence diagram. 53 Tamal K. Dey and Yusu Wang. Computational Topology for Data Analysis. Cambridge: Cambridge University Press, 2022. doi: 10.1017/9781009099950 10 Persistent homology: stability theorem In the previous chapter we introduced persistent homology and its basic method of computation in the discrete setting. However, it turns out that the concept of persistent homology can be treated on a much deeper theoretical level, through which many of its advantages become apparent. In this chapter we will delve further into the theoretical machin-ery of persistent homology. We will introduce continuous filtrations and the underlying algebraic structure of persistence modules. These structures will be crucial in the formulation of the stability theorem, which states that, unlike homology, persistent homology behaves continuously with respect to the underlying filtration. We conclude by mentioning a series of interpretations and examples of our expanded scope of persistence. 10.1 Continuous filtrations Recall that a discrete filtration of a simplicial complex K is a sequence of subcomplexes K1  K2  . . .  Km = K. An example of a filtration is given in Figure 10.1. K Figure 10.1: A discrete filtration. 1 K2 K3 K4 Discrete filtrations1 formalize finite nested sequences of complexes. 1 I.e., filtrations given by finitely many nested simplicial complexes 138 introduction to persistent homology While this approach is geometrically intuitive, there is an alternative 1 shorter description of a filtration. 2 Rather than storing a sequence of separate subcomplexes, we anno-1 1 3 tate each simplex s 2 K by the index f ( s) at which s first appears. 1 3 1 Given such an annotation it is easy to reconstruct Ki = { s 2 K | 2 f ( s)  i}. See Figure 10.2 for an example. This observation motivates us to expand the scope of filtrations in two ways: by considering con-1 4 3 tinuous filtrations with infinitely many subcomplexes; and by defining 2 3 filtrations from an appropriate annotation function. 2 Figure 10.2: The annotation of simplices encodes the filtration of 1. A continuous filtration of a finite simplicial complex K is a collec-Figures 10.1. tion of subcomplexes2 {Kr}r 0 of K such that 2 Throughout the book we will addi- tionally require that for each simplex s 8r < q : Kr  Kq  K. 2 K the minimum argminr{ s 2 Kr} exists, i.e., there exists the smallest scale r at which s appears. An equiv- 2. Given a simplicial complex alent condition is the following: for K, let f be a filtration function, i.e., an each r there exists r0 > r such that annotation of each of the simplices of K by a non-negative number Kr = Kr0, i.e., if a simplex is absent such that s  t =) f ( s)  f ( t). The sublevel filtration associat scale r, it is also absent at slightly larger scales. ated to f is a continuous filtration consisting of sublevel complexes3 Under this condition each continu- Kr = { s 2 K | f ( s)  r}  K for r 0. ous filtration is the sublevel filtration of its associated annotation func- tion. In particular, each sublevel There are two motivating reasons for the introduction of continuous filtration is a continuous filtration filtrations: and vice versa. The Rips and Cech filtrations as defined in Chapter 5 are • Most of the standard constructions of filtrations actually yield continuous filtrations of this sort. 3 continuous filtrations: Cech filtration4, Rips filtration5, filtration by In this setting parameter r is often referred to as the scale, a notion aris- alpha complexes, sublevel filtration, etc. ing from Rips and Cech filtrations, or the level, a notion arising from the • The interleaving structure and the resulting stability theorem de-filtration function. 4 pend6 on the continuous choice of parameter. The corresponding filtration func- tion on simplices being the radius of the smallest enclosing ball of the The definition of persistent homology groups for continuous filtra-vertices. tions is the same, with the only di↵erence being the continuous range 5 The corresponding filtration func-of the indices 0 tion on simplices being the diameter  s  t, which results in nominally infinitely many of the set of vertices. persistent homology groups. The matrix-reduction based computation 6 To be discussed in detail later. In a of persistent homology is unhindered by the expansion to continuous nutshell, the continuous choice of the filtrations as the computations depend only on filtration function de-parameter eventually results in the continuity of persistent homology. fined on (finitely many) simplices, and do not require all (infinitely many) complexes of filtration separately. In a nutshell, we can compute the same barcodes using the same procedure for continuous or discrete filtrations. We next discuss the interpaly between discrete and continuous filtrations: 1. Given a discrete filtration, there is an obvious extension of it as the sublevel filtration of the annotation function. persistent homology: stability theorem 139 Figure 10.3: An excerpt from the Rips filtration on the five points on the left. 2. Given a continuous sublevel filtration {Kr}r 0 associated to a filtration function f there are two ways of generating a discrete filtration: (a) By restriction to K1  K2  . . .  Kdmax fe. While mathematically convenient, this approach has many drawbacks7 and is 7 The corresponding continuous filtration as defined by 1. may be mostly avoided. significantly di↵erent from {Kr}r 0. The information about the sequence (b) A more beneficial way of thinking about the index i of a dis-of changes between each pair of crete filtration is not as the scale parameter8 but rather as the integer scales is lost. 8 index of the critical scale9 of the continuous filtration. Formally An interpretation prevalent in the context of continuous filtrations. speaking we define critical scales r1 < r2 < . . . < rk as the 9 A scale r of a continuous filtration enumeration10 of the image of f and define is critical, if at least one simplex appears at r. 10 Ki = { s 2 K | f ( s)  ri}. {r1, r2, . . . , rk} = Im f . The corresponding finite filtration contains information about all changes in the original continuous filtration. T From this point on, whenever we Continuous filtrations conveniently model the geometric setup of mention an unspecified filtration, or consider a transition from a finite to the standard filtrations. On the other hand, discrete filtrations are continuous filtration or vice versa, a convenient finite description on which we may develop algorithmic the underlying interplays we have in mind are 1. and 2. (b). For an approaches. example see Figure 10.4. 0 Figure 10.4: The Cech filtration on three vertices forming an equilateral 1/2 1/2 triangle of side length 1 nominally p 1/ 3 consists of infinitely many simplicial complexes. However, only at scales p p 0 1/2 0 0, 1/2 and 1/ 3 do the changes r = 0 r = 1/2 r = 1/ 3 occur and hence the corresponding discrete filtration (according to 2 (b) above) consists of simplices at those scales, depicted by the first three complexes in the figure. The Example 10.1.1. [Topology of o↵sets] Given a finite collection of points annotation function is provided on the right, its image consists of the S ⇢ Rn we have already mentioned that the nerve theorem implies that mentioned scales. for each r > 0 the Cech complex Cech(S, r) is homotopy equivalent to the r-neighborhood11 N(S, r) of S: 11 Also called the r-o↵set of S. 140 introduction to persistent homology [ Cech(S, r) = N ({B(s, r)}s2S) ' B(s, r) = N(S, r). s2S It turns out that the conclusion of the nerve theorem behaves con-sistently12 with the maps, which results in the following fact: if for 12 The formal term corresponding to this consistency is “functoriality” and r1 < r2 the inclusion N(S, r1) ,! N(S, r2) is a homotopy equivalence, the relevant extension of the nerve then13 so is the inclusion Cech(S, r1) ,! Cech(S, r2). theorem is referred to as “persistent nerve theorem” or “functorial nerve The Cech filtration thus models the homotopy type of growing o↵- theorem”. sets: if on some interval the growth of r results in homotopy equivalent 13 At this point we crucially use the growth of o↵sets, then it also results in a homotopy equivalent growth fact the Euclidean balls always form a good cover as required by the nerve of the Cech complexes. theorem. Interleaving distance for filtrations We conclude this section by recalling the interleaving distance between filtrations. The concepts has already been defined in Chapter 5 for Rips and Cech filtrations. With the established general notions we can use the same definition for filtrations in general. Definition 10.1.2. Choose # > 0. Continuous filtrations {Kr}r 0 and {Lr}r 0 are #-interleaved if there exist simplicial maps j r : Kr ! T Suppose filtrations L {Kr}r 0 and r+ # and y r : Lr ! Kr+ # such that j r+ # y r : Lr ! Lr+2 # and y r+ # {Lr}r 0 consist of subcomplexes of a j r : Kr ! Kr+2 # are equal to the corresponding inclusions. simplicial complex K and let # 0. It is easy to see that if for each r we have Kr  Lr+ # and Lr  Kr+ #, then · · · / Kr / Kr+ # / Kr+2 # / · · · = the filtrations are ; #-interleaved. An j r argument of this sort is used in the y proof of Proposition 10.1.3. r ! # · · · / Lr / Lr+ # / Lr+2 # / · · · Given two filtrations their interleaving distance is defined as the minimum14 of all values # > 0, for which the filtrations are #-interleaved. 14 It is not hard to prove that the minimum exists due to the addi- It turns out that the interleaving distance is a metric15. tional requirement imposed on our In Chapter 5 we proved that Rips and Cech filtrations equipped filtrations. 15 In order to maintain this view we with the interleaving distance are continuous (stable) with respect declare two filtrations to be isomor-to perturbations of the underlying points. Generalizing this result phic if they are 0-interleaved. The we now prove the sublevel filtrations are continuous with respect to interleaving distance is a metric on the isomorphy classes of filtrations. perturbations of the filtration function in the max metric16. 16 Given two functions f , g : K ! R defined on all simplices of a finite simplicial complex K, the max dis- Proposition 10.1.3. Let K be a simplicial complex. Assume f , g : K ! tance between them is [0, •) are filtration functions. Then the sublevel filtrations of K cor- || f g||• = max | f ( s) g( s)|. responding to f and g are || f g||• interleaved. s 2K Proof. In order to align our notation with the diagram above for # = || f g||• define Kr = { s 2 K | f ( s)  r}  K and Lr = { s 2 K | g( s)  r}  K. The interleaving maps j, y are defined to be identities on vertices. The maps are well defined by the following argument: persistent homology: stability theorem 141 • For each vertex v 2 K: if v 2 Kr then v 2 Lr+||f g|| by the • definition of the max distance. In a similar fashion, if v 2 Lr then v 2 Kr+||f g|| . Hence maps j, y are well defined on vertices. • • The same argument for simplices17 implies maps j, y are simplicial. 17 For example, if a simplex s 2 K is contained in Kr, it is also contained in Lr+||f g|| . • 10.2 Persistence modules Persistent homology is obtained by applying homology to a filtration. In this section we present the properties of the resulting algebraic objects (persistence modules) which model persistent homology. Just as filtrations model the growth of simplicial complexes, persistence modules model the evolution18 of vector spaces19. 18 I.e., not only growth. 19 Which we interpret as holes in the For the rest of this chapter we fix a field F, which will provide context of persistent homology coefficients to all mentioned vector spaces, including homology groups. Persistence modules Definition 10.2.1. A persistence module is a collection of (finite T In general literature persistence dimensional) vector spaces {Vr}r 0 along with linear maps modules may consist of infinite dimensional vector spaces. hr,q : Vr ! Vq, 8r  q satisfying hr,q = hr,s hs,q and hr,r = idVr for all r  q  s. Scale r 0 is said to be regular if there exists # > 0 such that maps hp,q are isomorphisms for all p, q 2 (r #, r + #) or (in the case r = 0) for all p, q 2 [0, #), i.e., the maps h are isomorphisms T Critical scales of a continuous close to r. Scale r is critical if it is not regular. filtration are a supset of critical scales of its persistent homology as any change in homology requires a change of the underlying complex, Our interest in persistence modules stems from the fact that they but not vice versa. are the underlying algebraic structure of persistent homology of continuous filtrations. In order to simplify our treatment we thus restrict to persistence modules that appear as persistent homology of continuous filtrations as defined above. In particular, each persistence module treated here will be assumed to have the following properties: T Properties 2. and 3. imply [0, •) can be decomposed into finitely many intervals of the form [ 1. There exists R > 0 such that for each R ⇤1, ⇤2) on  r < q maps hr,q are which all maps h are isomorphisms. isomorphisms20, i.e., eventually all maps h are isomorphisms. 20 An analogous property holds for continuous filtrations as they filter a 2. For each r > 0 there exists r0 > r such that for all q 2 [r, r0) the finite simplicial complex, i.e., given maps h a filtration function r,q are isomorphisms21. f , all sublevel complexes Kr for r > max | f | coincide. 21 3. There exist finitely22 many critical scales. This corresponds to the analogous property assumed for our continuous filtrations. 22 This property corresponds to the fact that continuous filtrations filter a finite simplicial complex. 142 introduction to persistent homology Definition 10.2.2. Persistence modules {Vr}r 0 and {Wr}r 0 are isomorphic if for each r 0 there exist isomorphisms Vr ! Wr such that for each 0  r1 < r2 < . . . the following diagram commutes · · · / Vr / V V j rj · · · +1 / rj+2 / ⇠ = ⇠ = ⇠ = ✏ ✏ ✏ · · · / Wr / W j rj+1 / Wrj+2 / · · · Decomposition It is often advantageous to decompose23 mathematical objects 23 Functions are decomposed into monomials (Taylor series) or trigono- into simple pieces and thus obtain a canonical form. In the previous metric functions (Fourier series). chapter we decomposed persistent homology into pieces represented Closed connected surfaces other than the sphere can be decomposed by bars. In this subsection we will formalize such a decomposition for as a direct sum of tori or projec-persistence modules. tive planes. Every n-dimensional We first explain what we mean by a “decomposition”. vector space is of a form Fn and in one of the previous appendices we mentioned how finitely generated Abelian groups can be decomposed Definition 10.2.3. The direct sum of persistence modules {Vr}r 0 into smallest indecomposable groups: and {V0r}r 0 along with respective linear maps hr,q and h0r,q, is a per-groups of the form Zp and Z. sistence module consisting of: • spaces Wr = Vr V0r and • maps ˜hr,q = (hr,q, h0r,q) for all 0  r < q. We next present interval modules, which are the pieces represented by bars. Definition 10.2.4. Let 0  p < q. An interval module Fp,q corresponding to the pair (p, q) is a persistence module {Vr}r 0 defined as follows: T Rewriting the condition on maps • V in Definition 10.2.4: r = F for r 2 [p, q) and Vr = 0 else. • for p  s  s0 < q map hs,s0 is the • Maps h identity on F. s,s0 are isomorphisms whenever possible. • else hs,s0 is the zero map. T It is easy to verify that interval modules Fp,q and Fp0,q0 are isomor- phic i↵ p = p0 and q = q0. Theorem 10.2.5. [Structure theorem for persistent homology] Each persistence module is isomorphic to a direct sum of interval modules. The decomposition is unique up to the permutation of the intervals. persistent homology: stability theorem 143 Barcodes and bars introduced in the previous chapter correspond to this decomposition and interval modules. Theorem 10.2.5 is an algebraic expression of the existence of barcodes. It states that the persistence module can be decomposed into the intervals and is completely determined24 by the interval modules of its decomposition. 24 And as a result, barcodes and persistence diagrams are complete descriptions of persistence modules. Interleaving distance for persistence modules The interleaving distance has already been defined for filtrations. Conceptually the same definition applies to persistence modules. Definition 10.2.6. Choose # > 0. Persistence modules {Vr}r 0 and {Wr}r 0 along with their respective linear maps hr,q and h0r,q are #- interleaved if there exist linear maps j r : Vr ! Wr+ # and y r : Wr ! Vr+ # such that j r+ # y r : Wr ! Wr+2 # and y r+ # j r : Vr ! Vr+2 # are equal h0r,r+ # and hr,r+ # correspondingly. Given two persistence modules their interleaving distance dI is defined as the minimum of all values # > 0, for which the filtrations are #-interleaved. · · · / Vr / Vr+ # / Vr+2 # / · · · j < ; r y r " # · · · / Wr / Wr+ # / Wr+2 # / · · · It is not hard to prove that the minimum in the definition of the interleaving distance exists due to the additional requirement imposed on persistence modules. It is easy to verify that the interleaving distance is a metric on the isometry classes of persistence modules. As such the interleaving distance is the metric25 of choice on persistent 25 At this point it should be clear that continuity and small perturba- homologies. tions of persistent homology depend The functoriality of homology implies that #-interleaved filtrations26 on the ability to perform continuous and small steps in the index set. induce #-interleaved persistence modules. Another setting in which An interleaving distance defined on #-interleaved persistence modules (but not necessarily #-interleaved persistent homology of discrete filtra-filtrations) are obtained is that of spaces, which are “close” to each tions or a single complex would have been, in the best of cases, restricted other. Let us first define closeness. to the integer values, that do not accommodate the idea of continuity. 26 Definition 10.2.7. Let (X, d) be a metric space and assume A, B ⇢ We have already discussed how these appear by perturbing points X are finite subsets. The Hausdor↵ distance dH(A, B) is defined when using Rips or Cech complexes, as and by perturbing the filtration function when using the sublevel d min d(a, b), max min d(a, b) . filtration. H(A, B) = max max b2B a2A a2A b2B The Hausdor↵ distance is a metric on all finite subspaces of a metric space X. It has a natural geometric meaning. Given the setting of Definition 10.2.7 find: 144 introduction to persistent homology • The minimal rA such that N(A, rA) B, i.e., the rA-neighborhood A B of A contains B. • The minimal rB such that N(B, rB) A. We conclude that dH(A, B) = min{rA, rB}. Note that for each a 2 N (A, rA) A there exists b 2 B such that d(a, b)  dH(a, b), and vice versa. An example is given in Figure 10.5, where a black set A and a red set B are displayed on top, while their respective neighborhoods are displayed in the middle and on the bottom. Hausdor↵ distance measures the distances between finite subspaces of a metric space and heavily depends on a way in which these subspaces are embedded. For example, di↵erent isometric subspaces will be at a positive Hausdor↵ distance. Similar to Hausdor↵ distance is the Gromov-Hausdor↵ distance. N (B, rB) Definition 10.2.8. Suppose A, B are finite metric spaces. The Gromov-Figure 10.5: dH(A, B) = rB > rA. Hausdor↵ distance dGH(A, B) is defined as dGH(A, B) = inf{dH( µ(A), n(B))}, µ, n where the infimum is over all isometric embeddings µ : A ! X and n : B ! X into a metric space X. T Observe that dGH(A, B)  dH(A, B) for finite subspaces of a metric space It turns out that the infimum in Definition 10.2.8 is always attained X. As a result Proposition 10.2.9 and that dGH is a metric on the isometry classes27 of finite metric also holds for dH. However, Gromov-Hausdor↵ distance is typically harder spaces. to compute and thus it is occasion- ally more convenient to use dH as Proposition 10.2.9. Let A, B be finite metric spaces with the easily computable parameter of # = dGH(A, B). interleaving. Then for each q 2 {0, 1, . . .}: 27 In particular, dGH(A, B) = 0 i↵ the spaces are isometric. 1. {Hq(Rips(A, r))}r 0 and {Hq(Rips(B, r))}r 0 are 2 #-interleaved. 2. {Hq(Cech(A, r))}r 0 and {Hq(Cech(B, r))}r 0 are #-interleaved. Proof. We will only sketch the proof for q = 1 and Rips filtrations. The proof of other cases follows the same idea but requires some technical diligence. Without loss of generality we may assume A and B are subspaces of a metric space X and # = dH(A, B). We aim to define maps j and y that constitute a commutative diagram: · · · / H1(Rips(A, r)) / H1(Rips(A, r + 2 #)) / H1(Rips(A, r + 4 #)) / · · · j 5 5 r y r ) ) · · · / H1(Rips(B, r)) / H1(Rips(B, r + 2 #)) / H1(Rips(B, r + 4 #)) / · · · persistent homology: stability theorem 145 We first define maps on the vertices of the Rips complexes: • For each a 2 A choose ba 2 B such that d(a, ba)  # and define j r(a) = ba, 8r. • For each b 2 B choose ab 2 A such that d(b, ab)  # and define28 28 As defined, maps j r and y r do not define an interleaving between the y r(b) = ab, 8r. Rips filtrations as in general aba 6= a. Given a 1-cycle a = Âihai, ai+1i in Rips(A, r) define j r([ a]) = [Âihba , b i ai+1 i]. This gives well defined maps j r (and also y r) by the following arguments: • Âihba , b i ai+1 i is a cycle in Rips(B, r + 2 #) as d(ai, ai+1)  r implies d(ba , b ) i ai  r + 2 #. +1 • If [Âihai, ai+1i] = [Âiha0i, a0i+1i] holds29, then j([Âihai, ai+1i]) = 29 This means Âihai, ai+1i Âiha0i, a0i+1i = ∂ Âjhxj, yj, zji. j([Âiha0i, a0i+1i]) as well30. 30 This holds as Âihba , b i ai+1 i At last we need to show that Âihba0, b i = ∂  , by , bz i a0i jhbx +1 j j j i and h i h i hbx , b , b j yj zj i are triangles in = Rips(B, r + 2 #). Âhai, ai+1i Âha00i, a00i+1i i i a00i in H1(Rips(A, r + 4 #)) where a00i = ab . First note that d(a a i, a00 i i )  2 #, 8i. The di↵erence  a00 ihai, ai+1i Âiha00i, a00i i 1 +1i is a boundary as a00i+1 demonstrated by the blue 2-chain in Figure 10.6. ai ai 1 ai+1 10.3 Bottleneck distance and stability theorem Figure 10.6: An excerpt from the proof of Proposition 10.2.9. Each edge connects points at distance at The many versions of the stability theorem for persistent homology most r + 2 #. state that persistent homology is continuous with respect to continuous change of the input parameters31. We have already seen examples 31 With various versions discussing various forms of input. of this sort: through Propositions 10.2.9 and 10.1.3 we can conclude that persistent homology behaves “continuously” in the interleaving distance. One of the main advantages of persistent homology is its visualization and so the final step towards a geometrically convenient form of the stability theorem is to interpret32 the interleaving dis-32 A brief idea about a transition from the interleaving distance to the tance in geometric terms as a distance on persistence diagrams33. The bottleneck distance is provided in resulting distance on persistence diagrams is called the bottleneck appendix. 33 distance. For this setting, the visualization with persistence diagrams is much preferred to the visualization with the barcodes. Bottleneck distance We start by explaining notions and setting needed to define the bottleneck distance. Suppose A = (a1, a2, . . . , am) and B = (b1, b2, . . . , bn) are persistence diagrams, i.e.: 146 introduction to persistent homology • each ai and bi is a point above the diagonal in the first quadrant in the plane, and • each point may appear multiple times in any of the diagrams. For a point v = (x, y) 2 R2 let34 ¯v = ((x + y)/2, (x + y)/2) 2 R2. 34 ¯v represents the point on the diagonal D = {(z, z) | z 2 R} which is A partial matching between A and B is a bijective map j : A0 ! B0 the closest to v in d• (and also in d2) where35 A0 ✓ A and B0 ✓ B. The matching distance of such a j is metric. 35 Again, a point can appear in defined as A0 or B0 multiple times but not more times n o than in A or B respectively. dM( j) = max max{d•(v, j(v))}, max {d•(v, ¯v)}, max {d•(v, ¯v)} . T Recall that the max distance v2A0 v2A\A0 v2B\B0 d•((x1, y1), (x2, y2)) between points in the plane is defined as Let µ(A, B) denote the collection of all partial matchings between A max and {|x1 x2|, |y1 y2|}. B. Definition 10.3.1. The bottleneck distance between persistence diagrams A and B is the minimal matching distance between them, i.e., dB(A, B) = min dM( j). j 2 µ(A,B) Figure 10.7: Examples of partial matchings between the red and the Examples of partial matchings are given in Figure 10.7. In order blue persistence diagrams with points unmatched by j being matched to to demonstrate the additional pairs used in the definition of the bot-the closest diagonal point. tleneck distance, the unmatched points are connected to the closest point on the diagonal. The matching with the smallest matching distance is the second from the left, a fact that can be verified in Figure T At this point it should become 10.8, which illustrates the matching distances for matchings of three apparent why it is geometrically convenient to consider points on diagrams of Figure 10.7. The d•(a, b) distance between points a and the diagonal represent the trivial b can be thought to represent one half of the side-length of the square persistence module. A side e↵ect of this approach is that any two points centered at a which has b on its boundary. The maximal length of on the diagonal represent the same such sides is the smallest in the second case and the resulting quantity trivial persistence module. In a way, is the bottleneck distance d the entire diagonal should thus be B. treated as a single point. persistent homology: stability theorem 147 Figure 10.8: The distances between the matched pairs are demonstrated by the squares arising as the balls of the d• metric. The diagram with the smallest maximal square dB amongst all matchings (even the ones not displayed here) is the middle one. Hence the resulting bottleneck distance dB arises from the middle diagram. 2" " Theorem 10.3.2 (Isometry theorem). The interleaving distance between persistence modules equals the bottleneck distance between the corresponding persistence diagrams. Example 10.3.3. Let A be the persistence diagram presented by the four blue points in Figure 10.9. If persistence diagram B satisfies dB(A, B)  #, then B consists of the following: • For each blue point there exists one designated36 red point within the grey square (i.e., the #-ball in d•) around it. Figure 10.9: A schematic represen- tation of the #-neighborhood of the diagram consisting of blue points as • Arbitrarily many points within the grey #-band37 along the diagonal. discusses in Example 10.3.3. 36 If some of the squares had non- Stability theorem empty intersection, then within that intersection there might be more points of B, so a single square might contain more red points. However, Theorem 10.3.4. [Stability theorem] Assume persistence diagrams A one of them, potentially a diagonal and B represent persistent homologies of filtrations V and W obtained point, has to be the designated one, by one of the following procedures: i.e., the point to which the blue point in question is matched in an optimal matching. 1. As the sublevel filtrations of filtration functions f and g satisfy-37 This band is actually the #- ing condition || f g||•  #, see Proposition 10.1.3. neighborhood of the trivial (empty) persistence diagram. Again, this 2. As the Rips filtrations of metric spaces X and Y satisfying con-implies that the squares intersecting dition d the band may contain more points. GH(X, Y)  #/2, see 1. of Proposition 10.2.9. T The stated version combines sev- 3. As the Cech filtrations of metric spaces X and Y satisfying con-eral separate version of the stability theorem found throughout the lit- dition dGH(X, Y)  #, see 2. of Proposition 10.2.9. erature by stating several di↵erent initial assumptions. Then dB(A, B)  #. T While the presented results explain stability in terms of the bottleneck distance, there is another family of Figure 10.10 is a schematic representation of the discussion leading distances on persistence diagrams to the stability theorem as presented here. called the Wasserstein distances. For example, the 1-Wasserstein The moral of the theorem is that small perturbations of the input distance is obtained by defining the lead to small changes in persistence diagrams. On the other hand, matching distance as the sum (rather critical simplices and homology representatives may be unstable. than max) of individual terms. Under appropriate assumptions the persistence diagrams are also stable when using Wasserstein distances. 148 introduction to persistent homology " interleaving of " interleaving of functoriality filtrations persistence isometry dB(A, B)  " theorem V and W modules Filtration Rips Cech functions filtration filtration ||f g||1  " dGH(X, Y )  "/2 dGH(X, Y )  " Figure 10.10: The diagram sum- marizing the stability theorem and strategy of its proof that have been 10.4 Interpretations and examples discussed. With the stability theorem, persistent homology may be thought of as a stable description of a geometric shape. Stability itself justifies the following observations: • Given a geometric shape, ever better approximating point-clouds induce persistent diagrams ever closer38 (converging) the the persis-38 Given a closed manifold X, a sufficiently small scale r, and a point- tence diagram of the shape. cloud S sufficiently close to X in dGH, the corresponding Rips and Čech • The points of higher persistence (i.e., the longer bars in the bar-complexes are homotopy equivalent to X. As a result, the persistent code) represent more stable39 features and are thus typically homology around scale r of a good deemed to be of higher importance, leading to simplification schemes approximation reveals homology of on data, such as denoising. X. 39 I.e., they remain non-trivial under In this section we will present several examples40 of persistent ho-larger perturbations. 40 Generated by Ripserer.jl, with mology arising via Rips complexes and comment on their structure. coefficients in Z2. 1-dimensional persistence of geodesic spaces Let X be a closed41 geodesic manifold or, more generally, the body 41 This implies it admits a finite triangulation. of a finite simplicial complex equipped with a geodesic metric. Assume Sn is a sequence42 of finite metric spaces converging towards X in the 42 Such a sequence may be, roughly speaking, obtained by constructing Gromov-Hausdor↵ metric. Let An denote the 1-dimensional persis-ever finer finite approximations of X tence diagram obtained from Sn via Rips filtration and coefficients in and inducing an approximation of a geodesic metric on them. F. It turns out that the limiting diagram43 A = limn!• An encodes 43 A can be obtained as persistence a shortest base of H1(X; F): for each member44 a of a shortest ho-diagram of the Rips filtration of X, a mology base of X we obtain a bar [0, | a|/3), where | a| is the length construction which involves infinite simplicial complexes and is formally of a, see Figure 10.11. Without going through all the details let us beyond the scope of this book. demonstrate the situation through examples. 44 Members are formally cycles whose length in this case is the length of the corresponding loop in X. One can choose a triangulation for which these simplicial loops are the shortest possible. persistent homology: stability theorem 149 Figures 10.12 and 10.13 represent three surfaces in R3 approximated by a finite collection of points. An approximation of a geodesic metric is induced on the points and used to compute 1-dimensional persistent homology. The right part of the figures represents the longest one or two bars obtained from each of the computations. Starting with a discrete set of points, a multitude of short bars is also generated, but are the artefact of a finite approximation rather than topologically significant features. By the stability theorem their lifespans decrease (although their numbers increase) as we improve the approximation density. ↵ Let us interpret the results: 1. Our chosen samples are dense enough for the longest bars to de- |↵|/3 | |/3 tect the shortest 1-dimensional homology bases, which in this case Figure 10.11: The 1-dimensional coincide for all coefficients. persistent homology of a torus detects its shortest homology basis. 2. The bars would ideally be born at 0 and run until45 one-third of 45 Žiga Virk. 1-dimensional intrinsic persistence of geodesic spaces. the lengths of the corresponding homology generators. With in-Journal of Topology and Analy- creased density the resulting barcode would approach this scenario. sis, 12(01):169–207, 2020. doi: 10.1142/S1793525319500444 3. The visualizations of approximating points also contain a loop or two: these are obtained by connecting the vertices of the critical triangles by the shortest paths through our points. For bars corresponding to the basis, this gives an approximation of the shortest homology basis. In the spirit of the stability theorem, the finer the approximation by points, the closer approximation of the loops we obtain. Figure 10.12: The longest bar of 1-dimensional persistent homology. 4. Going beyond the basis, we see that the next bar in Figure 10.13 detected a hole in our approximating points. The lifespan of this bar would decrease towards zero with ever better approximations. 150 introduction to persistent homology Figure 10.13: The longest two bars of 1-dimensional persistent homology. Stability demonstrated Figure 10.14 contains four approximations of a circle by discrete sets and the corresponding persistence diagrams arising from the Rips filtration. The stability theorem states that the induced persistence diagrams should be close to each other, and the figure demonstrates this is indeed the case. A few comments on the diagrams: 1. There is a point (0, 3) representing (0, •) indicating the one persistent component. 2. The main dominating feature is the very persistent point terminating at around 1.5. It represents the 1-dimensional hole, i.e., the homology of S1. Its precise coordinates tell us more about the geometry of the sample. • The birth is between 0 and .5. The precise birth depends on the edges of the Rips complex going “around the circle”46. The 46 ...and thus generating the 1-cycle. maximal gap needed for such circumcision is the birth time. We can see that such a gap is smallest in the upper-right case resulting in early birth. On the other hand, the gap is largest in the upper-left case47 and results in a later birth. 47 The gap of this sample appears in the upper-right part • The terminal scale of this feature is the minimal diameter of an “almost equilateral triangle” reaching “around the circle”. 3. The other points on persistence diagrams are of low persistence and appear48 as an artefact of discretization. 48 For example, as the Rips complex on n points is a discrete collection of n points at small scales, each such diagram will have n many points indicating persistent 0-dimensional homology. persistent homology: stability theorem 151 Figure 10.14: Four approximations of a circle and the corresponding persistence diagrams. Spheres We next present examples approximating spheres. In Figure 10.15 we present persistence diagrams via Rips complexes of a sample of 100 points on unit spheres: S2 (on left) and S3 (on right); using Euclidean (top) or geodesic distance (bottom). 3 S3(1) Figure 10.15: Persistence diagrams via Rips complexes of samples of one hundred points sampled from unit S2 (on left) and S3 (on right) using Euclidean or intrinsic (geodesic) distance. T In persistence diagrams in Fig- ure 10.15 there are a lot of short As the only non-trivial homology (except for dimension 0) of S2 is bars. Some of these are artefacts H2(S2; F) ⇠ = F, we expect a long persistent line in that dimension, of discretization, other indicate a which is indeed the case. In fact, the long 2-dimensional bar clearly more complex structure of persistent homology reaching beyond the in- indicates that in both cases the most prominent homology is of rank 1 terpretation of the size of homology 5 in dimension 2. The same holds for S3 although a denser sample would representatives of the underlying 4 space. Interpreting such bars is a very active research topic. FOOTPRINTS OF GEODESICS IN PERSISTENT HOMOLOGY 25 152 introduction to persistent homology have made the same observation easier in the geodesic case. To demonstrate the improvement induced by larger density we present in Figure 10.17 a sequence of diagrams with increasing density. The underlying space is a cut-o↵ unit sphere, i.e., a 2-dimensional sphere with a cap above the parallel of circumference approximately 1.5 removed, see Figure 10.16. We take a sample of 100, 200, and 400 points, generate a geodesic distance, and generate persistence diagrams via Rips filtrations. Here is what we would expect from resulting persistence diagrams: 1. The cut-o↵ sphere is contractible for small scales and thus initial clutter of 1-dimensional bars should be decreasing in size as we Figure 10.16: A cut-o↵ sphere as an increase density. underlying space leading to diagrams in Figure 10.17. 2. At a certain scale an o↵set of the cut-o↵ sphere will fill in the top and create a void and hence49 a 2-dimensional homology in the 49 See Example 10.1.1. Figure 9. PD described in Section 9. Cech complex. As Rips complexes are interleaved with Cech com-plexes50, we might hope that the same 2-dimensional bar might 50 ...and hence the persistence diagrams are not too di↵erent. appear in our case. That is indeed the case and the mentioned long bar grows with increasing density of the sample. (3) Note that the long 2-dimensional bar above is born slightly earlier than 3. As is frequently the case, there appear multitudes of short bars we the 3-dimensional bar. This is always the case, as generating the two-choose to ignore at this point. dimensional bar only requires a 2-dimensional portion of the generator of the 3-dimensional bar, that spans the sample of . (4) A pairing of a 3-dimensional bar with 2-dimensional bar indicates that is contractible in X. (5) Figure W 10.17: e F s our p p e ersist cu ence lat dia- e the other short 3-dimensional bars are induced by other ge-grams v o ia d Rip essi com c plex c es o ifracl cut-o e↵ sphere, based on samples of 100, 200, s (i.e., equator and its rotations) in X. We will delve deeper De-noising a function and 400 points. Suppose we are given an approximation of a function f in the form into them in our future work. of a discrete set of equally spaced measurements. We can connect the resulting points by edges and obtain a graph G representing our Note that, except for small values of r, there is essentially no noise in the PD. measurements, see the left side of Figure 10.18. Suppose we want to extract the global behaviour of f as in the center of Figure 10. W 18 e are able to interpret almost all of the bars. Initial 1-dimensional bars are by removing the local oscillations we consider to be noise. A way unavoidable as we always start with a finite sample (discrete subset). They shorten to achieve it would be to construct the sublevel51 filtration of the 51 A vertex of G appears at the simplicial complex G and choose the threshold # for the noise levelas . the func d tio e n vn alu seiittrey pre of sents. ou An ed r ge of G appears as soon as both of its sample increases. The only other unmentioned bar is the short We would then draw the corresponding 0-dimensional persistence vertices appear. diagram and ignore the points in the shaded #-neighborhood52 of 2-the dim 52 e Ean ch s lo i ca on l mini al mum inb o ar ur ap- appearing at about the same time as the long 2-dimensional proximation except for • and ⇤ diagonal, see the right side of Figure 10.18. As a result we obtain bar. geIt nera c tes a an point in b th e is e neigh x bor- hood. plained by the e↵ect of discretisation and the structure of the 3-dimensional bar born at about the same time. During our experimentation we have generated several instances of the PD using the mentioned procedure. The obtained diagrams are qualitatively the same in all instances (and aligned with the interpretation above) with the only exception being the short isolated 3-dimensional bar, which did not appear in all attempts due to its short length. References [1] M. Adamaszek and Adams, H.: The Vietoris-Rips complexes of a circle, Pacific Journal of Mathematics 290-1 (2017), 1–40. [2] M. Adamaszek, Adams, H., and S. Reddy: On Vietoris-Rips complexes of ellipses, Journal of Topology and Analysis 11 (2019), 661-690. [3] Adams, H., S. Chowdhury, A. Ja↵e, and B. Sibanda: Vietoris-Rips complexes of regular polygons, arXiv:1807.10971. [4] Adams, H., Coldren, E., Willmot, S.: The persistent homology of cyclic graphs, arXiv:1812.03374. [5] Adams, H., Coskunuzer, B.: Geometric Approaches on Persistent Homology, arXiv:2103.06408. persistent homology: stability theorem 153 two prominent points •, ⇤ in the persistence diagram. The de-noised function presented in the center of Figure 10.18 can now be obtained T By adjusting the threshold # we by connecting the critical simplices corresponding to these points: can adjust the level of details we want to preserve. 1. Blue birth simplices get connected to the higher endpoint of the red terminal simplex. 2. The only exception is ⇤, which is not a terminal simplex, but gets added as the highest point in the graph in order to finalize53 our 53 The component represented by ⇤ can not get terminated as a homol- approximation by connecting it to ⇤. ogy element. 1 Figure 10.18: A noisy function, its reconstruction and the corresponding persistence diagram. The shaded region contains a multitude of points 10.5 Concluding remarks we choose to ignore in our recon- struction. Recap (highlights) of this chapter • continuous filtrations; • persistence modules; • interleaving; 54 David Cohen-Steiner, Herbert Edelsbrunner, and John Harer. • Stability theorem; Stability of persistence diagrams. Discrete & Computational Geom- etry, 37(1):103–120, 2007. doi: Background and applications 10.1007/s00454-006-1276-5 There are three di↵erent proofs of the stability theorem in the 55 Frédéric Chazal, David Cohen-literature. The initial proof was combinatorial54 and did not include Steiner, Marc Glisse, Leonidas J. Guibas, and Steve Y. Oudot. Prox- the isometry theorem or interleavings but rather just the continuity of imity of persistence modules and persistence diagrams. It motivated a more algebraically oriented proof their diagrams. In Proceedings of the Twenty-Fifth Annual Sympo- using interleavings55. The third and most direct proof uses explicit sium on Computational Geome-matching56. try, SCG ’09, pages 237–246, New York, NY, USA, 2009. ACM. doi: The existence of a decomposition of a persistence module into in- 10.1145/1542362.1542407 decomposable parts is a particular case of a standard approach re-56 Ulrich Bauer and Michael Lesnick. ferred to as the Krull Remak Schmidt principle. The fact that the Induced matchings of barcodes and the algebraic stability of persistence. indecomposable are precisely the interval modules is a special case In Proceedings of the Thirtieth An-of the Gabriel’s theorem. The fact that the indecomposable parts of nual Symposium on Computational Geometry, SOCG’14, pages 355–364, New York, NY, USA, 2014. ACM. doi: 10.1145/2582112.2582168 154 introduction to persistent homology multi-parameter persistent homology are not as simple as the interval modules is the major obstacle to exhaustive applications of multi-parameter persistence. For a longer treatment of stability see a book57. There is also a 57 Frédéric Chazal, Vin de Silva, Marc Glisse, and Steve Y. Oudot. recent treatment simplifying some of these ideas58. The Structure and Stability of The material presented up to this point represents the core ideas Persistence Modules. Springer Briefs in Mathematics. Springer, 2016. doi: and properties of persistent homology. From this point on the topics 10.1007/978-3-319-42545-0 diverge, with some of the major motivations being: 58 Primož Škraba and Katharine Turner. Notes on an elementary • Theoretical treatment: persistent homology represents a param-proof for the stability of persistence eterized version of homology and as such there are many ways to diagrams, arXiv: 2103.10723, 2021 explore the structure further, either by generalizing the frame-work59, determining what geometrical properties it encodes60 and 59 For example, with zig-zag persistence, multi-parameter persistence, expanding the ideas into other theoretical contexts. introduction of new constructions of complexes (Witness, selective Rips • Practical treatment, mostly associated with data analysis: in this complex), etc. 60 context persistent homology is often viewed as a stable shape deFor example it encodes, at least to some degree, shortest homology scriptor. As a result considerable e↵ort is being invested to incor-bases, intrinsic volumes, geometric porate persistent homology into the flow of data analysis, either by shapes, curvatures, dimension, etc. adjusting it to specific data types61, establishing meaningful proba-61 Besides point-clouds, these include time series, high-dimensional data, bilistic62 and statistical63 analysis, and to extract relevant features. dynamical systems, sensor networks, etc. • Computational treatment: the aim of this context is to optimize the 62 It turns out there are significant computational resources required to obtain persistence (or at least a phase transitions in persistent homology of random processes, etc. part of it) by developing faster algorithms often incorporating vari-63 This is typically done by mapping ous shortcuts (for example, the twist64.) or additional structure65. persistence diagrams into Hilbert Currently available software for computing persistent homology space via any of the multitude maps available, for example persistence includes (but is not restricted to) Ripser and related Ripserer.jl, landscapes, persistence silhouette, Ripser.py, and Cubical Ripser, Dionysus, PHAT, GUDHI, javaPlex, persistence silhouette, persistence Perseus, Eirene, etc. images, etc. 64 Chao Chen and Michael Kerber. This list of topics and software is by no means exhaustive. Persistent homology computation with a twist. In Proceedings 27th Eu- ropean Workshop on Computational Appendix: From the interleaving distance to the bottleneck distance Geometry, volume 11, pages 197–200, 2011 65 In this appendix we will provide an explanation that leads to the For example, when computing one-dimensional persistence of geodesic bottleneck distance by determining the interleaving distance between spaces. pairs of interval modules. p q Case 1: distance between an interval module and the zero persistence module. The situation is presented in Figures 10.19 and 10.20. For 0  p < q let us discuss the interleaving of the interval mod- " ule Fp,q (the bold portion of the figures) and the trivial persistence Figure 10.19: interval module Fp,q module (the grey portion below). is not #-interleaved with the trivial interval if # < (q p)/2. p q Figure 10.20: interval module Fp,q is #-interleaved with the trivial interval if # (q p)/2. persistent homology: stability theorem 155 • If the interleaving parameter was # < (q p)/2 as in Figure 10.19, the composition of the red diagonal maps: – is the trivial map as it factors through the trivial vector space p q below; – should have been identity on F by the interleaving condition, as its domain and target are in [p, q). " Figure 10.21: #-interleaving implies the orange part is non-trivial as the These two observations contradict each other hence the interleaving interleaving maps in the blue region parameter is at least (q p)/2. have to be non-trivial. • For # = (p q)/2 though, the composition of the diagonal maps p q increases the scale parameter by p q, and any such structure map of Fp,q is trivial, hence the #-interleaving consisting of trivial maps p0 q0 exists, see Figure 10.20. p q We conclude that the interleaving distance is # = (p q)/2. p0 q0 Case 2: general case. From Case 1 we conclude that for 0  Figure 10.22: Condition of Figure p0 < q0 the following holds: If Fp,q is #-interleaved with Fp0,q0 for 10.21 induces two shapes. The # < (p q)/2, then [p0, q0) [p + #, q #) see Figure 10.21. By interleaving distance is the larger # parameter these shapes induce. symmetry the opposite also holds: [p, q) [p0 + #, q0 #). It is easy to see these conditions are also sufficient. For minimal # for which p q these two conditions hold we obtain the #-interleaving by mapping the designated generator of Fp,q to the designated generator of Fp0,q0 whenever possible, with other maps being trivial, see Figure 10.23. p0 q0 " Figure 10.23: The interleaving for It should be apparent from Figure 10.22 that the # in question is # = max{|p q|, |p0 q0|}. The max{|p q|, |p0 q0|}. non-trivial maps are in the shaded region. We conclude that Fp,q and Fp0,q0 are max{|p q|, |p0 q0|} interleaved. However, Case 1 demonstrates that Fp,q and Fp0,q0 are also max{(p q)/2, (p0 q0)/2}-interleaved by the trivial maps, the interleaving distance between Fp,q and Fp0,q0 is min{max{|p q|, |p0 q0|}, max{(p q)/2, (p0 q0)/2}}. We now interpret the obtained distanced in the context of persistence diagrams representing interval modules. First note that ✓ ✓ p ◆◆ Figure 10.24: Matching a point to D. + q p + q (p q)/2 = d• (p, q), , 2 2 is the d• distance between (pq) and the diagonal D. Case 1. The interleaving distance between Fp,q and the zero persistence module is realized by matching point (p, q) to the closest point on the diagonal and computing the resulting d• distance, see Figure 10.24. Case 2. The distance between Fp,q and Fp0,q0 is the smaller of the Figure 10.25: Matching two points. following two: 156 introduction to persistent homology 1. Either max{|p q|, |p0 q0|} = d•((p, q), (p0, q0)), which is the d• distance between the points, see Figure 10.25. 2. Or max{(p q)/2, (p0 q0)/2} which can be interpreted as follows: match each of the points to the closest point on the diagonal D and take the maximal d• distance, see Figure 10.26. We have thus interpreted the interleaving distance between interval modules in the context of persistence modules and obtained the bottleneck distance for diagrams containing at most one point. The crucial ingredients of the interpretation are the matching and d•. Theorem 10.3.2 essentially states that an optimal interleaving between any pair of persistence modules essentially consists of such matchings: match some pairs of interval modules from both persistence modules, and then match the remaining interval modules to D. The interleaving distance (and thus the bottleneck distance) corresponds to the matching whose d•-distance of its maximal matching is minimal. Figure 10.26: Matching each of the two points to D. 11 Discrete Morse theory Homology and persistent homology detect holes in spaces through the use of algebraic constructions: a simplicial complex generates a chain complex and the resulting homology construction detects holes. However, functions and vector fields also contain information about the topology of the domain. In the smooth setting this information is contained in critical points of functions and zeros of vector fields, a situation which is beautifully described by Morse theory. In this chapter we will describe discrete Morse theory. As the name suggests we will delve into the world of discrete functions and discrete vector fields defined on simplicial complexes. Our main goal will be to describe how these encode homology, often leading to simplified representations and faster computations than the standard methods. 11.1 Motivation We first recall the definition of elementary collapses. T If s is a free face in a simplicial Definition 11.1.1. A simplex in a simplicial complex is a free face complex K then its only coface is a if it is a face of precisely one simplex. This implies that the coface maximal simplex in K. in question is a maximal simplex Let K be a simplicial complex, s k 1 ⇢ t k 2 K, and assume s is a free face in K. A removal K ! K \ { s, t} is called an elementary collapse. Complex K is collapsible to a subcomplex L  K if there is a collapse (i.e., a sequence of elementary collapses) resulting in the subcomplex L. Complex K is collapsible if it is collapsible to a point. Remark 11.1.2. We have already proved in Lemma 3.4.6 that an el-Figure 11.1: An elementary collapse indicated by an arrow from s into t. ementary collapse results in a homotopically equivalent space. As a result, if a simplicial complex K is collapsible to a subcomplex L  K 158 introduction to persistent homology then L ' K. In particular, each collapsible simplicial complex is con-a tractible. The converse does not hold as there exist contractible sim-a plicial complexes without a free face, for example Dunce hat (Figure 11.2) and Bing’s house. a a Figure 11.2: Dunce hat is obtained Given a simplicial complex K it would be of interest to simplify (i.e. by glueing the boundary of a disc collapse) it as much as possible. This would, for example, simplify the along a circle: twice alone one direction and once along the other computation of homology groups. One would go about such simplifi-direction. The obtained space can be cation by repeating the following sequence as long as possible: find a triangulated but contains no free face meaning it is not collapsible. How- free face and perform the corresponding collapse. An example is given ever, it turns out to be contractible. in Figure 11.3, where the collapses of the first three steps are indicated by the arrows. One can encode such a collapse by: • drawing all the arrows1 indicating collapses, or 1 An example of a discrete vector field. • annotate simplices by numbers2 so that the countdown-sequence 2 A rough example of a discrete Morse function. encodes3 the collapsing sequence. 3 10 collapses to 9; 8 collapses to 7, etc. Both of these encodings are demonstrated on the right side of Figure 11.3. 6 7 8 5 10 9 Figure 11.3: A simplification of a simplicial complex using elementary Eventually a collapsing sequence ends when there are no more free collapses and the encoding of the resulting collapse by arrows (dis- faces. At this point we can resort to another trick that will on one crete vector field) and annotations hand change the structure of a complex, yet still simplify its descrip- (discrete Morse function). tion in a way. Choose any simplex, declare it to be a critical simplex, remove it from the complex, and continue with collapsing. In the end we will form a “complex” consisting of critical simplices. The details a a a of the construction will be described throughout this section. At this Figure 11.4: Declaring the red edge point we only illustrate a geometric interpretation of this idea in terms to be critical, we can collapse the other two edges and obtain a repre- of “stretching” simplices. sentation of a circle using only two critical “simplices”. For our motivational purposes let us continue in Figure 11.4 with the example from Figure 11.3. We are left with a triangle. We choose one of its edges to be a critical edge and continue with collapsing. We can imagine that each collapse stretches the critical edge until, at the end, we are left with two critical simplices: and edge and a point jointly forming a circle. The resulting space is homotopy equivalent to our original simplicial complex of Figure 11.3, has a simple representation, but is not a simplicial complex. However, its homology can be discrete morse theory 159 computed in the same way as simplicial homology so in e↵ect, we have transformed the 1-dimensional boundary matrix from 6 columns to 1 column. For another example see Figure 11.5. a Figure 11.5: Stretching critical x v w a x x x simplices of a standard triangulation of the torus (on the left) along b the indicated collapses results in a y y standard representation of a torus as a square with identified sides (on the b b right). Critical simplices can be thought z z of as zeros of the resulting discrete vector field. We have already seen in the hairy ball theorem that there is a connection between zeros of smooth x x vector fields and topology of the v w x x a domain. 11.2 Discrete Morse functions and discrete vector fields We start by defining functions that encode the collapsing sequences and deformations. Definition 11.2.1. Let K be an abstract simplicial complex. A function f : K ! R is a discrete Morse function [DMF] if 8 s k 2 K: T An abstract simplicial complex is a collection of simplices hence a 1. e1 = |{ t k 1 2 K | f ( t) f ( s)}|  1 and real function defined on it maps each simplex into a real number. 2. e2 = |{ t k+1 2 K | f ( t)  f ( s)}|  1. A function g : K ! R respects dimension4 if for each s k 1 ⇢ t k 2 K 4 As an example think of g( s) = dim( s). we have g( s) < g( t). Such a function is a DMF. On the other hand, each DMF almost respects dimension in the sense5 that for each sim-5 Putting it di↵erently, for each simplex the values of the function plex t k at most one exceptional facet and at most one exceptional co-strictly decrease by passing to its face of dimension k + 1 are allowed. The following proposition demon-faces with at most one exception, and the values of the function strictly strates that the two exceptions cannot occur simultaneously. increases by passing to its cofaces with at most one exception. Proposition 11.2.2. Given the notation of Definition 11.2.1, either e1 = 0 or e2 = 0. Proof. Aiming for the contradiction, assume that for s 2 K and for vertices v1, v2 2 K(0) we have f ( s) f ( s [ {v1}) f ( s [ {v1, v2}). (11.1) But then s [ {v2} 2 K and we have: 160 introduction to persistent homology 6 7 8 1. f ( s [ {v2}) > f ( s) as the exceptional coface of s is s [ {v1}. 5 2. f ( s [ {v2}) < f ( s [ {v1, v2}) as the exceptional face of s [ {v1, v2} is s [ {v1}. 10 9 3 These two conclusions combine into f ( s) < f ( s [ {v1, v2}) which contradicts equation 11.1. 4 2 Proposition 11.2.2 implies that simplices with exceptions form dis-1 3 joint pairs. We will refer to such pairs as regular pairs. A regular pair 2 consists of a simplex t and its facet s. It encodes an “arrow” s ! t Figure 11.6: An example of a DMF in the sense of the motivational section and is thus presented as such, and the resulting discrete vector field in blue. see Figure 11.6 for an example. A simplex without any exception6 is 6 I.e., for which e1 = e2 = 0. called7 critical simplex. Given a DMF on a simplicial complex, each 7 A critical simplex is not contained simplex is either critical or contained in a unique regular pair. in any regular pair. Definition 11.2.3. Let K be an abstract simplicial complex. A discrete vector field is a disjoint collection of pairs ( s i, t i) of simplices from K such that for each i simplex s i is a facet of t i. Each pair of a dis-Proposition 11.2.4. Let f be a crete vector field is referred to as an arrow. DMF on a simplicial complex K. For each i let ni denote the number of critical sim- plices of dimension i. Then The disjointness condition means that each simplex can be the c = n0 n1 + n2 . . . . member of at most one pair of a discrete vector field. The collection of regular pairs of a DMF forms8 a discrete vector field, see Figure 11.6. Proof. Removing a regular pair A discrete vector field is called a gradient vector field9 if it is induced of simplices does not change c because simplices are of adjacent by some DMF in this manner. The arrows constituting a discrete dimensions. vector field will be sometimes referred10 to as regular pairs. 8 The converse is not true in general as we will explain the the next Gradient vector fields subsection. 9 We will be omitting adjective “discrete” when mentioning gradient Definition 11.2.5. Let K be a simplicial complex and p vector fields. 2 N. Given 10 The reason is twofold: to em- a discrete vector field on K consisting of pairs {( s i, t i)}i2J, a p-path phasize that the pair is a part of is a sequence the structure of a discrete vector field, and to stress the analogy with p 1 p p 1 p p 1 p p 1 regular pairs of a DMF. s i ! t s ! t s ! · · · ! t s 1 i1 i2 i2 i3 ik ik+1 such that for each j: • p 1 p ( s i , t ) is an arrow in the discrete vector field, and j ij • p 1 p s i is a facet of t . j ij 1 p 1 p 1 Such a p-path is a cycle if s 1 = s k+1 and k 1. A discrete vector field is acyclic if it admits no cycle. A few observations concerning Definition 11.2.5: discrete morse theory 161 1. A critical simplex can only appear as the last simplex of a p-path in a discrete vector field. 2. Given a DMF f , function values decrease along any p-path in the induced discrete vector field, i.e.: f ( s i ) f ( t ) > f ( s ), j ij ij 8j. +1 In particular, f ( s i ) > f ( s ) for all m > 1. 1 im 3. Observation 2. implies that each gradient vector field is acyclic. The following theorem proves the converse. Figure 11.7: A 2-path in blue ending in a critical edge, a 2-path in red ending in a non-critical edge, and a Theorem 11.2.6. Each acyclic discrete vector field on a simplicial com-1-path in orange ending in a critical plex K is a gradient vector field, i.e. it is induced by some DMF. vertex. A proof is given in the appendix. As a result we obtain the following theorem. T Di↵erent DMFs on a simplicial complex K may induce the same discrete vector field. For example, if Theorem 11.2.7. A discrete vector field is gradient vector field i↵ it f is a DMF, then so are ef and 3 f 5, and all of them induce the same is acyclic. discrete vector field. Our primary interest in discrete vector fields We conclude this section by demonstrating how acyclic discrete lies in their encodings of collapses and deformations-simplifications vector fields encode collapses. of a simplicial complex. A DMF represents a convenient but not unique way of encoding a discrete Proposition 11.2.8. Suppose the critical simplices of an acyclic dis-vector field. crete vector field on K form a subcomplex L  K. Then there exists a collapse K ! L and thus K ' L. Corollary 11.2.9. If an acyclic discrete vector field on K has a single critical Proof. We claim there exists a regular pair ( s, t) such that s is a free simplex, then that simplex is a vertex face. Assuming for a moment this claim is true, we can remove pair and K is collapsible. ( s, t) by performing an elementary collapse and proceed by using the Proof. The statement follows di-claim on the resulting complex. Thus the inductive argument and the rectly from Proposition 11.2.8. claim suffice to prove the proposition. We now turn our attention to proving the claim. Let n denote the maximal dimension of a simplex in K \ L. There exists an n-path in the discrete vector field. Take a maximal11 such path and let s ! 11 Such a path contains a first regular pair because the discrete vector field t denote the first regular pair in it. Simplex s is a free face by the is acyclic. following argument: • s  t as the simplices form a regular pair. • If s was a facet of another simplex t 0 in K \ L, then the n-simplex t 0 would be contained in another regular pair12 ( s 0, t 0), which could 12 There are no (n + 1)-simplices in K \ L. be used to prolong our n-path. This contradicts the maximality of the chosen n-path. 162 introduction to persistent homology • If s was a facet of another simplex t 0 in L, then s 2 L as L is a subcomplex, a contradiction. 11.3 Morse homology In this section we will explain the proceedure that leads to the computation of homology from a gradient vector field. The geometric idea behind the theory was presented at the beginning of this chapter: collapsing regular pairs to stretch critical simplices, with the resulting space having the same homotopy type as the original simplicial complex but fewer “simplices”. In our treatment we will refrain13 from 13 A formal definition of resulting spaces would require a significant formally defining the resulting space and instead construct the result-amount of additional material from ing chain complex directly. However, it might still be helpful to keep algebraic topology. This would include a formal treatment of CW the geometric idea in mind to help navigate the algebraic construction. complexes, i.e., spaces obtained by inductively glueing discs. We have actually mentioned several Morse chain complex presentations of such constructions when presenting torus, Klein bottle For the rest of this section we fix a simplicial complex K, a gradi-and projective plane by drawing a ent vector field on K, and an (algebraic) field F to provide efficient square with identifications along the algebraic constructions. For each i let n edges, when defining the dunce hat, i denote the number of critical and in one of the previous appendices i-simplices. in the context of relative homology. Definition 11.3.1. Let p 2 {0, 1, . . .}. A Morse p-chain is a for-p p mal sum Ânp i=1 l i s i with l i 2 F and s i being an oriented critical simplex of dimension p in K for each i. The p-dimensional Morse chain group Cp is the vector space conT As with the usual homology, sisting of all Morse p-chains. multiplying an oriented simplex by 1 changes its orientation. Observe that Cp ⇠ = Fnp . In order to obtain a chain complex we also need to define boundary maps. These are based on oriented14 paths in 14 Paths in a discrete vector field are directed by definition. The adjective discrete vector fields. “orientable” refers to the fact that the simplices forming the path are oriented in a certain way. Definition 11.3.2. Let p 2 {0, 1, . . .}. An oriented p-path from an p 1 p 1 oriented simplex s 1 to an oriented simplex s k+1 is a p-path p 1 p p 1 p p 1 p p 1 s 1 ! t 1 s 2 ! t 2 s 3 ! · · · ! t k s k+1 consisting of oriented simplices, such that for each j the orientation T One could say that the simplices t i in an oriented p-path are oriented induced by t j j on its facets: consistently along the path. 1. matches s j, and 2. does not match s j+1. discrete morse theory 163 Given an oriented critical p-simplex t, let d( t) denote the collection of all of its facets with the induced orientation arising from t. For each15 oriented critical (p 1)-simplex s define 15 Given an oriented critical (p 1)- simplex s observe that at, s counts di↵erent paths than at, s. at, s =  |{ oriented paths from s 0 to s}| s 02 d( t) as the number of oriented p-paths from elements of d( t) to s. Definition 11.3.3. The boundary map d of the Morse chain complex is defined as follows: for each oriented critical p-simplex t define np 1 dp t =  ( a T The oriented paths constituting t, s ) i at, s i s i, i=1 the boundary map model how arrows stretch the boundary of a p-critical where s 1, . . . , s np 1 are critical (p 1)-simplices with a fixed orien-simplex towards critical (p 1)- tation. simplices. Examples will be provided below when demonstrating the computation of Morse homology, see also Figures 11.8 and 11.9. It turns out that d2 = 0. Definition 11.3.4. The Morse chain complex is the chain complex defined as · · · d ! d d d d Cn ! Cn 1 ! · · · ∂ ! C1 ! C0 ! 0. Morse homology We may now define Morse homology as the homology arising from the Morse chain complex. Definition 11.3.5. Let p 2 {0, 1, . . .}. The Morse homology of a gradient vector field on K is defined as Hp(K; F) = ker dp/ Im dp+1. Theorem 11.3.6. Morse homology is isomorphic to the standard (simplicial) homology: Hp(K; F) ⇠ = Hp(K; F). Corollary 11.3.7 (Weak Morse inequalities). For each p the number of critical p simplices is greater or equal to the corresponding Betti number: np bp. 164 introduction to persistent homology Example 11.3.8. Given the situation of Figure 11.8 there is one critical edge hb, ei and one critical vertex hai. Thus C1 ⇠ = C0 ⇠ = F with the other Morse chain groups being trivial. Let us determine dhb, ei: 1. d(hb, ei) = {hei, hbi}. d 2. There is one oriented 1-path from hei to hai: c hei ! he, ai hai. 3. In a similar fashion there is one oriented 1-path from hbi to hai. b 4. Observations 2. and 3. imply a hb,ei,hai = 1 and a hb,ei, hai = 1. 5. dhb, ei = ( a hb,ei,hai a hb,ei, hai) · hai = 0 e a Figure 11.8: A gradient vector field The resulting Morse chain complex is of the form on a simplicial complex. Critical edges are colored in red. There is one 0 · · · ! 0 ! F ! F ! 0. oriented 1-path from hei to hai and one oriented 1-path from hbi to hai. The resulting homology is trivial in dimensions two and above, and nontrivial below: H0(K) ⇠ = H1(K) ⇠ = F. Example 11.3.9. Let us compute the Morse homology of a torus. x v w a x x v w a x x v w a x x v w a x b b b b b y y y y y y y y ⌧ ⌧ ⌧ ⌧ z z z z z z z z x x x x x x x x a v w v w v w v w Figure 11.9: A triangulation of a torus, a gradient vector field, and Given the triangulation and the gradient vector field on a torus pre-paths generating the Morse boundary sented on the leftmost part of Figure 11.9 we determine the following from Example 11.3.8. critical simplices: • critical vertex x in purple; • critical edges a (red) and b (blue); • critical triangle t. discrete morse theory 165 We orient the critical simplices according to visualizations in the other parts in Figure 11.9. The resulting Morse chain complex is of the form · · · ! 0 ! F ! F2 ! F ! 0. We next determine the Morse boundary of t. Only two simplices of d( t) are the starting simplices of oriented 2-chains ending in a critical edge: • From the diagonal edge of d( t) there are two oriented 2-paths16 to 16 Drawn in black on the center-left part of Figure 11.9. critical edges17 hai and hbi. 17 The center-right part of Figure 11.9 contains opaque green arrows • From the top edge of d( t) there are two oriented 2-paths18 to criti-indicating the orientations of the cal edges19 hai and hbi. edges contained in the oriented 2- paths. The terminal edges of the • There are oriented 2-paths starting in the vertical edge of d( t) but oriented 2-paths are hai and hbi. 18 Drawn in black on the rightmost none of them ends in a critical edge. part of Figure 11.9. 19 The two oriented 2-paths di↵er Combining these three cases we conclude only in the last simplex. d( t) = hai hbi + hai + hbi = 0. In a similar way we conclude that da = db = 0. The resulting Morse chain complex is of the form 0 · · · ! 0 ! F ! F2 0 ! F ! 0. The resulting homology is trivial in dimensions three and above, and nontrivial below: H0(K) ⇠ = H2(K) ⇠ = F and H1(K) ⇠ = F2. Generating DMFs and gradient vector fields Figure 11.10: In this figure we pro- Using discrete Morse theory depends on the ability to generate vide a geometric justification for the way the orientation carries for- DMFs and gradient vector fields with as few critical simplices as pos-ward through oriented 2-paths. The sible. The weak Morse inequalities show that the lower bounds for the bottom-right part is a snapshot from numbers of critical simplices are Betti numbers. A DMF on a simpli-the center-right part Figure 11.9 indicating how the orientation of the cial complex is perfect if the number of critical p-simplices coincides diagonal edge carries on through the with the pth Betti number. In terms of the numbers of critical sim-arrow to the other two edges of the triangle. The first three parts of this plices, perfect DMFs are optimal DMFs. Not every simplicial complex figure indicate how such an orienta-admits a perfect DMF: an example is the Dunce hat. tion on the two edges is obtained by There is a simple algorithm to generate a perfect DMF on a graph. deforming the oriented diagonal edge along the arrow of a discrete vector For each component generate a gradient vector field as follows: field. • Find a spanning tree. • Choose a critical vertex. • Define a gradient vector field pointing towards the critical vertex along the spanning tree, see Figure 11.11. 166 introduction to persistent homology The mentioned construction can be generalized to higher dimensional simplicial complexes: keep adding arrows while making sure that the acyclicity condition is preserved. However, better results are typically obtained through more elaborate designs. 11.4 Concluding remarks Figure 11.11: A graph (left) and a blue spanning tree (right) with a gradient vector field pointing to the Recap (highlights) of this chapter chosen critical vertex (red). The edges not contained in the tree (red) are the critical edges. • Discrete Morse functions • Gradient vector fields • Morse homology Background and applications Smooth Morse theory was developed20 by Marston Morse in the 20 Marston Morse. The Calculus of Variations in the Large. American first part of the twentieth century. Amongst its results it relates crit-Mathematical Society Colloquium ical points and gradient flows of a generic function on a manifold to Publication. Vol. 18. New York, 1934 the homology of a manifold. Its discrete version21 has been introduced 21 Robin Forman. A user’s guide to discrete morse theory. Sém. Lothar. at the turn of the millennium. The past two decades saw a consid-Combin., 48, 12 2001 erable development of discrete Morse theory from various directions, including computational aspects, developing analogies between discrete and smooth results, etc. For a textbook presenting smooth and discrete point of view see a book22, for recent applications see a book23. 22 Kevin P. Knudson. Morse The- ory: Smooth and Discrete. World An algorithm for generating a discrete morse function is given in a scientific, 2015. doi: 10.1142/9360 paper24. 23 Tamal K. Dey and Yusu Wang. An echo of the smooth Morse theory is the hairy ball theorem: the Computational Topology for Data Analysis. Cambridge: Cambridge topology of a domain is connected to zeros of vector fields and thus to University Press, 2022. doi: extrema of functions. In a similar way, an echo of the discrete Morse 10.1017/9781009099950 24 theory is our proof of the Euler-Poincaré formula, where we essentially Henry King, Kevin Knudson, and Neža Mramor. Generating Discrete only counted the maxima of the x-coordinate function. Theorem 11.2.6 Morse Functions from Point Data. is a discrete variant of the assigning of a potential function to a vector Experiment. Math. 14 (4) 435 – 444, 2005 field. Generalized discrete Morse theories can be used to prove25 that 25 Ulrich Bauer and Herbert Edels- ˇ brunner. The morse theory of Čech Cech complexes collapse onto alpha complexes in Euclidean spaces. and Delaunay complexes. Transac- Several computer programs use discrete Morse theory to a di↵erent tions of the American Mathematical Society, 369:1, 06 2016. doi: degree to assist26 with computations of homology. The theory can also 10.1090/tran/6991 be used as a preprocessing tool or a framework within which to analize 26 For example, using simplification discrete functions. using emergent pairs in Ripser. On the other hand Perseus is actually based on a discrete Morse theory. A proof of Theorem 11.2.6 We first introduce some preliminary notions. Given a simplicial complex K the Hasse diagram of K is a directed graph defined as discrete morse theory 167 follows: d c 1. The nodes are the simplices of K; 2. Directed edges correspond to pairs (simplex, a facet). In particular, b each n-simplex is the source of n + 1 directed edges. e a An example is provided in Figure 11.12. The directed edges in the graph represent the containment of a facet. hb, d, ei Given an empty discrete vector field on a simplicial complex K, the directed edges also represent the direction of descent of the corre-hb, di hb, ei hd, ei hc, di ha, ei ha, bi sponding DMF. In this trivial case, all the simplices are critical, there are no exceptions and the DMF respects dimension. An obvious choice of a DMF in this case is the dimension function of a simplex. For il-hbi hdi hei hci hai lustrative purposes let us discuss how we could obtain such a function Figure 11.12: A simplicial complex from the Hasse diagram in an inductive manner: and its Hasse diagram. Hasse dia- grams are typically drawn in levels • Assign the smallest value, say 0, to all minimal nodes of a directed corresponding to the dimensions of simplices. graph; • Assign the second smallest value, say 1, to all nodes whose lower set27 has already been enumerated; 27 The lower set of a node s is the collection of all nodes which appear as the target of a directed edge • Proceed by induction: In step number n assign the nth smallest starting at s. value, say n 1, to all nodes whose lower set has already been enumerated; d c This inductive construction of function works for any acyclic directed graph and will be used in our eventual proof. b Given a non-trivial discrete vector field on a simplicial complex K we define a modified Hasse diagram of K by reverting the direction e a of the directed edges corresponding to the regular pairs, see Figure 11.12 for an example. A modified Hasse diagram encodes the sufficient hb, d, ei conditions on a DMF to generate the initial discrete vector field. The above inductive procedure on such a diagram will produce a required hb, di hb, ei hd, ei hc, di ha, ei ha, bi DMF i↵ the diagram itself is acyclic as a directed graph. Lemma 11.4.1. The modified Hasse diagram of an acyclic discrete vector field is acyclic. hbi hdi hei hci hai Figure 11.13: The modified Hasse Proof. Since each simplex can be a member of at most one regular diagram, the reverted edges are red. pair in a discrete vector field, a cycle in the modified Hasse diagram H cannot contain consecutive directed edges corresponding to regular pairs. As directed edges of H either end in a simplex of dimension 1 higher (in the case of regular pairs) or lower (in the case of edges encoding the facet relation) than the dimension of the initial simplex, any cycle in H has to be an alternating concatenation of these two types. As a result, a cycle in H corresponds to a p-cycle in the initial 168 introduction to persistent homology discrete vector field, which is non-existent by the main assumption. d 4 5 6 c 3 2 5 4 b A proof of Theorem 11.2.6. Given an acyclic discrete vector field, the corresponding modified Hasse diagram is acyclic by Lemma 11.4.1. 3 1 Thus the inductive procedure above results in a suitable DMF, see e a 2 1 0 Figure 11.14. 4 hb, d, ei 3 3 5 5 1 1 hb, di hb, ei hd, ei hc, di ha, ei ha, bi hbi hdi hei hci hai 2 4 2 6 0 Figure 11.14: The modified Hasse diagram and the DMF (in red) con- structed by the inductive proceedure. Index Abelian group, 81 closed surface, 48 abstract simplex, 37 coface, 35 abstract simplicial complex, 37 collapse, 157 affine combination, 33 column echelon form, 93 affine hull, 33 combinatorial manifold, 49 affine independence, 34 component, 17 Alexander duality, 102 connected space, 17 alpha complex, 65 connected sum, 52 annotation, 138 consistent orientation, 51 continuous filtration, 138 ball, 12 continuous map, 13 barcode, 121, 125 contractible space, 16 barycentric coordinates, 21, 34 convex combination, 21 barycentric subdivision, 20 convex hull, 21 basis, 79 convex set, 21 Betti number, 89 critical scale, 139, 141 birth simplex, 129 critical simplex, 160 body, 36 cubical complex, 106 bottleneck distance, 146 cycle, 89 boundary, 89 Čech complex, 61 boundary map, 87 Čech filtration, 62 boundary of a manifold, 48 boundary point, 48 Dante, 57 Brouwer fixed point, 111 Delaunay triangulation, 25 Brouwer fixed point theorem, diameter, 59 111 dimension of a simplex, 35 dimension of a simplicial chain, 86 complex, 36 chain complex, 88 direct sum, 142 chain group, 86 disc, 14 classification algorithm for discrete Morse function, 159 surfaces, 55 discrete vector field, 160 close manifold, 48 distance, 11 170 introduction to persistent homology Dowker duality, 73 induced orientation, 51 Inscribed angle theorem, 28 elder rule, 122 interior of a manifold, 48 elementary collapse, 44, 96, 157 interior point, 48 elementary divisors, 104 interleaving, 68 Euler characteristic, 22, 41, 97 interleaving distance, 140, 143 Euler-Poincaré formula, 22 interval module, 142 exact sequence, 114 isometry, 13 isomorphism, 79, 82, 142 face, 35 facet, 35 Jaccard distance, 12 field, 75 Jung’s theorem, 62 fields of remainders, 76 filtration, 123 kernel, 79 filtration function, 138 Klein bottle, 48, 100 free face, 157 full simplex, 98 line sweep, 23 functoriality, 111 link, 37 fundamental class, 99 locally Delaunay, 26 geodesic metric, 12 manifold, 48 geometric realization, 38 mapper, 66 geometric simplex, 35 matching distance, 146 geometric simplicial complex, MaxMin edge, 27 35 Mayer-Vietoris exact sequence, gradient vector field, 160 115 Gromov-Hausdor↵ distance, metric, 11 144 metric space, 11 group, 81 MiniBall algorithm, 70 Moebius band, 16 hairy ball theorem, 112 Morse homology, 163 Hausdor↵ distance, 143 multiplicity, 126 homeomorphism, 14 homology group, 89 nerve, 62 homology representative, 94 nerve theorem, 64 homomorphism, 82 homotopic maps, 15 o↵sets, 139 homotopy, 15 orientable surface, 51 homotopy equivalence, 16 oriented simplex, 50 homotopy equivalent spaces, 16 oriented triangulation, 51 image, 79 p-path, 160 incremental expansion, 96 partial matching, 146 induced maps, 110 path, 13 INDEX 171 path component, 17 skeleton, 36 path connected space, 17 Smith normal form, 94, 104 perfect DMF, 165 sphere, 14 persistence diagram, 126 star, 37 persistence module, 141 subcomplex, 36 persistent Betti numbers, 123 subdivision, 36 persistent homology, 123 sublevel filtration, 138 pivot, 128 surface, 48 projective plane, 48 tangent vector field, 112 quotient, 76 terminal simplex, 129 torus, 39 rank, 83 triangulation, 19, 36 regular pairs, 160 relative homology, 119 vector space, 78 representing cycles, 94 Vietoris complex, 73 retraction, 111 Voronoi diagram, 25 Rips complex, 59 Rips filtration, 60 Voronoi region, 24 row echelon form, 92 Wasserstein distances, 147 simplicial approximation, 43 simplicial map, 42 zig-zag lemma, 117 Bibliography Yuliy Baryshnikov and Robert Ghrist. Target enumeration via Euler characteristic integrals. SIAM Journal on Applied Mathematics, 70 (3):825–844, 2009. doi: 10.1137/070687293. Ulrich Bauer and Herbert Edelsbrunner. The morse theory of Čech and Delaunay complexes. Transactions of the American Mathematical Society, 369:1, 06 2016. doi: 10.1090/tran/6991. Ulrich Bauer and Michael Lesnick. Induced matchings of barcodes and the algebraic stability of persistence. In Proceedings of the Thirtieth Annual Symposium on Computational Geometry, SOCG’14, pages 355–364, New York, NY, USA, 2014. ACM. doi: 10.1145/2582112.2582168. Raoul Bott and Loring W. Tu. Di↵erential Forms in Algebraic Topology. Springer New York, New York, NY, 1982. doi: 10.1007/978-1- 4757-3951-0. Constantin Caratheodory. U¨ ber den Variabilitätsbereich der Koef- fizienten von Potenzreihen, die gegebene Werte nicht annehmen. Math. Ann. 64, no. 1, 95–115, 1907. doi: 10.1007/BF01449883. Frédéric Chazal, David Cohen-Steiner, Marc Glisse, Leonidas J. Guibas, and Steve Y. Oudot. Proximity of persistence modules and their diagrams. In Proceedings of the Twenty-Fifth Annual Symposium on Computational Geometry, SCG ’09, pages 237–246, New York, NY, USA, 2009. ACM. doi: 10.1145/1542362.1542407. Frédéric Chazal, Vin de Silva, Marc Glisse, and Steve Y. Oudot. The Structure and Stability of Persistence Modules. Springer Briefs in Mathematics. Springer, 2016. doi: 10.1007/978-3-319-42545-0. Chao Chen and Michael Kerber. Persistent homology computation with a twist. In Proceedings 27th European Workshop on Computational Geometry, volume 11, pages 197–200, 2011. 174 introduction to persistent homology David Cohen-Steiner, Herbert Edelsbrunner, and John Harer. Stability of persistence diagrams. Discrete & Computational Geometry, 37(1): 103–120, 2007. doi: 10.1007/s00454-006-1276-5. Mark de Berg, Marc van Kreveld, Mark Overmars, and Otfried Schwarzkopf. Computational Geometry: Algorithms and Applications. Springer-Verlag, second edition, 2000. doi: 10.1007/978-3- 540-77974-2. Tamal K. Dey and Yusu Wang. Computational Topology for Data Analysis. Cambridge: Cambridge University Press, 2022. doi: 10.1017/9781009099950. Cli↵ord H. Dowker. Homology groups of relations. Annals of Mathematics, 56(1):84–95, 1952. doi: 10.2307/1969768. David S. Dummit and Richard M. Foote. Abstract algebra. Wiley, 3rd edition, 2004. Herbert Edelsbrunner and John Harer. Computational Topology: An Introduction. Applied Mathematics. American Mathematical Society, 2010. doi: 10.1090/mbk/069. Herbert Edelsbrunner, David Letscher, and Afra Zomorodian. Topological persistence and simplification. Discrete & Computational Geometry, 28(4):511–533, 2002. doi: 10.1007/s00454-002-2885-2. Robin Forman. A user’s guide to discrete morse theory. Sém. Lothar. Combin., 48, 12 2001. Jean Gallier and Dianna Xu. A Guide to the Classification Theorem for Compact Surfaces. Springer Berlin Heidelberg, 2013. doi: 10.1007/978-3-642-34364-3. Allen Hatcher. Algebraic topology. Cambridge Univ. Press, Cambridge, 2000. Henry King, Kevin Knudson, and Neža Mramor. Generating Discrete Morse Functions from Point Data. Experiment. Math. 14 (4) 435 – 444, 2005. L. Christine Kinsey. Topology of Surfaces. Springer New York, 1993. doi: 10.1007/978-1-4612-0899-0. Kevin P. Knudson. Morse Theory: Smooth and Discrete. World scientific, 2015. doi: 10.1142/9360. Marston Morse. The Calculus of Variations in the Large. American Mathematical Society Colloquium Publication. Vol. 18. New York, 1934. bibliography 175 James Munkres. Elements of Algebraic Topology. Perseus Books, 1984. doi: 10.1201/9780429493911. James R. Munkres. Topology. Prentice Hall, Inc, 2nd ed edition, 2000. Steve Y. Oudot. Persistence Theory: From Quiver Representations to Data Analysis. Number 209 in Mathematical Surveys and Monographs. American Mathematical Society, 2015. doi: 10.1090/surv/209. Matt Parker. Humble pi: a comedy of maths errors. Allen Lane, 2019. A. Roy, R. A. I. Haque, A. J. Mitra, M. Dutta Choudhury, S. Taraf-dar, and T. Dutta. Understanding flow features in drying droplets via Euler characteristic surfaces—a topological tool. Physics of Fluids, 32(12):123310, 2020. doi: 10.1063/5.0026807. Gurjeet Singh, Facundo Memoli, and Gunnar Carlsson. Topological Methods for the Analysis of High Dimensional Data Sets and 3D Object Recognition. In M. Botsch, R. Pajarola, B. Chen, and M. Zwicker, editors, Eurographics Symposium on Point-Based Graphics. The Eurographics Association, 2007. doi: 10.2312/SPBG/SPBG07/091-100. Žiga Virk. 1-dimensional intrinsic persistence of geodesic spaces. Journal of Topology and Analysis, 12(01):169–207, 2020. doi: 10.1142/S1793525319500444. Žiga Virk. Rips complexes as nerves and a functorial Dowker-nerve diagram. Mediterranean Journal of Mathematics, 18(2):58, 2021. doi: 10.1007/s00009-021-01699-4. Emo Welzl. Smallest enclosing disks (balls and ellipsoids). In Hermann Maurer, editor, New Results and New Trends in Computer Science, pages 359–370, Berlin, Heidelberg, 1991. Springer Berlin Heidelberg. Primož Škraba and Katharine Turner. Notes on an elementary proof for the stability of persistence diagrams, arXiv: 2103.10723, 2021. Document Outline Metric spaces Definition of metric spaces and basic examples Maps and equivalence types Planar triangulations Definition of planar triangulations Recap on convexity Euler characteristic Constructing planar triangulations with line sweep Voronoi diagram and Delaunay triangulation Concluding remarks Simplicial complexes Affine independence Geometric simplicial complex Abstract simplicial complex Simplicial maps Concluding remarks Surfaces Surfaces as manifolds Orientability Connected sum of surfaces Classification of surfaces Concluding remarks Constructions of simplicial complexes Rips complexes Čech complexes Nerve complexes Interleaving properties Concluding remarks Fields and vector spaces Fields Vector spaces Concluding remarks Homology: definition and computation Definition Computing homology Examples of homology Concluding remarks Homology: impact and computation by parts Impact Homology by parts Concluding remarks Persistent homology: definition and computation Definition Visualization Computation Concluding remarks Persistent homology: stability theorem Continuous filtrations Persistence modules Bottleneck distance and stability theorem Interpretations and examples Concluding remarks Discrete Morse theory Motivation Discrete Morse functions and discrete vector fields Morse homology Concluding remarks Index Bibliography