A Global Reduction Based Algorithm for Computing Homology of Chain Complexes

In this paper, we propose a new algorithm to compute the homology of a finitely generated chain complex. Our method is based on grouping several reductions into structures that can be encoded as directed acyclic graphs. The organized reduction pairs lead to sequences of projection maps that reduce the number of generators while preserving the homology groups of the original chain complex. This sequencing of reduction pairs allows updating the boundary information in a single step for a whole set of reductions, which shows impressive gains in computational performance compared to existing methods. In addition, our method gives the homology generators for a small additional cost.


Introduction
Homology has been used recently in a wide variety of applications in domains such as dynamical systems, and image processing and recognition.In dynamics, typical problems are translated into problems in topology where invariants such as the Conley index are computed using homology algorithms.In digital image analysis, topological invariants are useful in shape description, indexation, and classification.Among shape descriptors based on homology theory, there are the Morse shape descriptor [1] [2], the Morse Connection Graph [3], and the persistence barcodes for shape [4].The necessity of improved algorithms appears evident as new applications of the homology computation arise in research for very large data sets.Although several algorithms and software packages have been developed for this purpose, there is still a lot of room for improvement as processing very large data sets is often very time-and memory-consuming.The classical approach to compute homology of a chain complex with integer coefficients reduces to the calculation of the Smith Normal Form (SNF) of the boundary matrices which are in general sparse [5] [6].
Unfortunately, this approach has a very poor running-time and its direct implementation yields exponential bounds.Polynomial time algorithms for computing the SNF were given by Kannan & Bachem [7] and later improved by Chou & Collins [8] and Illiopoulos [9].The best currently known Smith Normal Form algorithms have super cubical complexity [10].Another generic approach is the method of reduction proposed in [5] [11].In this method the original complex is simplified through a sequence of reductions of cells that preserve the homology groups at each step.The idea is to replace the chain complex (or even the object under study) by a smaller one with the same homology.A reduction consists of cancelling a cell B through an element of its boundary a (reduction pair ( ) , a B ).The two cells are suppressed from the structure of the complex and the boundary homomorphisms are updated.Direct implementations of the reduction method use unordered lists to encode the boundary homomorphisms and have cubical complexity.
Several algorithms based on the idea of reduction and that improve the time complexity for particular types of data sets have been designed.For cubical sets embedded in 3   , Mrozek et al. proposed a method [12] based on the computation of acyclic subspaces by using lookup tables.Experimentally, this method performed in linear time.Another method running in quasi linear time for finite simplicial complexes embedded in 3   was pro- posed by Delfinado and Edelsbrunner in [13].For problems dealing with higher dimensions, Mrozek and Batko developed an algorithm [14] based on the concept of coreduction which showed interesting performance results on cubical sets.
Despite the existence of efficient homology algorithms for cubical sets and lower dimension spaces, there remain many problems where the data is higher dimensional and not cubical.Generally, for these problems it is often difficult to adapt the data because the cost is too high or too complex.In this case, one has to use one of the classical methods: SNF or reduction.However, the classical methods spend most of their time in updating the boundary homomorphisms after each reduction step.This requires repetitive manipulations of the data structures that store the chain complex which incurs a high running time to accomplish the reduction.
An idea is to identify admissible reduction pairs and organize them in a structure that allows efficiently updating the boundary information in a single step for a whole set of reductions.This reduces to the minimum manipulations of the data structures that store the boundary information.Our method works as follows.Starting at an arbitrary cell A with a face a in its boundary, we identify successive adjacent admissible pairs of reduction and build the longest possible sequence originating at A. Several sequences can originate at the same cell, and each sequence can be seen as a path in a directed acyclic graph, called a reduction DAG, where nodes are reduction pairs and oriented edges denote an adjacency relation between the reduction pairs.Given a reduction DAG across a chain complex, we achieve a global simplification of the complex by performing all the reductions in the DAG at once.We will establish direct algebraic formulas that allow updating the boundary of remaining cells.The net advantage of our approach is that the boundaries are not explicitly updated after each reduction step.Instead, this information is always preserved implicitly within the data structures and processed globally for each reduction DAG allowing reducing considerably the computational time.The communication [15] contains a summary of the method developed systematically here and the preliminary experimentations that are now developed in Section 5.

Chain Complexes and Homology
Chain complexes are an algebraic tool for defining and computing homology.Although they are usually defined from geometrical structures such as simplicial or cellular complexes, they can also be considered in an abstract manner as pure algebraic entities.A finitely generated free chain complex ( ) with coefficients in a ring  is a sequence of finitely generated free abelian groups { } q q C ∈ together with a sequence of homomorphisms, called boundary maps or operators, { } 1 : for all q ∈  .Typically, 0 q C = for 0 q < .For each p, the elements of p C are called p-chains and the kernel of . The quotient groups : are the homology groups of the chain complex  .

Computation of Homology by Reduction of Chain Complexes
The reduction of a chain complex is a procedure that consists of removing successively pairs of generators from the bases of its chain groups while preserving the homology of the original complex.This method is originally motivated by a simple geometric idea known as the collapsing.At the algebraic level, each removal of a pair of generators that form a reduction pair is equivalent to a collection of projection maps { } : π → that send each generator of the removed pair into 0.Moreover, it projects the other cells into the remaining generators taking into account the modifications of their boundaries caused by the removal of the pair.Let ( ) , that is, the homologies of C′ and C coincide.To calculate the homology of the original chain complex C, the idea is to define a sequence of projections associated to the removal of pairs in all dimensions and then compute the homology of the resulting chain complex that is the image of the successive projections.A sequence of projections is complete if no other projection can be added to the sequence.Such a sequence always exits when the chain complex is finitely generated.Let ( ) , C ∂ be the original chain complex and ( ) C ∂ is the final chain complex obtained after a complete sequence of projections.We already know that the two complexes have the same homology, moreover it is easily observed that if 0 , that is, the Betti number d β is given by the number of d-cells remaining in the complex f C .Formally, a reduction pair and its associated collection of projection maps are defined as follows.
and , ⋅ ⋅ is the canonical bilinear form on chains.A way to understand the global reduction of a complex is to reformulate it as an algorithm and follow an example of its execution.The following implementation is basic and simply aims at understanding the process.For instance, if the chain complex has torsion or if the dimensions are reduced in arbitrary order, then we should refer to the more general version of the reduction method described in [5] [11].Also, we use the data structures as defined in section 4.0.
Algorithm 2. Reduction (Complex K) reduces K into K ′ using the one-step reduction formula.Consecutive calls of this function will simplify K until it remains unchanged.Finally, if all the boundaries are null, then the homology generators are given by the remaining cells.If not, then we continue the computation of homology by using a classical method such as Smith Normal Form.

1) (Find a reduction pair) ( )
, a B is a reduction pair if a is a face of B with an incidence coefficient of 1 ± .If no reduction pair is found, return K unmodified.Otherwise continue.It is assumed that reduction pairs are chosen by decreasing order of dimension.First the highest dimension is reduced, then the lower dimensions. 2 Clearly, 0 π is the identity map since there is no 0-cell removed from the complex and 1 π α α = for all α except for 1 0 b π = .The altered boundaries in the resulting complex K ′ are ( ) To calculate the homology of the complex we continue the simplifications until 0 ∂ = for all dimensions.

Computation of Homology by Grouping Reductions
The reduction algorithm chooses reduction pairs in arbitrary order and spends most of its time updating the boundaries at each reduction step.Instead of performing the reductions in arbitrary order, we organize them into reduction sequences that make up reduction directed acyclic graphs (RDAGs).We will show that this is very advantageous algorithmically.

Forming RDAGs
Starting from an arbitrary cell, we form the maximal possible directed acyclic graph (DAG) whose nodes are reduction pairs and a directed edge between two pairs ( ) , a B and ( ) definition 4).A directed path in the DAG corresponds to a reduction sequence.That type of reduc- tion sequence affects only adjacent cells to the path and leaves other cells unchanged.Information about each reduction is carried out by the reduced cells.After performing the whole set of reductions, we obtain the reduced complex by extracting the boundary information from the data structures associated to the cells visited by the reduction DAGs.
We introduce the main concepts through an example.The example details how the projections of cells are calculated given a reduction sequence.This will show what information is exactly needed to be kept about the original complex in order to be able to rebuild the reduced complex.π π π π =   .The 2-cell A is the only cell adjacent to the sequence while the 2-cell E is not adjacent.Their respective projections with respect to the sequence are , , 0 .λ λ and d λ and express concretely the projection of A: Once a reduction DAG is built, we are interested in building the reduced complex.The cells of the new complex are the projected cells of the original complex.The projection can be used effectively to update or build the boundary operator for each projected cell.In the previous example, the boundary of A′ can be reconstructed from the formula of its projection .

A A B C D f g e a i h
The incidence number of a given cell, say g, with respect to the boundary of the projection of A is computed by adding all the contribution coefficients of the cofaces of g present in the projection of A, each multiplied by the incidence number of g with respect to that coface.A particular coface of g can be present several times in the projection of A if it belongs to different sequences that originate at A.
For instance , A g ′ ∂ is given by since D is the only coface of g in the original complex and it is present in the projection of A. Note that the value of , D g ∂ is known from the original complex.Thus to calculate the reduced complex, it is enough to keep track of the coefficients of contribution of every reduced cell in any of the projected cells for which it contributes.In general, after a sequence of reductions is achieved, a projected cell adjacent to reduction DAGs will look as depicted in Figure 3.

Projection Formulas for Grouped Reductions
We define how to organize the reductions and the complex using reduction DAGs.

Definition 4. A cell 1
A is adjacent to the pair ( ) , a A if the cells 1 A and 2 A are of the same dimension m and 2 a is a cell with dimension 1 m − in the boundaries of both 1 A and 2 , a A .Definition 5. A reduction DAG is a directed acyclic graph whose nodes are reduction pairs and a directed edge between two pairs ( ) , a A means that ( ) , a A .Definition 6.A path P from ( ) , , , , n n a A a A  whose elements are nodes in G and ( ) , A cell A is said to be adjacent to a reduction sequence if it is adjacent to some pair of the sequence. , , a A .Definition 8.A cell not appearing in any reduction DAG is called a projected cell.The cells that appear in a reduction DAG are called reduced cells.
All individual paths originating at a given cell contribute to the projection of the cell.In the following theorem, we give a formula to calculate the projection of a cell by considering the contribution of a single path and ignoring the input from other paths and sub-paths.This formula is called "path projection".
Theorem 9. Let ( ) ( ) : , , , , be a path originating at A 0 .The path projection of A 0 by the path P is given by ( ) Proof: We proceed by induction on the length of the path.For 0 n = or 1 n = , the path projection formula is easily checked as shown in Example 3. Suppose that the path projection of 0 A by the path of length n, ( ) ( ) : , , , , , to the path n P so that 1 n a + is a face of n A leads to the path of length ( )


. The projection of 0 A by 1 n P + is equivalent to the pro- jection of ( ) , consisting of a single reduction pair, that is: The last equality is due to the fact that none of the cells 1 2

, , , n A A
A −  contains 1 n a + as a face.Thus, ( ) ( ) ( ) ( ) ( ) Note that when the path n P is extended by one reduction pair ( ) , , then the formula for the projection of 0 A by 1 n P + is given by: : , , , , : , , , , , then the formula generalizes to ( ) ( ) where ( ) A which are of di- mension ( ) 0 1 dim A − .Since the k paths are disjoint and non overlapping, each path has to start from a different face of 0 A .Without loss of generality, it is sufficient to prove the corollary for the case where 2 k = and with paths consisting of single reduction pairs as shown in Figure 4.If we apply first the reduction by ( ) , , where , Applying the second reduction results in: ) , , Since 1 P and 2 P are disjoint non overlapping, it follows that 2 a is not a face of 1 A , thus ( ) Let T be a reduction tree originating at 0 A , then the projection of 0 A by T is equal to the sum of 0 A and the projection chain of 0 A by T. Proof: Typically, the trees originating at 0 A can occur as pure paths, in which case the associated projections are given previously.Otherwise, we can find paths that share a common ancestral branch that originates at 0 A .This is seen as a bifurcation as shown in Figure 5.In this case, the ancestral branch is a path : , , , , : , , , , : , , , , . The combination of the results in Theorem 9 and Corollary 1 can be used to show that the total projection of A 0 by The term ( ) ( ) is called the projection chain of the tree.
In general, the tree can have many bifurcations at a given node and the bifurcations can occur at different nodes for which cases, the formula given above can be easily generalized.
 One may wonder what happens when two paths are merging into a single path as illustrated in Figure 6.For this purpose, let's consider the simple situation where two distinct reduction paths originating at A 0 overlap at a certain node, which is they are extended both by the same path.
In that situation, there are two independent pure paths CB and DB and the total projection of A 0 is given as follows ( )  We see from the formula that it comes down to considering each path (which is also a tree) including the shared sibling branch as pure paths contributing to the projection of A 0 .This comes down to considering each tree independently (see Figure 7).Theorem 10.Let A 0 be a cell adjacent to a RDAG in a chain complex C. The projection of A 0 by the RDAG is equal to A 0 to which we add the projection chains of A 0 by all the trees in the RDAG that originate at A 0 , that is ( ) ( )  is the collection of all the reduction trees in the RDAG originating at A 0 .Proof: Since we deal with finite complexes the number of all possible trees starting at a given cell (here A 0 ) has to be finite.Thus, the results proved for paths in corollaries 1 and 2 and the subsequent remark regarding merging paths can be extended for the case of trees (splitting and merging) to find the formula for the projection of A 0 .
 We review the projection formulas of paths and trees with the example of Figure 8.There are two trees B T and C T originating at A 0 .The projection of A 0 with respect to the tree B T can be written as . The projection chain of A 0 with respect to the tree C T is given by . It follows that the total projection of A 0 is given as the sum of both plus A 0 .

Why Acyclic Graphs
Adjacency alone does not guarantee that cycles are not created in a reduction DAG.To ensure that every reduction sequence is associated with a well defined projection, we enforce acyclicity in the formed directed graphs.
More generally, assuming that a reduction sequence ( ) ( ) , a A and ( ) , a A (see Figure 9), we consider extending the sequence to ( ) ( )  by setting ( ) ( ) . However, the pair ( ) 0 0 , a A is not necessarily ad- missible since after performing the sequence of reductions ( ) ( ) , the cell A 0 has been modified into 0 A′ ( ) and 0 1 n a a + = occurs in the boundary of 0 A′ as a face of both 0 A and n A .Its incidence number is therefore calculated as follows ( ) , which is not necessarily invertible.In Figure 9, , a A is not an admissible reduction pair.In our implementation, we avoid testing if   reduction pairs are admissible or not by avoiding cycles as we build the reduction DAGs.This results in a more efficient algorithm.

Using Projection Formulas
We recapitulate the formulas with a second example, see Figure 10.The first reduction pair is ( ) the cell E is adjacent to the pair ( ) , a A , it is marked as a projected cell.In a list associated with the reduced cell A, we save the pair ( ) to the list associated with the cell B. Continuing from the faces of B, the )

Building the Simplified Complex
Using reduction DAGs to compute the homology of a chain complex is a recursive process.At each recursion level, the algorithm simplifies the complex by constructing reduction DAGs on the complex and saving the associated projections into appropriate data structures.This is performed simultaneously for each dimension.This process eventually stops when it is impossible to add another reduction pair to any reduction DAG.In that case, the algorithm will build the associated simplified complex and continue the reduction process on the simplified complex.
The simplified complex is rebuilt from the projected cells only (reduced cells are not considered).The boundaries of the cells are updated using global projection formulas that allow to calculate the incidence numbers between cells of contiguous dimensions.Note that the reduced cells are not completely removed from the structures since they may be needed to recover homology generators expressed in terms of cells of the original complex as we explain in subsection 3.8.Contrary to the classical case where the boundary updating is done at each reduction step and may concern cells that can be reduced at a later step, a major benefit of this new approach is that the boundary updating is done only among the projected cells which often constitute a small fraction of the number of cells in the original complex.
Continuing with the example used in Figure 10, we proceed to build its simplified complex.At this point from the projection formulas we have the projection of E, which is E E A B ′ = + + and the projection of F which is F F D ′ = + .To build the simplified complex, we create two 2-dimensional cells, E′ and F ′ .We also create the projected lower dimensional cells (which is not illustrated in the example).Then, knowing that F F D ′ ∂ = ∂ + ∂ , we add the projected 1-cells in the boundary of D to the boundary of F′ using the correct coefficients.For example, if we suppose that 1 d is a projected cell of dimension one that was in the boundary of D in the original complex.From the structures, we know that D appears in the projection of F with an incidence coefficient of 1 ( F F D ′ = + ).So, we add 1 d in the boundary of F′ with a coefficient of 1.This is done for all projected cells in all dimensions.

Performance Benefits of Using Reduction DAGs
We use the complex illustrated in Figure 9 as a case study.Our objective is to point out how grouping reductions differs from classical methods and leads to substantial performance benefits.Performance is measured by the number of lists updates and compared to the classical reduction algorithm (see 2).Usually, the boundary and coboundary of cells are maintained by lists data structures.Assuming ordered lists, the cost for merging two lists 1 L and 2 L is bounded by ( ) , where L refers to the number of elements in L. Using unordered lists is bounded by ( ) , a A .According to step 2.a) of algorithm 2, we update cofaces of 1 a .That is, for each 2-cell β in the coboundary of 1 a , we add the set of cells in the boundary of 1 A to the boundary of β using the right coefficient.Therefore, this operation costs an order of A .For each 1-cell α in the boundary of 1 A , we add the set of cells in the coboundary of 1 a to the coboundary of α using the right coefficient.This time, the cost is in the order of ( ) Taken all together, it costs an order of 8 11 19 + = lists updates for the first reduction.A is in the order of ( ) Taken together, the cost for the second reduction is in the order of 10 10 20 + = lists updates.
(c) Reduction ( ) For step 2.b), updating faces of A 3 amounts to a cost in the order of ( ) The total cost for the third reduction is in the order of 12 10 22 + =.Taken all together, the three reductions cost an order of 19 20 22 97 + + = lists updates.Generally, as the re- ductions go on, the costs increase because the boundary and coboundary lists tend to grow in size.
2) Reduction DAGs Method (a) Reduction ( ) , a A .The pair ( ) , a A .The pair ( ) The cost is one list update.
(c) Reduction ( ) , a A .The pair ( ) (d) Reconstruction of the simplified complex.For each non reduced 1-cell a, we scan through its cofaces of the original complex and link a with the list of projected cells previously saved by using the right coefficients.The total cost is in the order of the number of remaining 1-cells plus the size of the lists of adjacent projected cells.Note that usually, the simplification process continues on the lower dimensions.So at the end there will remain only a fraction of the lower dimensional cells (here 1-cells).The Reduction DAGs method is more efficient as long as the cost for reconstructing the simplified complex is low which our experimental results corroborate.

Minimizing Lists Updates
Different reductions lead to different computational costs (measured by the number of lists updates).We attempt to achieve a good overall performance by choosing reduction pairs that should minimize the number of updates.To select among many candidate reduction pairs, we use a heuristic that estimates their respective costs.Given a reduction pair ( ) , a B , this cost is approximated by the number of non reduced cofaces of a (other than B) plus the size of the projection lists of each reduced coface of a.
We illustrate this on an example in Figure 11.Suppose that the 2-cell A is already reduced and has ( ) , A λ in its list of projected cells.We consider the cost of performing the reduction ( ) , c C .The heuris- tic approximates this cost to three lists updates, because there is one non reduced coface of c (that is B), and the projection list of the reduced coface A has two elements ( ) , A λ .Indeed, performing the reduction ( ) , c C amounts to inserting the pairs ( ) ( ) into the projection list of C. We give more details about the heuristic in section 4.

Calculating Generators
An interesting aspect of using the reduction DAGs method to compute homology is that the data structures allow to extract the homology generators at a low additional cost.We explain how to proceed.Let us consider the complex of a plane quotiented by its boundary as illustrated in Figure 12.This complex is homeomorphic to a 2-sphere and the 2-cell A represents the 2-generator . The generator associated to a projected cell corresponds to the projection of the cell.As we illustrated in Figure 12, after each reduction, the projection coefficients i λ are saved into the projection lists maintained in each reduced cell.Generally, once the complex has been simplified to the level where all boundaries are trivial, one has to examine all projected (non-reduced) cells and find their corresponding coefficients in the projection lists of the reduced cells.Thus, to get the 2-generator associated to the 2-cell A, one has to scan through each reduced 2-cell and extract the projection coefficients associated to A.
However, computing homology by reduction DAGs is a recursive process (see Figure and this needs to be considered when calculating the generators.Projections correspond to generators but they can be expressed with cells of the simplified complex at any level of the recursion.However, in order for the generators to carry geometric meaning, it makes sense only to express the generators with cells from the original complex.Due to the recursive simplifications, a projected cell at a previous level of the recursion may become a reduced cell at a later level of the recursion.We illustrate this in Figure 14.In this example, there are two levels of recursion.The final simplified complex is obtained after reducing the vertical edge and the 2-cell B shown in Figure 14(b).At this step, the cell A represents a 2-generator that is expressed as (considering that after the first reduction we rename the original cells A and B as A′ and B′ ).Note that in fact, the algorithm always works on the original cells and never copy nor create any new cell during the whole computation.Returning from the last recursion, we know that The cell A represents a projected cell at both recursion levels (14(b) and 14(c)) and requires no special consideration.On the other hand, the cell B is a projected cell at the first level of recursion but becomes a reduced cell at the second recursion level.Consequently, returning from the second recursion level, we scan through each 2-cell and whenever B is encountered in one of the projection lists, it is replaced by 5 A λ .This is shown in Figure 14(d).Finally, the generator is expressed by In Figure 15 we show different 1generators that we extracted from 3D models.

What If Boundaries Are Not Trivial?
In previous subsections, we stressed that all boundaries have to be trivial in order to obtain homology and its generators at the end of the reduction process.But in practice, when all reductions are done we can be in a situation where some of the boundaries remain non trivial.For example, this may be due to the presence of torsion.
In Definition 1, the pair ( ) ∈ .The easiest and most efficient way to guarantee this is to choose B and a such that , 1 B a ∂ = ± .Thus, for performance considerations, we designed the reduction DAGs algorithm to consider only reduction pairs with 1 ± coefficients.But the drawback is that once all reductions are done, it is not guaranteed that all boundaries will be trivial.There may remain reduction pairs such that , 1 In that case, one can modify the present algorithm to test for divisibility and continue the computations with the modified version.Another possibility is to have recourse to a classical algorithm such as the Smith Normal Form.Note that at this level most of the cells generating the complex have already been reduced to a small number.

Data Structures and Algorithms
The first data structure represents a chain complex.
ChainComplex: E: two dimensional array of cells organized by their respective dimension, that is E In our implementation, the pointer to a cell is used to identify the cell.These identifiers are saved in E. The main benefit to using pointers is that when we build the simplified complex, no new cell is created.Only the pointers of the projected cells are copied into the simplified complex.
Cell: boundary/coboundary: list of faces/cofaces.state: a flag taking one of the values in {NORMAL, REDUCED, PROJECTED, VISITED}.projCells: list to save the PROJECTED cells and their associated coefficients that project onto the given cell when it is REDUCED.
nbUpdates: approximates the number of updates to a projCells list when the given cell is reduced by one of its cofaces.
Initially, all cells are set to the NORMAL state.REDUCED is assigned to the cells in a reduction sequence and PROJECTED is assigned to the cells adjacent to a reduction sequence.The VISITED flag is assigned to the d-cells that don't have any ( ) coface that can form an admissible reduction pair.A reduction pair ( ) To build a sequence of reduction pairs, nbUpdates is used to select reduction pairs that should minimize the amount of updates to the projCells list.The number of updates is approximated by the number of non reduced cofaces plus the number of entries in the projCells lists of REDUCED cofaces.
We now give the principal steps of the algorithm. HOM_RDAG(ChainComplex K) returns the Betti numbers of a chain complex K by using reduction DAGs.
1) (Initialization) For all cells in K, set its state to NORMAL, empty projCells, set nbUpdates to the number of cofaces.
2) (Reduce cells) Proceed by decreasing order of dimension.Order the d-cells by increasing value of nb-Updates.Following this order, start a reduction DAG (call BuildRDAG()) from each cell whose state is NOR-MAL.
For dimension 0, change remaining NORMAL 0-cells to PROJECTED.
3) (Build the simplified complex) Call BuildSimplifiedComplex(K). This method extracts the boundary information from the data structures to build a simplified complex K ′ .Continue to recursively simplify K ′ by calling HOM_RDAG( K ′ ) until there are no reduction pair left.
Test if all boundaries are trivial.If not, then continue the reduction process with another method such as SNF.2) (Update the nbUpdates variable) For all faces a of B different than c, add |B.projCells|-1 to c.nbUpdates.For all faces of c, remove one to nbUpdates. BuildSimplifiedComplex(Complex K) extracts the boundary and coboundary information from the data structures to build a simplified complex K ′ .Note: In the preceding methods, the coboundary list of a cell is never modified.Thus, when referring to cofaces, it is about the cofaces in the complex K at a given recursion level before any reduction is performed.
1) (Update boundary) Proceed by increasing order of dimension.Let c be a PROJECTED d-cell of K.For all cofaces C of c, iterate through C. projCells.Let B be a ( )

Experimental Results
Because of its heuristic and recursive design, we had major difficulties to analyze the time complexity of our algorithm.Instead, we approached the problem by evaluating its performance on a wide range of experimentations.We compared its performance with the reduction algorithm 2 on different data series: d-balls, d-spheres, tori, Bing's houses, 3D medical scans and randomly generated d-complexes.
The series of d-balls and d-spheres were created by juxtaposing d dimensional cubes along each dimension.A 3-ball of size 10 is the cube obtained by placing side by side 10 × 10 × 10 3-cubes and making the corresponding identifications of lower dimensional cells.A d-sphere is a d -ball quotiented by its boundary.We also constructed the tori and Bing's houses from 3-cubes, although they are both 2 dimensional complexes.This construction does not affect the homology.We created two data series from 3D medical scans: brain and heart.Each of them is created from 3D images by selecting the pixels whose values are higher than a threshold.Lastly, we randomly generated d-complexes by choosing ( ) random numbers referring to the number of generators.
We created a d-cell for each d-generator and randomly linked the d-cells to the ( ) By subdivision, we obtained complexes of various data sizes.
In the experimentation, we use the following terminology.A data series refers to one of d-balls, d-spheres, tori, Bing's houses, 3D medical scans or randomly generated d-complexes.Each data series is constituted of many datasets.For example, d-balls contain the 2-ball, 3-ball, ..., 6-ball.Each dataset contains many files of complexes of various data sizes.For example, there are 15 files in the 2-ball data set for which the data size vary from 2000 up to 10,000,000.The size of a complex is measured as the number of cells plus the number of links (entries in boundary lists).A test is an execution of the algorithm on a file of a specified data set.
Initially, we observed that a test could be misleading, especially when dealing with cubical data sets.The reason is because we create the complexes in an orderly fashion.Thus, during its execution, the algorithm is most probably benefiting from a favorable choice of reduction pairs due to the order inherent in the data.Consequently, to avoid misreporting on the algorithm's performance, we took care to randomize the boundary and coboundary lists of all cells in the complex before each test.
We performed 30 tests for each file and computed the average, maximum and standard deviation of the time used for calculating the homology, the generators and sorting.We measured the sorting time because it has a time complexity of ( ) and we wanted to verify that sorting would not monopolize the computation time.Also, we saved statistics on the number of recursions to see its influence on the global performance.Here we grouped and summarized the results per dataset.
We begin with our most interesting results (see Table 1) which compare the performance of the reduction DAGs algorithm with the classical reduction method on different datasets.We can observe a improvement between the two methods.Knowing that the time complexity of the reduction method is quadratic (using ordered lists), those results suggest a subquadratic time complexity for the reduction DAGs algorithm which is what we measured in Table 2 In Table 2 we report the approximative time that our algorithm uses to calculate homology on various datasets.The second column (N) gives the number of files within the dataset.The third and fourth columns show the parameters α and β that best-fit the equation " time data_size α β ≈ ⋅ ", where time is expressed in seconds.We obtained the best-fits from the times measured on the individual files of each dataset.In the last four columns, we present the times in milliseconds that we get from the best-fit equation for different data sizes.Those times approximate the real times that we measured in real experiments.For example, on 30 tests, we measured an average time of 4992.24ms on a torus of size 10944304.The best-fit equation approximates a time of 4573.64 ms for a torus of size 10000000.
From Table 2, we observe that 1.16 Except for few values, α is relatively constant while β gradually decreases as the dimension in- creases.We explain this by the fact that the ratio of exterior face reductions versus interior face reductions increases with the dimension.An exterior face reduction denotes a reduction pair ( ) , a B for which a has only the coface B in its coboundary list.On the other hand, we use the term interior face reduction when a is shared by more than one coface.Algorithmically speaking, an exterior face reduction is more advantageous because it does not cost any update to the projection lists, while an interior face reduction costs at least an update.
We observe that the approximation times and the parameters α and β for the heart and brain datasets, remain in the same order as those of d-balls and d-spheres despite the high numbers of generators as shown in Table 3.For example, the file heart96 had 30 connected components, 107 holes and 19 voids.For these two datasets, the number of generators was not a relevant factor for the performance of the algorithm.We think it is because those two datasets contain a high ratio of exterior face reductions, which cost near to nothing.If we compare with the random d-complexes then the topology is an important factor.Indeed, as the data size increased we observed bigger differences in the time performance.
In Figure 16, we also reproduced the algorithm's performance on the files of each dataset.We used a logarithmic scale on both axes to better distinguish the trends.In Figure 16(d), each file in a d-complex has a distinct topology.It explains why we see only a weak trend in opposition to the other datasets where the trends are strong.
Another interesting aspect of the algorithm to consider is the number of recursions which is expressed as the number of reduction calls in Table 4. Interestingly, one reduction call was enough to calculate the homology of all d-balls, d-spheres and Bing's houses.It means that the algorithm did not reconstruct an intermediate simplified complex to obtain the homology.For other datasets, the algorithm had to reconstruct simplified complexes to carry out the computation.As an example, a value of n means that, on average, the algorithm constructed ( ) simplified complexes.The Max column shows the average of the maximum number of calls and the last column refers to the average standard deviation.Regarding the computation of generators, our algorithm allows to obtain them for an additional cost of about 10% or less on average.From Table 5 we see that most average times fluctuate around 5% and the maximums fall into the 15% -30% range.Moreover, the algorithm performed as strongly on datasets with a large amount of generators than on those with a small amount.

Conclusions
We developed paper a novel and efficient algorithm for the computation of homology groups and homology generators that works at the level of chain complexes, and hence allows handling a variety of geometric complexes.The main idea is based on organizing sequences of cell reductions into a structure of directed acyclic graphs, which makes it possible to derive global projection formulas and perform large collections of reductions at once.In this method, the boundary operators of the complex are not explicitly updated after each reduction step.Instead, this information is always preserved implicitly within the data structures and processed globally for each reduction graph allowing reducing considerably the computational time.
Our experimentations show that this algorithm performs significantly faster than the classical reduction method.In addition, for all the datasets that we tested, which cover a wide range of types of data, their global performance indicated a subquadratic time complexity.Moreover, it allows calculating the homology generators at a small additional time cost.Interestingly enough, in all the datasets we dealt with, the algorithm needed only to construct few intermediate complexes to obtain the homology of the original complex.Our feeling is that if one encounters a specially designed complex in which the number of recursions is substantially high then the algorithm may underperform.However, it is not yet clear how to construct such a complex.
(a) (Update cofaces of a) For all cofaces C of a, add B γ λ − ∂ to C. boundary, where γ is , C a ∂ , the coefficient of a in C. boundary.(b) (Update faces of B) For all faces c of B, add cob a c in B. boundary.(c) (Project a and B) Remove cells a and B from K. 3) (Return the result) Return the simplified K.An example of using the reduction algorithm is illustrated in Figure 1.With the prescribed orientation of the

Figure 1 .
Figure 1.Result of the projection of a complex after one reduction.given complex, the boundaries of the cells in K are A a b c d ∂ = + − − B b e f g ∂ =− + + − .C b h i j ∂ = + − −

Example 3 .
Consider the complex given inFigure 2. The boundaries of the 2-cell are A a b h ∂ = − + −

Figure 2 .
Figure 2. Example of collapsing by using RDAGs.B b c f ∂ =− + + C c d i ∂ = − + D d e g ∂ = − + + .E i j k ∂ = − + The depicted sequence of reduction pairs originating at cell A is ( ) ( ) , , , b B c C and ( ) , d D .The projection maps at the level of the 2-cell give the following ( ) 2 , , : , b b b B B B b α π α α ∂ = − ∂ in the projection is called the coefficient of contribution of the corresponding cell to the projection of A. For instance, of contribution of D in the projection of A. We can compute the coefficients , b c

Figure 3 .
Figure 3.A cell extended by reduction DAGs.

Ψ
the projection chain of the cell 0 A by the path n P .

Figure 4 .
Figure 4. Reduction by two disjoint non overlapping paths.

Figure 5 .
Figure 5.A path B bifurcates into two paths C and D.

Figure 6 .
Figure 6.Two paths C and D merge into a single path B.

Figure 7 .
Figure 7. Decomposition of a reduction DAG into two independent trees 1 T and 2T .

Figure 8 .
Figure 8. Example of reduction paths and the associated reduction DAG.

Figure 9 .
Figure 9. Cycles need to be avoided as reduction pairs are added in the tree.
is continued from the faces of A. The second reduction pair is ( ) , b B .We carry the information previously saved on the visited cofaces of b, so we add ( )

Figure 10 .
Figure 10.A more complicated reduction sequence.third reduction pair to be added is ( ) , c C .The cell c is adjacent to both the reduced cell B and the projected cell E. From this point, we continue a path ( ) ( ) ( ) ( ) , , , , , , a A b B c C  and begin a sub-path ( ) ( ) , , c C  .The information saved with C considers the input of both paths.So we save ( ) ( ) , cE cB b a E λ λ λ λ − + where , 1 , In step 2.b), we update the coboundaries of faces of 1 b), updating faces of 2 e i h a a ∂ =− + + + + − − − , and updating cofaces of 3

saved in a list maintained by 3 a
. The cost is one list update.

Figure 11 .
Figure 11.Cost of performing a reduction in terms of list updates.

Figure 13 .Figure 14 .
Figure 13.Computing homology by reduction DAGs is a recursive process.

Figure 15 .
Figure 15.(a)-(c) The 3D models of a chain, two kissing children and a Buddha.(d)-(f) The calculated holes (one dimensional generators).
and adding the pair to the reduction DAG does not create a cycle.The flag helps to avoid testing more than once if the given d-cells have an admissible reduction pair.The projCells list is used by the REDUCED cells to keep track of the coefficients of contribution of every PRO-JECTED cell for which it contributes.For example, if we use the sequence of reduction pairs illustrated in Figure2, then B.projCells and C.projCells will respectively contain ( )

4 )(
Return the Betti numbers) Assign the number of PROJECTED d-cells to d β . BuildRDAG(Cell c) builds a reduction DAG from cell c. 1) (Initialization) Save c.nbUpdates into nbUpdatesLimit for later use.2) (Find a reduction pair) Find a coface B of c such that B is NORMAL or VISITED.If a coface B is found, then call PairCells(c, B), otherwise set c to VISITED and exit BuildRDAG().3) (Expand the reduction DAG) Expand the DAG (repeat step 2) from all NORMAL faces of B that have nbUpdates ≤ nbUpdatesLimit.Proceed in a breadth first search approach. PairCells(Cell c, Cell B) adds the reduction pair ( ) , c B to the sequence and updates the data structures ac- cordingly.1) (Update projCells) For all cofaces C of c different than B, if C is REDUCED, then add λ c * C.projCells to B.projCells.Otherwise, set C to PROJECTED and add λ c * C to B.projCells, where , , boundary and update c.coboundary accordingly.Finally, remove all REDUCED cells remaining in the boundary and coboundary of PROJECTED cells.Copy the pointers of all PROJECTED cells into K'.E.
for d-balls and d-spheres without regard of the dimension.
It is important to mention that k cannot exceed the number of boundary faces of 0

Table 1 .
. Comparison of the reduction DAGs algorithm versus the classical reduction method.

Table 2 .
Approximated performance with respect to dataset, data size and dimension.

Table 3 .
Number of generators of the files in the heart and brain datasets.