I have a strong interest in applications of algebraic topology, and in particular the applications that come through the techniques of persistent homology. Even so, I have a pure mathematics background, with more pure algebra than most of the active researchers in the computational and applied algebraic topology field. As such, I am uniquely poised to work with, in a sense, knowledge transfer.
On the one hand, there is much to be gained applying algebraic insights and techniques on computational topology and even in computational geometry. Several of my currently ongoing projects deal with this, most notably the parallelization work with Primoz Skraba and David Lipsky, and a spin-off from that focused on the concrete use of computational algebra to handle persistent homology of chain complexes with torsion.
On the other hand, a rich source of research that I intend to investigate is the reflow of insights from the applications and computations field back into pure mathematics. A preprint by Simon King and Graham Ellis http://arXiv.org/abs/1006.2237 investigates and evaluates persistent homology of the central subgroup series of finite groups as a group invariant. Similarly, the use of barcodes as a tool in homological algebra seems a fruitful area to mine. For another reflow approach, I am involved in a recently started project jointly with Jon Hauenstein (Texas A&M), Chris Peterson (Colorado State), David Eklund and Martina Scolamiero (KTH), and Primoz Skraba dedicated to using the techniques of persistent homology to compute topological invariants of algebraic varieties.
Concrete approaches: A number of immediate approach plans are already evident to me. For the flow from pure to applied, I am deeply interested in approaching the foundational aspects of persistent homology. There are ways to interpret persistent homology as being homology computed with coefficients in the ring k[t], for a field k. This interpretation can guide both algorithms and methods in far deeper ways than it currently does. Furthermore, there is an interpretation of persistent homology as being the internal simplicial homology theory of a topos of sheaves over a particular site — and this approach would be deeply interesting to investigate for further results.
In the other direction, some immediate approaches are to pick up on the work by King and Ellis, and investigate filtered objects in homological algebra and the usefulness of the persistence algorithms and methods on these; as well as taking areas of algebraic geometry and topological combinatorics where the objects can reasonably be sampled, and using persistent homology to estimate topological features for further use within the respective fields.
Grant plans: To fund my research and that of students and guests, I plan to apply for a number of funding sources. I have funding already through the project Toposys, funded under the 7th Framework Programme, jointly with Primoz Skraba (Slovenia), Herbert Edelsbrunner (Austria), Danica Kragic (KTH), Robert Adler (Israel), and Marion Mrozek (Poland). Furthermore, I expect to put together and submit a proposal to ERC for a starting researcher grant — I submitted one proposal in 2010. I have been submitting project grant proposals yearly since 2010.
Dissemination plans: The research intended to apply pure insights on applied themes is most appropritely published in the fields where they will most easily reach their intended audiences. This means, concretely, that the chief publication venues will be ACM Symposium on Computational Geometry, Discrete and Computational Geometry, as well as machine learning venues and even the high profile science journals such as Science, or Proceedings of the National Academy of Science.
As for the application of applied insights on pure themes, these will be most appropriately oriented after the area the insights are applied in. I expect Journal of Pure and Applied Algebra, Geometry and Topology and similar journals to be good starting points.
In addition to the pure publication aspects, there will be a number of highly interesting conferences to disseminate through. The biennial Algebraic Topology: Methods, Computation and Science meetings are a mainstay of the field, as are the theoretical computer science conferences ACM Symposium on Computational Geometry, IEEE Visualization, ACM Symposium on Discrete Algorithms and IEEE Symposium on Foundations of Computer Science.
Work packages and timelines
Over the next few years, there are several already articulated projects I will work on. With external funding in place, I would build a group of myself, a postdoctoral researcher and 1—2 PhD students, and likely accelerate some of the time estimates below as a result.
Statistics in TDA In a project with a CUNY undergraduate student, Alisa Leshchenko, we have explored statistical methods to study the quality of Mapper models.
From these studies I have also started a research project, joint with Prof. Sayan Mukherjee into the statistics of multiple hypothesis testing in TDA.
Topological process diagnostics Jointly with Pär Jonsson and our joint students Mattia de Colle (MSc 2016) and Leo Carlsson (MSc 2015, PhD ongoing), we are studying applications of Mapper to process diagnostics: by including measurement error in the parameters of a Mapper analysis, we produce geometric models for failure modes of industrial predictive algorithmics.
Point processes and topological methods My collaboration with Jonathan Peters calls for spatial point process statistics, with data sizes and data structures that fall outside the main track of contemporary spatial statistics. We plan to study potential extensions of classical meth- ods and explore potential applications of persistent homology, alpha-shapes and Mapper to the construction of new statistical approaches and invariants. This builds on recent results by Biscio and Møller.
Mathematical Software I have been working extensively at creating and maintaining mathemat- ical software. My software projects include:
- An implementation of Dotsenko and Khoroshkhin’s definition of Operadic Gröbner Bases in Haskell,
- the jPlex and JavaPlex Java-based libraries for persistent homology and topological data anal- ysis,
- modules for the Computer Algebra System Magma that implement A∞ algebra computations and basic persistent homology algorithms,
- modules for the Computer Algebra System GAP that implement persistent homology algorithms.
JavaPlex is overdue for an update - I am currently pursuing funding for an overhaul of the code base.
Statistics In a more classical statistics consulting role, I work with Prof. Fumio Someki on statis- tical validation of translated rating scales and diagnostic instruments for ADHD. We have one paper published from our collaboration so far, another paper under review and two more currently being written.
Commutative algebra in persistent homology The interpretation of persistent homology as homology with coordinates in k[t] induces new algorithms and new constructions of known results in the field. was accepted for publication in 2013, and we see a number of followup directions of research.
Topos-theory in persistent homology Persistent homology as internal to a particular topos of sheaves is an idea that requires some work before its implications can be clearly seen. At my current rate of approach, this project will be publishable by 2015.
Parallelization of persistent homology The recovery of a parallelization scheme from spectral sequences turned out to be more difficult than expected. Some first results were on the arXiv already in 2011, but a report solving the problem completely jointly with David Lipsky, Dmitriy Morozov, and Primoz Skraba will likely not appear earlier than 2015. Tuning and investigating the algorithms and reaping all follow-up results from this will likely generate work for years to come.
Persistent homology of algebraic varieties This project, while already started, generates new questions and the need for deeper research — both on the algebraic geometry side, where a dense sampling paradigm for algebraic varieties has turned out to be a crucial aspect, and on the topology side, where the need for faster and stronger algorithms has been made clear by our initial experiments. The project might yield output during 2015—2016.
Circle-valued coordinates in signals Jointly with Primoz Skraba and Vin de Silva, we are investigating applications of circle-valued coordinates and persistent cohomology to periodic signals. First results were presented at a NIPS workshop in 2012, and we are currently finalizing a more complete report.
This direction has also spawned work jointly with Florian Pokorny, Primoz Skraba, and Danica Kragic on analyzing bipedal locomotion with circular coordinates. A journal paper is currently undergoing peer review.
Opportunistic signals Jointly with Robert Ghrist, Michael Robinson and Justin Curry (all at the University of Pennsylvania), we investigate topological coordinatization methods using opportunistic signals — geo-location with wireless network signal intensities has in initial experiments given very promising results. We hope to have a detailed technical paper during 2015.
Topological Machine Learning I have been increasingly interested in applications of techniques from topological data analysis to machine learning. I have been working out potential applications of Mapper to automatically propose model spaces in semantics together with Jussi Karlgren and Hedvig Kjellström at KTH. Furthermore, jointly with Jesse Berwald (IMA, University of Minnesota) and Marian Gidea (Yeshiva University), I have been investigating the uses of betti barcodes as features for analyzing spatial time series and dynamical systems with machine learning techniques.
Below, I will expand on some of these projects.
In my doctoral dissertation I studied the A-infinity-algebra structure induced on the group cohomology ring H(G,k) from the presentation of H(G) = ExtkG(k,k) = H(homkG(pk,pk)) for a projective resolution pk of the trivial G-module k over the group ring kG for a field k. This presentation demonstrates that H(G,k) is the homology of a dg-algebra, and thus by a theorem proven by Kadeishvili (also proven and extended by several other authors) has an induced A-infinity-algebra structure. From the proof by Kadeishvili an algorithm for the computation of an A-infinity-algebra structure extends.
I proved (Blackbox computation of A-infinity algebras) that the resulting algorithm can be used to compute A-infinity-structures of a certain class of dg-algebras in finite time, using extra structure on the dg-algebra used, and that detection of failure for this approach can be detected during computation. Furthermore, I provide a condition on whether the entire A-infinity-algebra structure has already been computed from an initial fragment of the structure computed.
As an example of the technique, I provide an independent proof that the structure for a canonical A-infinity-structure on H(Cn,k) as described by Dag Madsen in his doctoral thesis is the correct A-infinity-structure.
I proved that the A-infinity-structure induced by using the Saneblidze—Umble diagonal construction on the known A-infinity-algebra structures for H(Cn,k) and H(Cm,k) to provide an A-infinity-algebra structure on H(Cn x Cm,k) has non-trivial operations in two arities further than previously documented. (A partial A-infinity structure on the cohomology of CnxCm)
Persistent cohomology and circular coordinates
In collaboration with Vin de Silva and Dmitriy Morozov, I studied an extension of the persistence algorithm to compute persistent cohomology. In Persistent Cohomology and Circular coordinates, we use this, together with the identification of S1 as the Eilenberg-Mac Lane-space K(ℤ; 1) to convert representative cocycles for degree 1 persistent cohomology of point clouds into circle-valued continuous functions on the point clouds. Key in this process is to smooth the resulting circle-valued function after the conversion in order to get a (persistently) homotopic coordinate function. We demonstrate that a smoothing may be done by projection from the space of real cocycles to the space of Hodge cocycles, and that this projection corresponds to an L2-optimization problem.
As an extension of the work done on computing persistent cohomology, we discovered a collection of duality functors connecting persistent absolute and relative homology and cohomology, as well as an algorithm for computing persistent cohomology faster than existing methods. Dualities in Persistent (Co)Homology
Together with Vin de Silva and Primoz Skraba, I currently study applications of the circular coordinate approaches to the study of periodic signals and systems. A delay embedding of periodic data will, using the right parameters, provide an embedding of the data stream as an image of a circle into the embedding space. We are able to, using circular coordinates of sample data, analyse periodic data with techniques different and potentially more versatile than existing methods for these analyses. We have presented preliminary results at a NIPS Workshop in 2012, and are currently writing up the details.
Computing cohomology for a dataset from a curve, relative to everything but a small neighbourhood, gives us a linear coordinatization of small patches of the curve at a time, allowing for an approach to curve reconstruction with cohomological methods. Together with Bei Wang, we're trying to find applications of these ideas in questions appearing in Wang's research in visualization, with datasets from gait analysis and other application fields. This research was presented at the selective computational workshop Vis 2011 (Branching and circular features in high dimensional data).
Parallelization in persistence
Persistent homology has proven a versatile and powerful tool — in topological data analysis as well as in applications in the sciences. The limitations of the use of persistent homology lies in memory and processing power limitation — and none of the available software packages to compute persistent homology are able to easily use more than one processor at a time. I would like to approach the adaptation of persistent homology to parallel, distributed and co-processor based techniques — to use several cores, several computational nodes and the GPU to compute persistence.
In particular, in work jointly with Primož Škraba (Jožef Stefan institute, Slovenia) and David Lipsky (University of Pennsylvania, USA), we are able to use a classical construction of a spectral sequence generalizing the Mayer-Vietoris long exact sequence to provide a parallelization scheme for persistent homology. We have a preprint joint with Dmitriy Morozov http://arxiv.org/abs/1112.1245.
At the core of data analysis lies the production of coordinate functions for data sets. As an extension of classical linear techniques, and of the circle-valued techniques discussed above, we would want to produce coordinate functions with arbitrary coordinate spaces.
One approach, pioneered by Yi Ding, uses the topological fact
Hi(hom(X,Y)) = ⊕phom(Hi(X), Hi+p(Y))
to associate H0(hom(X,Y)) with ⊕phom(Hi(X),Hi(Y)). Using this, we can take any explicit identification between the representative classes of a persistence barcode for a point cloud and the classes of a suggested model space and associate to it an equivalence class of actual maps between the corresponding chain complexes.
To actually generate good coordinatizations out of this leaves a difficult optimization problem in its wake: while we can pick out a representative cycle in hom(X,Y), it remains to find a homologous cycle that is well suited to interpretation as a coordinate function. On this problem, our research is still ongoing.
Topological data analysis in politics
Data analysis has plentiful applications in political science. One of the approaches used is to analyze the shape of the collection of vote vectors given as one vector per member of a particular parliament and session, with entries numeric encodings of that member's votes in the rollcalls of the parliamentary sessions.
In a project joint with Gunnar Carlsson, Anders Sandberg, Emil Sköldberg and Primoz Skraba, we try to approach this type of data set, as well as other data sets characterizing parliaments, with topological tools. Primarily, we try to use the Mapper algorithm to provide simplicial models of the data that can be analyzed with high dimension reduction.
The work on using Mapper on politics datasets has been presented at a NIPS Workshop in 2012 and was published in Scientific Reports (Extracting insights from the shape of complex data using topology).
Parallelized Gröbner bases
In a project joint with Emil Sköldberg and Jason Dusek, we use combinatorial structures in homogenous ideals in multigraded polynomial rings to parallelize the Buchberger algorithm. The combinatorial structure we use allows us tighter control over dependencies between Gröbner basis elements, and allows us to pick out subsets of potential basis elements that can be examined independently of each other.
Once extended to Gröbner bases for homogenous multigraded modules, the approach we are taking will automatically work to compute the rank invariant and other invariants in multi-dimensional persistent homology, opening up topological data analysis to a wider spread of problems.
Operadic Gröbner bases
Building on the work by Dotsenko and Khoroshkhin, I collaborated with Vladimir Dotsenko to produce a reference implementation of the Buchberger algorithm for shuffle operads. The work already done is published in ICMS and in Séminaire et Congrès (Implementing Gröbner bases for operads, Operadic Gröbner bases: an implementation). From this point, further work and some research is needed to go from a first implementation to an efficient and user-friendly implementation.
Persistence in homological algebra
Ellis and King approach group homology with techniques developed in persistent homology. Specifically, they use persistence on the filtration given by the lower central series of a group to seek more homological information for finite groups. Their work indicates that the techniques in persistent homology and topological data analysis may well be transported back into pure mathematics. I would be interested in seeking further applications of the persistence paradigm in algebraic topology, homological algebra and algebraic geometry.
Carlsson and Zomorodian identify persistent homology with coefficients in a field k as the homology of differential k[t]-modules. A more fine-grained formalism could be constructed by picking an appropriate Heyting algebra L of life times and re-deriving the relevant parts of linear algebra and algebraic topology over the topos of sheaves over L.
Through this approach, the time-dependency or filtration dependency of the simplices and homology classes in persistent homology would be anchored all the way down to set theory, and the hope is that from such a more subtle formalism may provide more interesting approaches both to the computation of persistent homology, and to other, derived invariants.
I outlined this approach in a seminar talk at the IMA in 2013.
Topological Machine Learning
Jointly with Jussi Karlgren and Hedvig Kjellström at KTH I am planning to investigate using persistent homology on Mapper output spaces to automatically pick out features in the output spaces. A persistence barcode can serve as a catalogue of a collection of feature identifications together with a measure of their saliency for further learning algorithm handling. The space and its homology can be used either as a feature for further algorithms or as a model of its own. There are many details to be investigated in this direction, and the idea is one of my newest and least explored directions of research.
With Jesse Berwald (IMA, University of Minnesota) and Marian Gidea (Yeshiva University) I have started looking at tagging topologically distinct behaviour in time series from spatial dynamical systems. We have a preprint on the subject.