CG Week 2020

Accepted Papers — Computational Geometry: Young Researchers Forum

Shengtan Mao and Jack Snoeyink. Robust Boolean operations on polygons

We detail an implementation of regularized Boolean operations (union, intersection, difference) on two polygonal regions. It is based on an optimal red/blue segment intersection algorithm that requires only double the input precision. We carefully define predicates so that we do not need to define shared endpoints or overlapping segments as degeneracies requiring special handling but can fold them into the general case. We do not address the geometric rounding of intersection points (which may require more than double precision) back to input precision since such rounding must be application-specific.

Marco Michele Mosca and Vitaliy Kurlin. Voronoi-based similarity distances between arbitrary crystal lattices

This document develops new distance metrics based on Voronoi cell of a crystal lattice to compare crystal structures. Crystal structure prediction methods generate many simulated crystals used for materials discovery or drug design of which many of them may be similar to each other. Therefore, we may need to filter datasets by removing similar structures that can be time consuming for different computations. The proposed distances are invariant under rigid motions and can be used for clustering and visualizing datasets of crystals.

Sima Hajiaghaei. Hardness of Approximation for Some Covering Problems

In Geometric Set Cover the goal is to find a minimum sized cover for a given set of points in the plane by a family of given objects. This is a fundamental problem which has been studied for over 30 years. It has long been known to be NP-hard and was shown to be APX-hard by Chan and Grant in 2014. We study this and related problems here.
First, we consider the problem of Boxes Class Cover: points are given in two colors, red and blue, and the goal is to find a minimum number of axis-aligned rectangles that cover all the blue points but no reds. This problem is introduced for the first time in a paper in 2012 by Bereg et al., who showed the problem is NP-hard. No hardness of approximation result has been shown for this problem. Here for the first time we show that the problem is APX-hard.
We also study Red-Blue Geometric Set Cover, where the points are given in two colors, red and blue, and the goal is to find a family of given axis-aligned rectangles that cover all the blue points with the minimum number of reds. Chan in 2014 showed that the version of this problem where objects are axis-aligned rectangles is also NP-hard. Here for the first time we show that the problem is APX-hard. Moreover, we show that Red-Blue Geometric Set Cover is NP-hard to approximate within o(log n) factor of optimum, where the covering objects are convex objects and n is the number of blue points.

Oleg Kachan. Simplicial Oversampling: Accounting for the Higher-order Relations in Data Improves the Solution of the Imbalanced Learning Problem

SMOTE is the established geometric approach to random oversampling to balance classes in the imbalanced classes learning problem, followed by many extensions. Its main idea is to introduce synthetic data points in the minor class with each new point being the convex combination of an existing data point and one of its k-nearest neighbors. This could be viewed as a sampling from the edges of a geometric neighborhood graph.
We propose a generalization of the sampling approach, thus sampling from the maximal simplices (with respect to an ambient space) of the clique complex of a neighborhood graph. That is, a position of a new point is defined by the barycentric coordinates with respect to a simplex spanned by an arbitrary number of data points being sufficiently close. We evaluate the generalized technique and conclude that it outperforms the original SMOTE.

Adélie Garin, Teresa Heiss, Kelly Maggs, Bea Bleibe and Vanessa Robins. Duality in Persistent Homology of Images

We derive the relationship between the persistent homology barcodes of two dual filtered CW complexes. Applied to greyscale digital images, we obtain an algorithm to convert barcodes between the two different (dual) topological models of pixel connectivity.

Matt Bright and Vitaliy Kurlin. An Enumeration and Isotopy Classification of Oriented Textile Structures

We have developed a first method for encoding isotopy classes of links embedded in a thickened torus by 1-dimensional strings of symbols. We have designed a linear time algorithm to determine which codes give rise to realizable textiles in the plane. The new encoding and realizability algorithm have allowed us to enumerate for the first time all oriented single component textiles up to three crossings.

Maike Buchin and Leonie Selbach. Decomposition and Partition Algorithms for Tissue Dissection

We consider the decomposition of simple polygons and the partition of graphs under specific constraints. Our research is motivated by an application in the field of protein diagnostics, that is processing tissue samples for laser capture mircodissection. For the dissection to be successful the region of interest has to satisfy certain constraints in size and morphology. We consider different algorithmic approaches for the segmentation of these regions.

Matthijs Ebbens, Hugo Parlier and Gert Vegter. Minimal Delaunay triangulations of hyperbolic surfaces

Motivated by recent work on Delaunay triangulations of hyperbolic surfaces, we consider the minimal number of vertices of such triangulations. In particular, we will show that every hyperbolic surface has a Delaunay triangulation where edges are given by distance paths and where the number of vertices grows linearly as function of the genus g. We will show that the order of this bound is attained for some families of surfaces. Finally, we will give an example showing that the Ω(√g) lower bound in the more general case of topological surfaces is tight for hyperbolic surfaces as well.

Erik Amezquita, Michelle Quigley, Tim Ophelders, Elizabeth Munch and Daniel H. Chitwood. Quantifying barley morphology using Euler characteristic curves

Shape is foundational to biology. Observing and documenting shape has fueled biological understanding, and from this perspective, it is also a type of data.
The vision of topological data analysis, that data is shape and shape is data, will be relevant as biology transitions into a data-driven era where meaningful interpretation of large data sets is a limiting factor. We focus on quantifying the morphology of barley spikes and seeds using topological descriptors based on the Euler characteristic and relate the output back to genetic information.

Gill Barequet, Gil Ben-Shachar and Martha Carolina Osegueda. On the Power of Concatenation Arguments

We present several concatenation arguments, and show their applications to setting bounds on the growth constants of polyominoes and polycubes, and to families (e.g., trees) of polyominoes and polycubes whose enumerating sequences are pseudo sub- or super-multiplicative.

Andrei Asinowski, Gill Barequet, Gil Ben-Shachar, Martha Carolina Osegueda and Günter Rote. On The Number of Compositions of Two Polycubes

We provide tight bounds on the minimum and maximum possible numbers of compositions of two polycubes, either when each is of size n, or when their total size is 2n, in two and higher dimensions.

Apostolos Chalkis, Ioannis Z. Emiris and Vissarion Fisikopoulos. Practical volume estimation of polytopes by billiard trajectories and a new annealing schedule

We study the problem of estimating the volume of convex polytopes. Our algorithm is based on Multiphase Monte Carlo (MMC) methods and our main contributions include:

a new uniform sampler employing Billiard Walk (BW) for the first time in volume computation,
a new simulated annealing schedule, generalizing existing MMC by making use of adaptive convex bodies which fit to the input, thus drastically reducing the required number of iterations.

Extensive experiments show that our algorithm requires fewer oracle calls and arithmetic operations in total, when compared to existing practical algorithms for volume approximation. We offer an open-source, optimized C++ implementation, and analyze its performance. Our code tackles problems intractable so far, offering the first implementation to scale up to dimension d=100 for V-polytopes and zonotopes, and for d in the thousands for H-polytopes in less than one hour.

Sepideh Aghamolaei and Mohammad Ghodsi. Samples Using Geometric Counting Lower Bounds

Using Szemerédi–Trotter theorem for bounding the number of incidences between n points and m lines, we compute the sample size required for achieving that bound. We prove a similar result for the size of the dominating set of a graph and use it to find a sample for k-center.

Raphaël Tinarrage. Recovering the homology of immersed manifolds

Given a sample of an abstract manifold immersed in some Euclidean space, we describe a way to recover the singular homology of the original manifold. It consists in estimating its tangent bundle—seen as subset of another Euclidean space—in a measure theoretic point of view, and in applying measure-based filtrations for persistent homology.
The construction we propose is consistent and stable, and does not involve the knowledge of the dimension of the manifold. In order to obtain quantitative results, we introduce the normal reach, which is a notion of reach suitable for an immersed manifold.

Theo Lacombe and Vincent Divol. Understanding the Topology and the Geometry of the Persistence Diagram Space via Optimal Partial Transport

Despite the obvious similarities between the metrics used in topological data analysis and those of optimal transport, an explicit optimal transport based formalism to study persistence diagrams and similar topological descriptors has yet to come. By considering the space of persistence diagrams as a measure space, and by observing that its metrics can be expressed as optimal partial transport problems, we introduce a generalization of persistence diagrams, namely Radon measures supported on the upper half plane. Such measures naturally appear in topological data analysis when considering continuous representations of persistence diagrams (e.g. persistence surfaces) but also as limits for laws of large numbers on persistence diagrams or as expectations of probability distributions on the persistence diagrams space. This formalism allows us to prove topological properties of this new space, which will also hold for the closed subspace of persistence diagrams.
This bridge between optimal transport and persistence diagram metrics allows us to address various problems that naturally arise when dealing with persistence diagram, both in theory and in applications.
First, we provide a characterization of convergence in the space of persistence diagram (with respect to the diagram metrics) in terms of vague convergence of measures. This result provides a powerful tool to study convergence and continuity properties in the space of persistence diagrams; in particular it allows us to give an exhaustive characterization of continuous linear representations of persistence diagrams, a common tool used when incorporating persistence diagrams in machine learning pipeline.
Second, this formalism allows us to prove new results regarding persistence diagrams in a random setting, as it enables to manipulate some continuous such as expected persistence diagrams (that are not standard persistence diagrams) and to prove convergence rates and stability results in this context.
Finally, an optimal transport-based formalism has huge strengths in applications. Indeed, modern tools routinely used in computational optimal transport can be straightforwardly adapted to persistence diagrams metrics, allowing to estimate efficiently distances, barycenters, or quantizations of persistence diagrams to name a few.
This presentation is based on the preprint [Divol & Lacombe, 2019].

Jasna Urbančič and Primož Škraba. Optimizing Embeddings using Persistence

This work looks to optimize Takens-type embeddings of time series using persistent (co)homology. The motivation is to assume that an input time series exhibits periodic, quasi-periodic, or recurrent behaviour, then use continuous optimization to find good embeddings. Here we provide a practical approach to finding good embeddings with indications as to possible theoretical questions and directions which arise.

Mitchell Black and Amir Nayyeri. Finding Surfaces in 2-Dimnesional Simplicial Complexes with Bounded Treewidth 1-Skeletons

We present an algorithm for finding a subcomplex K' of a simplicial complex K such that K' is a surface with specified boundary, genus, and orientability. This algorithm is parameterized by the treewidth of the 1-skeleton of the simplicial complex K and the genus of the surface K'. Our algorithm is linear in the number of vertices of K if the treewidth of the 1-skeleton of K and the genus of K' are constant. Otherwise, our algorithm is polynomial if the treewidth of the 1-skeleton of K is constant.

Erin Chambers, Ellen Gasparovic, Tao Ju, David Letscher, Hannah Schreiber and Dan Zeng. Topology Aware Morphological Operations

This work in progress aims to give a solution of the homological simplification problem in ℝ² which is also geometrically nice.

Josh Vekhter and Etienne Vouga. Which Integrable Projection is Which?

Given a piecewise constant vector field v on a manifold m, we begin with the question of when it can be expressed as the gradient of some scalar field s, in which case we say that this field is discretely integrable. When v is not integrable, one can then begin considering projections from v to the space of integrable fields. In this document we note that there are a number of related notions of integrable projection defined in the literature. We present a framework in which we generalize three of these notions to simplicial complexes in arbitrary dimension, and which shows promise as a computational tool for solving otherwise combinatorially hard problems in computational topology.

Erfan Hosseini Sereshgi and Carola Wenk. Computing relevant subtrajectory bundles faster

The ubiquitous use of smartphones and other GPS-enabled devices leads to vast amounts of tracking data being collected on a regular basis. But generally, processing large GPS trajectory data requires bundling portions of the trajectories together; this is referred to as sub-trajectory clustering. Since this is a multi-criteria optimization problem, there are many potential bundles of interest and it is unclear which ones to choose. Recently an approach was proposed for identifying so-called relevant bundles of sub-trajectories. We provide a faster algorithm for computing such relevant bundles, by employing a combination of exponential and binary search. This yields a runtime improvement of a linear factor over the previous exhaustive search algorithm.