Riddler: Anti-isoceles sets

Problem

This feels like a NP-complete problem, where we are faced with too many possibilities to recursively exhaust for an exact answer for anything more than a small grid width \(N\). Not quite sure yet which of the Karp problems most resembles this one, though. My approach here was simply to use the brute-force recursive approach, which allowed me to find solutions up to \(N = 7\).

While getting exact answers is possible for a limited range of grid-widths, the broader relationship seems hard to get a fix on.

Limited solutions

The maximum size for the anti-isoceles set at each given grid-width is given here, with example plots below.

## # A tibble: 7 x 2
##   grid_size anti_isoceles_set_size
##       <dbl>                  <dbl>
## 1         1                      1
## 2         2                      2
## 3         3                      4
## 4         4                      6
## 5         5                      7
## 6         6                      9
## 7         7                     10

Unique distances in the grid

Is the maximum size of the anti-isoceles set a function of the number of unique distances in the grid? The number of unique distances in the grid is a sequence with initial terms 1, 3, 6, 10, 15, 20, 27, and discussed as OEIS A047800.

The OEIS entry includes this conjecture from Vaclav Kotesovec (after Landau and Erdős), which shows a mostly quadratic term from the triangular sequence muted somewhat.

\(a(n) \sim c \cdot \frac{n^2}{\sqrt{\log(n)}}, c = 0.79\)

Unique distances and relationship to largest anti-isoceles set

We might pose a possible relationship between the root of the number of unique distances and the size of the largest anti-isoceles set for a given grid width \(N\). Basically, that the relationship is roughly linear with \(N\), but takes into account that muting term above. Here, we plot these values, fitting a square root curve to the data.

Distribution of anti-isoceles set sizes

Looking beyond just the largest anti-isoceles set, it seems like that perhaps we can model the distribution at play here, using a Conway-Maxwell-Poisson (CMP) distribution. Does that suggest the a range of likely maximum values for each grid size?



Code

import numpy as np
import pandas as pd
from scipy.spatial.distance import cdist


def unique_distances(grid_size: int):
    """Get number of unique distances in coordinate grid"""

    coords = [[x, y] for x in list(range(grid_size))
              for y in list(range(grid_size))]
    distances = cdist(coords, coords, metric="sqeuclidean")
    return len(np.unique(distances))


def validate(current):
    """Validate whether current set of points contains no isoceles triangles"""

    distances = cdist(current, current, metric="sqeuclidean")
    valid = np.all(
        np.apply_along_axis(
            lambda x: np.all(np.unique(x, return_counts=True)[1] <= 1),
            axis=1, arr=distances
        ))
    return valid


def construct(current, candidates):
    """Construct set of non-isoceles points"""

    if not candidates:
        return []

    with_new_item = current + [candidates[0]]
    if validate(with_new_item):
        return (
            construct(current, candidates[1:]) +
            construct(with_new_item, candidates[1:]) +
            [with_new_item]
        )
    return construct(current, candidates[1:])


def process_grid(grid_size: int):
    """Process grid with side length grid_size"""

    coords = [[x, y] for x in list(range(grid_size))
              for y in list(range(grid_size))]
    results = construct([], coords)
    return np.max([len(result) for result in results])


def main_control(max_size: int):
    """Loop through grids and process each"""

    max_sizes = [process_grid(idx) for idx in in list(range(max_size))]
    print(max_sizes)

if __name__ == "__main__":
    main_control(max_size=10)