Skip to content

Using efficient string algorithms for match site enumeration #16

@apirogov

Description

@apirogov

We have a method to check a pair of intervals and determine the maximal match interval.

But doing brute force on match pairs to find the maximal matching regions seems inefficient.

Idea:

  • Use enchanced suffix array / suffix tree methods to compute all maximal match candidates in linear time
  • for each candidate match, ensure the non-self-intersection property
  • return the possible maximal match sites in a form that is easily iterable and reusable

This can be used to construct an automatic tiling brute force algorithm with rat packs (i.e. starting from a set of tiles, try producing increasingly large patches

Possibly this algorithm could be better than using some SAT approach. If a larger tiling exists, this algorithm must find it after $n$ rounds if we keep all previous rounds, i.e. in each round take the results of the last level and produce all valid larger tiles.

This can be seen as a form of dynamic programming, instead of brute force SAT solving or tiling one-by-one inefficiently. That way in each step we produce exponentially large patches! Each round is polynomial in the length of the input of that round.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions