OSTRICH: Regular Constraint Propagation

🪪 Matthew Hague
🏫 Royal Holloway, University of London

The OSTRICH String Solver 🦒

An incomplete list of contributors:

🇬🇧 Taolue Chen (University of Surrey)
🇸🇪 Riccardo De Masellis (Uppsala University)
🇬🇧 Alejandro Flores-Lamas, Matthew Hague (Royal Holloway, University of London)
🇨🇳 Zhilei Han (Tsinghua University)
🇨🇳 Denghang Hu (University of Chinese Academy of Sciences)
🇩🇪 Anthony W. Lin, Oliver Markgraf (TU Kaiserslautern)
🇩🇪 Philipp Rümmer (University of Regensburg)
🇨🇳 Zhilin Wu (State Key Laboratory of Computer Science)

Publications: POPL 2016, POPL 2018, POPL 2019, ATVA 2020, POPL 2022

Position

🏅 SMTCOMP 2022: Best complementary solver for UNSAT instances
🏅 SMTCOMP 2023: Best overall in QF_S category

  • Automata-based
  • Support for complex string functions
    • Variants of replaceAll
    • Capture groups and references
  • Proving UNSAT

Approaches 🗺️

Core Strategy Core Strategy:Regular ConstraintPropagation DPLL MonadicDecomposition Nielsen Transform Parikh Images

Core Approach: Regular Constraint Propagation 🌱

For a simple word equation

✏️ Rewrite into "solved form"

Core Approach: Regular Constraint Propagation 🌱

Now solve the solved form

👉 Propagate to :

👉 Propagate to via :

👉 Propagate to :

❌ Contradiction

Proof Rules 🧑‍🏫

OSTRICH uses proof rules to search for a solution or contradiction.

We write rules upwards

From we can add .

"Standard" rules. E.g.

Notice branches, both branches need to be closed.

Propagation 🎋

The more interesting rules propagate regular constraints

🧐 Is this enough to solve

🤔 No: what do we do with ?

String Functions 🧶

is really a string function

Or more precisely

For concat, we can have a rule

😀 is still regular

Forward Propagation ⏩

Given

  • assigns the result of a function call
  • a regular constraint on the function argument

We can push forwards through to get

🧐 On the condition that is still regular

Works for multiple arguments

Forward-able Functions ⏩

We can support several string functions with forward propagation

  • for fixed
  • Rational Transductions

🤔 What about and ?

ReplaceAll function ♻️

(🐭 😟)

Replace each instance of "cat" 🐈 in with "leopard" 🐆

For fixed strings is still regular.

What if the leopard is a variable?

❌ Non-regular

Backward Propagation ⏪

➡️ We can't combine these to get a regular constraint on

because is not regular.

⬅️ We can combine

to get a regular constraint on .

That is, is regular.

Backward Concat

What about

We have the following constraints on and .

We don't get separate constraints on and :

Allows e.g. and .

Recognisable Constraints 👓

A recognisable constraint is a finite-union of regular products.

In general

Backwards Propagation Rule 📏

In general, for we require

We then get the backwards rule

Backward-able Functions ⏪

Several functions can run backwards, even if they are not forward-able.

  • , .
  • and with
    • , string literatals,
    • a regular expression and a variable [POPL 2018],
    • a regex with capture groups, a pattern with references [POPL 2022]
  • Rational Transductions
  • Streaming string transductions
  • Poly-regular functions

Capture groups handled used prioritised streaming string transducers.

Complexity ranges from polynomial to a tower of exponentials.

Forward propagation tends to be cheaper, backwards richer.

Closing a Proof Search 📪

We can conclude UNSAT if we find contradictory constraints

When can we conclude SAT?

Completeness Guarantees: Straight-Line Programs 📏

A straight-line string constraint is of the form

where is only used after its "definition" .

Can be generalised to "chain-free".

Complete Algorithm for Straight-Line ⌛

  • Apply backwards propagation from down to .
  • Branch on all alternatives .
  • If a leaf is such that
    • There is an assignment to , and
    • All are satisfied
  • Then we are SAT.
  • If all branches close UNSAT, we are UNSAT.

(For non-straight-line constraints, we also have other ways of using SAT that can assign values using a rule and eliminate function applications to leave only regular constraints.)

str.len? str.indexof? str.charat? 📏

OSTRICH currently has limited support for integer functions.

However, OSTRICH-CEA uses cost-enriched finite automata to model string/integer functions.

  • Automata model can increment counters.
  • Emptiness uses some Parikh/Presburger constraints
  • Implemented in OSTRICH-CEA tool.
  • ATVA 2020

Conclusion ⌛

  • OSTRICH string constraint solver using regular constraint propagation.
    • String functions with regular post-image / recognisable pre-image
  • Good performance on UNSAT instances
  • Starting to show some strength in SMT-COMP
  • Completeness guarantees for decidable fragments
  • Rich constraint language including capture/pattern replacement

Future work

  • Improved performance on general instances
  • Substring functions
  • String-to-integer functions
  • Improved handling of integers in OSTRICH (lessons from OSTRICH-CEA)