Validating Mondher's correctness

Mondher's correctness rests on three layers, each catching a different class of bug:

Layer 1 — Internal invariants

Every algorithm in Mondher carries its mathematical contract as a test. The Galois connection laws (extensivity, antitonicity, idempotence) are verified on hundreds of randomly generated contexts per CI run. The canonical base is verified to be both sound (every implication holds) and complete (every valid implication follows from it). The covering relation is verified to be a true Hasse diagram (no redundant edges, no missing edges).

These property tests catch bugs in our own algorithms without reference to any external tool.

Layer 2 — Canonical corpus

A hand-picked set of small contexts (tests/corpus/) gives us hand-verified concept counts and base sizes. Failures here usually mean a structural change in an algorithm — and the small size makes diagnosis fast.

Layer 3 — External cross-validation

A frozen reference corpus (tests/reference/) records the outputs Mondher must produce for ~10 canonical FCA contexts. Each reference value is independently verified against at least one of:

Published literature: e.g., the Living Beings context's concept count of 19 (Ganter and Wille 1999, p. 21).
ConExp: the de facto FCA reference tool from 2003 onward.
fcaR: the R package, actively maintained as of 2026.
conexp-clj: the Clojure rewrite, also actively maintained.

When two of these tools agree on a value, we lock it as the reference. When they disagree, we treat the discrepancy as a research question and document it — disagreement between mature tools is rare and almost always traceable to a known historical bug.

This layer answers the question "is Mondher saying the same thing as the rest of the FCA world?" — not just internal consistency, but objective agreement.

Adding a new reference context

See crates/mondher-engine/tests/reference/README.md for the mechanical procedure. The short version: drop a .cxt file, run it through two reference tools, record the values in corpus.json, run the cross-validation test. The commit message should cite the tools that confirmed the values.

Provisionally-locked entries

Entries whose notes field contains "pending fcaR cross-check" or "pending conexp-clj cross-check" are locked from Mondher's own output because installing the external tool was infeasible at curation time. These entries function as regression checks — they catch changes to Mondher's behavior — but they are not yet true cross-validations. They are upgraded to verified status when the external check is done.

What this gives us

Researchers using Mondher get three guarantees:

Every operation produces output that is internally consistent with FCA theory.
Every operation's output matches the FCA literature on canonical examples.
Mondher disagrees with no other mature FCA tool on any context in the reference corpus.

Format coverage

Mondher reads five context formats interchangeably:

Burmeister .cxt — the de facto FCA standard.
CSV / TSV — what spreadsheets export.
FIMI .dat — the standard transactional format used in itemset-mining benchmarks (Mushroom, Adult, Chess, BMS-*).
FCA-XML — the legacy ConExp-era XML format.
Native JSON — Mondher's own diff-friendly interchange format.

The binary auto-detects format from file extension or content; no --format flag is needed. Cross-format consistency is verified by the test suite: a context written in any one of these formats and read back produces the same incidence matrix.

This matters for adoption: researchers can throw Mondher at whatever files they already have — no manual conversion step — and get the same answers as their existing toolchain.

The cost of building this is real — adding a reference context takes 30–60 minutes of cross-tool comparison. The payoff is the credibility needed for Mondher to be cited in FCA research.