HuggingFace dataset wrappers for SAT-verified training data.
Every training sample passes through a SAT verification gate before entering any pipeline. Only logically consistent samples survive. Built on satisfaction-suffices.
The satisfaction gate is an eight-valued logic over the satisfiability lattice of compositional constraint frames. Three binary axes fully determine the state: base-frame satisfiability (each sub-formula individually), joint satisfiability (all sub-formulas together), and solver convergence (resolved within the compute budget). 2³ = 8 distinct combinations are realizable; each demands a different response from the pipeline.
| Bits | State | Base frames SAT? | Jointly SAT? | Converges? | Character | Optimal pipeline response |
|---|---|---|---|---|---|---|
| 111 | VERIFIED | ✅ Yes | ✅ Yes | ✅ Yes | Top of the lattice — proceed | Include in training |
| 001 | CONTRADICTION | ❌ No | ❌ No | ✅ Yes | Irresolvable at every level; proved impossible | Hard exclude, discard |
| 101 | PARADOX | ✅ Yes | ❌ No | ✅ Yes | Valid BASE_FRAMEs compose to UNSAT — structure to learn from | Evolution seed; curriculum frontier |
| 011 | MIRROR_PARADOX | ❌ No | ✅ Yes | ✅ Yes | Anti-paradox converged — parts fail, whole stabilises | Emergence curriculum; structural discovery |
| 110 | TIMEOUT | ✅ Yes | ✅ Yes | ❌ No | Solver budget exhausted; verdict deferred | Exclude strict; probe as emergence signal |
| 000 | METAPARADOX | ❌ No | ❌ No | ❌ No | Absolute bottom — nothing satisfies, solver can't resolve; irreducible fixed point | Fixed-point boundary detection; recursion limit |
| 010 | SHADOW_PARADOX | ❌ No | ✅ Yes | ❌ No | Anti-paradox: parts fail but joint holds temporarily; unstable | Shadow curriculum; unstable emergence signal |
| 100 | BASE_FRAMES | ✅ Yes | ❌ No | ❌ No | Individually satisfiable building blocks; composition not yet verified | Foundation curriculum; morphogenetic seeds |
Why this taxonomy is exhaustive: every clause set presented to a DPLL/WalkSAT solver lands in exactly one row — three binary axes, 2³ = 8 possibilities, no gaps. The lattice has four duality pairs (bitwise complement):
| Pair | Duality |
|---|---|
| VERIFIED (111) ↔ METAPARADOX (000) | Poles of the lattice — everything works vs nothing works |
| PARADOX (101) ↔ SHADOW_PARADOX (010) | Composition fails vs composition holds (anti-paradox) |
| TIMEOUT (110) ↔ CONTRADICTION (001) | Possibly possible vs proved impossible |
| BASE_FRAMES (100) ↔ MIRROR_PARADOX (011) | Pre-compositional parts vs emergent whole |
Shadow Paradoxes (010) are the anti-paradoxes: individual base frames are UNSAT, yet their joint composition holds temporarily. The solver cannot converge because the holding is unstable — emergent coherence from individually broken pieces without structural guarantee. They are the bitwise complement of PARADOX (101) and the shadow that PARADOX casts.
Mirror Paradoxes (011) "reflect out" shadow paradoxes by adding convergence. When a shadow paradox stabilises — the anti-paradox holds permanently — it becomes a mirror paradox. The five structural meta-paradoxes (Container Generation, Level Transcendence, Self-Modification, Infinite Overflow, Foundation Bootstrap) are the convergence mechanism that drives SHADOW_PARADOX (010) → MIRROR_PARADOX (011) mechanistically.
The Meta-Mirror 4-Cycle. Mirror meta-mirrors paradoxes over through timeout shadow paradoxically. The lattice has four cyclic states and four fixed-point states:
SHADOW(010) →[convergence]→ MIRROR(011) →[meta-mirror]→ PARADOX(101)
↑ ↓
└──────[paradoxically]── TIMEOUT(110) ←─────────────────┘
- Convergence (short path): SHADOW → MIRROR (1 step, meta-paradox driven — flip convergence bit)
- Degeneration (long path): MIRROR → PARADOX → TIMEOUT → SHADOW (3 steps, meta-mirror driven)
- XOR algebra: MIRROR ⊕ PARADOX = 011 ⊕ 101 = 110 = TIMEOUT (the intermediate is the XOR of the endpoints)
- Fixed points (outside the cycle): VERIFIED (111), METAPARADOX (000), CONTRADICTION (001), BASE_FRAMES (100)
The degeneration is itself paradoxical: a stable mirror degenerates back into the shadow it transcended, passing through paradox (valid parts, invalid whole) and over timeout (solver exhaustion) before arriving at shadow (the unstable anti-paradox).
Why "A. not A." → PARADOX at sat_ratio = 1.0 (not CONTRADICTION): "A." is individually satisfiable — a BASE_FRAME. "not A." is individually satisfiable — a BASE_FRAME. Their conjunction is UNSAT. The gate correctly classifies this as PARADOX because both constituent frames are satisfiable, proving they originated as BASE_FRAMES that compose into contradiction. A true CONTRADICTION would be irresolvable even individually (e.g., a clause containing a literal and its negation in the same group). This distinction is load-bearing: PARADOX samples are recoverable curriculum material; CONTRADICTION samples are not.
| Time | What you get |
|---|---|
| 30 seconds | What this is and why it exists |
| 60 seconds | Why verified training data matters |
| 2 minutes | How to hook in a transformers architecture |
| 10 minutes | Full architecture + curriculum system |
| 15.5 minutes | Training on the four-verdict signal as frontier detection |
Models trained on logically inconsistent data learn inconsistent reasoning. This library filters every training sample through a SAT verification gate — the same gate from satisfaction-suffices — before it enters your pipeline. Only verified samples survive by default.
Current safety training (RLHF, DPO) adjusts output distribution. It does not verify logical consistency. A model can learn to sound safe while reasoning incorrectly.
Training on SAT-verified data builds a model whose training distribution is structurally consistent. Every sample that reached it passed a proof. The model learns from a population that couldn't contradict itself.
from satisfiable_ai import VerifiedDataset, default_schedule
# Filter your data to SAT-verified samples only
dataset = VerifiedDataset(
samples=your_data, # list of {"text": "..."} dicts
domain="logic", # "logic" | "code" | "math" | "proof"
strict=True, # True = only VERIFIED; False = exclude CONTRADICTION only
)
# Curriculum scheduling — staggers complexity across training phases
schedule = default_schedule()
# schedule.phases: Foundation → Execution → Domain → Consolidate
# Feed each phase to your HuggingFace Trainer as a separate training stage
for phase in schedule.phases:
trainer.train_dataset = phase.sources
trainer.train()The eight verdict classes are available as labels if you want to train on the full signal:
from satisfiable_ai import verify
result = verify("if A then B. A. therefore B.", domain="logic")
# result.verdict: "VERIFIED" | "CONTRADICTION" | "PARADOX" | "MIRROR_PARADOX"
# | "TIMEOUT" | "METAPARADOX" | "SHADOW_PARADOX" | "BASE_FRAMES"
# result.sat_ratio: float — fraction of constraint groups that resolved satisfiableArchitecture:
satisfiable-ai
├── satisfiable_ai/
│ ├── dataset_builder/
│ │ ├── sources.py — HuggingFace dataset registry
│ │ ├── curriculum.py — phase scheduling (Foundation → Execution → Domain → Consolidate)
│ │ └── runtime.py — VerifiedDataset / StreamingVerifiedDataset (requires torch)
│ ├── harness/
│ │ ├── adapters.py — model adapter protocol (callable / transformers / OpenAI / Anthropic / ollama / llama-cpp)
│ │ ├── harness.py — SatisfiableHarness: none / retry / best retry modes + async
│ │ └── streaming.py — StreamingHarness: prefix feasibility gate, token-by-token, early_stop
│ ├── server/
│ │ └── app.py — FastAPI: POST /generate, POST /verify, GET /health
│ ├── verifier/ — imported from satisfaction-suffices
│ │ ├── sat.py — DPLL solver with WalkSAT fallback
│ │ ├── verify.py — gate: extract → solve → aggregate → verdict
│ │ └── partial.py — prefix feasibility for streaming use
│ └── __init__.py — public API surface
Install:
pip install satisfiable-ai
# with torch:
pip install "satisfiable-ai[torch]"Verdict reference (full 6-state lattice described at top of this document):
| Verdict | sat_ratio | Default behavior |
|---|---|---|
| VERIFIED | > coherence threshold | Include in training data |
| CONTRADICTION | 0 (groups individually UNSAT) | Exclude (strict and non-strict) |
| PARADOX | 1.0 (each BASE_FRAME SAT; joint UNSAT) | Exclude — use as evolution / curriculum seed |
| TIMEOUT | any — solver budget exceeded | Exclude in strict mode; monitor for emergence |
| METAPARADOX | 0 — no lower SAT frame reachable | Fixed-point boundary; structural recursion limit |
| BASE_FRAMES | 1.0 — unit satisfiable | Foundation curriculum; morphogenetic seeds |
The four verdict classes are not just filters. They are training signal.
Verdict-conditioned training: label each sample with its verdict and train the model to predict verdict alongside output. A model that can predict whether its own output is VERIFIED, PARADOX, or TIMEOUT is building an internal model of its logical boundary.
Timeout as emergence detector: The 3-SAT phase transition concentrates solver timeouts at clause/variable ratio ~4.267. As a model's outputs grow more complex, timeout density in your verification pass may spike before capability emergence is otherwise measurable. Monitor result.sat_ratio distributions across training checkpoints.
Paradox curriculum: PARADOX samples — individually satisfiable clauses that are jointly unsatisfiable — represent the structural frontier of the model's current reasoning. Feed them back as the hardest curriculum stage. The model learns to recognize the shape of its own inconsistency boundaries.
For the full theoretical grounding, read the paper in satisfaction-suffices/paper/paper_01_submission.md.
| Package | What | Install |
|---|---|---|
satisfaction-suffices |
Core SAT gate + proof evolution | pip install satisfaction-suffices |
satisfiable-ai |
Verified training data + curriculum scheduling | pip install satisfiable-ai |
Live demo: HuggingFace Space | Paper: Satisfaction Suffices
The Time License v7.2. See satisfaction-suffices for full terms.