Co-folding models produce candidate complexes and scalar rankings. SIGMA decomposes each candidate into the structural failure modes that rankings hide: backbone quality, interface sufficiency, interface geometry, and consensus stability. A high rank is not a receipt. A pretty complex is not proof. A scalar score is not structural verification.
Co-Fold is the protein-complex application of the Invariant verification stack.
Generated is not verified. Same engine. Different evidence. Same receipt.
Systems such as OpenFold3, AlphaFold3, Boltz-2, and Chai-1 can generate protein-ligand complex predictions and associated confidence or ranking signals. That ranking reflects the model's internal belief, not independent structural verification. A candidate can score high on confidence while carrying a structural defect or an under-engaged binding interface.
SIGMA adds a missing layer: deterministic post-prediction verification that decomposes each candidate into backbone quality, interface geometry, and ligand engagement. The output is a signed receipt with a structural diagnosis, not another learned score.
5FDR is used only after verification for experimental comparison.
Drop a PDB or mmCIF from OpenFold3, AlphaFold3, Boltz-2, Chai-1, or any co-folding model. SIGMA decomposes the structure into backbone quality, interface geometry, interface sufficiency, and consensus stability. No reference structure needed. Self-consistency verification.
We ran SIGMA on all five OpenFold3 MCL-1 seed_42 samples and compared against the experimental 5FDR crystal structure. SIGMA used no reference structure during verification. The experimental comparison is post hoc only.
| Sample | OF3 Rank | SIGMA Rank | CoFold | IfaceGeo | IfaceSuf | ChainRMSD* | Contact.J | Diagnosis |
|---|---|---|---|---|---|---|---|---|
| 1 | 4 | 2 | 0.9643 | 0.9516 | 1.0000 | 1.051 | 0.506 | Best experimental contact match |
| 2 | 2 | 3 | 0.9489 | 0.9423 | 0.9556 | 1.140 | 0.330 | Mid-rank; interface sufficiency below 1.0 |
| 3 | 5 | 4 | 0.9439 | 0.9245 | 0.9567 | 1.052 | 0.387 | Lower interface geometry |
| 4 | 3 | 1 | 0.9786 | 0.9615 | 0.9743 | 1.074 | 0.257 | Highest interface geometry and co-fold quality |
| 5 | 1 | 5 | 0.9195 | 0.7374 | 1.0000 | 1.133 | 0.389 | OF3 top-ranked; interface geometry failure |
| 5FDR | EXP | REF | 0.9272 | 1.0000 | 1.0000 | 0.000 | 1.000 | Experimental ground truth |
ChainRMSD and Contact.J are post hoc experimental comparison metrics. They are not used by SIGMA during verification.
*Mean per-chain CA-RMSD after independent chain alignment. Whole-assembly RMSD is not used as the primary metric because 5FDR contains four MCL-1 copies in the asymmetric unit and quaternary packing differs across structures.
5FDR co-fold quality (0.9272) sits below several predictions because a deposited crystal structure carries real thermal and refinement deviation that idealized predictions do not, so predicted models can score above the experimental structure on pure geometry. This is expected, and is why SIGMA reports a structural decomposition rather than a single score.
For this OpenFold3 MCL-1 seed_42 output set, Sample 5 had the highest sample_ranking_score (0.5296).
SIGMA ranked Sample 5 lowest as a co-fold candidate because of its weak interface geometry (0.7374), the worst in the set.
SIGMA ranked Sample 4 highest. It had the strongest interface geometry and the highest co-fold quality of the five. By direct structural comparison to the experimental 5FDR coordinates, Sample 1 is the closest contact match (Contact.J = 0.506).
Across five stochastic OpenFold3 samples, certain residue regions are flagged as obstructions in multiple independently generated candidates. These are not random; they represent regions where the model is systematically uncertain or deviating from ideal geometry. The C-terminal region (residues 152-156) shows 3-12 Angstrom displacement from experimental 5FDR across all five samples.
| Residue | Frequency | Type |
|---|---|---|
| AGLY23 | 4/5 (80%) | backbone bond / angle |
| AGLY34 | 3/5 (60%) | backbone angle |
| AVAL152 - AASP154 | 2/5 (40%) | backbone angle |
| AASP2 - AASP3 | 2/5 (40%) | backbone bond |
| AASN113 | 2/5 (40%) | backbone angle |
| AGLY157 | 2/5 (40%) | backbone angle |
Bond lengths, bond angles, planarity, Ramachandran compliance. Per-chain geometry verified independently.
Does the ligand make enough contacts with the protein pocket? Contact density per ligand heavy atom. Distinct residue coverage. Penalizes underbound and unsupported poses.
Are those contacts clash-free and chemically plausible? Steric violations, hydrogen-bond geometry, contact distance distribution.
Recurring obstructions across multiple stochastic samples. Separates model-systematic uncertainty from random sample noise.
SIGMA adds deterministic post-prediction verification and failure-mode decomposition to co-folding workflows. It tells you why a candidate is structurally risky, not whether it will work as a drug.
Every verification produces an Ed25519-signed, timestamped receipt with a unique certificate ID, proof sketches, severity-ranked obstructions, and a zero-parameter verification declaration. The same receipt format used across all SIGMA verticals: compliance, legal, protein, co-fold.
Certificate ID (SIGMA-YYYYMMDD-HASH). Filing safety status. Quality decomposition across four structural axes. Proof sketches with per-obstruction severity, residue localization, and evidence. Model confidence comparison (pLDDT vs SIGMA backbone quality). Verification parameters (0 ML models, 0 GPU, 0 training data, deterministic replay = true). Ed25519 signature. Verify URL.
SIGMA has verified predictions from both OpenFold3 and Biohub's ESMFold2 on the MCL-1 target using the same deterministic pipeline. The comparison script auto-detected that ESMFold2 produced a monomer (no ligand) while OpenFold3 produced a protein-ligand complex, suppressed interface metrics for the monomer, and compared only the shared structural axes.
Across five OpenFold3 samples, SIGMA flagged 33 distinct obstruction residues; the single ESMFold2 prediction had 13. Zero overlap between the two sets. An early result, five complex samples against one monomer prediction, but the disjoint failure sites are the kind of signal that does not appear in either model's confidence output. It requires an independent verification layer.
Next: SIGMA is being extended to verify outputs from broader protein world models, including Biohub's new ESM-based protein biology models, using the same deterministic receipt layer.
Upload any co-folding output (PDB or mmCIF from OpenFold3, AlphaFold3, Boltz-2, Chai-1, or any structure prediction model) and get a deterministic structural verification receipt. Nothing leaves your network.