Protocol · Attribution

Verifiable attribution, one TEE attestation at a time.

Foundry's value depends on a single hard claim: when you contribute data to a Forge, the shares you receive in the resulting Ingot areprovably proportional to how much your contribution improved the model. This page is the load-bearing math under that claim.

The problem

Shapley values are the theoretically correct answer — but Shapley scales like 2n. For a Forge with 40 data contributions you'd need ~1012 training runs. Not viable.

Approximate-Shapley schemes (TMC, KNN-Shapley, Data Banzhaf) trade provability for tractability. Foundry's v1 attribution model takes a third path: a strict leave-one-out (LOO) score, run in a TEE, with a cryptographic attestation that this exact procedure was followed on this exact data.

Leave-one-out, exactly

Given n contributions c₁…cₙ, the LOO score for contribution cᵢ is:

LOO scoretext
score(cᵢ) = eval(train(C))  −  eval(train(C \ cᵢ))

where:
  C        = the full contribution set
  C \ cᵢ   = contribution set with cᵢ removed
  train()  = the Forge's pinned training recipe
  eval()   = the Forge's pinned eval (held-out test set + metric)

In English: the score for your data is the difference in eval performance between a model trained with it and a model trained without it. Positive scores → you helped. Negative scores → you hurt. Zero → indistinguishable from absence.

What about negative scores?

A contribution with a negative score still appears on-chain in the ContributionRegistry, but receives zero shares. Foundry never punishes contributors — bad data just doesn't earn. The contributor can resubmit a cleaned-up version to a future Forge.

The TEE wraps it

The eval coordinator runs inside a Trusted Execution Environment (TEE) — currently AMD SEV-SNP via 0G Compute's verifiable compute layer. The TEE produces a signed attestation containing:

FieldWhat it pins
code_hashHash of the training + eval container image. Forces deterministic recipe.
data_rootMerkle root over the contribution set (from ContributionRegistry).
baseline_metricEval metric for the baseline model (trained on C).
scoresVector of (contributor → score) pairs.
nonceForge-issued nonce to prevent replay.
sigTEE signature over the above, verifiable by anyone.

The Forge.submitEvalResult() function verifies the signature against the registered TEE provider's public key and confirms the data_root matches what the ContributionRegistry committed. If both checks pass, the Forge transitions to ATTESTEDand shares get minted in the next transaction.

Why a TEE and not zk?

A zk proof of LOO is the right end-state — but proving generic training inside a SNARK costs ~6 orders of magnitude more than running it. We're betting on the TEE → zk transition path, not on zk being viable for training in 2026.

Score vector shape

The score vector is the heart of the attestation. Here's a real example from the Konkani v1 Forge (4 contributors):

scores from Konkani v1json
{
  "baseline_metric": { "bleu": 24.7, "comet": 0.71 },
  "scores": [
    { "contributor": "0x4a7c…f12c", "score": 0.184, "shares": 4100 },
    { "contributor": "0x6f12…3b9e", "score": 0.142, "shares": 3160 },
    { "contributor": "0x8e2a…d4a1", "score": 0.071, "shares": 1580 },
    { "contributor": "0x1c34…7f08", "score": 0.052, "shares": 1160 }
  ]
}

Shares are integer values in basis points (10,000 = 100%) computed asround(scoreᵢ / Σscores · 10000). Rounding remainders go to the largest holder.

Non-TEE fallback

Hackathon-honest disclosure

0G Compute's TEE integration is still maturing. For Forges where the TEE attestation can't be produced, the eval coordinator runs the same procedure outside a TEE and submits a plain signed attestation from a known coordinator address. The Ingot is minted with a tee:false flag, visible on-chain and surfaced in the UI with an amber warning pill.

The non-TEE path exists for two reasons: (1) hackathon judges can verify the full loop without waiting on us to ship TEE integration; (2) the procedure is still correct, just less adversarially-resistant. Production deployments require TEE.

Known limits (v1)

  • LOO requires n training runs. For Forges with > 200 contributions, we batch via a stratified LOO approximation. The bound on accuracy loss is documented in the eval coordinator README.
  • Eval metric is pinned at Forge open. Changing the metric mid-Forge would change all scores. The metric is committed to evalSpec at ForgeFactory.createForge().
  • Compute contributions get a flat-rate share. LOO scores data, not GPU-hours. Compute Smiths receive a fixed fraction (configured per-Forge, typically 15–25%) split pro rata to compute units provided.