Skip to content

SPECTR Engine Docs

Use the hosted API or install the local Python wheel. Hosted API base URL:

https://api.spectrengine.com

Start Here

UFM takes bytes, decomposes them into structural primitives, and gives you repeatable results you can use in software. You can call it through the hosted HTTPS API, or install the same engine locally with the Python wheel.

Hosted API

Best when you want a service endpoint, account-managed ledgers, dashboard history, and no local engine install.

import base64, os, requests

BASE = "https://api.spectrengine.com"
API_KEY = os.environ["UFM_API_KEY"]

payload = {
    "data_b64": base64.b64encode(b"Hello World").decode()
}
resp = requests.post(
    f"{BASE}/v1/engine",
    json=payload,
    headers={"X-Api-Key": API_KEY},
)
resp.raise_for_status()
result = resp.json()
print(result["core"]["seed"])
print(result["core"]["replay_valid"])

Local Python wheel

Best when you want raw bytes in your own process, local ledger files, and engine calls without an HTTP round trip after activation.

pip install ufm-3.0.0-cp313-cp313-win_amd64.whl
python -m ufm activate ufm_live_<your_key>
import ufm

eng = ufm.InvariantIdentityEngine(storage_path='ledger.bin')
seed, status = eng.process(b'Hello World')
print(seed, status)
print(eng.reconstruct(b'Hello World'))
eng.save()
First success checkpoint: for the API, you should see a JSON response with core.seed and core.replay_valid. Locally, you should see a seed, a status such as NOVELTY, and True from reconstruct.

Which Endpoint Should I Use?

The SPECTR Engine has two main endpoints that cover most use cases. Pick the one that matches what you're doing.

Recommended default

POST /v1/engine

Analyse one piece of data. Returns its structural identity, quality metrics, temporal patterns, and frequency distribution, all in a single call.

Best for:

  • Understanding the structural profile of a file or payload
  • Monitoring data quality over time
  • Getting a complete analysis without multiple API calls
Default for pairs

POST /v1/pipeline

Compare two pieces of data. Returns their structural overlap and detects semantic noise (encoding changes, BOM insertions, line-ending shifts) that may disguise functionally identical data as different.

Best for:

  • Detecting what changed between two file versions
  • Filtering out noise (BOM, CRLF, Base64) from real structural changes
  • Data integrity verification across systems

What each endpoint runs

Analysis layer/v1/engine/v1/pipeline
Core ingestion (seed, signature, replay check)YesYes
Universal pipeline (7-stage quality gates)Yes-
Timeline analysis (autocorrelation, segments)Yes-
Frequency analysis (primitive distribution)Yes-
Discovery analysis (novelty sequence)Yes-
Structural comparison (Jaccard, shared/unique)With compare_b64With target_b64
Semantic noise detection (BOM, CRLF, Base64)-With target_b64
Anti-drift governance gates-With target_b64
Not sure? Start with /v1/engine. It gives you the most detail about a single input. Switch to /v1/pipeline when you need to compare two inputs or detect noise between them.

Other endpoints

The granular endpoints give you fine-grained control when you only need one specific layer:

Plain-English Terms

TermWhat it means when you are coding
data_b64Base64 text wrapping your bytes for HTTP JSON. Local wheel calls take raw bytes instead.
seedA deterministic numeric identity returned by the engine for the structural profile it computed.
ledgerThe stored primitive/timeline state. Hosted API ledgers are account-scoped; local ledgers are files you choose.
replayRebuilding previously ingested bytes from ledger state. replay_valid tells you whether the check passed.
NOVELTY / REPLAYNOVELTY means the call selected new primitives. REPLAY means the structure was already in that ledger.
request-scopedThe endpoint computes a result for that request without storing it for later replay.
semantic noiseA byte-level representation change, such as BOM or line endings, that the semantic layer can classify separately from structural difference.

Quick Start

This path gets you from no code to a working API call. Use Python first because it handles base64 safely on Windows, macOS, and Linux.

1. Get your API key

Go to API Keys in the sidebar and click Create Key. Copy the full key immediately; it is shown only once. Format: ufm_live_a1b2c3d4.xxxxxxxxx

2. Install the HTTP helper library

The examples below use requests and read your key from an environment variable so you do not paste secrets into source files.

pip install requests

# PowerShell
$env:UFM_API_KEY = "ufm_live_<your_key>"

# macOS/Linux shell
export UFM_API_KEY="ufm_live_<your_key>"

3. Analyse your first input

Use /v1/engine for a complete structural analysis of one input:

import base64
import os
import requests

BASE = "https://api.spectrengine.com"
API_KEY = os.environ["UFM_API_KEY"]

payload = {
    "data_b64": base64.b64encode(b"Hello World").decode(),
}
resp = requests.post(
    f"{BASE}/v1/engine",
    json=payload,
    headers={"X-Api-Key": API_KEY},
    timeout=60,
)
resp.raise_for_status()
result = resp.json()

print("seed:", result["core"]["seed"])
print("status:", result["core"]["status"])
print("replay valid:", result["core"]["replay_valid"])
print("quality:", result["universal"]["quality"])

If this works, you have authenticated successfully and the engine has analysed your bytes. The most useful first fields are core.seed, core.status, core.replay_valid, and universal.quality.

4. Persist and replay

/v1/engine is request-scoped (no data persisted between calls). To store data for later replay, use /v1/process:

payload = {"data_b64": base64.b64encode(b"Hello World").decode()}

created = requests.post(
    f"{BASE}/v1/process",
    json=payload,
    headers={"X-Api-Key": API_KEY},
).json()

seed = created["seed"]
replayed = requests.get(
    f"{BASE}/v1/replay/{seed}",
    headers={"X-Api-Key": API_KEY},
).json()

original = b"".join(base64.b64decode(c) for c in replayed["chunks_b64"])
print(original)

5. Compare two inputs

Use /v1/pipeline with a target_b64 to compare two inputs and detect noise:

source = b"Hello World"
target = b"\xef\xbb\xbfHello World"

resp = requests.post(
    f"{BASE}/v1/pipeline",
    json={
        "data_b64": base64.b64encode(source).decode(),
        "target_b64": base64.b64encode(target).decode(),
    },
    headers={"X-Api-Key": API_KEY},
)
pair = resp.json()
print(pair["compare"]["jaccard"])
print(pair["semantic"]["converges"])
print(pair["semantic"]["noise_units"])

This compares "Hello World" with a BOM-prefixed version. The response shows structural overlap (Jaccard similarity) and identifies the BOM as a noise artefact.

cURL smoke test

curl -X POST https://api.spectrengine.com/v1/engine \
  -H "Content-Type: application/json" \
  -H "X-Api-Key: YOUR_API_KEY" \
  -d '{"data_b64": "SGVsbG8gV29ybGQ="}'

Local Python Wheel

The hosted API and the local wheel are two surfaces over the same native UFM engine. Use the wheel when you want the engine inside your own process, with local ledgers, no HTTP base64 wrapper, and no per-request network call after activation.

Licence requiredLocal bytes APIBob and Ben included
Install once, activate once, then import ufm. Activation caches a signed licence token on the machine. Normal engine calls use that cached token; python -m ufm status checks it without touching the network.

Install and activate

Use the wheel that matches your Python and OS. The Windows CPython 3.13 wheel is:

python -m venv .venv
.\.venv\Scripts\activate

pip install ufm-3.0.0-cp313-cp313-win_amd64.whl
python -m ufm activate ufm_live_<their_key>
python -m ufm status
python -m ufm version

# Done: import ufm works, and features granted by your licence are available.

For unattended jobs, you can also set UFM_LICENCE_KEY before the first engine call. Set UFM_LICENCE_CACHE_DIR when you want the licence cache to live in a controlled directory.

import ufm

status = ufm.licence_status()
print(status["active"])
print(status["tier"])
print(status["days_remaining"])

Mental model

  • HTTP requests use base64 strings. Local calls take raw bytes.
  • API persistence is your account ledger. Local persistence is the storage_path file you choose.
  • Request-scoped API endpoints map to temporary local ledgers or stateless helpers.
  • Local Ben history lives in a BenSession; persist it yourself if you need durable chat history.
  • There are no hosted rate limits locally, but the 100 MB input limit and licence scopes still apply.

Core ingest, persist, and replay

This is the local equivalent of POST /v1/process, GET /v1/replay/{seed}, and POST /v1/reconstruct.

import ufm

data = b"Hello World"

with ufm.InvariantIdentityEngine(storage_path="customer-ledger.bin") as eng:
    seed, status = eng.process(data)
    print(seed, status)                  # NOVELTY on first ingest

    seed2, status2 = eng.process(data)
    print(seed2 == seed, status2)        # True, REPLAY

    assert eng.reconstruct(data) is True

    replayed = [bytes(seq) for seq in eng.replay(seed)]
    print(replayed[0])                   # b"Hello World"

    print(eng.ledger_summary())

# save() is called automatically when the context exits.

Full engine analysis

Use this pattern when you want the same layers as POST /v1/engine: core identity, universal pipeline quality, timeline, frequency, discovery, and optional structural comparison.

import tempfile
import ufm

def to_bits(data: bytes) -> list[int]:
    return [int(bit) for byte in data for bit in f"{byte:08b}"]

def mode_from_strategy(strategy: dict | None, fallback: str = "auto_curve") -> str:
    raw = str((strategy or {}).get("symbol_length_mode") or fallback).lower()
    if raw in ("autocurve", "auto_curve"):
        return "auto_curve"
    if raw == "entropy":
        return "entropy"
    if raw.startswith("fixed(") and raw.endswith(")"):
        return f"fixed{raw[6:-1]}"
    return raw if raw.startswith("fixed") else fallback

def compare_bytes(a: bytes, b: bytes, mode: str = "auto_curve") -> dict:
    la = ufm.ingest_raw(to_bits(a), symbol_length_mode=mode)
    lb = ufm.ingest_raw(to_bits(b), symbol_length_mode=mode)
    return ufm.ledger_compare(la, lb)

def run_local_engine(
    data: bytes,
    *,
    compare_with: bytes | None = None,
    verify: bool = True,
    max_lag: int = 50,
    top_n: int = 20,
) -> dict:
    with tempfile.TemporaryDirectory(prefix="ufm-engine-") as tmp:
        up = ufm.UniversalPipeline(
            storage_path=f"{tmp}/engine-request-ledger.bin",
            zero_point=True,
            verify=verify,
        )
        universal = up.run(data)
        mode = mode_from_strategy(universal.get("strategy"))
        sig = ufm.ufm_signature(data, symbol_length_mode=mode)
        ledger = ufm.ingest_raw(to_bits(data), symbol_length_mode=mode)

        result = {
            "core": {
                "seed": universal.get("seed", sig["seed"]),
                "status": (universal.get("execute") or {}).get("status"),
                "discovery_rate": sig["discovery_rate"],
                "symbol_length": sig["symbol_length"],
                "primitive_count": sig["primitive_count"],
                "timeline_length": sig["timeline_length"],
                "reuse_ratio": sig["reuse_ratio"],
                "replay_valid": sig["replay_valid"],
                "signature": sig["signature"],
            },
            "universal": universal,
            "timeline": {
                "acf": ledger.acf(max_lag),
                "segments": ledger.segments(100),
                "transitions": ledger.transitions(100, 0.1),
            },
            "frequency": {
                "histogram": ledger.frequency_histogram(),
                "top_n": ledger.top_n_primitives(top_n),
            },
            "discovery": {
                "discovery_sequence": ledger.discovery_sequence(),
                "discovery_rate": ledger.discovery_rate,
                "primitive_count": ledger.primitive_count,
                "timeline_length": ledger.timeline_length,
                "symbol_length": ledger.symbol_length,
            },
            "effective_symbol_length_mode": mode,
            "engine_version": ufm.VERSION,
        }
        if compare_with is not None:
            result["comparison"] = compare_bytes(data, compare_with, mode)
        return result

analysis = run_local_engine(b"Hello World", compare_with=b"\xef\xbb\xbfHello World")
print(analysis["core"]["seed"])
print(analysis["universal"]["quality"])
print(analysis["comparison"]["jaccard"])

Pair comparison and semantic noise

This is the local equivalent of POST /v1/pipeline, POST /v1/compare, POST /v1/noise/detect, POST /v1/noise/delta, and POST /v1/semantic/analyze.

import ufm

def to_bits(data: bytes) -> list[int]:
    return [int(bit) for byte in data for bit in f"{byte:08b}"]

source = b"line one\nline two\n"
target = b"line one\r\nline two\r\n"

la = ufm.ingest_raw(to_bits(source))
lb = ufm.ingest_raw(to_bits(target))
structural = ufm.ledger_compare(la, lb)

semantic = ufm.SemanticDecisionPipeline("semantic-ledger.jsonl")
noise = semantic.run_with_policy(
    source,
    target,
    enabled_noise_classes=["line_ending_crlf"],
    strict_allowlist=True,
)

print(structural["jaccard"])
print(noise["converges"])       # True for classified CRLF/LF noise
print(noise["noise_units"])
print(noise["validation_checks"])
print(noise["decision_hash"])

Analytics helpers

The granular analysis endpoints are direct ledger operations locally.

import ufm

def to_bits(data: bytes) -> list[int]:
    return [int(bit) for byte in data for bit in f"{byte:08b}"]

data = b"abcabcabc"
ledger = ufm.ingest_raw(to_bits(data), symbol_length_mode="auto_curve")

timeline = {
    "acf": ledger.acf(50),
    "segments": ledger.segments(100),
    "transitions": ledger.transitions(100, 0.1),
}
frequency = {
    "histogram": ledger.frequency_histogram(),
    "top_n": ledger.top_n_primitives(20),
}
discovery = {
    "discovery_sequence": ledger.discovery_sequence(),
    "discovery_rate": ledger.discovery_rate,
}
symbol_length, selector_meta = ufm.find_optimal_symbol_length(to_bits(data))
profile = ufm.structural_profile(data, symbol_width=16)

print(timeline)
print(frequency)
print(discovery)
print(symbol_length, selector_meta)
print(profile)

Batch and corpus workflows

Use process_batch for persisted corpus ingestion and ufm_signature_batch for independent stateless signatures. Local corpus import/export is just controlled movement of your ledger.bin file.

import shutil
import ufm

def to_bits(data: bytes) -> list[int]:
    return [int(bit) for byte in data for bit in f"{byte:08b}"]

docs = [b"file one", b"file two", b"file three"]

with ufm.InvariantIdentityEngine(storage_path="corpus-ledger.bin") as eng:
    results = eng.process_batch(docs)
    seeds = [seed for seed, _status in results]
    summary = eng.ledger_summary()

ledgers = [ufm.ingest_raw(to_bits(doc)) for doc in docs]
jaccard_matrix = [
    [ufm.ledger_jaccard(a, b) for b in ledgers]
    for a in ledgers
]

print(seeds)
print(summary)
print(jaccard_matrix)

# Export/import the local ledger file.
shutil.copyfile("corpus-ledger.bin", "customer-export.ufmr")
shutil.copyfile("customer-export.ufmr", "restored-ledger.bin")

Universal and decision pipelines

UniversalPipeline is the governed data processing path. DecisionPipeline is the text decision/audit path with anti-drift gates and a persistent audit ledger.

import ufm

up = ufm.UniversalPipeline(
    storage_path="universal-ledger.bin",
    bit_depth=21,
    verify=True,
    zero_point=False,
)
run = up.run(b"payload for the governed pipeline")
print(run["success"], run["replay_valid"], run["quality"])
print(run["stages_completed"])

dp = ufm.DecisionPipeline("decision-ledger.jsonl")
decision = dp.run("What is structural identity in UFM?")
print(decision["status"])
print(decision["decision_hash"])
print(decision["gates"])

Call Ben locally

Ben is included in the wheel. It loads sealed prompts after the ben.ask scope check, then uses the LLM backend you provide.

import os
import ufm

# Local Ollama. Make sure Ollama is running and the model is pulled.
backend = ufm.backend_from_config(backend="ollama", model="gemma4")
session = ufm.BenSession(backend=backend)

first = session.ask("What is UFM actually doing?")
print(first.text)
print(first.session_id, first.turn_count)

second = session.ask("How does replay relate to structural identity?")
print(second.text)
print(session.history())

# One-shot convenience wrapper with a provider API key.
answer = ufm.ask_ben(
    "Explain the replay invariant.",
    backend="openai",
    model="gpt-4o",
    api_key=os.environ["OPENAI_API_KEY"],
)
print(answer["text"])

CLI form:

python -m ufm ask-ben "What is UFM?" --backend ollama --model gemma4
python -m ufm ask-ben "Explain replay" --backend openai --model gpt-4o --api-key YOUR_OPENAI_API_KEY

Call Bob locally

Bob is also included in the wheel. The sealed corpus, claim-gate snapshot, OOV thresholds, and prompts ship with the package. Bob can run corpus retrieval and gate/OOV/audit output without an LLM backend; pass a backend when you want generated response wording.

import os
import ufm

# Corpus retrieval, gate status, OOV metrics, and audit output.
bob = ufm.BobPipeline()
result = bob.query("What is the replay invariant?", mode="advisory", max_anchors=5)
print(result.response)
print(result.gate_status)       # PASS, WARN, or BLOCK
print(result.evidence)
print(result.oov)
print(result.audit)
print(result.boundary_flags)

# Optional generated answer using an LLM backend.
backend = ufm.backend_from_config(
    backend="anthropic",
    model="claude-sonnet-4-20250514",
    api_key=os.environ["ANTHROPIC_API_KEY"],
)
bob_with_llm = ufm.BobPipeline(backend=backend)
print(bob_with_llm.query("Explain C-CORE-002.").response)

# One-shot convenience wrapper.
one_shot = ufm.ask_bob(
    "Is replay identity verified?",
    backend="ollama",
    model="gemma4",
)
print(one_shot["response"])
print(one_shot["gate_status"])

CLI form:

python -m ufm ask-bob "Is the replay invariant verified?" --backend ollama --model gemma4
python -m ufm ask-bob "Explain C-CORE-002" --backend anthropic --api-key YOUR_ANTHROPIC_API_KEY

Local equivalents at a glance

API surfaceLocal wheel surface
/v1/engineCompose UniversalPipeline, ufm_signature, ingest_raw, and ledger analytics.
/v1/process, /v1/replay, /v1/reconstructInvariantIdentityEngine.process, replay, and reconstruct.
/v1/pipeline, /v1/compareledger_compare, ledger_jaccard, plus SemanticDecisionPipeline for pair noise.
/v1/noise/*, /v1/semantic/analyzeSemanticDecisionPipeline.run, run_with_policy, and capabilities.
/v1/analyze/*, /v1/structural_profileLedger methods: acf, segments, transitions, frequency_histogram, discovery_sequence, plus structural_profile.
/v1/batch/*, /v1/corpus/*process_batch, ufm_signature_batch, local loops, local manifests, and copying/importing ledger.bin.
/v1/ingest/asyncRun process or UniversalPipeline.run inside your own background worker or queue.
/v1/ben/ask, /v1/bob/queryBenSession, ask_ben, BobPipeline, ask_bob, or the CLI commands.
/v1/me/llm-credentialsPass OllamaBackend, APIBackend, or backend_from_config directly. Store provider keys in your own secret manager.

Operational notes

  • Keep one ledger file per project or tenant. Primitive reuse is scoped to that file.
  • Do not run multiple writers against the same ledger path without your own lock.
  • Ledger paths must stay inside the current working directory; paths containing .. are rejected.
  • Call ufm.set_num_threads(n) before the first ingestion if you need to bound CPU use.
  • Provider-backed Ben/Bob calls require the provider SDK, for example pip install openai or pip install anthropic.

Authentication

All API endpoints (except /v1/health) require authenticated context. Programmatic clients should pass an API key in theX-Api-Key header. Browser dashboard sessions can use an access_token cookie.

Getting your API key

Go to API Keys in the sidebar, click Create Key, and copy the full key immediately. It is only shown once.

Using your API key

curl -H "X-Api-Key: ufm_live_a1b2c3d4.your_secret_here" \
  -H "Content-Type: application/json" \
  -X POST https://api.spectrengine.com/v1/process \
  -d '{"data_b64": "..."}'

Key format

Keys follow the format ufm_live_{public_id}.{secret}. The prefix identifies the key; the secret authenticates it. Both parts are required.

Security: Never share your API key or commit it to source control. If compromised, revoke it immediately from the API Keys page and create a new one.

POST /v1/engine

Start here. This is the recommended endpoint for analysing a single input. One call runs every analysis layer and returns a complete structural profile. Use the granular endpoints below only when you need a specific layer in isolation.

Send any binary data (base64-encoded) and the engine decomposes it into structural primitives, assigns a deterministic seed (its identity), and runs five analysis layers: core ingestion, 7-stage quality pipeline, timeline autocorrelation, frequency distribution, and discovery sequencing. Optionally provide a second input for structural comparison.

The response is grouped by layer so you can read exactly the depth you need. Most integrations only use core (identity and signature) anduniversal.quality (quality metrics).

Auth required10 req/minScope: write

Request body

  • data_b64 (string) Base64-encoded binary data to process
  • symbol_length_mode (string, optional) Default: "auto_curve". Options: auto, auto_curve, entropy, fixed8, fixed16, fixed24, fixed32, fixed64
  • verify (boolean, optional) Run replay verification in universal pipeline (default: true)
  • compare_b64 (string, optional) Second item for structural comparison (base64-encoded)
  • max_lag (integer, optional) Maximum ACF lag for timeline analysis (1-500, default: 50)
  • top_n (integer, optional) Top-N primitives for frequency analysis (1-1000, default: 20)

Response - nested by layer

Trust boundary: the core, universal, and analysis sections are computed in one request-scoped context. Use the response trust_boundary field to interpret replay and cross-section comparisons correctly.

core: Core ingestion result

  • core.seed (integer) Deterministic seed from geometric signature
  • core.status (string) NOVELTY or REPLAY
  • core.discovery_rate (float) Fraction of novel primitives (0.0-1.0)
  • core.symbol_length (integer) Symbol width used
  • core.primitive_count (integer) Unique primitives found
  • core.timeline_length (integer) Total timeline entries
  • core.reuse_ratio (float) Fraction of reused primitives
  • core.replay_valid (boolean) Whether replay(ingest(data)) == data
  • core.signature (string) Multi-scale geometric signature

universal: 7-stage governed pipeline with quality gates

  • universal.success (boolean) Whether pipeline completed successfully
  • universal.quality (object) Quality metrics: replay_valid, discovery_rate, reuse_ratio, deterministic
  • universal.replay_valid (boolean) Whether replay invariant holds
  • universal.stages_completed (string[]) List of completed stage names
  • universal.metrics (object) Pre-ingest byte metrics: byte_entropy, unique_byte_ratio, size_class, input_length
  • universal.strategy (object) Selected processing strategy: symbol_length_mode, zero_point
  • universal.execute (object) Execution results: seed, status, discovery_rate, primitive_count, timeline_length, reuse_ratio
  • universal.field_reach (object) Per-seed field connectivity: primitive_count, shared_with_others, unique_to_seed, total_ledger_primitives, reach_ratio, isolation_ratio
  • universal.violations (string[]) Any warnings or errors

timeline: Temporal structure analysis

  • timeline.acf (float[]) Autocorrelation function values
  • timeline.segments (array) Discovery segments with start, end, rate
  • timeline.transitions (integer[]) Positions of structural transitions

frequency: Primitive frequency distribution

  • frequency.histogram (array) Full primitive frequency distribution
  • frequency.top_n (array) Top-N most frequent primitives
  • frequency.total_primitives (integer) Unique primitive count

discovery: Novelty emergence sequence

  • discovery.discovery_sequence (integer[]) Primitive IDs in order of first encounter
  • discovery.discovery_rate (float) Final discovery rate
  • discovery.primitive_count (integer) Total unique primitives

comparison: Structural overlap (only if compare_b64 provided)

  • comparison.jaccard (float) Jaccard similarity (shared / union)
  • comparison.shared_primitives (integer) Primitives in both inputs
  • comparison.only_a (integer) Primitives exclusive to first input
  • comparison.only_b (integer) Primitives exclusive to second input
  • comparison.overlap_coefficient (float) Overlap coefficient (shared / min), 0-1

Contract metadata

  • effective_symbol_length_mode (string) Method-layer selected mode applied across core/analysis/comparison sections
  • trust_boundary (object) Ledger-context notes for core/universal/analytics/comparison/replay compatibility
  • verification_note (string) Clarifies success vs replay_valid semantics, especially when verify=false

cURL example

curl -X POST https://api.spectrengine.com/v1/engine \
  -H "Content-Type: application/json" \
  -H "X-Api-Key: YOUR_API_KEY" \
  -d '{"data_b64": "SGVsbG8gV29ybGQ="}'

Python example

import base64, requests

API_KEY = "ufm_live_a1b2c3d4.your_secret_here"
BASE    = "https://api.spectrengine.com"

resp = requests.post(
    f"{BASE}/v1/engine",
    json={"data_b64": base64.b64encode(b"Hello World").decode()},
    headers={"X-Api-Key": API_KEY},
)
result = resp.json()

# Identity
print(result["core"]["seed"])           # deterministic structural identity
print(result["core"]["status"])         # "NOVELTY" or "REPLAY"
print(result["core"]["replay_valid"])   # True when replay(ingest(x)) == x

# Quality gates
print(result["universal"]["quality"])   # {replay_valid, discovery_rate, reuse_ratio, deterministic}

# Structure
print(len(result["timeline"]["segments"]))   # temporal segments found
print(result["frequency"]["total_primitives"])  # unique primitives
print(result["discovery"]["discovery_rate"])    # fraction of novel primitives

Example response (abbreviated)

{
  "core": {
    "seed": 8472619,
    "status": "NOVELTY",
    "discovery_rate": 1.0,
    "symbol_length": 14,
    "primitive_count": 42,
    "timeline_length": 42,
    "reuse_ratio": 0.0,
    "replay_valid": true,
    "signature": "G4:0.707:0.500:0.500:6:1.000:0.120|G8:..."
  },
  "universal": {
    "success": true,
    "quality": {
      "replay_valid": true, "discovery_rate": 1.0,
      "reuse_ratio": 0.0, "deterministic": true
    },
    "stages_completed": ["VALIDATE","METRICS","STRATEGY","EXECUTE","VERIFY","ADAPT","OUTPUT"],
    "metrics": {"byte_entropy": 3.459, "unique_byte_ratio": 0.727, "size_class": "small"},
    "strategy": {"symbol_length_mode": "auto_curve", "zero_point": true},
    "violations": []
  },
  "timeline": { "acf": [1.0, 0.02, -0.04, ...], "segments": [...], "transitions": [] },
  "frequency": { "histogram": [...], "top_n": [...], "total_primitives": 42 },
  "discovery": { "discovery_sequence": [0, 1, 2, ...], "discovery_rate": 1.0 },
  "effective_symbol_length_mode": "auto_curve",
  "trust_boundary": { ... },
  "verification_note": "..."
}

API tiers

Tier 1: this endpoint. One call, all structural analysis layers. The default for single-input analysis. Returns identity, quality, timeline, frequency, and discovery data.

Tier 2: /v1/pipeline. The default when comparing two inputs. Runs core ingestion plus structural comparison and semantic noise detection (identifies BOM, CRLF, and encoding artefacts).

Tier 3: granular endpoints. Individual layer endpoints for fine-grained control: /v1/process for lightweight ingestion, /v1/compare for structural comparison without semantic analysis, or any /v1/analyze/* endpoint individually.

POST /v1/pipeline

Default for comparing two inputs. This is the recommended endpoint when you have a source and target to compare. It runs core ingestion plus structural overlap analysis and semantic noise detection, identifying encoding artefacts (BOM, line endings, Base64) that make data look different while being functionally identical.

With one input: returns core analysis only (same as /v1/process). With two inputs (source + target): returns core analysis, structural comparison (shared/unique primitives, Jaccard similarity), and semantic noise analysis (noise classes, convergence check, anti-drift governance gates).

Unlike /v1/engine, this endpoint does not run timeline, frequency, or discovery analysis. It focuses on answering: “Are these two inputs structurally the same, and what noise accounts for any differences?”

Auth required10 req/minScope: write

Request body

  • data_b64 (string) Primary input, base64-encoded
  • target_b64 (string, optional) Second input for comparison/noise analysis
  • symbol_length_mode (string, optional) Default: "auto_curve"

Response fields

  • core (object) Core analysis: seed, status, discovery_rate, symbol_length, primitive_count, timeline_length, reuse_ratio, replay_valid, signature
  • compare (object | null) Structural comparison (present only when target given)
  • semantic (object | null) Semantic noise analysis (present only when target given)
  • effective_symbol_length_mode (string) Method-layer selected mode applied across core/compare sections
  • trust_boundary (object) Ledger-context notes for core/compare/semantic/replay compatibility
  • engine_version (string) Engine version

Python example: pair mode

import base64, requests

resp = requests.post(
    f"{BASE}/v1/pipeline",
    json={
        "data_b64": base64.b64encode(b"original file").decode(),
        "target_b64": base64.b64encode(b"modified file").decode(),
    },
    headers={"X-Api-Key": API_KEY},
)
result = resp.json()
print(f"Core seed: {result['core']['seed']}")
print(f"Jaccard: {result['compare']['jaccard']}")
print(f"Converges: {result['semantic']['converges']}")
print(f"Noise: {result['semantic']['noise_units']}")

Example response (pair mode, abbreviated)

{
  "core": {
    "seed": 8472619, "status": "NOVELTY", "discovery_rate": 0.85,
    "replay_valid": true, "signature": "G4:0.707:..."
  },
  "compare": {
    "shared_primitives": 38, "only_a": 4, "only_b": 6,
    "jaccard": 0.792, "overlap_coefficient": 0.905
  },
  "semantic": {
    "status": "ok",
    "converges": true,
    "noise_units": [
      {
        "noise_type": "LineEndingCrlf", "layer": "semantic",
        "function": "normalize", "operation": "strip_cr", "confidence": 1.0
      }
    ],
    "decision_hash": "a1b2c3d4...",
    "gates": {
      "coherence": true, "consistency": true,
      "causality": true, "persistence": true
    }
  },
  "effective_symbol_length_mode": "auto_curve",
  "trust_boundary": { ... },
  "engine_version": "3.0-rust"
}
converges: true means that after stripping the detected noise, the source and target are structurally identical. The noise_units array lists each artefact that was filtered. If converges is false, the differences are structural, not just noise.

Licences

Licences are signed 30-day tokens that unlock tier-scoped features such as ben.ask and bob.query. The local wheel caches the signed token during python -m ufm activate, then checks it locally on normal engine calls.

Auth required

POST /v1/licences/verify

Mint or refresh a signed licence token for the caller's API key. Authenticated with X-Api-Key (not JWT). Rate-limited to 60 requests per hour.

Response:

  • token.payload.public_id (string) Public id from the API key used for activation
  • token.payload.scopes (string[]) Granted local feature scopes, for example ben.ask and bob.query
  • token.payload.tier (string) Customer tier
  • token.payload.expires_at (string (ISO 8601)) Token expiry timestamp
  • token.signature (string) Base64 signature over the payload

GET /v1/licences/me

Return the caller's current licence status without minting a new token.

  • active (boolean) True iff a non-revoked, non-expired licence exists
  • tier (string) Subscription tier (e.g. standard)
  • expires_at (string (ISO 8601) or null) Expiry timestamp of the active licence, or null if none exists
  • revoked (boolean) True if the licence has been administratively revoked

POST /v1/licences/revoke/{licence_id}

Admin-only. Revoke a specific licence by its UUID. Returns { revoked: true, licence_id }.

Local activation: customers normally do not call these endpoints by hand. The wheel calls /verifyduring python -m ufm activate ufm_live_...and then ufm.licence_status() reads the cached status locally.

Python example

import os
import requests

# X-Api-Key authentication, not JWT
headers = {"X-Api-Key": os.environ["SPECTR_API_KEY"]}
resp = requests.post(f"{BASE}/v1/licences/verify", headers=headers)
token = resp.json()["token"]

print(token["payload"]["tier"])
print(token["payload"]["scopes"])

status = requests.get(f"{BASE}/v1/licences/me", headers=headers).json()
print(status["active"], status["expires_at"])

LLM credentials

Store your Anthropic, OpenAI, or Google Gemini API key for Ben and Bob. The server encrypts keys at rest. You can also manage these from the dashboard Settings page.

Auth required

GET /v1/me/llm-credentials

Returns whether a key is configured and the chosen provider and model name. Never returns the secret.

  • configured (boolean) True when this account has stored LLM credentials
  • provider (string | null) Stored provider: "anthropic", "openai", or "gemini"
  • model (string | null) Configured model name, or the provider default

PUT /v1/me/llm-credentials

  • provider (string) One of: "anthropic", "openai", "gemini"
  • api_key (string) Provider API key (required on every update)
  • model (string, optional) Model id; omit for the provider default
If the server is missing LLM_CREDENTIALS_ENCRYPTION_KEY, PUT returns 503 and keys cannot be stored.

DELETE /v1/me/llm-credentials

Removes stored credentials for the current user.

Python example

import os
import requests

headers = {"X-Api-Key": API_KEY}

before = requests.get(
    f"{BASE}/v1/me/llm-credentials",
    headers=headers,
).json()
print(before["configured"])

r = requests.put(
    f"{BASE}/v1/me/llm-credentials",
    json={
        "provider": "openai",
        "api_key": os.environ["OPENAI_API_KEY"],
        "model": "gpt-4o",
    },
    headers=headers,
)
print(r.json())

removed = requests.delete(
    f"{BASE}/v1/me/llm-credentials",
    headers=headers,
).json()
print(removed["configured"])

POST /v1/ben/ask

Send a message to Ben, the UFM research assistant. Conversations are persisted per-user, so pass the returned conversation_id when you want a follow-up turn in the same conversation.

Auth requiredLLM key in Settings
Save a provider API key under Settings or call PUT /v1/me/llm-credentials before this endpoint. Locally, use BenSessionor ask_ben with an LLM backend.

Request body

  • message (string) Your question or message for Ben (1-20,000 chars)
  • conversation_id (string, optional) UUID of an existing conversation to continue

Response

  • conversation_id (string) UUID of the conversation (new or existing)
  • conversation_title (string) Auto-generated title from the first message
  • response (string) Ben's assistant response text
  • token_estimate (integer) Conservative token estimate for the exchange
  • token_warning (boolean) True when token estimate exceeds 6,000
  • user_message (object) Persisted user message with id, content, created_at
  • assistant_message (object) Persisted assistant message with id, content, created_at

Python example

import requests

# Start a new conversation
resp = requests.post(
    f"{BASE}/v1/ben/ask",
    json={"message": "What is UFM actually doing?"},
    headers={"X-Api-Key": API_KEY},
)
resp.raise_for_status()
body = resp.json()
print(body["response"])
conversation_id = body["conversation_id"]

# Continue the conversation
resp = requests.post(
    f"{BASE}/v1/ben/ask",
    json={
        "message": "How does replay work?",
        "conversation_id": conversation_id,
    },
    headers={"X-Api-Key": API_KEY},
)
print(resp.json()["response"])

GET /v1/ben/ledger/status

Returns Ben's projection ledger health, skill index, and last-updated timestamp. Use it as a product health check before starting a user session.

  • projections (object) Projection file counts and SHA-256 values
  • skill_index (object) Capability count and capability names
  • health (object) Open contradiction and failed pattern recurrence counts
  • last_updated_at (string | null) Timestamp of the loaded Ben data
status = requests.get(
    f"{BASE}/v1/ben/ledger/status",
    headers={"X-Api-Key": API_KEY},
).json()
print(status["skill_index"]["capability_count"])

POST /v1/bob/query

Run Bob's grounded query pipeline: tokenise the question, apply claim gates, retrieve replay-anchored evidence, monitor out-of-vocabulary terms, and return audit output.

Auth requiredLLM key in SettingsAdvisory mode only
Hosted Bob requires saved LLM credentials. Local Bob can run the corpus, gate, OOV, and audit pipeline without a backend; pass a backend locally when you want generated response wording.

Request body

  • question (string) User question for Bob (1-20,000 chars)
  • candidate_response (string, optional) Draft response to monitor and gate
  • mode (string, optional) Only "advisory" is accepted; default is "advisory"
  • governed (boolean, optional) Governed-path metadata flag; gate outcomes still annotate the response
  • max_anchors (integer, optional) Maximum replay anchors returned (1-20, default: 5)

Response highlights

  • response (string) Final response after gate and OOV policy
  • tokenise (object) Query seed, status, discovery_rate, and query_oov_ratio
  • gate.status (string) Claim gate result: "PASS", "WARN", or "BLOCK"
  • evidence (object[]) Replay-anchored evidence rows with source_id, source_path, seed, claim_ids, phase_numbers, and snippet
  • oov_classification (string) OOV policy result: "PASS", "WARN", or "BLOCK"
  • oov_metrics (object) Out-of-vocabulary metrics used by the policy
  • audit (object) DecisionPipeline audit status, decision_hash, stage, reason, and gates
  • boundary_flags (string[]) User-visible boundary indicators

Python example

import requests

resp = requests.post(
    f"{BASE}/v1/bob/query",
    json={
        "question": "Explain C-CORE-002 replay invariant.",
        "governed": True,
        "max_anchors": 5,
    },
    headers={"X-Api-Key": API_KEY},
)
resp.raise_for_status()
body = resp.json()
print(body["gate"]["status"], body["oov_classification"])
print(body["boundary_flags"])
print(body["response"])

POST /v1/bob/ask and POST /v1/bob/chat

These routes run Bob as a persisted chat conversation. /v1/bob/ask is the primary chat route;/v1/bob/chat is a backward-compatible alias with the same behavior.

  • message (string) Message for Bob (1-20,000 chars)
  • conversation_id (string, optional) Existing conversation id to continue
  • mode (string, optional) Only "advisory" is accepted
  • governed (boolean, optional) Governed-path metadata flag
chat = requests.post(
    f"{BASE}/v1/bob/ask",
    json={"message": "What does the replay invariant mean?"},
    headers={"X-Api-Key": API_KEY},
).json()
print(chat["response"])
print(chat["query"]["gate"]["status"])

GET /v1/bob/ledger/status

Returns Bob's corpus status, session decision count, OOV thresholds, whether reinforcement is enabled, and engine version.

  • corpus (object) Source count, claim count, built_at, and corpus_index_sha256
  • session_decisions (object) Total decisions and last decision timestamp
  • oov (object) Warn/block thresholds, symbol length, and calibration status
  • reinforce_enabled (boolean) Whether admin reinforcement is enabled
  • engine_version (string) Engine version string
status = requests.get(
    f"{BASE}/v1/bob/ledger/status",
    headers={"X-Api-Key": API_KEY},
).json()
print(status["corpus"]["claim_count"])
print(status["oov"]["warn_threshold"])

POST /v1/process

Ingest binary data and persist it to your ledger. Returns structural analysis including a deterministic seed, discovery rate, and replay validity. The seed can be used with /v1/replay/{seed} to retrieve the original data later.

When to use this vs /v1/engine: Use /v1/process when you need persisted ingestion (data stored for later replay). Use /v1/engine when you want a full structural analysis without persistence.
Auth required30 req/minScope: write

Request body

  • data_b64 (string) Base64-encoded binary data to ingest
  • symbol_length_mode (string, optional) Engine mode. Default: "auto_curve". Options: auto, auto_curve, entropy, fixed8, fixed16, fixed24, fixed32, fixed64

Response fields

  • seed (integer) Deterministic seed derived from geometric signature and primitive count
  • status (string) "NOVELTY" if new structure discovered, "REPLAY" if all primitives already known
  • discovery_rate (float) Fraction of primitives that were novel (0.0 to 1.0)
  • symbol_length (integer) Symbol length used (fixed modes return the requested width)
  • primitive_count (integer) Total unique primitives in the decomposition
  • timeline_length (integer) Number of entries in the ingestion timeline
  • reuse_ratio (float) Fraction of timeline entries that reused existing primitives
  • replay_valid (boolean) Whether replay(ingest(data)) == data
  • signature (string) Multi-scale geometric signature encoding the structural profile (see below)
  • selection_model (object | null) Selection statistics: total_selections, selections_created, selections_reused

Signature format (structural profile)

The signature field encodes the geometric profile at multiple scale windows. Each scale produces a component in the format:

G{W}:{spread}:{sx}:{sy}:{mass}:{centroid}:{topo}

Where W is the scale width and the six values are:

  • spread (float) RMS radius from centroid - overall spatial extent
  • sx (float) X-axis standard deviation
  • sy (float) Y-axis standard deviation
  • mass (integer) Count of set bits at this scale
  • centroid (float) Sum of centroid coordinates (cx + cy)
  • topo (float) Topological score with pre-bits weighting

Scale windows are [4, 8, 16, 32, 64] for inputs over 16 bits. Components are separated by |. Parse these to build a per-scale structural profile for cross-input comparison.

Python example - fixed-32 mode with signature parsing

import base64
import requests

API_KEY = "ufm_live_a1b2c3d4.your_secret_here"
BASE    = "https://api.spectrengine.com"

data = b"Hello World"
resp = requests.post(
    f"{BASE}/v1/process",
    json={
        "data_b64": base64.b64encode(data).decode(),
        "symbol_length_mode": "fixed32",
    },
    headers={"X-Api-Key": API_KEY},
)
result = resp.json()
print(f"Seed: {result['seed']}")
print(f"Status: {result['status']}")
print(f"Discovery rate: {result['discovery_rate']:.4f}")
print(f"Symbol length: {result['symbol_length']}")
print(f"Signature: {result['signature']}")

File input helper

import base64
import pathlib
import requests

data = pathlib.Path("myfile.bin").read_bytes()
resp = requests.post(
    f"{BASE}/v1/process",
    json={
        "data_b64": base64.b64encode(data).decode(),
        "symbol_length_mode": "auto_curve",
    },
    headers={"X-Api-Key": API_KEY},
)
result = resp.json()
print(result["seed"], result["status"], result["replay_valid"])

POST /v1/ingest/async

Queue a payload for ingestion by the background worker instead of waiting for the engine to run inline. Returns 202 Accepted with a job_id that you can poll with GET /v1/ingests/{job_id}.

Auth requiredScope: write

Request body

  • data_b64 (string) Raw bytes, base64-encoded (up to 100 MB)
  • symbol_length_mode (string, optional) Default: "auto_curve"
The response is returned as soon as the job row is written, not when the engine finishes. Use /v1/ingests/{job_id} to check status.

Response

  • job_id (string (UUID)) The ingest_jobs row id
  • status (string) Always "queued" on success

Python example

import base64
import requests

queued = requests.post(
    f"{BASE}/v1/ingest/async",
    json={"data_b64": base64.b64encode(b"large payload").decode()},
    headers={"X-Api-Key": API_KEY},
).json()

print(queued["job_id"], queued["status"])

GET /v1/ingests/{job_id}

Poll the status of one of your async ingest jobs. Returns 404 if the job does not exist or belongs to another user.

Auth requiredScope: read

Response fields

  • job_id (string (UUID)) The ingest_jobs row id
  • status (string) "queued" | "running" | "ok" | "error"
  • symbol_length_mode (string) Mode used for this job
  • payload_bytes (integer) Byte count of the decoded payload
  • retries (integer) Retry count to date
  • queued_at (timestamp) When the job was accepted
  • started_at (timestamp | null) Worker pickup time
  • completed_at (timestamp | null) Worker finish time
  • result (object | null) Engine result dict when status is 'ok' (matches /v1/process response)
  • error (string | null) Error message when status is 'error'

Python example

job_id = queued["job_id"]
job = requests.get(
    f"{BASE}/v1/ingests/{job_id}",
    headers={"X-Api-Key": API_KEY},
).json()

print(job["status"])
if job["status"] == "ok":
    print(job["result"]["seed"])
elif job["status"] == "error":
    print(job["error"])

GET /v1/ingests

List your own async ingest jobs in reverse-chronological order.

Auth requiredScope: read

Query parameters

  • limit (integer, optional) Max rows to return (1-100). Default: 100
  • before (timestamp, optional) Cursor: return rows queued before this time

Response fields

  • jobs (object[]) Array of job views; see /v1/ingests/{job_id} for field details

Python example

jobs = requests.get(
    f"{BASE}/v1/ingests",
    params={"limit": 25},
    headers={"X-Api-Key": API_KEY},
).json()["jobs"]

for job in jobs:
    print(job["job_id"], job["status"], job["queued_at"])

POST /v1/reconstruct

Verify that data survives a full ingest-replay round trip. Returns whether the reconstruction is lossless.

Auth required30 req/minScope: write

Request body

  • data_b64 (string) Base64-encoded binary data to verify

Response fields

  • valid (boolean) true if replay(ingest(data)) == data exactly

Python example

import base64, requests

resp = requests.post(
    f"{BASE}/v1/reconstruct",
    json={"data_b64": base64.b64encode(b"test data").decode()},
    headers={"X-Api-Key": API_KEY},
)
print(resp.json())  # {"valid": true}

GET /v1/replay/{seed}

Replay previously ingested data by its seed. Returns the reconstructed byte chunks as base64-encoded strings.

Auth required60 req/min

Path parameters

  • seed (integer) Seed returned from a persisted ingestion path (typically /process or /process/universal)

Response fields

  • seed (integer) Echo of the requested seed
  • chunks_b64 (string[]) Reconstructed data chunks, each base64-encoded

Python example

import base64, requests

seed = 8472619  # from a previous /process response

resp = requests.get(
    f"{BASE}/v1/replay/{seed}",
    headers={"X-Api-Key": API_KEY},
)
result = resp.json()
for chunk_b64 in result["chunks_b64"]:
    print(base64.b64decode(chunk_b64))

JavaScript example

const seed = 8472619;
const resp = await fetch(`${BASE}/v1/replay/${seed}`, {
  headers: { "X-Api-Key": API_KEY },
});
const result = await resp.json();
const chunks = result.chunks_b64.map(chunk => {
  const raw = atob(chunk);
  return Uint8Array.from(raw, c => c.charCodeAt(0));
});
console.log(chunks);

POST /v1/compare

Compare two data items by structural primitive overlap. Returns Jaccard similarity, shared/exclusive primitive counts, and geometric signatures for both inputs.

Auth required30 req/minScope: write

Request body

  • data_a_b64 (string) First input, base64-encoded
  • data_b64 (string) Second input, base64-encoded
  • symbol_length_mode (string, optional) Default: "auto_curve"

Response fields

  • shared_primitives (integer) Primitives present in both inputs
  • only_a (integer) Primitives exclusive to first input
  • only_b (integer) Primitives exclusive to second input
  • jaccard (float) Jaccard similarity (shared / union)
  • overlap_coefficient (float) Overlap coefficient (shared / min)
  • signature_a (string) Geometric signature for first input
  • signature_b (string) Geometric signature for second input
  • discovery_rate_a (float) Discovery rate for first input
  • discovery_rate_b (float) Discovery rate for second input
  • symbol_length_a (integer) Symbol length selected for first input
  • symbol_length_b (integer) Symbol length selected for second input

Python example

import base64
import requests

resp = requests.post(
    f"{BASE}/v1/compare",
    json={
        "data_a_b64": base64.b64encode(b"Hello").decode(),
        "data_b64": base64.b64encode(b"World").decode(),
    },
    headers={"X-Api-Key": API_KEY},
)
result = resp.json()
print(result["jaccard"])
print(result["shared_primitives"])

POST /v1/structural_profile

Compute the canonical 5D structural complexity profile for one input at a fixed symbol width. This is the engine-side reference implementation of the Phase-94 profile used by the C-DISCOV-045 claim test.

Returns alpha (vocabulary growth exponent), s_zipf (Zipf exponent of the frequency distribution), v_size (vocabulary size), reuse (1 − v_size / n_syms), and discovery_integral (count of unique symbols encountered).

Auth requiredScope: write

Request body

  • data_b64 (string) Raw bytes, base64-encoded
  • symbol_width (integer, optional) Symbol width in bits, 1-128. Default: 16
  • sizes_for_alpha (integer[], optional) Prefix sizes in bytes for the growth-exponent fit. Default: [512, 1024, 2048, 4096, 8192, 16384, 32768]

Response fields

  • alpha (float) Vocabulary growth exponent (log-log fit of |V| vs N)
  • s_zipf (float) Zipf exponent of the frequency distribution
  • v_size (integer) Vocabulary size (distinct symbols at this width)
  • reuse (float) 1 - (v_size / n_syms)
  • discovery_integral (integer) Count of unique symbols encountered
  • symbol_width (integer) Symbol width used for this call
  • engine (string) Always "ufm_core_structural_profile" on the native engine path

cURL example

curl -X POST https://api.spectrengine.com/v1/structural_profile \
  -H "Content-Type: application/json" \
  -H "X-Api-Key: YOUR_API_KEY" \
  -d '{"data_b64": "QVRDR0FUQ0dBVENHQVRDRw==", "symbol_width": 16}'

Python example: fixed-16 and fixed-32 per window

import base64, requests

def window_meta(window_bytes: bytes) -> dict:
    encoded = base64.b64encode(window_bytes).decode()
    out = {}
    for width in (16, 32):
        resp = requests.post(
            f"{BASE}/v1/structural_profile",
            json={"data_b64": encoded, "symbol_width": width},
            headers={"X-Api-Key": API_KEY},
        )
        out[f"meta_fixed{width}"] = resp.json()
    return out
The engine field reports "ufm_core_structural_profile" when the native engine computed the profile. A different value indicates a non-canonical fallback path; treat those numbers as unverified.

POST /v1/batch/process

Process multiple data items in a single request. Items are processed in parallel for maximum throughput. Maximum 100 items per request.

Auth required10 req/minScope: write

Request body

  • items_b64 (string[]) Array of base64-encoded data items (1-100)
  • symbol_length_mode (string, optional) Default: "auto_curve"

Response fields

  • results (object[]) Array of results (same fields as /v1/process)
  • total_items (integer) Number of items processed
  • processing_ms (integer) Total wall-clock processing time

Python example

import base64, requests

items = [b"file_one", b"file_two", b"file_three"]
resp = requests.post(
    f"{BASE}/v1/batch/process",
    json={"items_b64": [base64.b64encode(i).decode() for i in items]},
    headers={"X-Api-Key": API_KEY},
)
for r in resp.json()["results"]:
    print(f"Seed {r['seed']}: {r['status']} (rate={r['discovery_rate']:.2f})")

POST /v1/batch/engine

Run the full engine pipeline on up to 50 items in a single request. Returns core, universal, timeline, frequency, discovery, and optional comparison for each item. Per-item error handling: one bad item does not abort the batch.

Auth requiredScope: writeMax 50 items
Use this instead of calling /v1/engine in a loop. A 30-item sweep that previously required 30+ individual calls becomes a single request.

Request body

  • items (object[]) Array of engine items (1-50)
  • items[].data_b64 (string) Input data, base64-encoded
  • items[].compare_b64 (string, optional) Comparison target, base64-encoded
  • items[].symbol_length_mode (string, optional) Default: "auto_curve"
  • items[].verify (boolean, optional) Enable replay verification (default: true)
  • items[].max_lag (integer, optional) ACF max lag (default: 50)
  • items[].top_n (integer, optional) Top-N primitives (default: 20)
  • include (string[], optional) Layers to include. Default: all. Options: "core", "universal", "timeline", "frequency", "discovery", "comparison"

Response fields

  • results (object[]) Full engine result per item
  • results[].index (integer) Position in input array
  • results[].core (object | null) Core ingestion result
  • results[].universal (object | null) Universal pipeline result
  • results[].timeline (object | null) Timeline analysis
  • results[].frequency (object | null) Frequency analysis
  • results[].discovery (object | null) Discovery sequence
  • results[].comparison (object | null) Structural comparison (if compare_b64 given)
  • results[].error (string | null) Error message if this item failed
  • total_items (integer) Number of items submitted
  • succeeded (integer) Items processed successfully
  • failed (integer) Items that failed
  • processing_ms (integer) Total wall-clock processing time

Python example

import base64, requests

items = [b"data_one", b"data_two", b"data_three"]
resp = requests.post(
    f"{BASE}/v1/batch/engine",
    json={
        "items": [{"data_b64": base64.b64encode(i).decode()} for i in items],
        "include": ["core", "timeline"],  # skip layers you don't need
    },
    headers={"X-Api-Key": API_KEY},
)
data = resp.json()
print(f"Succeeded: {data['succeeded']}/{data['total_items']}")
for r in data["results"]:
    if r["error"]:
        print(f"  [{r['index']}] ERROR: {r['error']}")
    else:
        print(f"  [{r['index']}] seed={r['core']['seed']} dr={r['core']['discovery_rate']:.2f}")

POST /v1/batch/compare

Compare multiple data pairs structurally in a single request. Maximum 100 pairs. Per-pair error handling: one bad pair does not abort the batch.

Auth requiredScope: writeMax 100 pairs

Request body

  • pairs (object[]) Array of comparison pairs (1-100)
  • pairs[].data_a_b64 (string) First input, base64-encoded
  • pairs[].data_b64 (string) Second input, base64-encoded
  • pairs[].symbol_length_mode (string, optional) Default: "auto_curve"

Response fields

  • results (object[]) Comparison result per pair
  • results[].index (integer) Position in input array
  • results[].shared_primitives (integer | null) Primitives in both inputs
  • results[].only_a (integer | null) Primitives exclusive to first input
  • results[].only_b (integer | null) Primitives exclusive to second input
  • results[].jaccard (float | null) Jaccard similarity
  • results[].overlap_coefficient (float | null) Overlap coefficient
  • results[].error (string | null) Error message if this pair failed
  • total_pairs (integer) Number of pairs submitted
  • succeeded (integer) Pairs compared successfully
  • failed (integer) Pairs that failed
  • processing_ms (integer) Total wall-clock processing time

Python example

import base64, requests

pairs = [(b"source_1", b"target_1"), (b"source_2", b"target_2")]
resp = requests.post(
    f"{BASE}/v1/batch/compare",
    json={
        "pairs": [
            {"data_a_b64": base64.b64encode(a).decode(),
             "data_b64": base64.b64encode(b).decode()}
            for a, b in pairs
        ],
    },
    headers={"X-Api-Key": API_KEY},
)
for r in resp.json()["results"]:
    if r["error"]:
        print(f"  [{r['index']}] ERROR: {r['error']}")
    else:
        print(f"  [{r['index']}] jaccard={r['jaccard']:.3f}")

POST /v1/noise/detect

Detect deterministic noise artefacts between a source and target byte pair. Identifies transformations like BOM insertion, line-ending changes, and Base64 encoding.

Auth required30 req/min

Request body

  • source_b64 (string) Original data, base64-encoded
  • target_b64 (string) Transformed data, base64-encoded

Response fields

  • converges (boolean) Whether source and target converge after normalisation
  • noise_units (object[]) Detected noise artefacts with type, layer, operation
  • decision_hash (string) Method-layer decision hash for audit-chain tracing
  • gates (object) Anti-drift gate results (coherence, consistency, causality, persistence)

Python example

import base64
import requests

source = b"Hello World\n"
target = b"Hello World\r\n"
result = requests.post(
    f"{BASE}/v1/noise/detect",
    json={
        "source_b64": base64.b64encode(source).decode(),
        "target_b64": base64.b64encode(target).decode(),
    },
    headers={"X-Api-Key": API_KEY},
).json()
print(result["converges"])
print(result["noise_units"])

Compares "Hello World" with "Hello World\r\n". Detects a LineEndingCrlf noise artefact: the CRLF line ending is noise, not a structural change.

POST /v1/noise/delta

Return the full 12-field noise delta between a source and target, plus a reversibility summary. Unlike /v1/noise/detect, every response includes source_span, target_span, primitive_span, and reversibility_meta.

Auth required30 req/min

Request body

  • source_b64 (string) Source bytes, base64-encoded
  • target_b64 (string) Target bytes, base64-encoded
  • enabled_noise_classes (string[], optional) Restrict detection to these class IDs
  • strict_allowlist (boolean, optional) If true, reject unknown class IDs. Default: false

Response fields

  • converges (boolean) normalize(source) == target
  • noise_units (object[]) Each unit includes all 12 engine fields
  • summary (object) { total_bytes_changed, noise_class_counts, reversibility_breakdown }
  • decision_hash (string) Method-layer decision hash
  • gates (object) Anti-drift gate results

Python example

source = b"Hello"
target = b"\xef\xbb\xbfHello"
delta = requests.post(
    f"{BASE}/v1/noise/delta",
    json={
        "source_b64": base64.b64encode(source).decode(),
        "target_b64": base64.b64encode(target).decode(),
    },
    headers={"X-Api-Key": API_KEY},
).json()
print(delta["summary"])
print(delta["noise_units"])

GET /v1/noise/capabilities

List all available noise detection classes supported by the engine.

Auth required

Response fields

  • capabilities (object[]) Array of {id, label} for each noise class
  • engine_version (string) Engine version

Python example

caps = requests.get(
    f"{BASE}/v1/noise/capabilities",
    headers={"X-Api-Key": API_KEY},
).json()
print(caps["engine_version"])
print(caps["capabilities"])

# Example shape:
# {"capabilities": [{"id": "bom_utf8", "label": "BOM UTF-8"}, ...]}

POST /v1/semantic/analyze

Run full semantic analysis between two byte sequences. Determines whether the inputs converge (are structurally identical after noise removal) and classifies each noise artefact found. Supports policy control to restrict which noise classes are detected.

Use this when you need more control than /v1/pipeline provides , for example to restrict detection to specific noise classes or use strict allowlisting. For most pair comparisons, /v1/pipeline is simpler.

Auth required30 req/minScope: any authenticated context

Request body

  • source_b64 (string) Source data, base64-encoded
  • target_b64 (string) Target data, base64-encoded
  • enabled_noise_classes (string[], optional) Filter to specific noise class ids, for example ["bom_utf8", "line_ending_crlf"]
  • strict_allowlist (boolean, optional) If true, only detect listed classes. Default: false

Response fields

  • status (string) "ok" or "rejected" (if anti-drift gates fail)
  • converges (boolean) Whether normalised forms are identical
  • noise_units (object[]) Detected noise artefacts
  • decision_hash (string) Method-layer decision hash
  • gates (object) Anti-drift gate results

Compares "Hello" with "Hello" prefixed by a UTF-8 BOM. Returns converges: true with a BomUtf8 noise unit, meaning the data is functionally identical despite the byte-level difference.

Python example: strict allowlist

import base64, requests

resp = requests.post(
    f"{BASE}/v1/semantic/analyze",
    json={
        "source_b64": base64.b64encode(b"Hello").decode(),
        "target_b64": base64.b64encode(b"\xef\xbb\xbfHello").decode(),
        "enabled_noise_classes": ["bom_utf8"],  # Only detect BOM noise
        "strict_allowlist": True,
    },
    headers={"X-Api-Key": API_KEY},
)
print(resp.json()["converges"])     # True
print(resp.json()["noise_units"])   # [{noise_type: "BomUtf8", ...}]

GET /v1/corpus/list

List your recent ingests. This endpoint reads your own ledger only and never exposes another user’s data or Bob’s internal claim corpus.

Auth requiredScope: read

Query parameters

  • limit (integer, optional) Max rows to return (1-500). Default: 100
  • before (string, optional) Return rows created before this ISO-8601 timestamp

Response fields

  • records (object[]) Array of ingest records: request_id, endpoint, created_at, status_code, discovery_rate, symbol_length, primitive_count, timeline_length

Python example

records = requests.get(
    f"{BASE}/v1/corpus/list",
    params={"limit": 50},
    headers={"X-Api-Key": API_KEY},
).json()["records"]

for row in records:
    print(row["created_at"], row["endpoint"], row["status_code"])

POST /v1/corpus/build

Ingest multiple documents into your own ledger in a single call. Returns the per-document seeds, a pairwise Jaccard matrix across the documents, and an updated summary of your ledger.

Auth requiredScope: write

Request body

  • documents (object[]) 1-50 items of { data_b64 }
  • symbol_length_mode (string, optional) Default: "auto_curve"
Maximum 50 documents per call and 100 MB total across all documents.

Response fields

  • seeds (integer[]) Structural seed per document, in input order
  • jaccard_matrix (float[][]) Dense symmetric similarity matrix, diagonal = 1.0
  • ledger_summary (object) { primitive_count, timeline_length, discovery_rate, symbol_length, run_count }
  • processing_ms (integer) Total server-side processing time

Python example

import base64

docs = [b"first document", b"second document"]
body = {
    "documents": [
        {"data_b64": base64.b64encode(doc).decode()}
        for doc in docs
    ],
    "symbol_length_mode": "auto_curve",
}

result = requests.post(
    f"{BASE}/v1/corpus/build",
    json=body,
    headers={"X-Api-Key": API_KEY},
).json()

print(result["seeds"])
print(result["jaccard_matrix"])
print(result["ledger_summary"]["run_count"])

GET /v1/corpus/metadata

Return the summary for your own ledger. Returns zero values when you have not yet ingested anything.

Auth requiredScope: read

Response fields

  • primitive_count (integer) Distinct primitives in your ledger
  • timeline_length (integer) Total ledger timeline length
  • discovery_rate (float) New primitives / total selections
  • symbol_length (integer | null) Last symbol length used
  • run_count (integer) Distinct ingest runs recorded against this ledger

Python example

summary = requests.get(
    f"{BASE}/v1/corpus/metadata",
    headers={"X-Api-Key": API_KEY},
).json()
print(summary["primitive_count"], summary["run_count"])

GET /v1/corpus/export

Download your own ledger.bin as a binary attachment. Returns 404 if your ledger does not exist yet.

Auth requiredScope: read

Python example

resp = requests.get(
    f"{BASE}/v1/corpus/export",
    headers={"X-Api-Key": API_KEY},
)
resp.raise_for_status()
with open("ledger-export.bin", "wb") as f:
    f.write(resp.content)

POST /v1/corpus/import

Replace your own ledger with an uploaded binary. The previous ledger is atomically renamed to ledger.bin.bak before the replacement becomes visible, so a crash cannot leave a half-written ledger.

Auth requiredScope: write

Upload

multipart/form-data; the file must start with the UFMR magic header and be no larger than 100 MB.

Response fields

  • replaced (boolean) Always true on success
  • backup_path (string | null) Server path of the .bak file, or null if no prior ledger
  • bytes_written (integer) Size of the new ledger
  • new_stats (object) Summary after replacement

Python example

with open("ledger-export.bin", "rb") as f:
    result = requests.post(
        f"{BASE}/v1/corpus/import",
        files={"file": ("ledger-export.bin", f, "application/octet-stream")},
        headers={"X-Api-Key": API_KEY},
    ).json()

print(result["replaced"])
print(result["new_stats"]["primitive_count"])

POST /v1/analyze/timeline

Analyse temporal structure of ingested data: autocorrelation, segment boundaries, and structural transitions. Useful for detecting periodicity and regime changes.

Auth required30 req/min

Request body

  • data_b64 (string) Base64-encoded data
  • symbol_length_mode (string, optional) Default: "auto_curve"
  • max_lag (integer, optional) Maximum ACF lag (1-500, default: 50)
  • window_size (integer, optional) Segment window size (10-10000, default: 100)
  • transition_threshold (float, optional) Rate-change threshold (0-1, default: 0.1)

Response fields

  • acf (float[]) Autocorrelation coefficients at lags 1..max_lag
  • segments (object[]) Per-segment discovery stats (start, end, rate, primitives)
  • transitions (integer[]) Indices where discovery rate changes significantly
  • primitive_count (integer) Total unique primitives
  • timeline_length (integer) Timeline entries
  • discovery_rate (float) Overall discovery rate

Python example

result = requests.post(
    f"{BASE}/v1/analyze/timeline",
    json={
        "data_b64": base64.b64encode(b"abcabcabc").decode(),
        "max_lag": 20,
    },
    headers={"X-Api-Key": API_KEY},
).json()
print(result["acf"])
print(result["segments"])

POST /v1/analyze/frequency

Analyse primitive frequency distribution. Returns a full histogram sorted by frequency and the top-N most common primitives.

Auth required30 req/min

Request body

  • data_b64 (string) Base64-encoded data
  • symbol_length_mode (string, optional) Default: "auto_curve"
  • top_n (integer, optional) Number of top primitives to return (1-1000, default: 20)

Response fields

  • histogram (object[]) All primitives with {primitive_id, count}, sorted by count
  • top_n (object[]) Most frequent primitives
  • total_primitives (integer) Total unique primitives
  • timeline_length (integer) Timeline entries

Python example

result = requests.post(
    f"{BASE}/v1/analyze/frequency",
    json={
        "data_b64": base64.b64encode(b"abcabcabc").decode(),
        "top_n": 5,
    },
    headers={"X-Api-Key": API_KEY},
).json()
print(result["top_n"])
print(result["total_primitives"])

POST /v1/analyze/discovery

Retrieve the discovery sequence (order in which primitives were first encountered) and structural metrics. Useful for understanding how novelty evolves across data.

Auth required30 req/min

Request body

  • data_b64 (string) Base64-encoded data
  • symbol_length_mode (string, optional) Default: "auto_curve"

Response fields

  • discovery_sequence (integer[]) Primitive IDs in discovery order
  • discovery_rate (float) Novelty fraction (0.0 to 1.0)
  • primitive_count (integer) Total unique primitives
  • timeline_length (integer) Timeline entries
  • symbol_length (integer) Symbol width used

Python example

result = requests.post(
    f"{BASE}/v1/analyze/discovery",
    json={"data_b64": base64.b64encode(b"abcabcabc").decode()},
    headers={"X-Api-Key": API_KEY},
).json()
print(result["discovery_sequence"])
print(result["discovery_rate"])

POST /v1/analyze/symbol_length

Report the engine-selected symbol length for the given bytes and the selector metadata (entropy of the chosen length, the mode used, and the sample size consumed). Useful for diagnosing why a corpus produced the primitive count it did.

Auth required30 req/min

Request body

  • data_b64 (string) Raw bytes, base64-encoded

Response fields

  • selected_length (integer) Symbol length (in bits) the engine chose
  • entropy_at_selected (float) Shannon entropy at the selected length
  • mode (string) Selector mode (e.g. "AutoCurve", "Entropy", "Fixed(8)")
  • sample_bits_used (integer) Number of bits sampled during selection

Python example

result = requests.post(
    f"{BASE}/v1/analyze/symbol_length",
    json={"data_b64": base64.b64encode(b"Hello World").decode()},
    headers={"X-Api-Key": API_KEY},
).json()
print(result["selected_length"])
print(result["entropy_at_selected"])

POST /v1/process/universal

Run the 7-stage universal pipeline with quality gates. Each stage validates, analyses, and verifies your data through: VALIDATE → METRICS → STRATEGY → EXECUTE → VERIFY → ADAPT → OUTPUT.

Auth required30 req/minScope: write

Request body

  • data_b64 (string) Input data, base64-encoded
  • verify (boolean, optional) Run replay verification (default: true). Set false for bulk speed.

Response fields

  • success (boolean) Whether ingestion + stage flow completed (replay proof requires replay_valid=true)
  • seed (integer) Deterministic seed from geometric signature
  • quality (object) Quality metrics: replay_valid, discovery_rate, reuse_ratio, deterministic
  • replay_valid (boolean) Whether replay invariant holds
  • stages_completed (string[]) All 7 stage names in order
  • metrics (object) Pre-ingest byte metrics: byte_entropy, unique_byte_ratio, size_class, input_length
  • strategy (object) Selected processing strategy: symbol_length_mode, zero_point
  • execute (object) Execution results: seed, status, discovery_rate, primitive_count, timeline_length, reuse_ratio
  • field_reach (object) Per-seed field connectivity: primitive_count, shared_with_others, unique_to_seed, total_ledger_primitives, reach_ratio, isolation_ratio
  • violations (string[]) Any warnings or errors
  • engine_version (string) Engine version

Python example

import base64, requests

resp = requests.post(
    f"{BASE}/v1/process/universal",
    json={
        "data_b64": base64.b64encode(b"your data here").decode(),
        "verify": True,
    },
    headers={"X-Api-Key": API_KEY},
)
result = resp.json()
print(f"Success: {result['success']}")
print(f"Seed: {result['seed']}")
print(f"Replay valid: {result['replay_valid']}")
print(f"Stages: {result['stages_completed']}")
print(f"Strategy: {result['strategy']}")
print(f"Quality: {result['quality']}")

GET /v1/health

Check API and engine status. No authentication required.

No auth

Response fields

  • status (string) "ok" if database reachable, "degraded" if not
  • engine_version (string) UFM engine version (e.g. 3.0-rust)

Python example

health = requests.get(f"{BASE}/v1/health").json()
print(health["status"])
print(health["engine_version"])

cURL example

curl https://api.spectrengine.com/v1/health

# {"status": "ok", "engine_version": "3.0-rust"}

Error Handling

All errors return a consistent JSON structure:

{
  "error": {
    "code": "ERROR_CODE",
    "message": "Human-readable description",
    "request_id": "correlation-uuid"
  }
}

Error codes

StatusCodeMeaning
400BAD_REQUESTInvalid base64, malformed JSON, or engine error
401UNAUTHORIZEDMissing or invalid authentication credentials
413PAYLOAD_TOO_LARGERequest body exceeds 10 MB limit
422VALIDATION_ERRORRequest body failed schema validation
429RATE_LIMITToo many requests (check Retry-After header)
500INTERNAL_ERRORServer error (includes request_id for support)

Handling errors in code

import requests

BASE    = "https://api.spectrengine.com"
API_KEY = "ufm_live_a1b2c3d4.your_secret_here"

resp = requests.post(f"{BASE}/v1/process", json=payload,
                      headers={"X-Api-Key": API_KEY})

if resp.status_code == 200:
    result = resp.json()
elif resp.status_code == 429:
    retry_after = resp.headers.get("Retry-After", 60)
    print(f"Rate limited. Retry in {retry_after}s")
else:
    error = resp.json().get("error", {})
    print(f"Error {resp.status_code}: {error.get('message')}")
    print(f"Request ID: {error.get('request_id')}")

Rate Limits

Launch promotion: All API endpoint rate limits are temporarily removed. Sign up for free and use every endpoint with no request caps. Abuse-prevention limits on authentication endpoints still apply (see below). Tiered usage limits will be introduced in a future update.

The following protective limits remain in place:

LimitPurpose
5 requests/min on login & register (per IP)Prevents brute-force and spam account creation
1 request per 5 minutes on register (per email)Prevents the same address being submitted repeatedly across different IPs
3 requests/min on password reset requestPrevents email flooding
3 requests/hour on resend-verification (per email)Prevents verification email flooding
3 requests/hour on contact form (per IP)Prevents inbox spam
Cloudflare Turnstile on register, password reset, contact, resendBot protection. Adaptive on login: shown after 3 failed attempts within 15 minutes.
Email verification required before loginConfirms account ownership; blocks throwaway-email signups.
10 MB max request bodyPayload size protection
100 items per batch requestBatch size protection

If you exceed an abuse-prevention limit the API returns HTTP 429 with a Retry-After header (seconds).