SPECTR Engine Docs

Use the hosted API or install the local Python wheel. Hosted API base URL:

https://api.spectrengine.com

Start Here

UFM takes bytes, decomposes them into structural primitives, and gives you repeatable results you can use in software. You can call it through the hosted HTTPS API, or install the same engine locally with the Python wheel.

Hosted API

Best when you want a service endpoint, account-managed ledgers, dashboard history, and no local engine install.

import base64, os, requests

BASE = "https://api.spectrengine.com"
API_KEY = os.environ["UFM_API_KEY"]

payload = {
    "data_b64": base64.b64encode(b"Hello World").decode()
}
resp = requests.post(
    f"{BASE}/v1/engine",
    json=payload,
    headers={"X-Api-Key": API_KEY},
)
resp.raise_for_status()
result = resp.json()
print(result["core"]["seed"])
print(result["core"]["replay_valid"])

Local Python wheel

Best when you want raw bytes in your own process, local ledger files, and engine calls without an HTTP round trip after activation.

pip install ufm-3.0.0-cp313-cp313-win_amd64.whl
python -m ufm activate ufm_live_<your_key>

import ufm

eng = ufm.InvariantIdentityEngine(storage_path='ledger.bin')
seed, status = eng.process(b'Hello World')
print(seed, status)
print(eng.reconstruct(b'Hello World'))
eng.save()

First success checkpoint: for the API, you should see a JSON response with core.seed and core.replay_valid. Locally, you should see a seed, a status such as NOVELTY, and True from reconstruct.

Which Endpoint Should I Use?

The SPECTR Engine has two main endpoints that cover most use cases. Pick the one that matches what you're doing.

Recommended default

POST /v1/engine

Analyse one piece of data. Returns its structural identity, quality metrics, temporal patterns, and frequency distribution, all in a single call.

Best for:

Understanding the structural profile of a file or payload
Monitoring data quality over time
Getting a complete analysis without multiple API calls

Default for pairs

POST /v1/pipeline

Compare two pieces of data. Returns their structural overlap and detects semantic noise (encoding changes, BOM insertions, line-ending shifts) that may disguise functionally identical data as different.

Best for:

Detecting what changed between two file versions
Filtering out noise (BOM, CRLF, Base64) from real structural changes
Data integrity verification across systems

What each endpoint runs

Analysis layer	`/v1/engine`	`/v1/pipeline`
Core ingestion (seed, signature, replay check)	Yes	Yes
Universal pipeline (7-stage quality gates)	Yes	-
Timeline analysis (autocorrelation, segments)	Yes	-
Frequency analysis (primitive distribution)	Yes	-
Discovery analysis (novelty sequence)	Yes	-
Structural comparison (Jaccard, shared/unique)	With compare_b64	With target_b64
Semantic noise detection (BOM, CRLF, Base64)	-	With target_b64
Anti-drift governance gates	-	With target_b64

Not sure? Start with /v1/engine. It gives you the most detail about a single input. Switch to /v1/pipeline when you need to compare two inputs or detect noise between them.

Other endpoints

The granular endpoints give you fine-grained control when you only need one specific layer:

/v1/process - lightweight ingestion only (get a seed, replay later with /v1/replay/{seed})
/v1/batch/process - process up to 100 items in parallel (lightweight signatures)
/v1/batch/engine - full engine analysis on up to 50 items (replaces per-item loops)
/v1/batch/compare - compare up to 100 pairs in one call
/v1/compare - structural comparison without semantic analysis
/v1/structural_profile - canonical 5D structural complexity profile (alpha, Zipf, vocabulary, reuse, discovery integral)
/v1/semantic/analyze - semantic noise detection with policy control
/v1/analyze/* - individual timeline, frequency, or discovery analysis

Plain-English Terms

Term	What it means when you are coding
`data_b64`	Base64 text wrapping your bytes for HTTP JSON. Local wheel calls take raw `bytes` instead.
seed	A deterministic numeric identity returned by the engine for the structural profile it computed.
ledger	The stored primitive/timeline state. Hosted API ledgers are account-scoped; local ledgers are files you choose.
replay	Rebuilding previously ingested bytes from ledger state. `replay_valid` tells you whether the check passed.
`NOVELTY` / `REPLAY`	`NOVELTY` means the call selected new primitives. `REPLAY` means the structure was already in that ledger.
request-scoped	The endpoint computes a result for that request without storing it for later replay.
semantic noise	A byte-level representation change, such as BOM or line endings, that the semantic layer can classify separately from structural difference.

Quick Start

This path gets you from no code to a working API call. Use Python first because it handles base64 safely on Windows, macOS, and Linux.

1. Get your API key

Go to API Keys in the sidebar and click Create Key. Copy the full key immediately; it is shown only once. Format: ufm_live_a1b2c3d4.xxxxxxxxx

2. Install the HTTP helper library

The examples below use requests and read your key from an environment variable so you do not paste secrets into source files.

pip install requests

# PowerShell
$env:UFM_API_KEY = "ufm_live_<your_key>"

# macOS/Linux shell
export UFM_API_KEY="ufm_live_<your_key>"

3. Analyse your first input

Use /v1/engine for a complete structural analysis of one input:

import base64
import os
import requests

BASE = "https://api.spectrengine.com"
API_KEY = os.environ["UFM_API_KEY"]

payload = {
    "data_b64": base64.b64encode(b"Hello World").decode(),
}
resp = requests.post(
    f"{BASE}/v1/engine",
    json=payload,
    headers={"X-Api-Key": API_KEY},
    timeout=60,
)
resp.raise_for_status()
result = resp.json()

print("seed:", result["core"]["seed"])
print("status:", result["core"]["status"])
print("replay valid:", result["core"]["replay_valid"])
print("quality:", result["universal"]["quality"])

If this works, you have authenticated successfully and the engine has analysed your bytes. The most useful first fields are core.seed, core.status, core.replay_valid, and universal.quality.

4. Persist and replay

/v1/engine is request-scoped (no data persisted between calls). To store data for later replay, use /v1/process:

payload = {"data_b64": base64.b64encode(b"Hello World").decode()}

created = requests.post(
    f"{BASE}/v1/process",
    json=payload,
    headers={"X-Api-Key": API_KEY},
).json()

seed = created["seed"]
replayed = requests.get(
    f"{BASE}/v1/replay/{seed}",
    headers={"X-Api-Key": API_KEY},
).json()

original = b"".join(base64.b64decode(c) for c in replayed["chunks_b64"])
print(original)

5. Compare two inputs

Use /v1/pipeline with a target_b64 to compare two inputs and detect noise:

source = b"Hello World"
target = b"\xef\xbb\xbfHello World"

resp = requests.post(
    f"{BASE}/v1/pipeline",
    json={
        "data_b64": base64.b64encode(source).decode(),
        "target_b64": base64.b64encode(target).decode(),
    },
    headers={"X-Api-Key": API_KEY},
)
pair = resp.json()
print(pair["compare"]["jaccard"])
print(pair["semantic"]["converges"])
print(pair["semantic"]["noise_units"])

This compares "Hello World" with a BOM-prefixed version. The response shows structural overlap (Jaccard similarity) and identifies the BOM as a noise artefact.

cURL smoke test

curl -X POST https://api.spectrengine.com/v1/engine \
  -H "Content-Type: application/json" \
  -H "X-Api-Key: YOUR_API_KEY" \
  -d '{"data_b64": "SGVsbG8gV29ybGQ="}'

Local Python Wheel

The hosted API and the local wheel are two surfaces over the same native UFM engine. Use the wheel when you want the engine inside your own process, with local ledgers, no HTTP base64 wrapper, and no per-request network call after activation.

Licence requiredLocal bytes APIBob and Ben included

Install once, activate once, then import ufm. Activation caches a signed licence token on the machine. Normal engine calls use that cached token; python -m ufm status checks it without touching the network.

Install and activate

Use the wheel that matches your Python and OS. The Windows CPython 3.13 wheel is:

python -m venv .venv
.\.venv\Scripts\activate

pip install ufm-3.0.0-cp313-cp313-win_amd64.whl
python -m ufm activate ufm_live_<their_key>
python -m ufm status
python -m ufm version

# Done: import ufm works, and features granted by your licence are available.

For unattended jobs, you can also set UFM_LICENCE_KEY before the first engine call. Set UFM_LICENCE_CACHE_DIR when you want the licence cache to live in a controlled directory.

import ufm

status = ufm.licence_status()
print(status["active"])
print(status["tier"])
print(status["days_remaining"])

Mental model

HTTP requests use base64 strings. Local calls take raw bytes.
API persistence is your account ledger. Local persistence is the storage_path file you choose.
Request-scoped API endpoints map to temporary local ledgers or stateless helpers.
Local Ben history lives in a BenSession; persist it yourself if you need durable chat history.
There are no hosted rate limits locally, but the 100 MB input limit and licence scopes still apply.

Core ingest, persist, and replay

This is the local equivalent of POST /v1/process, GET /v1/replay/{seed}, and POST /v1/reconstruct.

import ufm

data = b"Hello World"

with ufm.InvariantIdentityEngine(storage_path="customer-ledger.bin") as eng:
    seed, status = eng.process(data)
    print(seed, status)                  # NOVELTY on first ingest

    seed2, status2 = eng.process(data)
    print(seed2 == seed, status2)        # True, REPLAY

    assert eng.reconstruct(data) is True

    replayed = [bytes(seq) for seq in eng.replay(seed)]
    print(replayed[0])                   # b"Hello World"

    print(eng.ledger_summary())

# save() is called automatically when the context exits.

Full engine analysis

Use this pattern when you want the same layers as POST /v1/engine: core identity, universal pipeline quality, timeline, frequency, discovery, and optional structural comparison.

import tempfile
import ufm

def to_bits(data: bytes) -> list[int]:
    return [int(bit) for byte in data for bit in f"{byte:08b}"]

def mode_from_strategy(strategy: dict | None, fallback: str = "auto_curve") -> str:
    raw = str((strategy or {}).get("symbol_length_mode") or fallback).lower()
    if raw in ("autocurve", "auto_curve"):
        return "auto_curve"
    if raw == "entropy":
        return "entropy"
    if raw.startswith("fixed(") and raw.endswith(")"):
        return f"fixed{raw[6:-1]}"
    return raw if raw.startswith("fixed") else fallback

def compare_bytes(a: bytes, b: bytes, mode: str = "auto_curve") -> dict:
    la = ufm.ingest_raw(to_bits(a), symbol_length_mode=mode)
    lb = ufm.ingest_raw(to_bits(b), symbol_length_mode=mode)
    return ufm.ledger_compare(la, lb)

def run_local_engine(
    data: bytes,
    *,
    compare_with: bytes | None = None,
    verify: bool = True,
    max_lag: int = 50,
    top_n: int = 20,
) -> dict:
    with tempfile.TemporaryDirectory(prefix="ufm-engine-") as tmp:
        up = ufm.UniversalPipeline(
            storage_path=f"{tmp}/engine-request-ledger.bin",
            zero_point=True,
            verify=verify,
        )
        universal = up.run(data)
        mode = mode_from_strategy(universal.get("strategy"))
        sig = ufm.ufm_signature(data, symbol_length_mode=mode)
        ledger = ufm.ingest_raw(to_bits(data), symbol_length_mode=mode)

        result = {
            "core": {
                "seed": universal.get("seed", sig["seed"]),
                "status": (universal.get("execute") or {}).get("status"),
                "discovery_rate": sig["discovery_rate"],
                "symbol_length": sig["symbol_length"],
                "primitive_count": sig["primitive_count"],
                "timeline_length": sig["timeline_length"],
                "reuse_ratio": sig["reuse_ratio"],
                "replay_valid": sig["replay_valid"],
                "signature": sig["signature"],
            },
            "universal": universal,
            "timeline": {
                "acf": ledger.acf(max_lag),
                "segments": ledger.segments(100),
                "transitions": ledger.transitions(100, 0.1),
            },
            "frequency": {
                "histogram": ledger.frequency_histogram(),
                "top_n": ledger.top_n_primitives(top_n),
            },
            "discovery": {
                "discovery_sequence": ledger.discovery_sequence(),
                "discovery_rate": ledger.discovery_rate,
                "primitive_count": ledger.primitive_count,
                "timeline_length": ledger.timeline_length,
                "symbol_length": ledger.symbol_length,
            },
            "effective_symbol_length_mode": mode,
            "engine_version": ufm.VERSION,
        }
        if compare_with is not None:
            result["comparison"] = compare_bytes(data, compare_with, mode)
        return result

analysis = run_local_engine(b"Hello World", compare_with=b"\xef\xbb\xbfHello World")
print(analysis["core"]["seed"])
print(analysis["universal"]["quality"])
print(analysis["comparison"]["jaccard"])

Pair comparison and semantic noise

This is the local equivalent of POST /v1/pipeline, POST /v1/compare, POST /v1/noise/detect, POST /v1/noise/delta, and POST /v1/semantic/analyze.

import ufm

def to_bits(data: bytes) -> list[int]:
    return [int(bit) for byte in data for bit in f"{byte:08b}"]

source = b"line one\nline two\n"
target = b"line one\r\nline two\r\n"

la = ufm.ingest_raw(to_bits(source))
lb = ufm.ingest_raw(to_bits(target))
structural = ufm.ledger_compare(la, lb)

semantic = ufm.SemanticDecisionPipeline("semantic-ledger.jsonl")
noise = semantic.run_with_policy(
    source,
    target,
    enabled_noise_classes=["line_ending_crlf"],
    strict_allowlist=True,
)

print(structural["jaccard"])
print(noise["converges"])       # True for classified CRLF/LF noise
print(noise["noise_units"])
print(noise["validation_checks"])
print(noise["decision_hash"])

Analytics helpers

The granular analysis endpoints are direct ledger operations locally.

import ufm

def to_bits(data: bytes) -> list[int]:
    return [int(bit) for byte in data for bit in f"{byte:08b}"]

data = b"abcabcabc"
ledger = ufm.ingest_raw(to_bits(data), symbol_length_mode="auto_curve")

timeline = {
    "acf": ledger.acf(50),
    "segments": ledger.segments(100),
    "transitions": ledger.transitions(100, 0.1),
}
frequency = {
    "histogram": ledger.frequency_histogram(),
    "top_n": ledger.top_n_primitives(20),
}
discovery = {
    "discovery_sequence": ledger.discovery_sequence(),
    "discovery_rate": ledger.discovery_rate,
}
symbol_length, selector_meta = ufm.find_optimal_symbol_length(to_bits(data))
profile = ufm.structural_profile(data, symbol_width=16)

print(timeline)
print(frequency)
print(discovery)
print(symbol_length, selector_meta)
print(profile)

Batch and corpus workflows

Use process_batch for persisted corpus ingestion and ufm_signature_batch for independent stateless signatures. Local corpus import/export is just controlled movement of your ledger.bin file.

import shutil
import ufm

def to_bits(data: bytes) -> list[int]:
    return [int(bit) for byte in data for bit in f"{byte:08b}"]

docs = [b"file one", b"file two", b"file three"]

with ufm.InvariantIdentityEngine(storage_path="corpus-ledger.bin") as eng:
    results = eng.process_batch(docs)
    seeds = [seed for seed, _status in results]
    summary = eng.ledger_summary()

ledgers = [ufm.ingest_raw(to_bits(doc)) for doc in docs]
jaccard_matrix = [
    [ufm.ledger_jaccard(a, b) for b in ledgers]
    for a in ledgers
]

print(seeds)
print(summary)
print(jaccard_matrix)

# Export/import the local ledger file.
shutil.copyfile("corpus-ledger.bin", "customer-export.ufmr")
shutil.copyfile("customer-export.ufmr", "restored-ledger.bin")

Universal and decision pipelines

UniversalPipeline is the governed data processing path. DecisionPipeline is the text decision/audit path with anti-drift gates and a persistent audit ledger.

import ufm

up = ufm.UniversalPipeline(
    storage_path="universal-ledger.bin",
    bit_depth=21,
    verify=True,
    zero_point=False,
)
run = up.run(b"payload for the governed pipeline")
print(run["success"], run["replay_valid"], run["quality"])
print(run["stages_completed"])

dp = ufm.DecisionPipeline("decision-ledger.jsonl")
decision = dp.run("What is structural identity in UFM?")
print(decision["status"])
print(decision["decision_hash"])
print(decision["gates"])

Call Ben locally

Ben is included in the wheel. It loads sealed prompts after the ben.ask scope check, then uses the LLM backend you provide.

import os
import ufm

# Local Ollama. Make sure Ollama is running and the model is pulled.
backend = ufm.backend_from_config(backend="ollama", model="gemma4")
session = ufm.BenSession(backend=backend)

first = session.ask("What is UFM actually doing?")
print(first.text)
print(first.session_id, first.turn_count)

second = session.ask("How does replay relate to structural identity?")
print(second.text)
print(session.history())

# One-shot convenience wrapper with a provider API key.
answer = ufm.ask_ben(
    "Explain the replay invariant.",
    backend="openai",
    model="gpt-4o",
    api_key=os.environ["OPENAI_API_KEY"],
)
print(answer["text"])

CLI form:

python -m ufm ask-ben "What is UFM?" --backend ollama --model gemma4
python -m ufm ask-ben "Explain replay" --backend openai --model gpt-4o --api-key YOUR_OPENAI_API_KEY

Call Bob locally

Bob is also included in the wheel. The sealed corpus, claim-gate snapshot, OOV thresholds, and prompts ship with the package. Bob can run corpus retrieval and gate/OOV/audit output without an LLM backend; pass a backend when you want generated response wording.

import os
import ufm

# Corpus retrieval, gate status, OOV metrics, and audit output.
bob = ufm.BobPipeline()
result = bob.query("What is the replay invariant?", mode="advisory", max_anchors=5)
print(result.response)
print(result.gate_status)       # PASS, WARN, or BLOCK
print(result.evidence)
print(result.oov)
print(result.audit)
print(result.boundary_flags)

# Optional generated answer using an LLM backend.
backend = ufm.backend_from_config(
    backend="anthropic",
    model="claude-sonnet-4-20250514",
    api_key=os.environ["ANTHROPIC_API_KEY"],
)
bob_with_llm = ufm.BobPipeline(backend=backend)
print(bob_with_llm.query("Explain C-CORE-002.").response)

# One-shot convenience wrapper.
one_shot = ufm.ask_bob(
    "Is replay identity verified?",
    backend="ollama",
    model="gemma4",
)
print(one_shot["response"])
print(one_shot["gate_status"])

CLI form:

python -m ufm ask-bob "Is the replay invariant verified?" --backend ollama --model gemma4
python -m ufm ask-bob "Explain C-CORE-002" --backend anthropic --api-key YOUR_ANTHROPIC_API_KEY

Local equivalents at a glance

API surface	Local wheel surface
`/v1/engine`	Compose `UniversalPipeline`, `ufm_signature`, `ingest_raw`, and ledger analytics.
`/v1/process`, `/v1/replay`, `/v1/reconstruct`	`InvariantIdentityEngine.process`, `replay`, and `reconstruct`.
`/v1/pipeline`, `/v1/compare`	`ledger_compare`, `ledger_jaccard`, plus `SemanticDecisionPipeline` for pair noise.
`/v1/noise/*`, `/v1/semantic/analyze`	`SemanticDecisionPipeline.run`, `run_with_policy`, and `capabilities`.
`/v1/analyze/*`, `/v1/structural_profile`	Ledger methods: `acf`, `segments`, `transitions`, `frequency_histogram`, `discovery_sequence`, plus `structural_profile`.
`/v1/batch/`, `/v1/corpus/`	`process_batch`, `ufm_signature_batch`, local loops, local manifests, and copying/importing `ledger.bin`.
`/v1/ingest/async`	Run `process` or `UniversalPipeline.run` inside your own background worker or queue.
`/v1/ben/ask`, `/v1/bob/query`	`BenSession`, `ask_ben`, `BobPipeline`, `ask_bob`, or the CLI commands.
`/v1/me/llm-credentials`	Pass `OllamaBackend`, `APIBackend`, or `backend_from_config` directly. Store provider keys in your own secret manager.

Operational notes

Keep one ledger file per project or tenant. Primitive reuse is scoped to that file.
Do not run multiple writers against the same ledger path without your own lock.
Ledger paths must stay inside the current working directory; paths containing .. are rejected.
Call ufm.set_num_threads(n) before the first ingestion if you need to bound CPU use.
Provider-backed Ben/Bob calls require the provider SDK, for example pip install openai or pip install anthropic.

Authentication

All API endpoints (except /v1/health) require authenticated context. Programmatic clients should pass an API key in theX-Api-Key header. Browser dashboard sessions can use an access_token cookie.

Getting your API key

Go to API Keys in the sidebar, click Create Key, and copy the full key immediately. It is only shown once.

Using your API key

curl -H "X-Api-Key: ufm_live_a1b2c3d4.your_secret_here" \
  -H "Content-Type: application/json" \
  -X POST https://api.spectrengine.com/v1/process \
  -d '{"data_b64": "..."}'

Key format

Keys follow the format ufm_live_{public_id}.{secret}. The prefix identifies the key; the secret authenticates it. Both parts are required.

Security: Never share your API key or commit it to source control. If compromised, revoke it immediately from the API Keys page and create a new one.

POST /v1/engine

Start here. This is the recommended endpoint for analysing a single input. One call runs every analysis layer and returns a complete structural profile. Use the granular endpoints below only when you need a specific layer in isolation.

Send any binary data (base64-encoded) and the engine decomposes it into structural primitives, assigns a deterministic seed (its identity), and runs five analysis layers: core ingestion, 7-stage quality pipeline, timeline autocorrelation, frequency distribution, and discovery sequencing. Optionally provide a second input for structural comparison.

The response is grouped by layer so you can read exactly the depth you need. Most integrations only use core (identity and signature) anduniversal.quality (quality metrics).

Auth required10 req/minScope: write

Request body

data_b64 (string) Base64-encoded binary data to process
symbol_length_mode (string, optional) Default: "auto_curve". Options: auto, auto_curve, entropy, fixed8, fixed16, fixed24, fixed32, fixed64
verify (boolean, optional) Run replay verification in universal pipeline (default: true)
compare_b64 (string, optional) Second item for structural comparison (base64-encoded)
max_lag (integer, optional) Maximum ACF lag for timeline analysis (1-500, default: 50)
top_n (integer, optional) Top-N primitives for frequency analysis (1-1000, default: 20)

Response - nested by layer

Trust boundary: the core, universal, and analysis sections are computed in one request-scoped context. Use the response trust_boundary field to interpret replay and cross-section comparisons correctly.

core: Core ingestion result

core.seed (integer) Deterministic seed from geometric signature
core.status (string) NOVELTY or REPLAY
core.discovery_rate (float) Fraction of novel primitives (0.0-1.0)
core.symbol_length (integer) Symbol width used
core.primitive_count (integer) Unique primitives found
core.timeline_length (integer) Total timeline entries
core.reuse_ratio (float) Fraction of reused primitives
core.replay_valid (boolean) Whether replay(ingest(data)) == data
core.signature (string) Multi-scale geometric signature

universal: 7-stage governed pipeline with quality gates

universal.success (boolean) Whether pipeline completed successfully
universal.quality (object) Quality metrics: replay_valid, discovery_rate, reuse_ratio, deterministic
universal.replay_valid (boolean) Whether replay invariant holds
universal.stages_completed (string[]) List of completed stage names
universal.metrics (object) Pre-ingest byte metrics: byte_entropy, unique_byte_ratio, size_class, input_length
universal.strategy (object) Selected processing strategy: symbol_length_mode, zero_point
universal.execute (object) Execution results: seed, status, discovery_rate, primitive_count, timeline_length, reuse_ratio
universal.field_reach (object) Per-seed field connectivity: primitive_count, shared_with_others, unique_to_seed, total_ledger_primitives, reach_ratio, isolation_ratio
universal.violations (string[]) Any warnings or errors

timeline: Temporal structure analysis

timeline.acf (float[]) Autocorrelation function values
timeline.segments (array) Discovery segments with start, end, rate
timeline.transitions (integer[]) Positions of structural transitions

frequency: Primitive frequency distribution

frequency.histogram (array) Full primitive frequency distribution
frequency.top_n (array) Top-N most frequent primitives
frequency.total_primitives (integer) Unique primitive count

discovery: Novelty emergence sequence

discovery.discovery_sequence (integer[]) Primitive IDs in order of first encounter
discovery.discovery_rate (float) Final discovery rate
discovery.primitive_count (integer) Total unique primitives

comparison: Structural overlap (only if compare_b64 provided)

comparison.jaccard (float) Jaccard similarity (shared / union)
comparison.shared_primitives (integer) Primitives in both inputs
comparison.only_a (integer) Primitives exclusive to first input
comparison.only_b (integer) Primitives exclusive to second input
comparison.overlap_coefficient (float) Overlap coefficient (shared / min), 0-1

Contract metadata

effective_symbol_length_mode (string) Method-layer selected mode applied across core/analysis/comparison sections
trust_boundary (object) Ledger-context notes for core/universal/analytics/comparison/replay compatibility
verification_note (string) Clarifies success vs replay_valid semantics, especially when verify=false

cURL example

curl -X POST https://api.spectrengine.com/v1/engine \
  -H "Content-Type: application/json" \
  -H "X-Api-Key: YOUR_API_KEY" \
  -d '{"data_b64": "SGVsbG8gV29ybGQ="}'

Python example

import base64, requests

API_KEY = "ufm_live_a1b2c3d4.your_secret_here"
BASE    = "https://api.spectrengine.com"

resp = requests.post(
    f"{BASE}/v1/engine",
    json={"data_b64": base64.b64encode(b"Hello World").decode()},
    headers={"X-Api-Key": API_KEY},
)
result = resp.json()

# Identity
print(result["core"]["seed"])           # deterministic structural identity
print(result["core"]["status"])         # "NOVELTY" or "REPLAY"
print(result["core"]["replay_valid"])   # True when replay(ingest(x)) == x

# Quality gates
print(result["universal"]["quality"])   # {replay_valid, discovery_rate, reuse_ratio, deterministic}

# Structure
print(len(result["timeline"]["segments"]))   # temporal segments found
print(result["frequency"]["total_primitives"])  # unique primitives
print(result["discovery"]["discovery_rate"])    # fraction of novel primitives

Example response (abbreviated)

{
  "core": {
    "seed": 8472619,
    "status": "NOVELTY",
    "discovery_rate": 1.0,
    "symbol_length": 14,
    "primitive_count": 42,
    "timeline_length": 42,
    "reuse_ratio": 0.0,
    "replay_valid": true,
    "signature": "G4:0.707:0.500:0.500:6:1.000:0.120|G8:..."
  },
  "universal": {
    "success": true,
    "quality": {
      "replay_valid": true, "discovery_rate": 1.0,
      "reuse_ratio": 0.0, "deterministic": true
    },
    "stages_completed": ["VALIDATE","METRICS","STRATEGY","EXECUTE","VERIFY","ADAPT","OUTPUT"],
    "metrics": {"byte_entropy": 3.459, "unique_byte_ratio": 0.727, "size_class": "small"},
    "strategy": {"symbol_length_mode": "auto_curve", "zero_point": true},
    "violations": []
  },
  "timeline": { "acf": [1.0, 0.02, -0.04, ...], "segments": [...], "transitions": [] },
  "frequency": { "histogram": [...], "top_n": [...], "total_primitives": 42 },
  "discovery": { "discovery_sequence": [0, 1, 2, ...], "discovery_rate": 1.0 },
  "effective_symbol_length_mode": "auto_curve",
  "trust_boundary": { ... },
  "verification_note": "..."
}

API tiers

Tier 1: this endpoint. One call, all structural analysis layers. The default for single-input analysis. Returns identity, quality, timeline, frequency, and discovery data.

Tier 2: /v1/pipeline. The default when comparing two inputs. Runs core ingestion plus structural comparison and semantic noise detection (identifies BOM, CRLF, and encoding artefacts).

Tier 3: granular endpoints. Individual layer endpoints for fine-grained control: /v1/process for lightweight ingestion, /v1/compare for structural comparison without semantic analysis, or any /v1/analyze/* endpoint individually.

POST /v1/pipeline

Default for comparing two inputs. This is the recommended endpoint when you have a source and target to compare. It runs core ingestion plus structural overlap analysis and semantic noise detection, identifying encoding artefacts (BOM, line endings, Base64) that make data look different while being functionally identical.

With one input: returns core analysis only (same as /v1/process). With two inputs (source + target): returns core analysis, structural comparison (shared/unique primitives, Jaccard similarity), and semantic noise analysis (noise classes, convergence check, anti-drift governance gates).

Unlike /v1/engine, this endpoint does not run timeline, frequency, or discovery analysis. It focuses on answering: “Are these two inputs structurally the same, and what noise accounts for any differences?”

Auth required10 req/minScope: write

Request body

data_b64 (string) Primary input, base64-encoded
target_b64 (string, optional) Second input for comparison/noise analysis
symbol_length_mode (string, optional) Default: "auto_curve"

Response fields

core (object) Core analysis: seed, status, discovery_rate, symbol_length, primitive_count, timeline_length, reuse_ratio, replay_valid, signature
compare (object | null) Structural comparison (present only when target given)
semantic (object | null) Semantic noise analysis (present only when target given)
effective_symbol_length_mode (string) Method-layer selected mode applied across core/compare sections
trust_boundary (object) Ledger-context notes for core/compare/semantic/replay compatibility
engine_version (string) Engine version

Python example: pair mode

import base64, requests

resp = requests.post(
    f"{BASE}/v1/pipeline",
    json={
        "data_b64": base64.b64encode(b"original file").decode(),
        "target_b64": base64.b64encode(b"modified file").decode(),
    },
    headers={"X-Api-Key": API_KEY},
)
result = resp.json()
print(f"Core seed: {result['core']['seed']}")
print(f"Jaccard: {result['compare']['jaccard']}")
print(f"Converges: {result['semantic']['converges']}")
print(f"Noise: {result['semantic']['noise_units']}")

Example response (pair mode, abbreviated)

{
  "core": {
    "seed": 8472619, "status": "NOVELTY", "discovery_rate": 0.85,
    "replay_valid": true, "signature": "G4:0.707:..."
  },
  "compare": {
    "shared_primitives": 38, "only_a": 4, "only_b": 6,
    "jaccard": 0.792, "overlap_coefficient": 0.905
  },
  "semantic": {
    "status": "ok",
    "converges": true,
    "noise_units": [
      {
        "noise_type": "LineEndingCrlf", "layer": "semantic",
        "function": "normalize", "operation": "strip_cr", "confidence": 1.0
      }
    ],
    "decision_hash": "a1b2c3d4...",
    "gates": {
      "coherence": true, "consistency": true,
      "causality": true, "persistence": true
    }
  },
  "effective_symbol_length_mode": "auto_curve",
  "trust_boundary": { ... },
  "engine_version": "3.0-rust"
}

converges: true means that after stripping the detected noise, the source and target are structurally identical. The noise_units array lists each artefact that was filtered. If converges is false, the differences are structural, not just noise.

Licences

Licences are signed 30-day tokens that unlock tier-scoped features such as ben.ask and bob.query. The local wheel caches the signed token during python -m ufm activate, then checks it locally on normal engine calls.

Auth required

POST /v1/licences/verify

Mint or refresh a signed licence token for the caller's API key. Authenticated with X-Api-Key (not JWT). Rate-limited to 60 requests per hour.

Response:

token.payload.public_id (string) Public id from the API key used for activation
token.payload.scopes (string[]) Granted local feature scopes, for example ben.ask and bob.query
token.payload.tier (string) Customer tier
token.payload.expires_at (string (ISO 8601)) Token expiry timestamp
token.signature (string) Base64 signature over the payload

GET /v1/licences/me

Return the caller's current licence status without minting a new token.

active (boolean) True iff a non-revoked, non-expired licence exists
tier (string) Subscription tier (e.g. standard)
expires_at (string (ISO 8601) or null) Expiry timestamp of the active licence, or null if none exists
revoked (boolean) True if the licence has been administratively revoked

POST /v1/licences/revoke/{licence_id}

Admin-only. Revoke a specific licence by its UUID. Returns { revoked: true, licence_id }.

Local activation: customers normally do not call these endpoints by hand. The wheel calls /verifyduring python -m ufm activate ufm_live_...and then ufm.licence_status() reads the cached status locally.

Python example

import os
import requests

# X-Api-Key authentication, not JWT
headers = {"X-Api-Key": os.environ["SPECTR_API_KEY"]}
resp = requests.post(f"{BASE}/v1/licences/verify", headers=headers)
token = resp.json()["token"]

print(token["payload"]["tier"])
print(token["payload"]["scopes"])

status = requests.get(f"{BASE}/v1/licences/me", headers=headers).json()
print(status["active"], status["expires_at"])

LLM credentials

Store your Anthropic, OpenAI, or Google Gemini API key for Ben and Bob. The server encrypts keys at rest. You can also manage these from the dashboard Settings page.

Auth required

GET /v1/me/llm-credentials

Returns whether a key is configured and the chosen provider and model name. Never returns the secret.

configured (boolean) True when this account has stored LLM credentials
provider (string | null) Stored provider: "anthropic", "openai", or "gemini"
model (string | null) Configured model name, or the provider default

PUT /v1/me/llm-credentials

provider (string) One of: "anthropic", "openai", "gemini"
api_key (string) Provider API key (required on every update)
model (string, optional) Model id; omit for the provider default

If the server is missing LLM_CREDENTIALS_ENCRYPTION_KEY, PUT returns 503 and keys cannot be stored.

DELETE /v1/me/llm-credentials

Removes stored credentials for the current user.

Python example

import os
import requests

headers = {"X-Api-Key": API_KEY}

before = requests.get(
    f"{BASE}/v1/me/llm-credentials",
    headers=headers,
).json()
print(before["configured"])

r = requests.put(
    f"{BASE}/v1/me/llm-credentials",
    json={
        "provider": "openai",
        "api_key": os.environ["OPENAI_API_KEY"],
        "model": "gpt-4o",
    },
    headers=headers,
)
print(r.json())

removed = requests.delete(
    f"{BASE}/v1/me/llm-credentials",
    headers=headers,
).json()
print(removed["configured"])

POST /v1/ben/ask

Send a message to Ben, the UFM research assistant. Conversations are persisted per-user, so pass the returned conversation_id when you want a follow-up turn in the same conversation.

Auth requiredLLM key in Settings

Save a provider API key under Settings or call PUT /v1/me/llm-credentials before this endpoint. Locally, use BenSessionor ask_ben with an LLM backend.

Request body

message (string) Your question or message for Ben (1-20,000 chars)
conversation_id (string, optional) UUID of an existing conversation to continue

Response

conversation_id (string) UUID of the conversation (new or existing)
conversation_title (string) Auto-generated title from the first message
response (string) Ben's assistant response text
token_estimate (integer) Conservative token estimate for the exchange
token_warning (boolean) True when token estimate exceeds 6,000
user_message (object) Persisted user message with id, content, created_at
assistant_message (object) Persisted assistant message with id, content, created_at

Python example

import requests

# Start a new conversation
resp = requests.post(
    f"{BASE}/v1/ben/ask",
    json={"message": "What is UFM actually doing?"},
    headers={"X-Api-Key": API_KEY},
)
resp.raise_for_status()
body = resp.json()
print(body["response"])
conversation_id = body["conversation_id"]

# Continue the conversation
resp = requests.post(
    f"{BASE}/v1/ben/ask",
    json={
        "message": "How does replay work?",
        "conversation_id": conversation_id,
    },
    headers={"X-Api-Key": API_KEY},
)
print(resp.json()["response"])

GET /v1/ben/ledger/status

Returns Ben's projection ledger health, skill index, and last-updated timestamp. Use it as a product health check before starting a user session.

projections (object) Projection file counts and SHA-256 values
skill_index (object) Capability count and capability names
health (object) Open contradiction and failed pattern recurrence counts
last_updated_at (string | null) Timestamp of the loaded Ben data

status = requests.get(
    f"{BASE}/v1/ben/ledger/status",
    headers={"X-Api-Key": API_KEY},
).json()
print(status["skill_index"]["capability_count"])

POST /v1/bob/query

Run Bob's grounded query pipeline: tokenise the question, apply claim gates, retrieve replay-anchored evidence, monitor out-of-vocabulary terms, and return audit output.

Auth requiredLLM key in SettingsAdvisory mode only

Hosted Bob requires saved LLM credentials. Local Bob can run the corpus, gate, OOV, and audit pipeline without a backend; pass a backend locally when you want generated response wording.

Request body

question (string) User question for Bob (1-20,000 chars)
candidate_response (string, optional) Draft response to monitor and gate
mode (string, optional) Only "advisory" is accepted; default is "advisory"
governed (boolean, optional) Governed-path metadata flag; gate outcomes still annotate the response
max_anchors (integer, optional) Maximum replay anchors returned (1-20, default: 5)

Response highlights

response (string) Final response after gate and OOV policy
tokenise (object) Query seed, status, discovery_rate, and query_oov_ratio
gate.status (string) Claim gate result: "PASS", "WARN", or "BLOCK"
evidence (object[]) Replay-anchored evidence rows with source_id, source_path, seed, claim_ids, phase_numbers, and snippet
oov_classification (string) OOV policy result: "PASS", "WARN", or "BLOCK"
oov_metrics (object) Out-of-vocabulary metrics used by the policy
audit (object) DecisionPipeline audit status, decision_hash, stage, reason, and gates
boundary_flags (string[]) User-visible boundary indicators

Python example

import requests

resp = requests.post(
    f"{BASE}/v1/bob/query",
    json={
        "question": "Explain C-CORE-002 replay invariant.",
        "governed": True,
        "max_anchors": 5,
    },
    headers={"X-Api-Key": API_KEY},
)
resp.raise_for_status()
body = resp.json()
print(body["gate"]["status"], body["oov_classification"])
print(body["boundary_flags"])
print(body["response"])

POST /v1/bob/ask and POST /v1/bob/chat

These routes run Bob as a persisted chat conversation. /v1/bob/ask is the primary chat route;/v1/bob/chat is a backward-compatible alias with the same behavior.

message (string) Message for Bob (1-20,000 chars)
conversation_id (string, optional) Existing conversation id to continue
mode (string, optional) Only "advisory" is accepted
governed (boolean, optional) Governed-path metadata flag

chat = requests.post(
    f"{BASE}/v1/bob/ask",
    json={"message": "What does the replay invariant mean?"},
    headers={"X-Api-Key": API_KEY},
).json()
print(chat["response"])
print(chat["query"]["gate"]["status"])

GET /v1/bob/ledger/status

Returns Bob's corpus status, session decision count, OOV thresholds, whether reinforcement is enabled, and engine version.

corpus (object) Source count, claim count, built_at, and corpus_index_sha256
session_decisions (object) Total decisions and last decision timestamp
oov (object) Warn/block thresholds, symbol length, and calibration status
reinforce_enabled (boolean) Whether admin reinforcement is enabled
engine_version (string) Engine version string

status = requests.get(
    f"{BASE}/v1/bob/ledger/status",
    headers={"X-Api-Key": API_KEY},
).json()
print(status["corpus"]["claim_count"])
print(status["oov"]["warn_threshold"])

POST /v1/process

Ingest binary data and persist it to your ledger. Returns structural analysis including a deterministic seed, discovery rate, and replay validity. The seed can be used with /v1/replay/{seed} to retrieve the original data later.

When to use this vs /v1/engine: Use /v1/process when you need persisted ingestion (data stored for later replay). Use /v1/engine when you want a full structural analysis without persistence.

Auth required30 req/minScope: write

Request body

data_b64 (string) Base64-encoded binary data to ingest
symbol_length_mode (string, optional) Engine mode. Default: "auto_curve". Options: auto, auto_curve, entropy, fixed8, fixed16, fixed24, fixed32, fixed64

Response fields

seed (integer) Deterministic seed derived from geometric signature and primitive count
status (string) "NOVELTY" if new structure discovered, "REPLAY" if all primitives already known
discovery_rate (float) Fraction of primitives that were novel (0.0 to 1.0)
symbol_length (integer) Symbol length used (fixed modes return the requested width)
primitive_count (integer) Total unique primitives in the decomposition
timeline_length (integer) Number of entries in the ingestion timeline
reuse_ratio (float) Fraction of timeline entries that reused existing primitives
replay_valid (boolean) Whether replay(ingest(data)) == data
signature (string) Multi-scale geometric signature encoding the structural profile (see below)
selection_model (object | null) Selection statistics: total_selections, selections_created, selections_reused

Signature format (structural profile)

The signature field encodes the geometric profile at multiple scale windows. Each scale produces a component in the format:

G{W}:{spread}:{sx}:{sy}:{mass}:{centroid}:{topo}

Where W is the scale width and the six values are:

spread (float) RMS radius from centroid - overall spatial extent
sx (float) X-axis standard deviation
sy (float) Y-axis standard deviation
mass (integer) Count of set bits at this scale
centroid (float) Sum of centroid coordinates (cx + cy)
topo (float) Topological score with pre-bits weighting

Scale windows are [4, 8, 16, 32, 64] for inputs over 16 bits. Components are separated by |. Parse these to build a per-scale structural profile for cross-input comparison.

Python example - fixed-32 mode with signature parsing

import base64
import requests

API_KEY = "ufm_live_a1b2c3d4.your_secret_here"
BASE    = "https://api.spectrengine.com"

data = b"Hello World"
resp = requests.post(
    f"{BASE}/v1/process",
    json={
        "data_b64": base64.b64encode(data).decode(),
        "symbol_length_mode": "fixed32",
    },
    headers={"X-Api-Key": API_KEY},
)
result = resp.json()
print(f"Seed: {result['seed']}")
print(f"Status: {result['status']}")
print(f"Discovery rate: {result['discovery_rate']:.4f}")
print(f"Symbol length: {result['symbol_length']}")
print(f"Signature: {result['signature']}")

File input helper

import base64
import pathlib
import requests

data = pathlib.Path("myfile.bin").read_bytes()
resp = requests.post(
    f"{BASE}/v1/process",
    json={
        "data_b64": base64.b64encode(data).decode(),
        "symbol_length_mode": "auto_curve",
    },
    headers={"X-Api-Key": API_KEY},
)
result = resp.json()
print(result["seed"], result["status"], result["replay_valid"])

POST /v1/ingest/async

Queue a payload for ingestion by the background worker instead of waiting for the engine to run inline. Returns 202 Accepted with a job_id that you can poll with GET /v1/ingests/{job_id}.

Auth requiredScope: write

Request body

data_b64 (string) Raw bytes, base64-encoded (up to 100 MB)
symbol_length_mode (string, optional) Default: "auto_curve"

The response is returned as soon as the job row is written, not when the engine finishes. Use /v1/ingests/{job_id} to check status.

Response

job_id (string (UUID)) The ingest_jobs row id
status (string) Always "queued" on success

Python example

import base64
import requests

queued = requests.post(
    f"{BASE}/v1/ingest/async",
    json={"data_b64": base64.b64encode(b"large payload").decode()},
    headers={"X-Api-Key": API_KEY},
).json()

print(queued["job_id"], queued["status"])

GET /v1/ingests/{job_id}

Poll the status of one of your async ingest jobs. Returns 404 if the job does not exist or belongs to another user.

Auth requiredScope: read

Response fields

job_id (string (UUID)) The ingest_jobs row id
status (string) "queued" | "running" | "ok" | "error"
symbol_length_mode (string) Mode used for this job
payload_bytes (integer) Byte count of the decoded payload
retries (integer) Retry count to date
queued_at (timestamp) When the job was accepted
started_at (timestamp | null) Worker pickup time
completed_at (timestamp | null) Worker finish time
result (object | null) Engine result dict when status is 'ok' (matches /v1/process response)
error (string | null) Error message when status is 'error'

Python example

job_id = queued["job_id"]
job = requests.get(
    f"{BASE}/v1/ingests/{job_id}",
    headers={"X-Api-Key": API_KEY},
).json()

print(job["status"])
if job["status"] == "ok":
    print(job["result"]["seed"])
elif job["status"] == "error":
    print(job["error"])

GET /v1/ingests

List your own async ingest jobs in reverse-chronological order.

Auth requiredScope: read

Query parameters

limit (integer, optional) Max rows to return (1-100). Default: 100
before (timestamp, optional) Cursor: return rows queued before this time

Response fields

jobs (object[]) Array of job views; see /v1/ingests/{job_id} for field details

Python example

jobs = requests.get(
    f"{BASE}/v1/ingests",
    params={"limit": 25},
    headers={"X-Api-Key": API_KEY},
).json()["jobs"]

for job in jobs:
    print(job["job_id"], job["status"], job["queued_at"])

POST /v1/reconstruct

Verify that data survives a full ingest-replay round trip. Returns whether the reconstruction is lossless.

Auth required30 req/minScope: write

Request body

data_b64 (string) Base64-encoded binary data to verify

Response fields

valid (boolean) true if replay(ingest(data)) == data exactly

Python example

import base64, requests

resp = requests.post(
    f"{BASE}/v1/reconstruct",
    json={"data_b64": base64.b64encode(b"test data").decode()},
    headers={"X-Api-Key": API_KEY},
)
print(resp.json())  # {"valid": true}

GET /v1/replay/{seed}

Replay previously ingested data by its seed. Returns the reconstructed byte chunks as base64-encoded strings.

Auth required60 req/min

Path parameters

seed (integer) Seed returned from a persisted ingestion path (typically /process or /process/universal)

Response fields

seed (integer) Echo of the requested seed
chunks_b64 (string[]) Reconstructed data chunks, each base64-encoded

Python example

import base64, requests

seed = 8472619  # from a previous /process response

resp = requests.get(
    f"{BASE}/v1/replay/{seed}",
    headers={"X-Api-Key": API_KEY},
)
result = resp.json()
for chunk_b64 in result["chunks_b64"]:
    print(base64.b64decode(chunk_b64))

JavaScript example

const seed = 8472619;
const resp = await fetch(`${BASE}/v1/replay/${seed}`, {
  headers: { "X-Api-Key": API_KEY },
});
const result = await resp.json();
const chunks = result.chunks_b64.map(chunk => {
  const raw = atob(chunk);
  return Uint8Array.from(raw, c => c.charCodeAt(0));
});
console.log(chunks);

POST /v1/compare

Compare two data items by structural primitive overlap. Returns Jaccard similarity, shared/exclusive primitive counts, and geometric signatures for both inputs.

Auth required30 req/minScope: write

Request body

data_a_b64 (string) First input, base64-encoded
data_b64 (string) Second input, base64-encoded
symbol_length_mode (string, optional) Default: "auto_curve"

Response fields

shared_primitives (integer) Primitives present in both inputs
only_a (integer) Primitives exclusive to first input
only_b (integer) Primitives exclusive to second input
jaccard (float) Jaccard similarity (shared / union)
overlap_coefficient (float) Overlap coefficient (shared / min)
signature_a (string) Geometric signature for first input
signature_b (string) Geometric signature for second input
discovery_rate_a (float) Discovery rate for first input
discovery_rate_b (float) Discovery rate for second input
symbol_length_a (integer) Symbol length selected for first input
symbol_length_b (integer) Symbol length selected for second input

Python example

import base64
import requests

resp = requests.post(
    f"{BASE}/v1/compare",
    json={
        "data_a_b64": base64.b64encode(b"Hello").decode(),
        "data_b64": base64.b64encode(b"World").decode(),
    },
    headers={"X-Api-Key": API_KEY},
)
result = resp.json()
print(result["jaccard"])
print(result["shared_primitives"])

POST /v1/structural_profile

Compute the canonical 5D structural complexity profile for one input at a fixed symbol width. This is the engine-side reference implementation of the Phase-94 profile used by the C-DISCOV-045 claim test.

Returns alpha (vocabulary growth exponent), s_zipf (Zipf exponent of the frequency distribution), v_size (vocabulary size), reuse (1 − v_size / n_syms), and discovery_integral (count of unique symbols encountered).

Auth requiredScope: write

Request body

data_b64 (string) Raw bytes, base64-encoded
symbol_width (integer, optional) Symbol width in bits, 1-128. Default: 16
sizes_for_alpha (integer[], optional) Prefix sizes in bytes for the growth-exponent fit. Default: [512, 1024, 2048, 4096, 8192, 16384, 32768]

Response fields

alpha (float) Vocabulary growth exponent (log-log fit of |V| vs N)
s_zipf (float) Zipf exponent of the frequency distribution
v_size (integer) Vocabulary size (distinct symbols at this width)
reuse (float) 1 - (v_size / n_syms)
discovery_integral (integer) Count of unique symbols encountered
symbol_width (integer) Symbol width used for this call
engine (string) Always "ufm_core_structural_profile" on the native engine path

cURL example

curl -X POST https://api.spectrengine.com/v1/structural_profile \
  -H "Content-Type: application/json" \
  -H "X-Api-Key: YOUR_API_KEY" \
  -d '{"data_b64": "QVRDR0FUQ0dBVENHQVRDRw==", "symbol_width": 16}'

Python example: fixed-16 and fixed-32 per window

import base64, requests

def window_meta(window_bytes: bytes) -> dict:
    encoded = base64.b64encode(window_bytes).decode()
    out = {}
    for width in (16, 32):
        resp = requests.post(
            f"{BASE}/v1/structural_profile",
            json={"data_b64": encoded, "symbol_width": width},
            headers={"X-Api-Key": API_KEY},
        )
        out[f"meta_fixed{width}"] = resp.json()
    return out

The engine field reports "ufm_core_structural_profile" when the native engine computed the profile. A different value indicates a non-canonical fallback path; treat those numbers as unverified.

POST /v1/batch/process

Process multiple data items in a single request. Items are processed in parallel for maximum throughput. Maximum 100 items per request.

Auth required10 req/minScope: write

Request body

items_b64 (string[]) Array of base64-encoded data items (1-100)
symbol_length_mode (string, optional) Default: "auto_curve"

Response fields

results (object[]) Array of results (same fields as /v1/process)
total_items (integer) Number of items processed
processing_ms (integer) Total wall-clock processing time

Python example

import base64, requests

items = [b"file_one", b"file_two", b"file_three"]
resp = requests.post(
    f"{BASE}/v1/batch/process",
    json={"items_b64": [base64.b64encode(i).decode() for i in items]},
    headers={"X-Api-Key": API_KEY},
)
for r in resp.json()["results"]:
    print(f"Seed {r['seed']}: {r['status']} (rate={r['discovery_rate']:.2f})")

POST /v1/batch/engine

Run the full engine pipeline on up to 50 items in a single request. Returns core, universal, timeline, frequency, discovery, and optional comparison for each item. Per-item error handling: one bad item does not abort the batch.

Auth requiredScope: writeMax 50 items

Use this instead of calling /v1/engine in a loop. A 30-item sweep that previously required 30+ individual calls becomes a single request.

Request body

items (object[]) Array of engine items (1-50)
items[].data_b64 (string) Input data, base64-encoded
items[].compare_b64 (string, optional) Comparison target, base64-encoded
items[].symbol_length_mode (string, optional) Default: "auto_curve"
items[].verify (boolean, optional) Enable replay verification (default: true)
items[].max_lag (integer, optional) ACF max lag (default: 50)
items[].top_n (integer, optional) Top-N primitives (default: 20)
include (string[], optional) Layers to include. Default: all. Options: "core", "universal", "timeline", "frequency", "discovery", "comparison"

Response fields

results (object[]) Full engine result per item
results[].index (integer) Position in input array
results[].core (object | null) Core ingestion result
results[].universal (object | null) Universal pipeline result
results[].timeline (object | null) Timeline analysis
results[].frequency (object | null) Frequency analysis
results[].discovery (object | null) Discovery sequence
results[].comparison (object | null) Structural comparison (if compare_b64 given)
results[].error (string | null) Error message if this item failed
total_items (integer) Number of items submitted
succeeded (integer) Items processed successfully
failed (integer) Items that failed
processing_ms (integer) Total wall-clock processing time

Python example

import base64, requests

items = [b"data_one", b"data_two", b"data_three"]
resp = requests.post(
    f"{BASE}/v1/batch/engine",
    json={
        "items": [{"data_b64": base64.b64encode(i).decode()} for i in items],
        "include": ["core", "timeline"],  # skip layers you don't need
    },
    headers={"X-Api-Key": API_KEY},
)
data = resp.json()
print(f"Succeeded: {data['succeeded']}/{data['total_items']}")
for r in data["results"]:
    if r["error"]:
        print(f"  [{r['index']}] ERROR: {r['error']}")
    else:
        print(f"  [{r['index']}] seed={r['core']['seed']} dr={r['core']['discovery_rate']:.2f}")

POST /v1/batch/compare

Compare multiple data pairs structurally in a single request. Maximum 100 pairs. Per-pair error handling: one bad pair does not abort the batch.

Auth requiredScope: writeMax 100 pairs

Request body

pairs (object[]) Array of comparison pairs (1-100)
pairs[].data_a_b64 (string) First input, base64-encoded
pairs[].data_b64 (string) Second input, base64-encoded
pairs[].symbol_length_mode (string, optional) Default: "auto_curve"

Response fields

results (object[]) Comparison result per pair
results[].index (integer) Position in input array
results[].shared_primitives (integer | null) Primitives in both inputs
results[].only_a (integer | null) Primitives exclusive to first input
results[].only_b (integer | null) Primitives exclusive to second input
results[].jaccard (float | null) Jaccard similarity
results[].overlap_coefficient (float | null) Overlap coefficient
results[].error (string | null) Error message if this pair failed
total_pairs (integer) Number of pairs submitted
succeeded (integer) Pairs compared successfully
failed (integer) Pairs that failed
processing_ms (integer) Total wall-clock processing time

Python example

import base64, requests

pairs = [(b"source_1", b"target_1"), (b"source_2", b"target_2")]
resp = requests.post(
    f"{BASE}/v1/batch/compare",
    json={
        "pairs": [
            {"data_a_b64": base64.b64encode(a).decode(),
             "data_b64": base64.b64encode(b).decode()}
            for a, b in pairs
        ],
    },
    headers={"X-Api-Key": API_KEY},
)
for r in resp.json()["results"]:
    if r["error"]:
        print(f"  [{r['index']}] ERROR: {r['error']}")
    else:
        print(f"  [{r['index']}] jaccard={r['jaccard']:.3f}")

POST /v1/noise/detect

Detect deterministic noise artefacts between a source and target byte pair. Identifies transformations like BOM insertion, line-ending changes, and Base64 encoding.

Auth required30 req/min

Request body

source_b64 (string) Original data, base64-encoded
target_b64 (string) Transformed data, base64-encoded

Response fields

converges (boolean) Whether source and target converge after normalisation
noise_units (object[]) Detected noise artefacts with type, layer, operation
decision_hash (string) Method-layer decision hash for audit-chain tracing
gates (object) Anti-drift gate results (coherence, consistency, causality, persistence)

Python example

import base64
import requests

source = b"Hello World\n"
target = b"Hello World\r\n"
result = requests.post(
    f"{BASE}/v1/noise/detect",
    json={
        "source_b64": base64.b64encode(source).decode(),
        "target_b64": base64.b64encode(target).decode(),
    },
    headers={"X-Api-Key": API_KEY},
).json()
print(result["converges"])
print(result["noise_units"])

Compares "Hello World" with "Hello World\r\n". Detects a LineEndingCrlf noise artefact: the CRLF line ending is noise, not a structural change.

POST /v1/noise/delta

Return the full 12-field noise delta between a source and target, plus a reversibility summary. Unlike /v1/noise/detect, every response includes source_span, target_span, primitive_span, and reversibility_meta.

Auth required30 req/min

Request body

source_b64 (string) Source bytes, base64-encoded
target_b64 (string) Target bytes, base64-encoded
enabled_noise_classes (string[], optional) Restrict detection to these class IDs
strict_allowlist (boolean, optional) If true, reject unknown class IDs. Default: false

Response fields

converges (boolean) normalize(source) == target
noise_units (object[]) Each unit includes all 12 engine fields
summary (object) { total_bytes_changed, noise_class_counts, reversibility_breakdown }
decision_hash (string) Method-layer decision hash
gates (object) Anti-drift gate results

Python example

source = b"Hello"
target = b"\xef\xbb\xbfHello"
delta = requests.post(
    f"{BASE}/v1/noise/delta",
    json={
        "source_b64": base64.b64encode(source).decode(),
        "target_b64": base64.b64encode(target).decode(),
    },
    headers={"X-Api-Key": API_KEY},
).json()
print(delta["summary"])
print(delta["noise_units"])

GET /v1/noise/capabilities

List all available noise detection classes supported by the engine.

Auth required

Response fields

capabilities (object[]) Array of {id, label} for each noise class
engine_version (string) Engine version

Python example

caps = requests.get(
    f"{BASE}/v1/noise/capabilities",
    headers={"X-Api-Key": API_KEY},
).json()
print(caps["engine_version"])
print(caps["capabilities"])

# Example shape:
# {"capabilities": [{"id": "bom_utf8", "label": "BOM UTF-8"}, ...]}

POST /v1/semantic/analyze

Run full semantic analysis between two byte sequences. Determines whether the inputs converge (are structurally identical after noise removal) and classifies each noise artefact found. Supports policy control to restrict which noise classes are detected.

Use this when you need more control than /v1/pipeline provides , for example to restrict detection to specific noise classes or use strict allowlisting. For most pair comparisons, /v1/pipeline is simpler.

Auth required30 req/minScope: any authenticated context

Request body

source_b64 (string) Source data, base64-encoded
target_b64 (string) Target data, base64-encoded
enabled_noise_classes (string[], optional) Filter to specific noise class ids, for example ["bom_utf8", "line_ending_crlf"]
strict_allowlist (boolean, optional) If true, only detect listed classes. Default: false

Response fields

status (string) "ok" or "rejected" (if anti-drift gates fail)
converges (boolean) Whether normalised forms are identical
noise_units (object[]) Detected noise artefacts
decision_hash (string) Method-layer decision hash
gates (object) Anti-drift gate results

Compares "Hello" with "Hello" prefixed by a UTF-8 BOM. Returns converges: true with a BomUtf8 noise unit, meaning the data is functionally identical despite the byte-level difference.

Python example: strict allowlist

import base64, requests

resp = requests.post(
    f"{BASE}/v1/semantic/analyze",
    json={
        "source_b64": base64.b64encode(b"Hello").decode(),
        "target_b64": base64.b64encode(b"\xef\xbb\xbfHello").decode(),
        "enabled_noise_classes": ["bom_utf8"],  # Only detect BOM noise
        "strict_allowlist": True,
    },
    headers={"X-Api-Key": API_KEY},
)
print(resp.json()["converges"])     # True
print(resp.json()["noise_units"])   # [{noise_type: "BomUtf8", ...}]

GET /v1/corpus/list

List your recent ingests. This endpoint reads your own ledger only and never exposes another user’s data or Bob’s internal claim corpus.

Auth requiredScope: read

Query parameters

limit (integer, optional) Max rows to return (1-500). Default: 100
before (string, optional) Return rows created before this ISO-8601 timestamp

Response fields

records (object[]) Array of ingest records: request_id, endpoint, created_at, status_code, discovery_rate, symbol_length, primitive_count, timeline_length

Python example

records = requests.get(
    f"{BASE}/v1/corpus/list",
    params={"limit": 50},
    headers={"X-Api-Key": API_KEY},
).json()["records"]

for row in records:
    print(row["created_at"], row["endpoint"], row["status_code"])

POST /v1/corpus/build

Ingest multiple documents into your own ledger in a single call. Returns the per-document seeds, a pairwise Jaccard matrix across the documents, and an updated summary of your ledger.

Auth requiredScope: write

Request body

documents (object[]) 1-50 items of { data_b64 }
symbol_length_mode (string, optional) Default: "auto_curve"

Maximum 50 documents per call and 100 MB total across all documents.

Response fields

seeds (integer[]) Structural seed per document, in input order
jaccard_matrix (float[][]) Dense symmetric similarity matrix, diagonal = 1.0
ledger_summary (object) { primitive_count, timeline_length, discovery_rate, symbol_length, run_count }
processing_ms (integer) Total server-side processing time

Python example

import base64

docs = [b"first document", b"second document"]
body = {
    "documents": [
        {"data_b64": base64.b64encode(doc).decode()}
        for doc in docs
    ],
    "symbol_length_mode": "auto_curve",
}

result = requests.post(
    f"{BASE}/v1/corpus/build",
    json=body,
    headers={"X-Api-Key": API_KEY},
).json()

print(result["seeds"])
print(result["jaccard_matrix"])
print(result["ledger_summary"]["run_count"])

GET /v1/corpus/metadata

Return the summary for your own ledger. Returns zero values when you have not yet ingested anything.

Auth requiredScope: read

Response fields

primitive_count (integer) Distinct primitives in your ledger
timeline_length (integer) Total ledger timeline length
discovery_rate (float) New primitives / total selections
symbol_length (integer | null) Last symbol length used
run_count (integer) Distinct ingest runs recorded against this ledger

Python example

summary = requests.get(
    f"{BASE}/v1/corpus/metadata",
    headers={"X-Api-Key": API_KEY},
).json()
print(summary["primitive_count"], summary["run_count"])

GET /v1/corpus/export

Download your own ledger.bin as a binary attachment. Returns 404 if your ledger does not exist yet.

Auth requiredScope: read

Python example

resp = requests.get(
    f"{BASE}/v1/corpus/export",
    headers={"X-Api-Key": API_KEY},
)
resp.raise_for_status()
with open("ledger-export.bin", "wb") as f:
    f.write(resp.content)

POST /v1/corpus/import

Replace your own ledger with an uploaded binary. The previous ledger is atomically renamed to ledger.bin.bak before the replacement becomes visible, so a crash cannot leave a half-written ledger.

Auth requiredScope: write

Upload

multipart/form-data; the file must start with the UFMR magic header and be no larger than 100 MB.

Response fields

replaced (boolean) Always true on success
backup_path (string | null) Server path of the .bak file, or null if no prior ledger
bytes_written (integer) Size of the new ledger
new_stats (object) Summary after replacement

Python example

with open("ledger-export.bin", "rb") as f:
    result = requests.post(
        f"{BASE}/v1/corpus/import",
        files={"file": ("ledger-export.bin", f, "application/octet-stream")},
        headers={"X-Api-Key": API_KEY},
    ).json()

print(result["replaced"])
print(result["new_stats"]["primitive_count"])

POST /v1/analyze/timeline

Analyse temporal structure of ingested data: autocorrelation, segment boundaries, and structural transitions. Useful for detecting periodicity and regime changes.

Auth required30 req/min

Request body

data_b64 (string) Base64-encoded data
symbol_length_mode (string, optional) Default: "auto_curve"
max_lag (integer, optional) Maximum ACF lag (1-500, default: 50)
window_size (integer, optional) Segment window size (10-10000, default: 100)
transition_threshold (float, optional) Rate-change threshold (0-1, default: 0.1)

Response fields

acf (float[]) Autocorrelation coefficients at lags 1..max_lag
segments (object[]) Per-segment discovery stats (start, end, rate, primitives)
transitions (integer[]) Indices where discovery rate changes significantly
primitive_count (integer) Total unique primitives
timeline_length (integer) Timeline entries
discovery_rate (float) Overall discovery rate

Python example

result = requests.post(
    f"{BASE}/v1/analyze/timeline",
    json={
        "data_b64": base64.b64encode(b"abcabcabc").decode(),
        "max_lag": 20,
    },
    headers={"X-Api-Key": API_KEY},
).json()
print(result["acf"])
print(result["segments"])

POST /v1/analyze/frequency

Analyse primitive frequency distribution. Returns a full histogram sorted by frequency and the top-N most common primitives.

Auth required30 req/min

Request body

data_b64 (string) Base64-encoded data
symbol_length_mode (string, optional) Default: "auto_curve"
top_n (integer, optional) Number of top primitives to return (1-1000, default: 20)

Response fields

histogram (object[]) All primitives with {primitive_id, count}, sorted by count
top_n (object[]) Most frequent primitives
total_primitives (integer) Total unique primitives
timeline_length (integer) Timeline entries

Python example

result = requests.post(
    f"{BASE}/v1/analyze/frequency",
    json={
        "data_b64": base64.b64encode(b"abcabcabc").decode(),
        "top_n": 5,
    },
    headers={"X-Api-Key": API_KEY},
).json()
print(result["top_n"])
print(result["total_primitives"])

POST /v1/analyze/discovery

Retrieve the discovery sequence (order in which primitives were first encountered) and structural metrics. Useful for understanding how novelty evolves across data.

Auth required30 req/min

Request body

data_b64 (string) Base64-encoded data
symbol_length_mode (string, optional) Default: "auto_curve"

Response fields

discovery_sequence (integer[]) Primitive IDs in discovery order
discovery_rate (float) Novelty fraction (0.0 to 1.0)
primitive_count (integer) Total unique primitives
timeline_length (integer) Timeline entries
symbol_length (integer) Symbol width used

Python example

result = requests.post(
    f"{BASE}/v1/analyze/discovery",
    json={"data_b64": base64.b64encode(b"abcabcabc").decode()},
    headers={"X-Api-Key": API_KEY},
).json()
print(result["discovery_sequence"])
print(result["discovery_rate"])

POST /v1/analyze/symbol_length

Report the engine-selected symbol length for the given bytes and the selector metadata (entropy of the chosen length, the mode used, and the sample size consumed). Useful for diagnosing why a corpus produced the primitive count it did.

Auth required30 req/min

Request body

data_b64 (string) Raw bytes, base64-encoded

Response fields

selected_length (integer) Symbol length (in bits) the engine chose
entropy_at_selected (float) Shannon entropy at the selected length
mode (string) Selector mode (e.g. "AutoCurve", "Entropy", "Fixed(8)")
sample_bits_used (integer) Number of bits sampled during selection

Python example

result = requests.post(
    f"{BASE}/v1/analyze/symbol_length",
    json={"data_b64": base64.b64encode(b"Hello World").decode()},
    headers={"X-Api-Key": API_KEY},
).json()
print(result["selected_length"])
print(result["entropy_at_selected"])

POST /v1/process/universal

Run the 7-stage universal pipeline with quality gates. Each stage validates, analyses, and verifies your data through: VALIDATE → METRICS → STRATEGY → EXECUTE → VERIFY → ADAPT → OUTPUT.

Auth required30 req/minScope: write

Request body

data_b64 (string) Input data, base64-encoded
verify (boolean, optional) Run replay verification (default: true). Set false for bulk speed.

Response fields

success (boolean) Whether ingestion + stage flow completed (replay proof requires replay_valid=true)
seed (integer) Deterministic seed from geometric signature
quality (object) Quality metrics: replay_valid, discovery_rate, reuse_ratio, deterministic
replay_valid (boolean) Whether replay invariant holds
stages_completed (string[]) All 7 stage names in order
metrics (object) Pre-ingest byte metrics: byte_entropy, unique_byte_ratio, size_class, input_length
strategy (object) Selected processing strategy: symbol_length_mode, zero_point
execute (object) Execution results: seed, status, discovery_rate, primitive_count, timeline_length, reuse_ratio
field_reach (object) Per-seed field connectivity: primitive_count, shared_with_others, unique_to_seed, total_ledger_primitives, reach_ratio, isolation_ratio
violations (string[]) Any warnings or errors
engine_version (string) Engine version

Python example

import base64, requests

resp = requests.post(
    f"{BASE}/v1/process/universal",
    json={
        "data_b64": base64.b64encode(b"your data here").decode(),
        "verify": True,
    },
    headers={"X-Api-Key": API_KEY},
)
result = resp.json()
print(f"Success: {result['success']}")
print(f"Seed: {result['seed']}")
print(f"Replay valid: {result['replay_valid']}")
print(f"Stages: {result['stages_completed']}")
print(f"Strategy: {result['strategy']}")
print(f"Quality: {result['quality']}")

GET /v1/health

Check API and engine status. No authentication required.

No auth

Response fields

status (string) "ok" if database reachable, "degraded" if not
engine_version (string) UFM engine version (e.g. 3.0-rust)

Python example

health = requests.get(f"{BASE}/v1/health").json()
print(health["status"])
print(health["engine_version"])

cURL example

curl https://api.spectrengine.com/v1/health

# {"status": "ok", "engine_version": "3.0-rust"}

Error Handling

All errors return a consistent JSON structure:

{
  "error": {
    "code": "ERROR_CODE",
    "message": "Human-readable description",
    "request_id": "correlation-uuid"
  }
}

Error codes

Status	Code	Meaning
400	BAD_REQUEST	Invalid base64, malformed JSON, or engine error
401	UNAUTHORIZED	Missing or invalid authentication credentials
413	PAYLOAD_TOO_LARGE	Request body exceeds 10 MB limit
422	VALIDATION_ERROR	Request body failed schema validation
429	RATE_LIMIT	Too many requests (check Retry-After header)
500	INTERNAL_ERROR	Server error (includes request_id for support)

Handling errors in code

import requests

BASE    = "https://api.spectrengine.com"
API_KEY = "ufm_live_a1b2c3d4.your_secret_here"

resp = requests.post(f"{BASE}/v1/process", json=payload,
                      headers={"X-Api-Key": API_KEY})

if resp.status_code == 200:
    result = resp.json()
elif resp.status_code == 429:
    retry_after = resp.headers.get("Retry-After", 60)
    print(f"Rate limited. Retry in {retry_after}s")
else:
    error = resp.json().get("error", {})
    print(f"Error {resp.status_code}: {error.get('message')}")
    print(f"Request ID: {error.get('request_id')}")

Rate Limits

Launch promotion: All API endpoint rate limits are temporarily removed. Sign up for free and use every endpoint with no request caps. Abuse-prevention limits on authentication endpoints still apply (see below). Tiered usage limits will be introduced in a future update.

The following protective limits remain in place:

Limit	Purpose
5 requests/min on login & register (per IP)	Prevents brute-force and spam account creation
1 request per 5 minutes on register (per email)	Prevents the same address being submitted repeatedly across different IPs
3 requests/min on password reset request	Prevents email flooding
3 requests/hour on resend-verification (per email)	Prevents verification email flooding
3 requests/hour on contact form (per IP)	Prevents inbox spam
Cloudflare Turnstile on register, password reset, contact, resend	Bot protection. Adaptive on login: shown after 3 failed attempts within 15 minutes.
Email verification required before login	Confirms account ownership; blocks throwaway-email signups.
10 MB max request body	Payload size protection
100 items per batch request	Batch size protection

If you exceed an abuse-prevention limit the API returns HTTP 429 with a Retry-After header (seconds).