SPECTR Engine Docs
Use the hosted API or install the local Python wheel. Hosted API base URL:
https://api.spectrengine.com
Start Here
UFM takes bytes, decomposes them into structural primitives, and gives you repeatable results you can use in software. You can call it through the hosted HTTPS API, or install the same engine locally with the Python wheel.
Hosted API
Best when you want a service endpoint, account-managed ledgers, dashboard history, and no local engine install.
import base64, os, requests
BASE = "https://api.spectrengine.com"
API_KEY = os.environ["UFM_API_KEY"]
payload = {
"data_b64": base64.b64encode(b"Hello World").decode()
}
resp = requests.post(
f"{BASE}/v1/engine",
json=payload,
headers={"X-Api-Key": API_KEY},
)
resp.raise_for_status()
result = resp.json()
print(result["core"]["seed"])
print(result["core"]["replay_valid"])Local Python wheel
Best when you want raw bytes in your own process, local ledger files, and engine calls without an HTTP round trip after activation.
pip install ufm-3.0.0-cp313-cp313-win_amd64.whl
python -m ufm activate ufm_live_<your_key>import ufm
eng = ufm.InvariantIdentityEngine(storage_path='ledger.bin')
seed, status = eng.process(b'Hello World')
print(seed, status)
print(eng.reconstruct(b'Hello World'))
eng.save()core.seed and core.replay_valid. Locally, you should see a seed, a status such as NOVELTY, and True from reconstruct.Which Endpoint Should I Use?
The SPECTR Engine has two main endpoints that cover most use cases. Pick the one that matches what you're doing.
Analyse one piece of data. Returns its structural identity, quality metrics, temporal patterns, and frequency distribution, all in a single call.
Best for:
- Understanding the structural profile of a file or payload
- Monitoring data quality over time
- Getting a complete analysis without multiple API calls
Compare two pieces of data. Returns their structural overlap and detects semantic noise (encoding changes, BOM insertions, line-ending shifts) that may disguise functionally identical data as different.
Best for:
- Detecting what changed between two file versions
- Filtering out noise (BOM, CRLF, Base64) from real structural changes
- Data integrity verification across systems
What each endpoint runs
| Analysis layer | /v1/engine | /v1/pipeline |
|---|---|---|
| Core ingestion (seed, signature, replay check) | Yes | Yes |
| Universal pipeline (7-stage quality gates) | Yes | - |
| Timeline analysis (autocorrelation, segments) | Yes | - |
| Frequency analysis (primitive distribution) | Yes | - |
| Discovery analysis (novelty sequence) | Yes | - |
| Structural comparison (Jaccard, shared/unique) | With compare_b64 | With target_b64 |
| Semantic noise detection (BOM, CRLF, Base64) | - | With target_b64 |
| Anti-drift governance gates | - | With target_b64 |
/v1/engine. It gives you the most detail about a single input. Switch to /v1/pipeline when you need to compare two inputs or detect noise between them.Other endpoints
The granular endpoints give you fine-grained control when you only need one specific layer:
/v1/process- lightweight ingestion only (get a seed, replay later with/v1/replay/{seed})/v1/batch/process- process up to 100 items in parallel (lightweight signatures)/v1/batch/engine- full engine analysis on up to 50 items (replaces per-item loops)/v1/batch/compare- compare up to 100 pairs in one call/v1/compare- structural comparison without semantic analysis/v1/structural_profile- canonical 5D structural complexity profile (alpha, Zipf, vocabulary, reuse, discovery integral)/v1/semantic/analyze- semantic noise detection with policy control/v1/analyze/*- individual timeline, frequency, or discovery analysis
Plain-English Terms
| Term | What it means when you are coding |
|---|---|
data_b64 | Base64 text wrapping your bytes for HTTP JSON. Local wheel calls take raw bytes instead. |
| seed | A deterministic numeric identity returned by the engine for the structural profile it computed. |
| ledger | The stored primitive/timeline state. Hosted API ledgers are account-scoped; local ledgers are files you choose. |
| replay | Rebuilding previously ingested bytes from ledger state. replay_valid tells you whether the check passed. |
NOVELTY / REPLAY | NOVELTY means the call selected new primitives. REPLAY means the structure was already in that ledger. |
| request-scoped | The endpoint computes a result for that request without storing it for later replay. |
| semantic noise | A byte-level representation change, such as BOM or line endings, that the semantic layer can classify separately from structural difference. |
Quick Start
This path gets you from no code to a working API call. Use Python first because it handles base64 safely on Windows, macOS, and Linux.
1. Get your API key
Go to API Keys in the sidebar and click Create Key. Copy the full key immediately; it is shown only once. Format: ufm_live_a1b2c3d4.xxxxxxxxx
2. Install the HTTP helper library
The examples below use requests and read your key from an environment variable so you do not paste secrets into source files.
pip install requests
# PowerShell
$env:UFM_API_KEY = "ufm_live_<your_key>"
# macOS/Linux shell
export UFM_API_KEY="ufm_live_<your_key>"3. Analyse your first input
Use /v1/engine for a complete structural analysis of one input:
import base64
import os
import requests
BASE = "https://api.spectrengine.com"
API_KEY = os.environ["UFM_API_KEY"]
payload = {
"data_b64": base64.b64encode(b"Hello World").decode(),
}
resp = requests.post(
f"{BASE}/v1/engine",
json=payload,
headers={"X-Api-Key": API_KEY},
timeout=60,
)
resp.raise_for_status()
result = resp.json()
print("seed:", result["core"]["seed"])
print("status:", result["core"]["status"])
print("replay valid:", result["core"]["replay_valid"])
print("quality:", result["universal"]["quality"])If this works, you have authenticated successfully and the engine has analysed your bytes. The most useful first fields are core.seed, core.status, core.replay_valid, and universal.quality.
4. Persist and replay
/v1/engine is request-scoped (no data persisted between calls). To store data for later replay, use /v1/process:
payload = {"data_b64": base64.b64encode(b"Hello World").decode()}
created = requests.post(
f"{BASE}/v1/process",
json=payload,
headers={"X-Api-Key": API_KEY},
).json()
seed = created["seed"]
replayed = requests.get(
f"{BASE}/v1/replay/{seed}",
headers={"X-Api-Key": API_KEY},
).json()
original = b"".join(base64.b64decode(c) for c in replayed["chunks_b64"])
print(original)5. Compare two inputs
Use /v1/pipeline with a target_b64 to compare two inputs and detect noise:
source = b"Hello World"
target = b"\xef\xbb\xbfHello World"
resp = requests.post(
f"{BASE}/v1/pipeline",
json={
"data_b64": base64.b64encode(source).decode(),
"target_b64": base64.b64encode(target).decode(),
},
headers={"X-Api-Key": API_KEY},
)
pair = resp.json()
print(pair["compare"]["jaccard"])
print(pair["semantic"]["converges"])
print(pair["semantic"]["noise_units"])This compares "Hello World" with a BOM-prefixed version. The response shows structural overlap (Jaccard similarity) and identifies the BOM as a noise artefact.
cURL smoke test
curl -X POST https://api.spectrengine.com/v1/engine \
-H "Content-Type: application/json" \
-H "X-Api-Key: YOUR_API_KEY" \
-d '{"data_b64": "SGVsbG8gV29ybGQ="}'Local Python Wheel
The hosted API and the local wheel are two surfaces over the same native UFM engine. Use the wheel when you want the engine inside your own process, with local ledgers, no HTTP base64 wrapper, and no per-request network call after activation.
import ufm. Activation caches a signed licence token on the machine. Normal engine calls use that cached token; python -m ufm status checks it without touching the network.Install and activate
Use the wheel that matches your Python and OS. The Windows CPython 3.13 wheel is:
python -m venv .venv
.\.venv\Scripts\activate
pip install ufm-3.0.0-cp313-cp313-win_amd64.whl
python -m ufm activate ufm_live_<their_key>
python -m ufm status
python -m ufm version
# Done: import ufm works, and features granted by your licence are available.For unattended jobs, you can also set UFM_LICENCE_KEY before the first engine call. Set UFM_LICENCE_CACHE_DIR when you want the licence cache to live in a controlled directory.
import ufm
status = ufm.licence_status()
print(status["active"])
print(status["tier"])
print(status["days_remaining"])Mental model
- HTTP requests use base64 strings. Local calls take raw
bytes. - API persistence is your account ledger. Local persistence is the
storage_pathfile you choose. - Request-scoped API endpoints map to temporary local ledgers or stateless helpers.
- Local Ben history lives in a
BenSession; persist it yourself if you need durable chat history. - There are no hosted rate limits locally, but the 100 MB input limit and licence scopes still apply.
Core ingest, persist, and replay
This is the local equivalent of POST /v1/process, GET /v1/replay/{seed}, and POST /v1/reconstruct.
import ufm
data = b"Hello World"
with ufm.InvariantIdentityEngine(storage_path="customer-ledger.bin") as eng:
seed, status = eng.process(data)
print(seed, status) # NOVELTY on first ingest
seed2, status2 = eng.process(data)
print(seed2 == seed, status2) # True, REPLAY
assert eng.reconstruct(data) is True
replayed = [bytes(seq) for seq in eng.replay(seed)]
print(replayed[0]) # b"Hello World"
print(eng.ledger_summary())
# save() is called automatically when the context exits.Full engine analysis
Use this pattern when you want the same layers as POST /v1/engine: core identity, universal pipeline quality, timeline, frequency, discovery, and optional structural comparison.
import tempfile
import ufm
def to_bits(data: bytes) -> list[int]:
return [int(bit) for byte in data for bit in f"{byte:08b}"]
def mode_from_strategy(strategy: dict | None, fallback: str = "auto_curve") -> str:
raw = str((strategy or {}).get("symbol_length_mode") or fallback).lower()
if raw in ("autocurve", "auto_curve"):
return "auto_curve"
if raw == "entropy":
return "entropy"
if raw.startswith("fixed(") and raw.endswith(")"):
return f"fixed{raw[6:-1]}"
return raw if raw.startswith("fixed") else fallback
def compare_bytes(a: bytes, b: bytes, mode: str = "auto_curve") -> dict:
la = ufm.ingest_raw(to_bits(a), symbol_length_mode=mode)
lb = ufm.ingest_raw(to_bits(b), symbol_length_mode=mode)
return ufm.ledger_compare(la, lb)
def run_local_engine(
data: bytes,
*,
compare_with: bytes | None = None,
verify: bool = True,
max_lag: int = 50,
top_n: int = 20,
) -> dict:
with tempfile.TemporaryDirectory(prefix="ufm-engine-") as tmp:
up = ufm.UniversalPipeline(
storage_path=f"{tmp}/engine-request-ledger.bin",
zero_point=True,
verify=verify,
)
universal = up.run(data)
mode = mode_from_strategy(universal.get("strategy"))
sig = ufm.ufm_signature(data, symbol_length_mode=mode)
ledger = ufm.ingest_raw(to_bits(data), symbol_length_mode=mode)
result = {
"core": {
"seed": universal.get("seed", sig["seed"]),
"status": (universal.get("execute") or {}).get("status"),
"discovery_rate": sig["discovery_rate"],
"symbol_length": sig["symbol_length"],
"primitive_count": sig["primitive_count"],
"timeline_length": sig["timeline_length"],
"reuse_ratio": sig["reuse_ratio"],
"replay_valid": sig["replay_valid"],
"signature": sig["signature"],
},
"universal": universal,
"timeline": {
"acf": ledger.acf(max_lag),
"segments": ledger.segments(100),
"transitions": ledger.transitions(100, 0.1),
},
"frequency": {
"histogram": ledger.frequency_histogram(),
"top_n": ledger.top_n_primitives(top_n),
},
"discovery": {
"discovery_sequence": ledger.discovery_sequence(),
"discovery_rate": ledger.discovery_rate,
"primitive_count": ledger.primitive_count,
"timeline_length": ledger.timeline_length,
"symbol_length": ledger.symbol_length,
},
"effective_symbol_length_mode": mode,
"engine_version": ufm.VERSION,
}
if compare_with is not None:
result["comparison"] = compare_bytes(data, compare_with, mode)
return result
analysis = run_local_engine(b"Hello World", compare_with=b"\xef\xbb\xbfHello World")
print(analysis["core"]["seed"])
print(analysis["universal"]["quality"])
print(analysis["comparison"]["jaccard"])Pair comparison and semantic noise
This is the local equivalent of POST /v1/pipeline, POST /v1/compare, POST /v1/noise/detect, POST /v1/noise/delta, and POST /v1/semantic/analyze.
import ufm
def to_bits(data: bytes) -> list[int]:
return [int(bit) for byte in data for bit in f"{byte:08b}"]
source = b"line one\nline two\n"
target = b"line one\r\nline two\r\n"
la = ufm.ingest_raw(to_bits(source))
lb = ufm.ingest_raw(to_bits(target))
structural = ufm.ledger_compare(la, lb)
semantic = ufm.SemanticDecisionPipeline("semantic-ledger.jsonl")
noise = semantic.run_with_policy(
source,
target,
enabled_noise_classes=["line_ending_crlf"],
strict_allowlist=True,
)
print(structural["jaccard"])
print(noise["converges"]) # True for classified CRLF/LF noise
print(noise["noise_units"])
print(noise["validation_checks"])
print(noise["decision_hash"])Analytics helpers
The granular analysis endpoints are direct ledger operations locally.
import ufm
def to_bits(data: bytes) -> list[int]:
return [int(bit) for byte in data for bit in f"{byte:08b}"]
data = b"abcabcabc"
ledger = ufm.ingest_raw(to_bits(data), symbol_length_mode="auto_curve")
timeline = {
"acf": ledger.acf(50),
"segments": ledger.segments(100),
"transitions": ledger.transitions(100, 0.1),
}
frequency = {
"histogram": ledger.frequency_histogram(),
"top_n": ledger.top_n_primitives(20),
}
discovery = {
"discovery_sequence": ledger.discovery_sequence(),
"discovery_rate": ledger.discovery_rate,
}
symbol_length, selector_meta = ufm.find_optimal_symbol_length(to_bits(data))
profile = ufm.structural_profile(data, symbol_width=16)
print(timeline)
print(frequency)
print(discovery)
print(symbol_length, selector_meta)
print(profile)Batch and corpus workflows
Use process_batch for persisted corpus ingestion and ufm_signature_batch for independent stateless signatures. Local corpus import/export is just controlled movement of your ledger.bin file.
import shutil
import ufm
def to_bits(data: bytes) -> list[int]:
return [int(bit) for byte in data for bit in f"{byte:08b}"]
docs = [b"file one", b"file two", b"file three"]
with ufm.InvariantIdentityEngine(storage_path="corpus-ledger.bin") as eng:
results = eng.process_batch(docs)
seeds = [seed for seed, _status in results]
summary = eng.ledger_summary()
ledgers = [ufm.ingest_raw(to_bits(doc)) for doc in docs]
jaccard_matrix = [
[ufm.ledger_jaccard(a, b) for b in ledgers]
for a in ledgers
]
print(seeds)
print(summary)
print(jaccard_matrix)
# Export/import the local ledger file.
shutil.copyfile("corpus-ledger.bin", "customer-export.ufmr")
shutil.copyfile("customer-export.ufmr", "restored-ledger.bin")Universal and decision pipelines
UniversalPipeline is the governed data processing path. DecisionPipeline is the text decision/audit path with anti-drift gates and a persistent audit ledger.
import ufm
up = ufm.UniversalPipeline(
storage_path="universal-ledger.bin",
bit_depth=21,
verify=True,
zero_point=False,
)
run = up.run(b"payload for the governed pipeline")
print(run["success"], run["replay_valid"], run["quality"])
print(run["stages_completed"])
dp = ufm.DecisionPipeline("decision-ledger.jsonl")
decision = dp.run("What is structural identity in UFM?")
print(decision["status"])
print(decision["decision_hash"])
print(decision["gates"])Call Ben locally
Ben is included in the wheel. It loads sealed prompts after the ben.ask scope check, then uses the LLM backend you provide.
import os
import ufm
# Local Ollama. Make sure Ollama is running and the model is pulled.
backend = ufm.backend_from_config(backend="ollama", model="gemma4")
session = ufm.BenSession(backend=backend)
first = session.ask("What is UFM actually doing?")
print(first.text)
print(first.session_id, first.turn_count)
second = session.ask("How does replay relate to structural identity?")
print(second.text)
print(session.history())
# One-shot convenience wrapper with a provider API key.
answer = ufm.ask_ben(
"Explain the replay invariant.",
backend="openai",
model="gpt-4o",
api_key=os.environ["OPENAI_API_KEY"],
)
print(answer["text"])CLI form:
python -m ufm ask-ben "What is UFM?" --backend ollama --model gemma4
python -m ufm ask-ben "Explain replay" --backend openai --model gpt-4o --api-key YOUR_OPENAI_API_KEYCall Bob locally
Bob is also included in the wheel. The sealed corpus, claim-gate snapshot, OOV thresholds, and prompts ship with the package. Bob can run corpus retrieval and gate/OOV/audit output without an LLM backend; pass a backend when you want generated response wording.
import os
import ufm
# Corpus retrieval, gate status, OOV metrics, and audit output.
bob = ufm.BobPipeline()
result = bob.query("What is the replay invariant?", mode="advisory", max_anchors=5)
print(result.response)
print(result.gate_status) # PASS, WARN, or BLOCK
print(result.evidence)
print(result.oov)
print(result.audit)
print(result.boundary_flags)
# Optional generated answer using an LLM backend.
backend = ufm.backend_from_config(
backend="anthropic",
model="claude-sonnet-4-20250514",
api_key=os.environ["ANTHROPIC_API_KEY"],
)
bob_with_llm = ufm.BobPipeline(backend=backend)
print(bob_with_llm.query("Explain C-CORE-002.").response)
# One-shot convenience wrapper.
one_shot = ufm.ask_bob(
"Is replay identity verified?",
backend="ollama",
model="gemma4",
)
print(one_shot["response"])
print(one_shot["gate_status"])CLI form:
python -m ufm ask-bob "Is the replay invariant verified?" --backend ollama --model gemma4
python -m ufm ask-bob "Explain C-CORE-002" --backend anthropic --api-key YOUR_ANTHROPIC_API_KEYLocal equivalents at a glance
| API surface | Local wheel surface |
|---|---|
/v1/engine | Compose UniversalPipeline, ufm_signature, ingest_raw, and ledger analytics. |
/v1/process, /v1/replay, /v1/reconstruct | InvariantIdentityEngine.process, replay, and reconstruct. |
/v1/pipeline, /v1/compare | ledger_compare, ledger_jaccard, plus SemanticDecisionPipeline for pair noise. |
/v1/noise/*, /v1/semantic/analyze | SemanticDecisionPipeline.run, run_with_policy, and capabilities. |
/v1/analyze/*, /v1/structural_profile | Ledger methods: acf, segments, transitions, frequency_histogram, discovery_sequence, plus structural_profile. |
/v1/batch/*, /v1/corpus/* | process_batch, ufm_signature_batch, local loops, local manifests, and copying/importing ledger.bin. |
/v1/ingest/async | Run process or UniversalPipeline.run inside your own background worker or queue. |
/v1/ben/ask, /v1/bob/query | BenSession, ask_ben, BobPipeline, ask_bob, or the CLI commands. |
/v1/me/llm-credentials | Pass OllamaBackend, APIBackend, or backend_from_config directly. Store provider keys in your own secret manager. |
Operational notes
- Keep one ledger file per project or tenant. Primitive reuse is scoped to that file.
- Do not run multiple writers against the same ledger path without your own lock.
- Ledger paths must stay inside the current working directory; paths containing
..are rejected. - Call
ufm.set_num_threads(n)before the first ingestion if you need to bound CPU use. - Provider-backed Ben/Bob calls require the provider SDK, for example
pip install openaiorpip install anthropic.
Authentication
All API endpoints (except /v1/health) require authenticated context. Programmatic clients should pass an API key in theX-Api-Key header. Browser dashboard sessions can use an access_token cookie.
Getting your API key
Go to API Keys in the sidebar, click Create Key, and copy the full key immediately. It is only shown once.
Using your API key
curl -H "X-Api-Key: ufm_live_a1b2c3d4.your_secret_here" \
-H "Content-Type: application/json" \
-X POST https://api.spectrengine.com/v1/process \
-d '{"data_b64": "..."}'Key format
Keys follow the format ufm_live_{public_id}.{secret}. The prefix identifies the key; the secret authenticates it. Both parts are required.
POST /v1/engine
Send any binary data (base64-encoded) and the engine decomposes it into structural primitives, assigns a deterministic seed (its identity), and runs five analysis layers: core ingestion, 7-stage quality pipeline, timeline autocorrelation, frequency distribution, and discovery sequencing. Optionally provide a second input for structural comparison.
The response is grouped by layer so you can read exactly the depth you need. Most integrations only use core (identity and signature) anduniversal.quality (quality metrics).
Request body
data_b64(string) Base64-encoded binary data to processsymbol_length_mode(string, optional) Default: "auto_curve". Options: auto, auto_curve, entropy, fixed8, fixed16, fixed24, fixed32, fixed64verify(boolean, optional) Run replay verification in universal pipeline (default: true)compare_b64(string, optional) Second item for structural comparison (base64-encoded)max_lag(integer, optional) Maximum ACF lag for timeline analysis (1-500, default: 50)top_n(integer, optional) Top-N primitives for frequency analysis (1-1000, default: 20)
Response - nested by layer
core, universal, and analysis sections are computed in one request-scoped context. Use the response trust_boundary field to interpret replay and cross-section comparisons correctly.core: Core ingestion result
core.seed(integer) Deterministic seed from geometric signaturecore.status(string) NOVELTY or REPLAYcore.discovery_rate(float) Fraction of novel primitives (0.0-1.0)core.symbol_length(integer) Symbol width usedcore.primitive_count(integer) Unique primitives foundcore.timeline_length(integer) Total timeline entriescore.reuse_ratio(float) Fraction of reused primitivescore.replay_valid(boolean) Whether replay(ingest(data)) == datacore.signature(string) Multi-scale geometric signature
universal: 7-stage governed pipeline with quality gates
universal.success(boolean) Whether pipeline completed successfullyuniversal.quality(object) Quality metrics: replay_valid, discovery_rate, reuse_ratio, deterministicuniversal.replay_valid(boolean) Whether replay invariant holdsuniversal.stages_completed(string[]) List of completed stage namesuniversal.metrics(object) Pre-ingest byte metrics: byte_entropy, unique_byte_ratio, size_class, input_lengthuniversal.strategy(object) Selected processing strategy: symbol_length_mode, zero_pointuniversal.execute(object) Execution results: seed, status, discovery_rate, primitive_count, timeline_length, reuse_ratiouniversal.field_reach(object) Per-seed field connectivity: primitive_count, shared_with_others, unique_to_seed, total_ledger_primitives, reach_ratio, isolation_ratiouniversal.violations(string[]) Any warnings or errors
timeline: Temporal structure analysis
timeline.acf(float[]) Autocorrelation function valuestimeline.segments(array) Discovery segments with start, end, ratetimeline.transitions(integer[]) Positions of structural transitions
frequency: Primitive frequency distribution
frequency.histogram(array) Full primitive frequency distributionfrequency.top_n(array) Top-N most frequent primitivesfrequency.total_primitives(integer) Unique primitive count
discovery: Novelty emergence sequence
discovery.discovery_sequence(integer[]) Primitive IDs in order of first encounterdiscovery.discovery_rate(float) Final discovery ratediscovery.primitive_count(integer) Total unique primitives
comparison: Structural overlap (only if compare_b64 provided)
comparison.jaccard(float) Jaccard similarity (shared / union)comparison.shared_primitives(integer) Primitives in both inputscomparison.only_a(integer) Primitives exclusive to first inputcomparison.only_b(integer) Primitives exclusive to second inputcomparison.overlap_coefficient(float) Overlap coefficient (shared / min), 0-1
Contract metadata
effective_symbol_length_mode(string) Method-layer selected mode applied across core/analysis/comparison sectionstrust_boundary(object) Ledger-context notes for core/universal/analytics/comparison/replay compatibilityverification_note(string) Clarifies success vs replay_valid semantics, especially when verify=false
cURL example
curl -X POST https://api.spectrengine.com/v1/engine \
-H "Content-Type: application/json" \
-H "X-Api-Key: YOUR_API_KEY" \
-d '{"data_b64": "SGVsbG8gV29ybGQ="}'Python example
import base64, requests
API_KEY = "ufm_live_a1b2c3d4.your_secret_here"
BASE = "https://api.spectrengine.com"
resp = requests.post(
f"{BASE}/v1/engine",
json={"data_b64": base64.b64encode(b"Hello World").decode()},
headers={"X-Api-Key": API_KEY},
)
result = resp.json()
# Identity
print(result["core"]["seed"]) # deterministic structural identity
print(result["core"]["status"]) # "NOVELTY" or "REPLAY"
print(result["core"]["replay_valid"]) # True when replay(ingest(x)) == x
# Quality gates
print(result["universal"]["quality"]) # {replay_valid, discovery_rate, reuse_ratio, deterministic}
# Structure
print(len(result["timeline"]["segments"])) # temporal segments found
print(result["frequency"]["total_primitives"]) # unique primitives
print(result["discovery"]["discovery_rate"]) # fraction of novel primitivesExample response (abbreviated)
{
"core": {
"seed": 8472619,
"status": "NOVELTY",
"discovery_rate": 1.0,
"symbol_length": 14,
"primitive_count": 42,
"timeline_length": 42,
"reuse_ratio": 0.0,
"replay_valid": true,
"signature": "G4:0.707:0.500:0.500:6:1.000:0.120|G8:..."
},
"universal": {
"success": true,
"quality": {
"replay_valid": true, "discovery_rate": 1.0,
"reuse_ratio": 0.0, "deterministic": true
},
"stages_completed": ["VALIDATE","METRICS","STRATEGY","EXECUTE","VERIFY","ADAPT","OUTPUT"],
"metrics": {"byte_entropy": 3.459, "unique_byte_ratio": 0.727, "size_class": "small"},
"strategy": {"symbol_length_mode": "auto_curve", "zero_point": true},
"violations": []
},
"timeline": { "acf": [1.0, 0.02, -0.04, ...], "segments": [...], "transitions": [] },
"frequency": { "histogram": [...], "top_n": [...], "total_primitives": 42 },
"discovery": { "discovery_sequence": [0, 1, 2, ...], "discovery_rate": 1.0 },
"effective_symbol_length_mode": "auto_curve",
"trust_boundary": { ... },
"verification_note": "..."
}API tiers
Tier 1: this endpoint. One call, all structural analysis layers. The default for single-input analysis. Returns identity, quality, timeline, frequency, and discovery data.
Tier 2: /v1/pipeline. The default when comparing two inputs. Runs core ingestion plus structural comparison and semantic noise detection (identifies BOM, CRLF, and encoding artefacts).
Tier 3: granular endpoints. Individual layer endpoints for fine-grained control: /v1/process for lightweight ingestion, /v1/compare for structural comparison without semantic analysis, or any /v1/analyze/* endpoint individually.
POST /v1/pipeline
With one input: returns core analysis only (same as /v1/process). With two inputs (source + target): returns core analysis, structural comparison (shared/unique primitives, Jaccard similarity), and semantic noise analysis (noise classes, convergence check, anti-drift governance gates).
Unlike /v1/engine, this endpoint does not run timeline, frequency, or discovery analysis. It focuses on answering: “Are these two inputs structurally the same, and what noise accounts for any differences?”
Request body
data_b64(string) Primary input, base64-encodedtarget_b64(string, optional) Second input for comparison/noise analysissymbol_length_mode(string, optional) Default: "auto_curve"
Response fields
core(object) Core analysis: seed, status, discovery_rate, symbol_length, primitive_count, timeline_length, reuse_ratio, replay_valid, signaturecompare(object | null) Structural comparison (present only when target given)semantic(object | null) Semantic noise analysis (present only when target given)effective_symbol_length_mode(string) Method-layer selected mode applied across core/compare sectionstrust_boundary(object) Ledger-context notes for core/compare/semantic/replay compatibilityengine_version(string) Engine version
Python example: pair mode
import base64, requests
resp = requests.post(
f"{BASE}/v1/pipeline",
json={
"data_b64": base64.b64encode(b"original file").decode(),
"target_b64": base64.b64encode(b"modified file").decode(),
},
headers={"X-Api-Key": API_KEY},
)
result = resp.json()
print(f"Core seed: {result['core']['seed']}")
print(f"Jaccard: {result['compare']['jaccard']}")
print(f"Converges: {result['semantic']['converges']}")
print(f"Noise: {result['semantic']['noise_units']}")Example response (pair mode, abbreviated)
{
"core": {
"seed": 8472619, "status": "NOVELTY", "discovery_rate": 0.85,
"replay_valid": true, "signature": "G4:0.707:..."
},
"compare": {
"shared_primitives": 38, "only_a": 4, "only_b": 6,
"jaccard": 0.792, "overlap_coefficient": 0.905
},
"semantic": {
"status": "ok",
"converges": true,
"noise_units": [
{
"noise_type": "LineEndingCrlf", "layer": "semantic",
"function": "normalize", "operation": "strip_cr", "confidence": 1.0
}
],
"decision_hash": "a1b2c3d4...",
"gates": {
"coherence": true, "consistency": true,
"causality": true, "persistence": true
}
},
"effective_symbol_length_mode": "auto_curve",
"trust_boundary": { ... },
"engine_version": "3.0-rust"
}noise_units array lists each artefact that was filtered. If converges is false, the differences are structural, not just noise.Licences
Licences are signed 30-day tokens that unlock tier-scoped features such as ben.ask and bob.query. The local wheel caches the signed token during python -m ufm activate, then checks it locally on normal engine calls.
POST /v1/licences/verify
Mint or refresh a signed licence token for the caller's API key. Authenticated with X-Api-Key (not JWT). Rate-limited to 60 requests per hour.
Response:
token.payload.public_id(string) Public id from the API key used for activationtoken.payload.scopes(string[]) Granted local feature scopes, for example ben.ask and bob.querytoken.payload.tier(string) Customer tiertoken.payload.expires_at(string (ISO 8601)) Token expiry timestamptoken.signature(string) Base64 signature over the payload
GET /v1/licences/me
Return the caller's current licence status without minting a new token.
active(boolean) True iff a non-revoked, non-expired licence existstier(string) Subscription tier (e.g. standard)expires_at(string (ISO 8601) or null) Expiry timestamp of the active licence, or null if none existsrevoked(boolean) True if the licence has been administratively revoked
POST /v1/licences/revoke/{licence_id}
Admin-only. Revoke a specific licence by its UUID. Returns { revoked: true, licence_id }.
/verifyduring python -m ufm activate ufm_live_...and then ufm.licence_status() reads the cached status locally.Python example
import os
import requests
# X-Api-Key authentication, not JWT
headers = {"X-Api-Key": os.environ["SPECTR_API_KEY"]}
resp = requests.post(f"{BASE}/v1/licences/verify", headers=headers)
token = resp.json()["token"]
print(token["payload"]["tier"])
print(token["payload"]["scopes"])
status = requests.get(f"{BASE}/v1/licences/me", headers=headers).json()
print(status["active"], status["expires_at"])LLM credentials
Store your Anthropic, OpenAI, or Google Gemini API key for Ben and Bob. The server encrypts keys at rest. You can also manage these from the dashboard Settings page.
GET /v1/me/llm-credentials
Returns whether a key is configured and the chosen provider and model name. Never returns the secret.
configured(boolean) True when this account has stored LLM credentialsprovider(string | null) Stored provider: "anthropic", "openai", or "gemini"model(string | null) Configured model name, or the provider default
PUT /v1/me/llm-credentials
provider(string) One of: "anthropic", "openai", "gemini"api_key(string) Provider API key (required on every update)model(string, optional) Model id; omit for the provider default
LLM_CREDENTIALS_ENCRYPTION_KEY, PUT returns 503 and keys cannot be stored.DELETE /v1/me/llm-credentials
Removes stored credentials for the current user.
Python example
import os
import requests
headers = {"X-Api-Key": API_KEY}
before = requests.get(
f"{BASE}/v1/me/llm-credentials",
headers=headers,
).json()
print(before["configured"])
r = requests.put(
f"{BASE}/v1/me/llm-credentials",
json={
"provider": "openai",
"api_key": os.environ["OPENAI_API_KEY"],
"model": "gpt-4o",
},
headers=headers,
)
print(r.json())
removed = requests.delete(
f"{BASE}/v1/me/llm-credentials",
headers=headers,
).json()
print(removed["configured"])POST /v1/ben/ask
Send a message to Ben, the UFM research assistant. Conversations are persisted per-user, so pass the returned conversation_id when you want a follow-up turn in the same conversation.
PUT /v1/me/llm-credentials before this endpoint. Locally, use BenSessionor ask_ben with an LLM backend.Request body
message(string) Your question or message for Ben (1-20,000 chars)conversation_id(string, optional) UUID of an existing conversation to continue
Response
conversation_id(string) UUID of the conversation (new or existing)conversation_title(string) Auto-generated title from the first messageresponse(string) Ben's assistant response texttoken_estimate(integer) Conservative token estimate for the exchangetoken_warning(boolean) True when token estimate exceeds 6,000user_message(object) Persisted user message with id, content, created_atassistant_message(object) Persisted assistant message with id, content, created_at
Python example
import requests
# Start a new conversation
resp = requests.post(
f"{BASE}/v1/ben/ask",
json={"message": "What is UFM actually doing?"},
headers={"X-Api-Key": API_KEY},
)
resp.raise_for_status()
body = resp.json()
print(body["response"])
conversation_id = body["conversation_id"]
# Continue the conversation
resp = requests.post(
f"{BASE}/v1/ben/ask",
json={
"message": "How does replay work?",
"conversation_id": conversation_id,
},
headers={"X-Api-Key": API_KEY},
)
print(resp.json()["response"])GET /v1/ben/ledger/status
Returns Ben's projection ledger health, skill index, and last-updated timestamp. Use it as a product health check before starting a user session.
projections(object) Projection file counts and SHA-256 valuesskill_index(object) Capability count and capability nameshealth(object) Open contradiction and failed pattern recurrence countslast_updated_at(string | null) Timestamp of the loaded Ben data
status = requests.get(
f"{BASE}/v1/ben/ledger/status",
headers={"X-Api-Key": API_KEY},
).json()
print(status["skill_index"]["capability_count"])POST /v1/bob/query
Run Bob's grounded query pipeline: tokenise the question, apply claim gates, retrieve replay-anchored evidence, monitor out-of-vocabulary terms, and return audit output.
Request body
question(string) User question for Bob (1-20,000 chars)candidate_response(string, optional) Draft response to monitor and gatemode(string, optional) Only "advisory" is accepted; default is "advisory"governed(boolean, optional) Governed-path metadata flag; gate outcomes still annotate the responsemax_anchors(integer, optional) Maximum replay anchors returned (1-20, default: 5)
Response highlights
response(string) Final response after gate and OOV policytokenise(object) Query seed, status, discovery_rate, and query_oov_ratiogate.status(string) Claim gate result: "PASS", "WARN", or "BLOCK"evidence(object[]) Replay-anchored evidence rows with source_id, source_path, seed, claim_ids, phase_numbers, and snippetoov_classification(string) OOV policy result: "PASS", "WARN", or "BLOCK"oov_metrics(object) Out-of-vocabulary metrics used by the policyaudit(object) DecisionPipeline audit status, decision_hash, stage, reason, and gatesboundary_flags(string[]) User-visible boundary indicators
Python example
import requests
resp = requests.post(
f"{BASE}/v1/bob/query",
json={
"question": "Explain C-CORE-002 replay invariant.",
"governed": True,
"max_anchors": 5,
},
headers={"X-Api-Key": API_KEY},
)
resp.raise_for_status()
body = resp.json()
print(body["gate"]["status"], body["oov_classification"])
print(body["boundary_flags"])
print(body["response"])POST /v1/bob/ask and POST /v1/bob/chat
These routes run Bob as a persisted chat conversation. /v1/bob/ask is the primary chat route;/v1/bob/chat is a backward-compatible alias with the same behavior.
message(string) Message for Bob (1-20,000 chars)conversation_id(string, optional) Existing conversation id to continuemode(string, optional) Only "advisory" is acceptedgoverned(boolean, optional) Governed-path metadata flag
chat = requests.post(
f"{BASE}/v1/bob/ask",
json={"message": "What does the replay invariant mean?"},
headers={"X-Api-Key": API_KEY},
).json()
print(chat["response"])
print(chat["query"]["gate"]["status"])GET /v1/bob/ledger/status
Returns Bob's corpus status, session decision count, OOV thresholds, whether reinforcement is enabled, and engine version.
corpus(object) Source count, claim count, built_at, and corpus_index_sha256session_decisions(object) Total decisions and last decision timestampoov(object) Warn/block thresholds, symbol length, and calibration statusreinforce_enabled(boolean) Whether admin reinforcement is enabledengine_version(string) Engine version string
status = requests.get(
f"{BASE}/v1/bob/ledger/status",
headers={"X-Api-Key": API_KEY},
).json()
print(status["corpus"]["claim_count"])
print(status["oov"]["warn_threshold"])POST /v1/process
Ingest binary data and persist it to your ledger. Returns structural analysis including a deterministic seed, discovery rate, and replay validity. The seed can be used with /v1/replay/{seed} to retrieve the original data later.
/v1/process when you need persisted ingestion (data stored for later replay). Use /v1/engine when you want a full structural analysis without persistence.Request body
data_b64(string) Base64-encoded binary data to ingestsymbol_length_mode(string, optional) Engine mode. Default: "auto_curve". Options: auto, auto_curve, entropy, fixed8, fixed16, fixed24, fixed32, fixed64
Response fields
seed(integer) Deterministic seed derived from geometric signature and primitive countstatus(string) "NOVELTY" if new structure discovered, "REPLAY" if all primitives already knowndiscovery_rate(float) Fraction of primitives that were novel (0.0 to 1.0)symbol_length(integer) Symbol length used (fixed modes return the requested width)primitive_count(integer) Total unique primitives in the decompositiontimeline_length(integer) Number of entries in the ingestion timelinereuse_ratio(float) Fraction of timeline entries that reused existing primitivesreplay_valid(boolean) Whether replay(ingest(data)) == datasignature(string) Multi-scale geometric signature encoding the structural profile (see below)selection_model(object | null) Selection statistics: total_selections, selections_created, selections_reused
Signature format (structural profile)
The signature field encodes the geometric profile at multiple scale windows. Each scale produces a component in the format:
G{W}:{spread}:{sx}:{sy}:{mass}:{centroid}:{topo}Where W is the scale width and the six values are:
spread(float) RMS radius from centroid - overall spatial extentsx(float) X-axis standard deviationsy(float) Y-axis standard deviationmass(integer) Count of set bits at this scalecentroid(float) Sum of centroid coordinates (cx + cy)topo(float) Topological score with pre-bits weighting
Scale windows are [4, 8, 16, 32, 64] for inputs over 16 bits. Components are separated by |. Parse these to build a per-scale structural profile for cross-input comparison.
Python example - fixed-32 mode with signature parsing
import base64
import requests
API_KEY = "ufm_live_a1b2c3d4.your_secret_here"
BASE = "https://api.spectrengine.com"
data = b"Hello World"
resp = requests.post(
f"{BASE}/v1/process",
json={
"data_b64": base64.b64encode(data).decode(),
"symbol_length_mode": "fixed32",
},
headers={"X-Api-Key": API_KEY},
)
result = resp.json()
print(f"Seed: {result['seed']}")
print(f"Status: {result['status']}")
print(f"Discovery rate: {result['discovery_rate']:.4f}")
print(f"Symbol length: {result['symbol_length']}")
print(f"Signature: {result['signature']}")File input helper
import base64
import pathlib
import requests
data = pathlib.Path("myfile.bin").read_bytes()
resp = requests.post(
f"{BASE}/v1/process",
json={
"data_b64": base64.b64encode(data).decode(),
"symbol_length_mode": "auto_curve",
},
headers={"X-Api-Key": API_KEY},
)
result = resp.json()
print(result["seed"], result["status"], result["replay_valid"])POST /v1/ingest/async
Queue a payload for ingestion by the background worker instead of waiting for the engine to run inline. Returns 202 Accepted with a job_id that you can poll with GET /v1/ingests/{job_id}.
Request body
data_b64(string) Raw bytes, base64-encoded (up to 100 MB)symbol_length_mode(string, optional) Default: "auto_curve"
/v1/ingests/{job_id} to check status.Response
job_id(string (UUID)) The ingest_jobs row idstatus(string) Always "queued" on success
Python example
import base64
import requests
queued = requests.post(
f"{BASE}/v1/ingest/async",
json={"data_b64": base64.b64encode(b"large payload").decode()},
headers={"X-Api-Key": API_KEY},
).json()
print(queued["job_id"], queued["status"])GET /v1/ingests/{job_id}
Poll the status of one of your async ingest jobs. Returns 404 if the job does not exist or belongs to another user.
Response fields
job_id(string (UUID)) The ingest_jobs row idstatus(string) "queued" | "running" | "ok" | "error"symbol_length_mode(string) Mode used for this jobpayload_bytes(integer) Byte count of the decoded payloadretries(integer) Retry count to datequeued_at(timestamp) When the job was acceptedstarted_at(timestamp | null) Worker pickup timecompleted_at(timestamp | null) Worker finish timeresult(object | null) Engine result dict when status is 'ok' (matches /v1/process response)error(string | null) Error message when status is 'error'
Python example
job_id = queued["job_id"]
job = requests.get(
f"{BASE}/v1/ingests/{job_id}",
headers={"X-Api-Key": API_KEY},
).json()
print(job["status"])
if job["status"] == "ok":
print(job["result"]["seed"])
elif job["status"] == "error":
print(job["error"])GET /v1/ingests
List your own async ingest jobs in reverse-chronological order.
Query parameters
limit(integer, optional) Max rows to return (1-100). Default: 100before(timestamp, optional) Cursor: return rows queued before this time
Response fields
jobs(object[]) Array of job views; see /v1/ingests/{job_id} for field details
Python example
jobs = requests.get(
f"{BASE}/v1/ingests",
params={"limit": 25},
headers={"X-Api-Key": API_KEY},
).json()["jobs"]
for job in jobs:
print(job["job_id"], job["status"], job["queued_at"])POST /v1/reconstruct
Verify that data survives a full ingest-replay round trip. Returns whether the reconstruction is lossless.
Request body
data_b64(string) Base64-encoded binary data to verify
Response fields
valid(boolean) true if replay(ingest(data)) == data exactly
Python example
import base64, requests
resp = requests.post(
f"{BASE}/v1/reconstruct",
json={"data_b64": base64.b64encode(b"test data").decode()},
headers={"X-Api-Key": API_KEY},
)
print(resp.json()) # {"valid": true}GET /v1/replay/{seed}
Replay previously ingested data by its seed. Returns the reconstructed byte chunks as base64-encoded strings.
Path parameters
seed(integer) Seed returned from a persisted ingestion path (typically /process or /process/universal)
Response fields
seed(integer) Echo of the requested seedchunks_b64(string[]) Reconstructed data chunks, each base64-encoded
Python example
import base64, requests
seed = 8472619 # from a previous /process response
resp = requests.get(
f"{BASE}/v1/replay/{seed}",
headers={"X-Api-Key": API_KEY},
)
result = resp.json()
for chunk_b64 in result["chunks_b64"]:
print(base64.b64decode(chunk_b64))JavaScript example
const seed = 8472619;
const resp = await fetch(`${BASE}/v1/replay/${seed}`, {
headers: { "X-Api-Key": API_KEY },
});
const result = await resp.json();
const chunks = result.chunks_b64.map(chunk => {
const raw = atob(chunk);
return Uint8Array.from(raw, c => c.charCodeAt(0));
});
console.log(chunks);POST /v1/compare
Compare two data items by structural primitive overlap. Returns Jaccard similarity, shared/exclusive primitive counts, and geometric signatures for both inputs.
Request body
data_a_b64(string) First input, base64-encodeddata_b64(string) Second input, base64-encodedsymbol_length_mode(string, optional) Default: "auto_curve"
Response fields
shared_primitives(integer) Primitives present in both inputsonly_a(integer) Primitives exclusive to first inputonly_b(integer) Primitives exclusive to second inputjaccard(float) Jaccard similarity (shared / union)overlap_coefficient(float) Overlap coefficient (shared / min)signature_a(string) Geometric signature for first inputsignature_b(string) Geometric signature for second inputdiscovery_rate_a(float) Discovery rate for first inputdiscovery_rate_b(float) Discovery rate for second inputsymbol_length_a(integer) Symbol length selected for first inputsymbol_length_b(integer) Symbol length selected for second input
Python example
import base64
import requests
resp = requests.post(
f"{BASE}/v1/compare",
json={
"data_a_b64": base64.b64encode(b"Hello").decode(),
"data_b64": base64.b64encode(b"World").decode(),
},
headers={"X-Api-Key": API_KEY},
)
result = resp.json()
print(result["jaccard"])
print(result["shared_primitives"])POST /v1/structural_profile
Compute the canonical 5D structural complexity profile for one input at a fixed symbol width. This is the engine-side reference implementation of the Phase-94 profile used by the C-DISCOV-045 claim test.
Returns alpha (vocabulary growth exponent), s_zipf (Zipf exponent of the frequency distribution), v_size (vocabulary size), reuse (1 − v_size / n_syms), and discovery_integral (count of unique symbols encountered).
Request body
data_b64(string) Raw bytes, base64-encodedsymbol_width(integer, optional) Symbol width in bits, 1-128. Default: 16sizes_for_alpha(integer[], optional) Prefix sizes in bytes for the growth-exponent fit. Default: [512, 1024, 2048, 4096, 8192, 16384, 32768]
Response fields
alpha(float) Vocabulary growth exponent (log-log fit of |V| vs N)s_zipf(float) Zipf exponent of the frequency distributionv_size(integer) Vocabulary size (distinct symbols at this width)reuse(float) 1 - (v_size / n_syms)discovery_integral(integer) Count of unique symbols encounteredsymbol_width(integer) Symbol width used for this callengine(string) Always "ufm_core_structural_profile" on the native engine path
cURL example
curl -X POST https://api.spectrengine.com/v1/structural_profile \
-H "Content-Type: application/json" \
-H "X-Api-Key: YOUR_API_KEY" \
-d '{"data_b64": "QVRDR0FUQ0dBVENHQVRDRw==", "symbol_width": 16}'Python example: fixed-16 and fixed-32 per window
import base64, requests
def window_meta(window_bytes: bytes) -> dict:
encoded = base64.b64encode(window_bytes).decode()
out = {}
for width in (16, 32):
resp = requests.post(
f"{BASE}/v1/structural_profile",
json={"data_b64": encoded, "symbol_width": width},
headers={"X-Api-Key": API_KEY},
)
out[f"meta_fixed{width}"] = resp.json()
return outengine field reports "ufm_core_structural_profile" when the native engine computed the profile. A different value indicates a non-canonical fallback path; treat those numbers as unverified.POST /v1/batch/process
Process multiple data items in a single request. Items are processed in parallel for maximum throughput. Maximum 100 items per request.
Request body
items_b64(string[]) Array of base64-encoded data items (1-100)symbol_length_mode(string, optional) Default: "auto_curve"
Response fields
results(object[]) Array of results (same fields as /v1/process)total_items(integer) Number of items processedprocessing_ms(integer) Total wall-clock processing time
Python example
import base64, requests
items = [b"file_one", b"file_two", b"file_three"]
resp = requests.post(
f"{BASE}/v1/batch/process",
json={"items_b64": [base64.b64encode(i).decode() for i in items]},
headers={"X-Api-Key": API_KEY},
)
for r in resp.json()["results"]:
print(f"Seed {r['seed']}: {r['status']} (rate={r['discovery_rate']:.2f})")POST /v1/batch/engine
Run the full engine pipeline on up to 50 items in a single request. Returns core, universal, timeline, frequency, discovery, and optional comparison for each item. Per-item error handling: one bad item does not abort the batch.
Request body
items(object[]) Array of engine items (1-50)items[].data_b64(string) Input data, base64-encodeditems[].compare_b64(string, optional) Comparison target, base64-encodeditems[].symbol_length_mode(string, optional) Default: "auto_curve"items[].verify(boolean, optional) Enable replay verification (default: true)items[].max_lag(integer, optional) ACF max lag (default: 50)items[].top_n(integer, optional) Top-N primitives (default: 20)include(string[], optional) Layers to include. Default: all. Options: "core", "universal", "timeline", "frequency", "discovery", "comparison"
Response fields
results(object[]) Full engine result per itemresults[].index(integer) Position in input arrayresults[].core(object | null) Core ingestion resultresults[].universal(object | null) Universal pipeline resultresults[].timeline(object | null) Timeline analysisresults[].frequency(object | null) Frequency analysisresults[].discovery(object | null) Discovery sequenceresults[].comparison(object | null) Structural comparison (if compare_b64 given)results[].error(string | null) Error message if this item failedtotal_items(integer) Number of items submittedsucceeded(integer) Items processed successfullyfailed(integer) Items that failedprocessing_ms(integer) Total wall-clock processing time
Python example
import base64, requests
items = [b"data_one", b"data_two", b"data_three"]
resp = requests.post(
f"{BASE}/v1/batch/engine",
json={
"items": [{"data_b64": base64.b64encode(i).decode()} for i in items],
"include": ["core", "timeline"], # skip layers you don't need
},
headers={"X-Api-Key": API_KEY},
)
data = resp.json()
print(f"Succeeded: {data['succeeded']}/{data['total_items']}")
for r in data["results"]:
if r["error"]:
print(f" [{r['index']}] ERROR: {r['error']}")
else:
print(f" [{r['index']}] seed={r['core']['seed']} dr={r['core']['discovery_rate']:.2f}")POST /v1/batch/compare
Compare multiple data pairs structurally in a single request. Maximum 100 pairs. Per-pair error handling: one bad pair does not abort the batch.
Request body
pairs(object[]) Array of comparison pairs (1-100)pairs[].data_a_b64(string) First input, base64-encodedpairs[].data_b64(string) Second input, base64-encodedpairs[].symbol_length_mode(string, optional) Default: "auto_curve"
Response fields
results(object[]) Comparison result per pairresults[].index(integer) Position in input arrayresults[].shared_primitives(integer | null) Primitives in both inputsresults[].only_a(integer | null) Primitives exclusive to first inputresults[].only_b(integer | null) Primitives exclusive to second inputresults[].jaccard(float | null) Jaccard similarityresults[].overlap_coefficient(float | null) Overlap coefficientresults[].error(string | null) Error message if this pair failedtotal_pairs(integer) Number of pairs submittedsucceeded(integer) Pairs compared successfullyfailed(integer) Pairs that failedprocessing_ms(integer) Total wall-clock processing time
Python example
import base64, requests
pairs = [(b"source_1", b"target_1"), (b"source_2", b"target_2")]
resp = requests.post(
f"{BASE}/v1/batch/compare",
json={
"pairs": [
{"data_a_b64": base64.b64encode(a).decode(),
"data_b64": base64.b64encode(b).decode()}
for a, b in pairs
],
},
headers={"X-Api-Key": API_KEY},
)
for r in resp.json()["results"]:
if r["error"]:
print(f" [{r['index']}] ERROR: {r['error']}")
else:
print(f" [{r['index']}] jaccard={r['jaccard']:.3f}")POST /v1/noise/detect
Detect deterministic noise artefacts between a source and target byte pair. Identifies transformations like BOM insertion, line-ending changes, and Base64 encoding.
Request body
source_b64(string) Original data, base64-encodedtarget_b64(string) Transformed data, base64-encoded
Response fields
converges(boolean) Whether source and target converge after normalisationnoise_units(object[]) Detected noise artefacts with type, layer, operationdecision_hash(string) Method-layer decision hash for audit-chain tracinggates(object) Anti-drift gate results (coherence, consistency, causality, persistence)
Python example
import base64
import requests
source = b"Hello World\n"
target = b"Hello World\r\n"
result = requests.post(
f"{BASE}/v1/noise/detect",
json={
"source_b64": base64.b64encode(source).decode(),
"target_b64": base64.b64encode(target).decode(),
},
headers={"X-Api-Key": API_KEY},
).json()
print(result["converges"])
print(result["noise_units"])Compares "Hello World" with "Hello World\r\n". Detects a LineEndingCrlf noise artefact: the CRLF line ending is noise, not a structural change.
POST /v1/noise/delta
Return the full 12-field noise delta between a source and target, plus a reversibility summary. Unlike /v1/noise/detect, every response includes source_span, target_span, primitive_span, and reversibility_meta.
Request body
source_b64(string) Source bytes, base64-encodedtarget_b64(string) Target bytes, base64-encodedenabled_noise_classes(string[], optional) Restrict detection to these class IDsstrict_allowlist(boolean, optional) If true, reject unknown class IDs. Default: false
Response fields
converges(boolean) normalize(source) == targetnoise_units(object[]) Each unit includes all 12 engine fieldssummary(object) { total_bytes_changed, noise_class_counts, reversibility_breakdown }decision_hash(string) Method-layer decision hashgates(object) Anti-drift gate results
Python example
source = b"Hello"
target = b"\xef\xbb\xbfHello"
delta = requests.post(
f"{BASE}/v1/noise/delta",
json={
"source_b64": base64.b64encode(source).decode(),
"target_b64": base64.b64encode(target).decode(),
},
headers={"X-Api-Key": API_KEY},
).json()
print(delta["summary"])
print(delta["noise_units"])GET /v1/noise/capabilities
List all available noise detection classes supported by the engine.
Response fields
capabilities(object[]) Array of {id, label} for each noise classengine_version(string) Engine version
Python example
caps = requests.get(
f"{BASE}/v1/noise/capabilities",
headers={"X-Api-Key": API_KEY},
).json()
print(caps["engine_version"])
print(caps["capabilities"])
# Example shape:
# {"capabilities": [{"id": "bom_utf8", "label": "BOM UTF-8"}, ...]}POST /v1/semantic/analyze
Run full semantic analysis between two byte sequences. Determines whether the inputs converge (are structurally identical after noise removal) and classifies each noise artefact found. Supports policy control to restrict which noise classes are detected.
Use this when you need more control than /v1/pipeline provides , for example to restrict detection to specific noise classes or use strict allowlisting. For most pair comparisons, /v1/pipeline is simpler.
Request body
source_b64(string) Source data, base64-encodedtarget_b64(string) Target data, base64-encodedenabled_noise_classes(string[], optional) Filter to specific noise class ids, for example ["bom_utf8", "line_ending_crlf"]strict_allowlist(boolean, optional) If true, only detect listed classes. Default: false
Response fields
status(string) "ok" or "rejected" (if anti-drift gates fail)converges(boolean) Whether normalised forms are identicalnoise_units(object[]) Detected noise artefactsdecision_hash(string) Method-layer decision hashgates(object) Anti-drift gate results
Compares "Hello" with "Hello" prefixed by a UTF-8 BOM. Returns converges: true with a BomUtf8 noise unit, meaning the data is functionally identical despite the byte-level difference.
Python example: strict allowlist
import base64, requests
resp = requests.post(
f"{BASE}/v1/semantic/analyze",
json={
"source_b64": base64.b64encode(b"Hello").decode(),
"target_b64": base64.b64encode(b"\xef\xbb\xbfHello").decode(),
"enabled_noise_classes": ["bom_utf8"], # Only detect BOM noise
"strict_allowlist": True,
},
headers={"X-Api-Key": API_KEY},
)
print(resp.json()["converges"]) # True
print(resp.json()["noise_units"]) # [{noise_type: "BomUtf8", ...}]GET /v1/corpus/list
List your recent ingests. This endpoint reads your own ledger only and never exposes another user’s data or Bob’s internal claim corpus.
Query parameters
limit(integer, optional) Max rows to return (1-500). Default: 100before(string, optional) Return rows created before this ISO-8601 timestamp
Response fields
records(object[]) Array of ingest records: request_id, endpoint, created_at, status_code, discovery_rate, symbol_length, primitive_count, timeline_length
Python example
records = requests.get(
f"{BASE}/v1/corpus/list",
params={"limit": 50},
headers={"X-Api-Key": API_KEY},
).json()["records"]
for row in records:
print(row["created_at"], row["endpoint"], row["status_code"])POST /v1/corpus/build
Ingest multiple documents into your own ledger in a single call. Returns the per-document seeds, a pairwise Jaccard matrix across the documents, and an updated summary of your ledger.
Request body
documents(object[]) 1-50 items of { data_b64 }symbol_length_mode(string, optional) Default: "auto_curve"
Response fields
seeds(integer[]) Structural seed per document, in input orderjaccard_matrix(float[][]) Dense symmetric similarity matrix, diagonal = 1.0ledger_summary(object) { primitive_count, timeline_length, discovery_rate, symbol_length, run_count }processing_ms(integer) Total server-side processing time
Python example
import base64
docs = [b"first document", b"second document"]
body = {
"documents": [
{"data_b64": base64.b64encode(doc).decode()}
for doc in docs
],
"symbol_length_mode": "auto_curve",
}
result = requests.post(
f"{BASE}/v1/corpus/build",
json=body,
headers={"X-Api-Key": API_KEY},
).json()
print(result["seeds"])
print(result["jaccard_matrix"])
print(result["ledger_summary"]["run_count"])GET /v1/corpus/metadata
Return the summary for your own ledger. Returns zero values when you have not yet ingested anything.
Response fields
primitive_count(integer) Distinct primitives in your ledgertimeline_length(integer) Total ledger timeline lengthdiscovery_rate(float) New primitives / total selectionssymbol_length(integer | null) Last symbol length usedrun_count(integer) Distinct ingest runs recorded against this ledger
Python example
summary = requests.get(
f"{BASE}/v1/corpus/metadata",
headers={"X-Api-Key": API_KEY},
).json()
print(summary["primitive_count"], summary["run_count"])GET /v1/corpus/export
Download your own ledger.bin as a binary attachment. Returns 404 if your ledger does not exist yet.
Python example
resp = requests.get(
f"{BASE}/v1/corpus/export",
headers={"X-Api-Key": API_KEY},
)
resp.raise_for_status()
with open("ledger-export.bin", "wb") as f:
f.write(resp.content)POST /v1/corpus/import
Replace your own ledger with an uploaded binary. The previous ledger is atomically renamed to ledger.bin.bak before the replacement becomes visible, so a crash cannot leave a half-written ledger.
Upload
multipart/form-data; the file must start with the UFMR magic header and be no larger than 100 MB.
Response fields
replaced(boolean) Always true on successbackup_path(string | null) Server path of the .bak file, or null if no prior ledgerbytes_written(integer) Size of the new ledgernew_stats(object) Summary after replacement
Python example
with open("ledger-export.bin", "rb") as f:
result = requests.post(
f"{BASE}/v1/corpus/import",
files={"file": ("ledger-export.bin", f, "application/octet-stream")},
headers={"X-Api-Key": API_KEY},
).json()
print(result["replaced"])
print(result["new_stats"]["primitive_count"])POST /v1/analyze/timeline
Analyse temporal structure of ingested data: autocorrelation, segment boundaries, and structural transitions. Useful for detecting periodicity and regime changes.
Request body
data_b64(string) Base64-encoded datasymbol_length_mode(string, optional) Default: "auto_curve"max_lag(integer, optional) Maximum ACF lag (1-500, default: 50)window_size(integer, optional) Segment window size (10-10000, default: 100)transition_threshold(float, optional) Rate-change threshold (0-1, default: 0.1)
Response fields
acf(float[]) Autocorrelation coefficients at lags 1..max_lagsegments(object[]) Per-segment discovery stats (start, end, rate, primitives)transitions(integer[]) Indices where discovery rate changes significantlyprimitive_count(integer) Total unique primitivestimeline_length(integer) Timeline entriesdiscovery_rate(float) Overall discovery rate
Python example
result = requests.post(
f"{BASE}/v1/analyze/timeline",
json={
"data_b64": base64.b64encode(b"abcabcabc").decode(),
"max_lag": 20,
},
headers={"X-Api-Key": API_KEY},
).json()
print(result["acf"])
print(result["segments"])POST /v1/analyze/frequency
Analyse primitive frequency distribution. Returns a full histogram sorted by frequency and the top-N most common primitives.
Request body
data_b64(string) Base64-encoded datasymbol_length_mode(string, optional) Default: "auto_curve"top_n(integer, optional) Number of top primitives to return (1-1000, default: 20)
Response fields
histogram(object[]) All primitives with {primitive_id, count}, sorted by counttop_n(object[]) Most frequent primitivestotal_primitives(integer) Total unique primitivestimeline_length(integer) Timeline entries
Python example
result = requests.post(
f"{BASE}/v1/analyze/frequency",
json={
"data_b64": base64.b64encode(b"abcabcabc").decode(),
"top_n": 5,
},
headers={"X-Api-Key": API_KEY},
).json()
print(result["top_n"])
print(result["total_primitives"])POST /v1/analyze/discovery
Retrieve the discovery sequence (order in which primitives were first encountered) and structural metrics. Useful for understanding how novelty evolves across data.
Request body
data_b64(string) Base64-encoded datasymbol_length_mode(string, optional) Default: "auto_curve"
Response fields
discovery_sequence(integer[]) Primitive IDs in discovery orderdiscovery_rate(float) Novelty fraction (0.0 to 1.0)primitive_count(integer) Total unique primitivestimeline_length(integer) Timeline entriessymbol_length(integer) Symbol width used
Python example
result = requests.post(
f"{BASE}/v1/analyze/discovery",
json={"data_b64": base64.b64encode(b"abcabcabc").decode()},
headers={"X-Api-Key": API_KEY},
).json()
print(result["discovery_sequence"])
print(result["discovery_rate"])POST /v1/analyze/symbol_length
Report the engine-selected symbol length for the given bytes and the selector metadata (entropy of the chosen length, the mode used, and the sample size consumed). Useful for diagnosing why a corpus produced the primitive count it did.
Request body
data_b64(string) Raw bytes, base64-encoded
Response fields
selected_length(integer) Symbol length (in bits) the engine choseentropy_at_selected(float) Shannon entropy at the selected lengthmode(string) Selector mode (e.g. "AutoCurve", "Entropy", "Fixed(8)")sample_bits_used(integer) Number of bits sampled during selection
Python example
result = requests.post(
f"{BASE}/v1/analyze/symbol_length",
json={"data_b64": base64.b64encode(b"Hello World").decode()},
headers={"X-Api-Key": API_KEY},
).json()
print(result["selected_length"])
print(result["entropy_at_selected"])POST /v1/process/universal
Run the 7-stage universal pipeline with quality gates. Each stage validates, analyses, and verifies your data through: VALIDATE → METRICS → STRATEGY → EXECUTE → VERIFY → ADAPT → OUTPUT.
Request body
data_b64(string) Input data, base64-encodedverify(boolean, optional) Run replay verification (default: true). Set false for bulk speed.
Response fields
success(boolean) Whether ingestion + stage flow completed (replay proof requires replay_valid=true)seed(integer) Deterministic seed from geometric signaturequality(object) Quality metrics: replay_valid, discovery_rate, reuse_ratio, deterministicreplay_valid(boolean) Whether replay invariant holdsstages_completed(string[]) All 7 stage names in ordermetrics(object) Pre-ingest byte metrics: byte_entropy, unique_byte_ratio, size_class, input_lengthstrategy(object) Selected processing strategy: symbol_length_mode, zero_pointexecute(object) Execution results: seed, status, discovery_rate, primitive_count, timeline_length, reuse_ratiofield_reach(object) Per-seed field connectivity: primitive_count, shared_with_others, unique_to_seed, total_ledger_primitives, reach_ratio, isolation_ratioviolations(string[]) Any warnings or errorsengine_version(string) Engine version
Python example
import base64, requests
resp = requests.post(
f"{BASE}/v1/process/universal",
json={
"data_b64": base64.b64encode(b"your data here").decode(),
"verify": True,
},
headers={"X-Api-Key": API_KEY},
)
result = resp.json()
print(f"Success: {result['success']}")
print(f"Seed: {result['seed']}")
print(f"Replay valid: {result['replay_valid']}")
print(f"Stages: {result['stages_completed']}")
print(f"Strategy: {result['strategy']}")
print(f"Quality: {result['quality']}")GET /v1/health
Check API and engine status. No authentication required.
Response fields
status(string) "ok" if database reachable, "degraded" if notengine_version(string) UFM engine version (e.g. 3.0-rust)
Python example
health = requests.get(f"{BASE}/v1/health").json()
print(health["status"])
print(health["engine_version"])cURL example
curl https://api.spectrengine.com/v1/health
# {"status": "ok", "engine_version": "3.0-rust"}Error Handling
All errors return a consistent JSON structure:
{
"error": {
"code": "ERROR_CODE",
"message": "Human-readable description",
"request_id": "correlation-uuid"
}
}Error codes
| Status | Code | Meaning |
|---|---|---|
| 400 | BAD_REQUEST | Invalid base64, malformed JSON, or engine error |
| 401 | UNAUTHORIZED | Missing or invalid authentication credentials |
| 413 | PAYLOAD_TOO_LARGE | Request body exceeds 10 MB limit |
| 422 | VALIDATION_ERROR | Request body failed schema validation |
| 429 | RATE_LIMIT | Too many requests (check Retry-After header) |
| 500 | INTERNAL_ERROR | Server error (includes request_id for support) |
Handling errors in code
import requests
BASE = "https://api.spectrengine.com"
API_KEY = "ufm_live_a1b2c3d4.your_secret_here"
resp = requests.post(f"{BASE}/v1/process", json=payload,
headers={"X-Api-Key": API_KEY})
if resp.status_code == 200:
result = resp.json()
elif resp.status_code == 429:
retry_after = resp.headers.get("Retry-After", 60)
print(f"Rate limited. Retry in {retry_after}s")
else:
error = resp.json().get("error", {})
print(f"Error {resp.status_code}: {error.get('message')}")
print(f"Request ID: {error.get('request_id')}")Rate Limits
The following protective limits remain in place:
| Limit | Purpose |
|---|---|
| 5 requests/min on login & register (per IP) | Prevents brute-force and spam account creation |
| 1 request per 5 minutes on register (per email) | Prevents the same address being submitted repeatedly across different IPs |
| 3 requests/min on password reset request | Prevents email flooding |
| 3 requests/hour on resend-verification (per email) | Prevents verification email flooding |
| 3 requests/hour on contact form (per IP) | Prevents inbox spam |
| Cloudflare Turnstile on register, password reset, contact, resend | Bot protection. Adaptive on login: shown after 3 failed attempts within 15 minutes. |
| Email verification required before login | Confirms account ownership; blocks throwaway-email signups. |
| 10 MB max request body | Payload size protection |
| 100 items per batch request | Batch size protection |
If you exceed an abuse-prevention limit the API returns HTTP 429 with a Retry-After header (seconds).