Skip to main content

Local vs cloud

Every operation Midcore performs — running a robotics policy, embedding a document, generating a PDF report, charting a Monte Carlo — is described in a single canonical catalog. The Compute Router consults that catalog plus your machine’s live capability + your preferences and decides where to run.

The five tiers

TierWhere it runsExamples
LOCAL_ONLYAlways your machine. Never billed.Physics preview, image thumbnailing, password-derived key generation, PKB capsule export.
PREFER_LOCALYour machine when it can. Cloud when local capability is missing.Monte Carlo simulation, document embedding, PDF extraction, OCR.
HYBRIDMeaningful local fast-path AND higher-fidelity cloud. User picks per-op.τ₀-WM policy inference (local if you have an RTX 4080+; cloud otherwise), large LLM inference (local small model vs cloud premium model).
PREFER_CLOUDCloud first; local exists only as degraded fallback.Update checks, light external API pass-through.
CLOUD_ONLYAlways cloud. No meaningful local path.Multi-tenant DB writes, Stripe webhooks, external paid LLM APIs (Anthropic, OpenAI), τ₀-WM cloud fine-tune.

The decision flow

When you invoke an operation, the router resolves a decision in roughly this order:

  1. Is the tier fixed? LOCAL_ONLY → LOCAL. CLOUD_ONLY → CLOUD. Done.
  2. Is the data sensitive? Operations flagged sensitive (passwords, biometric, raw camera feed) go LOCAL regardless of any preference.
  3. Did the user override this op? If your Account → Compute panel has a per-op override set, that wins (subject to the platform tier).
  4. What’s the default mode? If you’re on Cloud-first → CLOUD. Otherwise probe local capability against the op’s requirements.
  5. Can your machine actually run it? The router compares the op’s minimum requirements (CPU cores, free RAM, WebGPU, local model weights cached) against the live capability probe. If yes → LOCAL. If no → CLOUD (with the hold/charge path below).
  6. Hold + dispatch. For CLOUD, the router pre-flights the cost, places a balance hold, and then dispatches the operation. If the hold fails with 402 (insufficient balance) and a local fallback exists, the router demotes to LOCAL automatically.
  7. Finalize. When the operation completes, the router calls finalize: the hold becomes a debit + a signed receipt. The receipt lands on your project audit ledger.

What determines local capability

  • CPU cores. Some ops declare a minimum (Monte Carlo: 4; some heavier sims: 8).
  • Free RAM. Same idea — the router won’t fire a 4 GB local LLM on a machine with 2 GB free.
  • Free disk. Per-op model weights have to live somewhere. The capability probe measures the userData partition’s free space.
  • WebGPU. Required by a subset of ops (notably image gen, some neural inference). If your renderer doesn’t expose navigator.gpu, those ops route to cloud.
  • Local-model readiness. For ops requiring a downloaded weight (τ₀-WM, embedding models, OCR), the Compute panel reports cached/not-cached per op. Until a weight is cached, the op can’t route locally.

Hybrid: choosing local vs cloud per op

HYBRID ops have a meaningful local path and a meaningful cloud path. Examples:

  • τ₀-WM policy inference. Local: run the 5.5 B model on your RTX 4080+ at ~140-220 ms per chunk. Cloud: run it on our A100s with a flat per-call charge, useful when your machine doesn’t have the GPU.
  • Embedding batch. Local: bge-small-en in ONNX, 30k tokens/sec. Cloud: larger model with slightly higher recall, per-token charge.
  • Image generation. Local: small Stable Diffusion through WebGPU. Cloud: higher-fidelity SDXL via metered call.

For HYBRID ops the router defaults to your preference— Local-first or Cloud-first. You can also pin specific ops in the Per-operation overrides table.

The local-first dial isn’t free of trade-offs

Local compute eats your CPU, RAM, and battery. A heavy Monte Carlo might pin your fan for fifteen seconds. The cloud path runs in the background and doesn’t. If you’re on battery or doing latency-sensitive work, Ask each time gives you the per-op confirmation step; Cloud-first is the strict opposite.

Next

Now that you know where things run, Pricing breaks down what cloud compute costs in cents and per-unit.