OpenAI-compatible gateway
Worker routes chat and image traffic through `/v1/*`, with API keys and usage tracking already in place.
B3IQ starts as polished local inference software and Andromeda-built appliances, then expands into a B3-powered DePIN protocol where gaming PCs, pro workstations, and verified nodes can host models, earn fees, build reputation, and settle work transparently.
B3IQ can start with customers who want useful local AI, then compound into a protocol where consumer GPUs, gaming PCs, and Andromeda-verified machines become paid inference supply. The product has to make hosting easy, make demand predictable, and make trust measurable.
Host software, verified appliances, managed inference, and protocol access to a global GPU network.
No open remote shell, no fake GPU farming, no rewards divorced from real inference demand.
B3 already has consumer distribution, staking, appchain infrastructure, low-fee settlement, and a natural audience of GPU-heavy gamers.
This preserves Andromeda as the trusted hardware provider while making B3IQ the customer-facing intelligence platform.
The physical inference appliance a customer plugs into their network.
The deterministic, signed-command service on the node.
Fleet, tenants, routing, health, jobs, keys, and billing.
A conversational operator that plans safe actions through the Node Agent.
OpenAI-compatible API for cloud, private, and local inference routes.
Cross-platform software for gaming PCs and workstations that want to serve or use local models.
Node registration, stake, reputation, receipts, disputes, and settlement on B3.
Signed model manifests, publisher stake, curation, benchmarks, and royalty splits.
Hardware sourcing, validation, burn-in, and factory imaging.
The next level is to generalize static inference hosts into enrolled B3IQ Nodes with identity, model inventory, tunnel state, benchmark records, and safe jobs.
Worker routes chat and image traffic through `/v1/*`, with API keys and usage tracking already in place.
Machine metrics, GPU stats, service status, latency, token rates, and alerting are already modeled.
The collector posts metrics, receives pending actions, restarts services, and reports results.
Qwen, Gemma, FLUX, and inference2 routing prove the gateway can handle heterogeneous capacity.
The current runbooks capture GPU isolation, cold-load windows, watchdog behavior, and failure modes.
The product should separate permissioned execution from AI planning. The B3IQ Operator can reason and explain; the B3IQ Node Agent enforces the allowlist.
The same B3IQ software should run on gaming PCs, Macs, Linux workstations, and Andromeda appliances. Hardware quality determines what routes the node can earn.
The host stays on the customer LAN. It serves local apps, local users, and local API keys. No remote tunnel or token stake is required.
The host creates an outbound tunnel, enrolls in the B3IQ Control Plane, and receives signed operational jobs.
The host stakes B3 and opts into making selected capacity available to the B3IQ network under explicit controls.
A non-technical buyer should be able to get to a useful local endpoint without touching a terminal.
The factory image checks whether the node is claimed. If not, it starts setup mode, publishes a local setup page, and blocks external inference routes until ownership is established.
Setup mode should survive bad network credentials and reboot back into a recoverable state.
B3IQ-N7K4 is ready to be claimed. Hardware scan passed.
The software should run on ordinary gaming PCs, but the protocol should grade them honestly. Consumer hosts create breadth; verified machines create reliability.
The wedge for scale is not only selling machines. It is letting existing GPU owners install B3IQ Host, benchmark their device, and decide whether to use it locally, join managed jobs, or stake into the network.
The sustainable version is demand-funded: users pay for inference, hosts earn for delivered work, model publishers earn royalties, and B3 captures protocol fees. Emissions should bootstrap supply, not become the business model.
Users should buy B3IQ credits with card, stablecoin, or B3. The request path stays fast offchain, while signed usage receipts settle in batches on B3. This avoids turning every token into a blockchain transaction.
The protocol can be useful before perfect cryptographic verification exists. The key is to label trust tiers honestly and route sensitive workloads only to nodes that meet the required guarantees.
Windows, macOS, and casual Linux hosts with benchmarked hardware and reputation. Good for low-cost, low-sensitivity tasks.
Earn less, need more sampling, receive fewer premium jobs.
Signed B3IQ Host, measured runtime versions, model hash checks, canary prompts, and recurring benchmark challenges.
Default for managed network inference.
Confidential-compute capable CPU/GPU platforms where available, with remote attestation bound to node identity and stake.
Premium route for enterprise and sensitive workloads.
Multiple nodes answer sampled jobs, evaluator nodes compare outputs, and disputes can trigger deeper review.
Use for model evals, high-value jobs, and reputation recovery.
Uptime, latency, delivered tokens, failed jobs, disputes, model integrity, challenge results, and customer ratings decay over time.
Moving stake to a new node should not copy trust history.
Slash only when evidence is objective: spoofed hardware, tampered agent, fraudulent receipts, malicious model, or non-delivery after acceptance.
Soft quality failures should hit escrow and reputation first.
OpenHarness can power the assistant experience, but the Node Agent should remain the deterministic execution boundary.
B3IQ Operator turns intent into reviewed jobs. The Node Agent validates the job type, policy, signature, preflight checks, and rollback plan before execution.
Keep the early assistant narrow. It should make support feel intelligent without creating an open-ended remote shell.
Users should be able to bring models to the B3IQ network, but every model needs provenance, licensing, safety scanning, benchmark evidence, and economic alignment before it can earn broadly.
Publisher submits a signed model manifest: artifact hashes, license claims, runtime profile, VRAM band, context window, and intended use.
Store artifacts offchain; commit hashes and metadata on B3.
Publisher stakes B3 to list the model. Higher distribution, premium categories, or closed-source weights require more collateral.
Stake discourages malware, false benchmarks, and license fraud.
Verifier nodes run install tests, safety scans, benchmark suites, and output-quality checks before the model gets recommended.
Curators can earn fees for useful evaluation work.
Hosts opt into approved models. B3IQ routes requests to nodes that have the model installed, healthy, and within policy.
Popular models pre-cache on Andromeda machines.
Model publishers can receive a royalty from paid inference. Open public models can route the same slot to hosts, verifiers, or protocol treasury.
This creates a reason to publish models into B3IQ.
Community and protocol governance can delist malicious models, update category rules, adjust stake requirements, and fund public-good models.
B3's existing governance surface becomes useful infrastructure.
A buyer should understand what each machine can comfortably serve. The SKU story should be conservative enough for support.
The default first product: a premium local inference appliance that a small team can understand, afford, and support.
The first serious managed-node SKU. It can isolate workloads per GPU or split larger quantized models when the runtime supports it.
The premium enterprise workstation SKU: fewer multi-GPU complications, more VRAM headroom, stronger reliability story.
The value and experimentation track using Andromeda's existing AMD experience. Valuable, but not the first customer-support default.
The network should accept broad consumer supply, but verified B3IQ machines should earn better placement because they reduce fraud, support, and variance.
Burn-in records, serial identity, signed image, driver baseline, and known-good model profiles get committed to the node identity.
Higher matching priority and lower sampling overhead.
Because hardware spoofing and thermal drift are lower risk, Andromeda-verified nodes can qualify for a stake discount or better earning multiplier.
Consumer hosts can earn the same status over time.
Managed customers and sensitive jobs should prefer verified machines, especially when the task needs uptime, support, or confidential-compute hardware.
This protects B3IQ's brand while still allowing open supply.
The protocol should launch in layers: useful host software first, measured supply second, B3 settlement third, permissionless staking last.
Make local inference easy on Windows, macOS, Linux, and Andromeda machines.
Ship Andromeda machines with reproducible factory images and support telemetry.
Let hosts join a managed beta with offchain credits and strict eligibility.
Move receipts, stake, reputation commitments, and payout accounting onto B3.
Permissionless hosting, model launchpad, verifier jobs, and Operator workflows.
The vision is bigger now, but the same rule applies: do not financialize a network until the work, evidence, and customer demand are real.
Open GPU networks attract spoofing, farmed rewards, and low-quality hosts if emissions pay before demand exists.
Gate rewards behind completed paid work, challenge jobs, stake, and reputation that decays over time.
Over-promising yield or slashing for subjective quality can create legal, trust, and community problems.
Use B3 stake as collateral and access, not guaranteed yield. Slash only objective fraud after evidence and appeal.
Different GPUs, drivers, runtimes, and models can multiply failure modes quickly.
Separate consumer hosts, Linux providers, and Andromeda-verified appliances into different trust and payout tiers.
A conversational operator can become a security risk if it controls root directly.
Keep the Node Agent deterministic. Operator proposes; signed jobs execute.
Network mode can feel like lending private hardware to strangers if the terms are vague.
Ship Local and Managed first. Make Network mode opt-in with accounting, schedules, limits, and visible earnings.
Consumer GPUs cannot fully prove model execution or data privacy, especially on unmanaged home PCs.
Label trust tiers honestly and reserve sensitive routes for verified or attested machines.
Hardware, runtime, protocol, and token details should be rechecked before purchase orders or tokenomics work, but these sources are enough for product planning.
B3 docs describe a Base-settled L3 consumer ecosystem, appchains, sub-cent fees, staking, governance, and value across B3 apps.
Akash positions itself as a decentralized GPU marketplace; io.net documents per-device staking and slashing for inadequate or malicious supplier behavior.
Ollama supports macOS, Linux, and Windows downloads. vLLM remains the production Linux path for high-throughput serving.
NVIDIA lists RTX 5090 at 32GB GDDR7 and RTX PRO 6000 Blackwell Workstation at 96GB GDDR7. AMD lists Radeon AI PRO R9700 at 32GB.
AMD SEV and NVIDIA attestation docs show a future path for stronger trust tiers, but consumer GPU hosts should not be marketed as fully confidential.
Cloudflare Tunnel supports outbound-only origin connections. NetworkManager's nmcli can create and activate Wi-Fi hotspots on capable devices.