Internal product document B3IQ brand B3 protocol rails Andromeda hardware provider

B3IQ turns idle GPUs into a verifiable inference economy.

B3IQ starts as polished local inference software and Andromeda-built appliances, then expands into a B3-powered DePIN protocol where gaming PCs, pro workstations, and verified nodes can host models, earn fees, build reputation, and settle work transparently.

View system design See token economy

Software B3IQ Host app for Windows, macOS, Linux, and appliances.

Protocol B3 handles registration, stake, reputation, receipts, and settlement.

Marketplace Users pay for inference; hosts and model publishers earn from demand.

Provider Andromeda-verified machines receive higher trust and better matching.

01 / Product thesis

The wedge is not "local models." It is a trusted market for private, user-owned AI capacity.

B3IQ can start with customers who want useful local AI, then compound into a protocol where consumer GPUs, gaming PCs, and Andromeda-verified machines become paid inference supply. The product has to make hosting easy, make demand predictable, and make trust measurable.

What we sell

Host software, verified appliances, managed inference, and protocol access to a global GPU network.

What we avoid

No open remote shell, no fake GPU farming, no rewards divorced from real inference demand.

Why B3

B3 already has consumer distribution, staking, appchain infrastructure, low-fee settlement, and a natural audience of GPU-heavy gamers.

02 / Brand map

B3IQ is the product. Andromeda is the machine builder.

This preserves Andromeda as the trusted hardware provider while making B3IQ the customer-facing intelligence platform.

Andromeda appliance

B3IQ Node

The physical inference appliance a customer plugs into their network.

Metrics collector

B3IQ Node Agent

The deterministic, signed-command service on the node.

Admin dashboard

B3IQ Control Plane

Fleet, tenants, routing, health, jobs, keys, and billing.

OpenHarness helper

B3IQ Operator

A conversational operator that plans safe actions through the Node Agent.

Gateway endpoint

B3IQ Gateway

OpenAI-compatible API for cloud, private, and local inference routes.

Consumer GPU app

B3IQ Host

Cross-platform software for gaming PCs and workstations that want to serve or use local models.

Onchain integration

B3IQ Protocol

Node registration, stake, reputation, receipts, disputes, and settlement on B3.

Model uploads

B3IQ Model Launchpad

Signed model manifests, publisher stake, curation, benchmarks, and royalty splits.

PC assembly partner

Andromeda

Hardware sourcing, validation, burn-in, and factory imaging.

03 / What exists now

The current Andromeda LLM setup already contains the cloud-control seed.

The next level is to generalize static inference hosts into enrolled B3IQ Nodes with identity, model inventory, tunnel state, benchmark records, and safe jobs.

OpenAI-compatible gateway

Worker routes chat and image traffic through `/v1/*`, with API keys and usage tracking already in place.

Health and telemetry

Machine metrics, GPU stats, service status, latency, token rates, and alerting are already modeled.

Primitive command loop

The collector posts metrics, receives pending actions, restarts services, and reports results.

Multi-host route pattern

Qwen, Gemma, FLUX, and inference2 routing prove the gateway can handle heterogeneous capacity.

Ops knowledge

The current runbooks capture GPU isolation, cold-load windows, watchdog behavior, and failure modes.

04 / System architecture

A deterministic appliance agent underneath a conversational operator.

The product should separate permissioned execution from AI planning. The B3IQ Operator can reason and explain; the B3IQ Node Agent enforces the allowlist.

05 / Operating modes

One host app, three customer promises.

The same B3IQ software should run on gaming PCs, Macs, Linux workstations, and Andromeda appliances. Hardware quality determines what routes the node can earn.

Local Mode

The host stays on the customer LAN. It serves local apps, local users, and local API keys. No remote tunnel or token stake is required.

Best for: privacy-sensitive teams, labs, dev shops, prosumers.

Default endpoint: `http://b3iq.local/v1` plus LAN IP fallback.

Cloud data: none, unless the customer opts into update checks.

Support posture: guided self-service plus exportable diagnostics bundle.

Managed Mode

The host creates an outbound tunnel, enrolls in the B3IQ Control Plane, and receives signed operational jobs.

Best for: small businesses that want private AI without GPU ops.

Connectivity: Cloudflare Tunnel or equivalent outbound-only connector.

Cloud data: health, usage metadata, model inventory, logs by policy.

Support posture: remote troubleshooting with customer-visible audit log.

B3IQ Network Mode

The host stakes B3 and opts into making selected capacity available to the B3IQ network under explicit controls.

Best for: GPU owners who want credits, monetization, or shared capacity.

Controls: schedule, max tokens, model allowlist, tenant isolation.

Cloud data: capacity, health, job accounting, usage settlement.

Support posture: strict policy, strong customer consent, clear revenue model.

06 / Customer setup

The first ten minutes should feel like pairing a premium router.

A non-technical buyer should be able to get to a useful local endpoint without touching a terminal.

Step 1

Power on

The factory image checks whether the node is claimed. If not, it starts setup mode, publishes a local setup page, and blocks external inference routes until ownership is established.

Product detail

Setup mode should survive bad network credentials and reboot back into a recoverable state.

B3IQ setup

Node found

B3IQ-N7K4 is ready to be claimed. Hardware scan passed.

Local endpointAvailable after setup

GPU profileDetecting memory and driver stack

Cloud enrollmentOptional

07 / B3 protocol expansion

B3IQ can become the compute layer for B3's consumer economy.

The software should run on ordinary gaming PCs, but the protocol should grade them honestly. Consumer hosts create breadth; verified machines create reliability.

Cross-platform host software

The wedge for scale is not only selling machines. It is letting existing GPU owners install B3IQ Host, benchmark their device, and decide whether to use it locally, join managed jobs, or stake into the network.

Windows Largest gaming-PC surface. Start with Ollama/llama.cpp-compatible local hosting, then add network supply for qualified NVIDIA GPUs.

macOS Excellent local Apple Silicon experience. Treat as local-first and light network supply; not the primary high-throughput host tier.

Linux Production host path for appliances, serious providers, vLLM, CUDA fleets, ROCm profiles, and remote managed supply.

Onchain responsibilities

RegistryWallet, node identity, hardware claims, software version, model inventory, and operating mode.

StakeSlashable B3 bond sized by device class, risk tier, and expected earnings.

ReceiptsSigned job receipts, usage metering, benchmark commitments, and batched settlement.

ReputationNon-transferable node score from uptime, latency, disputes, evals, and verified hardware history.

GovernanceProtocol parameters, curation policy, verifier rules, and treasury spending.

08 / Token economy

Use B3 for trust, settlement, and upside. Do not force every prompt onchain.

The sustainable version is demand-funded: users pay for inference, hosts earn for delivered work, model publishers earn royalties, and B3 captures protocol fees. Emissions should bootstrap supply, not become the business model.

Payments

Users should buy B3IQ credits with card, stablecoin, or B3. The request path stays fast offchain, while signed usage receipts settle in batches on B3. This avoids turning every token into a blockchain transaction.

Host70%Paid for completed work and SLA tier.

Model10%Royalty when the model is publisher-owned.

Protocol15%B3IQ/B3 fee for routing, trust, and settlement.

Verifier5%Audits, challenge jobs, insurance reserve.

Illustrative only. Final token design needs legal, tax, securities, and market-structure review before launch.

09 / Trust and verification

Verification should be tiered, because consumer GPUs cannot prove everything.

The protocol can be useful before perfect cryptographic verification exists. The key is to label trust tiers honestly and route sensitive workloads only to nodes that meet the required guarantees.

Tier 0

Best-effort hosts

Windows, macOS, and casual Linux hosts with benchmarked hardware and reputation. Good for low-cost, low-sensitivity tasks.

Earn less, need more sampling, receive fewer premium jobs.

Tier 1

Verified agent stack

Signed B3IQ Host, measured runtime versions, model hash checks, canary prompts, and recurring benchmark challenges.

Default for managed network inference.

Tier 2

Attested compute

Confidential-compute capable CPU/GPU platforms where available, with remote attestation bound to node identity and stake.

Premium route for enterprise and sensitive workloads.

Tier 3

Redundant verification

Multiple nodes answer sampled jobs, evaluator nodes compare outputs, and disputes can trigger deeper review.

Use for model evals, high-value jobs, and reputation recovery.

Reputation

Non-transferable score

Uptime, latency, delivered tokens, failed jobs, disputes, model integrity, challenge results, and customer ratings decay over time.

Moving stake to a new node should not copy trust history.

Disputes

Evidence before slashing

Slash only when evidence is objective: spoofed hardware, tampered agent, fraudulent receipts, malicious model, or non-delivery after acceptance.

Soft quality failures should hit escrow and reputation first.

10 / B3IQ Operator

The operator is the conversational layer, not the root controller.

OpenHarness can power the assistant experience, but the Node Agent should remain the deterministic execution boundary.

Operator model

B3IQ Operator turns intent into reviewed jobs. The Node Agent validates the job type, policy, signature, preflight checks, and rollback plan before execution.

ExplainSummarize health, logs, model choices, and setup issues.

PlanRecommend a model install, benchmark, tunnel fix, or runtime change.

RequestCreate a signed job for the Node Agent to approve or reject.

ExecuteOnly allowlisted Node Agent actions run on the machine.

First operator skills

Keep the early assistant narrow. It should make support feel intelligent without creating an open-ended remote shell.

Install or remove approved model profiles.
Run benchmark suites and compare to expected SKU bands.
Explain GPU memory pressure, queue depth, and model cold load.
Generate a support bundle with logs, versions, and redacted config.
Recommend local vs managed mode based on customer constraints.
Draft a safe job plan for customer approval when risk is non-trivial.

11 / Model launchpad

Model publishing becomes a second marketplace, not a file upload.

Users should be able to bring models to the B3IQ network, but every model needs provenance, licensing, safety scanning, benchmark evidence, and economic alignment before it can earn broadly.

Manifest

Publisher submits a signed model manifest: artifact hashes, license claims, runtime profile, VRAM band, context window, and intended use.

Store artifacts offchain; commit hashes and metadata on B3.

Stake

Publisher stakes B3 to list the model. Higher distribution, premium categories, or closed-source weights require more collateral.

Stake discourages malware, false benchmarks, and license fraud.

Curate

Verifier nodes run install tests, safety scans, benchmark suites, and output-quality checks before the model gets recommended.

Curators can earn fees for useful evaluation work.

Distribute

Hosts opt into approved models. B3IQ routes requests to nodes that have the model installed, healthy, and within policy.

Popular models pre-cache on Andromeda machines.

Earn

Model publishers can receive a royalty from paid inference. Open public models can route the same slot to hosts, verifiers, or protocol treasury.

This creates a reason to publish models into B3IQ.

Govern

Community and protocol governance can delist malicious models, update category rules, adjust stake requirements, and fund public-good models.

B3's existing governance surface becomes useful infrastructure.

12 / Hardware packaging

Sell SKUs by model experience, not just GPU count.

A buyer should understand what each machine can comfortably serve. The SKU story should be conservative enough for support.

B3IQ Studio

The default first product: a premium local inference appliance that a small team can understand, afford, and support.

Best for 8B-35B local chat, coding helpers, RAG apps, and private productivity.
Use one fast default model and one optional quality profile.
Prioritize quiet thermals, low support burden, and setup polish.

GPU targetRTX 5090 32GB GDDR7 or similar 32GB card

RAM128GB DDR5 minimum

Storage4TB NVMe, expansion bay preferred

Power1000-1200W quality PSU, validated thermals

Runtimellama.cpp + optional Ollama UX bridge

PositioningPrivate AI box for local teams

B3IQ Pro

The first serious managed-node SKU. It can isolate workloads per GPU or split larger quantized models when the runtime supports it.

Best for small-business managed mode and heavier local apps.
One GPU can run chat while the second handles image, embeddings, or a quality model.
Support story must explain that dual 32GB is not always one 64GB pool.

GPU target2x 32GB GPUs, NVIDIA-first for MVP

RAM192-256GB DDR5

Storage8TB NVMe, model cache partition

Power1600W PSU, chassis airflow validated under burn-in

Runtimellama.cpp split or one service per GPU; vLLM where CUDA fits

PositioningManaged private inference for teams

B3IQ Max

The premium enterprise workstation SKU: fewer multi-GPU complications, more VRAM headroom, stronger reliability story.

Best for high-end local deployments, long-context work, and heavier 70B-class profiles.
96GB VRAM simplifies support because large profiles fit on one card more often.
Price is higher, but operational complexity is lower.

GPU targetRTX PRO 6000 Blackwell Workstation 96GB GDDR7 ECC

RAM256-512GB ECC preferred

Storage8-16TB NVMe, redundant boot optional

PowerWorkstation-grade 1200-1600W depending platform

RuntimevLLM for throughput, llama.cpp for quantized breadth

PositioningEnterprise-grade private AI workstation

B3IQ Lab

The value and experimentation track using Andromeda's existing AMD experience. Valuable, but not the first customer-support default.

Best for power users, labs, and customers who accept ROCm constraints.
Strong 32GB VRAM-per-card story with lower board power than RTX 5090-class cards.
Keep behind a separate support matrix until ROCm/runtime behavior is predictable.

GPU targetAMD Radeon AI PRO R9700 32GB

RAM128-256GB DDR5

Storage4-8TB NVMe

PowerAMD lists 300W TBP for R9700

Runtimellama.cpp ROCm profiles first

PositioningValidated open GPU inference track

Andromeda machines get a protocol advantage.

The network should accept broad consumer supply, but verified B3IQ machines should earn better placement because they reduce fraud, support, and variance.

Verified

Factory trust badge

Burn-in records, serial identity, signed image, driver baseline, and known-good model profiles get committed to the node identity.

Higher matching priority and lower sampling overhead.

Stake

Lower collateral per GPU

Because hardware spoofing and thermal drift are lower risk, Andromeda-verified nodes can qualify for a stake discount or better earning multiplier.

Consumer hosts can earn the same status over time.

Support

Premium route eligibility

Managed customers and sensitive jobs should prefer verified machines, especially when the task needs uptime, support, or confidential-compute hardware.

This protects B3IQ's brand while still allowing open supply.

13 / Build sequence

Earn trust offchain before moving value onchain.

The protocol should launch in layers: useful host software first, measured supply second, B3 settlement third, permissionless staking last.

Phase 1

Host app MVP

Make local inference easy on Windows, macOS, Linux, and Andromeda machines.

Local endpoint
Runtime adapter
Hardware scan
Model profiles
Setup UX

Phase 2

Verified appliances

Ship Andromeda machines with reproducible factory images and support telemetry.

Driver baseline
Setup Wi-Fi
Burn-in record
Signed image
Burn-in script

Phase 3

Measured network

Let hosts join a managed beta with offchain credits and strict eligibility.

Nodes table
Model instances
Benchmark bands
Signed receipts
Reputation score

Phase 4

B3 settlement beta

Move receipts, stake, reputation commitments, and payout accounting onto B3.

Node registry
Stake vaults
Batched settlement
Dispute evidence
Treasury fees

Phase 5

Open protocol

Permissionless hosting, model launchpad, verifier jobs, and Operator workflows.

Model publisher stake
Verifier market
Slashing appeals
Network mode GA
Governed parameters

14 / Product risks

The hard parts are proof, regulation, and supply quality.

The vision is bigger now, but the same rule applies: do not financialize a network until the work, evidence, and customer demand are real.

Fake supply

Open GPU networks attract spoofing, farmed rewards, and low-quality hosts if emissions pay before demand exists.

Gate rewards behind completed paid work, challenge jobs, stake, and reputation that decays over time.

Token design

Over-promising yield or slashing for subjective quality can create legal, trust, and community problems.

Use B3 stake as collateral and access, not guaranteed yield. Slash only objective fraud after evidence and appeal.

Support drift

Different GPUs, drivers, runtimes, and models can multiply failure modes quickly.

Separate consumer hosts, Linux providers, and Andromeda-verified appliances into different trust and payout tiers.

Agent overreach

A conversational operator can become a security risk if it controls root directly.

Keep the Node Agent deterministic. Operator proposes; signed jobs execute.

Customer trust

Network mode can feel like lending private hardware to strangers if the terms are vague.

Ship Local and Managed first. Make Network mode opt-in with accounting, schedules, limits, and visible earnings.

Verification limits

Consumer GPUs cannot fully prove model execution or data privacy, especially on unmanaged home PCs.

Label trust tiers honestly and reserve sensitive routes for verified or attested machines.

15 / Research notes

Assumptions grounded in current platform facts.

Hardware, runtime, protocol, and token details should be rechecked before purchase orders or tokenomics work, but these sources are enough for product planning.

B3 protocol fit

B3 docs describe a Base-settled L3 consumer ecosystem, appchains, sub-cent fees, staking, governance, and value across B3 apps.

B3 overview / $B3 token / Whitepaper

DePIN market precedent

Akash positions itself as a decentralized GPU marketplace; io.net documents per-device staking and slashing for inadequate or malicious supplier behavior.

Akash GPU / io.net staking / Gensyn

Cross-platform host path

Ollama supports macOS, Linux, and Windows downloads. vLLM remains the production Linux path for high-throughput serving.

Ollama download / vLLM OpenAI server / llama.cpp server

GPU memory tiers

NVIDIA lists RTX 5090 at 32GB GDDR7 and RTX PRO 6000 Blackwell Workstation at 96GB GDDR7. AMD lists Radeon AI PRO R9700 at 32GB.

RTX 5090 / RTX PRO 6000 / R9700

Confidential-compute path

AMD SEV and NVIDIA attestation docs show a future path for stronger trust tiers, but consumer GPU hosts should not be marketed as fully confidential.

AMD SEV / NVIDIA attestation

Connectivity and setup

Cloudflare Tunnel supports outbound-only origin connections. NetworkManager's nmcli can create and activate Wi-Fi hotspots on capable devices.

Cloudflare Tunnel / nmcli / OpenHarness