src/dashboard.rs
audience: ai
The operator’s read-only view of the bridge. Binds
to 127.0.0.1:<port> inside the TDX guest. The
operator reaches it by SSH-forwarding the port from
their workstation; no authentication beyond the SSH
login. No external port is ever opened.
What the dashboard shows
Aggregate state only. No per-grant requester
identity, no peer_id, no prompt or image contents,
no market settlement hashes beyond count. The
dashboard enforces the zipnet privacy contract on
the operator just as the provider enforces it on
the submitter: the operator sees how their
infrastructure is used, not who is using it.
Fields exposed per snapshot:
- Provider status —
registered/alive/mrtd_ok. - Coalition name and Compute module version — so the operator can confirm they booted against the right handshake.
- Active subscribers — how many coalition agents
have bonded this bridge’s provider card in the
Compute module’s
Collection<ProviderId, ProviderCard>. Trends up as word spreads; trends down when agents retire or unpin. - Active grants vs the provider’s declared concurrency cap.
- Lifetime grants served.
- Per-backend capacity and utilisation — for each enabled backend: declared capacity, active grant count, window cpu-core-seconds, window ram-mib-seconds, window network bytes.
- Window cost estimate — sum over backends of
core-hours × $/core-hour + GiB-hours × $/GiB-hourusing the operator’s rate table. - Window revenue settled — sum of coordination- market clearings credited to this bridge.
- Recent events (ring buffer, last 50) — all redacted of requester identity: grant accepted (backend only), grant completed (usage delta only), provider card refreshed, subscriber joined / left.
ASCII mock
compute-bridge dashboard
════════════════════════════════════════════════════════════════
Provider: <provider_id_prefix> Status: registered
Coalition: infer.corp Module: Compute v0.6
Uptime: 4d 02h MR_TD: ok
Subscribers (coalition agents bonded to this card): 27
Active grants: 3 / 20
Total grants served (lifetime): 1,842
─────────────────────────────── Backends ────────────────────────
aws [eu-west-1, us-east-1] cap=80c/320G used=30c/128G
gcp [europe-west1] cap=32c/128G used=0
azure (disabled)
baremetal [home-01 tdx-capable] cap=32c/128G used=32c/128G
───────────────────────────── Last 24h ─────────────────────────
Grants served: 412
Usage: cpu=18,420 core-h ram=73,680 GiB-h net=412 GB
Idle: 24%
Estimated cost: $147.23 (aws $94, gcp $0, azure $0, bare $0)
Settled revenue: $228.71 (market: infer-market clearings)
Net: +$81.48 (55% margin)
────────────────────────── Recent events ──────────────────────
[t-2m] Grant accepted on aws:us-east-1
[t-5m] Grant completed; usage 40 c-min, 160 GiB-min
[t-1h] Provider card refreshed
[t-3h] Subscriber joined
...
Two HTTP routes
GET /dashboard— HTML view (the ASCII layout above but rendered). No JavaScript; the page refreshes everycapacity_refresh_sec.GET /snapshot.json— the fullDashboardSnapshotas JSON. Operators who want to plumb the bridge into their own observability stack poll this.
What is deliberately not exposed
- The requester’s
ClientId,peer_id, or x25519 public key. - The image post-hash of any individual workload.
- The prompt / input / output of any workload.
- The coordination market’s per-settlement attribution (which requester paid what for which grant). The dashboard sees only aggregate revenue per window.
These are never in the dashboard’s memory; the bridge never pulls them out of the zipnet envelope or the market’s commit stream into process-local state.
Attaching the dashboard to other observability
Operators who run fleet-wide monitoring (many bridges under one operator) can:
- Scrape
/snapshot.jsonover the SSH tunnel. - Export a small set of Prometheus counters from the
same process (not yet implemented in this
prototype; see
dashboard.rsfor the extension point). - Forward aggregate metrics to an external TSDB without breaking the privacy contract — the exported metrics match the dashboard’s fields exactly.
//! Local operator dashboard.
//!
//! Boots alongside the provider loop and serves a small
//! read-only view at `http://127.0.0.1:<dashboard.port>`
//! from **inside** the TDX guest. The operator reaches
//! it by SSH-forwarding the port:
//!
//! ```shell
//! ssh -L 8080:localhost:8080 operator@tdx-host
//! open http://localhost:8080/dashboard
//! ```
//!
//! ## What it shows
//!
//! Aggregate state only. No per-grant identity, no
//! `peer_id`, no settlement hashes beyond count. The
//! dashboard enforces the zipnet privacy contract on
//! the *operator* just as the provider enforces it on
//! the submitter: the operator sees how their
//! infrastructure is used, not who is using it.
//!
//! Fields exposed:
//!
//! - Provider status: registered / alive / MR_TD match.
//! - Active grant count + lifetime total.
//! - Active-subscriber count (coalition agents bonded
//! to this bridge's provider card in the Compute
//! module's `Collection<ProviderId, ProviderCard>`).
//! - Per-backend capacity vs utilised.
//! - Last-window usage (cpu-hours, ram GiB-hours,
//! network GB).
//! - Last-window cost estimate (per backend at the
//! operator-supplied rate table).
//! - Last-window settled revenue (sum of market-
//! clearing amounts; no per-requester detail).
//! - Ring-buffer of recent events, each redacted of
//! requester identity.
//!
//! What is **not** shown: who submitted, what prompt
//! ran, which image contents executed. Those never
//! reached the dashboard's memory.
use std::sync::Arc;
use anyhow::Context;
use serde::Serialize;
use tokio::sync::RwLock;
use coalition_compute::{AlmanacTick, RequestId};
use crate::backends::{Capabilities, Fleet};
use crate::config::DashboardBootConfig;
pub struct Dashboard {
cfg: DashboardBootConfig,
state: Arc<RwLock<DashboardState>>,
fleet: Fleet,
}
impl Dashboard {
pub fn new(cfg: &DashboardBootConfig, fleet: Fleet) -> Self {
Self {
cfg: cfg.clone(),
state: Arc::new(RwLock::new(DashboardState::default())),
fleet,
}
}
/// Spawn the dashboard HTTP server. Binds to
/// 127.0.0.1:<port>; the TDX guest's firewall is
/// configured to refuse external traffic to this
/// port, so an operator's SSH port-forward is the
/// only access path.
pub async fn spawn(&self) -> anyhow::Result<()> {
let addr = format!("127.0.0.1:{}", self.cfg.port);
tracing::info!(addr = %addr, "dashboard listening");
// TODO: bring up an axum / hyper server with two
// routes:
// GET /dashboard → HTML view
// GET /snapshot.json → DashboardSnapshot JSON
//
// Every request is unauthenticated because the
// bind is 127.0.0.1 and the only path in is the
// operator's SSH tunnel.
anyhow::bail!(
"Dashboard::spawn is a prototype stub; wire up a \
localhost-only axum server with two routes"
)
}
/// Called by the provider loop every time something
/// observable happens. The dashboard's state is the
/// only in-memory sink; raw events are never
/// persisted.
pub async fn record(&self, ev: DashboardEvent) {
let mut s = self.state.write().await;
match ev {
DashboardEvent::SubscriberJoined => s.subscribers = s.subscribers.saturating_add(1),
DashboardEvent::SubscriberLeft => s.subscribers = s.subscribers.saturating_sub(1),
DashboardEvent::GrantAccepted { backend } => {
s.active_grants = s.active_grants.saturating_add(1);
s.lifetime_grants = s.lifetime_grants.saturating_add(1);
s.bump_backend(&backend, |b| b.active_grants += 1);
}
DashboardEvent::GrantCompleted { backend, usage } => {
s.active_grants = s.active_grants.saturating_sub(1);
s.bump_backend(&backend, |b| {
if b.active_grants > 0 { b.active_grants -= 1; }
b.window_cpu_core_seconds += usage.cpu_core_seconds;
b.window_ram_mib_seconds += usage.ram_mib_seconds;
b.window_net_bytes += usage.net_bytes;
});
}
DashboardEvent::RevenueSettled { usd } => {
s.window_revenue_usd += usd;
}
DashboardEvent::ProviderCardRefreshed => {
s.last_card_refresh = Some(chrono_like_now());
}
}
}
/// Build a snapshot for the HTTP view.
pub async fn snapshot(&self) -> anyhow::Result<DashboardSnapshot> {
let s = self.state.read().await;
let caps = self.fleet.capabilities().await?;
let mut backends = Vec::with_capacity(caps.len());
for (name, c) in caps {
let b = s.backends.iter().find(|b| b.name == name);
backends.push(BackendSnapshot {
name,
capabilities: c,
active_grants: b.map(|b| b.active_grants).unwrap_or(0),
window_cpu_core_seconds: b.map(|b| b.window_cpu_core_seconds).unwrap_or(0),
window_ram_mib_seconds: b.map(|b| b.window_ram_mib_seconds).unwrap_or(0),
window_net_bytes: b.map(|b| b.window_net_bytes).unwrap_or(0),
});
}
Ok(DashboardSnapshot {
provider_status: "registered".into(),
coalition_name: self.cfg.coalition_name.clone(),
mrtd_ok: true,
subscribers: s.subscribers,
active_grants: s.active_grants,
lifetime_grants: s.lifetime_grants,
backends,
window_revenue_usd: s.window_revenue_usd,
window_cost_estimated_usd: s.estimated_cost_usd(&self.cfg.rate_table),
last_card_refresh: s.last_card_refresh,
})
}
}
// -----------------------------------------------------
// State & events
// -----------------------------------------------------
#[derive(Default)]
struct DashboardState {
subscribers: u32,
active_grants: u32,
lifetime_grants: u64,
window_revenue_usd: f64,
last_card_refresh: Option<u64>,
backends: Vec<PerBackend>,
}
impl DashboardState {
fn bump_backend(&mut self, name: &str, f: impl FnOnce(&mut PerBackend)) {
if let Some(b) = self.backends.iter_mut().find(|b| b.name == name) {
f(b);
return;
}
let mut b = PerBackend { name: name.into(), ..Default::default() };
f(&mut b);
self.backends.push(b);
}
fn estimated_cost_usd(&self, rate_table: &RateTable) -> f64 {
self.backends.iter()
.map(|b| rate_table.cost_for(&b.name, b.window_cpu_core_seconds, b.window_ram_mib_seconds))
.sum()
}
}
#[derive(Default)]
struct PerBackend {
name: String,
active_grants: u32,
window_cpu_core_seconds: u64,
window_ram_mib_seconds: u64,
window_net_bytes: u64,
}
pub enum DashboardEvent {
SubscriberJoined,
SubscriberLeft,
GrantAccepted { backend: String },
GrantCompleted { backend: String, usage: UsageDelta },
RevenueSettled { usd: f64 },
ProviderCardRefreshed,
}
pub struct UsageDelta {
pub cpu_core_seconds: u64,
pub ram_mib_seconds: u64,
pub net_bytes: u64,
}
// -----------------------------------------------------
// Snapshot — JSON shape returned by /snapshot.json
// -----------------------------------------------------
#[derive(Serialize)]
pub struct DashboardSnapshot {
pub provider_status: String,
pub coalition_name: String,
pub mrtd_ok: bool,
pub subscribers: u32,
pub active_grants: u32,
pub lifetime_grants: u64,
pub backends: Vec<BackendSnapshot>,
pub window_revenue_usd: f64,
pub window_cost_estimated_usd: f64,
pub last_card_refresh: Option<u64>,
}
#[derive(Serialize)]
pub struct BackendSnapshot {
pub name: String,
pub capabilities: Capabilities,
pub active_grants: u32,
pub window_cpu_core_seconds: u64,
pub window_ram_mib_seconds: u64,
pub window_net_bytes: u64,
}
// -----------------------------------------------------
// Rate table (operator-supplied)
// -----------------------------------------------------
#[derive(Clone, Debug, Default, serde::Deserialize)]
pub struct RateTable {
/// $/core-hour per backend. Missing key defaults to 0
/// (bare-metal = 0 marginal cost unless the operator
/// declares their power/cooling amortisation).
pub core_hour_usd: std::collections::BTreeMap<String, f64>,
/// $/GiB-hour per backend.
pub ram_gib_hour_usd: std::collections::BTreeMap<String, f64>,
}
impl RateTable {
fn cost_for(&self, backend: &str, cpu_s: u64, ram_mib_s: u64) -> f64 {
let core_h = cpu_s as f64 / 3600.0;
let gib_h = (ram_mib_s as f64 / 1024.0) / 3600.0;
let core_rate = *self.core_hour_usd.get(backend).unwrap_or(&0.0);
let gib_rate = *self.ram_gib_hour_usd.get(backend).unwrap_or(&0.0);
core_h * core_rate + gib_h * gib_rate
}
}
// Placeholder for SystemTime::now-ish; the real impl uses
// the Almanac tick if one is available, falling back to
// SystemTime.
fn chrono_like_now() -> u64 {
std::time::SystemTime::now()
.duration_since(std::time::UNIX_EPOCH)
.map(|d| d.as_secs())
.unwrap_or(0)
}
Up: compute-bridge.