Kedios Infrastructure Report

Kedios B300 — Full Network & Topology Report

March 30, 202632-Node NVIDIA Blackwell Ultra B300 GPU Farm with 72-Node-Standard-Aligned Network Baseline

Kedios B300 — Full Network & Topology Report

Organization: Kedios

Date: March 30, 2026

Scope: 32-Node NVIDIA Blackwell Ultra B300 GPU Farm with 72-Node-Standard-Aligned Network Baseline

Status: Source Architecture Refresh Locked · Diagram Redraw Pending

1. Executive Summary

The Kedios B300 package remains a 32-node NVIDIA Blackwell Ultra B300 GPU training cluster inside a 20 MW facility, but the network architecture, BOM framing, and management-plane narrative are now refreshed to align with the 72-node B300 standard baseline. The compute population stays fixed at 32 servers / 256 GPUs. The standards-aligned layer now governs how the report presents InfiniBand control, Ethernet side fabric, border, storage-network, and management components.

This means the report intentionally carries two truths at once:

the current deployed compute scope is still a 32-node farm
the accepted network baseline is now a 72-node-standard-aligned design pattern

The compute fabric remains a 1:1 non-blocking Q3400-RA InfiniBand fat-tree delivering 204.8 Tb/s of server-side bisection bandwidth for the currently deployed 32-node farm. At the same time, the Ethernet, storage-network, management, and border layers are no longer described as a minimal custom add-on. They are now expressed as the standard-aligned design basis for the current package.

The platform remains optimized for collective-heavy distributed training and post-training workflows. Selective inference and checkpoint-adjacent workloads remain supportable, but they are not the sizing basis for the fabric or rack-power architecture.

Farm at a Glance

Parameter	Value
Server model	ASUS XA NB3I-E12 (HGX Blackwell Ultra B300, 9U air-cooled)
Current deployed servers	32
Current deployed GPUs	256 × NVIDIA B300
Standard network reference capacity	72 nodes / 576 GPUs
Total GPU memory (current deployment)	73.73 TB HBM3e (256 × 288 GB)
Peak compute (FP8 dense, current deployment)	~1,152 PFLOPS
Peak compute (NVFP4 sparse, current deployment)	~7,680 PFLOPS
IB fabric bisection BW	204.8 Tb/s (1:1 non-blocking)
IB fabric core	8 × Q3400-RA leaf + 4 × Q3400-RA spine
Retained Ethernet side fabric	2 × NVIDIA Spectrum-4 (active-active)
Standard-aligned added layers	SN5610 ×6, SN4700 ×4, SN2201 ×17, UFM ×2
Contracted rack package	34 occupied racks
Per-rack hard limit	20 kW
Current 34-rack hard ceiling	680 kW
Facility allocation	1.5 MW

The earlier ~530 kW all-in peak remains the last fully validated package-level figure from the pre-refresh network stack. Because the refreshed baseline adds SN5610, SN4700, SN2201, and a second UFM node, this revision separates validated compute-side power anchors from the refreshed network-side BOM instead of overstating a new all-in watt total before the final vendor power sheets are attached.

2. Data Center Infrastructure & Power

Facility Overview

Parameter	Value
Total DC power capacity	20 MW
Kedios allocation	1.5 MW (1,500 kW)
Per-cabinet hard limit	20 kW (wall-output hard ceiling)
Current 34-rack hard ceiling	680 kW
Building levels	Multi-level (Level 1, 2, 3)
Cooling units	CRAC / CRAH fleet with N+1 target coverage
Power distribution	PDU-A and PDU-B (dual-feed per row)
UPS	UPS Room #1, #2, #3 + UPS 3 & UPS 4
Power control	PCU 1, PCU 2

Per-Server Power Audit (Component-Level)

Component	Model / Spec	Qty	Power each	Subtotal
NVIDIA B300 GPU	HGX tray, TDP 1,100 W	8	1,100 W	8,800 W
Intel Xeon 6776P CPU	56-core, 660 W TDP	2	350 W	700 W
DDR5-6400 128 GB RDIMM	Samsung M321RAJA0MB2-CCP	32	~7 W	224 W
Samsung PM9D3a NVMe U.2	Gen5, 1.92/3.84 TB	10	~10 W	100 W
ConnectX-8 NIC (HGX on-board)	800 Gb/s NDR	8	~20 W	160 W
BlueField-3 3220 DPU	400 Gb/s NDR400	2	~40 W	80 W
Intel X710-AT2 mgmt NIC	Dual 10 GbE	1	~12 W	12 W
System fans (80×80 mm)	High-RPM	15	~18 W	270 W
CPU fans (60×56 mm)	—	6	~7 W	42 W
PCIe switch PEX89144	—	1	~12 W	12 W
VRMs / motherboard / misc	—	—	—	175 W
Component sum				10,575 W
PSU loss (80 PLUS Titanium ~95.5%)			+4.5%	+494 W
Server draw at wall (sustained)				~14,500 W ≈ 14.5 kW
In-rack PDU cabling loss (~1%)				+111 W
Total server rack draw — sustained peak				~14.5 kW
GPU instantaneous burst overshoot (~+6%)				~15.4 kW absolute max

Server Rack vs 20 kW Hard Limit

Item	Value
20 kW hard limit per rack	20.0 kW
Server sustained peak (at wall, incl. PDU loss)	~14.5 kW
Absolute instantaneous burst	~15.4 kW
Safety margin (sustained)	~5.5 kW ✅
Safety margin (burst)	~4.6 kW ✅

Server fits within the 20 kW hard limit. Sustained training peak of ~14.5 kW leaves 5.5 kW margin. Even at absolute burst (~15.4 kW) the margin is 4.6 kW — no risk of breaker trips.

Validated Power Anchors

Layer	Basis	Value
32 compute racks — sustained	32 × ~14.5 kW	~464 kW
32 compute racks — burst peak	32 × ~15.4 kW	~492 kW
Compute-rack hard ceiling	32 × 20 kW	640 kW
Current 34-rack package hard ceiling	34 × 20 kW	680 kW
Facility allocation envelope	Project allocation	1.5 MW

Network-Side Power Refresh Status

The refreshed network baseline adds hardware families that were not part of the earlier all-in package roll-up. The current source set supports counts, roles, and topology placement intent for these elements, but it does not yet support a final consolidated watt total for every added model.

Component family	Count now in scope	Power status in this source revision
Q3400-RA	12	Existing repo-backed estimate remains available
Spectrum-4	2	Existing repo-backed estimate remains available
UFM	2	Count locked; refreshed dual-node roll-up pending final BOM watt sheet
SN5610	6	Count locked; per-device wattage pending final model sheet
SN4700	4	Count locked; per-device wattage pending final model sheet
SN2201	17	Count locked; per-device wattage pending final model sheet

Use 1.5 MW as the facility-allocation answer and 680 kW as the current rack-constrained ceiling under the unchanged 20 kW/rack rule. Do not promote a new all-network aggregate watt number until the refreshed procurement BOM carries the missing per-model power entries.

3. Physical Layout & Zone Structure

Zone Allocation

Zone	Current rule
32-rack compute zone	Unchanged — 32× ASUS XA NB3I-E12 B300 servers (C01–C32)
Network / services envelope	The old `N1 occupied + N2 occupied + N3–N6 empty` description is no longer authoritative. The refreshed topology now uses the broader network/services placement envelope for IB, Spectrum-4, SN5610, SN4700, SN2201, and dual-UFM layers.

The project still remains a 34-rack package. The updated draw.io sources must finalize the physical placement of the refreshed network layers instead of reusing the previous assumption that only N1 and N2 are populated.

Compute Zone — 32-Rack Grid (C01–C32)

Racks numbered C01 through C32 — pure compute, 1 server per rack
Alternating cold aisle / hot aisle layout — conditioned air from CRAC #1 and CRAC #2
Dual-feed from PDU-A and PDU-B across all rows (N+1 power path redundancy)
Each rack: server occupies bottom U1–U9 (9U); patch panel at U11; cable management U12–U13; upper 29U intentionally empty
Each rack exits 12 cables: 8× IB AOC + 2× Ethernet AOC + 2× Cat6A copper to network tower

Network Zone — Standard-Aligned Placement Envelope

The compute-side physical rule still holds: long server-to-network runs cross the inter-zone gap and therefore use AOC or copper management cabling rather than passive DAC. What changes in this revision is the network-layer occupancy narrative:

the Q3400-RA core remains the live compute-fabric anchor
the Spectrum-4 pair remains the live BF3 side fabric
the management, border, and storage-network layers are now part of the accepted standard-aligned design basis
the prior statement that N3–N6 are simply empty spare racks is withdrawn

Until the diagram refresh is completed, treat N1–N6 as the logical network/services placement envelope rather than as a final U-by-U rack map.

4. Server Specification — ASUS XA NB3I-E12

Model: ASUS XA NB3I-E12

Form factor: 9U, air-cooled (front-to-back), direct airflow

Quantity deployed: 32 units (one per rack, C01–C32)

Per-Server Hardware Bill of Materials

Component	Model / Spec	Qty per server
GPU	NVIDIA Blackwell Ultra B300 (HGX tray), TDP 1,100 W each	8
GPU memory	288 GB HBM3e per GPU (12-high HBM3e stacks) = 2.304 TB per server	—
CPU	Intel Xeon Platinum 6776P (56 cores, 660 W TDP)	2
System RAM	Samsung M321RAJA0MB2-CCP 128 GB DDR5-6400 RDIMM	32 (= 4 TB)
Boot SSD	Samsung PM9D3a U.2 Gen5 NVMe 1.92 TB	2
Data SSD	Samsung PM9D3a U.2 Gen5 NVMe 3.84 TB	8
IB NIC (GPU fabric)	ConnectX-8 (CX8) soldered on HGX tray, 800 Gb/s NDR IB	8
DPU	NVIDIA BlueField-3 3220, 400 Gb/s NDR400 Ethernet	2
Management NIC	Intel X710-AT2, dual-port 10 GbE RJ45	1
UFM Agent	Software only — installed on OS	1 (SW only)
Interconnect fabric	NVLink 5 (intra-GPU), 14.4 TB/s total	HGX tray
PSU	80 PLUS Titanium, 5+5 array (N+5 redundancy)	10

Per-Server Performance

Metric	Value
GPU compute (FP8 dense, 8-GPU)	~36 PFLOPS
GPU compute (NVFP4 sparse, 8-GPU)	~240 PFLOPS
Intra-server NVLink BW	14.4 TB/s (NVLink Gen 5, 1.8 TB/s × 8 GPUs)
GPU-to-GPU latency (intra-server)	< 100 ns
System memory bandwidth	~600 GB/s
Total NVMe storage	34.56 TB (2 × 1.92 TB boot + 8 × 3.84 TB data)
Operating power draw	~14.5 kW
Peak draw at wall (sustained)	~14.5 kW
Absolute instantaneous burst	~15.4 kW

32-Server Farm Aggregate

Metric	Value
Total GPUs	256 × NVIDIA B300
Total GPU memory	73.73 TB HBM3e
Total system RAM	128 TB DDR5
Total NVMe storage	~1,106 TB
Peak compute (FP8 dense)	~1,152 PFLOPS
Peak compute (NVFP4 sparse)	~7,680 PFLOPS
Intra-node NVLink aggregate	32 × 14.4 TB/s = 460.8 TB/s
Operating power	~355 kW
Peak power (sustained wall)	~464 kW
Absolute burst peak	~492 kW
Thermal output per server	~14.5 kW = ~49,500 BTU/hr
Farm thermal output	~355 kW = ~1,211,000 BTU/hr

5. Network Architecture — InfiniBand Compute Fabric

Network Ports Per Server

Port type	Component	Count	Speed	Fabric
CX8 IB (GPU fabric)	HGX tray on-board	8	800 Gb/s NDR	IB compute fabric → Q3400-RA leaf
BlueField-3 DPU	Add-in card	2	400 Gb/s NDR400	Storage/security Ethernet → Spectrum-4
Intel X710-AT2	Management NIC	2	10 GbE	OOB management → OOB switch
Total per server		12

32-Server Farm Port Totals

Fabric	Per server	× 32 servers	Speed	Target layer
CX8 IB (compute)	8	256 ports	800 Gb/s NDR	Q3400-RA leaf (×8)
BF3 Ethernet (storage/DPU)	2	64 ports	400 Gb/s NDR400	Spectrum-4 pair and standard-aligned Ethernet layer
Server management	2	64 server-facing ports	10 GbE	SN2201-based management layer

The old 80 total ports on one generic OOB switch figure is retired in this revision. Management, border, and control-plane connectivity now follow the standard-aligned SN2201 + SN4700 + dual-UFM model rather than a single generic 96-port switch story.

InfiniBand Rail Assignment

Each server's 8 CX8 ports are assigned to 8 independent GPU rails:

CX8 port	Rail	Leaf switch	GPU
CX8[0]	Rail 0	Leaf L0	GPU-0
CX8[1]	Rail 1	Leaf L1	GPU-1
CX8[2]	Rail 2	Leaf L2	GPU-2
CX8[3]	Rail 3	Leaf L3	GPU-3
CX8[4]	Rail 4	Leaf L4	GPU-4
CX8[5]	Rail 5	Leaf L5	GPU-5
CX8[6]	Rail 6	Leaf L6	GPU-6
CX8[7]	Rail 7	Leaf L7	GPU-7

Rail isolation principle: GPU-i on server X and GPU-i on server Y share the same leaf switch. This means AllReduce within a rail (the dominant training communication pattern) traverses zero spine hops — leaf-only latency. Only cross-rail operations (AllToAll) require the 2-hop leaf→spine→leaf path.

6. Switch Topology — 2-Tier Rail-Optimized Fat-Tree

Topology Design

                    ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
  SPINES (×4)       │  Spine   │ │  Spine   │ │  Spine   │ │  Spine   │
                    │   S0     │ │   S1     │ │   S2     │ │   S3     │
                    │Q3400-RA  │ │Q3400-RA  │ │Q3400-RA  │ │Q3400-RA  │
                    │144p NDR  │ │144p NDR  │ │144p NDR  │ │144p NDR  │
                    └──┬───────┘ └──┬───────┘ └──┬───────┘ └──┬───────┘
                       │ (32 links per spine, 8 per leaf)      │
        ┌──────┬────────┼──────┬─────┬──────┬────┼────┬────────┼──────┐
        │      │        │      │     │      │    │    │        │      │
  ┌─────┴──┐ ┌─┴──────┐ ... ┌──┴─────┴──┐ ... ┌─┴────┴──┐ ┌──┴─────┐
  │ Leaf L0│ │ Leaf L1│     │  Leaf L6  │     │  Leaf L7│ │        │
  │Rail 0  │ │Rail 1  │     │  Rail 6   │     │  Rail 7 │ │        │
  │Q3400-RA│ │Q3400-RA│     │  Q3400-RA │     │Q3400-RA │ │        │
  └────┬───┘ └────┬───┘     └─────┬─────┘     └────┬────┘ └────────┘
       │          │               │                 │
  (32 downlinks per leaf — one CX8[n] port from each server)
       │          │               │                 │
  C01..C32   C01..C32        C01..C32           C01..C32
  CX8[0]     CX8[1]          CX8[6]             CX8[7]

Q3400-RA Specifications

Parameter	Value
Form factor	4U chassis
Port count	144 × 800 Gb/s NDR IB (72 OSFP connectors, each bifurcated)
Switching capacity	115.2 Tb/s non-blocking
UFM management port	Dedicated in-band OSFP 400 Gb/s (isolated from data ports)
Adaptive features	SHARP Gen 4, adaptive routing, telemetry congestion control
SerDes	200 Gb/s
RDMA support	Yes

Leaf Switch Design (×8 — L0 through L7)

Parameter	Value
Count	8 leaf switches
Each serves	1 GPU rail
Server downlinks	32 (one CX8 port from each server on that rail)
Spine uplinks	32 (8 parallel links to each of 4 spines)
Port utilization	64 of 144 (44%) — room for ~32 more servers (up to ~64 total)
Cable type (down)	AOC 5–10 m (server → leaf, cross-zone)
Cable type (up)	DAC ≤3 m (leaf → spine, intra-N1)

Spine Switch Design (×4 — S0 through S3)

Parameter	Value
Count	4 spine switches
Leaf-facing ports	64 (8 parallel uplinks × 8 leaf switches)
Server connections	None (pure leaf-to-leaf forwarding plane)
Port utilization	64 of 144 (44%)
Cable type	DAC ≤3 m (all intra-N1)

Fat-Tree Bandwidth & Latency

Metric	Value
Server-side aggregate BW	256 × 800 Gb/s = 204.8 Tb/s
Spine bisection BW	8 leaf × 32 uplinks × 800 Gb/s = 204.8 Tb/s
Bisection ratio	1:1 — fully non-blocking
Max hops server-to-server	2 (leaf → spine → leaf)
IB latency server-to-server	< 200 ns (2-hop NDR)
Intra-rail latency (same leaf)	< 100 ns (single hop)
SHARP Gen 4 AllReduce gain	Up to ~50% BW reduction for typical batch sizes

The 8-leaf / 4-spine Q3400 topology remains the live compute fabric for the current 32-node farm. The 72-node standard alignment changes the surrounding Ethernet, management, border, and storage-network layers rather than the rail mapping or the 32-node compute-population count.

7. Network Tower — Rack N1 (InfiniBand Fabric)

Physical Content — 12 × Q3400-RA (48U total)

All 12 Q3400-RA switches are co-located in the network tower (Rack N1, or split across N1 IB bay and N2 upper bay as needed for physical depth). Co-location enables all inter-switch links to be short passive DAC, eliminating active components from the spine-to-leaf path.

Leaf-to-Spine Internal Wiring

Link	Cable type	Length	Count
Each leaf → each spine (8 parallel)	Passive copper DAC	≤ 3 m	8 × 4 × 8 = 256 DAC cables
Total unique leaf-spine pairs	8 leaves × 4 spines	—	32 pairs

SHARP In-Network Computation

All 12 Q3400-RA support SHARP Gen 4
AllReduce operations partially executed inside switches — partial gradient sums computed cooperatively by leaf and spine
Reduces fabric bandwidth consumed per AllReduce iteration by up to ~50% for training batch sizes common in LLM pretraining
Orchestrated automatically by UFM Appliance; no application code changes required
Particularly impactful for gradient synchronization in large transformer models (GPT, LLaMA scale)

8. Standard-Aligned Ethernet, Management & UFM Layers

Layering Rule

This revision separates the non-compute network stack into four explicit layers:

Retained Spectrum-4 side fabric for the current 32-node BF3 path
SN5610 standard-aligned converged / storage-network layer
SN4700 border and control layer
SN2201 management layer with dual UFM nodes

These are now treated as accepted topology components for the Kedios package. The report therefore stops describing management and border networking as a single generic custom switch and instead presents them as the standard-aligned design basis serving the current 32-node farm.

Spectrum-4 AI Ethernet Fabric (Retained)

The Spectrum-4 pair remains in the topology by explicit design decision. It continues to carry the BF3-facing Ethernet path for the current 32-node farm.

Parameter	Value
Count	2
Switching capacity	51.2 Tb/s per switch
Port configuration	128 × 400 GbE (or 64 × 800 GbE)
Protocol	Ethernet + RoCE (Spectrum-X AI Ethernet platform)
BF3 port allocation	32 BF3 ports per switch
Redundancy model	Active-active — each BF3 DPU connects one port to each Spectrum-4
Inter-switch link	Local ISL for east-west forwarding symmetry and resiliency

Traffic carried by the retained Spectrum-4 fabric:

BF3-based storage and service traffic for the 32 deployed servers
secondary compute-adjacent Ethernet traffic
orchestration and control-plane services that are intentionally kept off the IB data fabric

SN5610 Converged / Storage-Network Layer

The refreshed standard baseline introduces an SN5610 layer with a locked count of 6 total units, using the owner-confirmed 2 + (2 + 2) formula. In this report revision, the count is locked and the role is locked, while the final sub-role placement is deferred to the updated diagrams.

Attribute	Working position
Count	6 total
Role	Standard-aligned converged / storage-network switching layer
Scope rule	Included in the current package narrative even though the 32-node farm uses only a subset of the available endpoint capacity
Presentation rule	Do not describe these units as mere future reserve; they are part of the accepted topology basis

UFM High-Availability Pair

The refreshed baseline upgrades UFM from a single-appliance default to a dual-node base architecture.

Parameter	Value
Count	2
Management connection	Via the management plane, not on the IB data fabric
Managed scope today	12× Q3400-RA + 256× CX8 endpoints
Design position	Production + standby / HA pair

UFM responsibilities remain unchanged in type:

topology discovery
routing computation
adaptive routing and congestion response
SHARP orchestration
fault detection and recovery
telemetry aggregation

The reason for moving to 2 × UFM is operational resilience and standards alignment, not raw port-count exhaustion. One UFM node is still likely sufficient for raw capacity, but it is no longer the preferred base-design statement for the standards-aligned network package.

SN2201 Management Layer

The management plane now moves from a generic 96-port 10 GbE OOB switch narrative to an SN2201-based management layer.

Attribute	Working position
Count	17 total
Count formula	9 + 8
Role	Server management, BMC / IPMI, switch management, and control-plane fan-out
Replacement rule	Replaces the previous generic single-switch OOB base-design wording

The current management-port mapping rule remains unchanged until final server integration says otherwise:

X710 Port 0 = OS management
X710 Port 1 = BMC / IPMI path

SN4700 Border / Control Layer

The refreshed baseline also adds a fixed SN4700 border and control layer:

Layer role	Count
Border leaf	2
C-spine / OOB-related layer	2
Total SN4700	4

These units should now be treated as part of the IDC-facing and control-layer architecture rather than as optional future add-ons.

Placement Rule

This source revision locks the counts and roles above, but it does not claim a final U-by-U rack drawing for them yet. The next draw.io refresh must convert these locked counts into the final physical placement view.

9. Cabling Plan

Locked Cable Counts From The Current 32-Node Compute Population

Link	Cable type	Length	Count	Notes
Server CX8 → Q3400-RA leaf (IB NDR800)	AOC	5–10 m	256	8 per server × 32 servers
Server BF3 → Spectrum-4 (400 GbE NDR400)	AOC	5–10 m	64	2 per server × 32 servers
Server X710 management links	Cat6A copper	5–10 m	64	2 per server × 32 servers
Q3400-RA leaf → Q3400-RA spine (NDR800)	Passive DAC	≤ 3 m	256	All intra-network envelope

Standard-Aligned Network-Layer Additions

The refreshed baseline adds cable families that were not present in the old ~654 total cable assemblies roll-up:

SN5610 interconnects
SN4700 border and control-layer patching
SN2201 management fan-out and uplinks
second-UFM management links

Those counts should be finalized in the diagram redraw and the refreshed procurement BOM, because the exact values depend on the final physical placement and link map. For that reason:

keep the compute-side cable counts above as locked
do not reuse the old ~654 total as a final all-in cable total for the refreshed topology

10. Complete Hardware Bill of Materials

Compute Hardware — Current Deployment

Component	Qty per server	Current farm total
ASUS XA NB3I-E12 server	1	32
NVIDIA B300 GPU (on HGX tray)	8	256
Intel Xeon Platinum 6776P CPU	2	64
Samsung DDR5-6400 128 GB RDIMM	32	1,024
Samsung PM9D3a 1.92 TB NVMe U.2	2	64
Samsung PM9D3a 3.84 TB NVMe U.2	8	256
TPM security module	1	32

Split Baseline vs Current Deployment BOM

Component family	72-node standard baseline	Current 32-node populated deployment	Note
Compute nodes	72	32	Standard reference capacity versus live deployment
GPUs	576	256	Same rule as compute nodes
Q3400-RA leaf	8	8	Live compute-fabric count remains unchanged
Q3400-RA spine	4	4	Live compute-fabric count remains unchanged
Spectrum-4	2	2	Retained by explicit owner direction
UFM nodes	2	2	Production + standby / HA pair
SN5610 total	6	6	Locked formula: `2 + (2 + 2)`
SN4700 border leaf	2	2	Border layer
SN4700 C-spine / OOB	2	2	Control / OOB layer
SN2201 management layer	17	17	Locked formula: `9 + 8`
UFM Agent (software)	72	32	One software agent per populated compute server

Rack Infrastructure

Item	Count	Rule
Compute racks (C01–C32)	32	Fixed current deployment
Contracted package total	34 occupied racks	Fixed project package
Network / services placement envelope	N1–N6 logical envelope	Physical placement refreshed in diagrams rather than assumed from the old `2 occupied + 4 spare` model

11. Aggregate Bandwidth Summary

Fabric layer	Calculation	Total bandwidth
IB compute — server side	256 CX8 × 800 Gb/s	204.8 Tb/s
IB compute — spine bisection	8 leaf × 32 uplinks × 800 Gb/s	204.8 Tb/s (1:1 non-blocking)
NVLink (intra-server, each)	NVLink Gen 5, 1.8 TB/s × 8 GPUs	14.4 TB/s per server
NVLink (farm aggregate)	32 × 14.4 TB/s	460.8 TB/s
BF3 side-fabric Ethernet	64 BF3 ports × 400 Gb/s	25.6 Tb/s
Spectrum-4 switching capacity	2 × 51.2 Tb/s	102.4 Tb/s
Management / border / storage-network layers	Topology-specific in this revision	Presented by count and role, not collapsed into one aggregate number

12. Farm Compute Performance

Metric	Per server	Farm total (×32)
FP8 dense (GPU peak)	~36 PFLOPS	~1,152 PFLOPS
NVFP4 sparse (GPU peak)	~240 PFLOPS	~7,680 PFLOPS
GPU memory	2.304 TB HBM3e	73.73 TB HBM3e
HBM3e bandwidth	~8 TB/s per server	~256 TB/s farm
System RAM	4 TB DDR5	128 TB
NVMe storage	34.56 TB	~1,106 TB
IB network bandwidth	6.4 Tb/s (8 × 800 Gb/s)	204.8 Tb/s bisection
Ethernet BW (BF3)	800 Gb/s (2 × 400 Gb/s)	25.6 Tb/s

13. Topology Diagrams

Three rendered SVG topology views are published through the shared diagram viewer. Each one opens in a centered review window with fit, 100%, zoom, and scroll controls for detailed inspection.

Diagram 1 — IDC Farm Planning & Connectivity

Open rendered SVG viewer

Block-flow diagram showing the full end-to-end connectivity path:

Internet → Router → Firewall → Border Leaf → Spectrum-4 (active-active) → Q3400-RA IB Fabric → 32× B300 Compute Servers

Also shows: OOB Management Switch, UFM Appliance, Management networking path, zone separation, fabric management connections.

Diagram 2 — Farm InfiniBand Topology (NVIDIA-Style Rail View)

Open rendered SVG viewer

Full IB topology in the NVIDIA reference style:

Row 1: 4× Q3400-RA Spine switches (S0–S3)
Row 2: 8× Q3400-RA Leaf switches (L0–L7), each rail color-coded
Row 3: 32× server nodes (C01–C32), grouped in 4 sets of 8
256 rail-colored CX8→Leaf connections (8 colors, one per rail)
32 orange spine-to-leaf connections
Ethernet section: Spectrum-4 #1/#2, UFM, OOB, Storage (future)
64 BF3→Spectrum-4 connections

Diagram 3 — Single Rack Leaf/Spine Connection Detail

Open rendered SVG viewer

Per-port detail for a single representative server rack (C01):

All 8 CX8 ports to 8 leaf switches (color-coded by rail)
All 8 leaf switches to 4 spine switches (orange, 8×4 = 32 links)
BF3-0 → Spectrum-4 #1 (purple AOC)
BF3-1 → Spectrum-4 #2 (purple AOC)
X710 Port 0 + Port 1 → OOB Management Switch (yellow dashed)
UFM Agent (SW) ← UFM Appliance (blue dashed)
UFM Appliance → Leaf switches for IB fabric management (blue dashed)
Connection legend (color key for all edge types)

Racks C02–C32 are identical — same 12-port topology, same cable types, same switch targets.

14. Key Design Decisions & Rationale

Decision	Choice	Rationale
Server per rack	1 server per rack (9U, ~14.5 kW sustained)	20 kW/rack hard limit still governs the package
Compute population	32 deployed nodes	Customer scope remains 32 nodes / 256 GPUs
Network baseline	72-node-standard-aligned	Required for BOM, management, border, and OOB framing
IB switch model	Q3400-RA (8 leaf + 4 spine)	Live compute fabric remains rail-optimized and 1:1 non-blocking
Spectrum-4 role	Retained active-active pair	Explicit owner direction to keep the side fabric in place
UFM architecture	2 nodes + per-server software agents	Standards alignment and HA posture outweigh the old single-node default
Management layer	SN2201-based	Replaces the old single generic OOB-switch story
Border / control layer	SN4700-based	Required to reflect the new standard-aligned network baseline
Storage-network layer	SN5610-based	Required to align the package with the accepted standard reference pattern
BOM presentation	Split baseline vs current deployment	Prevents accidental implication that 72 compute nodes are already live
Power narrative	1.5 MW facility envelope + 680 kW rack ceiling	Separates facility allocation from rack-constrained usable envelope
Physical placement rule	Refresh via draw.io	Counts and roles are locked first; final placement follows in the updated diagrams

15. Risk & Mitigation Notes

Risk	Severity	Mitigation
GPU burst overshoot trips breaker	Medium	4.6 kW burst margin remains at the 20 kW/rack ceiling; retain monitoring and BMC power controls
Single AOC cable failure (server→leaf)	Medium	Per-rail isolation limits blast radius; UFM rerouting and job checkpointing remain the primary control
Leaf switch failure	High	Rail isolation plus UFM rerouting still apply
Spectrum-4 failure	Low	Active-active pair retained; surviving switch preserves service at reduced aggregate Ethernet bandwidth
Single UFM-node failure	Low	Base design is now a dual-node UFM architecture
Documentation ambiguity between `32 deployed` and `72 standard baseline`	High	Every externally facing table must label `standard baseline` and `current deployment` explicitly
Refreshed network-side power or cable roll-up published too early	Medium	Do not claim a new all-in network watt or all-cable total until the refreshed BOM and draw.io topology lock the missing entries
Future scale-out language reuses the old `~64-node` story	Medium	Reframe future growth against the 72-node-standard network baseline while keeping the live compute population at 32

Report generated: February 28, 2026

Architecture status: Locked

Diagrams: ✅ farm-idc-topology · ✅ farm-ib-topology · ✅ single-rack-topology

Next actions: Commit to repository · Stakeholder review · Procurement kick-off

Glossary

NDR: Next Data Rate — InfiniBand generation at 400 Gb/s (NDR400) or 800 Gb/s (NDR800) per physical port.

NDR400: InfiniBand NDR at 400 Gb/s per port, used by the BlueField-3 DPU for side-fabric connections.

NDR800: InfiniBand NDR at 800 Gb/s per port, used by ConnectX-8 HCAs on the HGX B300 GPU-to-fabric links.

ConnectX-8: NVIDIA ConnectX-8 NDR800 InfiniBand HCA integrated on the HGX B300 tray — 8 per server, one per GPU rail.

BlueField-3: NVIDIA BF-3220 DPU — 400G NDR400 InfiniBand, provides side-fabric connectivity and in-network compute offload.

Q3400-RA: NVIDIA Quantum-X800 Q3400 Rail-Accelerated InfiniBand switch — 144 NDR ports; deployed as 8 leaf + 4 spine.

Spectrum-4: NVIDIA Spectrum-4 400GbE/InfiniBand Ethernet switch — 51.2 Tb/s; retained as active-active side-fabric pair.

SN5610: NVIDIA Spectrum-SN5610 converged 400G Ethernet switch — 6 units in the storage/converged service plane.

SN4700: NVIDIA Spectrum-SN4700 400G Ethernet switch — 4 units for border/WAN handoff and control-plane.

SN2201: NVIDIA Spectrum-SN2201 1G/10G management switch — 17 units covering the full OOB management layer.

UFM: Unified Fabric Manager — NVIDIA IB fabric management; deployed as 2-node HA pair (production + standby).

SHARP: Scalable Hierarchical Aggregation and Reduction Protocol — in-network collective offload on Q3400-RA.

HGX B300: NVIDIA HGX Blackwell Ultra B300 — 8-GPU tray with NVLink Gen 5 at 1.8 TB/s per GPU, 14.4 TB/s aggregate across the tray.

B300 GPU: NVIDIA Blackwell Ultra B300 — 288 GB HBM3e, 1.1 kW TDP; current report basis uses ~4.5 PFLOPS FP8 dense / ~9 PFLOPS FP8 sparse and ~15 PFLOPS NVFP4 dense / ~30 PFLOPS NVFP4 sparse per GPU.

NVLink: NVIDIA direct GPU interconnect — Gen 5 on Blackwell at 1.8 TB/s per GPU, yielding 14.4 TB/s across an 8-GPU HGX B300 tray.

HBM3e: High Bandwidth Memory 3e — stacked DRAM in B300 GPUs at 288 GB per GPU, 8 TB/s peak bandwidth.

Fat-Tree: Network topology providing non-blocking bisection bandwidth; IB compute fabric is a 2-tier rail-optimised fat-tree.

Rail-Optimised: IB fabric layout: each GPU rail maps to a dedicated leaf switch, keeping AllReduce traffic rail-local.

AOC: Active Optical Cable — fibre-based cable with integrated E/O conversion, used for all NDR800 IB inter-rack links.

IPMI / BMC: Intelligent Platform Management Interface / Baseboard Management Controller — out-of-band server management.

PDU-A / PDU-B: Dual-feed power distribution: each PSU bank pairs with one PDU, giving N+5 PSU + dual-feed facility redundancy.

CRAC / CRAH: Computer Room Air Conditioner / Air Handler — precision cooling units, N+1 target coverage in the Kedios facility.

DPU: Data Processing Unit — BlueField-3 Smart NIC providing network/storage offload and security isolation.

XA NB3I-E12: ASUS server SKU: 9U air-cooled, dual Xeon 6776P, 32 × 128 GB DDR5 (4 TB total), 10× NVMe, HGX B300 ×8, CX-8 ×8, BF-3 ×2.

Xeon 6776P: Intel Xeon 6 Granite Rapids-SP — 56-core, PCIe 5.0 host CPU in the XA NB3I-E12 server; current server power tables in this repo model ~350 W per socket.

NVFP4: NVIDIA FP4 format — current report basis uses ~15 PFLOPS dense / ~30 PFLOPS sparse per B300 GPU, reported in this repo as ~240 PFLOPS sparse per 8-GPU server.

FP8: 8-bit float — current report basis uses ~4.5 PFLOPS dense / ~9 PFLOPS sparse per B300 GPU, with the report itself citing ~36 PFLOPS dense per 8-GPU server.

AllReduce: Distributed-training collective operation across all GPUs; accelerated by IB fat-tree fabric and SHARP.

Fat-Tree Bisection BW: 204.8 Tb/s across the full 32-server farm — 1:1 non-blocking, no fabric oversubscription.

20 kW Rack Limit: Hard power cap per rack in the Kedios facility; servers draw ~14.5 kW sustained, leaving 5.5 kW margin.