Farm Planning Summary

Kedios B300 Server Farm

32-Node ASUS XA NB3I-E12 · HGX Blackwell Ultra B300  –  Prepared February 28, 2026

256
NVIDIA B300 GPUs
1,152
PFLOPS FP8
73.7 TB
HBM3e Memory
204.8 Tb/s
IB Bisection BW
~530 kW
Peak Farm Power
47%
1 MW Headroom
1

Data Center Infrastructure & Power

Facility Power

  • Total DC capacity: 20 MW
  • This purchase allocation: 1 MW (1,000 kW)
  • Per-rack wall-output ceiling: 20 kW
  • Structure: Multi-level (L1, L2, L3)

Server Rack Power Result

  • Sustained wall draw: ~14.5 kW
  • Burst wall draw (+6%): ~15.4 kW
  • Margin (sustained): 5.5 kW ✓
  • Margin (burst): 4.6 kW ✓

Bottom-Up Component Power Audit

ComponentPer-UnitQtySubtotal
NVIDIA B300 GPU1,100 W×88,800 W
Intel Xeon 6776P CPU350 W×2700 W
DDR5-6400 128 GB RDIMM~7 W×32224 W
Samsung PM9D3a NVMe U.2~10 W×10100 W
ConnectX-8 NIC (HGX on-board)~20 W×8160 W
BlueField-3 3220 DPU~40 W×280 W
Intel X710-AT2 Mgmt NIC~12 W×112 W
System fans (80 mm)~18 W×15270 W
CPU fans (60 mm)~7 W×642 W
PCIe switch PEX89144~12 W×112 W
VRMs / motherboard / misc175 W
Component sum10,575 W
PSU loss (80+ Titanium ~4.5%)+494 W
At-wall sustained draw~14,500 W
In-rack PDU cabling (~1%)Included in updated wall draw baseline
Total rack draw — sustained peak~14,500 W ≈ 14.5 kW
GPU burst overshoot (+6%)~15,370 W ≈ 15.4 kW
Server fits within the 20 kW wall-output ceiling. Sustained wall draw of ~14.5 kW leaves 5.5 kW margin. Even at burst wall draw (+6%) of ~15.4 kW, the margin remains 4.6 kW — no risk of breaker trips.

1 MW Budget — Power Allocation

32 Server Racks (burst peak) — 492 kW49.2% of 1 MW
2 Network Tower Racks — 18 kW1.8%
Cooling / UPS overhead — 20 kW2.0%
Remaining headroom — 470 kW47.0% unused
2

Physical Zone Allocation

Compute Zone
32
Racks · ASUS XA NB3I-E12
100% occupied · pure compute

AOC
5–10m
Network Zone
2+4
Used + Spare
N1: IB fabric · N2: Eth/UFM/OOB
ZoneCapacityAllocationSpare
32-rack compute zone32 racks32× ASUS XA NB3I-E12 B300 servers0
6-rack network zone6 racks2× dedicated switch/fabric racks4 (future scale-out)
Placement confirmed. 32 server racks fill the 32-rack compute zone. The 2 dedicated switch/fabric racks go to the 6-rack network zone (adjacent). 4 spare positions are reserved for future network expansion.

Infrastructure Layout

  • Compute zone: 32× ASUS XA NB3I-E12 B300 — pure compute, no networking gear inside
  • Network zone (Rack N1): 12× Q3400-RA IB spine/leaf switches — InfiniBand fabric
  • Network zone (Rack N2): 2× Spectrum-4 + 1× UFM Appliance + 1× OOB switch
  • Power: PDU-A and PDU-B dual-feed redundancy on all racks
  • UPS: UPS Rooms #1, #2, #3 + UPS 3 & UPS 4 (building level)
  • Cooling: CRAC #1 and CRAC #2, A/C condensers, cold/hot aisle containment

Inter-Zone Cabling (AOC required — adjacent zones)

Link TypeCableRationale
ConnectX-8 → Q3400-RA (NDR 800 Gb/s)AOC 5–10 mCross-zone ~5–15 m; DAC only viable ≤3 m
BlueField-3 → Spectrum-4 (400 GbE)AOC 5–10 mSame inter-zone distance
X710 mgmt NIC → OOB switch (10 GbE)Cat6A 5–10 mCopper viable to 100 m
Q3400-RA ↔ Q3400-RA (intra-network zone)DAC ≤3 mAll switches co-located in N1 — short runs
ℹ️ AOC vs DAC delta ~$120–250/cable. Total cross-zone AOC links: 256 IB + 64 Ethernet = 320 cables. No switch count or architectural changes required.
3

Server Specification — ASUS XA NB3I-E12

9U air-cooled rackmount · 32 units · 1 per rack · direct front-to-back airflow

ComponentModel / SpecQty per Server
GPUNVIDIA Blackwell Ultra B300 (HGX tray) — TDP 1,100 W8
GPU Memory288 GB HBM3e per GPU (12-high stacks) = 2.304 TB per server
CPUIntel Xeon Platinum 6776P (56 cores, 350 W)2
System RAMSamsung M321RAJA0MB2-CCP 128 GB DDR5-6400 RDIMM32 (= 4 TB)
Boot SSDSamsung PM9D3a U.2 Gen5 NVMe 1.92 TB2
Data SSDSamsung PM9D3a U.2 Gen5 NVMe 3.84 TB8
IB NICConnectX-8 (on-board, 1 per GPU) — 800 Gb/s NDR8
DPUNVIDIA BlueField-3 3220 — 400 Gb/s NDR4002
Mgmt NICIntel X710-AT2 dual-port 10 GbE RJ451
UFM AgentSoftware (no hardware) — installed on OS1 (SW)

Per-Server Performance

Compute
  • FP8 dense: ~36 PFLOPS
  • NVFP4 sparse: ~240 PFLOPS
  • NVLink BW: 14.4 TB/s (full tray)
  • Intra-GPU latency: <100 ns
Storage
  • System RAM: 4 TB DDR5-6400
  • GPU memory: 2.304 TB HBM3e
  • NVMe total: 34.56 TB
  • Sys memory BW: ~600 GB/s

32-Server Farm Aggregate

MetricValue
Total GPUs256× NVIDIA B300
Total GPU memory73.73 TB HBM3e (256 × 288 GB)
Total system RAM128 TB DDR5
Total NVMe storage~1,106 TB
Peak compute (FP8 dense)~1,152 PFLOPS
Peak compute (NVFP4 sparse)~7,680 PFLOPS
Operating power~464 kW
Peak sustained wall~464 kW
Absolute burst peak~492 kW
4

Switch Topology — 2-Tier Rail-Optimized Fat-Tree

Decision Locked: Q3400-RA · 1:1 Non-Blocking — AllReduce / AllGather at full 204.8 Tb/s bisection bandwidth for maximum training throughput.

Leaf Switches ×8

  • One per GPU rail (Rail 0–7)
  • 32 downlinks: one CX8 per server
  • 32 uplinks: split equally to 4 spines
  • Port utilization: 44% (64/144)

Spine Switches ×4

  • Full any-to-any inter-leaf path
  • 64 ports from all 8 leaf switches
  • 8 parallel links per leaf-spine pair
  • Port utilization: 44% (64/144)

Switch Hardware Count

ComponentCountRole
Q3400-RA — Leaf832 server downlinks + 32 spine uplinks; 1 per GPU rail
Q3400-RA — Spine464 leaf-facing ports; full bisection mesh
Total Q3400-RA124U each · 115.2 Tb/s per unit
Spectrum-4 Ethernet2BF3 DPU 400 GbE — active-active redundancy
OOB 10 GbE mgmt switch1BMC/IPMI + OS management
UFM Appliance (hardware)1Centralized IB fabric manager
UFM Agent (software)32Installed on each server — no hardware
Total hardware units1612 Q3400-RA + 2 Spectrum-4 + 1 OOB + 1 UFM
5

Aggregate Bandwidth Summary

Fabric LayerCalculationTotal Bandwidth
IB compute — server side256 ports × 800 Gb/s204.8 Tb/s
IB compute — spine bisection8 leaf × 32 uplinks × 800 Gb/s204.8 Tb/s — 1:1 non-blocking
BF3 storage / DPU fabric64 ports × 400 Gb/s25.6 Tb/s
OOB management80 connections × 10 GbE800 Gb/s
204.8 Tb/s
IB Bisection — 1:1 Non-Blocking
Max hops: 2 (leaf→spine→leaf) · Latency <200 ns
25.6 Tb/s
Ethernet / Storage Fabric (BF3 DPU)
Active-active across 2× Spectrum-4
6

Key Design Decisions & Constraints

DecisionRationale
1 server per rack9U chassis; 20 kW/rack wall-output ceiling. Burst wall draw 15.4 kW → 4.6 kW safety margin.
8 leaf + 4 spine Q3400-RARail-optimized fat-tree: GPU port i → leaf i isolates 8 rails. 204.8 Tb/s bisection, 1:1 non-blocking.
Switches in separate racksKeeps compute racks under 20 kW. Switch racks run ~8.8 kW each with 11.2 kW margin.
Network zone (6-rack) for switchesAdjacent to compute zone — short AOC runs. 4 spare positions for future scale-out.
Dual BF3 per serverActive-active storage fabric bonding. Offloads RDMA / encryption from host CPU.
1 UFM Appliance centralManages entire 32-node IB domain (256 endpoints + 12 switches). Capacity limit: 648 ports.
Air cooling (front-to-back)ASUS XA NB3I-E12 is direct air-cooled. CRAC #1 & #2 support hot/cold aisle containment.