Kedios Infrastructure Report
B300 32-Server Farm + Standard-Aligned Network Tower — Architecture Specification
B300 32-Server Farm + Standard-Aligned Network Tower — Architecture Specification
Date: March 30, 2026
Scope: 32-node deployed NVIDIA B300 compute farm with 72-node-standard-aligned network, management, and border architecture
Zones: 32-rack compute zone + refreshed network/services placement envelope
Overview
The Kedios package remains a 32-node NVIDIA Blackwell Ultra B300 training farm, but the network-side architecture basis is now refreshed. The compute population stays fixed at 32 servers / 256 GPUs. What changes is the way the report presents the surrounding network, management, border, and storage-network layers: they now follow the accepted 72-node standard B300 baseline rather than the old minimal custom-stack narrative.
This file therefore uses a dual-layer rule:
- Current deployment layer: 32 compute nodes, 256 GPUs, live Q3400-RA compute fabric
- Standard-aligned network layer: retained Spectrum-4 plus
SN5610,SN4700,SN2201, and dualUFM
The old N1/N2 occupied, N3–N6 empty, single-UFM, and generic one-switch OOB wording should no longer be treated as authoritative source text.
1. Physical Layout and Zone Structure
Locked Design Rules
| Rule | Locked position |
|---|---|
| Compute population | 32 nodes / 256 GPUs |
| Network baseline | 72-node-standard-aligned |
| IB compute fabric | 8 × Q3400-RA leaf + 4 × Q3400-RA spine |
| Ethernet side fabric | 2 × Spectrum-4 retained |
| Added standard layers | SN5610 ×6, SN4700 ×4, SN2201 ×17, UFM ×2 |
| Contracted package | 34 occupied racks |
| Facility allocation | 1.5 MW |
| Rack-constrained ceiling | 680 kW under the retained 20 kW/rack rule |
Compute Zone — 32-Rack Grid
The compute zone remains unchanged:
- 32 rack positions
- 32 ASUS XA NB3I-E12 B300 servers
- no networking hardware inside the compute racks as the base reporting assumption
- unchanged cold-aisle / hot-aisle and dual-feed PDU-A / PDU-B logic
The racks remain numbered C01 through C32 and still land within the working range of the existing inter-zone cable-length assumptions.
Network / Services Placement Envelope
The network-side narrative changes materially. The refreshed topology should no longer assume that only N1 and N2 are populated and that N3–N6 are simply future reserve. Instead, treat the broader network/services area as the placement envelope for:
- Q3400-RA compute fabric core
- retained Spectrum-4 pair
- SN5610 converged / storage-network layer
- SN4700 border and control layer
- SN2201 management layer
- dual UFM nodes
| Zone element | Current rule |
|---|---|
| C01–C32 | Fixed compute racks |
| N1–N6 | Logical placement envelope for the refreshed network stack |
| Final physical placement | To be locked in the draw.io refresh, not inferred from the old two-rack-only story |
2. Compute Zone — Per-Rack Build (×32)
Each of the 32 compute rack positions is built identically. The server is an ASUS XA NB3I-E12 carrying one NVIDIA HGX B300 × 8 tray. Each rack carries exactly one server.
The server occupies the bottom 9U of the rack (U1–U9). A patch panel at U11 terminates the management cables, and a cable management arm at U12–U13 organises the 10 heavy AOC cable bundles. The remaining 29U above U13 is intentionally left empty.
Power is delivered via two vertical in-rack PDUs on the rear posts — one fed from PDU-A, one from PDU-B. The 5+5 PSU array in the server pairs each bank with one PDU, giving N+5 PSU redundancy plus dual-feed facility redundancy simultaneously.
Each rack's external cabling consists of:
- 8× NDR800 AOC cables (5–10 m) carrying the 8 ConnectX-8 IB ports to the 8 leaf switches in Rack N1
- 2× NDR400 AOC cables (5–10 m) carrying the 2 BlueField-3 DPU ports to the 2 Spectrum-4 switches in Rack N2
- 2× management links (5–10 m) carrying the X710 management NIC ports (OS management + BMC/IPMI) into the SN2201-based management layer
Every rack produces the same 12-cable bundle. Across all 32 racks, this creates 256 IB AOC cables, 64 Ethernet AOC cables, and 64 Cat6A management cables landing in the network tower.
3. IB Compute Fabric — Live Deployment
The Q3400-RA compute fabric remains the live cross-server training fabric for the current 32-node farm. The refresh does not change the compute-fabric count or rail design.
| Item | Value |
|---|---|
| Topology | 2-tier rail-optimized fat-tree |
| Leaf count | 8 |
| Spine count | 4 |
| Total Q3400-RA | 12 |
| Server-facing IB ports | 256 |
| Bisection bandwidth | 204.8 Tb/s |
| Max hops | 2 |
| Port utilization per Q3400 | 64 / 144 (44%) |
Rail Logic
Each server still maps one CX8 port per GPU rail:
- rail
0to leafL0 - rail
1to leafL1 - through rail
7to leafL7
This keeps AllReduce traffic rail-local whenever possible and preserves the current 32-node training performance assumptions.
SHARP and Fabric Management
All 12 Q3400-RA switches still support SHARP Gen 4 and the usual UFM-managed functions:
- topology discovery
- routing computation
- adaptive routing
- congestion response
- telemetry aggregation
What changes in this refresh is the management posture around the IB fabric: the base architecture now assumes 2 UFM nodes rather than a single-UFM default.
Inter-Switch Cabling
The locked compute-fabric inter-switch count remains:
- 256 passive DAC links for Q3400 leaf-to-spine connectivity
Those counts remain valid because the live 8-leaf / 4-spine compute fabric itself has not changed.
4. Standard-Aligned Ethernet, Management, Border, and UFM Layers
The refreshed network stack now carries four explicit non-compute layers.
| Component family | Count | Role |
|---|---|---|
| Spectrum-4 | 2 | Retained BF3-facing Ethernet side fabric |
| SN5610 | 6 | Standard-aligned converged / storage-network layer |
| SN4700 border leaf | 2 | Border layer |
| SN4700 C-spine / OOB | 2 | Control / OOB layer |
| SN2201 | 17 | Management layer |
| UFM nodes | 2 | Production + standby / HA pair |
Spectrum-4
Spectrum-4 remains part of the accepted topology by explicit owner direction. It is not replaced by the standard-aligned additions. The 64 BF3-facing server links therefore stay valid and continue to represent the live Ethernet side-fabric path for the current 32-node farm.
SN5610
The SN5610 layer is locked at 6 total units, using the owner-confirmed 2 + (2 + 2) formula. In this source revision, the count and the role are both locked, while the final per-unit rack and port labels are deferred to the draw.io refresh.
SN4700
The SN4700 layer is locked at 4 total units:
- 2 border leaf
- 2 C-spine / OOB-related
This is now part of the accepted external-connectivity and control-plane story.
SN2201
The management layer is locked at 17 total units, using the owner-confirmed 9 + 8 split. This replaces the previous generic single-switch OOB narrative as the base design position.
The server-side management-port mapping remains unchanged until the final ASUS integration package says otherwise:
- X710 Port 0 = OS management
- X710 Port 1 = BMC / IPMI
UFM
The topology now assumes 2 UFM nodes as the base architecture. Present them as production + standby / HA pair unless a stronger implementation detail is later confirmed.
The reason for moving from one to two UFM nodes is operational resilience and standards alignment, not raw port-count exhaustion.
5. Inter-Zone Cable Infrastructure
All cables from the 32 compute racks still run through the overhead tray path into the refreshed network/services placement envelope. The locked server-side cable counts are:
| Cable type | Count | Route | Notes |
|---|---|---|---|
| NDR800 AOC (CX8 → Q3400 leaf) | 256 | 32 racks × 8 cables | Still the dominant cable bundle |
| 400 GbE AOC (BF3 → Spectrum-4) | 64 | 32 racks × 2 cables | Retained Spectrum-4 path |
| Management links (X710 → management layer) | 64 | 32 racks × 2 links | Server-side management quantity unchanged |
| DAC (Q3400 leaf ↔ spine) | 256 | Within network/services envelope | Live compute-fabric interconnect |
Additional cable counts for SN5610, SN4700, SN2201, and the second UFM node must be finalized in the draw.io refresh. Do not reuse the old all-in cable total from the pre-refresh topology as the final number for the refreshed design.
6. Power and Capacity Framing
| Layer | Value | Comment |
|---|---|---|
| 32 compute racks sustained | ~464 kW | Still valid |
| 32 compute racks burst peak | ~492 kW | Still valid |
| 34-rack hard ceiling | 680 kW | 34 × 20 kW |
| Facility allocation | 1.5 MW | Owner-confirmed update |
Important distinction:
- 1.5 MW is the facility-allocation envelope
- 680 kW is the current rack-constrained hard ceiling under the unchanged rack rule
Because the refreshed network baseline introduces SN5610, SN4700, SN2201, and a second UFM node, this source revision does not claim a new all-network watt total until the final vendor power sheets are attached to the refreshed procurement BOM.
7. Split Baseline vs Current Deployment Snapshot
| Component family | 72-node standard baseline | Current 32-node populated deployment |
|---|---|---|
| Compute nodes | 72 | 32 |
| GPUs | 576 | 256 |
| Q3400-RA leaf | 8 | 8 |
| Q3400-RA spine | 4 | 4 |
| Spectrum-4 | 2 | 2 |
| UFM nodes | 2 | 2 |
| SN5610 | 6 | 6 |
| SN4700 | 4 | 4 |
| SN2201 | 17 | 17 |
| UFM Agent (software) | 72 | 32 |
This is now the mandatory reporting rule for downstream summaries, questionnaires, and diagrams:
- compute population differs between the standard baseline and the live deployment
- network-side switch counts do not in the current working design
Glossary
- NDR
- Next Data Rate — InfiniBand generation at 400 Gb/s (NDR400) or 800 Gb/s (NDR800) per physical port.
- NDR400
- InfiniBand NDR at 400 Gb/s per port, used by the BlueField-3 DPU for side-fabric connections.
- NDR800
- InfiniBand NDR at 800 Gb/s per port, used by ConnectX-8 HCAs on the HGX B300 GPU-to-fabric links.
- ConnectX-8
- NVIDIA ConnectX-8 NDR800 InfiniBand HCA integrated on the HGX B300 tray — 8 per server, one per GPU rail.
- BlueField-3
- NVIDIA BF-3220 DPU — 400G NDR400 InfiniBand, provides side-fabric connectivity and in-network compute offload.
- Q3400-RA
- NVIDIA Quantum-X800 Q3400 Rail-Accelerated InfiniBand switch — 144 NDR ports; deployed as 8 leaf + 4 spine.
- Spectrum-4
- NVIDIA Spectrum-4 400GbE/InfiniBand Ethernet switch — 51.2 Tb/s; retained as active-active side-fabric pair.
- SN5610
- NVIDIA Spectrum-SN5610 converged 400G Ethernet switch — 6 units in the storage/converged service plane.
- SN4700
- NVIDIA Spectrum-SN4700 400G Ethernet switch — 4 units for border/WAN handoff and control-plane.
- SN2201
- NVIDIA Spectrum-SN2201 1G/10G management switch — 17 units covering the full OOB management layer.
- UFM
- Unified Fabric Manager — NVIDIA IB fabric management; deployed as 2-node HA pair (production + standby).
- SHARP
- Scalable Hierarchical Aggregation and Reduction Protocol — in-network collective offload on Q3400-RA.
- HGX B300
- NVIDIA HGX Blackwell Ultra B300 — 8-GPU tray with NVLink Gen 5 at 1.8 TB/s per GPU, 14.4 TB/s aggregate across the tray.
- B300 GPU
- NVIDIA Blackwell Ultra B300 — 288 GB HBM3e, 1.1 kW TDP; current report basis uses ~4.5 PFLOPS FP8 dense / ~9 PFLOPS FP8 sparse and ~15 PFLOPS NVFP4 dense / ~30 PFLOPS NVFP4 sparse per GPU.
- NVLink
- NVIDIA direct GPU interconnect — Gen 5 on Blackwell at 1.8 TB/s per GPU, yielding 14.4 TB/s across an 8-GPU HGX B300 tray.
- HBM3e
- High Bandwidth Memory 3e — stacked DRAM in B300 GPUs at 288 GB per GPU, 8 TB/s peak bandwidth.
- Fat-Tree
- Network topology providing non-blocking bisection bandwidth; IB compute fabric is a 2-tier rail-optimised fat-tree.
- Rail-Optimised
- IB fabric layout: each GPU rail maps to a dedicated leaf switch, keeping AllReduce traffic rail-local.
- AOC
- Active Optical Cable — fibre-based cable with integrated E/O conversion, used for all NDR800 IB inter-rack links.
- IPMI / BMC
- Intelligent Platform Management Interface / Baseboard Management Controller — out-of-band server management.
- PDU-A / PDU-B
- Dual-feed power distribution: each PSU bank pairs with one PDU, giving N+5 PSU + dual-feed facility redundancy.
- CRAC / CRAH
- Computer Room Air Conditioner / Air Handler — precision cooling units, N+1 target coverage in the Kedios facility.
- DPU
- Data Processing Unit — BlueField-3 Smart NIC providing network/storage offload and security isolation.
- XA NB3I-E12
- ASUS server SKU: 9U air-cooled, dual Xeon 6776P, 32 × 128 GB DDR5 (4 TB total), 10× NVMe, HGX B300 ×8, CX-8 ×8, BF-3 ×2.
- Xeon 6776P
- Intel Xeon 6 Granite Rapids-SP — 56-core, PCIe 5.0 host CPU in the XA NB3I-E12 server; current server power tables in this repo model ~350 W per socket.
- NVFP4
- NVIDIA FP4 format — current report basis uses ~15 PFLOPS dense / ~30 PFLOPS sparse per B300 GPU, reported in this repo as ~240 PFLOPS sparse per 8-GPU server.
- FP8
- 8-bit float — current report basis uses ~4.5 PFLOPS dense / ~9 PFLOPS sparse per B300 GPU, with the report itself citing ~36 PFLOPS dense per 8-GPU server.
- AllReduce
- Distributed-training collective operation across all GPUs; accelerated by IB fat-tree fabric and SHARP.
- Fat-Tree Bisection BW
- 204.8 Tb/s across the full 32-server farm — 1:1 non-blocking, no fabric oversubscription.
- 20 kW Rack Limit
- Hard power cap per rack in the Kedios facility; servers draw ~14.5 kW sustained, leaving 5.5 kW margin.