Infrastructure & Supporting Systems

B300 Farm — Cooling · Power · Storage

Kedios B300 · 32-Node Blackwell Ultra GPU Farm  ·  March 1, 2026  ·  Companion to: B300 Farm and Network Tower Architecture

~494 kW
Total Heat Load
650+ kW
Cooling Capacity Required
80
Rack PDUs (dual-feed)
~590 TB
Shared Storage Usable
40
Total Rack Positions
~436–466 kW
Power Headroom vs 1 MW
0

Overview

The 32-node B300 farm generates approximately 494 kW of sustained heat load across servers, networking, and storage within its allocated 1 MW facility budget. Three supporting infrastructure pillars — cooling, power delivery, and storage — must be designed and racked in parallel with the compute and network zones already defined. These systems live on a separate physical layer, each with dedicated rack positions, independent power feeds, and independent management.

❄️
Cooling
N+1
4× CRAC units · CAHC/HAHC containment · 650+ kWth
Power
Dual-Feed
80 rack PDUs · Row ATS · 5+5 PSU per server · Row UPS optional
🗄️
Storage
~590 TB
2× NVMe-dense nodes · IB-attached · NVMe-oF / Lustre
1

Cooling Infrastructure

1.1 Cluster Heat Load

SourceCountPer-unitTotal Heat
B300 servers (sustained draw at wall)32~14.5 kW~464 kW
Network racks (IB switches + Ethernet)~17.6 kW
Storage nodes (estimated)~4~3 kW~12 kW
Total cluster heat load~494 kW
Burst ceiling (+6% GPU transient on compute)~522 kW
⚠️ Cooling capacity must be rated for the burst figure (~522 kW) with N+1 headroom — i.e., minimum 650+ kWth installed. N+1 means: if the largest CRAC unit fails, the remaining units must still cover full burst load alone.

1.2 Air Cooling Architecture (Primary)

The ASUS XA NB3I-E12 is air-cooled with 15× 80 mm high-RPM rear fans + 6× 60 mm CPU fans. Standard cold-aisle / hot-aisle containment is the primary method at Phase 1 scale.

Cold-Aisle / Hot-Aisle Containment

  • Cold aisles face rack fronts — conditioned supply air enters inlets
  • Hot aisles face rack rears — exhaust ~55–65°C at full GPU load
  • Hot-aisle containment (HAC) ceiling panels / chimney preferred
  • Prevents hot-air recirculation into adjacent cold aisles

Operating Temperature Thresholds

  • Inlet air (normal operating): 18–27°C
  • Inlet air (maximum ASHRAE A2): 35°C
  • Relative humidity: 20–80% non-condensing
  • Facility chilled water supply: ≤12°C recommended

1.3 CRAC / CRAH Unit Specification

ParameterValue
Total cooling required (burst ceiling)~522 kW
Recommended installed capacity650+ kWth (+20% headroom for N+1)
Minimum CRAC units (2 active + 1 standby = N+1)
Unit sizing (each)200–250 kWth chilled-water CRAH
Facility supply water temp≤12°C (≤15°C acceptable)
Airflow configurationUnder-floor (raised floor plenum) or overhead ducting
Temperature monitoringPer-rack sensors → UFM / facility BMS integration
ℹ️ The existing architecture document references CRAC #1 and CRAC #2. Under the revised load profile, two additional units (CRAC #3 and CRAC #4) are required to maintain N+1 coverage at full burst load. Confirm with the facility operator (datacenter).

1.4 Direct Liquid Cooling (Future Option — Phase 2)

At 8,800 W of pure GPU heat per server (8× B300 at 1,100 W TDP), DLC becomes attractive at scale-out beyond 32 nodes.

OptionDescriptionApplicability
Rear-Door Heat Exchanger (RDHx)Liquid-cooled door captures hot exhaust from existing fansDrop-in retrofit, no server modification
Direct Liquid Cooling (DLC)Coolant plates on GPU cold plates — eliminates hot-air exhaustRequires server support — check ASUS roadmap
In-Row Cooling (IRC)Dedicated cooling unit between rack rowsGood for zone isolation
At Phase 1 (32 nodes), standard CRAC + containment is sufficient. Evaluate DLC/RDHx at Phase 2 when cluster approaches 64 nodes and cooling load nears 750 kW.
2

Power Distribution Infrastructure

2.1 Farm Power Budget

Compute servers (~464 kW)46.4% of 1 MW
Network racks (~17.6 kW)1.8% of 1 MW
Storage nodes (~12 kW)1.2% of 1 MW
Cooling infrastructure (~40 kW)4.0% of 1 MW
ZoneRacksSustained DrawFacility Allocation
Compute zone32~464 kW640 kW (32 × 20 kW)
Network zone2 occupied~17.6 kW~40 kW
Storage zone1–2~12 kW~20 kW
Cooling (CRAC fans, pumps)~40–70 kW~80 kW
Total Phase 1~534–564 kW~780 kW
Facility allocation (1 MW)1,000 kW
Headroom remaining~436–466 kW
The 1 MW allocation remains sufficient for Phase 1. Approximately 436–466 kW of headroom remains available for Phase 2 scale-out without requesting additional facility capacity from the datacenter.

2.2 Per-Rack PDU Specification

Every rack — compute, network, and storage — is fitted with two independent vertical PDUs (PDU-A fed from busbar A, PDU-B from busbar B). A single feed failure never takes down an entire rack.

ParameterValue
PDU form factor0U vertical, rear-post mounted
Feed configurationDual-feed: PDU-A (busbar A) + PDU-B (busbar B)
Phase3-phase, 230/400 V
OutletsIEC C19 (server PSUs) + IEC C13 (management/1U devices)
Capacity per PDU32A 3-phase ≈ 22 kVA (~17.3 kW at 0.8 PF)
MeteringPer-outlet metered recommended (enables per-server power trending)
PDUs per rack2 (one per feed)
ZoneRacksPDUs/rackTotal PDUs
Compute32264
Network (occupied)224
Network (reserved N3–N6)428 (pre-wire)
Storage224
Total4080 PDUs

2.3 Server PSU Redundancy (5+5 Array)

Each ASUS XA NB3I-E12 carries a 5+5 PSU array — Bank A feeds GPU 0–3 and CPUs; Bank B feeds GPU 4–7 and CPUs:

PSU BankFed fromServesRedundancy
Bank A (5× PSUs)PDU-A (busbar A)GPU 0–3 + CPUsN+5 within bank
Bank B (5× PSUs)PDU-B (busbar B)GPU 4–7 + CPUsN+5 within bank
ℹ️ On complete PDU-A or PDU-B loss, half the server PSUs go dark — but the surviving bank keeps its GPUs operational via NVLink ring topology, allowing a reduced-scale training run or graceful checkpoint until power is restored.

2.4 UPS and Power Protection

A 10–30 second gap between mains failure and diesel generator takeover is typical. For 256 NVLink-connected B300 GPUs mid-training, uncontrolled power loss = full training restart from last checkpoint. A layered protection approach is recommended:

LevelScopePurposeCapacity target
Facility UPS ProvidedEntire buildingCovers generator transfer gapFacility-managed
Row ATS RequiredPer 8–16 rack rowInstant A↔B feed switchover <4 ms — no power gapStateless transfer
Row UPS RecommendedPer 8–16 rack row60–120 s bridging for graceful checkpoint on generator start~150–200 kW × 2 min ≈ 5 kWh/row
⚠️ Row UPS is optional if the facility guarantees generator transfer in <8 seconds. Confirm SLA with the datacenter operator before deciding. If generator transfer is 15–30 s, row UPS is required to protect training checkpoints.

Graceful Shutdown Sequence (on power event)

  1. ATS detects mains failure → switches to B-feed (or facility UPS) in <4 ms
  2. BMC IPMI broadcasts power event to all 32 servers via OOB network
  3. DCIM / cluster manager triggers controlled checkpoint-to-NVMe on all nodes (~30 s)
  4. NVSwitch and NVLink drain cleanly before power removal
  5. Row UPS provides the 60–90 s window for this sequence
  6. Generator reaches stable voltage → ATS switches back to mains feed
3

Storage Infrastructure

3.1 Local NVMe (Per Server — Already Installed)

ItemValue
NVMe drives per server10× Samsung PM9D3a U.2 (8 data + 2 OS)
Raw capacity per server~32 TB (assuming 3.2 TB per drive)
Usable per server (RAID 6 equiv.)~22–25 TB
Total raw across 32 servers~1 PB
Total usable across 32 servers~700–800 TB

✅ Good for (local NVMe)

  • OS and container images
  • Checkpoint writes (fast local flush during training)
  • Ephemeral scratch / temp data
  • Single-node inference scratch pad

❌ Not sufficient for

  • Shared dataset access across all 32 nodes simultaneously
  • Pre-processed token corpus (LLM datasets routinely 10–50 TB+)
  • Checkpoint aggregation from all 32 nodes to a single location
  • Model weights staging at multi-node scale

3.2 Shared Storage Cluster (IB-Attached)

A shared parallel storage cluster connected to the existing IB fabric is required for multi-node distributed training workloads.

OptionTechnologyThroughputNotes
NVMe-oF over IB RecommendedRDMA + NVMe-oF400+ GB/s aggregateLowest latency, native IB RDMA integration
Lustre over IBLustre + LNET200–400 GB/sStandard HPC/AI, mature tooling
BeeGFS over IBBeeGFS RDMA100–300 GB/sSimpler management than Lustre
VAST DataNFS/S3 over RDMA100–500 GB/sAll-flash appliance, scale-out

3.3 Phase 1 Storage Configuration (2 Nodes)

ComponentSpecificationQty
Storage node chassis2U NVMe-dense (e.g. Supermicro SSG-610P-ACR12H or equiv.)2
NVMe drives per node24× 15.36 TB U.2 enterprise read-intensive48 total
Raw capacity per node~369 TB
Total raw capacity~738 TB
Usable (erasure coding 8+2)~590 TB
IB connectivity per node2× NDR800 ports → existing leaf switches4 IB cables total
Lustre MDS/MGS node1U, 2× NDR IB ports (metadata server)1
Storage protocolNVMe-oF / Lustre OSS
MetricPer Node2-Node Cluster
Sequential read throughput~100 GB/s~200 GB/s
Sequential write throughput~60 GB/s~120 GB/s
IB bandwidth consumed (peak read)800 Gb/s = 100 GB/s~200 GB/s → 2 leaf ports
200 GB/s aggregate read is sufficient for most LLM pre-training workloads up to 70B parameters at 32-node scale. For 175B+ models, scale to 4 storage nodes in Phase 2.
ℹ️ Storage nodes connect via 2× NDR IB ports each into the existing leaf layer — no additional switches required. Leaf switches have 44% port headroom at Phase 1, comfortably absorbing 4 storage IB ports.
4

Full Infrastructure Rack Plan

ZoneRack IDsCountStatusNotes
ComputeC01 – C3232Occupied32× ASUS XA NB3I-E12 B300 servers
NetworkN1 – N22OccupiedIB switches + Ethernet + UFM + OOB
Network (reserved)N3 – N64ReservedPhase 2 expansion capacity
StorageS011Occupied (Phase 1)2× storage nodes + MDS + management
Storage (expansion)S021ReservedPhase 2 storage scale-out
Total Phase 135Occupied positions
Total incl. reserves40All allocated positions

Physical Zone Adjacency

Compute Zone · C01 – C32
32
Racks · ~464 kW sustained · Cold-aisle / hot-aisle containment · CRAC #1 #2 #3 #4
Network Zone · N1–N6
6
N1 N2 occupied · N3–N6 reserved
IB fabric · Ethernet · UFM · OOB
Storage Zone · S01–S02
2
S01 occupied · S02 reserved
NVMe-oF / Lustre · IB-attached

IB AOC inter-connect: compute ↔ network ↔ storage · ideally minimise distance for shortest cable runs

5

Supporting Infrastructure BOM

Cooling

ItemSpecQtyNotes
CRAC / CRAH unit200–250 kWth chilled-water32 active + 1 standby (N+1)
Hot-aisle containment panelsFloor-to-ceiling, per rowPer layoutFacility-integrated
Inlet temperature sensorsRack-mount, 1U per rack40One per rack, feeds BMS
BMS integration layerDCIM or facility BMS tie-inAggregates temp / power / humidity

Power

ItemSpecQtyNotes
Rack PDU (metered, dual-feed)0U vertical, 3-phase 32A, C19+C13802 per rack × 40 racks
Row ATS3-phase auto-transfer <4 ms41 per row of 8–16 racks
Row UPS module150–200 kW, 120 s runtime4Optional — confirm generator SLA first
Main distribution panelA+B independent feeds, per facilityFacilityIndependent circuits for each zone

Storage

ItemSpecQtyNotes
Storage node chassis2U NVMe-dense, 2× NDR IB2Phase 1
NVMe drives15.36 TB U.2 enterprise read-intensive4824 per storage node
Lustre MDS/MGS node1U, 2× NDR IB ports1Metadata server
IB NDR AOC cables5–10 m NDR8004Storage nodes → existing leaf switches
Storage rack PDUsSame spec as compute racks42 per rack × 2 racks (S01, S02)
6

Integration with Existing Network Design

Storage — IB & OOB Connections

ConnectionFromToCableBW
Storage IB dataS01 nodes (4× NDR ports)Leaf switches L0–L3NDR800 AOC 5–10 m4× 800 Gb/s = 3.2 Tb/s
Storage OS managementStorage node management NICsOOB 96-port switch (N2)Cat6A3× 1 GbE
Storage BMCStorage node BMC portsOOB 96-port switch (N2)Cat6A3× 1 GbE

Updated OOB Switch Port Count

Connection typeCount
OS management — 32 compute servers32
BMC / iDRAC — 32 compute servers32
Q3400-RA switch management (12 units)12
Spectrum-4 switch management (2 units)2
UFM management node1
Storage node OS management (3 nodes)3
Storage node BMC (3 nodes)3
Uplink(s)1+
Total used86
96-port OOB switch capacity96
Remaining spare10 ports
Adding storage nodes consumes 6 more OOB switch ports (86 total vs 80 without storage). The 96-port OOB switch still has 10 spare ports — no switch upgrade required.
7

Next Steps

PriorityAction
HighConfirm facility provides N+1 CRAC (650+ kWth) and validate inlet temperature SLA with the datacenter operator
HighConfirm MDP has independent A+B feeds for all three zones (compute, network, storage)
HighSpecify and procure storage nodes + NVMe drives; order 4× NDR IB AOC cables for leaf integration
MediumConfirm datacenter generator transfer time — determines whether row UPS modules are required
MediumInstall and configure Lustre/NVMe-oF stack during server commissioning
LowEvaluate DLC / RDHx readiness for Phase 2 (64-node expansion, ~750 kW cooling load)