XDP BNG · CGNAT — Hardware Sizing · NIC Selection · Power & Cost-per-Subscriber
Deployment Guide · Hardware Sizing & TCO

Pick the Right Server & NIC: Throughput, Subscribers, Power and Cost per Subscriber

Because the BNGSOFT data plane runs in XDP on commodity servers, your capacity is set by three things: the NIC, the PCIe bus, and CPU/RAM for session state — not by a proprietary chassis. This guide compares Intel X710 / XL710 / E810 and NVIDIA ConnectX-6 Dx across 1U and 2U builds, and gives a simple model to size a node to a target subscriber count and traffic.
A BNG node has two ceilings: how much traffic it can move (NIC + PCIe bus) and how many subscribers it can track (CPU + RAM for the session/NAT tables). The right build balances both — and the densest balanced node wins on watts and dollars per subscriber.
10→200G
per server
X710 (20G) · XL710 (40G) ·
E810 / CX-6 Dx (100–200G)
1U & 2U
commodity x86
edge node to high-density
multi-100G aggregation
~100+
subs / watt
best at high density —
consolidation wins TCO
PCIe
the real ceiling
gen3 x8 ≈ 50G ·
gen4 x16 ≈ 200G

1 · The NICs — X710, XL710, E810 and ConnectX-6 Dx

For an XDP BNG the NIC choice drives line rate, latency, queue count and how much you can offload. All four are well-supported by mature Linux drivers (Intel i40e/ice, NVIDIA mlx5) and run XDP natively.

AdapterSpeedPCIeTyp. power*DriverStreet price*Best for
Intel X710-DA22 × 10GGen3 ×8~3.3–4.5 Wi40e~$200–300Edge / small PoP nodes
Intel XL710-QDA22 × 40G (bus-capped)Gen3 ×8~7–8 Wi40e~$300–45020–40G nodes; mind the bus
Intel E810-CQDA22 × 100GGen4 ×16~15–21 Wice~$700–900Mainstream 100G, many queues / ADQ
NVIDIA ConnectX-6 Dx2 × 100G / 200GGen4 ×16~16–22 Wmlx5~$965–1,400Lowest latency + HW conntrack offload (ASAP²)
Watch the PCIe bus, not just the port speed. An XL710 2×40G sits on PCIe Gen3 ×8 (~63 Gbps raw, ~50 Gbps usable) — so it cannot move a full 80G bidirectionally; the bus caps it near ~50G aggregate. 100G-class cards need PCIe Gen4 ×16 to deliver their ports. This is the most common real-world bottleneck — we have measured it on production nodes.
PCIe bus capacity vs NIC line rate (usable, per direction-ish, aggregate shown) PCIe Gen3 ×8 ~50 Gbps usable ← XL710 2×40G (80G) is bus-capped here PCIe Gen4 ×16 ~200 Gbps usable E810 / CX-6 Dx 2×100G fit 20G (X710) 80G (XL710 ports) 200G (dual-100G)
Port speed only matters if the bus can carry it. Size the slot (PCIe generation × lanes) to the NIC — or the bus becomes your throughput ceiling.

2 · 1U vs 2U — what the form factor buys you

1U

Edge & mainstream nodes

PCIe slots (usable)1–2 low-profile
Sockets1 (sometimes 2)
Typical NICX710 / XL710 / one E810
Power draw~250–450 W
Cooling / noisetight, high-RPM
Best for distributed PoP/edge: one NIC, up to ~100G, lowest rack cost per node.
2U

High-density aggregation

PCIe slots (usable)2–4 full-height ×16
Sockets1–2 (more cores/RAM)
Typical NIC2× E810 or ConnectX-6 Dx
Power draw~450–800 W (dual PSU)
Cooling / noisebetter thermals, redundancy
Best for consolidation: 200G, more session memory, dual-PSU resilience, best subs/watt.
Rule of thumb. 1U = the cheapest way to put a 100G XDP BNG at the edge. 2U = the cheapest way to serve a lot of subscribers per rack-unit and per watt, with PSU redundancy and room for dual-100G or a SmartNIC.

3 · The sizing model — a node has two ceilings

Size every node against both limits and take the smaller:

CeilingSet byHow to estimate
① Traffic it can moveNIC line rate ∧ PCIe busGbps = min(NIC ports, PCIe usable). E.g. XL710→~50G, E810→~100–200G.
② Subscribers it can trackCGNAT/session table (CPU + RAM) and a per-node map ceilingThe session/CGNAT table is sized to hold the throughput-driven count with headroom; a hard per-node map ceiling of ~131,000 subs only bites above ~400G.
Effective subscribers = min( ① NIC usable Gbps ÷ busy-hour Mbps , 131,000 map ceiling ). At an assumed ~3 Mbps busy-hour average per active subscriber, a 100G node carries ~33k subs and a 2×100G node ~64k — the NIC + PCIe bus is the practical ceiling, and the CGNAT/session table is sized to match it. The data plane forwards in XDP at ~one core per ~55 Gbps, so CPU is not the wall on a right-sized server. Tune the 3 Mbps to your own busy-hour data.

Size your node — NIC card × CPU → subscribers

Capacity = min( NIC usable Gbps ÷ busy-hour Mbps , CPU data-plane ceiling , ~131k per-node map limit ). Everything is editable; the math is shown.
Figures are raw NIC ÷ busy-hour rate; published per-node headline numbers round down slightly for encapsulation overhead and N+1 headroom. The data plane forwards in XDP at ~1 core per ~55 Gbps; the per-subscriber control thread needs ~1 core per ~45k subs.

4 · Sample configurations (indicative — validate per deployment)

EDGE · 1U

Small PoP / edge BNG

CPU8-core Xeon-E / EPYC
RAM32 GB
NICIntel X710-DA2 (2×10G)
Moves~20 Gbps
Tracksup to ~6.7k subs
Power~250 W
~$2.5k  ·  ~$0.37 / sub  ·  ~27 subs/W
STANDARD · 1U

Mainstream 100G node

CPU16-core Xeon-SP / EPYC
RAM64 GB
NICIntel E810-CQDA2 (2×100G)
Moves~100 Gbps
Tracksup to ~33k subs
Power~380 W
~$5k  ·  ~$0.15 / sub  ·  ~87 subs/W
HIGH-DENSITY · 2U

Aggregation / consolidation

CPU2× 16–24c (dual socket)
RAM128 GB
NIC2× E810 (up to 200G)
Moves~200 Gbps
Tracksup to ~64k subs
Power~650 W (dual PSU)
~$9k  ·  ~$0.14 / sub  ·  ~98 subs/W
SMARTNIC · 2U

Lowest latency + HW offload

CPU24-core (frees cores)
RAM128 GB
NICConnectX-6 Dx + ASAP²
Moves~200 Gbps
Tracksup to ~64k subs
Power~600 W
~$10k  ·  ~$0.16 / sub  ·  HW conntrack offload

Power efficiency — subscribers per watt (higher is better)

Whole-node draw ÷ subscribers tracked. Density improves efficiency: the 2U consolidation node carries the most subs per watt.
Edge 1U · X710
~27 subs/W
~250 W → 6.7k
Standard 1U · E810
~87 subs/W
~380 W → 33k
High-density 2U · 2×E810
~98 subs/W
~650 W → 64k
SmartNIC 2U · CX-6 Dx
~107 subs/W
~600 W → 64k

Traffic the node can move (forwarding ceiling, Gbps)

min(NIC ports, PCIe bus). Note the XL710 bus cap vs the 100G-class cards.
X710-DA2 (2×10G)
~20G
NIC-bound
XL710 (2×40G)
~50G
bus-bound
E810-CQDA2 (2×100G)
~100G
1U / single
2× E810 / CX-6 Dx
~200G
2U / dual

5 · Latency & the SmartNIC option

All four NICs run the XDP data plane at low microsecond latency. Two levers matter for tail latency under load: queue count / steering (E810 ADQ and ConnectX flow steering isolate latency-sensitive traffic) and hardware offload. The ConnectX-6 Dx can offload stateful connection tracking in hardware via ASAP² (up to ~8M rules) — moving CGNAT/firewall flow state into the NIC, cutting CPU and trimming latency on the hottest flows.

i40e · X710/XL710

Proven & cheap

Rock-solid 10–40G, lowest card cost, mature driver. Great for edge nodes where 100G is overkill.
ice · E810

Mainstream 100G

Many queues, ADQ traffic isolation, PCIe Gen4 — the default choice for new 100G BNG nodes.
mlx5 · ConnectX-6 Dx

Latency + offload

Lowest latency, best AF_XDP, and HW conntrack/NAT offload for the densest, leanest-CPU builds.

6 · 3-Year TCO — a 50,000-subscriber example

Hardware is only the entry fee. Over a typical 3-year refresh, power + cooling, rack space, optics and the operational surface of every extra box dominate. Below is a directional total-cost comparison for 50,000 FTTH subscribers with CGNAT: a consolidated BNGSOFT build versus the common router-appliance design (≈16 NAS routers + a separate CGNAT tier).

3-year cost line (50k subs)BNGSOFT — consolidatedAppliance fleet (router + CGNAT tier)
Build3 × 1U E810 node (2 active @ ~33k + N+1)16 × CCR2216 NAS + ~3-node CGNAT cluster
Hardware (CapEx)~$15,000~$52,000
Software license+ BNGSOFT license (per quote)RouterOS included
Power + cooling (3 yr)*~$5,400 (~1.1 kW)~$13,000 (~2.7 kW)
Rack space3U~18U
Boxes to operate & spare3~19
HW + power subtotal (ex-license)~$20,000  (~$0.41 / sub)~$65,000  (~$1.30 / sub)

3-year hardware + power for 50k subscribers (ex-license)

Lower is better. The consolidated build frees ~$45k of headroom — typically more than covers the software license while still winning on TCO, rack and ops.
Appliance fleet
~$65k · ~18U · 19 boxes
~$1.30/sub
BNGSOFT consolidated
~$20k + license · 3U · 3 boxes
~$0.41/sub
The license is the variable — and consolidation pays for it. Even before software, the consolidated build runs ~$45k less in hardware + power over 3 years for 50k subscribers, in 3U instead of ~18U, with 3 boxes to manage and spare instead of ~19 — and no second hardware tier just for CGNAT. Plug your BNGSOFT quote into the headroom and compare full TCO.

7 · Why this beats a fixed appliance

Hardware strategy · operator value
$
~$0.14–0.16 per subscriber on 100G+ nodesMainstream 100G and dense 2U builds land near ~$0.15/sub HW; edge 10G nodes cost more per sub but suit small PoPs. Densify to lower both $/sub and watts/sub.
Buy the NIC the site needs10G edge to 200G aggregation on the same software — no forklift, no proprietary line cards.
Scale by adding commodity nodesLinear scale-out; each node is a standard server you already know how to buy, rack and spare.
Future-proof to SmartNIC/DPUMove to ConnectX-6 Dx ASAP² hardware offload when you want lower CPU and latency — same platform.
Fewer watts per subscriberHigh-density 2U consolidation reaches ~98 subs/W (SmartNIC ~107) — less power, cooling and rack per customer served.
One model to size everythingmin(traffic ceiling, subscriber ceiling) gives a defensible node spec for any PoP in minutes.

The bottom line

Size a BNGSOFT node against two ceilings — traffic (NIC + PCIe) and the per-node map limit (~131k) — and take the smaller. At ~3 Mbps busy-hour, a 1U E810 node moves ~100G and carries ~33k subs at ~$5k and ~87 subs/W; a 2U dual-100G node reaches ~200G, ~64k subs and ~98 subs/W. Add a ConnectX-6 Dx when you want hardware conntrack offload and the lowest latency.

Right card, right form factor, right node — ~$0.15 per subscriber on hardware you already buy.

Sources & honest framing: This is a sizing and TCO guide, not a benchmark report; all subscriber, throughput, power and cost figures are indicative engineering estimates that depend on CPU, RAM, NIC, traffic mix, per-subscriber busy-hour rate and enabled features, and must be validated per deployment. NIC specifications: Intel X710-DA2 — PCIe Gen3 ×8, typ. ~3.3–4.5 W (Intel ARK, ServeTheHome); Intel XL710 — 40GbE, PCIe Gen3 ×8 (Intel datasheet); Intel E810-CQDA2 — 2×100G, PCIe Gen4 ×16, ~15.4 W idle to ~20.8 W with optics (Intel ARK, ServeTheHome); NVIDIA ConnectX-6 Dx — 2×100G/200G, PCIe Gen4, ASAP² hardware connection-tracking offload up to ~8M rules, street price ~US$965–1,400 (NVIDIA datasheet, FS.com). NIC street prices and server costs are approximate retail (June 2026) and exclude optics, software licensing and operational costs. PCIe usable bandwidth: Gen3 ×8 ≈ 63 Gbps raw (~50 Gbps usable), Gen4 ×16 ≈ 252 Gbps raw (~200 Gbps usable) — the XL710 2×40G bus-cap and the multi-100G ceiling reflect this and match BNGSOFT production measurements. Per-node subscriber capacity is throughput-driven — NIC usable line rate ÷ the busy-hour per-subscriber rate (e.g. 2×100G ÷ ~3 Mbps ≈ ~64k), capped by a hard ~131,000-subscriber per-node map ceiling. The data plane forwards in XDP at roughly one core per ~55 Gbps, so a right-sized server is NIC/PCIe-bound, not CPU-bound, at residential busy-hour rates. The ~3 Mbps busy-hour per-subscriber figure is a planning assumption and should be replaced with your own measured busy-hour average; BNGSOFT XDP forwarding cost (~2.5% CPU at production load) and the ~131k map ceiling are from BNGSOFT deployment data. Power-efficiency (subs/W) and cost-per-subscriber are derived from the indicative whole-node figures above. 3-year TCO example assumptions: 50,000 subscribers over 36 months; electricity ~$0.12/kWh with a ~1.5× cooling/PUE multiplier (~$4,700 per continuous kW over 3 years); BNGSOFT build = 3 × 1U E810 nodes (2 active @ ~33k + N+1); appliance fleet = 16 × MikroTik CCR2216 NAS (max power 80–121 W — MikroTik docs; ~16 × CCR2216 @ 5k users for 50k per the carrier-BNG design referenced in our MikroTik comparison) plus a ~3-node CGNAT cluster; hardware and power are directional and exclude the BNGSOFT software license (shown as a separate line — request a quote) and any optics, cabling, support and rack/colo fees, which apply to both sides. MikroTik®, Intel®, NVIDIA®/Mellanox® and respective product names are trademarks of their owners; BNGSOFT is not affiliated with them. Prepared as a hardware-planning overview for operators.