BNGsoft Full CGNAT  ·  BNG + Carrier-Grade NAT in One XDP Data Plane

One node. The whole edge.

BNG, carrier-grade NAT, firewall, and QoS — running together in a single XDP data plane on one commodity server. The same engine that terminates subscribers also translates their traffic, line-rate, before the kernel stack sees a packet. Four appliances collapse into one box.

4-in-1 BNG + CGNAT + Firewall + QoS, one data plane
~3% CPU on a 48-core node at carrier load
~900× Fewer log records vs. per-flow NAT
252k+ Concurrent NAT sessions per node, measured
The Edge Problem

The subscriber edge is a stack of expensive single-purpose boxes.

To deliver IPv4 broadband today, most operators chain together four separate systems: a BNG to terminate subscribers (PPPoE/IPoE, RADIUS, addressing), a CGNAT appliance to share scarce public IPv4, a firewall for edge security, and a QoS / shaping tier to enforce plans. Each is a separate purchase, a separate licence, a separate failure domain — and every subscriber packet is copied across all of them.

That architecture is expensive in every dimension. Dedicated CGNAT appliances cost six figures and lock you to proprietary silicon. Linux/nftables masquerade is free but runs every packet through the full kernel netfilter stack — conntrack grows without bound, logging explodes at carrier volume, and its symmetric NAT breaks gaming, VoIP, and P2P. Bolting a software BNG, a NAT box, and a firewall together multiplies the per-packet cost and the operational surface.

BNGSOFT collapses the entire stack into one XDP data plane. The same program that terminates the subscriber also translates, firewalls, and shapes the traffic — in a single pass at the NIC driver, before the kernel stack is involved. One box replaces four.

"One server terminates the subscriber, shares the public IPv4, enforces the firewall, and shapes the plan — in a single pass, at line rate."

Why the four-box stack fails
1

Four CapEx lines, four lock-ins

Separate BNG, CGNAT, firewall and shaper — each six-figure or licence-metered, each on its own refresh cycle and support contract.

2

Per-packet cost multiplied

Every subscriber packet is parsed, copied and queued through each appliance in turn. Latency and power add up; nothing is shared.

3

Log record explosion

Per-flow NAT logging at CGNAT volume generates millions of records per hour. Storage and SIEM costs scale with the subscriber base.

4

Four failure domains

Four boxes mean four things that can break, four config surfaces, four upgrade windows — and symmetric NAT on the cheap path still breaks apps.

How It Works

The whole edge at the driver — before the kernel stack sees a single packet.

One XDP program terminates the subscriber, translates the address, applies the firewall, and shapes the plan in a single pass at the NIC. The Linux alternative chains a software BNG, an nftables NAT tier, a firewall and a shaper — every packet copied through each in turn. Below: the same packet's journey through both.

Approach A Linux nftables NAT

NIC Driver (kernel)

Packet arrives, DMA to kernel memory

Netfilter / iptables hooks

PREROUTING → FORWARD → POSTROUTING chain traversal

nf_conntrack lookup

Per-flow entry allocation, spinlock contention under load

masquerade / SNAT

Symmetric NAT — new external port per flow, no determinism

Kernel routing → egress

Full stack traversal before the packet leaves the host

High per-packet CPU  ·  Symmetric NAT  ·  Per-flow logging
Approach B BNGSOFT Full CGNAT (XDP)

NIC Driver (Intel 40GbE)

Packet arrives via DMA

XDP CGNAT hook

Runs at driver level — kernel stack is never entered

BPF-managed NAT table

Port-block lookup in BPF map — no per-flow conntrack, no spinlocks

Port-block SNAT (full-cone)

Deterministic block assigned per private IP — endpoint-independent mapping

XDP_TX / XDP_REDIRECT

Packet forwarded at line rate — no kernel stack traversal

~3% CPU  ·  Full-cone NAT  ·  ~900× less logging
bng-edge.conf  —  one data plane, four roles
BNG — subscriber termination
accessPPPoE / IPoE + RADIUS
bngenabled
CGNAT — carrier-grade NAT
private_range100.64.0.0/10
public_pool203.232.123.0/27
block_size512
nat_typefull-cone
hairpin  ·  algenabled
Firewall + QoS — same pass
firewallenabled
qos / shapingper-subscriber

Result — one XDP program, auto-populated
subscribers_capacity131,072 per node
session_table_capacity8.4 million sessions
cpu_at_carrier_load~3% (48 cores)

One configuration, one process, one XDP data plane. Subscriber termination, NAT, firewall and QoS run in a single pass at the driver — no chained appliances, no per-box config surface, no inter-box copies.

Performance Comparison

Five metrics. One clear winner.

XDP CGNAT figures are from a live production node; the ~23% kernel-path CPU is our own measured pre-XDP baseline. The remaining Linux/nftables bars are qualitative — relative ordering from well-understood architectural characteristics, not a controlled benchmark.

Per-packet CPU overhead
Linux nftables NAT ~23% (kernel path)
XDP CGNAT ~3% CPU (48-core, carrier load)
~3%

Moving NAT into XDP cut CPU from ~23% to ~2.5% on comparable nodes. Per-packet translation overhead is measured in nanoseconds.

Log records generated
Linux nftables (per-flow) 1 record per flow (illustrative)
XDP CGNAT (port-block) ~900× fewer — 1 log per port block
<1%

314 million sessions produced only ~347,000 port-block allocation log records — ~900× reduction in log volume while preserving lawful-intercept compliance.

Max concurrent sessions (capacity)
Linux nftables (conntrack) Kernel conntrack limited (illustrative)
XDP CGNAT 8.4 million sessions
8,400,000 sessions

8.4 million session-table capacity per node; 131,072 private-IP capacity. Scale horizontally for larger deployments.

Forwarding latency added
Linux nftables NAT Full kernel stack traversal (illustrative)
XDP CGNAT Line rate — nanoseconds per packet
ns

XDP processes packets before the kernel network stack — no socket buffers, no iptables traversal, no scheduling overhead.

NAT behaviour — application compatibility
Linux nftables Symmetric NAT — breaks full-cone-dependent apps (gaming, VoIP, P2P)
XDP CGNAT Full-cone (endpoint-independent) NAT — gaming, VoIP, and P2P work correctly

nftables masquerade allocates a new external port per flow with no endpoint-independent mapping guarantee. The XDP CGNAT engine provides deterministic port-block assignment and supports endpoint-independent filtering, enabling full-cone behaviour that carrier subscribers expect.

Capability Comparison

Capability matrix: Linux (BNG + nftables NAT) vs. Full CGNAT (XDP).

The realistic Linux comparison is a software BNG plus an nftables NAT tier plus a separate firewall and shaper. BNGSOFT delivers all of it in one XDP data plane. The table below reflects well-established architectural characteristics of each approach.

Capability Linux BNG + nftables NAT Full CGNAT (XDP)
Data path Kernel netfilter — full stack traversal per packet (PREROUTING, FORWARD, POSTROUTING) XDP hook at driver level — kernel stack bypassed entirely for translated traffic
Per-packet CPU High — scales with packet rate; netfilter overhead plus conntrack lock contention at carrier volume ~3% on a 48-core node at carrier load; per-packet overhead in nanoseconds; XDP cut CPU from ~23% to ~2.5% on comparable nodes
Session scale Kernel conntrack table; contention and memory pressure at millions of flows 8.4 million session-table capacity; 131,072 private-IP capacity per node; 252,000+ concurrent sessions demonstrated in production
NAT behaviour Symmetric  masquerade allocates a new external port per flow; no endpoint-independent mapping — breaks gaming, VoIP, P2P Full-cone  endpoint-independent mapping (RFC 4787 compliant); deterministic port-block per private IP — carrier-grade app compatibility
Logging One log record per connection/flow — generates millions of records/hour at carrier scale; high storage and SIEM cost ~900× fewer records: 314 million sessions → ~347,000 port-block allocation logs; block↔subscriber mapping satisfies lawful-intercept/data-retention requirements
Port allocation Dynamic per-flow; no determinism; difficult subscriber attribution without per-flow logs Deterministic port-block allocation (~1.1 blocks per private IP; 31,248-block pool at ~8% utilisation); subscriber attributable from block alone
ALG support Via kernel nf_nat helpers (FTP, SIP, H.323 etc.) Built-in ALG — FTP, SIP, and related protocols handled in XDP path
Hairpin NAT Supported via hairpin masquerade rules Native hairpin — subscribers reach each other via the public IP without leaving the CGNAT node
Edge functions in the box Separate systems: software BNG (pppd) + nftables NAT + firewall + tc/HTB shaper — each configured and scaled independently BNG + CGNAT + firewall + QoS in one XDP data plane; subscriber termination, translation, security and shaping in a single pass
Hardware / appliances One or more Linux servers plus, typically, a dedicated CGNAT appliance for scale — multiple boxes and licences One commodity x86 server + Intel 40/100GbE NIC; four appliance roles collapse into a single node — no dedicated CGNAT box
Configuration nftables ruleset; conntrack helper modules; per-rule logging configuration Private range + public pool + port range + block size + NAT type — tables auto-populated at startup from the configured private range
The Logging Win

314 million sessions. ~347,000 log records.

Port-block allocation logging is not a compromise on compliance — it is a smarter model. Each port block maps deterministically to one private IP address and a precise allocation timestamp, providing everything lawful-intercept and data-retention regulations require at a fraction of the storage cost.

~900×

314 million sessions logged as only ~347,000 port-block allocation records — roughly 900× fewer log entries than per-connection NAT logging, with no loss of subscriber attribution capability. The port block encodes the private IP, the public IP:port-range, and the allocation timestamp in a single compact record.

314M Sessions handled (lifetime production node)
~347k Port-block allocation log records generated
~108 Sessions per private IP; ~1.1 port-blocks per private IP

Per-flow NAT logging at CGNAT volume can produce tens of millions of records per day per node. Port-block allocation logging generates the same subscriber-attribution information in a table that is roughly 900× smaller — dramatically reducing storage, SIEM ingest costs, and log retention infrastructure.

What You Get

Carrier-grade CGNAT — without the carrier price tag.

The full CGNAT data plane packages BNG, NAT, firewall and QoS into one XDP engine on a single node. Every capability is production-proven on live networks.

IPv4 Conservation

100+ private IPs share a single public IP. ~108 sessions per private IP, ~1.1 port-blocks per private IP — efficient utilisation of your public address pool.

Full-Cone NAT

Endpoint-independent mapping (RFC 4787). Gaming consoles, VoIP SIP clients, and P2P applications work correctly — no broken matchmaking, no one-way audio.

Deterministic Port Blocks

Each private IP receives a deterministically assigned port block. Subscriber attribution for any logged connection requires only the port block record — no per-flow logging needed.

ALG + Hairpin

Application-layer gateway handles FTP, SIP, H.323, and related protocols. Hairpin NAT lets subscribers reach each other via the shared public IP without leaving the CGNAT node.

Drop-in, Any Vendor

Operates in front of or behind any BNG, router, or edge device. No integration required beyond configuring the private range and public pool — completely vendor-agnostic.

Commodity Hardware

Runs on standard x86 servers with an Intel 40GbE NIC — the same hardware already in your network. No proprietary CGNAT appliance, no vendor lock-in, no six-figure CapEx.

Production Evidence

Production-proven numbers — same engine, same XDP core.

The figures below are from an anonymised production node running the full BNG + CGNAT + firewall + QoS data plane on live carrier infrastructure, June 2026 — one server doing the work of four appliances.

Production BNG + CGNAT node — live, June 2026
Concurrent Sessions 252k+

Active NAT sessions

Concurrent sessions in production at carrier subscriber load.

Session Table Capacity 8.4M

Maximum session capacity

Total session-table capacity per node — current utilisation is ~3% of maximum.

Private IP Capacity 131k

Subscribers per node

131,072 subscriber / private-IP capacity per node — the same table serves BNG sessions and NAT mappings.

Lifetime Sessions 314M

Sessions handled

314 million NAT sessions processed over the engine's lifetime on this production node.

Packets Translated 1.7B

1.7 billion packets

Packets translated by the XDP engine in production — zero kernel-stack traversal.

CPU Utilisation ~3%

CPU at carrier load

~3% CPU on a 48-core node at carrier load; XDP cut CPU from ~23% to ~2.5% vs. kernel-path NAT on comparable nodes.

Port-block pool: 31,248 blocks at ~8% utilisation

~1.1 port blocks per private IP; ~108 concurrent sessions per private IP. The port-block model keeps the translation table compact while supporting high per-subscriber session counts — deterministic, auditable, and compliant.

Hardware Sizing

From a regional POP to a national core — the same engine, sized to the link.

Because the data plane runs in XDP, CPU is almost never the limit — the bus and the NIC are. The three reference builds below show how the same BNGSOFT engine scales from a regional edge to a national core. Subscriber and traffic figures are planning estimates derived from the measured production node (one tier-1 build, ~3% CPU at carrier load); your numbers will vary with traffic mix and plan profile.

Tier 1 · Regional
Regional POP / single edge
One node terminates subscribers, NATs, firewalls and shapes a regional access network.
2× Xeon (≈48 cores)
64 GB RAM
Intel XL710 dual-40G
(PCIe 3.0 x8)
  • Subscribers~5k–9k
  • Throughput~25–40 Gbps
  • NAT sessions250k+
  • CPU headroom~10–15%
Tier 2 · Metro
Metro aggregation
A single high-density node consolidating multiple POPs onto 100G uplinks.
2× Xeon (≈48–64 cores)
128–256 GB RAM
Intel E810 dual-100G
(PCIe 4.0 x16)
  • Subscribers~30k–50k
  • Throughput~100–200 Gbps
  • NAT sessionsmillions
  • CPU headroom~15–25%
Tier 3 · National
National core cluster
Horizontally-scaled nodes behind ECMP/anycast for a nationwide subscriber base.
N × Tier-2 nodes
ECMP / anycast
100G/400G fabric
active-active
  • Subscribers250k–1M+
  • Throughputmulti-Tbps
  • NAT sessionstens of M
  • Scalinglinear / add nodes

The bus, not the CPU, sets the per-node ceiling. On a dual-40G XL710 both ports share one PCIe 3.0 x8 link (~56 Gbps of usable bus bandwidth); router-on-a-stick designs that hairpin traffic in and back out halve usable subscriber throughput. Moving to E810 + PCIe 4.0 x16 lifts that ceiling roughly fourfold. In every tier the XDP CPU cost stays in the single-to-low-double-digit percentages — you scale the link, not the core count. All figures are illustrative planning estimates.

Cost-Effectiveness

One commodity server replaces a rack of single-purpose appliances.

The cost story is structural, not a discount. A purpose-built edge is four product lines — BNG, CGNAT, firewall, shaper — each with its own CapEx, licence and support contract. BNGSOFT delivers all four roles on one commodity server. Cost ranges below are illustrative industry estimates for comparison, not quotes.

Dedicated appliances
CGNAT box + vendor BNG
$100k–$1M+
per site, plus annual support
  • Separate CGNAT appliance, BNG, firewall — multiple CapEx lines
  • Proprietary silicon and vendor lock-in
  • Per-throughput / per-subscriber licensing
  • Lead times in months; forklift upgrades to scale
Linux DIY fleet
pppd + nftables + tc
Low CapEx
high OpEx & scaling risk
  • Free software, but per-packet kernel path doesn't scale to carrier CGNAT
  • conntrack & log explosion drive storage/SIEM cost
  • Symmetric NAT breaks apps → support load & churn
  • More boxes per Gbps as load grows
BNGSOFT Full CGNAT (XDP)
One commodity x86 node
Low five figures
per node, hardware you can own
  • BNG + CGNAT + firewall + QoS in one box — four roles, one CapEx line
  • No dedicated CGNAT appliance, no per-throughput licence
  • ~900× smaller log estate → lower storage & SIEM cost
  • ~3% CPU at carrier load → years of headroom before a refresh
4 → 1 Four appliance roles collapse into one commodity server. You stop buying — and stop powering, cooling, licensing and patching — separate BNG, CGNAT, firewall and shaper boxes. The savings compound across every site in the network, and the saved IPv4 purchases (100+ subscribers per public IP) sit on top.
Business Value

Extend IPv4 life. Eliminate appliance CapEx. Shrink your log estate.

One commodity x86 server running the BNGSOFT XDP data plane delivers subscriber termination, carrier-grade address sharing, full-cone NAT, edge firewall, and QoS together — replacing a rack of proprietary appliances and reducing operational costs across the board.

IPv4 Strategy

Defer IPv4 address purchases

100+ private IPs per public IP extends your existing address pool substantially, deferring costly IPv4 address acquisitions and giving time for IPv6 transition at a sustainable pace.

CapEx Reduction

Replace four appliances with one

BNG, CGNAT, firewall and shaper collapse into a single commodity x86 node — eliminating six-figure appliance costs and vendor lock-in across the whole edge stack. Use hardware you already own or procure at commodity prices.

OpEx Reduction

~900× less log infrastructure

Port-block logging replaces per-flow logging. 314 million sessions produced only ~347,000 log records — dramatically reducing storage, SIEM ingest volumes, and log retention infrastructure costs.

Performance

Carrier scale at line-rate latency

~3% CPU at carrier load on a 48-core node. XDP processes packets in nanoseconds before the kernel stack — no bottleneck as subscriber count grows, no conntrack spinlock contention.

Subscriber Experience

Full-cone NAT — apps just work

Endpoint-independent NAT eliminates symmetric-NAT app breakage. Gaming, VoIP, and P2P applications work correctly, reducing support tickets and subscriber churn from CGNAT-related issues.

Vendor Freedom

Zero vendor lock-in

The data plane runs on commodity x86 with standard Intel NICs — no proprietary silicon, no per-throughput licence, no renegotiated contracts. Scale by adding nodes, not by forklift upgrades.

Get Started

Ready to collapse your edge into one node?

Talk to BNGSOFT about a full BNG + CGNAT + firewall + QoS deployment. We'll walk through your network topology, subscriber scale, address pool, and link budget — and have you running a proof of concept on your own hardware.

Commodity x86 + Intel 40GbE
Drop-in beside any vendor BNG or router
8.4M session capacity per node
Full-cone NAT — RFC 4787
Lawful-intercept compliant logging
BNGSOFT — XDP Carrier-Grade NAT Production figures from a live BNG + CGNAT node, June 2026. Hardware sizing, cost ranges, and Linux/nftables qualitative bars are illustrative planning estimates. All trademarks are property of their respective owners.