BNG, carrier-grade NAT, firewall, and QoS — running together in a single XDP data plane on one commodity server. The same engine that terminates subscribers also translates their traffic, line-rate, before the kernel stack sees a packet. Four appliances collapse into one box.
To deliver IPv4 broadband today, most operators chain together four separate systems: a BNG to terminate subscribers (PPPoE/IPoE, RADIUS, addressing), a CGNAT appliance to share scarce public IPv4, a firewall for edge security, and a QoS / shaping tier to enforce plans. Each is a separate purchase, a separate licence, a separate failure domain — and every subscriber packet is copied across all of them.
That architecture is expensive in every dimension. Dedicated CGNAT appliances cost six figures and lock you to proprietary silicon. Linux/nftables masquerade is free but runs every packet through the full kernel netfilter stack — conntrack grows without bound, logging explodes at carrier volume, and its symmetric NAT breaks gaming, VoIP, and P2P. Bolting a software BNG, a NAT box, and a firewall together multiplies the per-packet cost and the operational surface.
BNGSOFT collapses the entire stack into one XDP data plane. The same program that terminates the subscriber also translates, firewalls, and shapes the traffic — in a single pass at the NIC driver, before the kernel stack is involved. One box replaces four.
"One server terminates the subscriber, shares the public IPv4, enforces the firewall, and shapes the plan — in a single pass, at line rate."
Separate BNG, CGNAT, firewall and shaper — each six-figure or licence-metered, each on its own refresh cycle and support contract.
Every subscriber packet is parsed, copied and queued through each appliance in turn. Latency and power add up; nothing is shared.
Per-flow NAT logging at CGNAT volume generates millions of records per hour. Storage and SIEM costs scale with the subscriber base.
Four boxes mean four things that can break, four config surfaces, four upgrade windows — and symmetric NAT on the cheap path still breaks apps.
One XDP program terminates the subscriber, translates the address, applies the firewall, and shapes the plan in a single pass at the NIC. The Linux alternative chains a software BNG, an nftables NAT tier, a firewall and a shaper — every packet copied through each in turn. Below: the same packet's journey through both.
Packet arrives, DMA to kernel memory
PREROUTING → FORWARD → POSTROUTING chain traversal
Per-flow entry allocation, spinlock contention under load
Symmetric NAT — new external port per flow, no determinism
Full stack traversal before the packet leaves the host
Packet arrives via DMA
Runs at driver level — kernel stack is never entered
Port-block lookup in BPF map — no per-flow conntrack, no spinlocks
Deterministic block assigned per private IP — endpoint-independent mapping
Packet forwarded at line rate — no kernel stack traversal
One configuration, one process, one XDP data plane. Subscriber termination, NAT, firewall and QoS run in a single pass at the driver — no chained appliances, no per-box config surface, no inter-box copies.
XDP CGNAT figures are from a live production node; the ~23% kernel-path CPU is our own measured pre-XDP baseline. The remaining Linux/nftables bars are qualitative — relative ordering from well-understood architectural characteristics, not a controlled benchmark.
Moving NAT into XDP cut CPU from ~23% to ~2.5% on comparable nodes. Per-packet translation overhead is measured in nanoseconds.
314 million sessions produced only ~347,000 port-block allocation log records — ~900× reduction in log volume while preserving lawful-intercept compliance.
8.4 million session-table capacity per node; 131,072 private-IP capacity. Scale horizontally for larger deployments.
XDP processes packets before the kernel network stack — no socket buffers, no iptables traversal, no scheduling overhead.
nftables masquerade allocates a new external port per flow with no endpoint-independent mapping guarantee. The XDP CGNAT engine provides deterministic port-block assignment and supports endpoint-independent filtering, enabling full-cone behaviour that carrier subscribers expect.
The realistic Linux comparison is a software BNG plus an nftables NAT tier plus a separate firewall and shaper. BNGSOFT delivers all of it in one XDP data plane. The table below reflects well-established architectural characteristics of each approach.
| Capability | Linux BNG + nftables NAT | Full CGNAT (XDP) |
|---|---|---|
| Data path | Kernel netfilter — full stack traversal per packet (PREROUTING, FORWARD, POSTROUTING) | XDP hook at driver level — kernel stack bypassed entirely for translated traffic |
| Per-packet CPU | High — scales with packet rate; netfilter overhead plus conntrack lock contention at carrier volume | ~3% on a 48-core node at carrier load; per-packet overhead in nanoseconds; XDP cut CPU from ~23% to ~2.5% on comparable nodes |
| Session scale | Kernel conntrack table; contention and memory pressure at millions of flows | 8.4 million session-table capacity; 131,072 private-IP capacity per node; 252,000+ concurrent sessions demonstrated in production |
| NAT behaviour | Symmetric masquerade allocates a new external port per flow; no endpoint-independent mapping — breaks gaming, VoIP, P2P | Full-cone endpoint-independent mapping (RFC 4787 compliant); deterministic port-block per private IP — carrier-grade app compatibility |
| Logging | One log record per connection/flow — generates millions of records/hour at carrier scale; high storage and SIEM cost | ~900× fewer records: 314 million sessions → ~347,000 port-block allocation logs; block↔subscriber mapping satisfies lawful-intercept/data-retention requirements |
| Port allocation | Dynamic per-flow; no determinism; difficult subscriber attribution without per-flow logs | Deterministic port-block allocation (~1.1 blocks per private IP; 31,248-block pool at ~8% utilisation); subscriber attributable from block alone |
| ALG support | Via kernel nf_nat helpers (FTP, SIP, H.323 etc.) | Built-in ALG — FTP, SIP, and related protocols handled in XDP path |
| Hairpin NAT | Supported via hairpin masquerade rules | Native hairpin — subscribers reach each other via the public IP without leaving the CGNAT node |
| Edge functions in the box | Separate systems: software BNG (pppd) + nftables NAT + firewall + tc/HTB shaper — each configured and scaled independently | BNG + CGNAT + firewall + QoS in one XDP data plane; subscriber termination, translation, security and shaping in a single pass |
| Hardware / appliances | One or more Linux servers plus, typically, a dedicated CGNAT appliance for scale — multiple boxes and licences | One commodity x86 server + Intel 40/100GbE NIC; four appliance roles collapse into a single node — no dedicated CGNAT box |
| Configuration | nftables ruleset; conntrack helper modules; per-rule logging configuration | Private range + public pool + port range + block size + NAT type — tables auto-populated at startup from the configured private range |
Port-block allocation logging is not a compromise on compliance — it is a smarter model. Each port block maps deterministically to one private IP address and a precise allocation timestamp, providing everything lawful-intercept and data-retention regulations require at a fraction of the storage cost.
314 million sessions logged as only ~347,000 port-block allocation records — roughly 900× fewer log entries than per-connection NAT logging, with no loss of subscriber attribution capability. The port block encodes the private IP, the public IP:port-range, and the allocation timestamp in a single compact record.
Per-flow NAT logging at CGNAT volume can produce tens of millions of records per day per node. Port-block allocation logging generates the same subscriber-attribution information in a table that is roughly 900× smaller — dramatically reducing storage, SIEM ingest costs, and log retention infrastructure.
The full CGNAT data plane packages BNG, NAT, firewall and QoS into one XDP engine on a single node. Every capability is production-proven on live networks.
100+ private IPs share a single public IP. ~108 sessions per private IP, ~1.1 port-blocks per private IP — efficient utilisation of your public address pool.
Endpoint-independent mapping (RFC 4787). Gaming consoles, VoIP SIP clients, and P2P applications work correctly — no broken matchmaking, no one-way audio.
Each private IP receives a deterministically assigned port block. Subscriber attribution for any logged connection requires only the port block record — no per-flow logging needed.
Application-layer gateway handles FTP, SIP, H.323, and related protocols. Hairpin NAT lets subscribers reach each other via the shared public IP without leaving the CGNAT node.
Operates in front of or behind any BNG, router, or edge device. No integration required beyond configuring the private range and public pool — completely vendor-agnostic.
Runs on standard x86 servers with an Intel 40GbE NIC — the same hardware already in your network. No proprietary CGNAT appliance, no vendor lock-in, no six-figure CapEx.
The figures below are from an anonymised production node running the full BNG + CGNAT + firewall + QoS data plane on live carrier infrastructure, June 2026 — one server doing the work of four appliances.
Concurrent sessions in production at carrier subscriber load.
Total session-table capacity per node — current utilisation is ~3% of maximum.
131,072 subscriber / private-IP capacity per node — the same table serves BNG sessions and NAT mappings.
314 million NAT sessions processed over the engine's lifetime on this production node.
Packets translated by the XDP engine in production — zero kernel-stack traversal.
~3% CPU on a 48-core node at carrier load; XDP cut CPU from ~23% to ~2.5% vs. kernel-path NAT on comparable nodes.
Because the data plane runs in XDP, CPU is almost never the limit — the bus and the NIC are. The three reference builds below show how the same BNGSOFT engine scales from a regional edge to a national core. Subscriber and traffic figures are planning estimates derived from the measured production node (one tier-1 build, ~3% CPU at carrier load); your numbers will vary with traffic mix and plan profile.
The bus, not the CPU, sets the per-node ceiling. On a dual-40G XL710 both ports share one PCIe 3.0 x8 link (~56 Gbps of usable bus bandwidth); router-on-a-stick designs that hairpin traffic in and back out halve usable subscriber throughput. Moving to E810 + PCIe 4.0 x16 lifts that ceiling roughly fourfold. In every tier the XDP CPU cost stays in the single-to-low-double-digit percentages — you scale the link, not the core count. All figures are illustrative planning estimates.
The cost story is structural, not a discount. A purpose-built edge is four product lines — BNG, CGNAT, firewall, shaper — each with its own CapEx, licence and support contract. BNGSOFT delivers all four roles on one commodity server. Cost ranges below are illustrative industry estimates for comparison, not quotes.
One commodity x86 server running the BNGSOFT XDP data plane delivers subscriber termination, carrier-grade address sharing, full-cone NAT, edge firewall, and QoS together — replacing a rack of proprietary appliances and reducing operational costs across the board.
100+ private IPs per public IP extends your existing address pool substantially, deferring costly IPv4 address acquisitions and giving time for IPv6 transition at a sustainable pace.
BNG, CGNAT, firewall and shaper collapse into a single commodity x86 node — eliminating six-figure appliance costs and vendor lock-in across the whole edge stack. Use hardware you already own or procure at commodity prices.
Port-block logging replaces per-flow logging. 314 million sessions produced only ~347,000 log records — dramatically reducing storage, SIEM ingest volumes, and log retention infrastructure costs.
~3% CPU at carrier load on a 48-core node. XDP processes packets in nanoseconds before the kernel stack — no bottleneck as subscriber count grows, no conntrack spinlock contention.
Endpoint-independent NAT eliminates symmetric-NAT app breakage. Gaming, VoIP, and P2P applications work correctly, reducing support tickets and subscriber churn from CGNAT-related issues.
The data plane runs on commodity x86 with standard Intel NICs — no proprietary silicon, no per-throughput licence, no renegotiated contracts. Scale by adding nodes, not by forklift upgrades.
Talk to BNGSOFT about a full BNG + CGNAT + firewall + QoS deployment. We'll walk through your network topology, subscriber scale, address pool, and link budget — and have you running a proof of concept on your own hardware.