Three Functions. One Suite. Internet That Feels Fast Under Load.
Your subscribers already get the megabits you sold them. The thing that makes broadband feel premium is what happens when the line is busy. BNGSOFT's low-latency suite — AQM, L4S, and Interactive Flow Protection — works as one integrated stack to keep gaming smooth, calls clear, and web pages snappy even when a household is saturating its line. Validated in production. Zero new hardware. Software-deployed on your existing fleet.
Speed is what you sell. Latency under load is what subscribers feel — and it is the one part of that experience the operator actually controls.
1.15 B+
interactive packets protected on a single live node (Node A)
~0.6 ms
avg queuing latency held under load · Node A · ~950 subs
13.6 M
no-loss ECN signals issued Node A · zero packet drops
0
new hardware, appliances or per-subscriber licences
Why operators choose the low-latency suite
$0
Runs on your existing fleet — zero CapEx.Software-only on the BNGs you already operate: Intel i40e (40G, native XDP), Intel ixgbe (10G), and VMware vmxnet3 (virtual/SKB mode). No new appliances, no forklift upgrade. Deploy node-by-node, observe first, flip to enforce with one command.
◆
A complete answer to "my internet lags when someone downloads."AQM keeps the queue short. L4S signals ECN-capable senders loss-free before overflow. IFP ensures the gamer's packets and the Zoom audio frame never join that queue at all. Three mechanisms, one coordinated stack — the first two companion brochures cover each individually; this is the combined picture.
↑
A product you can sell — and churn you can cut.Package as a "gaming / low-latency / pro" tier with continuous, measurable per-subscriber protection. Reduce "my game lags / call drops" support tickets. Retain work-from-home subscribers who notice the difference. The measured protection events are auditable by the NOC — you can show it working.
✓
Intelligent restraint — armed insurance, not an aggressive optimizer.On a healthy, uncongested line the suite stays near-silent. The adaptive AQM controller held at its minimum setting throughout peak on every production node. IFP's drop-protection fires when a line saturates and does nothing when it does not. The ACK-prioritization and interactive classification run continuously as a baseline benefit. One-flag disable per node, instantly reversible.
RFC
Standards-based and future-proof.IETF L4S — RFC 9330 / 9331 / 9332 — plus CoDel-style AQM and a conservative per-subscriber interactive-flow classifier. Works on CGNAT and QoS-only deployments, IPv4 and IPv6, ECN-capable and legacy flows, on all supported NIC types.
~2%
~2% CPU. Line-rate XDP. No bottleneck added.The entire suite — AQM, L4S dual-queue, IFP classification — runs inline in XDP at line rate. Measured CPU overhead on production nodes is approximately 2%. The per-node throughput limit is the NIC and CGNAT map capacity, not the suite software.
1 · The problem: bufferbloat and head-of-line blocking
When a subscriber saturates their line — a game patch download, a 4K stream, a cloud backup, or simply several devices active at once — packets arrive faster than the line can transmit them. They pile up in the shaper's queue at the BNG. That queue is where the delay is born.
Diagram 1: The problem — one queue, one lane, everything waits
On a standard BNG with a single FIFO/tail-drop queue, a bulk download fills the subscriber's shaper buffer. Every interactive packet — a game input, a VoIP audio frame, a DNS lookup — joins the back of the same queue and waits. The throughput meter looks fine. The subscriber's game is rubber-banding.
The symptom every ISP knows: "my internet lags whenever someone downloads something." The speed test shows 100 Mbps — the line is full, exactly as sold. But the gamer's input arrives 200 ms late; the video call is freezing; the web page hesitates before loading. The culprit is bufferbloat (queuing latency) and head-of-line blocking (all flows waiting in one queue). Both are solvable at the BNG — without new hardware.
2 · The solution: the three-layer low-latency suite
Layer 1 · AQM
Active Queue Management
CoDel-style algorithm watches how long each packet waits in the queue (sojourn time). When delay starts rising, it acts before the buffer overflows — marking or dropping early to signal senders to ease off. Keeps average queuing latency near a configured target (e.g. 5 ms). Adaptive controller self-tunes per node; no manual per-box threshold babysitting. Catches the transient bufferbloat spikes that destroy gaming and calls even when average latency looks fine.
Layer 2 · IFP
Interactive Flow Protection
Classifies every packet at line rate as bulk or interactive (gaming, VoIP, DNS, TCP ACKs, connection setup). Interactive packets take a protected lane — they bypass the policer's induced delay and are exempt from AQM mark/drop decisions. They never wait behind a bulk download. ACK-prioritization specifically fixes asymmetric-link download collapse: upstream TCP ACKs are fast-pathed so the download never stalls. Abuse-gated with a bounded per-subscriber priority allowance (default 10%).
Layer 3 · L4S
Low Latency, Low Loss, Scalable Throughput
RFC 9330/9331/9332. Dual-queue architecture separates ECN-capable flows from legacy flows. ECN-capable senders (modern OS, CDNs, streaming services) receive a Congestion Experienced (CE) mark instead of a packet drop — they ease off without losing a single byte, eliminating retransmission stalls. Legacy flows get CoDel-style early-drop. Both paths together mean every subscriber, every flow type is handled optimally. Validated at scale: 13.6 million no-loss signals on Node A alone.
Diagram 2: The solution — dual queue + protected lane
With the suite active, the same subscriber's mixed traffic is classified and routed into two coordinated paths. Interactive flows (game, VoIP, DNS, ACKs) take the protected lane — they pass without ever competing with the bulk backlog. Bulk flows enter the AQM-controlled queue where L4S issues ECN marks (no drop) to ECN-capable senders and CoDel early-drop to legacy senders, keeping queue depth low. The result: the download still gets its full bandwidth; the game and call are never delayed by it.
The three paths are coordinated: IFP classification fires first, then AQM governs the bulk queues. Interactive packets are never subject to AQM mark/drop — they were not the cause of the congestion.
3 · Before vs after — the experience difference CENTERPIECE
The following comparison is the core story. A household running a large download (a game patch, an OS update, a 4K stream) normally ruins everyone else's experience on the same line. The suite changes that outcome without touching the download speed.
BEFORE — Standard BNG (single FIFO / tail-drop)
Gaming ping during a download
150–250 ms
Inputs arrive late → rubber-banding, lost fights, disconnects. "My internet is broken."
Video call (Zoom/Teams/Meet) while another device streams
Choppy / frozen
Audio frames wait behind stream burst → call stutters, people talk over each other, "you're frozen."
Page load while background download runs
Stalls visibly
DNS + SYN wait in the shared queue → the browser spins before anything loads.
"My internet lags when someone downloads" is the most common complaint the speed test never catches.
→
with suite
AFTER — XDP BNG with Low-Latency Suite
Gaming ping during a download
~20 ms or less
Game packets take the protected lane (IFP) and are never delayed by the download burst. AQM holds queuing latency to sub-millisecond averages under load.
Video call while another device streams
Clear and smooth
VoIP/RTP audio frames (~160–200 B) hit the small-packet gate in IFP → protected lane → audio stays continuous. Stream still gets its bandwidth.
Page load while background download runs
Snappy
DNS queries and TCP SYN packets are IFP-protected — they pass without joining the bulk queue. Pages feel instant even at line saturation.
ECN congestion response
ECN CE mark — no drop
L4S issues a Congestion Experienced mark. The sender eases off without losing a packet. No retransmit stall. Download stays smooth and fast.
Support call driver
Reduced
The "lags when someone downloads" complaint is precisely what the suite eliminates. Fewer tickets, higher satisfaction, lower churn.
Diagram 3: Before vs after — game ping under load (illustrative of the mechanism)
The "before" figure (150–250 ms game ping during a download) reflects typical bufferbloat on a saturated residential line with a standard tail-drop shaper. The "after" figures are anchored by the real measured production data: queuing latency averaged ~0.6 ms on Node A (~950 subs, i40e, native XDP) under load. The bar labeled "illustrative" is a realistic representation of the mechanism; the queuing-latency measurements are the real production figures. Lower is better.
Standard BNG — idle line
~1 ms
baseline
Standard BNG — busy line (illustrative)
150–250 ms game ping · bufferbloat
broken
Suite — busy line, game ping (illustrative)
~20 ms or less
smooth play
Suite — avg queuing latency · Node A (measured)
~0.6 ms
measured
Suite — avg queuing latency · Node B (measured)
~0.84 ms
measured
Suite — avg queuing latency · Node C (measured)
~1 ms
measured
Standard BNG (tail-drop) Suite — game ping (illustrative of mechanism) Suite — queuing latency (measured production)
Methodology note: game-ping figures are illustrative representations of the bufferbloat mechanism — they are consistent with the literature and operator experience but were not directly instrumented in the production trial. The queuing-latency figures (0.6 / 0.84 / 1 ms) are real measured values from three independent live production nodes. The mechanism connection between low queuing latency and low game ping is direct and causal: queuing latency at the BNG is a primary component of game ping on a saturated residential line.
4 · Production evidence — three independent live nodes REAL DATA
These are not simulations or lab benchmarks. Each node below is an independent live production BNG node on real residential traffic. Node identifiers are anonymized; no hostnames, IP addresses, or customer names are included.
NODE A · ~950 SUBS · i40e 40G · NATIVE XDP
IFP + L4S fully instrumented
1.15 B
pure TCP ACKs protected (the asymmetric-link ACK fix)
1.15 billion pure TCP ACKs classified and fast-pathed by IFP — the dominant interactive class on a residential access node, as expected
216,000+ interactive packets actively saved from drop during real congestion bursts
617 million thin/interactive flows detected by rate-aware classification
13.6 million L4S ECN no-loss congestion signals issued — zero packet drops on those flows
Only 6,691 classic drops (legacy flows, not ECN-capable) — 2,000:1 ratio of no-loss signals to drops
Avg queuing latency: ~0.6 ms under load; peak excursions to ~19 ms (caught and signalled)
CPU: ~2% — the full suite running inline in XDP
NODE B · ~2,530 SUBS · NATIVE XDP (LARGEST NODE)
High-density node — suite at scale
2.5 M
interactive packets protected from drop during a real congestion burst
208 million total interactive packets classified by IFP
2.5 million interactive packets saved from drop during a congestion event — the insurance firing when it mattered
3.4 million L4S no-loss ECN signals — zero packet loss on ECN-capable flows
Avg queuing latency: ~0.84 ms — held flat under load
Adaptive AQM controller at minimum aggressiveness throughout — correct behaviour, throughput fully protected
~95–97% of protected interactive traffic was pure TCP ACKs — consistent with the fleet-wide pattern
NODE C · ~400 SUBS · NATIVE XDP
Smaller node — L4S signal volume
31.9 M
L4S no-loss ECN congestion signals issued
31.9 million no-loss ECN congestion signals — every one a lossless "ease off" to an ECN-capable sender
Avg queuing latency: ~1 ms — within target throughout
Adaptive AQM controller correctly idle — latency was healthy; no unnecessary drops
Zero subscriber disruption at deployment; full throughput preserved
Consistent with Nodes A and B: same software, same per-subscriber behaviour regardless of node density
Fleet summary — three live production nodes, measured in production on real residential traffic
All figures from independent live production nodes. No lab setup, no synthetic load. ~95–97% of IFP-protected traffic is pure TCP ACKs across all nodes. Suite deployed with zero subscriber disruption on all nodes.
1.15 B+
TCP ACKs protected · Node A asymmetric-link ACK fix
13.6 M+
no-loss L4S signals · Node A 2,000:1 vs classic drops
~0.6 ms
avg queuing latency · Node A under real load
~2%
CPU overhead · full suite AQM + L4S + IFP inline XDP
Honest interpretation of the production data:What the IFP drop-protection counter means: the 216,000+ packets saved from drop on Node A (2.5 million on Node B) represent real congestion events where the subscriber's line saturated and interactive packets would otherwise have been dropped or AQM-marked. On a healthy, uncongested line this counter stays near zero — because there is nothing to protect against. This is the correct behaviour. IFP is armed insurance: it engages automatically under congestion with conservative gates, and does nothing when it is not needed.
What runs continuously regardless: the ACK-prioritization (1.15 billion ACKs on Node A), the interactive-flow classification, and the L4S ECN marking all operate continuously — not only during peak congestion. These are the baseline benefits that are always active.
The adaptive AQM controller: it held at minimum aggressiveness on every node throughout peak. This is the correct outcome — it means the network was healthy, so the controller protected throughput by doing as little as possible. Smart enough to do nothing when nothing is wrong.
5 · What each subscriber type experiences
The following table maps each common traffic type to the concrete subscriber-felt benefit of the suite — and what the alternative looks like on a standard BNG without it.
Traffic / Use Case
Without the suite (standard BNG)
With the low-latency suite
Which mechanism
Online gaming UDP ~40–120 B packets
Ping jumps from ~20 ms to 150–250 ms the instant a download starts. Rubber-banding, lost inputs, disconnects. "My internet is broken when my family is home."
Game packets hit the small-packet IFP gate and take the protected lane. Never wait behind the download burst. Ping stays low and stable throughout. Competitive play on a busy household line.
IFP (small-packet gate + rate-aware) protects game packets. AQM holds queue depth. L4S ensures the download doesn't cause retransmit oscillation.
Voice / VoIP calls RTP ~160–200 B frames
Audio frames wait behind upload or download bursts. Call is choppy, people talk over each other, "you're frozen," reconnect loops. Critical for work-from-home subscribers.
VoIP/RTP audio frames are small — they pass through the IFP small-packet gate. Protected lane keeps them out of the bulk queue. Calls stay clear and natural even while the household is saturating the line.
IFP (small-packet gate) protects audio frames. AQM prevents queue buildup. L4S prevents the background download from dropping and recovering repeatedly.
Video calls Zoom / Teams / Meet
Video keyframes are larger but audio is the perceptual bottleneck. Intermittent queue depth from streaming or downloads causes audio jitter. "The video call quality is terrible when my kids are streaming."
Audio packets protected via small-packet and VoIP gates. Video call audio stays continuous. Video may occasionally see a brief keyframe delay, but the call remains usable and clear. Critical for the growing WFH subscriber segment.
IFP protects audio frames. AQM holds overall queue latency below 5 ms target. L4S prevents the streaming burst from causing drop-and-recover oscillation.
Large downloads OS updates, game installs
Tail-drop causes retransmit stalls when the queue fills — throughput oscillates below rated speed. The download also degrades everyone else's experience in the household simultaneously.
L4S issues ECN CE marks to ECN-capable download senders (modern OS, CDNs) instead of dropping — no retransmit stall, smooth throughput. Download still gets full rated speed. Other household traffic unaffected by the download's bulk queue.
L4S (ECN CE mark) prevents retransmit stalls for ECN-capable flows. CoDel path handles legacy flows. IFP ensures other household traffic is not blocked by this download's queue.
Video streaming Netflix, YouTube, 4K
Buffer fills → latency spikes → adaptive bitrate drops to lower quality → video visibly degrades. When multiple devices stream, one wins and others degrade.
AQM keeps queue short — streaming buffers refill quickly and consistently. L4S ECN marking lets adaptive-bitrate algorithms respond to real congestion signals without drop events. Streams stay at higher quality for longer.
AQM holds queuing latency near target. L4S ECN marks guide ABR algorithms. IFP ensures stream control packets (DNS, SYN, small bursts) are not blocked.
Web browsing DNS + TCP SYN critical path
DNS queries and TCP SYN packets queue behind background downloads. Pages take an extra 100–300 ms to start loading. "My internet feels slow even though the speed test shows 200 Mbps."
DNS (port 53) and SYN packets are IFP-protected — they pass without joining the bulk queue. Pages snap open instantly even when the line is saturated by a download. The "snappiness" that subscribers use to judge service quality is preserved at all times.
IFP (DNS gate + SYN gate) protects the critical-path packets. This is the gut-feel "is my internet good?" test subscribers run constantly — and the fix is free.
Upload traffic (video call sending, cloud backup) fills the slow upstream queue. TCP ACKs for the download stream wait behind upload bulk packets → download throughput collapses even though the download line is empty.
Pure TCP ACKs (≤64 B, no payload) are fast-pathed by IFP's ACK gate. They can never be delayed by upload bulk. Download performance is fully recovered while the upload continues unaffected. Directly measured: 1.15 billion ACKs protected on Node A.
IFP (pure-ACK gate, upload direction). Well-understood bufferbloat remedy — works on every asymmetric link, continuously, always-on regardless of congestion state.
6 · How the three functions work together — the coordinated stack
Three mechanisms — one integrated answer
AQM limits queue depth. L4S signals ECN senders without loss. IFP keeps interactive traffic out of the queue entirely.
AQM (Active Queue Management)
Watches how long each packet waits in the subscriber's shaper queue (sojourn time). When sojourn time exceeds the target, acts early — before buffer overflow. Keeps average queuing latency near the configured target (e.g. 5 ms). Adaptive controller self-tunes per node: ramps up under sustained congestion, backs off to minimum when latency is healthy. This is the baseline: without AQM, a full queue simply drops packets with no warning.
Dual-queue architecture separates ECN-capable flows from legacy flows. ECN-capable senders receive a Congestion Experienced mark — not a drop. The sender eases off without losing a packet, without triggering a retransmission stall, and without the throughput oscillation that classic drop causes. L4S is the congestion signal; AQM is the policy for when to send it. They are designed to work together. Validated: 13.6 M no-loss signals on Node A, 31.9 M on Node C.
+
together
IFP (Interactive Flow Protection)
Classifies every packet as bulk or interactive using five signal gates: pure-ACK, SYN, DNS, small-packet (≤256 B), and optional DSCP/port lists. Interactive packets take the protected lane — no policer delay, no AQM mark/drop. They were not the cause of the congestion. Abuse-gated with a bounded per-subscriber priority allowance (default 10% of shaped rate). ACK-prioritization (pure-ACK gate on upload) recovers download performance on asymmetric links and runs continuously regardless of congestion state.
Plain framing: AQM limits how deep the queue gets. L4S ensures the signal to ease off never causes a packet loss. IFP ensures the gamer's packets and the Zoom audio frame never join that queue in the first place. A subscriber doing a 4K stream download and a video call simultaneously: AQM holds the shaper latency below 5 ms; L4S signals the streaming sender loss-free; IFP ensures the Zoom audio packets are never delayed by the stream's burst. All three run inline in XDP at line rate — no coordination overhead, no central state.
Diagram 4: The latency stack — where each mechanism applies
Total subscriber-felt latency has three components. Each is controlled (or not) by a different mechanism. The suite covers everything within the BNG's scope.
7 · Hardware coverage — runs on your existing fleet
The suite runs across all supported NIC types without modification. The XDP program variant is selected automatically at startup; no operator configuration is required beyond the NIC type being present.
▪
Intel i40e — 40G NIC
Native XDP mode. Full suite: AQM + L4S dual-queue + IFP Tier 1 + Tier 2 (rate-aware sparse-flow detection). Highest headroom — all features including advanced rate-aware IFP classification. Used on Node A in production (~950 subs, 40G link). This is the recommended configuration for new deployments and high-density access nodes.
NATIVE XDP · FULL SUITE
▪
Intel ixgbe — 10G NIC
Native XDP mode. Full suite: AQM + L4S dual-queue + IFP Tier 1 (ACK, SYN, DNS, small-packet protection). Same per-subscriber behaviour as i40e; 10G link speed. Suitable for smaller access nodes and edge deployments. No forklift upgrade required — runs on existing ixgbe hardware already deployed in the field.
NATIVE XDP · FULL SUITE
□
VMware vmxnet3 — Virtual / SKB mode
SKB (kernel stack) XDP mode, automatically selected at startup for vmxnet3 NICs. AQM + L4S + IFP Tier 1 all supported in SKB mode. Enables the complete low-latency suite to run in fully virtualized environments — VMware ESXi, cloud-hosted BNGs, lab environments — without any native NIC. No forklift upgrade; no hardware dependency. The vmxnet3 SKB mode selection is automatic.
SKB/VIRTUAL · FULL SUITE
No forklift upgrade. No hardware dependency. The suite runs on i40e, ixgbe, and vmxnet3 NICs today — plus any other NIC supported by the XDP BNG. The hardware coverage means an operator can roll out the low-latency suite across a mixed fleet — physical 40G and 10G nodes alongside virtual BNGs — without buying new hardware. The XDP program variant (native vs. SKB) is selected automatically; the operator does not need to configure it. On vmxnet3 (vmxnet3 + kernel 7.0+), the BNG auto-selects SKB mode — native XDP would be silently ignored by the driver, so the automatic selection is a safety guarantee as well as a convenience.
8 · What to expect — rollout, behaviour, and honest framing
The suite is designed so you can turn it on without a maintenance window, a hardware order, or a leap of faith. Each component has an observe-first mode that measures and counts without changing any behaviour.
STEP 1 · OBSERVE
Measure only — zero behaviour change
Both AQM and IFP ship with observe mode: classification, queue measurement, and counting only
See your real interactive-packet volumes, ACK rates, queuing latency, and ECN-capable traffic share on real subscribers — zero risk
This is exactly how all production nodes were initially measured
Default-off per node — opt-in, nothing changes until you enable it
STEP 2 · ENFORCE
Protected lane and AQM active
Flip to enforce with one operator command per node — instant on/off, always reversible
ACK-prioritization and interactive-flow protection engage; AQM dual-queue begins marking and controlling
Drop-protection counters now fire during congestion events, confirming protection
Roll out node-by-node at your own pace; nodes are independent
ALWAYS · TELEMETRY
NOC-visible per-subscriber SLA
Per-subscriber queuing latency, ECN mark counts, IFP protection events visible via bngxdpctl metrics and bngxdpctl sub show
Distinguishes BNG-controllable queuing latency from path RTT (physics) — the NOC can prove which delay is theirs to fix
Fleet-wide "interactive flows protected" rollup — auditable proof of engagement at peak
"Armed insurance" — how the suite behaves across the operating cycle:Idle / uncongested line (most of the time): the suite is near-silent. Adaptive AQM controller holds at minimum aggressiveness — zero unnecessary drops. IFP drop-protection counters stay near zero because there is nothing to protect against. ACK-prioritization and interactive classification run continuously as a baseline benefit.
Peak / busy line (evenings, shared households): the suite engages proportionally. AQM begins issuing ECN marks to ECN-capable senders; CoDel path handles legacy flows. IFP drop-protection fires for interactive flows that would otherwise be dropped. Drop-protection counter events are visible in the NOC telemetry.
Sustained congestion: the adaptive controller ramps aggressiveness to hold queuing latency near the configured target. Demonstrated capability: average queuing latency reduced from ~5 ms to ~2.9 ms in a controlled sustained-congestion run on a production node.
One-flag disable: the entire suite can be disabled per node with a single command at any time. Instant, reversible, no subscriber impact. Individual components (AQM, IFP) can also be toggled independently.
9 · Measure it, manage it — the operator visibility layer
Section 8 describes the "ALWAYS · TELEMETRY" card — per-subscriber queuing latency and ECN mark counts visible via bngxdpctl metrics and bngxdpctl sub show. The four capabilities below are the concrete realisation of that promise: a complete visibility and management layer that turns "we have low latency" into "we measure every subscriber's experience, continuously, and act on it before exhaustion or before a fault occurs."
Why a visibility layer matters: the data-plane suite can hold queuing latency to sub-millisecond averages and protect millions of interactive packets — but without the right operator tooling, none of that is actionable. Support cannot triage a subscriber complaint without a score. The NOC cannot plan capacity without headroom numbers. The product team cannot sell a "low-latency tier" without a KPI that proves it. These four tools close that gap — all running on the same BNG, same datapath, same deploy.
9.1 · Subscriber Experience Score — bngxdpctl ses
Every subscriber on the node receives a single 0–100 quality score, computed continuously from the data-plane telemetry the suite already collects. Support staff can open a ticket, look up the subscriber's current SES, and immediately answer "this customer scores 58 — here is why" rather than asking the subscriber to run a speed test. The NOC and product team get a real KPI instead of raw counters: not "average queuing latency was 0.8 ms" but "92% of subscribers scored Good or better at peak hour."
The score is latency-centric by design, reflecting what broadband subscribers actually feel. Typical weighting: queuing latency contributes the majority of the score, consistency and spike avoidance contribute a further quarter, and congestion-related loss accounts for the remainder. Crucially, a subscriber running at their contracted plan rate cap does not score worse for it — the score measures experience quality, not utilisation, and a subscriber who is getting exactly what they paid for should not be flagged as a problem.
Diagram 5: Subscriber Experience Score — fleet band distribution (illustrative)
The fleet view groups subscribers into four bands. The "worst-N" list surfaces the lowest-scoring subscribers instantly for proactive support outreach — before they raise a ticket. Lower score = worse experience; the goal is to push the fleet distribution toward Excellent.
Excellent · 80–100
majority of fleet at peak · sub-millisecond queuing latency
Excellent (80–100) Good (60–79) Fair (40–59) Poor (0–39)
The histogram is a live fleet snapshot. The "worst-N" list is the actionable output: the N lowest-scoring subscribers, ordered, ready for proactive support review. The operator runs bngxdpctl ses for the fleet view or bngxdpctl ses <ip> for a single subscriber.
Operator value: SES turns the NOC's "we have low latency" marketing claim into a continuous, per-subscriber measurement. Support can triage complaints in seconds. The NOC has a real SLA KPI. Product can package "guaranteed SES ≥ 80 on our premium tier." And it requires no new infrastructure — the score is computed from telemetry the suite already generates.
9.2 · Application-aware QoS — bngxdpctl app
Traffic is classified into named application categories — video-streaming, gaming, conferencing, software-updates, bulk, and others — by destination and port at line rate in the XDP fast path, with no deep packet inspection and no additional hardware. This gives operators three compounding capabilities on the same platform.
VISIBILITY
See where the bandwidth actually goes
Per-category traffic breakdown per subscriber and fleet-wide
Answers "which applications consume the most bandwidth on this node?" in real time
Informs peering and capacity decisions with real traffic-mix data rather than estimates
Visible via bngxdpctl app show at any time — no sampling delay
POLICY
Per-subscriber, per-category rate policy
Apply different rate treatment to different categories for a given subscriber
Gaming and conferencing packets can be given priority treatment; software-updates and bulk are shaped more conservatively
Policy is applied in the fast path — no separate policy engine, no additional hop
Configured via bngxdpctl app policy
TIERED PLANS
Differentiated product tiers
Define named profiles ("gaming-priority", "standard", "basic") with different per-category treatment
Assign profiles per subscriber — the correct profile applies automatically at line rate
Enables a "gaming-priority tier" as a real, measurable product with a tangible difference subscribers can feel
Managed via bngxdpctl app profile
The compounding benefit: application-aware QoS on the existing fast path means the operator can simultaneously (a) see the real app mix for capacity planning and peering decisions, (b) sell differentiated plans backed by a measurable per-category policy, and (c) give gaming and conferencing traffic the same protected-lane treatment that IFP already provides for raw interactive flows — all without adding a DPI appliance, a new hop, or a per-subscriber licence.
A single command gives the operator a "how full is this node, and what runs out first?" view across every finite resource simultaneously. Rather than waiting for a fault (subscriber map full → new sessions silently fail; CGNAT port-blocks exhausted → NAT allocation fails; CPU saturated → packet loss begins), the operator sees used / headroom / status for each resource and can order capacity before exhaustion occurs.
Resource
What it measures
Why it matters
Status levels
CPU — data-plane share
softirq and XDP processing CPU share specifically, not just total system CPU
Total CPU can look healthy while the XDP data-plane is saturated; this separates the two
OK / WARN / CRIT
Subscriber slots
Customer map entries used vs. total capacity
Subscriber map is a finite BPF map; a full map silently fails new session allocation
OK / WARN / CRIT
Connection-tracking table
Active conntrack entries used vs. table size
CT table exhaustion causes new connection drops — visible to subscribers as random failures
OK / WARN / CRIT
CGNAT port-block pool
Public-IP port-blocks allocated vs. total available, per public IP
Exhaustion before it happens rather than the first NAT allocation failure under load
OK / WARN / CRIT
The output names the single tightest resource on the node — "CGNAT port-blocks at 78% — approaching WARN threshold; all other resources OK" — so the operator knows exactly where to focus. The command is bngxdpctl capacity; run it in a monitoring loop or integrate the structured JSON output into an existing NOC dashboard via bngxdpctl metrics.
Proactive vs. reactive: every finite resource on the table above has previously caused a production incident when it was discovered by a fault rather than by monitoring. Capacity-planning telemetry turns each of those into a known, graphed headroom number — ordered capacity before exhaustion, not after.
9.4 · Live diagnostics on demand — bngxdpctl debug
When troubleshooting a specific subscriber or a node anomaly, an operator can stream the node's debug detail to their console for a bounded window — typically a few minutes — and then it auto-reverts to normal verbosity. No configuration change is required, no central logging is flooded, and no restart is needed. A small but meaningful "operate it safely in production" capability: the detail needed for a diagnosis is available when required and gone when it is not.
10 · Cost effectiveness and monetization
Business angle
"Gaming smooth, calls clear, pages snappy — even at peak hour, even when the household is busy"
This sentence is what converts a subscriber who submits "my internet lags when someone downloads" into a loyal customer on a premium tier. The low-latency suite makes it technically true and continuously measurable. It runs on your existing XDP BNG fleet at ~2% CPU with zero new hardware — making the incremental cost per subscriber effectively zero. That creates two distinct revenue opportunities:
Silent retention improvement: deploy across all plans as a quality-of-service baseline. Reduce "lags under load" support tickets. Reduce churn among gaming households, WFH subscribers, and families with multiple simultaneous users. The ROI is in ticket deflection and retention.
Premium tier monetization: package as a "Gaming / Pro / Low-Latency" add-on tier with continuous, per-subscriber, NOC-auditable protection. Charge a £3–5/month premium (or equivalent). At 5% uptake on a 50,000-subscriber base, that is £75,000–125,000/month in new revenue for a feature with zero incremental CapEx. The protection is real, measurable, and demonstrable — not a marketing claim.
Cost comparison — low-latency protection via the suite vs. alternatives
Operators routinely buy dedicated DPI / QoE / traffic-management appliances to improve subscriber experience. The suite delivers the same outcome — measured and controlled latency under load — in pure software on the BNG you already operate.
Approach
Hardware
Per-subscriber licence
Adds a hop / latency?
Interactive-flow protection
Self-tuning?
Dedicated DPI / QoE appliance
New appliance(s) per site
Typically yes
Yes — another box in the path
Varies; often per-app DPI policy
Rarely
BNGSOFT low-latency suite
None — existing XDP BNG
None
No — same XDP datapath
Built in · all subscribers · all flows
Yes — adaptive AQM controller
SLA-grade, self-tuning, standards-based low-latency protection for zero added CapEx, zero per-subscriber licence cost, and zero new latency — on infrastructure you have already deployed.
11 · Fleet-scale numbers MEASURED + PROJECTED
Measured on live production nodes — projected linearly to fleet scale
What the suite delivers across a large operator fleet
Every production figure below is from the three independent live BNG nodes described in Section 4. Fleet-scale projections are transparent linear extrapolations from the measured per-subscriber rates — clearly labelled. The mechanism is per-packet in XDP with no shared state; a large fleet is more independent nodes, each running identically.
no-loss L4S signals · Node A measured · 2,000:1 vs classic drops
~0.6 ms
avg queuing latency · Node A measured · under real load
0
new hardware required runs on existing i40e / ixgbe / vmxnet3
Metric
Measured — live production nodes
Behaviour at any density
IFP interactive packets classified
1.15 B+ on Node A (~950 subs) · 208 M on Node B (~2,530 subs) · 95–97% are pure TCP ACKs on all nodes
Per-subscriber and per-packet — same rate regardless of node density. Scales linearly.
IFP drop-protection events
216,000+ on Node A · 2.5 M on Node B (larger node, congestion burst observed)
Engages under congestion; near-zero on healthy lines. Correct behaviour — not a weakness.
L4S no-loss ECN signals
13.6 M on Node A · 3.4 M on Node B · 31.9 M on Node C — zero packet loss on marked flows
Per-subscriber, per-packet — independent of node density. Scales with ECN-capable traffic share.
Average queuing latency under load
~0.6 ms (Node A) · ~0.84 ms (Node B) · ~1 ms (Node C) — all nodes, all peak windows
Per-node property. Independent nodes, not a shared resource. Same behaviour at any density up to ~131k subs/node map capacity.
CPU overhead
~2% — full suite (AQM + L4S + IFP) running inline in XDP on production nodes
Per-packet cost scales linearly with traffic volume. The NIC and CGNAT map capacity are the node bottlenecks — not the suite.
Subscriber disruption at deploy
Zero — all three nodes deployed with full throughput and subscriber count preserved
Software-update deploy. Observe-first mode available. Instant on/off per node.
The architecture key: the suite operates per-packet in XDP — it does not coordinate across subscribers, has no central state, and does not become more expensive per subscriber as node density grows. Each BNG node runs independently. The per-node subscriber map capacity is ~131,000 subscribers; the practical node limit is the NIC link and CGNAT. At a large operator, this means the low-latency suite runs on your existing fleet — no new hardware, no new appliances, no per-subscriber licence, no central bottleneck to scale.
The bottom line
Speed is what you sell. Latency under load is what subscribers feel — especially during peak hours, especially in multi-person households, and especially for the traffic types that matter most: gaming, video calls, voice, web browsing. The low-latency suite answers the question that the speed test never catches: "why does my internet lag when someone downloads?"
Three functions. One coordinated stack. Validated on three independent live production nodes across ~950, ~2,530, and ~400 subscribers respectively. Zero new hardware. ~2% CPU. Software-deployed on i40e, ixgbe, and vmxnet3 NICs. Instant per-node on/off. Observe first, enforce when ready.
The operator visibility layer (Section 9) completes the picture: per-subscriber Subscriber Experience Scores, application-aware QoS classification and policy, capacity-planning telemetry across every finite node resource, and live on-demand diagnostics — all on the same BNG, same datapath, same deploy. "We have low latency" becomes "we measure every subscriber's experience, continuously."
1.15 B+ interactive packets protected · 13.6 M no-loss ECN signals · ~0.6 ms avg queuing latency
3 live nodes · i40e / ixgbe / vmxnet3 · ~2% CPU · 0 new hardware · 0 per-subscriber licence + per-subscriber SES · app-aware QoS · capacity telemetry · live diagnostics
Recommendation: enable OBSERVE mode on your first node today — see your real interactive-packet volumes, ACK rates, and queuing latency on real subscribers at zero risk. Then roll out to enforce node-by-node, one command at a time. The suite runs on the fleet you already operate.
Methodology and honest framing: all production figures are from three independent live BNG nodes (anonymized as Node A, Node B, Node C — no hostnames, IP addresses, or customer names included) running on real residential traffic. Node A carries approximately 950 subscribers on an Intel i40e 40G NIC in native XDP mode. Node B carries approximately 2,530 subscribers in native XDP mode. Node C carries approximately 400 subscribers in native XDP mode. IFP and L4S measurement periods varied per node; figures reflect the measurement window available for each node. All three nodes were deployed with zero subscriber disruption and full throughput preservation.
IFP drop-protection counters: on a healthy, uncongested line the drop-protection counters (packets saved from drop or mark) stay near zero because there is nothing to protect against. This is correct behaviour. The 216,000+ events on Node A and 2.5 million events on Node B reflect real congestion periods observed on those nodes. The ACK-prioritization function operates continuously regardless of congestion state; 1.15 billion ACKs on Node A reflects this continuous always-on operation.
Game ping and before/after latency claims: the "standard BNG" game ping figure (150–250 ms during a download) is an illustrative representation of typical bufferbloat observed on residential lines with tail-drop shapers — it is consistent with published measurements in the bufferbloat literature and common operator experience, but was not directly instrumented in this production trial. The "after" game ping (~20 ms) is a reasonable lower bound given that the BNG queuing latency contribution (the dominant variable during local saturation) was measured at ~0.6–1 ms on all nodes; the remainder is path RTT to the game server, which is unchanged by the suite. These figures are clearly labeled as illustrative of the mechanism and anchored by the real queuing-latency measurements.
Hardware compatibility: i40e and ixgbe support native XDP mode; vmxnet3 (VMware) automatically uses SKB (kernel stack) XDP mode on kernel 7.0+, as the driver silently accepts but never runs native XDP on that kernel version — the automatic SKB fallback is a safety behaviour built into the BNG software. All suite features (AQM + L4S + IFP Tier 1) are supported in SKB mode.
CPU overhead: the ~2% figure reflects the full suite running inline in XDP on production nodes. The per-packet cost scales linearly with traffic volume; the stated ~2% is not a hard guarantee for all NIC types and traffic mixes.
Monetization figures: the example premium-tier revenue projection (£3–5/month, 5% uptake, 50,000 subscribers) is illustrative; actual uptake and pricing depend on the operator's market, subscriber base, and commercial decisions.
Standards references: IETF L4S — RFC 9330 (architecture), RFC 9331 (dual-queue coupled AQM), RFC 9332 (ECN identifier). CoDel AQM — RFC 8289. Prepared as a management and planning overview for large-scale ISP operators. This brochure is a companion to and superset of the individual L4S Dual-Queue AQM and Interactive Flow Protection (IFP) brochures, which contain additional per-function technical depth and methodology notes.