Flow Intelligence · per-flow loss · access/transit RTT · elephant scavenging · per-sub QoE · auto-remediation

Engine Brief · Subscriber Experience in the XDP Data Plane

Flow Intelligence — Measure, Scavenge, Score & Remediate Subscriber Experience

Ping looks fine, but the customer says it "feels slow." BNGSOFT Flow Intelligence closes that gap with a loop that runs inline in XDP: it measures real per-flow download loss with no probes, scavenges the bulk/elephant flows that are actually starving the line — but only under real contention, scores each subscriber's experience 0–100, and auto-remediates the ones that suffer. Default-off and observe-first: until you opt in, the data plane is byte-identical.

AQM keeps the average queue short and Interactive Flow Protection shields the victim packet — but neither measures the subscriber's real experience, and neither penalizes the flow causing the pain. Flow Intelligence does both, at line rate, in the same program that already does CGNAT, QoS and security.

Passive

no probes · no CPE agents

measures real download loss
from the traffic already flowing

QoE 0–100

one number support can read

per subscriber + a fleet
rollup for the NOC

L4S CE-mark

demote without loss

ECN-capable flows are marked,
not dropped — loss-free relief

Default-OFF

observe before you act

recommend-only until you
ever change a packet

1 · The problem L4S and IFP don't solve

Modern AQM and Interactive Flow Protection are real wins — but they answer the wrong half of the question. A subscriber rings support: "the internet is slow." You ping their line and the latency is fine. What's actually happening is that a few bulk / elephant flows — a cloud backup, an OS update, one fat download — have filled the subscriber's queue and are starving the interactive traffic that makes the connection feel fast: the game, the video call, the page load.

And nobody on the NOC can answer the simplest question — is this subscriber actually having a good experience right now, and if not, why not? — because the BNG never measures real per-flow loss or latency. It shapes blind.

The gap: AQM keeps the average queue short. IFP shields the victim packet. Neither one measures the subscriber's experience, and neither one penalizes the cause — the dominant bulk flow that is eating the line. You get a shorter queue and a protected ACK, but no number that says "this customer is suffering," and nothing that leans on the flow doing the damage.

2 · The idea — a closed loop in the fast path

Flow Intelligence is a control loop that lives inside the XDP data plane: measure → scavenge → score → remediate. It runs inline alongside CGNAT, QoS and security in the same eBPF program — not a separate shaping box, not a hairpin to an external appliance, not a kernel scheduler.

ILLUSTRATIVE Measure real per-flow download loss passively. Scavenge the bulk flows that are starving the line — but only when the subscriber is actually contended. Score each subscriber's experience as one 0–100 number. Remediate by auto-tuning how hard scavenging leans on suffering subscribers, with hysteresis. The loop runs inline in XDP next to CGNAT and QoS — not a separate shaping box.

3 · The four pillars

MEASURE NEW · access/transit RTT

Real per-subscriber loss and RTT — split into line vs internet.

Per-flow download loss inferred from TCP sequence-regression — a retransmit means a segment was lost.
Per-flow path RTT + jitter — from the TCP handshake and refreshed continuously via TCP timestamps as the flow runs.
Access RTT vs Transit RTT NEW — the same passive timestamps split the latency into the access leg (BNG↔CPE — the line & Wi-Fi the customer owns) and the transit leg (BNG↔server — the internet), so support sees which side of the demarc the lag is on. Runs on native QoS-only boxes too.
No probes, no CPE agents, no synthetic traffic — it reads loss and latency out of the packets already flowing.
Reorder-aware: genuine out-of-order delivery is separated from true retransmits, so wireless/reordered paths don't inflate the number.

SCAVENGE

Lean on the cause — only when it hurts.

Detects elephant / bulk flows (cloud backup, OS update, big download).
Demotes them only under real contention — drop, or CE-mark if ECN-capable (L4S = loss-free relief).
Per-flow fairness weights the fattest flow hardest — CAKE-like — so the dominant bulk flow yields first.
Idle / uncongested subscribers are never touched.

SCORE

One number support can read.

Per-subscriber QoE 0–100 derived from the measured loss signal — low loss → high score.
A fleet rollup for the NOC: which subscribers, PoPs or boxes are suffering, right now.
Worst-sufferers list NEW — names the exact subscribers hurting this moment (per-sub windowed loss), so the NOC sees who, not just how many.
Visible per-sub via sub show and fleet-wide via metrics — no per-subscriber configuration.

REMEDIATE

Fix the ones that suffer — automatically.

An auto-tuning controller leans harder on just the suffering subscribers — demoting their bulk flows more aggressively while healthy subs are left untouched — and relaxes the moment they recover. NEW · per-sub
Hysteresis stops it flapping; a cooldown bounds how often it moves; it reacts to a recent-window loss spike, not a lifetime average.
Runs observe (recommend-only) or enforce — and always layers on top of your operator baseline, never below it.

4 · What it helps

SUPPORT / NOC

Answer the unanswerable call

"Is the lag the customer's line, or the transit?"

Now answered with two numbers — access RTT (the line & Wi-Fi) vs transit RTT (the internet) — beside the loss % and QoE score. A vague "it's slow" ticket becomes a measured answer that points at the right side of the demarc, with no truck roll and no CPE agent.

SUBSCRIBER EXPERIENCE

Responsive under load

Games, calls and video stay smooth while a big download runs.

Scavenging leans on the bulk flow that's starving the line, so the interactive traffic the customer notices keeps flowing.

VISIBILITY

See the whole fleet

Per-sub QoE plus a fleet rollup, in the NOC.

Spot suffering subscribers, PoPs and boxes at a glance — with no per-subscriber configuration to maintain.

IFP protects the victim packet; Flow Intelligence penalizes the cause. Together they approximate CAKE per-flow fairness — in the data plane, at line rate, integrated with the BNG, with no kernel scheduler in the path. It is comparable in spirit to LibreQoS, Preseem and CAKE, but it runs inline in XDP on the box that already terminates the subscriber.

Where it sits. Flow Intelligence does not replace AQM or IFP — it completes them. AQM bounds the queue, IFP shields the latency-sensitive packet, and Flow Intelligence supplies the two things they were never designed to do: a measurement of real experience and a penalty on the flow causing the contention. All three run in the same XDP program.

5 · How it works — under the hood

The marketing story above is the whole of it; the rest of this brief is for the network engineer who wants to know exactly how each pillar is implemented in the data plane and the daemon.

5 · Measurement — passive per-flow loss

A per-flow LRU hash, keyed on the subscriber four-tuple, tracks the highest end-sequence seen for the flow. Every download TCP segment is classified against that high-water mark into one of four buckets:

Classification	Test	Counts as
New data	end-sequence advances past the high-water mark	forward progress — updates the mark
Retransmit	segment falls before the mark by more than the reorder window	loss proxy — `tcp_retransmits`
Out-of-order	segment falls before the mark but within the reorder window	genuine reordering — `tcp_ooo_packets`, not loss
Pure ACK	zero-length / no new payload	ignored for loss accounting

The reorder window (default ~3000 bytes ≈ 2×MSS) is what separates genuine out-of-order delivery — which is counted separately and not treated as loss — from a true retransmit, the loss proxy. That is why a wireless or reordered path does not inflate the measured loss. Counters land in a per-CPU QoE map (tcp_retransmits, tcp_dl_segments, tcp_ooo_packets), and the loss figure is simply retransmits ÷ segments.

Cost. This is an ePPing-class technique: it samples the loss signal rather than tracking per-packet state for every flow, so the data-plane cost is a fraction of the line-rate budget — measurement is cheap enough to leave on everywhere.

Path RTT & jitter NEW are timed the same passive way. The upload side stamps the instant a flow’s SYN leaves into the per-flow map; the matching SYN-ACK on the way back yields one handshake RTT (now − syn_time), folded into a per-subscriber ring that gives average RTT and jitter. And it doesn’t stop at the handshake: a sampled, throttled TCP-timestamp (TSval/TSecr) match NEW refreshes the RTT continuously through the life of a long flow, so the figure tracks the path in real time. This all runs in the native QoS-only data plane as well as full-CGNAT, so even a QoS-only box can tell support whether a subscriber’s latency lives on the access line or out in the transit.

Access vs Transit — the split itself NEW is the elegant part. The continuous match above already times the full path (the subscriber’s timestamp, echoed back by the server). To isolate the access leg, the download direction records the server’s own timestamp into the same flow entry; when the subscriber’s next segment echoes that value back, the elapsed time is the BNG↔CPE round trip in isolation — the access line and the home Wi-Fi, nothing else. Two passive measurements, one flow entry, no extra map walk: the result is transit RTT (the internet/server leg) and access RTT (the customer’s own line) as two distinct per-subscriber numbers. Because the access figure also includes the subscriber device’s own response time, it reads as an honest upper bound on last-mile responsiveness under load — exactly the signal that exposes Wi-Fi and last-mile bufferbloat — and a plausibility cap keeps a stray sample from skewing the average.

6 · Scavenging — elephants, contention-gated

An elephant is a flow that has pushed past a cumulative-byte threshold (default 3 MB) and, optionally, clears a windowed-rate gate: a per-second window measures the flow's rate and only flows above a configurable kbps qualify — so a big-but-slow flow (a long, gentle download) is left alone.

Demotion is contention-gated. The data plane computes the subscriber's download virtual-queue sojourn:

sojourn µs = queue-depth-bytes × 8000 ÷ shaped-rate-kbpsAn idle subscriber's queue depth is ~0, so the sojourn is ~0 — and the subscriber is never demoted.

Scavenging only acts when that sojourn exceeds a target (default 5000 µs, the AQM target). When the subscriber is contended, an eligible elephant packet is demoted with a configurable probability (default 50%):

ECN-capable → CE-mark

The packet is marked, not dropped — L4S-style, loss-free congestion signalling. The sender backs off without a retransmit.

Not ECN-capable → drop

A classic drop is the congestion signal, applied probabilistically to the elephant rather than to the subscriber's interactive traffic.

Per-flow fairness (optional) scales that demotion probability by how many multiples of the rate threshold the flow's windowed rate represents — 2×, 4×, 8× map to progressively higher probability, computed division-free. The dominant bulk flow therefore yields hardest, while a borderline flow that is only just over the line is spared. That is the CAKE-like behaviour, expressed as a probability shift inside the XDP program.

7 · Score & auto-remediation

The per-subscriber QoE 0–100 is derived from the measured loss: low loss maps to a high score, rising loss pulls it down. On top of that sits the auto-remediation controller — a daemon control loop, not a data-plane component.

ILLUSTRATIVE The controller samples the box's recent windowed loss and new bulk-flow activity on a few-second cadence and walks this small ladder with hysteresis and a cooldown, bounded by a maximum probability.

The controller samples the box every few seconds and measures the loss in the most recent window — the retransmits and segments that appeared since the last sample, not a lifetime average — so it reacts to a real spike the moment it builds, and relaxes again as soon as it clears. It pairs that with the new bulk-flow activity in the same window, so it only acts when fresh elephant traffic is genuinely present. Hysteresis requires several consecutive bad windows to step up, and more good ones to step down; a cooldown sits between changes; a maximum probability bounds the top. It runs in one of two modes: observe logs the decision it would make and never touches the data plane, while enforce applies it. Escalation is always layered on the operator baseline — never below it — and reverts on reload or restart. Critically it acts per subscriber, not box-wide NEW: the controller writes a small per-sub demote-boost that the scavenger reads, so a struggling subscriber’s bulk flows yield harder while every healthy subscriber on the same box is left completely untouched — surgical, not a blanket policy.

8 · Operating it — `bngxdpctl`

Everything is driven from bngxdpctl; there is no per-subscriber configuration to maintain. A single subscriber's experience is one glance away:

$ bngxdpctl sub show 100.96.12.41
  DL Loss:     0.13 %
  Transit RTT:  18.4 ms  (internet / server leg)
  Access RTT:   9.2 ms  (line / Wi-Fi)
  FI QoE:      99/100 (good)
  DL Bulk:     36 elephant flow(s) (scavenge observe)One subscriber, one glance: low loss, the latency split into transit vs access so support knows which side of the demarc to look at, a healthy QoE score, and 36 bulk flows currently observed — scavenging in observe mode, recommending but not yet acting.

9 · Safety, deployment & limits

Default-OFF and observe-first. Measurement is passive and harmless — it only reads loss out of traffic that is already flowing. Scavenging and remediation act only on explicit opt-in; until you enable enforce, the data plane is byte-identical to a box without Flow Intelligence. And because demotion is contention-gated, an uncongested subscriber is never touched even after you turn it on.

MATURITY — HONEST FRAMING

What is proven, and what to validate.

Passive loss measurement is live across native QoS-only and full-CGNAT production boxes.
Elephant detection + demotion is proven under load; the auto-remediation escalation ladder is demonstrated end-to-end.
Scores and thresholds are tuned per deployment — validate before enabling enforce.

Bottom line on safety. You can deploy Flow Intelligence to measure and score every subscriber with zero risk to the data path, decide from real numbers which subscribers are suffering, and only then opt those paths into observe and — once you trust the recommendations — enforce. Every escalation reverts on reload or restart, and nothing ever drops below your operator baseline.

See every subscriber's real experience — then fix the ones that suffer, automatically.

Flow Intelligence turns "it feels slow" into a measured number, leans on the bulk flows that cause the pain — only when they actually do — and auto-tunes the relief, all inline in the XDP data plane next to CGNAT, QoS and security. Passive. Default-off. Observe-first.

Turn on measurement today, read the QoE scores, and enforce only when the numbers earn your trust.

Sources & honest framing: This is an engine brief, not a benchmark report. BNGSOFT Flow Intelligence measures per-flow download loss via TCP sequence-regression with a reorder window (an ePPing-class passive-measurement technique; see the eBPF Passive Ping project, github.com/xdp-project/bpf-examples (pping)) and applies contention-gated elephant scavenging with optional L4S-style ECN CE-marking (Low Latency, Low Loss, Scalable throughput — IETF RFC 9330, RFC 9331) and per-flow fairness in spirit comparable to CAKE (bufferbloat.net (CAKE)). Default thresholds quoted here (elephant ~3 MB, sojourn target 5000 µs, demotion probability 50%, reorder window ~3000 bytes ≈ 2×MSS) are engineering defaults and are tuned per deployment. Maturity: passive loss and RTT — handshake, continuous, and the access-vs-transit split — measurement is live across native QoS-only and full-CGNAT production boxes; elephant detection and demotion are proven under load; the auto-remediation escalation ladder is demonstrated end-to-end; scores and thresholds must be validated per deployment before enabling enforce. The larger native feature set depends on the BPF verifier instruction-ceiling raise (1M → 4M); on 1M-kernel boxes passive measurement runs but native enforce wants the 4M kernel. The example sub show snippet is illustrative of the output format, not a benchmark. Competitor and project names — LibreQoS, Preseem, CAKE, ePPing/xdp-project — are referenced for positioning only and are trademarks of their respective owners; BNGSOFT is not affiliated with them. No specific performance figures are claimed in this brief; all figures and scores are validated per deployment. Related per-topic briefs (L4S/AQM, Interactive Flow Protection, QoS, NOC2) are available alongside this guide.