linux-stable/include/trace/events
Jakub Kicinski 6025b9135f net: dqs: add NIC stall detector based on BQL
softnet_data->time_squeeze is sometimes used as a proxy for
host overload or indication of scheduling problems. In practice
this statistic is very noisy and has hard to grasp units -
e.g. is 10 squeezes a second to be expected, or high?

Delaying network (NAPI) processing leads to drops on NIC queues
but also RTT bloat, impacting pacing and CA decisions.
Stalls are a little hard to detect on the Rx side, because
there may simply have not been any packets received in given
period of time. Packet timestamps help a little bit, but
again we don't know if packets are stale because we're
not keeping up or because someone (*cough* cgroups)
disabled IRQs for a long time.

We can, however, use Tx as a proxy for Rx stalls. Most drivers
use combined Rx+Tx NAPIs so if Tx gets starved so will Rx.
On the Tx side we know exactly when packets get queued,
and completed, so there is no uncertainty.

This patch adds stall checks to BQL. Why BQL? Because
it's a convenient place to add such checks, already
called by most drivers, and it has copious free space
in its structures (this patch adds no extra cache
references or dirtying to the fast path).

The algorithm takes one parameter - max delay AKA stall
threshold and increments a counter whenever NAPI got delayed
for at least that amount of time. It also records the length
of the longest stall.

To be precise every time NAPI has not polled for at least
stall thrs we check if there were any Tx packets queued
between last NAPI run and now - stall_thrs/2.

Unlike the classic Tx watchdog this mechanism does not
ignore stalls caused by Tx being disabled, or loss of link.
I don't think the check is worth the complexity, and
stall is a stall, whether due to host overload, flow
control, link down... doesn't matter much to the application.

We have been running this detector in production at Meta
for 2 years, with the threshold of 8ms. It's the lowest
value where false positives become rare. There's still
a constant stream of reported stalls (especially without
the ksoftirqd deferral patches reverted), those who like
their stall metrics to be 0 may prefer higher value.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-08 10:23:26 +00:00
..
9p.h 9p: prevent read overrun in protocol dump tracepoint 2023-12-05 21:18:44 +09:00
afs.h afs: Fix error handling with lookup via FS.InlineBulkStatus 2024-01-22 22:30:14 +00:00
alarmtimer.h
asoc.h
avc.h
bcache.h
block.h fs: add CONFIG_BUFFER_HEAD 2023-08-02 09:13:09 -06:00
bpf_test_run.h
bridge.h net: bridge: Add a tracepoint for MDB overflows 2023-02-06 08:48:25 +00:00
btrfs.h btrfs: use the flags of an extent map to identify the compression type 2023-12-15 22:59:02 +01:00
cachefiles.h fscache,cachefiles: add prepare_ondemand_read() callback 2022-12-07 10:56:29 +08:00
cgroup.h
clk.h clk: Add trace events for rate requests 2022-12-07 13:54:09 -08:00
cma.h trace: cma: remove unnecessary event class cma_alloc_class 2023-04-05 19:42:58 -07:00
compaction.h mm: compaction: add trace event for fast freepages isolation 2023-06-09 16:25:43 -07:00
context_tracking.h
cpuhp.h
csd.h smp: Change function signatures to use call_single_data_t 2023-09-13 14:59:24 +02:00
damon.h mm/damon/core: use nr_accesses_bp as a source of damos_before_apply tracepoint 2023-10-04 10:32:31 -07:00
devfreq.h
devlink.h devlink: Fix TP_STRUCT_entry in trace of devlink health report 2023-02-15 19:15:44 -08:00
dlm.h fs: dlm: add plock dev tracepoints 2023-08-10 10:33:03 -05:00
dma_fence.h
erofs.h erofs: adapt folios for z_erofs_read_folio() 2023-08-23 23:47:33 +08:00
error_report.h panic: use error_report_end tracepoint on warnings 2022-01-20 08:52:55 +02:00
ext4.h ext4: remove 'needed' in trace_ext4_discard_preallocations 2024-01-18 10:52:45 -05:00
f2fs.h f2fs: add tracepoint for f2fs_vm_page_mkwrite() 2023-12-11 13:37:53 -08:00
fib.h net: Replace strlcpy with strscpy 2023-07-04 19:40:16 +01:00
fib6.h net: Replace strlcpy with strscpy 2023-07-04 19:40:16 +01:00
filelock.h
filemap.h
fs_dax.h
fscache.h fscache: Fix oops due to race with cookie_lru and use_cookie 2022-12-07 11:49:18 -08:00
fsi.h fsi: core: Add trace events for scan and unregister 2023-08-09 15:43:28 +09:30
fsi_master_aspeed.h fsi: Add trace events in initialization path 2022-02-21 19:38:54 +10:30
fsi_master_ast_cf.h
fsi_master_gpio.h
fsi_master_i2cr.h fsi: Add IBM I2C Responder virtual FSI master 2023-08-11 13:32:14 +09:30
gpio.h
gpu_mem.h
habanalabs.h accel/habanalabs: minor cosmetics update to trace file 2023-10-09 12:37:23 +03:00
handshake.h net/handshake: Trace events for TLS Alert helpers 2023-07-28 14:07:59 -07:00
host1x.h
huge_memory.h mm/khugepaged: skip shmem with userfaultfd 2023-04-18 16:29:52 -07:00
hwmon.h
i2c.h
i2c_slave.h i2c: add tracepoints for I2C slave events 2022-03-20 00:11:05 +01:00
ib_mad.h IB/mad: Don't call to function that might sleep while in atomic context 2022-11-10 10:57:15 +02:00
ib_umad.h
initcall.h
intel-sst.h
intel_ifs.h platform/x86/intel/ifs: Gen2 Scan test support 2023-10-06 13:05:18 +03:00
intel_ish.h
io_uring.h io_uring: rename trace_io_uring_submit_sqe() tracepoint 2023-04-03 07:16:15 -06:00
iocost.h blk-iocost: Trace vtime_base_rate instead of vtime_rate 2022-12-01 07:44:12 -07:00
iommu.h iommu: Remove detach_dev callback 2023-01-13 16:39:18 +01:00
ipi.h trace: Add trace_ipi_send_cpu() 2023-03-24 11:01:29 +01:00
irq.h softirq: Add trace points for tasklet entry/exit 2023-04-15 10:17:16 +02:00
irq_matrix.h
iscsi.h scsi: iscsi: tracing: Use the new __vstring() helper 2022-07-19 11:20:25 -04:00
jbd2.h jbd2: remove journal_clean_one_cp_list() 2023-07-10 23:09:21 -04:00
kmem.h mm: convert mm's rss stats into percpu_counter 2022-11-30 15:58:40 -08:00
ksm.h mm/ksm: add tracepoint for ksm advisor 2023-12-29 11:58:27 -08:00
kvm.h KVM: remove CONFIG_HAVE_KVM_IRQFD 2023-12-08 15:43:33 -05:00
kyber.h kyber: Replace strlcpy with strscpy 2023-07-17 08:18:17 -06:00
libata.h ata: libata: add qc->flags in ata_qc_complete_template tracepoint 2022-06-17 16:30:03 +09:00
lock.h locking/mutex: Make contention tracepoints more consistent wrt adaptive spinning 2022-04-05 10:24:36 +02:00
maple_tree.h Maple Tree: add new data structure 2022-09-26 19:46:13 -07:00
mce.h
mctp.h mctp: Add SIOCMCTP{ALLOC,DROP}TAG ioctls for tag control 2022-02-09 12:00:11 +00:00
mdio.h
migrate.h mm/migrate: add nr_split to trace_mm_migrate_pages stats. 2023-10-25 16:47:13 -07:00
mlxsw.h
mmap.h mm: mmap: remove newline at the end of the trace 2023-03-23 17:18:36 -07:00
mmap_lock.h
mmc.h
mmflags.h arch: Remove Itanium (IA-64) architecture 2023-09-11 08:13:17 +00:00
module.h
mptcp.h net: implement lockless SO_MAX_PACING_RATE 2023-10-01 19:09:54 +01:00
napi.h net: dqs: add NIC stall detector based on BQL 2024-03-08 10:23:26 +00:00
nbd.h
neigh.h neighbor: tracing: Move pin6 inside CONFIG_IPV6=y section 2023-10-18 11:16:43 +01:00
net.h net: fix net_dev_start_xmit trace event vs skb_transport_offset() 2023-07-03 09:13:23 +01:00
net_probe_common.h
netfs.h netfs: Implement a write-through caching option 2023-12-28 09:45:27 +00:00
netlink.h
nilfs2.h fs/nilfs2: Use the enum req_op and blk_opf_t types 2022-07-14 12:14:33 -06:00
nmi.h
notifier.h notifiers: add tracepoints to the notifiers infrastructure 2023-04-08 13:45:38 -07:00
objagg.h
oom.h
osnoise.h
page_isolation.h
page_pool.h page_pool: split types and declarations from page_pool.h 2023-08-07 13:05:19 -07:00
page_ref.h
pagemap.h
percpu.h include/trace/events/percpu.h: cleanup for "percpu: improve percpu_alloc_percpu event trace" 2022-05-25 10:47:48 -07:00
power.h cpuidle: Add cpu_idle_miss trace event 2022-08-03 17:50:58 +02:00
power_cpu_migrate.h
preemptirq.h
printk.h
pwc.h
pwm.h pwm/tracing: Also record trace events for failed API calls 2022-12-06 12:46:23 +01:00
qdisc.h tracing/net_sched: Fix tracepoints that save qdisc_dev() as a string 2024-03-04 09:35:54 +00:00
qla.h scsi: qla2xxx: tracing: Use the new __vstring() helper 2022-07-19 11:20:25 -04:00
qrtr.h net: qrtr: correct types of trace event parameters 2023-04-04 18:58:43 -07:00
rcu.h RCU Changes for 6.4: 2023-04-24 12:16:14 -07:00
rdma_core.h
regulator.h
rpcgss.h SUNRPC: Record gss_wrap() errors in svcauth_gss_wrap_priv() 2023-02-20 09:20:25 -05:00
rpcrdma.h svcrdma: Copy construction of svc_rqst::rq_arg to rdma_read_complete() 2024-01-07 17:54:33 -05:00
rpm.h
rseq.h tracing/rseq: Add mm_cid field to rseq_update 2022-12-27 12:52:15 +01:00
rtc.h
rv.h rv/monitor: Add the wwnr monitor 2022-07-30 14:01:30 -04:00
rwmmio.h asm-generic/io: Add _RET_IP_ to MMIO trace for more accurate debug info 2022-11-21 22:02:10 +01:00
rxrpc.h rxrpc: Use ktimes for call timeout tracking and set the timer lazily 2024-03-05 23:35:25 +00:00
sched.h sched: Remove vruntime from trace_sched_stat_runtime() 2023-11-15 09:57:49 +01:00
scmi.h include: trace: Add platform and channel instance references 2023-01-20 11:40:57 +00:00
scsi.h scsi: core: Trace SCSI sense data 2023-05-31 11:05:34 -04:00
sctp.h
signal.h
siox.h
skb.h net: add location to trace_consume_skb() 2023-02-20 08:28:49 +00:00
smbus.h
sock.h inet: preserve const qualifier in inet_sk() 2023-03-17 08:56:37 +00:00
sof.h ASoC: SOF: replace ipc4-loader dev_vdbg with tracepoints 2022-09-19 15:44:08 +01:00
sof_intel.h ASoC: SOF: Intel: replace dev_vdbg with tracepoints 2022-09-19 15:44:06 +01:00
spi.h spi: Fix spelling typos and acronyms capitalization 2023-07-11 14:14:32 +01:00
spmi.h spmi: trace: fix stack-out-of-bound access in SPMI tracing functions 2022-07-24 16:16:44 +02:00
sunrpc.h SUNRPC: Remove RQ_SPLICE_OK 2024-01-07 17:54:26 -05:00
sunvnet.h
swiotlb.h swiotlb: make the swiotlb_init interface more useful 2022-04-18 07:21:11 +02:00
syscalls.h
target.h
task.h tracing: Replace strlcpy with strscpy in trace/events/task.h 2023-09-01 21:00:00 -04:00
tcp.h tcp: add tracing of skbaddr in tcp_event_skb class 2024-03-07 15:29:15 +01:00
tegra_apb_dma.h
thermal_pressure.h arch_topology: Trace the update thermal pressure 2022-05-06 09:57:38 +02:00
thp.h powerpc/book3s64/mm: enable transparent pud hugepage 2023-08-18 10:12:55 -07:00
timer.h tracing/timers: Add tracepoint for tracking timer base is_idle flag 2023-12-20 16:49:38 +01:00
tlb.h
udp.h
ufs.h scsi: ufs: core: Include the SCSI ID in UFS command tracing output 2023-09-13 20:44:59 -04:00
v4l2.h
vb2.h
vmalloc.h mm: vmalloc: add free_vmap_area_noflush trace event 2022-11-08 17:37:17 -08:00
vmscan.h mm, vmscan: remove ISOLATE_UNMAPPED 2023-10-04 10:32:29 -07:00
vsock_virtio_transport_common.h vsock/virtio: MSG_ZEROCOPY flag support 2023-09-21 12:34:00 +02:00
watchdog.h watchdog: Add tracing events for the most usual watchdog events 2022-10-12 09:47:02 +02:00
wbt.h blk-wbt: Replace strlcpy with strscpy 2023-07-17 08:18:17 -06:00
workqueue.h workqueue: Fix type of cpu in trace event 2022-06-07 07:09:47 -10:00
writeback.h writeback: fix dereferencing NULL mapping->host on writeback_page_template 2023-06-19 13:19:31 -07:00
xdp.h net: invert the netdevice.h vs xdp.h dependency 2023-08-03 08:38:07 -07:00
xen.h x86/xen: move paravirt lazy code 2023-09-19 07:04:49 +02:00