linux-stable/kernel/bpf
Kumar Kartikeya Dwivedi f0c5941ff5 bpf: Support bpf_list_head in map values
Add the support on the map side to parse, recognize, verify, and build
metadata table for a new special field of the type struct bpf_list_head.
To parameterize the bpf_list_head for a certain value type and the
list_node member it will accept in that value type, we use BTF
declaration tags.

The definition of bpf_list_head in a map value will be done as follows:

struct foo {
	struct bpf_list_node node;
	int data;
};

struct map_value {
	struct bpf_list_head head __contains(foo, node);
};

Then, the bpf_list_head only allows adding to the list 'head' using the
bpf_list_node 'node' for the type struct foo.

The 'contains' annotation is a BTF declaration tag composed of four
parts, "contains:name:node" where the name is then used to look up the
type in the map BTF, with its kind hardcoded to BTF_KIND_STRUCT during
the lookup. The node defines name of the member in this type that has
the type struct bpf_list_node, which is actually used for linking into
the linked list. For now, 'kind' part is hardcoded as struct.

This allows building intrusive linked lists in BPF, using container_of
to obtain pointer to entry, while being completely type safe from the
perspective of the verifier. The verifier knows exactly the type of the
nodes, and knows that list helpers return that type at some fixed offset
where the bpf_list_node member used for this list exists. The verifier
also uses this information to disallow adding types that are not
accepted by a certain list.

For now, no elements can be added to such lists. Support for that is
coming in future patches, hence draining and freeing items is done with
a TODO that will be resolved in a future patch.

Note that the bpf_list_head_free function moves the list out to a local
variable under the lock and releases it, doing the actual draining of
the list items outside the lock. While this helps with not holding the
lock for too long pessimizing other concurrent list operations, it is
also necessary for deadlock prevention: unless every function called in
the critical section would be notrace, a fentry/fexit program could
attach and call bpf_map_update_elem again on the map, leading to the
same lock being acquired if the key matches and lead to a deadlock.
While this requires some special effort on part of the BPF programmer to
trigger and is highly unlikely to occur in practice, it is always better
if we can avoid such a condition.

While notrace would prevent this, doing the draining outside the lock
has advantages of its own, hence it is used to also fix the deadlock
related problem.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20221114191547.1694267-5-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-11-14 21:52:45 -08:00
..
preload bpf: iterators: Build and use lightweight bootstrap version of bpftool 2022-07-15 12:01:30 -07:00
arraymap.c bpf: Consolidate spin_lock, timer management into btf_record 2022-11-03 22:19:40 -07:00
bloom_filter.c treewide: use get_random_u32() when possible 2022-10-11 17:42:58 -06:00
bpf_cgrp_storage.c bpf: Implement cgroup storage available to non-cgroup-attached bpf progs 2022-10-25 23:19:19 -07:00
bpf_inode_storage.c bpf: Refactor some inode/task/sk storage functions for reuse 2022-10-25 23:19:19 -07:00
bpf_iter.c bpf: Initialize the bpf_run_ctx in bpf_iter_run_prog() 2022-08-18 17:06:13 -07:00
bpf_local_storage.c bpf: Consolidate spin_lock, timer management into btf_record 2022-11-03 22:19:40 -07:00
bpf_lru_list.c bpf_lru_list: Read double-checked variable once without lock 2021-02-10 15:54:26 -08:00
bpf_lru_list.h printk: stop including cache.h from printk.h 2022-05-13 07:20:07 -07:00
bpf_lsm.c Networking changes for 6.1. 2022-10-04 13:38:03 -07:00
bpf_struct_ops.c bpf: Remove is_valid_bpf_tramp_flags() 2022-07-11 21:04:58 +02:00
bpf_struct_ops_types.h bpf: Add dummy BPF STRUCT_OPS for test purpose 2021-11-01 14:10:00 -07:00
bpf_task_storage.c bpf: Refactor some inode/task/sk storage functions for reuse 2022-10-25 23:19:19 -07:00
btf.c bpf: Support bpf_list_head in map values 2022-11-14 21:52:45 -08:00
cgroup.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-10-03 17:44:18 -07:00
cgroup_iter.c bpf-next-for-netdev 2022-11-02 08:18:27 -07:00
core.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-10-20 17:49:10 -07:00
cpumap.c docs/bpf: Document BPF_MAP_TYPE_CPUMAP map 2022-11-11 11:32:54 -08:00
devmap.c bpf: Use bpf_map_area_alloc consistently on bpf map creation 2022-08-10 11:50:43 -07:00
disasm.c bpf: Relicense disassembler as GPL-2.0-only OR BSD-2-Clause 2021-09-02 14:49:23 +02:00
disasm.h bpf: Relicense disassembler as GPL-2.0-only OR BSD-2-Clause 2021-09-02 14:49:23 +02:00
dispatcher.c bpf: Fix dispatcher patchable function entry to 5 bytes nop 2022-10-20 18:57:51 -07:00
hashtab.c bpf: Consolidate spin_lock, timer management into btf_record 2022-11-03 22:19:40 -07:00
helpers.c bpf: Support bpf_list_head in map values 2022-11-14 21:52:45 -08:00
inode.c bpf: Convert bpf_preload.ko to use light skeleton. 2022-02-10 23:31:51 +01:00
Kconfig rcu: Make the TASKS_RCU Kconfig option be selected 2022-04-20 16:52:58 -07:00
link_iter.c bpf: Add bpf_link iterator 2022-05-10 11:20:45 -07:00
local_storage.c bpf: Consolidate spin_lock, timer management into btf_record 2022-11-03 22:19:40 -07:00
lpm_trie.c bpf: Use bpf_map_area_alloc consistently on bpf map creation 2022-08-10 11:50:43 -07:00
Makefile bpf: Implement cgroup storage available to non-cgroup-attached bpf progs 2022-10-25 23:19:19 -07:00
map_in_map.c bpf: Consolidate spin_lock, timer management into btf_record 2022-11-03 22:19:40 -07:00
map_in_map.h
map_iter.c bpf: Introduce MEM_RDONLY flag 2021-12-18 13:27:41 -08:00
memalloc.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-10-24 13:44:11 -07:00
mmap_unlock_work.h bpf: Introduce helper bpf_find_vma 2021-11-07 11:54:51 -08:00
net_namespace.c net: Add includes masked by netdevice.h including uapi/bpf.h 2021-12-29 20:03:05 -08:00
offload.c bpf: Use bpf_map_area_alloc consistently on bpf map creation 2022-08-10 11:50:43 -07:00
percpu_freelist.c bpf: Simplify code by using for_each_cpu_wrap() 2022-09-10 16:18:55 -07:00
percpu_freelist.h
prog_iter.c
queue_stack_maps.c bpf: Remove unneeded memset in queue_stack_map creation 2022-08-10 11:48:22 -07:00
reuseport_array.c net: Fix suspicious RCU usage in bpf_sk_reuseport_detach() 2022-08-17 16:42:59 -07:00
ringbuf.c bpf: Add bpf_user_ringbuf_drain() helper 2022-09-21 16:24:58 -07:00
stackmap.c perf/bpf: Always use perf callchains if exist 2022-09-13 15:03:22 +02:00
syscall.c bpf: Support bpf_list_head in map values 2022-11-14 21:52:45 -08:00
sysfs_btf.c
task_iter.c - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in 2022-10-10 17:53:04 -07:00
tnum.c bpf, tnums: Provably sound, faster, and more precise algorithm for tnum_mul 2021-06-01 13:34:15 +02:00
trampoline.c bpf: Remove prog->active check for bpf_lsm and bpf_iter 2022-10-25 23:11:46 -07:00
verifier.c bpf: Support bpf_list_head in map values 2022-11-14 21:52:45 -08:00