linux-stable/kernel/bpf
Daniel Borkmann 107e215c29 bpf, lru: avoid messing with eviction heuristics upon syscall lookup
commit 50b045a8c0 upstream.

One of the biggest issues we face right now with picking LRU map over
regular hash table is that a map walk out of user space, for example,
to just dump the existing entries or to remove certain ones, will
completely mess up LRU eviction heuristics and wrong entries such
as just created ones will get evicted instead. The reason for this
is that we mark an entry as "in use" via bpf_lru_node_set_ref() from
system call lookup side as well. Thus upon walk, all entries are
being marked, so information of actual least recently used ones
are "lost".

In case of Cilium where it can be used (besides others) as a BPF
based connection tracker, this current behavior causes disruption
upon control plane changes that need to walk the map from user space
to evict certain entries. Discussion result from bpfconf [0] was that
we should simply just remove marking from system call side as no
good use case could be found where it's actually needed there.
Therefore this patch removes marking for regular LRU and per-CPU
flavor. If there ever should be a need in future, the behavior could
be selected via map creation flag, but due to mentioned reason we
avoid this here.

  [0] http://vger.kernel.org/bpfconf.html

Fixes: 29ba732acb ("bpf: Add BPF_MAP_TYPE_LRU_HASH")
Fixes: 8f8449384e ("bpf: Add BPF_MAP_TYPE_LRU_PERCPU_HASH")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-25 18:23:48 +02:00
..
arraymap.c bpf: decouple btf from seq bpf fs dump and enable more maps 2018-08-13 00:52:45 +02:00
bpf_lru_list.c
bpf_lru_list.h
btf.c bpf: btf: Fix end boundary calculation for type section 2018-09-12 22:00:23 +02:00
cgroup.c bpf: introduce update_effective_progs() 2018-08-07 14:29:55 +02:00
core.c bpf: enable access to ax register also from verifier rewrite 2019-01-31 08:14:40 +01:00
cpumap.c bpf: fix redirect to map under tail calls 2018-08-17 15:56:23 -07:00
devmap.c bpf: fix redirect to map under tail calls 2018-08-17 15:56:23 -07:00
disasm.c bpf: Remove struct bpf_verifier_env argument from print_bpf_insn 2018-03-23 17:38:57 +01:00
disasm.h bpf: Remove struct bpf_verifier_env argument from print_bpf_insn 2018-03-23 17:38:57 +01:00
hashtab.c bpf, lru: avoid messing with eviction heuristics upon syscall lookup 2019-05-25 18:23:48 +02:00
helpers.c bpf: introduce the bpf_get_local_storage() helper function 2018-08-03 00:47:32 +02:00
inode.c bpf: relax inode permission check for retrieving bpf program 2019-05-25 18:23:47 +02:00
local_storage.c bpf: allocate local storage buffers using GFP_ATOMIC 2018-12-17 09:24:33 +01:00
lpm_trie.c bpf, lpm: fix lookup bug in map_delete_elem 2019-03-23 20:09:51 +01:00
Makefile bpf: Introduce BPF_MAP_TYPE_REUSEPORT_SOCKARRAY 2018-08-11 01:58:46 +02:00
map_in_map.c bpf: fix inner map masking to prevent oob under speculation 2019-01-31 08:14:41 +01:00
map_in_map.h
offload.c bpf: offload: allow program and map sharing per-ASIC 2018-07-18 15:10:34 +02:00
percpu_freelist.c bpf: fix lockdep false positive in percpu_freelist 2019-03-13 14:02:36 -07:00
percpu_freelist.h bpf: fix lockdep false positive in percpu_freelist 2019-03-13 14:02:36 -07:00
reuseport_array.c bpf: Introduce BPF_MAP_TYPE_REUSEPORT_SOCKARRAY 2018-08-11 01:58:46 +02:00
sockmap.c bpf: sockmap, fix transition through disconnect without close 2018-09-22 02:46:41 +02:00
stackmap.c bpf: fix lockdep false positive in stackmap 2019-03-23 20:09:48 +01:00
syscall.c bpf: add map_lookup_elem_sys_only for lookups from syscall side 2019-05-25 18:23:48 +02:00
tnum.c bpf/verifier: improve register value range tracking with ARSH 2018-04-29 08:45:53 -07:00
verifier.c bpf: do not restore dst_reg when cur_state is freed 2019-04-03 06:26:30 +02:00
xskmap.c xsk: do not call synchronize_net() under RCU read lock 2018-10-11 10:19:01 +02:00