linux-stable/kernel/bpf
Martin KaFai Lau 96eabe7a40 bpf: Allow selecting numa node during map creation
The current map creation API does not allow to provide the numa-node
preference.  The memory usually comes from where the map-creation-process
is running.  The performance is not ideal if the bpf_prog is known to
always run in a numa node different from the map-creation-process.

One of the use case is sharding on CPU to different LRU maps (i.e.
an array of LRU maps).  Here is the test result of map_perf_test on
the INNER_LRU_HASH_PREALLOC test if we force the lru map used by
CPU0 to be allocated from a remote numa node:

[ The machine has 20 cores. CPU0-9 at node 0. CPU10-19 at node 1 ]

># taskset -c 10 ./map_perf_test 512 8 1260000 8000000
5:inner_lru_hash_map_perf pre-alloc 1628380 events per sec
4:inner_lru_hash_map_perf pre-alloc 1626396 events per sec
3:inner_lru_hash_map_perf pre-alloc 1626144 events per sec
6:inner_lru_hash_map_perf pre-alloc 1621657 events per sec
2:inner_lru_hash_map_perf pre-alloc 1621534 events per sec
1:inner_lru_hash_map_perf pre-alloc 1620292 events per sec
7:inner_lru_hash_map_perf pre-alloc 1613305 events per sec
0:inner_lru_hash_map_perf pre-alloc 1239150 events per sec  #<<<

After specifying numa node:
># taskset -c 10 ./map_perf_test 512 8 1260000 8000000
5:inner_lru_hash_map_perf pre-alloc 1629627 events per sec
3:inner_lru_hash_map_perf pre-alloc 1628057 events per sec
1:inner_lru_hash_map_perf pre-alloc 1623054 events per sec
6:inner_lru_hash_map_perf pre-alloc 1616033 events per sec
2:inner_lru_hash_map_perf pre-alloc 1614630 events per sec
4:inner_lru_hash_map_perf pre-alloc 1612651 events per sec
7:inner_lru_hash_map_perf pre-alloc 1609337 events per sec
0:inner_lru_hash_map_perf pre-alloc 1619340 events per sec #<<<

This patch adds one field, numa_node, to the bpf_attr.  Since numa node 0
is a valid node, a new flag BPF_F_NUMA_NODE is also added.  The numa_node
field is honored if and only if the BPF_F_NUMA_NODE flag is set.

Numa node selection is not supported for percpu map.

This patch does not change all the kmalloc.  F.e.
'htab = kzalloc()' is not changed since the object
is small enough to stay in the cache.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-19 21:35:43 -07:00
..
arraymap.c bpf: Allow selecting numa node during map creation 2017-08-19 21:35:43 -07:00
bpf_lru_list.c bpf: lru: Lower the PERCPU_NR_SCANS from 16 to 4 2017-04-17 13:55:52 -04:00
bpf_lru_list.h bpf: Add percpu LRU list 2016-11-15 11:50:20 -05:00
cgroup.c bpf: BPF support for sock_ops 2017-07-01 16:15:13 -07:00
core.c bpf: sock_map fixes for !CONFIG_BPF_SYSCALL and !STREAM_PARSER 2017-08-16 15:34:13 -07:00
devmap.c bpf: Allow selecting numa node during map creation 2017-08-19 21:35:43 -07:00
hashtab.c bpf: Allow selecting numa node during map creation 2017-08-19 21:35:43 -07:00
helpers.c bpf: rename ARG_PTR_TO_STACK 2017-01-09 16:56:27 -05:00
inode.c bpf: Implement show_options 2017-07-06 03:31:46 -04:00
lpm_trie.c bpf: Allow selecting numa node during map creation 2017-08-19 21:35:43 -07:00
Makefile bpf: sock_map fixes for !CONFIG_BPF_SYSCALL and !STREAM_PARSER 2017-08-16 15:34:13 -07:00
map_in_map.c bpf: Add syscall lookup support for fd array and htab 2017-06-29 13:13:25 -04:00
map_in_map.h bpf: Add syscall lookup support for fd array and htab 2017-06-29 13:13:25 -04:00
percpu_freelist.c bpf: introduce percpu_freelist 2016-03-08 15:28:31 -05:00
percpu_freelist.h bpf: introduce percpu_freelist 2016-03-08 15:28:31 -05:00
sockmap.c bpf: Allow selecting numa node during map creation 2017-08-19 21:35:43 -07:00
stackmap.c bpf: Allow selecting numa node during map creation 2017-08-19 21:35:43 -07:00
syscall.c bpf: Allow selecting numa node during map creation 2017-08-19 21:35:43 -07:00
tnum.c bpf/verifier: track signed and unsigned min/max values 2017-08-08 17:51:34 -07:00
verifier.c bpf: Fix map-in-map checking in the verifier 2017-08-18 16:25:00 -07:00