Commit graph

1279 commits

Author SHA1 Message Date
Martin KaFai Lau
a96d1cfb2d selftests/bpf: Check bpf_sk_storage has uncharged sk_omem_alloc
This patch checks the sk_omem_alloc has been uncharged by bpf_sk_storage
during the __sk_destruct.

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230901231129.578493-4-martin.lau@linux.dev
2023-09-06 11:08:47 +02:00
Xu Kuohai
c1970e26bd selftests/bpf: Fix a CI failure caused by vsock write
While commit 90f0074cd9 ("selftests/bpf: fix a CI failure caused by vsock sockmap test")
fixes a receive failure of vsock sockmap test, there is still a write failure:

Error: #211/79 sockmap_listen/sockmap VSOCK test_vsock_redir
Error: #211/79 sockmap_listen/sockmap VSOCK test_vsock_redir
  ./test_progs:vsock_unix_redir_connectible:1501: egress: write: Transport endpoint is not connected
  vsock_unix_redir_connectible:FAIL:1501
  ./test_progs:vsock_unix_redir_connectible:1501: ingress: write: Transport endpoint is not connected
  vsock_unix_redir_connectible:FAIL:1501
  ./test_progs:vsock_unix_redir_connectible:1501: egress: write: Transport endpoint is not connected
  vsock_unix_redir_connectible:FAIL:1501

The reason is that the vsock connection in the test is set to ESTABLISHED state
by function virtio_transport_recv_pkt, which is executed in a workqueue thread,
so when the user space test thread runs before the workqueue thread, this
problem occurs.

To fix it, before writing the connection, wait for it to be connected.

Fixes: d61bd8c1fd ("selftests/bpf: add a test case for vsock sockmap")
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230901031037.3314007-1-xukuohai@huaweicloud.com
2023-09-01 10:56:00 +02:00
Jiri Olsa
d11ae1b16b selftests/bpf: Fix d_path test
Recent commit [1] broke d_path test, because now filp_close is not called
directly from sys_close, but eventually later when the file is finally
released.

As suggested by Hou Tao we don't need to re-hook the bpf program, but just
instead we can use sys_close_range to trigger filp_close synchronously.

  [1] 021a160abf ("fs: use __fput_sync in close(2)")

Suggested-by: Hou Tao <houtao@huaweicloud.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230831141103.359810-1-jolsa@kernel.org
2023-08-31 17:18:53 +02:00
Yonghong Song
5439cfa7fe selftests/bpf: Fix flaky cgroup_iter_sleepable subtest
Occasionally, with './test_progs -j' on my vm, I will hit the
following failure:

  test_cgrp_local_storage:PASS:join_cgroup /cgrp_local_storage 0 nsec
  test_cgroup_iter_sleepable:PASS:skel_open 0 nsec
  test_cgroup_iter_sleepable:PASS:skel_load 0 nsec
  test_cgroup_iter_sleepable:PASS:attach_iter 0 nsec
  test_cgroup_iter_sleepable:PASS:iter_create 0 nsec
  test_cgroup_iter_sleepable:FAIL:cgroup_id unexpected cgroup_id: actual 1 != expected 2812
  #48/5    cgrp_local_storage/cgroup_iter_sleepable:FAIL
  #48      cgrp_local_storage:FAIL

Finally, I decided to do some investigation since the test is introduced
by myself. It turns out the reason is due to cgroup_fd with value 0.
In cgroup_iter, a cgroup_fd of value 0 means the root cgroup.

	/* from cgroup_iter.c */
        if (fd)
                cgrp = cgroup_v1v2_get_from_fd(fd);
        else if (id)
                cgrp = cgroup_get_from_id(id);
        else /* walk the entire hierarchy by default. */
                cgrp = cgroup_get_from_path("/");

That is why we got cgroup_id 1 instead of expected 2812.

Why we got a cgroup_fd 0? Nobody should really touch 'stdin' (fd 0) in
test_progs. I traced 'close' syscall with stack trace and found the root
cause, which is a bug in bpf_obj_pinning.c. Basically, the code closed
fd 0 although it should not. Fixing the bug in bpf_obj_pinning.c also
resolved the above cgroup_iter_sleepable subtest failure.

Fixes: 3b22f98e5a ("selftests/bpf: Add path_fd-based BPF_OBJ_PIN and BPF_OBJ_GET tests")
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230827150551.1743497-1-yonghong.song@linux.dev
2023-08-30 08:45:05 +02:00
Dave Marchevsky
ba2464c86f bpf: Reenable bpf_refcount_acquire
Now that all reported issues are fixed, bpf_refcount_acquire can be
turned back on. Also reenable all bpf_refcount-related tests which were
disabled.

This a revert of:
 * commit f3514a5d67 ("selftests/bpf: Disable newly-added 'owner' field test until refcount re-enabled")
 * commit 7deca5eae8 ("bpf: Disable bpf_refcount_acquire kfunc calls until race conditions are fixed")

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230821193311.3290257-5-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-25 09:23:16 -07:00
Yonghong Song
001fedacc9 selftests/bpf: Add a local kptr test with no special fields
Add a local kptr test with no special fields in the struct. Without the
previous patch, the following warning will hit:

  [   44.683877] WARNING: CPU: 3 PID: 485 at kernel/bpf/syscall.c:660 bpf_obj_free_fields+0x220/0x240
  [   44.684640] Modules linked in: bpf_testmod(OE)
  [   44.685044] CPU: 3 PID: 485 Comm: kworker/u8:5 Tainted: G           OE      6.5.0-rc5-01703-g260d855e9b90 #248
  [   44.685827] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
  [   44.686693] Workqueue: events_unbound bpf_map_free_deferred
  [   44.687297] RIP: 0010:bpf_obj_free_fields+0x220/0x240
  [   44.687775] Code: e8 55 17 1f 00 49 8b 74 24 08 4c 89 ef e8 e8 14 05 00 e8 a3 da e2 ff e9 55 fe ff ff 0f 0b e9 4e fe ff
                       ff 0f 0b e9 47 fe ff ff <0f> 0b e8 d9 d9 e2 ff 31 f6 eb d5 48 83 c4 10 5b 41 5c e
  [   44.689353] RSP: 0018:ffff888106467cb8 EFLAGS: 00010246
  [   44.689806] RAX: 0000000000000000 RBX: ffff888112b3a200 RCX: 0000000000000001
  [   44.690433] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffff8881128ad988
  [   44.691094] RBP: 0000000000000002 R08: ffffffff81370bd0 R09: 1ffff110216231a5
  [   44.691643] R10: dffffc0000000000 R11: ffffed10216231a6 R12: ffff88810d68a488
  [   44.692245] R13: ffff88810767c288 R14: ffff88810d68a400 R15: ffff88810d68a418
  [   44.692829] FS:  0000000000000000(0000) GS:ffff8881f7580000(0000) knlGS:0000000000000000
  [   44.693484] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [   44.693964] CR2: 000055c7f2afce28 CR3: 000000010fee4002 CR4: 0000000000370ee0
  [   44.694513] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  [   44.695102] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  [   44.695747] Call Trace:
  [   44.696001]  <TASK>
  [   44.696183]  ? __warn+0xfe/0x270
  [   44.696447]  ? bpf_obj_free_fields+0x220/0x240
  [   44.696817]  ? report_bug+0x220/0x2d0
  [   44.697180]  ? handle_bug+0x3d/0x70
  [   44.697507]  ? exc_invalid_op+0x1a/0x50
  [   44.697887]  ? asm_exc_invalid_op+0x1a/0x20
  [   44.698282]  ? btf_find_struct_meta+0xd0/0xd0
  [   44.698634]  ? bpf_obj_free_fields+0x220/0x240
  [   44.699027]  ? bpf_obj_free_fields+0x1e2/0x240
  [   44.699414]  array_map_free+0x1a3/0x260
  [   44.699763]  bpf_map_free_deferred+0x7b/0xe0
  [   44.700154]  process_one_work+0x46d/0x750
  [   44.700523]  worker_thread+0x49e/0x900
  [   44.700892]  ? pr_cont_work+0x270/0x270
  [   44.701224]  kthread+0x1ae/0x1d0
  [   44.701516]  ? kthread_blkcg+0x50/0x50
  [   44.701860]  ret_from_fork+0x34/0x50
  [   44.702178]  ? kthread_blkcg+0x50/0x50
  [   44.702508]  ret_from_fork_asm+0x11/0x20
  [   44.702880]  </TASK>

With the previous patch, there is no warnings.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230824063422.203097-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-24 08:15:16 -07:00
Yafang Shao
0072e3624b selftests/bpf: Add selftest for allow_ptr_leaks
- Without prev commit

  $ tools/testing/selftests/bpf/test_progs --name=tc_bpf
  #232/1   tc_bpf/tc_bpf_root:OK
  test_tc_bpf_non_root:PASS:set_cap_bpf_cap_net_admin 0 nsec
  test_tc_bpf_non_root:PASS:disable_cap_sys_admin 0 nsec
  0: R1=ctx(off=0,imm=0) R10=fp0
  ; if ((long)(iph + 1) > (long)skb->data_end)
  0: (61) r2 = *(u32 *)(r1 +80)         ; R1=ctx(off=0,imm=0) R2_w=pkt_end(off=0,imm=0)
  ; struct iphdr *iph = (void *)(long)skb->data + sizeof(struct ethhdr);
  1: (61) r1 = *(u32 *)(r1 +76)         ; R1_w=pkt(off=0,r=0,imm=0)
  ; if ((long)(iph + 1) > (long)skb->data_end)
  2: (07) r1 += 34                      ; R1_w=pkt(off=34,r=0,imm=0)
  3: (b4) w0 = 1                        ; R0_w=1
  4: (2d) if r1 > r2 goto pc+1
  R2 pointer comparison prohibited
  processed 5 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
  test_tc_bpf_non_root:FAIL:test_tc_bpf__open_and_load unexpected error: -13
  #233/2   tc_bpf_non_root:FAIL

- With prev commit

  $ tools/testing/selftests/bpf/test_progs --name=tc_bpf
  #232/1   tc_bpf/tc_bpf_root:OK
  #232/2   tc_bpf/tc_bpf_non_root:OK
  #232     tc_bpf:OK
  Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20230823020703.3790-3-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-23 09:37:29 -07:00
Yonghong Song
fb30159426 selftests/bpf: Add a failure test for bpf_kptr_xchg() with local kptr
For a bpf_kptr_xchg() with local kptr, if the map value kptr type and
allocated local obj type does not match, with the previous patch,
the below verifier error message will be logged:
  R2 is of type <allocated local obj type> but <map value kptr type> is expected

Without the previous patch, the test will have unexpected success.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230822050058.2887354-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22 09:43:55 -07:00
Jiri Olsa
8909a9392b selftests/bpf: Add extra link to uprobe_multi tests
Attaching extra program to same functions system wide for api
and link tests.

This way we can test the pid filter works properly when there's
extra system wide consumer on the same uprobe that will trigger
the original uprobe handler.

We expect to have the same counts as before.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-29-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-21 15:51:27 -07:00
Jiri Olsa
d571efae0f selftests/bpf: Add uprobe_multi pid filter tests
Running api and link tests also with pid filter and checking
the probe gets executed only for specific pid.

Spawning extra process to trigger attached uprobes and checking
we get correct counts from executed programs.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-28-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-21 15:51:27 -07:00
Jiri Olsa
e7cf9a48f8 selftests/bpf: Add uprobe_multi cookie test
Adding test for cookies setup/retrieval in uprobe_link uprobes
and making sure bpf_get_attach_cookie works properly.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-27-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-21 15:51:27 -07:00
Jiri Olsa
85209e839f selftests/bpf: Add uprobe_multi usdt bench test
Adding test that attaches 50k usdt probes in usdt_multi binary.

After the attach is done we run the binary and make sure we get
proper amount of hits.

With current uprobes:

  # perf stat --null ./test_progs -n 254/6
  #254/6   uprobe_multi_test/bench_usdt:OK
  #254     uprobe_multi_test:OK
  Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED

   Performance counter stats for './test_progs -n 254/6':

      1353.659680562 seconds time elapsed

With uprobe_multi link:

  # perf stat --null ./test_progs -n 254/6
  #254/6   uprobe_multi_test/bench_usdt:OK
  #254     uprobe_multi_test:OK
  Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED

   Performance counter stats for './test_progs -n 254/6':

         0.322046364 seconds time elapsed

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-26-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-21 15:51:27 -07:00
Jiri Olsa
3706919ee0 selftests/bpf: Add uprobe_multi bench test
Adding test that attaches 50k uprobes in uprobe_multi binary.

After the attach is done we run the binary and make sure we
get proper amount of hits.

The resulting attach/detach times on my setup:

  test_bench_attach_uprobe:PASS:uprobe_multi__open 0 nsec
  test_bench_attach_uprobe:PASS:uprobe_multi__attach 0 nsec
  test_bench_attach_uprobe:PASS:uprobes_count 0 nsec
  test_bench_attach_uprobe: attached in   0.346s
  test_bench_attach_uprobe: detached in   0.419s
  #262/5   uprobe_multi_test/bench_uprobe:OK

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-24-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-21 15:51:27 -07:00
Jiri Olsa
a93d22ea60 selftests/bpf: Add uprobe_multi link test
Adding uprobe_multi test for bpf_link_create attach function.

Testing attachment using the struct bpf_link_create_opts.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-22-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-21 15:51:27 -07:00
Jiri Olsa
ffc6890361 selftests/bpf: Add uprobe_multi api test
Adding uprobe_multi test for bpf_program__attach_uprobe_multi
attach function.

Testing attachment using glob patterns and via bpf_uprobe_multi_opts
paths/syms fields.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-21-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-21 15:51:27 -07:00
Jiri Olsa
75b3715720 selftests/bpf: Add uprobe_multi skel test
Adding uprobe_multi test for skeleton load/attach functions,
to test skeleton auto attach for uprobe_multi link.

Test that bpf_get_func_ip works properly for uprobe_multi
attachment.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-20-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-21 15:51:27 -07:00
Jiri Olsa
3830d04a74 selftests/bpf: Move get_time_ns to testing_helpers.h
We'd like to have single copy of get_time_ns used b bench and test_progs,
but we can't just include bench.h, because of conflicting 'struct env'
objects.

Moving get_time_ns to testing_helpers.h which is being included by both
bench and test_progs objects.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-19-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-21 15:51:26 -07:00
Dave Marchevsky
63ae8eb2c5 selftests/bpf: Add CO-RE relocs kfunc flavors tests
This patch adds selftests that exercise kfunc flavor relocation
functionality added in the previous patch. The actual kfunc defined
in kernel/bpf/helpers.c is:

  struct task_struct *bpf_task_acquire(struct task_struct *p)

The following relocation behaviors are checked:

  struct task_struct *bpf_task_acquire___one(struct task_struct *name)
    * Should succeed despite differing param name

  struct task_struct *bpf_task_acquire___two(struct task_struct *p, void *ctx)
    * Should fail because there is no two-param bpf_task_acquire

  struct task_struct *bpf_task_acquire___three(void *ctx)
    * Should fail because, despite vmlinux's bpf_task_acquire having one param,
      the types don't match

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/bpf/20230817225353.2570845-2-davemarchevsky@fb.com
2023-08-18 18:12:59 +02:00
Yan Zhai
6c77997bc6 selftests/bpf: Add lwt_xmit tests for BPF_REROUTE
There is no lwt test case for BPF_REROUTE yet. Add test cases for both
normal and abnormal situations. The abnormal situation is set up with an
fq qdisc on the reroute target device. Without proper fixes, overflow
this qdisc queue limit (to trigger a drop) would panic the kernel.

Signed-off-by: Yan Zhai <yan@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/62c8ddc1e924269dcf80d2e8af1a1e632cee0b3a.1692326837.git.yan@cloudflare.com
2023-08-18 16:05:27 +02:00
Yan Zhai
43a7c3ef8a selftests/bpf: Add lwt_xmit tests for BPF_REDIRECT
There is no lwt_xmit test case for BPF_REDIRECT yet. Add test cases for
both normal and abnormal situations. For abnormal test cases, devices
are set down or have its carrier set down. Without proper fixes,
BPF_REDIRECT to either ingress or egress of such device would panic the
kernel.

Signed-off-by: Yan Zhai <yan@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/96bf435243641939d9c9da329fab29cb45f7df22.1692326837.git.yan@cloudflare.com
2023-08-18 16:05:26 +02:00
Geliang Tang
ddba122428 selftests/bpf: Add mptcpify test
Implement a new test program mptcpify: if the family is AF_INET or
AF_INET6, the type is SOCK_STREAM, and the protocol ID is 0 or
IPPROTO_TCP, set it to IPPROTO_MPTCP. It will be hooked in
update_socket_protocol().

Extend the MPTCP test base, add a selftest test_mptcpify() for the
mptcpify case. Open and load the mptcpify test prog to mptcpify the
TCP sockets dynamically, then use start_server() and connect_to_fd()
to create a TCP socket, but actually what's created is an MPTCP
socket, which can be verified through 'getsockopt(SOL_PROTOCOL)'
and 'getsockopt(MPTCP_INFO)'.

Acked-by: Yonghong Song <yonghong.song@linux.dev>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Link: https://lore.kernel.org/r/364e72f307e7bb38382ec7442c182d76298a9c41.1692147782.git.geliang.tang@suse.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-08-16 11:42:33 -07:00
Geliang Tang
2077465502 selftests/bpf: Fix error checks of mptcp open_and_load
Return libbpf_get_error(), instead of -EIO, for the error from
mptcp_sock__open_and_load().

Load success means prog_fd and map_fd are always valid. So drop these
unneeded ASSERT_GE checks for them in mptcp run_test().

Acked-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Link: https://lore.kernel.org/r/db5fcb93293df9ab173edcbaf8252465b80da6f2.1692147782.git.geliang.tang@suse.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-08-16 10:22:16 -07:00
Geliang Tang
97c9c65208 selftests/bpf: Add two mptcp netns helpers
Add two netns helpers for mptcp tests: create_netns() and
cleanup_netns(). Use them in test_base().

These new helpers will be re-used in the following commits
introducing new tests.

Acked-by: Yonghong Song <yonghong.song@linux.dev>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Link: https://lore.kernel.org/r/7506371fb6c417b401cc9d7365fe455754f4ba3f.1692147782.git.geliang.tang@suse.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-08-16 10:22:16 -07:00
Yafang Shao
23cf7aa539 selftests/bpf: Add selftest for fill_link_info
Add selftest for the fill_link_info of uprobe, kprobe and tracepoint.
The result:

  $ tools/testing/selftests/bpf/test_progs --name=fill_link_info
  #79/1    fill_link_info/kprobe_link_info:OK
  #79/2    fill_link_info/kretprobe_link_info:OK
  #79/3    fill_link_info/kprobe_invalid_ubuff:OK
  #79/4    fill_link_info/tracepoint_link_info:OK
  #79/5    fill_link_info/uprobe_link_info:OK
  #79/6    fill_link_info/uretprobe_link_info:OK
  #79/7    fill_link_info/kprobe_multi_link_info:OK
  #79/8    fill_link_info/kretprobe_multi_link_info:OK
  #79/9    fill_link_info/kprobe_multi_invalid_ubuff:OK
  #79      fill_link_info:OK
  Summary: 1/9 PASSED, 0 SKIPPED, 0 FAILED

The test case for kprobe_multi won't be run on aarch64, as it is not
supported.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20230813141900.1268-3-laoar.shao@gmail.com
2023-08-16 16:44:28 +02:00
Daniel Borkmann
ccd9a8be2e selftests/bpf: Add various more tcx test cases
Add several new tcx test cases to improve test coverage. This also includes
a few new tests with ingress instead of clsact qdisc, to cover the fix from
commit dc644b540a ("tcx: Fix splat in ingress_destroy upon tcx_entry_free").

  # ./test_progs -t tc
  [...]
  #234     tc_links_after:OK
  #235     tc_links_append:OK
  #236     tc_links_basic:OK
  #237     tc_links_before:OK
  #238     tc_links_chain_classic:OK
  #239     tc_links_chain_mixed:OK
  #240     tc_links_dev_cleanup:OK
  #241     tc_links_dev_mixed:OK
  #242     tc_links_ingress:OK
  #243     tc_links_invalid:OK
  #244     tc_links_prepend:OK
  #245     tc_links_replace:OK
  #246     tc_links_revision:OK
  #247     tc_opts_after:OK
  #248     tc_opts_append:OK
  #249     tc_opts_basic:OK
  #250     tc_opts_before:OK
  #251     tc_opts_chain_classic:OK
  #252     tc_opts_chain_mixed:OK
  #253     tc_opts_delete_empty:OK
  #254     tc_opts_demixed:OK
  #255     tc_opts_detach:OK
  #256     tc_opts_detach_after:OK
  #257     tc_opts_detach_before:OK
  #258     tc_opts_dev_cleanup:OK
  #259     tc_opts_invalid:OK
  #260     tc_opts_mixed:OK
  #261     tc_opts_prepend:OK
  #262     tc_opts_replace:OK
  #263     tc_opts_revision:OK
  [...]
  Summary: 44/38 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/8699efc284b75ccdc51ddf7062fa2370330dc6c0.1692029283.git.daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-08-14 21:54:59 -07:00
Yipeng Zou
811915db67 selftests/bpf: Fix repeat option when kfunc_call verification fails
There is no way where topts.repeat can be set to 1 when tc_test fails.
Fix the typo where the break statement slipped by one line.

Fixes: fb66223a24 ("selftests/bpf: add test for accessing ctx from syscall program type")
Signed-off-by: Yipeng Zou <zouyipeng@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Li Zetao <lizetao1@huawei.com>
Link: https://lore.kernel.org/bpf/20230814031434.3077944-1-zouyipeng@huawei.com
2023-08-14 18:03:00 +02:00
Jakub Kicinski
6a1ed1430d bpf-next pull-request 2023-08-09
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRdM/uy1Ege0+EN1fNar9k/UBDW4wUCZNRx8QAKCRBar9k/UBDW
 46MBAQC3YDFsEfPzX4P7ZnlM5Lf1NynjNbso5bYW0TF/dp/Y+gD+M8wdM5Vj2Mb0
 Zr56TnwCJei0kGBemiel4sStt3e4qwY=
 =+0u+
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Martin KaFai Lau says:

====================
pull-request: bpf-next 2023-08-09

We've added 19 non-merge commits during the last 6 day(s) which contain
a total of 25 files changed, 369 insertions(+), 141 deletions(-).

The main changes are:

1) Fix array-index-out-of-bounds access when detaching from an
   already empty mprog entry from Daniel Borkmann.

2) Adjust bpf selftest because of a recent llvm change
   related to the cpu-v4 ISA from Eduard Zingerman.

3) Add uprobe support for the bpf_get_func_ip helper from Jiri Olsa.

4) Fix a KASAN splat due to the kernel incorrectly accepted
   an invalid program using the recent cpu-v4 instruction from
   Yonghong Song.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
  bpf: btf: Remove two unused function declarations
  bpf: lru: Remove unused declaration bpf_lru_promote()
  selftests/bpf: relax expected log messages to allow emitting BPF_ST
  selftests/bpf: remove duplicated functions
  bpf, docs: Fix small typo and define semantics of sign extension
  selftests/bpf: Add bpf_get_func_ip test for uprobe inside function
  selftests/bpf: Add bpf_get_func_ip tests for uprobe on function entry
  bpf: Add support for bpf_get_func_ip helper for uprobe program
  selftests/bpf: Add a movsx selftest for sign-extension of R10
  bpf: Fix an incorrect verification success with movsx insn
  bpf, docs: Formalize type notation and function semantics in ISA standard
  bpf: change bpf_alu_sign_string and bpf_movsx_string to static
  libbpf: Use local includes inside the library
  bpf: fix bpf_dynptr_slice() to stop return an ERR_PTR.
  bpf: fix inconsistent return types of bpf_xdp_copy_buf().
  selftests/bpf: fix the incorrect verification of port numbers.
  selftests/bpf: Add test for detachment on empty mprog entry
  bpf: Fix mprog detachment for empty mprog entry
  bpf: bpf_struct_ops: Remove unnecessary initial values of variables
====================

Link: https://lore.kernel.org/r/20230810055123.109578-1-martin.lau@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-10 14:12:34 -07:00
Jakub Kicinski
4d016ae42e Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

No conflicts.

Adjacent changes:

drivers/net/ethernet/intel/igc/igc_main.c
  06b412589e ("igc: Add lock to safeguard global Qbv variables")
  d3750076d4 ("igc: Add TransmissionOverrun counter")

drivers/net/ethernet/microsoft/mana/mana_en.c
  a7dfeda6fd ("net: mana: Fix MANA VF unload when hardware is unresponsive")
  a9ca9f9cef ("page_pool: split types and declarations from page_pool.h")
  92272ec410 ("eth: add missing xdp.h includes in drivers")

net/mptcp/protocol.h
  511b90e392 ("mptcp: fix disconnect vs accept race")
  b8dc6d6ce9 ("mptcp: fix rcv buffer auto-tuning")

tools/testing/selftests/net/mptcp/mptcp_join.sh
  c8c101ae39 ("selftests: mptcp: join: fix 'implicit EP' test")
  03668c65d1 ("selftests: mptcp: join: rework detailed report")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-10 14:10:53 -07:00
Xu Kuohai
a4b7193d8e selftests/bpf: Add sockmap test for redirecting partial skb data
Add a test case to check whether sockmap redirection works correctly
when data length returned by stream_parser is less than skb->len.

In addition, this test checks whether strp_done is called correctly.
The reason is that we returns skb->len - 1 from the stream_parser, so
the last byte in the skb will be held by strp->skb_head. Therefore,
if strp_done is not called to free strp->skb_head, we'll get a memleak
warning.

Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Link: https://lore.kernel.org/r/20230804073740.194770-5-xukuohai@huaweicloud.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-08-09 20:29:02 -07:00
Xu Kuohai
90f0074cd9 selftests/bpf: fix a CI failure caused by vsock sockmap test
BPF CI has reported the following failure:

Error: #200/79 sockmap_listen/sockmap VSOCK test_vsock_redir
  Error: #200/79 sockmap_listen/sockmap VSOCK test_vsock_redir
  ./test_progs:vsock_unix_redir_connectible:1506: egress: write: Transport endpoint is not connected
  vsock_unix_redir_connectible:FAIL:1506
  ./test_progs:vsock_unix_redir_connectible:1506: ingress: write: Transport endpoint is not connected
  vsock_unix_redir_connectible:FAIL:1506
  ./test_progs:vsock_unix_redir_connectible:1506: egress: write: Transport endpoint is not connected
  vsock_unix_redir_connectible:FAIL:1506
  ./test_progs:vsock_unix_redir_connectible:1514: ingress: recv() err, errno=11
  vsock_unix_redir_connectible:FAIL:1514
  ./test_progs:vsock_unix_redir_connectible:1518: ingress: vsock socket map failed, a != b
  vsock_unix_redir_connectible:FAIL:1518
  ./test_progs:vsock_unix_redir_connectible:1525: ingress: want pass count 1, have 0

It’s because the recv(... MSG_DONTWAIT) syscall in the test case is
called before the queued work sk_psock_backlog() in the kernel finishes
executing. So the data to be read is still queued in psock->ingress_skb
and cannot be read by the user program. Therefore, the non-blocking
recv() reads nothing and reports an EAGAIN error.

So replace recv(... MSG_DONTWAIT) with xrecv_nonblock(), which calls
select() to wait for data to be readable or timeout before calls recv().

Fixes: d61bd8c1fd ("selftests/bpf: add a test case for vsock sockmap")
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Link: https://lore.kernel.org/r/20230804073740.194770-4-xukuohai@huaweicloud.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-08-09 20:29:02 -07:00
Eduard Zingerman
898f55f50a selftests/bpf: relax expected log messages to allow emitting BPF_ST
Update [1] to LLVM BPF backend seeks to enable generation of BPF_ST
instruction when CPUv4 is selected. This affects expected log messages
for the following selftests:
- log_fixup/missing_map
- spin_lock/lock_id_mapval_preserve
- spin_lock/lock_id_innermapval_preserve

Expected messages in these tests hard-code instruction numbers for BPF
programs compiled from C. These instruction numbers change when
BPF_ST is allowed because single BPF_ST instruction replaces a pair of
BPF_MOV/BPF_STX instructions, e.g.:

    r1 = 42;
    *(u32 *)(r10 - 8) = r1;  --->  *(u32 *)(r10 - 8) = 42;

This commit updates expected log messages to avoid matching specific
instruction numbers (program position still could be uniquely
identified).

[1] https://reviews.llvm.org/D140804
    "[BPF] support for BPF_ST instruction in codegen"

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230808162755.392606-1-eddyz87@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-08-08 17:02:22 -07:00
Kui-Feng Lee
96ead1e702 selftests/bpf: remove duplicated functions
The file cgroup_tcp_skb.c contains redundant implementations of the similar
functions (create_server_sock_v6(), connect_client_server_v6() and
get_sock_port_v6()) found in network_helpers.c. Let's eliminate these
duplicated functions.

Changes from v1:

 - Remove get_sock_port_v6() as well.

v1: https://lore.kernel.org/all/20230807193840.567962-1-thinker.li@gmail.com/

Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/20230808162858.326871-1-thinker.li@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-08-08 17:01:19 -07:00
Jiri Olsa
7febf573a5 selftests/bpf: Add bpf_get_func_ip test for uprobe inside function
Adding get_func_ip test for uprobe inside function that validates
the get_func_ip helper returns correct probe address value.

Tested-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230807085956.2344866-4-jolsa@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-08-07 16:42:58 -07:00
Jiri Olsa
e43163ed1c selftests/bpf: Add bpf_get_func_ip tests for uprobe on function entry
Adding get_func_ip tests for uprobe on function entry that
validates that bpf_get_func_ip returns proper values from
both uprobe and return uprobe.

Tested-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230807085956.2344866-3-jolsa@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-08-07 16:42:58 -07:00
Kui-Feng Lee
9eab71bd88 selftests/bpf: fix the incorrect verification of port numbers.
Check port numbers before calling htons().

According to Dan Carpenter's report, Smatch identified incorrect port
number checks. It is expected that the returned port number is an integer,
with negative numbers indicating errors. However, the value was mistakenly
verified after being translated by htons().

Major changes from v1:

 - Move the variable 'port' to the same line of 'err'.

Fixes: 539c7e67aa ("selftests/bpf: Verify that the cgroup_skb filters receive expected packets.")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/bpf/cafd6585-d5a2-4096-b94f-7556f5aa7737@moroto.mountain/
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/20230804165831.173627-1-thinker.li@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-08-04 10:20:33 -07:00
Daniel Borkmann
21ce6abe17 selftests/bpf: Add test for detachment on empty mprog entry
Add a detachment test case with miniq present to assert that with and
without the miniq we get the same error.

  # ./test_progs -t tc_opts
  #244     tc_opts_after:OK
  #245     tc_opts_append:OK
  #246     tc_opts_basic:OK
  #247     tc_opts_before:OK
  #248     tc_opts_chain_classic:OK
  #249     tc_opts_delete_empty:OK
  #250     tc_opts_demixed:OK
  #251     tc_opts_detach:OK
  #252     tc_opts_detach_after:OK
  #253     tc_opts_detach_before:OK
  #254     tc_opts_dev_cleanup:OK
  #255     tc_opts_invalid:OK
  #256     tc_opts_mixed:OK
  #257     tc_opts_prepend:OK
  #258     tc_opts_replace:OK
  #259     tc_opts_revision:OK
  Summary: 16/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20230804131112.11012-2-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-08-04 09:35:39 -07:00
Leon Hwang
7fedbf32fc selftests/bpf: Add testcase for xdp attaching failure tracepoint
Add a test case for the tracepoint of xdp attaching failure by bpf
tracepoint when attach XDP to a device with invalid flags option.

The bpf tracepoint retrieves error message from the tracepoint, and
then put the error message to a perf buffer. The testing code receives
error message from perf buffer, and then ASSERT "Invalid XDP flags for
BPF link attachment".

Signed-off-by: Leon Hwang <hffilwlqm@gmail.com>
Link: https://lore.kernel.org/r/20230801142621.7925-3-hffilwlqm@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-02 14:21:12 -07:00
Daniel Xu
c313eae739 bpf: selftests: Add defrag selftests
These selftests tests 2 major scenarios: the BPF based defragmentation
can successfully be done and that packet pointers are invalidated after
calls to the kfunc. The logic is similar for both ipv4 and ipv6.

In the first scenario, we create a UDP client and UDP echo server. The
the server side is fairly straightforward: we attach the prog and simply
echo back the message.

The on the client side, we send fragmented packets to and expect the
reassembled message back from the server.

Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Link: https://lore.kernel.org/r/33e40fdfddf43be93f2cb259303f132f46750953.1689970773.git.dxu@dxuuu.xyz
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-28 16:52:08 -07:00
Yonghong Song
0c606571ae selftests/bpf: Test ldsx with more complex cases
The following ldsx cases are tested:
  - signed readonly map value
  - read/write map value
  - probed memory
  - not-narrowed ctx field access
  - narrowed ctx field access.

Without previous proper verifier/git handling, the test will fail.

If cpuv4 is not supported either by compiler or by jit,
the test will be skipped.

  # ./test_progs -t ldsx_insn
  #113/1   ldsx_insn/map_val and probed_memory:SKIP
  #113/2   ldsx_insn/ctx_member_sign_ext:SKIP
  #113/3   ldsx_insn/ctx_member_narrow_sign_ext:SKIP
  #113     ldsx_insn:SKIP
  Summary: 1/0 PASSED, 3 SKIPPED, 0 FAILED

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230728011336.3723434-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-27 18:54:17 -07:00
Yonghong Song
613dad4980 selftests/bpf: Add unit tests for new gotol insn
Add unit tests for gotol insn.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230728011329.3721881-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-27 18:54:17 -07:00
Yonghong Song
de1c26809e selftests/bpf: Add unit tests for new sdiv/smod insns
Add unit tests for sdiv/smod insns.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230728011321.3720500-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-27 18:54:17 -07:00
Yonghong Song
79dbabc175 selftests/bpf: Add unit tests for new bswap insns
Add unit tests for bswap insns.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230728011314.3720109-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-27 18:54:17 -07:00
Yonghong Song
f02ec3ff3f selftests/bpf: Add unit tests for new sign-extension mov insns
Add unit tests for movsx insns.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230728011309.3719295-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-27 18:54:17 -07:00
Yonghong Song
147c8f4470 selftests/bpf: Add unit tests for new sign-extension load insns
Add unit tests for new ldsx insns. The test includes sign-extension
with a single value or with a value range.

If cpuv4 is not supported due to
  (1) older compiler, e.g., less than clang version 18, or
  (2) test runner test_progs and test_progs-no_alu32 which tests
      cpu v2 and v3, or
  (3) non-x86_64 arch not supporting new insns in jit yet,
a dummy program is added with below output:
  #318/1   verifier_ldsx/cpuv4 is not supported by compiler or jit, use a dummy test:OK
  #318     verifier_ldsx:OK
to indicate the test passed with a dummy test instead of actually
testing cpuv4. I am using a dummy prog to avoid changing the
verifier testing infrastructure. Once clang 18 is widely available
and other architectures support cpuv4, at least for CI run,
the dummy program can be removed.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230728011304.3719139-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-27 18:54:16 -07:00
Daniel Borkmann
22408d58a4 selftests/bpf: Test that SO_REUSEPORT can be used with sk_assign helper
We use two programs to check that the new reuseport logic is executed
appropriately.

The first is a TC clsact program which bpf_sk_assigns
the skb to a UDP or TCP socket created by user space. Since the test
communicates via lo we see both directions of packets in the eBPF.
Traffic ingressing to the reuseport socket is identified by looking
at the destination port. For TCP, we additionally need to make sure
that we only assign the initial SYN packets towards our listening
socket. The network stack then creates a request socket which
transitions to ESTABLISHED after the 3WHS.

The second is a reuseport program which shares the fact that
it has been executed with user space. This tells us that the delayed
lookup mechanism is working.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Co-developed-by: Lorenz Bauer <lmb@isovalent.com>
Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
Cc: Joe Stringer <joe@cilium.io>
Link: https://lore.kernel.org/r/20230720-so-reuseport-v6-8-7021b683cdae@isovalent.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-07-25 14:02:47 -07:00
Daniel Borkmann
c6d479b334 selftests/bpf: Add mprog API tests for BPF tcx links
Add a big batch of test coverage to assert all aspects of the tcx link API:

  # ./vmtest.sh -- ./test_progs -t tc_links
  [...]
  #225     tc_links_after:OK
  #226     tc_links_append:OK
  #227     tc_links_basic:OK
  #228     tc_links_before:OK
  #229     tc_links_chain_classic:OK
  #230     tc_links_dev_cleanup:OK
  #231     tc_links_invalid:OK
  #232     tc_links_prepend:OK
  #233     tc_links_replace:OK
  #234     tc_links_revision:OK
  Summary: 10/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20230719140858.13224-9-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-19 10:07:28 -07:00
Daniel Borkmann
cd13c91d92 selftests/bpf: Add mprog API tests for BPF tcx opts
Add a big batch of test coverage to assert all aspects of the tcx opts
attach, detach and query API:

  # ./vmtest.sh -- ./test_progs -t tc_opts
  [...]
  #238     tc_opts_after:OK
  #239     tc_opts_append:OK
  #240     tc_opts_basic:OK
  #241     tc_opts_before:OK
  #242     tc_opts_chain_classic:OK
  #243     tc_opts_demixed:OK
  #244     tc_opts_detach:OK
  #245     tc_opts_detach_after:OK
  #246     tc_opts_detach_before:OK
  #247     tc_opts_dev_cleanup:OK
  #248     tc_opts_invalid:OK
  #249     tc_opts_mixed:OK
  #250     tc_opts_prepend:OK
  #251     tc_opts_replace:OK
  #252     tc_opts_revision:OK
  Summary: 15/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20230719140858.13224-8-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-19 10:07:28 -07:00
Dave Marchevsky
f3514a5d67 selftests/bpf: Disable newly-added 'owner' field test until refcount re-enabled
The test added in previous patch will fail with bpf_refcount_acquire
disabled. Until all races are fixed and bpf_refcount_acquire is
re-enabled on bpf-next, disable the test so CI doesn't complain.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/r/20230718083813.3416104-6-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-18 17:23:10 -07:00
Dave Marchevsky
fdf48dc2d0 selftests/bpf: Add rbtree test exercising race which 'owner' field prevents
This patch adds a runnable version of one of the races described by
Kumar in [0]. Specifically, this interleaving:

(rbtree1 and list head protected by lock1, rbtree2 protected by lock2)

Prog A                          Prog B
======================================
n = bpf_obj_new(...)
m = bpf_refcount_acquire(n)
kptr_xchg(map, m)

                                m = kptr_xchg(map, NULL)
                                lock(lock2)
				bpf_rbtree_add(rbtree2, m->r, less)
				unlock(lock2)

lock(lock1)
bpf_list_push_back(head, n->l)
/* make n non-owning ref */
bpf_rbtree_remove(rbtree1, n->r)
unlock(lock1)

The above interleaving, the node's struct bpf_rb_node *r can be used to
add it to either rbtree1 or rbtree2, which are protected by different
locks. If the node has been added to rbtree2, we should not be allowed
to remove it while holding rbtree1's lock.

Before changes in the previous patch in this series, the rbtree_remove
in the second part of Prog A would succeed as the verifier has no way of
knowing which tree owns a particular node at verification time. The
addition of 'owner' field results in bpf_rbtree_remove correctly
failing.

The test added in this patch splits "Prog A" above into two separate BPF
programs - A1 and A2 - and uses a second mapval + kptr_xchg to pass n
from A1 to A2 similarly to the pass from A1 to B. If the test is run
without the fix applied, the remove will succeed.

Kumar's example had the two programs running on separate CPUs. This
patch doesn't do this as it's not necessary to exercise the broken
behavior / validate fixed behavior.

  [0]: https://lore.kernel.org/bpf/d7hyspcow5wtjcmw4fugdgyp3fwhljwuscp3xyut5qnwivyeru@ysdq543otzv2

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Suggested-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230718083813.3416104-5-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-18 17:23:10 -07:00
Dave Marchevsky
c3c510ce43 bpf: Add 'owner' field to bpf_{list,rb}_node
As described by Kumar in [0], in shared ownership scenarios it is
necessary to do runtime tracking of {rb,list} node ownership - and
synchronize updates using this ownership information - in order to
prevent races. This patch adds an 'owner' field to struct bpf_list_node
and bpf_rb_node to implement such runtime tracking.

The owner field is a void * that describes the ownership state of a
node. It can have the following values:

  NULL           - the node is not owned by any data structure
  BPF_PTR_POISON - the node is in the process of being added to a data
                   structure
  ptr_to_root    - the pointee is a data structure 'root'
                   (bpf_rb_root / bpf_list_head) which owns this node

The field is initially NULL (set by bpf_obj_init_field default behavior)
and transitions states in the following sequence:

  Insertion: NULL -> BPF_PTR_POISON -> ptr_to_root
  Removal:   ptr_to_root -> NULL

Before a node has been successfully inserted, it is not protected by any
root's lock, and therefore two programs can attempt to add the same node
to different roots simultaneously. For this reason the intermediate
BPF_PTR_POISON state is necessary. For removal, the node is protected
by some root's lock so this intermediate hop isn't necessary.

Note that bpf_list_pop_{front,back} helpers don't need to check owner
before removing as the node-to-be-removed is not passed in as input and
is instead taken directly from the list. Do the check anyways and
WARN_ON_ONCE in this unexpected scenario.

Selftest changes in this patch are entirely mechanical: some BTF
tests have hardcoded struct sizes for structs that contain
bpf_{list,rb}_node fields, those were adjusted to account for the new
sizes. Selftest additions to validate the owner field are added in a
further patch in the series.

  [0]: https://lore.kernel.org/bpf/d7hyspcow5wtjcmw4fugdgyp3fwhljwuscp3xyut5qnwivyeru@ysdq543otzv2

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Suggested-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230718083813.3416104-4-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-18 17:23:10 -07:00