Commit graph

949101 commits

Author SHA1 Message Date
Alexei Starovoitov
07be4c4a3e bpf: Add bpf_copy_from_user() helper.
Sleepable BPF programs can now use copy_from_user() to access user memory.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: KP Singh <kpsingh@google.com>
Link: https://lore.kernel.org/bpf/20200827220114.69225-4-alexei.starovoitov@gmail.com
2020-08-28 21:20:33 +02:00
Alexei Starovoitov
1e6c62a882 bpf: Introduce sleepable BPF programs
Introduce sleepable BPF programs that can request such property for themselves
via BPF_F_SLEEPABLE flag at program load time. In such case they will be able
to use helpers like bpf_copy_from_user() that might sleep. At present only
fentry/fexit/fmod_ret and lsm programs can request to be sleepable and only
when they are attached to kernel functions that are known to allow sleeping.

The non-sleepable programs are relying on implicit rcu_read_lock() and
migrate_disable() to protect life time of programs, maps that they use and
per-cpu kernel structures used to pass info between bpf programs and the
kernel. The sleepable programs cannot be enclosed into rcu_read_lock().
migrate_disable() maps to preempt_disable() in non-RT kernels, so the progs
should not be enclosed in migrate_disable() as well. Therefore
rcu_read_lock_trace is used to protect the life time of sleepable progs.

There are many networking and tracing program types. In many cases the
'struct bpf_prog *' pointer itself is rcu protected within some other kernel
data structure and the kernel code is using rcu_dereference() to load that
program pointer and call BPF_PROG_RUN() on it. All these cases are not touched.
Instead sleepable bpf programs are allowed with bpf trampoline only. The
program pointers are hard-coded into generated assembly of bpf trampoline and
synchronize_rcu_tasks_trace() is used to protect the life time of the program.
The same trampoline can hold both sleepable and non-sleepable progs.

When rcu_read_lock_trace is held it means that some sleepable bpf program is
running from bpf trampoline. Those programs can use bpf arrays and preallocated
hash/lru maps. These map types are waiting on programs to complete via
synchronize_rcu_tasks_trace();

Updates to trampoline now has to do synchronize_rcu_tasks_trace() and
synchronize_rcu_tasks() to wait for sleepable progs to finish and for
trampoline assembly to finish.

This is the first step of introducing sleepable progs. Eventually dynamically
allocated hash maps can be allowed and networking program types can become
sleepable too.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: KP Singh <kpsingh@google.com>
Link: https://lore.kernel.org/bpf/20200827220114.69225-3-alexei.starovoitov@gmail.com
2020-08-28 21:20:33 +02:00
Alexei Starovoitov
76cd61739f mm/error_inject: Fix allow_error_inject function signatures.
'static' and 'static noinline' function attributes make no guarantees that
gcc/clang won't optimize them. The compiler may decide to inline 'static'
function and in such case ALLOW_ERROR_INJECT becomes meaningless. The compiler
could have inlined __add_to_page_cache_locked() in one callsite and didn't
inline in another. In such case injecting errors into it would cause
unpredictable behavior. It's worse with 'static noinline' which won't be
inlined, but it still can be optimized. Like the compiler may decide to remove
one argument or constant propagate the value depending on the callsite.

To avoid such issues make sure that these functions are global noinline.

Fixes: af3b854492 ("mm/page_alloc.c: allow error injection")
Fixes: cfcbfb1382 ("mm/filemap.c: enable error injection at add_to_page_cache()")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/bpf/20200827220114.69225-2-alexei.starovoitov@gmail.com
2020-08-28 21:20:32 +02:00
Martin KaFai Lau
d557ea39a5 bpf: selftests: Add test for different inner map size
This patch tests the inner map size can be different
for reuseport_sockarray but has to be the same for
arraymap.  A new subtest "diff_size" is added for this.

The existing test is moved to a subtest "lookup_update".

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200828011819.1970825-1-kafai@fb.com
2020-08-28 15:41:30 +02:00
Martin KaFai Lau
134fede4ee bpf: Relax max_entries check for most of the inner map types
Most of the maps do not use max_entries during verification time.
Thus, those map_meta_equal() do not need to enforce max_entries
when it is inserted as an inner map during runtime.  The max_entries
check is removed from the default implementation bpf_map_meta_equal().

The prog_array_map and xsk_map are exception.  Its map_gen_lookup
uses max_entries to generate inline lookup code.  Thus, they will
implement its own map_meta_equal() to enforce max_entries.
Since there are only two cases now, the max_entries check
is not refactored and stays in its own .c file.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200828011813.1970516-1-kafai@fb.com
2020-08-28 15:41:30 +02:00
Martin KaFai Lau
f4d0525921 bpf: Add map_meta_equal map ops
Some properties of the inner map is used in the verification time.
When an inner map is inserted to an outer map at runtime,
bpf_map_meta_equal() is currently used to ensure those properties
of the inserting inner map stays the same as the verification
time.

In particular, the current bpf_map_meta_equal() checks max_entries which
turns out to be too restrictive for most of the maps which do not use
max_entries during the verification time.  It limits the use case that
wants to replace a smaller inner map with a larger inner map.  There are
some maps do use max_entries during verification though.  For example,
the map_gen_lookup in array_map_ops uses the max_entries to generate
the inline lookup code.

To accommodate differences between maps, the map_meta_equal is added
to bpf_map_ops.  Each map-type can decide what to check when its
map is used as an inner map during runtime.

Also, some map types cannot be used as an inner map and they are
currently black listed in bpf_map_meta_alloc() in map_in_map.c.
It is not unusual that the new map types may not aware that such
blacklist exists.  This patch enforces an explicit opt-in
and only allows a map to be used as an inner map if it has
implemented the map_meta_equal ops.  It is based on the
discussion in [1].

All maps that support inner map has its map_meta_equal points
to bpf_map_meta_equal in this patch.  A later patch will
relax the max_entries check for most maps.  bpf_types.h
counts 28 map types.  This patch adds 23 ".map_meta_equal"
by using coccinelle.  -5 for
	BPF_MAP_TYPE_PROG_ARRAY
	BPF_MAP_TYPE_(PERCPU)_CGROUP_STORAGE
	BPF_MAP_TYPE_STRUCT_OPS
	BPF_MAP_TYPE_ARRAY_OF_MAPS
	BPF_MAP_TYPE_HASH_OF_MAPS

The "if (inner_map->inner_map_meta)" check in bpf_map_meta_alloc()
is moved such that the same error is returned.

[1]: https://lore.kernel.org/bpf/20200522022342.899756-1-kafai@fb.com/

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200828011806.1970400-1-kafai@fb.com
2020-08-28 15:41:30 +02:00
Yonghong Song
b0c9eb3781 bpf: Make bpf_link_info.iter similar to bpf_iter_link_info
bpf_link_info.iter is used by link_query to return bpf_iter_link_info
to user space. Fields may be different, e.g., map_fd vs. map_id, so
we cannot reuse the exact structure. But make them similar, e.g.,

  struct bpf_link_info {
     /* common fields */
     union {
	struct { ... } raw_tracepoint;
	struct { ... } tracing;
	...
	struct {
	    /* common fields for iter */
	    union {
		struct {
		    __u32 map_id;
		} map;
		/* other structs for other targets */
	    };
	};
    };
 };

so the structure is extensible the same way as bpf_iter_link_info.

Fixes: 6b0a249a30 ("bpf: Implement link_query for bpf iterators")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200828051922.758950-1-yhs@fb.com
2020-08-28 14:33:24 +02:00
Jesper Dangaard Brouer
661b37cd43 tools, bpf/build: Cleanup feature files on make clean
The system for "Auto-detecting system features" located under
tools/build/ are (currently) used by perf, libbpf and bpftool. It can
contain stalled feature detection files, which are not cleaned up by
libbpf and bpftool on make clean (side-note: perf tool is correct).

Fix this by making the users invoke the make clean target.

Some details about the changes. The libbpf Makefile already had a
clean-config target (which seems to be copy-pasted from perf), but this
target was not "connected" (a make dependency) to clean target. Choose
not to rename target as someone might be using it. Did change the output
from "CLEAN config" to "CLEAN feature-detect", to make it more clear
what happens.

This is related to the complaint and troubleshooting in the following
link: https://lore.kernel.org/lkml/20200818122007.2d1cfe2d@carbon/

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/lkml/20200818122007.2d1cfe2d@carbon/
Link: https://lore.kernel.org/bpf/159851841661.1072907.13770213104521805592.stgit@firesoul
2020-08-28 14:04:27 +02:00
Andrii Nakryiko
2e80be60c4 libbpf: Fix compilation warnings for 64-bit printf args
Fix compilation warnings due to __u64 defined differently as `unsigned long`
or `unsigned long long` on different architectures (e.g., ppc64le differs from
x86-64). Also cast one argument to size_t to fix printf warning of similar
nature.

Fixes: eacaaed784 ("libbpf: Implement enum value-based CO-RE relocations")
Fixes: 50e09460d9 ("libbpf: Skip well-known ELF sections when iterating ELF")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200827041109.3613090-1-andriin@fb.com
2020-08-26 22:13:38 -07:00
Yonghong Song
f5493c514c selftests/bpf: Add verifier tests for xor operation
Added some test_verifier bounds check test cases for
xor operations.
  $ ./test_verifier
  ...
  #78/u bounds check for reg = 0, reg xor 1 OK
  #78/p bounds check for reg = 0, reg xor 1 OK
  #79/u bounds check for reg32 = 0, reg32 xor 1 OK
  #79/p bounds check for reg32 = 0, reg32 xor 1 OK
  #80/u bounds check for reg = 2, reg xor 3 OK
  #80/p bounds check for reg = 2, reg xor 3 OK
  #81/u bounds check for reg = any, reg xor 3 OK
  #81/p bounds check for reg = any, reg xor 3 OK
  #82/u bounds check for reg32 = any, reg32 xor 3 OK
  #82/p bounds check for reg32 = any, reg32 xor 3 OK
  #83/u bounds check for reg > 0, reg xor 3 OK
  #83/p bounds check for reg > 0, reg xor 3 OK
  #84/u bounds check for reg32 > 0, reg32 xor 3 OK
  #84/p bounds check for reg32 > 0, reg32 xor 3 OK
  ...

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200825064609.2018077-1-yhs@fb.com
2020-08-26 21:47:32 -07:00
Yonghong Song
2921c90d47 bpf: Fix a verifier failure with xor
bpf selftest test_progs/test_sk_assign failed with llvm 11 and llvm 12.
Compared to llvm 10, llvm 11 and 12 generates xor instruction which
is not handled properly in verifier. The following illustrates the
problem:

  16: (b4) w5 = 0
  17: ... R5_w=inv0 ...
  ...
  132: (a4) w5 ^= 1
  133: ... R5_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) ...
  ...
  37: (bc) w8 = w5
  38: ... R5=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff))
          R8_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) ...
  ...
  41: (bc) w3 = w8
  42: ... R3_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) ...
  45: (56) if w3 != 0x0 goto pc+1
   ... R3_w=inv0 ...
  46: (b7) r1 = 34
  47: R1_w=inv34 R7=pkt(id=0,off=26,r=38,imm=0)
  47: (0f) r7 += r1
  48: R1_w=invP34 R3_w=inv0 R7_w=pkt(id=0,off=60,r=38,imm=0)
  48: (b4) w9 = 0
  49: R1_w=invP34 R3_w=inv0 R7_w=pkt(id=0,off=60,r=38,imm=0)
  49: (69) r1 = *(u16 *)(r7 +0)
  invalid access to packet, off=60 size=2, R7(id=0,off=60,r=38)
  R7 offset is outside of the packet

At above insn 132, w5 = 0, but after w5 ^= 1, we give a really conservative
value of w5. At insn 45, in reality the condition should be always false.
But due to conservative value for w3, the verifier evaluates it could be
true and this later leads to verifier failure complaining potential
packet out-of-bound access.

This patch implemented proper XOR support in verifier.
In the above example, we have:
  132: R5=invP0
  132: (a4) w5 ^= 1
  133: R5_w=invP1
  ...
  37: (bc) w8 = w5
  ...
  41: (bc) w3 = w8
  42: R3_w=invP1
  ...
  45: (56) if w3 != 0x0 goto pc+1
  47: R3_w=invP1
  ...
  processed 353 insns ...
and the verifier can verify the program successfully.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200825064608.2017937-1-yhs@fb.com
2020-08-26 21:47:32 -07:00
Alex Gartrell
ef05afa66c libbpf: Fix unintentional success return code in bpf_object__load
There are code paths where EINVAL is returned directly without setting
errno. In that case, errno could be 0, which would mask the
failure. For example, if a careless programmer set log_level to 10000
out of laziness, they would have to spend a long time trying to figure
out why.

Fixes: 4f33ddb4e3 ("libbpf: Propagate EPERM to caller on program load")
Signed-off-by: Alex Gartrell <alexgartrell@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200826075549.1858580-1-alexgartrell@gmail.com
2020-08-26 15:05:35 -07:00
Alexei Starovoitov
1fc0e18b6e Merge branch 'resolve_prog_type'
Udip Pant says:

====================
This patch series adds changes in verifier to make decisions such as granting
of read / write access or enforcement of return code status based on
the program type of the target program while using dynamic program
extension (of type BPF_PROG_TYPE_EXT).

The BPF_PROG_TYPE_EXT type can be used to extend types such as XDP, SKB
and others. Since the BPF_PROG_TYPE_EXT program type on itself is just a
placeholder for those, we need this extended check for those extended
programs to actually work with proper access, while using this option.

Patch #1 includes changes in the verifier.
Patch #2 adds selftests to verify write access on a packet for a valid
extension program type
Patch #3 adds selftests to verify proper check for the return code
Patch #4 adds selftests to ensure access permissions and restrictions
for some map types such sockmap.

Changelogs:
  v2 -> v3:
    * more comprehensive resolution of the program type in the verifier
      based on the target program (and not just for the packet access)
    * selftests for checking return code and map access
    * Also moved this patch to 'bpf-next' from 'bpf' tree
  v1 -> v2:
    * extraction of the logic to resolve prog type into a separate method
    * selftests to check for packet access for a valid freplace prog
====================

Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-08-26 12:50:36 -07:00
Udip Pant
1410620cf2 selftests/bpf: Test for map update access from within EXT programs
This adds further tests to ensure access permissions and restrictions
are applied properly for some map types such as sock-map.
It also adds another negative tests to assert static functions cannot be
replaced. In the 'unreliable' mode it still fails with error 'tracing progs
cannot use bpf_spin_lock yet' with the change in the verifier

Signed-off-by: Udip Pant <udippant@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200825232003.2877030-5-udippant@fb.com
2020-08-26 12:47:56 -07:00
Udip Pant
50d19736af selftests/bpf: Test for checking return code for the extended prog
This adds test to enforce same check for the return code for the extended prog
as it is enforced for the target program. It asserts failure for a
return code, which is permitted without the patch in this series, while
it is restricted after the application of this patch.

Signed-off-by: Udip Pant <udippant@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200825232003.2877030-4-udippant@fb.com
2020-08-26 12:47:56 -07:00
Udip Pant
6dc03dc713 selftests/bpf: Add test for freplace program with write access
This adds a selftest that tests the behavior when a freplace target program
attempts to make a write access on a packet. The expectation is that the read or write
access is granted based on the program type of the linked program and
not itself (which is of type, for e.g., BPF_PROG_TYPE_EXT).

This test fails without the associated patch on the verifier.

Signed-off-by: Udip Pant <udippant@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200825232003.2877030-3-udippant@fb.com
2020-08-26 12:47:56 -07:00
Udip Pant
7e40781cc8 bpf: verifier: Use target program's type for access verifications
This patch adds changes in verifier to make decisions such as granting
of read / write access or enforcement of return code status based on
the program type of the target program while using dynamic program
extension (of type BPF_PROG_TYPE_EXT).

The BPF_PROG_TYPE_EXT type can be used to extend types such as XDP, SKB
and others. Since the BPF_PROG_TYPE_EXT program type on itself is just a
placeholder for those, we need this extended check for those extended
programs to actually work with proper access, while using this option.

Specifically, it introduces following changes:
- may_access_direct_pkt_data:
    allow access to packet data based on the target prog
- check_return_code:
    enforce return code based on the target prog
    (currently, this check is skipped for EXT program)
- check_ld_abs:
    check for 'may_access_skb' based on the target prog
- check_map_prog_compatibility:
    enforce the map compatibility check based on the target prog
- may_update_sockmap:
    allow sockmap update based on the target prog

Some other occurrences of prog->type is left as it without replacing
with the 'resolved' type:
- do_check_common() and check_attach_btf_id():
    already have specific logic to handle the EXT prog type
- jit_subprogs() and bpf_check():
    Not changed for jit compilation or while inferring env->ops

Next few patches in this series include selftests for some of these cases.

Signed-off-by: Udip Pant <udippant@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200825232003.2877030-2-udippant@fb.com
2020-08-26 12:47:56 -07:00
Colin Ian King
7100ff7c62 selftests/bpf: Fix spelling mistake "scoket" -> "socket"
There is a spelling mistake in a check error message. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200826085907.43095-1-colin.king@canonical.com
2020-08-26 09:19:34 -07:00
Jiri Olsa
d83971761f selftests/bpf: Fix open call in trigger_fstat_events
Alexei reported compile breakage on newer systems with
following error:

  In file included from /usr/include/fcntl.h:290:0,
  4814                 from ./test_progs.h:29,
  4815                 from
  .../bpf-next/tools/testing/selftests/bpf/prog_tests/d_path.c:3:
  4816In function ‘open’,
  4817    inlined from ‘trigger_fstat_events’ at
  .../bpf-next/tools/testing/selftests/bpf/prog_tests/d_path.c:50:10,
  4818    inlined from ‘test_d_path’ at
  .../bpf-next/tools/testing/selftests/bpf/prog_tests/d_path.c:119:6:
  4819/usr/include/x86_64-linux-gnu/bits/fcntl2.h:50:4: error: call to
  ‘__open_missing_mode’ declared with attribute error: open with O_CREAT
  or O_TMPFILE in second argument needs 3 arguments
  4820    __open_missing_mode ();
  4821    ^~~~~~~~~~~~~~~~~~~~~~

We're missing permission bits as 3rd argument
for open call with O_CREAT flag specified.

Fixes: e4d1af4b16 ("selftests/bpf: Add test for d_path helper")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200826101845.747617-1-jolsa@kernel.org
2020-08-26 07:20:48 -07:00
Jiri Olsa
cd04b04de1 selftests/bpf: Add set test to resolve_btfids
Adding test to for sets resolve_btfids. We're checking that
testing set gets properly resolved and sorted.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200825192124.710397-15-jolsa@kernel.org
2020-08-25 15:41:15 -07:00
Jiri Olsa
e4d1af4b16 selftests/bpf: Add test for d_path helper
Adding test for d_path helper which is pretty much
copied from Wenbo Zhang's test for bpf_get_fd_path,
which never made it in.

The test is doing fstat/close on several fd types,
and verifies we got the d_path helper working on
kernel probes for vfs_getattr/filp_close functions.

Original-patch-by: Wenbo Zhang <ethercflow@gmail.com>

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200825192124.710397-14-jolsa@kernel.org
2020-08-25 15:41:15 -07:00
Jiri Olsa
762f851568 selftests/bpf: Add verifier test for d_path helper
Adding verifier test for attaching tracing program and
calling d_path helper from within and testing that it's
allowed for dentry_open function and denied for 'd_path'
function with appropriate error.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200825192124.710397-13-jolsa@kernel.org
2020-08-25 15:41:15 -07:00
Jiri Olsa
68a26bc792 bpf: Update .BTF_ids section in btf.rst with sets info
Updating btf.rst doc with info about BTF_SET_START/END macros.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200825192124.710397-12-jolsa@kernel.org
2020-08-25 15:41:15 -07:00
Jiri Olsa
6e22ab9da7 bpf: Add d_path helper
Adding d_path helper function that returns full path for
given 'struct path' object, which needs to be the kernel
BTF 'path' object. The path is returned in buffer provided
'buf' of size 'sz' and is zero terminated.

  bpf_d_path(&file->f_path, buf, size);

The helper calls directly d_path function, so there's only
limited set of function it can be called from. Adding just
very modest set for the start.

Updating also bpf.h tools uapi header and adding 'path' to
bpf_helpers_doc.py script.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: KP Singh <kpsingh@google.com>
Link: https://lore.kernel.org/bpf/20200825192124.710397-11-jolsa@kernel.org
2020-08-25 15:41:15 -07:00
Jiri Olsa
eae2e83e62 bpf: Add BTF_SET_START/END macros
Adding support to define sorted set of BTF ID values.

Following defines sorted set of BTF ID values:

  BTF_SET_START(btf_allowlist_d_path)
  BTF_ID(func, vfs_truncate)
  BTF_ID(func, vfs_fallocate)
  BTF_ID(func, dentry_open)
  BTF_ID(func, vfs_getattr)
  BTF_ID(func, filp_close)
  BTF_SET_END(btf_allowlist_d_path)

It defines following 'struct btf_id_set' variable to access
values and count:

  struct btf_id_set btf_allowlist_d_path;

Adding 'allowed' callback to struct bpf_func_proto, to allow
verifier the check on allowed callers.

Adding btf_id_set_contains function, which will be used by
allowed callbacks to verify the caller's BTF ID value is
within allowed set.

Also removing extra '\' in __BTF_ID_LIST macro.

Added BTF_SET_START_GLOBAL macro for global sets.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200825192124.710397-10-jolsa@kernel.org
2020-08-25 15:37:41 -07:00
Jiri Olsa
faaf4a790d bpf: Add btf_struct_ids_match function
Adding btf_struct_ids_match function to check if given address provided
by BTF object + offset is also address of another nested BTF object.

This allows to pass an argument to helper, which is defined via parent
BTF object + offset, like for bpf_d_path (added in following changes):

  SEC("fentry/filp_close")
  int BPF_PROG(prog_close, struct file *file, void *id)
  {
    ...
    ret = bpf_d_path(&file->f_path, ...

The first bpf_d_path argument is hold by verifier as BTF file object
plus offset of f_path member.

The btf_struct_ids_match function will walk the struct file object and
check if there's nested struct path object on the given offset.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200825192124.710397-9-jolsa@kernel.org
2020-08-25 15:37:41 -07:00
Jiri Olsa
1c6d28a6ac bpf: Factor btf_struct_access function
Adding btf_struct_walk function that walks through the
struct type + given offset and returns following values:

  enum bpf_struct_walk_result {
       /* < 0 error */
       WALK_SCALAR = 0,
       WALK_PTR,
       WALK_STRUCT,
  };

WALK_SCALAR - when SCALAR_VALUE is found
WALK_PTR    - when pointer value is found, its ID is stored
              in 'next_btf_id' output param
WALK_STRUCT - when nested struct object is found, its ID is stored
              in 'next_btf_id' output param

It will be used in following patches to get all nested
struct objects for given type and offset.

The btf_struct_access now calls btf_struct_walk function,
as long as it gets nested structs as return value.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200825192124.710397-8-jolsa@kernel.org
2020-08-25 15:37:41 -07:00
Jiri Olsa
dafe58fc19 bpf: Remove recursion call in btf_struct_access
Andrii suggested we can simply jump to again label
instead of making recursion call.

Suggested-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200825192124.710397-7-jolsa@kernel.org
2020-08-25 15:37:41 -07:00
Jiri Olsa
887c31a39c bpf: Add type_id pointer as argument to __btf_resolve_size
Adding type_id pointer as argument to __btf_resolve_size
to return also BTF ID of the resolved type. It will be
used in following changes.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200825192124.710397-6-jolsa@kernel.org
2020-08-25 15:37:41 -07:00
Jiri Olsa
69ff304792 bpf: Add elem_id pointer as argument to __btf_resolve_size
If the resolved type is array, make btf_resolve_size return also
ID of the elem type. It will be needed in following changes.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200825192124.710397-5-jolsa@kernel.org
2020-08-25 15:37:41 -07:00
Jiri Olsa
6298399bfc bpf: Move btf_resolve_size into __btf_resolve_size
Moving btf_resolve_size into __btf_resolve_size and
keeping btf_resolve_size public with just first 3
arguments, because the rest of the arguments are not
used by outside callers.

Following changes are adding more arguments, which
are not useful to outside callers. They will be added
to the __btf_resolve_size function.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200825192124.710397-4-jolsa@kernel.org
2020-08-25 15:37:41 -07:00
Jiri Olsa
a5f53b1d59 tools resolve_btfids: Add support for set symbols
The set symbol does not have the unique number suffix,
so we need to give it a special parsing function.

This was omitted in the first batch, because there was
no set support yet, so it slipped in the testing.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200825192124.710397-3-jolsa@kernel.org
2020-08-25 15:37:41 -07:00
Jiri Olsa
193a983c5b tools resolve_btfids: Add size check to get_id function
To make sure we don't crash on malformed symbols.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200825192124.710397-2-jolsa@kernel.org
2020-08-25 15:37:41 -07:00
Alexei Starovoitov
2532f849b5 bpf: Disallow BPF_PRELOAD in allmodconfig builds
The CC_CAN_LINK checks that the host compiler can link, but bpf_preload
relies on libbpf which in turn needs libelf to be present during linking.
allmodconfig runs in odd setups with cross compilers and missing host
libraries like libelf. Instead of extending kconfig with every possible
library that bpf_preload might need disallow building BPF_PRELOAD in
such build-only configurations.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-08-25 15:23:46 -07:00
KP Singh
cd324d7abb bpf: Add selftests for local_storage
inode_local_storage:

* Hook to the file_open and inode_unlink LSM hooks.
* Create and unlink a temporary file.
* Store some information in the inode's bpf_local_storage during
  file_open.
* Verify that this information exists when the file is unlinked.

sk_local_storage:

* Hook to the socket_post_create and socket_bind LSM hooks.
* Open and bind a socket and set the sk_storage in the
  socket_post_create hook using the start_server helper.
* Verify if the information is set in the socket_bind hook.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200825182919.1118197-8-kpsingh@chromium.org
2020-08-25 15:00:04 -07:00
KP Singh
30897832d8 bpf: Allow local storage to be used from LSM programs
Adds support for both bpf_{sk, inode}_storage_{get, delete} to be used
in LSM programs. These helpers are not used for tracing programs
(currently) as their usage is tied to the life-cycle of the object and
should only be used where the owning object won't be freed (when the
owning object is passed as an argument to the LSM hook). Thus, they
are safer to use in LSM hooks than tracing. Usage of local storage in
tracing programs will probably follow a per function based whitelist
approach.

Since the UAPI helper signature for bpf_sk_storage expect a bpf_sock,
it, leads to a compilation warning for LSM programs, it's also updated
to accept a void * pointer instead.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200825182919.1118197-7-kpsingh@chromium.org
2020-08-25 15:00:04 -07:00
KP Singh
8ea636848a bpf: Implement bpf_local_storage for inodes
Similar to bpf_local_storage for sockets, add local storage for inodes.
The life-cycle of storage is managed with the life-cycle of the inode.
i.e. the storage is destroyed along with the owning inode.

The BPF LSM allocates an __rcu pointer to the bpf_local_storage in the
security blob which are now stackable and can co-exist with other LSMs.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200825182919.1118197-6-kpsingh@chromium.org
2020-08-25 15:00:04 -07:00
KP Singh
450af8d0f6 bpf: Split bpf_local_storage to bpf_sk_storage
A purely mechanical change:

	bpf_sk_storage.c = bpf_sk_storage.c + bpf_local_storage.c
	bpf_sk_storage.h = bpf_sk_storage.h + bpf_local_storage.h

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200825182919.1118197-5-kpsingh@chromium.org
2020-08-25 15:00:04 -07:00
KP Singh
f836a56e84 bpf: Generalize bpf_sk_storage
Refactor the functionality in bpf_sk_storage.c so that concept of
storage linked to kernel objects can be extended to other objects like
inode, task_struct etc.

Each new local storage will still be a separate map and provide its own
set of helpers. This allows for future object specific extensions and
still share a lot of the underlying implementation.

This includes the changes suggested by Martin in:

  https://lore.kernel.org/bpf/20200725013047.4006241-1-kafai@fb.com/

adding new map operations to support bpf_local_storage maps:

* storages for different kernel objects to optionally have different
  memory charging strategy (map_local_storage_charge,
  map_local_storage_uncharge)
* Functionality to extract the storage pointer from a pointer to the
  owning object (map_owner_storage_ptr)

Co-developed-by: Martin KaFai Lau <kafai@fb.com>

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200825182919.1118197-4-kpsingh@chromium.org
2020-08-25 15:00:04 -07:00
KP Singh
4cc9ce4e73 bpf: Generalize caching for sk_storage.
Provide the a ability to define local storage caches on a per-object
type basis. The caches and caching indices for different objects should
not be inter-mixed as suggested in:

  https://lore.kernel.org/bpf/20200630193441.kdwnkestulg5erii@kafai-mbp.dhcp.thefacebook.com/

  "Caching a sk-storage at idx=0 of a sk should not stop an
  inode-storage to be cached at the same idx of a inode."

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200825182919.1118197-3-kpsingh@chromium.org
2020-08-25 15:00:03 -07:00
KP Singh
1f00d375af bpf: Renames in preparation for bpf_local_storage
A purely mechanical change to split the renaming from the actual
generalization.

Flags/consts:

  SK_STORAGE_CREATE_FLAG_MASK	BPF_LOCAL_STORAGE_CREATE_FLAG_MASK
  BPF_SK_STORAGE_CACHE_SIZE	BPF_LOCAL_STORAGE_CACHE_SIZE
  MAX_VALUE_SIZE		BPF_LOCAL_STORAGE_MAX_VALUE_SIZE

Structs:

  bucket			bpf_local_storage_map_bucket
  bpf_sk_storage_map		bpf_local_storage_map
  bpf_sk_storage_data		bpf_local_storage_data
  bpf_sk_storage_elem		bpf_local_storage_elem
  bpf_sk_storage		bpf_local_storage

The "sk" member in bpf_local_storage is also updated to "owner"
in preparation for changing the type to void * in a subsequent patch.

Functions:

  selem_linked_to_sk			selem_linked_to_storage
  selem_alloc				bpf_selem_alloc
  __selem_unlink_sk			bpf_selem_unlink_storage_nolock
  __selem_link_sk			bpf_selem_link_storage_nolock
  selem_unlink_sk			__bpf_selem_unlink_storage
  sk_storage_update			bpf_local_storage_update
  __sk_storage_lookup			bpf_local_storage_lookup
  bpf_sk_storage_map_free		bpf_local_storage_map_free
  bpf_sk_storage_map_alloc		bpf_local_storage_map_alloc
  bpf_sk_storage_map_alloc_check	bpf_local_storage_map_alloc_check
  bpf_sk_storage_map_check_btf		bpf_local_storage_map_check_btf

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200825182919.1118197-2-kpsingh@chromium.org
2020-08-25 14:59:58 -07:00
Yonghong Song
0fcdfffe80 selftests/bpf: Enable tc verbose mode for test_sk_assign
Currently test_sk_assign failed verifier with llvm11/llvm12.
During debugging, I found the default verifier output is
truncated like below
  Verifier analysis:

  Skipped 2200 bytes, use 'verb' option for the full verbose log.
  [...]
  off=23,r=34,imm=0) R5=inv0 R6=ctx(id=0,off=0,imm=0) R7=pkt(id=0,off=0,r=34,imm=0) R10=fp0
  80: (0f) r7 += r2
  last_idx 80 first_idx 21
  regs=4 stack=0 before 78: (16) if w3 == 0x11 goto pc+1
when I am using "./test_progs -vv -t assign".

The reason is tc verbose mode is not enabled.

This patched enabled tc verbose mode and the output looks like below
  Verifier analysis:

  0: (bf) r6 = r1
  1: (b4) w0 = 2
  2: (61) r1 = *(u32 *)(r6 +80)
  3: (61) r7 = *(u32 *)(r6 +76)
  4: (bf) r2 = r7
  5: (07) r2 += 14
  6: (2d) if r2 > r1 goto pc+61
   R0_w=inv2 R1_w=pkt_end(id=0,off=0,imm=0) R2_w=pkt(id=0,off=14,r=14,imm=0)
  ...

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200824222807.100200-1-yhs@fb.com
2020-08-24 21:15:13 -07:00
Daniel T. Lee
f0c328f8af samples: bpf: Refactor tracepoint tracing programs with libbpf
For the problem of increasing fragmentation of the bpf loader programs,
instead of using bpf_loader.o, which is used in samples/bpf, this
commit refactors the existing tracepoint tracing programs with libbbpf
bpf loader.

    - Adding a tracepoint event and attaching a bpf program to it was done
    through bpf_program_attach().
    - Instead of using the existing BPF MAP definition, MAP definition
    has been refactored with the new BTF-defined MAP format.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200823085334.9413-4-danieltimlee@gmail.com
2020-08-24 20:59:35 -07:00
Daniel T. Lee
3677d0a131 samples: bpf: Refactor kprobe tracing programs with libbpf
For the problem of increasing fragmentation of the bpf loader programs,
instead of using bpf_loader.o, which is used in samples/bpf, this
commit refactors the existing kprobe tracing programs with libbbpf
bpf loader.

    - For kprobe events pointing to system calls, the SYSCALL() macro in
    trace_common.h was used.
    - Adding a kprobe event and attaching a bpf program to it was done
    through bpf_program_attach().
    - Instead of using the existing BPF MAP definition, MAP definition
    has been refactored with the new BTF-defined MAP format.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200823085334.9413-3-danieltimlee@gmail.com
2020-08-24 20:59:35 -07:00
Daniel T. Lee
35a8b6dd33 samples: bpf: Cleanup bpf_load.o from Makefile
Since commit cc7f641d63 ("samples: bpf: Refactor BPF map performance
test with libbpf") has ommited the removal of bpf_load.o from Makefile,
this commit removes the bpf_load.o rule for targets where bpf_load.o is
not used.

Fixes: cc7f641d63 ("samples: bpf: Refactor BPF map performance test with libbpf")
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200823085334.9413-2-danieltimlee@gmail.com
2020-08-24 20:59:35 -07:00
Lorenz Bauer
8c3b3d971f selftests: bpf: Fix sockmap update nits
Address review by Yonghong, to bring the new tests in line with the
usual code style.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200824084523.13104-1-lmb@cloudflare.com
2020-08-24 14:51:46 -07:00
Andrii Nakryiko
f872e4bc47 libbpf: Fix type compatibility check copy-paste error
Fix copy-paste error in types compatibility check. Local type is accidentally
used instead of target type for the very first type check strictness check.
This can result in potentially less strict candidate comparison. Fix the
error.

Fixes: 3fc32f40c4 ("libbpf: Implement type-based CO-RE relocations support")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200821225653.2180782-1-andriin@fb.com
2020-08-24 14:50:00 -07:00
Andrii Nakryiko
3418c56de8 libbpf: Avoid false unuinitialized variable warning in bpf_core_apply_relo
Some versions of GCC report uninitialized targ_spec usage. GCC is wrong, but
let's avoid unnecessary warnings.

Fixes: ddc7c30426 ("libbpf: implement BPF CO-RE offset relocation algorithm")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200821225556.2178419-1-andriin@fb.com
2020-08-24 14:48:19 -07:00
Jakub Sitnicki
07ff4f0126 bpf: sk_lookup: Add user documentation
Describe the purpose of BPF sk_lookup program, how it can be attached, when
it gets invoked, and what information gets passed to it. Point the reader
to examples and further documentation.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200821100226.403844-1-jakub@cloudflare.com
2020-08-24 14:46:50 -07:00
Jianlin Lv
4d0d167341 docs: Correct subject prefix and update LLVM info
bpf_devel_QA.rst:152 The subject prefix information is not accurate, it
should be 'PATCH bpf-next v2'

Also update LLVM version info and add information about
‘-DLLVM_TARGETS_TO_BUILD’ to prompt the developer to build the desired
target.

Signed-off-by: Jianlin Lv <Jianlin.Lv@arm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200821052817.46887-1-Jianlin.Lv@arm.com
2020-08-24 14:43:48 -07:00