Commit graph

888778 commits

Author SHA1 Message Date
Brian Vazquez
858e284f0e libbpf: Fix unneeded extra initialization in bpf_map_batch_common
bpf_attr doesn't required to be declared with '= {}' as memset is used
in the code.

Fixes: 2ab3d86ea1 ("libbpf: Add libbpf support to batch ops")
Reported-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Brian Vazquez <brianvv@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200116045918.75597-1-brianvv@google.com
2020-01-16 15:31:52 +01:00
Andrii Nakryiko
b65053cd94 selftests/bpf: Add whitelist/blacklist of test names to test_progs
Add ability to specify a list of test name substrings for selecting which
tests to run. So now -t is accepting a comma-separated list of strings,
similarly to how -n accepts a comma-separated list of test numbers.

Additionally, add ability to blacklist tests by name. Blacklist takes
precedence over whitelist. Blacklisting is important for cases where it's
known that some tests can't pass (e.g., due to perf hardware events that are
not available within VM). This is going to be used for libbpf testing in
Travis CI in its Github repo.

Example runs with just whitelist and whitelist + blacklist:

  $ sudo ./test_progs -tattach,core/existence
  #1 attach_probe:OK
  #6 cgroup_attach_autodetach:OK
  #7 cgroup_attach_multi:OK
  #8 cgroup_attach_override:OK
  #9 core_extern:OK
  #10/44 existence:OK
  #10/45 existence___minimal:OK
  #10/46 existence__err_int_sz:OK
  #10/47 existence__err_int_type:OK
  #10/48 existence__err_int_kind:OK
  #10/49 existence__err_arr_kind:OK
  #10/50 existence__err_arr_value_type:OK
  #10/51 existence__err_struct_type:OK
  #10 core_reloc:OK
  #19 flow_dissector_reattach:OK
  #60 tp_attach_query:OK
  Summary: 8/8 PASSED, 0 SKIPPED, 0 FAILED

  $ sudo ./test_progs -tattach,core/existence -bcgroup,flow/arr
  #1 attach_probe:OK
  #9 core_extern:OK
  #10/44 existence:OK
  #10/45 existence___minimal:OK
  #10/46 existence__err_int_sz:OK
  #10/47 existence__err_int_type:OK
  #10/48 existence__err_int_kind:OK
  #10/51 existence__err_struct_type:OK
  #10 core_reloc:OK
  #60 tp_attach_query:OK
  Summary: 4/6 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: Julia Kartseva <hex@fb.com>
Link: https://lore.kernel.org/bpf/20200116005549.3644118-1-andriin@fb.com
2020-01-15 18:38:39 -08:00
Alexei Starovoitov
7bcfea9615 Merge branch 'bpftool-improvements'
Martin Lau says:

====================
When a map is storing a kernel's struct, its
map_info->btf_vmlinux_value_type_id is set.  The first map type
supporting it is BPF_MAP_TYPE_STRUCT_OPS.

This series adds support to dump this kind of map with BTF.
The first two patches are bug fixes which are only applicable to
bpf-next.

Please see individual patches for details.

v3:
- Remove unnecessary #include "libbpf_internal.h" from patch 5

v2:
- Expose bpf_find_kernel_btf() as a LIBBPF_API in patch 3 (Andrii)
- Cache btf_vmlinux in bpftool/map.c (Andrii)
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-01-15 15:24:17 -08:00
Martin KaFai Lau
4e1ea33292 bpftool: Support dumping a map with btf_vmlinux_value_type_id
This patch makes bpftool support dumping a map's value properly
when the map's value type is a type of the running kernel's btf.
(i.e. map_info.btf_vmlinux_value_type_id is set instead of
map_info.btf_value_type_id).  The first usecase is for the
BPF_MAP_TYPE_STRUCT_OPS.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200115230044.1103008-1-kafai@fb.com
2020-01-15 15:23:27 -08:00
Martin KaFai Lau
84c72ceee9 bpftool: Add struct_ops map name
This patch adds BPF_MAP_TYPE_STRUCT_OPS to "struct_ops" name mapping
so that "bpftool map show" can print the "struct_ops" map type
properly.

[root@arch-fb-vm1 bpf]# ~/devshare/fb-kernel/linux/tools/bpf/bpftool/bpftool map show id 8
8: struct_ops  name dctcp  flags 0x0
	key 4B  value 256B  max_entries 1  memlock 4096B
	btf_id 7

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200115230037.1102674-1-kafai@fb.com
2020-01-15 15:23:27 -08:00
Martin KaFai Lau
fb2426ad00 libbpf: Expose bpf_find_kernel_btf as a LIBBPF_API
This patch exposes bpf_find_kernel_btf() as a LIBBPF_API.
It will be used in 'bpftool map dump' in a following patch
to dump a map with btf_vmlinux_value_type_id set.

bpf_find_kernel_btf() is renamed to libbpf_find_kernel_btf()
and moved to btf.c.  As <linux/kernel.h> is included,
some of the max/min type casting needs to be fixed.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200115230031.1102305-1-kafai@fb.com
2020-01-15 15:23:27 -08:00
Martin KaFai Lau
188a486619 bpftool: Fix missing BTF output for json during map dump
The btf availability check is only done for plain text output.
It causes the whole BTF output went missing when json_output
is used.

This patch simplifies the logic a little by avoiding passing "int btf" to
map_dump().

For plain text output, the btf_wtr is only created when the map has
BTF (i.e. info->btf_id != 0).  The nullness of "json_writer_t *wtr"
in map_dump() alone can decide if dumping BTF output is needed.
As long as wtr is not NULL, map_dump() will print out the BTF-described
data whenever a map has BTF available (i.e. info->btf_id != 0)
regardless of json or plain-text output.

In do_dump(), the "int btf" is also renamed to "int do_plain_btf".

Fixes: 99f9863a0c ("bpftool: Match maps by name")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Cc: Paul Chaignon <paul.chaignon@orange.com>
Link: https://lore.kernel.org/bpf/20200115230025.1101828-1-kafai@fb.com
2020-01-15 15:23:27 -08:00
Martin KaFai Lau
d7de72674a bpftool: Fix a leak of btf object
When testing a map has btf or not, maps_have_btf() tests it by actually
getting a btf_fd from sys_bpf(BPF_BTF_GET_FD_BY_ID). However, it
forgot to btf__free() it.

In maps_have_btf() stage, there is no need to test it by really
calling sys_bpf(BPF_BTF_GET_FD_BY_ID). Testing non zero
info.btf_id is good enough.

Also, the err_close case is unnecessary, and also causes double
close() because the calling func do_dump() will close() all fds again.

Fixes: 99f9863a0c ("bpftool: Match maps by name")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Cc: Paul Chaignon <paul.chaignon@orange.com>
Link: https://lore.kernel.org/bpf/20200115230019.1101352-1-kafai@fb.com
2020-01-15 15:23:27 -08:00
Alexei Starovoitov
990bca1fc8 Merge branch 'bpf-batch-ops'
Brian Vazquez says:

====================
This patch series introduce batch ops that can be added to bpf maps to
lookup/lookup_and_delete/update/delete more than 1 element at the time,
this is specially useful when syscall overhead is a problem and in case
of hmap it will provide a reliable way of traversing them.

The implementation inclues a generic approach that could potentially be
used by any bpf map and adds it to arraymap, it also includes the specific
implementation of hashmaps which are traversed using buckets instead
of keys.

The bpf syscall subcommands introduced are:

  BPF_MAP_LOOKUP_BATCH
  BPF_MAP_LOOKUP_AND_DELETE_BATCH
  BPF_MAP_UPDATE_BATCH
  BPF_MAP_DELETE_BATCH

The UAPI attribute is:

  struct { /* struct used by BPF_MAP_*_BATCH commands */
         __aligned_u64   in_batch;       /* start batch,
                                          * NULL to start from beginning
                                          */
         __aligned_u64   out_batch;      /* output: next start batch */
         __aligned_u64   keys;
         __aligned_u64   values;
         __u32           count;          /* input/output:
                                          * input: # of key/value
                                          * elements
                                          * output: # of filled elements
                                          */
         __u32           map_fd;
         __u64           elem_flags;
         __u64           flags;
  } batch;

in_batch and out_batch are only used for lookup and lookup_and_delete since
those are the only two operations that attempt to traverse the map.

update/delete batch ops should provide the keys/values that user wants
to modify.

Here are the previous discussions on the batch processing:
 - https://lore.kernel.org/bpf/20190724165803.87470-1-brianvv@google.com/
 - https://lore.kernel.org/bpf/20190829064502.2750303-1-yhs@fb.com/
 - https://lore.kernel.org/bpf/20190906225434.3635421-1-yhs@fb.com/

Changelog sinve v4:
 - Remove unnecessary checks from libbpf API (Andrii Nakryiko)
 - Move DECLARE_LIBBPF_OPTS with all var declarations (Andrii Nakryiko)
 - Change bucket internal buffer size to 5 entries (Yonghong Song)
 - Fix some minor bugs in hashtab batch ops implementation (Yonghong Song)

Changelog sinve v3:
 - Do not use copy_to_user inside atomic region (Yonghong Song)
 - Use _opts approach on libbpf APIs (Andrii Nakryiko)
 - Drop generic_map_lookup_and_delete_batch support
 - Free malloc-ed memory in tests (Yonghong Song)
 - Reverse christmas tree (Yonghong Song)
 - Add acked labels

Changelog sinve v2:
 - Add generic batch support for lpm_trie and test it (Yonghong Song)
 - Use define MAP_LOOKUP_RETRIES for retries (John Fastabend)
 - Return errors directly and remove labels (Yonghong Song)
 - Insert new API functions into libbpf alphabetically (Yonghong Song)
 - Change hlist_nulls_for_each_entry_rcu to
   hlist_nulls_for_each_entry_safe in htab batch ops (Yonghong Song)

Changelog since v1:
 - Fix SOB ordering and remove Co-authored-by tag (Alexei Starovoitov)

Changelog since RFC:
 - Change batch to in_batch and out_batch to support more flexible opaque
   values to iterate the bpf maps.
 - Remove update/delete specific batch ops for htab and use the generic
   implementations instead.
====================

Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-01-15 14:07:47 -08:00
Brian Vazquez
f0fac2cec2 selftests/bpf: Add batch ops testing to array bpf map
Tested bpf_map_lookup_batch() and bpf_map_update_batch()
functionality.

  $ ./test_maps
      ...
        test_array_map_batch_ops:PASS
      ...

Signed-off-by: Brian Vazquez <brianvv@google.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115184308.162644-10-brianvv@google.com
2020-01-15 14:00:35 -08:00
Yonghong Song
30ff3c5913 selftests/bpf: Add batch ops testing for htab and htab_percpu map
Tested bpf_map_lookup_batch(), bpf_map_lookup_and_delete_batch(),
bpf_map_update_batch(), and bpf_map_delete_batch() functionality.
  $ ./test_maps
    ...
      test_htab_map_batch_ops:PASS
      test_htab_percpu_map_batch_ops:PASS
    ...

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Brian Vazquez <brianvv@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115184308.162644-9-brianvv@google.com
2020-01-15 14:00:35 -08:00
Yonghong Song
2ab3d86ea1 libbpf: Add libbpf support to batch ops
Added four libbpf API functions to support map batch operations:
  . int bpf_map_delete_batch( ... )
  . int bpf_map_lookup_batch( ... )
  . int bpf_map_lookup_and_delete_batch( ... )
  . int bpf_map_update_batch( ... )

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115184308.162644-8-brianvv@google.com
2020-01-15 14:00:35 -08:00
Yonghong Song
a1e3a3b8ba tools/bpf: Sync uapi header bpf.h
sync uapi header include/uapi/linux/bpf.h to
tools/include/uapi/linux/bpf.h

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115184308.162644-7-brianvv@google.com
2020-01-15 14:00:35 -08:00
Yonghong Song
057996380a bpf: Add batch ops to all htab bpf map
htab can't use generic batch support due some problematic behaviours
inherent to the data structre, i.e. while iterating the bpf map  a
concurrent program might delete the next entry that batch was about to
use, in that case there's no easy solution to retrieve the next entry,
the issue has been discussed multiple times (see [1] and [2]).

The only way hmap can be traversed without the problem previously
exposed is by making sure that the map is traversing entire buckets.
This commit implements those strict requirements for hmap, the
implementation follows the same interaction that generic support with
some exceptions:

 - If keys/values buffer are not big enough to traverse a bucket,
   ENOSPC will be returned.
 - out_batch contains the value of the next bucket in the iteration, not
   the next key, but this is transparent for the user since the user
   should never use out_batch for other than bpf batch syscalls.

This commits implements BPF_MAP_LOOKUP_BATCH and adds support for new
command BPF_MAP_LOOKUP_AND_DELETE_BATCH. Note that for update/delete
batch ops it is possible to use the generic implementations.

[1] https://lore.kernel.org/bpf/20190724165803.87470-1-brianvv@google.com/
[2] https://lore.kernel.org/bpf/20190906225434.3635421-1-yhs@fb.com/

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Brian Vazquez <brianvv@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115184308.162644-6-brianvv@google.com
2020-01-15 14:00:35 -08:00
Brian Vazquez
c60f2d2861 bpf: Add lookup and update batch ops to arraymap
This adds the generic batch ops functionality to bpf arraymap, note that
since deletion is not a valid operation for arraymap, only batch and
lookup are added.

Signed-off-by: Brian Vazquez <brianvv@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200115184308.162644-5-brianvv@google.com
2020-01-15 14:00:35 -08:00
Brian Vazquez
aa2e93b8e5 bpf: Add generic support for update and delete batch ops
This commit adds generic support for update and delete batch ops that
can be used for almost all the bpf maps. These commands share the same
UAPI attr that lookup and lookup_and_delete batch ops use and the
syscall commands are:

  BPF_MAP_UPDATE_BATCH
  BPF_MAP_DELETE_BATCH

The main difference between update/delete and lookup batch ops is that
for update/delete keys/values must be specified for userspace and
because of that, neither in_batch nor out_batch are used.

Suggested-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Brian Vazquez <brianvv@google.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115184308.162644-4-brianvv@google.com
2020-01-15 14:00:35 -08:00
Brian Vazquez
cb4d03ab49 bpf: Add generic support for lookup batch op
This commit introduces generic support for the bpf_map_lookup_batch.
This implementation can be used by almost all the bpf maps since its core
implementation is relying on the existing map_get_next_key and
map_lookup_elem. The bpf syscall subcommand introduced is:

  BPF_MAP_LOOKUP_BATCH

The UAPI attribute is:

  struct { /* struct used by BPF_MAP_*_BATCH commands */
         __aligned_u64   in_batch;       /* start batch,
                                          * NULL to start from beginning
                                          */
         __aligned_u64   out_batch;      /* output: next start batch */
         __aligned_u64   keys;
         __aligned_u64   values;
         __u32           count;          /* input/output:
                                          * input: # of key/value
                                          * elements
                                          * output: # of filled elements
                                          */
         __u32           map_fd;
         __u64           elem_flags;
         __u64           flags;
  } batch;

in_batch/out_batch are opaque values use to communicate between
user/kernel space, in_batch/out_batch must be of key_size length.

To start iterating from the beginning in_batch must be null,
count is the # of key/value elements to retrieve. Note that the 'keys'
buffer must be a buffer of key_size * count size and the 'values' buffer
must be value_size * count, where value_size must be aligned to 8 bytes
by userspace if it's dealing with percpu maps. 'count' will contain the
number of keys/values successfully retrieved. Note that 'count' is an
input/output variable and it can contain a lower value after a call.

If there's no more entries to retrieve, ENOENT will be returned. If error
is ENOENT, count might be > 0 in case it copied some values but there were
no more entries to retrieve.

Note that if the return code is an error and not -EFAULT,
count indicates the number of elements successfully processed.

Suggested-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Brian Vazquez <brianvv@google.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115184308.162644-3-brianvv@google.com
2020-01-15 14:00:35 -08:00
Brian Vazquez
15c14a3dca bpf: Add bpf_map_{value_size, update_value, map_copy_value} functions
This commit moves reusable code from map_lookup_elem and map_update_elem
to avoid code duplication in kernel/bpf/syscall.c.

Signed-off-by: Brian Vazquez <brianvv@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200115184308.162644-2-brianvv@google.com
2020-01-15 14:00:34 -08:00
Eelco Chaudron
83e4b88be1 selftests/bpf: Add a test for attaching a bpf fentry/fexit trace to an XDP program
Add a test that will attach a FENTRY and FEXIT program to the XDP test
program. It will also verify data from the XDP context on FENTRY and
verifies the return code on exit.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157909410480.47481.11202505690938004673.stgit@xdp-tutorial
2020-01-15 13:22:25 -08:00
Andrii Nakryiko
9173cac3b6 libbpf: Support .text sub-calls relocations
The LLVM patch https://reviews.llvm.org/D72197 makes LLVM emit function call
relocations within the same section. This includes a default .text section,
which contains any BPF sub-programs. This wasn't the case before and so libbpf
was able to get a way with slightly simpler handling of subprogram call
relocations.

This patch adds support for .text section relocations. It needs to ensure
correct order of relocations, so does two passes:
- first, relocate .text instructions, if there are any relocations in it;
- then process all the other programs and copy over patched .text instructions
for all sub-program calls.

v1->v2:
- break early once .text program is processed.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115190856.2391325-1-andriin@fb.com
2020-01-15 11:51:48 -08:00
Alexei Starovoitov
5640a771d7 Merge branch 'bpf_send_signal_thread'
Yonghong Song says:

====================
Commit 8b401f9ed2 ("bpf: implement bpf_send_signal() helper")
added helper bpf_send_signal() which permits bpf program to
send a signal to the current process. The signal may send to
any thread of the process.

This patch implemented a new helper bpf_send_signal_thread()
to send a signal to the thread corresponding to the kernel current task.
This helper can simplify user space code if the thread context of
bpf sending signal is needed in user space. Please see Patch #1 for
details of use case and kernel implementation.

Patch #2 added some bpf self tests for the new helper.

Changelogs:
  v2 -> v3:
    - More simplification for skeleton codes by removing not-needed
      mmap code and redundantly created tracepoint link.
  v1 -> v2:
    - More description for the difference between bpf_send_signal()
      and bpf_send_signal_thread() in the uapi header bpf.h.
    - Use skeleton and mmap for send_signal test.
====================

Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-01-15 11:46:32 -08:00
Yonghong Song
ab8b7f0cb3 tools/bpf: Add self tests for bpf_send_signal_thread()
The test_progs send_signal() is amended to test
bpf_send_signal_thread() as well.

   $ ./test_progs -n 40
   #40/1 send_signal_tracepoint:OK
   #40/2 send_signal_perf:OK
   #40/3 send_signal_nmi:OK
   #40/4 send_signal_tracepoint_thread:OK
   #40/5 send_signal_perf_thread:OK
   #40/6 send_signal_nmi_thread:OK
   #40 send_signal:OK
   Summary: 1/6 PASSED, 0 SKIPPED, 0 FAILED

Also took this opportunity to rewrite the send_signal test
using skeleton framework and array mmap to make code
simpler and more readable.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115035003.602425-1-yhs@fb.com
2020-01-15 11:44:51 -08:00
Yonghong Song
8482941f09 bpf: Add bpf_send_signal_thread() helper
Commit 8b401f9ed2 ("bpf: implement bpf_send_signal() helper")
added helper bpf_send_signal() which permits bpf program to
send a signal to the current process. The signal may be
delivered to any threads in the process.

We found a use case where sending the signal to the current
thread is more preferable.
  - A bpf program will collect the stack trace and then
    send signal to the user application.
  - The user application will add some thread specific
    information to the just collected stack trace for
    later analysis.

If bpf_send_signal() is used, user application will need
to check whether the thread receiving the signal matches
the thread collecting the stack by checking thread id.
If not, it will need to send signal to another thread
through pthread_kill().

This patch proposed a new helper bpf_send_signal_thread(),
which sends the signal to the thread corresponding to
the current kernel task. This way, user space is guaranteed that
bpf_program execution context and user space signal handling
context are the same thread.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115035002.602336-1-yhs@fb.com
2020-01-15 11:44:51 -08:00
Magnus Karlsson
d3a56931f9 xsk: Support allocations of large umems
When registering a umem area that is sufficiently large (>1G on an
x86), kmalloc cannot be used to allocate one of the internal data
structures, as the size requested gets too large. Use kvmalloc instead
that falls back on vmalloc if the allocation is too large for kmalloc.

Also add accounting for this structure as it is triggered by a user
space action (the XDP_UMEM_REG setsockopt) and it is by far the
largest structure of kernel allocated memory in xsk.

Reported-by: Ryan Goodfellow <rgoodfel@isi.edu>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Link: https://lore.kernel.org/bpf/1578995365-7050-1-git-send-email-magnus.karlsson@intel.com
2020-01-15 11:41:52 -08:00
Li RongQing
0a29275b63 bpf: Return -EBADRQC for invalid map type in __bpf_tx_xdp_map
A negative value should be returned if map->map_type is invalid
although that is impossible now, but if we run into such situation
in future, then xdpbuff could be leaked.

Daniel Borkmann suggested:

-EBADRQC should be returned to stay consistent with generic XDP
for the tracepoint output and not to be confused with -EOPNOTSUPP
from other locations like dev_map_enqueue() when ndo_xdp_xmit is
missing and such.

Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/1578618277-18085-1-git-send-email-lirongqing@baidu.com
2020-01-14 20:32:24 +01:00
Martin KaFai Lau
3b4130418f bpf: Fix seq_show for BPF_MAP_TYPE_STRUCT_OPS
Instead of using bpf_struct_ops_map_lookup_elem() which is
not implemented, bpf_struct_ops_map_seq_show_elem() should
also use bpf_struct_ops_map_sys_lookup_elem() which does
an inplace update to the value.  The change allocates
a value to pass to bpf_struct_ops_map_sys_lookup_elem().

[root@arch-fb-vm1 bpf]# cat /sys/fs/bpf/dctcp
{{{1}},BPF_STRUCT_OPS_STATE_INUSE,{{00000000df93eebc,00000000df93eebc},0,2, ...

Fixes: 85d33df357 ("bpf: Introduce BPF_MAP_TYPE_STRUCT_OPS")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200114072647.3188298-1-kafai@fb.com
2020-01-14 09:54:31 -08:00
Alexei Starovoitov
6dd42aa17f Merge branch 'runqslower'
Andrii Nakryiko says:

====================
Based on recent BPF CO-RE, tp_btf, and BPF skeleton changes, re-implement
BCC-based runqslower tool as a portable pre-compiled BPF CO-RE-based tool.
Make sure it's built as part of selftests to ensure it doesn't bit rot.

As part of this patch set, augment `format c` output of `bpftool btf dump`
sub-command with applying `preserve_access_index` attribute to all structs and
unions. This makes all such structs and unions automatically relocatable under
BPF CO-RE, which improves user experience of writing TRACING programs with
direct kernel memory read access.

Also, further clean up selftest/bpf Makefile output and make it conforming to
libbpf and bpftool succinct output format.

v1->v2:
- build in-tree bpftool for runqslower (Yonghong);
- drop `format core` and augment `format c` instead (Alexei);
- move runqslower under tools/bpf (Daniel).
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-01-14 09:23:20 -08:00
Andrii Nakryiko
3a0d3092a4 selftests/bpf: Build runqslower from selftests
Ensure runqslower tool is built as part of selftests to prevent it from bit
rotting.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200113073143.1779940-7-andriin@fb.com
2020-01-14 09:23:12 -08:00
Andrii Nakryiko
9c01546d26 tools/bpf: Add runqslower tool to tools/bpf
Convert one of BCC tools (runqslower [0]) to BPF CO-RE + libbpf. It matches
its BCC-based counterpart 1-to-1, supporting all the same parameters and
functionality.

runqslower tool utilizes BPF skeleton, auto-generated from BPF object file,
as well as memory-mapped interface to global (read-only, in this case) data.
Its Makefile also ensures auto-generation of "relocatable" vmlinux.h, which is
necessary for BTF-typed raw tracepoints with direct memory access.

  [0] 11bf5d02c8/tools/runqslower.py

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200113073143.1779940-6-andriin@fb.com
2020-01-13 17:48:13 -08:00
Andrii Nakryiko
1cf5b23988 bpftool: Apply preserve_access_index attribute to all types in BTF dump
This patch makes structs and unions, emitted through BTF dump, automatically
CO-RE-relocatable (unless disabled with `#define BPF_NO_PRESERVE_ACCESS_INDEX`,
specified before including generated header file).

This effectivaly turns usual bpf_probe_read() call into equivalent of
bpf_core_read(), by automatically applying builtin_preserve_access_index to
any field accesses of types in generated C types header.

This is especially useful for tp_btf/fentry/fexit BPF program types. They
allow direct memory access, so BPF C code just uses straightfoward a->b->c
access pattern to read data from kernel. But without kernel structs marked as
CO-RE relocatable through preserve_access_index attribute, one has to enclose
all the data reads into a special __builtin_preserve_access_index code block,
like so:

__builtin_preserve_access_index(({
    x = p->pid; /* where p is struct task_struct *, for example */
}));

This is very inconvenient and obscures the logic quite a bit. By marking all
auto-generated types with preserve_access_index attribute the above code is
reduced to just a clean and natural `x = p->pid;`.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200113073143.1779940-5-andriin@fb.com
2020-01-13 17:48:13 -08:00
Andrii Nakryiko
2cc51d34d9 selftests/bpf: Conform selftests/bpf Makefile output to libbpf and bpftool
Bring selftest/bpf's Makefile output to the same format used by libbpf and
bpftool: 2 spaces of padding on the left + 8-character left-aligned build step
identifier.

Also, hide feature detection output by default. Can be enabled back by setting
V=1.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200113073143.1779940-4-andriin@fb.com
2020-01-13 17:48:13 -08:00
Andrii Nakryiko
292e1d73b1 libbpf: Clean up bpf_helper_defs.h generation output
bpf_helpers_doc.py script, used to generate bpf_helper_defs.h, unconditionally
emits one informational message to stderr. Remove it and preserve stderr to
contain only relevant errors. Also make sure script invocations command is
muted by default in libbpf's Makefile.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200113073143.1779940-3-andriin@fb.com
2020-01-13 17:48:13 -08:00
Andrii Nakryiko
533420a415 tools: Sync uapi/linux/if_link.h
Sync uapi/linux/if_link.h into tools to avoid out of sync warnings during
libbpf build.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200113073143.1779940-2-andriin@fb.com
2020-01-13 17:48:12 -08:00
Andrii Nakryiko
ac065870d9 selftests/bpf: Add BPF_PROG, BPF_KPROBE, and BPF_KRETPROBE macros
Streamline BPF_TRACE_x macro by moving out return type and section attribute
definition out of macro itself. That makes those function look in source code
similar to other BPF programs. Additionally, simplify its usage by determining
number of arguments automatically (so just single BPF_TRACE vs a family of
BPF_TRACE_1, BPF_TRACE_2, etc). Also, allow more natural function argument
syntax without commas inbetween argument type and name.

Given this helper is useful not only for tracing tp_btf/fenty/fexit programs,
but could be used for LSM programs and others following the same pattern,
rename BPF_TRACE macro into more generic BPF_PROG. Existing BPF_TRACE_x
usages in selftests are converted to new BPF_PROG macro.

Following the same pattern, define BPF_KPROBE and BPF_KRETPROBE macros for
nicer usage of kprobe/kretprobe arguments, respectively. BPF_KRETPROBE, adopts
same convention used by fexit programs, that last defined argument is probed
function's return result.

v4->v5:
- fix test_overhead test (__set_task_comm is void) (Alexei);

v3->v4:
- rebased and fixed one more BPF_TRACE_x occurence (Alexei);

v2->v3:
- rename to shorter and as generic BPF_PROG (Alexei);

v1->v2:
- verified GCC handles pragmas as expected;
- added descriptions to macros;
- converted new STRUCT_OPS selftest to BPF_HANDLER (worked as expected);
- added original context as 'ctx' parameter, for cases where it has to be
  passed into BPF helpers. This might cause an accidental naming collision,
  unfortunately, but at least it's easy to work around. Fortunately, this
  situation produces quite legible compilation error:

progs/bpf_dctcp.c:46:6: error: redefinition of 'ctx' with a different type: 'int' vs 'unsigned long long *'
        int ctx = 123;
            ^
progs/bpf_dctcp.c:42:6: note: previous definition is here
void BPF_HANDLER(dctcp_init, struct sock *sk)
     ^
./bpf_trace_helpers.h:58:32: note: expanded from macro 'BPF_HANDLER'
____##name(unsigned long long *ctx, ##args)

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200110211634.1614739-1-andriin@fb.com
2020-01-10 14:02:07 -08:00
Andrii Nakryiko
1d1a3bcffe libbpf: Poison kernel-only integer types
It's been a recurring issue with types like u32 slipping into libbpf source
code accidentally. This is not detected during builds inside kernel source
tree, but becomes a compilation error in libbpf's Github repo. Libbpf is
supposed to use only __{s,u}{8,16,32,64} typedefs, so poison {s,u}{8,16,32,64}
explicitly in every .c file. Doing that in a bit more centralized way, e.g.,
inside libbpf_internal.h breaks selftests, which are both using kernel u32 and
libbpf_internal.h.

This patch also fixes a new u32 occurence in libbpf.c, added recently.

Fixes: 590a008882 ("bpf: libbpf: Add STRUCT_OPS support")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200110181916.271446-1-andriin@fb.com
2020-01-10 10:38:00 -08:00
Daniel Borkmann
7a2d070f91 Merge branch 'bpf-global-funcs'
Alexei Starovoitov says:

====================
Introduce static vs global functions and function by function verification.
This is another step toward dynamic re-linking (or replacement) of global
functions. See patch 2 for details.

v2->v3:
- cleaned up a check spotted by Song.
- rebased and dropped patch 2 that was trying to improve BTF based on ELF.
- added one more unit test for scalar return value from global func.

v1->v2:
- addressed review comments from Song, Andrii, Yonghong
- fixed memory leak in error path
- added modified ctx check
- added more tests in patch 7
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2020-01-10 17:20:19 +01:00
Alexei Starovoitov
360301a6c2 selftests/bpf: Add unit tests for global functions
test_global_func[12] - check 512 stack limit.
test_global_func[34] - check 8 frame call chain limit.
test_global_func5    - check that non-ctx pointer cannot be passed into
                       a function that expects context.
test_global_func6    - check that ctx pointer is unmodified.
test_global_func7    - check that global function returns scalar.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200110064124.1760511-7-ast@kernel.org
2020-01-10 17:20:07 +01:00
Alexei Starovoitov
e528d1c012 selftests/bpf: Modify a test to check global functions
Make two static functions in test_xdp_noinline.c global:
before: processed 2790 insns
after: processed 2598 insns

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200110064124.1760511-6-ast@kernel.org
2020-01-10 17:20:07 +01:00
Alexei Starovoitov
6db2d81a46 selftests/bpf: Add a test for a large global function
test results:
pyperf50 with always_inlined the same function five times: processed 46378 insns
pyperf50 with global function: processed 6102 insns

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200110064124.1760511-5-ast@kernel.org
2020-01-10 17:20:07 +01:00
Alexei Starovoitov
7608e4db6d selftests/bpf: Add fexit-to-skb test for global funcs
Add simple fexit prog type to skb prog type test when subprogram is a global
function.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200110064124.1760511-4-ast@kernel.org
2020-01-10 17:20:07 +01:00
Alexei Starovoitov
51c39bb1d5 bpf: Introduce function-by-function verification
New llvm and old llvm with libbpf help produce BTF that distinguish global and
static functions. Unlike arguments of static function the arguments of global
functions cannot be removed or optimized away by llvm. The compiler has to use
exactly the arguments specified in a function prototype. The argument type
information allows the verifier validate each global function independently.
For now only supported argument types are pointer to context and scalars. In
the future pointers to structures, sizes, pointer to packet data can be
supported as well. Consider the following example:

static int f1(int ...)
{
  ...
}

int f3(int b);

int f2(int a)
{
  f1(a) + f3(a);
}

int f3(int b)
{
  ...
}

int main(...)
{
  f1(...) + f2(...) + f3(...);
}

The verifier will start its safety checks from the first global function f2().
It will recursively descend into f1() because it's static. Then it will check
that arguments match for the f3() invocation inside f2(). It will not descend
into f3(). It will finish f2() that has to be successfully verified for all
possible values of 'a'. Then it will proceed with f3(). That function also has
to be safe for all possible values of 'b'. Then it will start subprog 0 (which
is main() function). It will recursively descend into f1() and will skip full
check of f2() and f3(), since they are global. The order of processing global
functions doesn't affect safety, since all global functions must be proven safe
based on their arguments only.

Such function by function verification can drastically improve speed of the
verification and reduce complexity.

Note that the stack limit of 512 still applies to the call chain regardless whether
functions were static or global. The nested level of 8 also still applies. The
same recursion prevention checks are in place as well.

The type information and static/global kind is preserved after the verification
hence in the above example global function f2() and f3() can be replaced later
by equivalent functions with the same types that are loaded and verified later
without affecting safety of this main() program. Such replacement (re-linking)
of global functions is a subject of future patches.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200110064124.1760511-3-ast@kernel.org
2020-01-10 17:20:07 +01:00
Alexei Starovoitov
2d3eb67f64 libbpf: Sanitize global functions
In case the kernel doesn't support BTF_FUNC_GLOBAL sanitize BTF produced by the
compiler for global functions.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200110064124.1760511-2-ast@kernel.org
2020-01-10 17:20:07 +01:00
Alexei Starovoitov
f41aa387a7 Merge branch 'selftest-makefile-cleanup'
Andrii Nakryiko says:

====================
Fix issues with bpf_helper_defs.h usage in selftests/bpf. As part of that, fix
the way clean up is performed for libbpf and selftests/bpf. Some for Makefile
output clean ups as well.
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-01-09 21:56:16 -08:00
Andrii Nakryiko
965b9fee28 selftests/bpf: Further clean up Makefile output
Further clean up Makefile output:
- hide "entering directory" messages;
- silvence sub-Make command echoing;
- succinct MKDIR messages.

Also remove few test binaries that are not produced anymore from .gitignore.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200110051716.1591485-4-andriin@fb.com
2020-01-09 21:55:08 -08:00
Andrii Nakryiko
6910d7d386 selftests/bpf: Ensure bpf_helper_defs.h are taken from selftests dir
Reorder includes search path to ensure $(OUTPUT) and $(CURDIR) go before
libbpf's directory. Also fix bpf_helpers.h to include bpf_helper_defs.h in
such a way as to leverage includes search path. This allows selftests to not
use libbpf's local and potentially stale bpf_helper_defs.h. It's important
because selftests/bpf's Makefile only re-generates bpf_helper_defs.h in
seltests' output directory, not the one in libbpf's directory.

Also force regeneration of bpf_helper_defs.h when libbpf.a is updated to
reduce staleness.

Fixes: fa633a0f89 ("libbpf: Fix build on read-only filesystems")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200110051716.1591485-3-andriin@fb.com
2020-01-09 21:55:08 -08:00
Andrii Nakryiko
2031af28a4 libbpf,selftests/bpf: Fix clean targets
Libbpf's clean target should clean out generated files in $(OUTPUT) directory
and not make assumption that $(OUTPUT) directory is current working directory.

Selftest's Makefile should delegate cleaning of libbpf-generated files to
libbpf's Makefile. This ensures more robust clean up.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200110051716.1591485-2-andriin@fb.com
2020-01-09 21:55:08 -08:00
Andrii Nakryiko
492ab0205f libbpf: Make bpf_map order and indices stable
Currently, libbpf re-sorts bpf_map structs after all the maps are added and
initialized, which might change their relative order and invalidate any
bpf_map pointer or index taken before that. This is inconvenient and
error-prone. For instance, it can cause .kconfig map index to point to a wrong
map.

Furthermore, libbpf itself doesn't rely on any specific ordering of bpf_maps,
so it's just an unnecessary complication right now. This patch drops sorting
of maps and makes their relative positions fixed. If efficient index is ever
needed, it's better to have a separate array of pointers as a search index,
instead of reordering bpf_map struct in-place. This will be less error-prone
and will allow multiple independent orderings, if necessary (e.g., either by
section index or by name).

Fixes: 166750bc1d ("libbpf: Support libbpf-provided extern variables")
Reported-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200110034247.1220142-1-andriin@fb.com
2020-01-09 21:17:01 -08:00
Andrey Ignatov
f5bfcd953d bpf: Document BPF_F_QUERY_EFFECTIVE flag
Document BPF_F_QUERY_EFFECTIVE flag, mostly to clarify how it affects
attach_flags what may not be obvious and what may lead to confision.

Specifically attach_flags is returned only for target_fd but if programs
are inherited from an ancestor cgroup then returned attach_flags for
current cgroup may be confusing. For example, two effective programs of
same attach_type can be returned but w/o BPF_F_ALLOW_MULTI in
attach_flags.

Simple repro:
  # bpftool c s /sys/fs/cgroup/path/to/task
  ID       AttachType      AttachFlags     Name
  # bpftool c s /sys/fs/cgroup/path/to/task effective
  ID       AttachType      AttachFlags     Name
  95043    ingress                         tw_ipt_ingress
  95048    ingress                         tw_ingress

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200108014006.938363-1-rdna@fb.com
2020-01-09 09:40:06 -08:00
Alexei Starovoitov
417759f7d4 Merge branch 'tcp-bpf-cc'
Martin Lau says:

====================
This series introduces BPF STRUCT_OPS.  It is an infra to allow
implementing some specific kernel's function pointers in BPF.
The first use case included in this series is to implement
TCP congestion control algorithm in BPF  (i.e. implement
struct tcp_congestion_ops in BPF).

There has been attempt to move the TCP CC to the user space
(e.g. CCP in TCP).   The common arguments are faster turn around,
get away from long-tail kernel versions in production...etc,
which are legit points.

BPF has been the continuous effort to join both kernel and
userspace upsides together (e.g. XDP to gain the performance
advantage without bypassing the kernel).  The recent BPF
advancements (in particular BTF-aware verifier, BPF trampoline,
BPF CO-RE...) made implementing kernel struct ops (e.g. tcp cc)
possible in BPF.

The idea is to allow implementing tcp_congestion_ops in bpf.
It allows a faster turnaround for testing algorithm in the
production while leveraging the existing (and continue growing) BPF
feature/framework instead of building one specifically for
userspace TCP CC.

Please see individual patch for details.

The bpftool support will be posted in follow-up patches.

v4:
- Expose tcp_ca_find() to tcp.h in patch 7.
  It is used to check the same bpf-tcp-cc
  does not exist to guarantee the register()
  will succeed.
- set_memory_ro() and then set_memory_x() only after all
  trampolines are written to the image in patch 6. (Daniel)
  spinlock is replaced by mutex because set_memory_*
  requires sleepable context.

v3:
- Fix kbuild error by considering CONFIG_BPF_SYSCALL (kbuild)
- Support anonymous bitfield in patch 4 (Andrii, Yonghong)
- Push boundary safety check to a specific arch's trampoline function
  (in patch 6) (Yonghong).
  Reuse the WANR_ON_ONCE check in arch_prepare_bpf_trampoline() in x86.
- Check module field is 0 in udata in patch 6 (Yonghong)
- Check zero holes in patch 6 (Andrii)
- s/_btf_vmlinux/btf/ in patch 5 and 7 (Andrii)
- s/check_xxx/is_xxx/ in patch 7 (Andrii)
- Use "struct_ops/" convention in patch 11 (Andrii)
- Use the skel instead of bpf_object in patch 11 (Andrii)
- libbpf: Decide BPF_PROG_TYPE_STRUCT_OPS at open phase by using
          find_sec_def()
- libbpf: Avoid a debug message at open phase (Andrii)
- libbpf: Add bpf_program__(is|set)_struct_ops() for consistency (Andrii)
- libbpf: Add "struct_ops" to section_defs (Andrii)
- libbpf: Some code shuffling in init_kern_struct_ops() (Andrii)
- libbpf: A few safety checks (Andrii)

v2:
- Dropped cubic for now.  They will be reposted
  once there are more clarity in "jiffies" on both
  bpf side (about the helper) and
  tcp_cubic side (some of jiffies usages are being replaced
  by tp->tcp_mstamp)
- Remove unnecssary check on bitfield support from btf_struct_access()
  (Yonghong)
- BTF_TYPE_EMIT macro (Yonghong, Andrii)
- value_name's length check to avoid an unlikely
  type match during truncation case (Yonghong)
- BUILD_BUG_ON to ensure no trampoline-image overrun
  in the future (Yonghong)
- Simplify get_next_key() (Yonghong)
- Added comment to explain how to check mandatory
  func ptr in net/ipv4/bpf_tcp_ca.c (Yonghong)
- Rename "__bpf_" to "bpf_struct_ops_" for value prefix (Andrii)
- Add comment to highlight the bpf_dctcp.c is not necessarily
  the same as tcp_dctcp.c. (Alexei, Eric)
- libbpf: Renmae "struct_ops" to ".struct_ops" for elf sec (Andrii)
- libbpf: Expose struct_ops as a bpf_map (Andrii)
- libbpf: Support multiple struct_ops in SEC(".struct_ops") (Andrii)
- libbpf: Add bpf_map__attach_struct_ops()  (Andrii)
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-01-09 09:15:25 -08:00
Martin KaFai Lau
09903869f6 bpf: Add bpf_dctcp example
This patch adds a bpf_dctcp example.  It currently does not do
no-ECN fallback but the same could be done through the cgrp2-bpf.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200109003517.3856825-1-kafai@fb.com
2020-01-09 08:46:18 -08:00