Commit Graph

27002 Commits

Author SHA1 Message Date
Linus Torvalds 9cccec2bf3 x86:
- missing TLB flush
 
 - nested virtualization fixes for SMM (secure boot on nested hypervisor)
   and other nested SVM fixes
 
 - syscall fuzzing fixes
 
 - live migration fix for AMD SEV
 
 - mirror VMs now work for SEV-ES too
 
 - fixes for reset
 
 - possible out-of-bounds access in IOAPIC emulation
 
 - fix enlightened VMCS on Windows 2022
 
 ARM:
 
 - Add missing FORCE target when building the EL2 object
 
 - Fix a PMU probe regression on some platforms
 
 Generic:
 
 - KCSAN fixes
 
 selftests:
 
 - random fixes, mostly for clang compilation
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmFN0EwUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNqaQf/Vx7ePFTqwWpo+8wKapnc6JN9SLjC
 hM4jipxfc1WyQWcfCt8ZuPhCnhF7o8mG/mrqTm+JB+oGqIsydHW19DiUT8ekv09F
 dQ+XYSiR4B547wUH5XLQc4xG9imwYlXGEOHqrE7eJvGH3LOqVFX2fLRBnFefZbO8
 GKhRJrGXwG3/JSAP6A0c22iVU+pLbfV9gpKwrAj0V7o8nzT2b3Wmh74WBNb47BzE
 a4+AwKpWO4rqJGOwdYwy67pdFHh1YmrlZ59cFZc7fzlXE+o0D0bitaJyioZALpOl
 4mRGdzoYkNB++ZjDzVFnAClCYQV/oNxCNGFaFF2mh/gzXG1TLmN7B8zGDg==
 =7oVh
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "A bit late... I got sidetracked by back-from-vacation routines and
  conferences. But most of these patches are already a few weeks old and
  things look more calm on the mailing list than what this pull request
  would suggest.

  x86:

   - missing TLB flush

   - nested virtualization fixes for SMM (secure boot on nested
     hypervisor) and other nested SVM fixes

   - syscall fuzzing fixes

   - live migration fix for AMD SEV

   - mirror VMs now work for SEV-ES too

   - fixes for reset

   - possible out-of-bounds access in IOAPIC emulation

   - fix enlightened VMCS on Windows 2022

  ARM:

   - Add missing FORCE target when building the EL2 object

   - Fix a PMU probe regression on some platforms

  Generic:

   - KCSAN fixes

  selftests:

   - random fixes, mostly for clang compilation"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (43 commits)
  selftests: KVM: Explicitly use movq to read xmm registers
  selftests: KVM: Call ucall_init when setting up in rseq_test
  KVM: Remove tlbs_dirty
  KVM: X86: Synchronize the shadow pagetable before link it
  KVM: X86: Fix missed remote tlb flush in rmap_write_protect()
  KVM: x86: nSVM: don't copy virt_ext from vmcb12
  KVM: x86: nSVM: test eax for 4K alignment for GP errata workaround
  KVM: x86: selftests: test simultaneous uses of V_IRQ from L1 and L0
  KVM: x86: nSVM: restore int_vector in svm_clear_vintr
  kvm: x86: Add AMD PMU MSRs to msrs_to_save_all[]
  KVM: x86: nVMX: re-evaluate emulation_required on nested VM exit
  KVM: x86: nVMX: don't fail nested VM entry on invalid guest state if !from_vmentry
  KVM: x86: VMX: synthesize invalid VM exit when emulating invalid guest state
  KVM: x86: nSVM: refactor svm_leave_smm and smm_enter_smm
  KVM: x86: SVM: call KVM_REQ_GET_NESTED_STATE_PAGES on exit from SMM mode
  KVM: x86: reset pdptrs_from_userspace when exiting smm
  KVM: x86: nSVM: restore the L1 host state prior to resuming nested guest on SMM exit
  KVM: nVMX: Filter out all unsupported controls when eVMCS was activated
  KVM: KVM: Use cpumask_available() to check for NULL cpumask when kicking vCPUs
  KVM: Clean up benign vcpu->cpu data races when kicking vCPUs
  ...
2021-09-27 13:58:23 -07:00
Like Xu 4da8b12188 perf iostat: Fix Segmentation fault from NULL 'struct perf_counts_values *'
If the 'perf iostat' user specifies two or more iio_root_ports and also
specifies the cpu(s) by -C which is not *connected to all* the above iio
ports, the iostat_print_metric() will run into trouble:

For example:

  $ perf iostat list
  S0-uncore_iio_0<0000:16>
  S1-uncore_iio_0<0000:97> # <--- CPU 1 is located in the socket S0

  $ perf iostat 0000:16,0000:97 -C 1 -- ls
  port 	Inbound Read(MB)	Inbound Write(MB)	Outbound Read(MB)	Outbound
  Write(MB) ../perf-iostat: line 12: 104418 Segmentation fault
  (core dumped) perf stat --iostat$DELIMITER$*

The core-dump stack says, in the above corner case, the returned
(struct perf_counts_values *) count will be NULL, and the caller
iostat_print_metric() apparently doesn't not handle this case.

  433	struct perf_counts_values *count = perf_counts(evsel->counts, die, 0);
  434
  435	if (count->run && count->ena) {
  (gdb) p count
  $1 = (struct perf_counts_values *) 0x0

The deeper reason is that there are actually no statistics from the user
specified pair "iostat 0000:X, -C (disconnected) Y ", but let's fix it with
minimum cost by adding a NULL check in the user space.

Fixes: f9ed693e8b ("perf stat: Enable iostat mode for x86 platforms")
Signed-off-by: Like Xu <likexu@tencent.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210927081115.39568-2-likexu@tencent.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:41:07 -03:00
Like Xu e4fe5d7349 perf iostat: Use system-wide mode if the target cpu_list is unspecified
An iostate use case like "perf iostat 0000:16,0000:97 -- ls" should be
implemented to work in system-wide mode to ensure that the output from
print_header() is consistent with the user documentation perf-iostat.txt,
rather than incorrectly assuming that the kernel does not support it:

 Error:
 The sys_perf_event_open() syscall returned with 22 (Invalid argument) \
 for event (uncore_iio_0/event=0x83,umask=0x04,ch_mask=0xF,fc_mask=0x07/).
 /bin/dmesg | grep -i perf may provide additional information.

This error is easily fixed by assigning system-wide mode by default
for IOSTAT_RUN only when the target cpu_list is unspecified.

Fixes: f07952b179 ("perf stat: Basic support for iostat in perf")
Signed-off-by: Like Xu <likexu@tencent.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210927081115.39568-1-likexu@tencent.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:39:30 -03:00
Like Xu a827c007c7 perf config: Refine error message to eliminate confusion
If there is no configuration file at first, the user can write any pair
of "key.subkey=value" to the newly created configuration file, while
value validation against a valid configurable key is *deferred* until
the next execution or the implied execution of "perf config ... ".

For example:

  $ rm ~/.perfconfig
  $ perf config call-graph.dump-size=65529
  $ cat ~/.perfconfig
  # this file is auto-generated.
  [call-graph]
 	dump-size = 65529
  $ perf config call-graph.dump-size=2048
  callchain: Incorrect stack dump size (max 65528): 65529
  Error: wrong config key-value pair call-graph.dump-size=65529

The user might expect that the second value 2048 is valid and can be
updated to the configuration file, but the error message is very
confusing because the first value 65529 is not reported as an error
during the last configuration.

It is recommended not to change the current behavior of delayed
validation (as more effort is needed), but to refine the original error
message to *clearly indicate* that the cause of the error is the
configuration file.

Signed-off-by: Like Xu <likexu@tencent.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210924115817.58689-1-likexu@tencent.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:32:28 -03:00
Like Xu 4da6552c5d perf doc: Fix typos all over the place
Considering that perf and its subcommands have so many parameters, the
documentation is always the first stop for perf beginners. Fixing some
spelling errors will relax the eyes of some readers a little bit.

 s/specicfication/specification/
 s/caheline/cacheline/
 s/tranasaction/transaction/
 s/complan/complain/
 s/sched_wakep/sched_wakeup/
 s/possble/possible/
 s/methology/methodology/

Signed-off-by: Like Xu <likexu@tencent.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210924081942.38368-1-likexu@tencent.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:32:28 -03:00
Ian Rogers c6613bd4a5 perf arm: Fix off-by-one directory paths.
Relative path include works in the regular build due to -I paths but may
fail in other situations.

v2. Rebase. Comments on v1 were that we should handle include paths
    differently and it is agreed that can be a sensible refactor but
    beyond the scope of this change.
https://lore.kernel.org/lkml/20210504191227.793712-1-irogers@google.com/

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210923154254.737657-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:32:28 -03:00
Colin Ian King 774f2c0890 perf vendor events powerpc: Fix spelling mistake "icach" -> "icache"
There is a spelling mistake in the description text, fix it.

Signed-off-by: Colin King <colin.king@canonical.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-janitors@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210916081314.41751-1-colin.king@canonical.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:32:28 -03:00
James Clark 0f892fd1bd perf tests: Fix flaky test 'Object code reading'
This test occasionally fails on aarch64 when a sample is taken in
free@plt and it fails with "Bytes read differ from those read by
objdump".

This is because that symbol is near a section boundary in the elf file.
Despite the -z option to always output zeros, objdump uses
bfd_map_over_sections() to iterate through the elf file so it doesn't
see outside of the sections where these zeros are and can't print them.

For example this boundary proceeds free@plt in libc with a gap of 48
bytes between .plt and .text:

  objdump -d -z --start-address=0x23cc8 --stop-address=0x23d08 libc-2.30.so

  libc-2.30.so:     file format elf64-littleaarch64

  Disassembly of section .plt:

  0000000000023cc8 <*ABS*+0x7fd00@plt+0x8>:
     23cc8:	91018210 	add	x16, x16, #0x60
     23ccc:	d61f0220 	br	x17

  Disassembly of section .text:

  0000000000023d00 <abort@@GLIBC_2.17-0x98>:
     23d00:	a9bf7bfd 	stp	x29, x30, [sp, #-16]!
     23d04:	910003fd 	mov	x29, sp

Taking a sample in free@plt is very rare because it is so small, but the
test can be forced to fail almost every time on any platform by linking
the test with a shared library that has a single empty function and
calling it in a loop.

The fix is to zero the buffers so that when there is a jump in the
addresses output by objdump, zeros are already filled in between.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20210906152238.3415467-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:32:28 -03:00
Ian Rogers 5c34aea341 perf test: Fix DWARF unwind for optimized builds.
To ensure the stack frames are on the stack tail calls optimizations
need to be inhibited. If your compiler supports an attribute use it,
otherwise use an asm volatile barrier.

The barrier fix was suggested here:
https://lore.kernel.org/lkml/20201028081123.GT2628@hirez.programming.kicks-ass.net/
Tested with an optimized clang build and by forcing the asm barrier
route with an optimized clang build.

A GCC bug tracking a proper disable_tail_calls is:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97831

Fixes: 9ae1e990f1 ("perf tools: Remove broken __no_tail_call
       attribute")

v2. is a rebase. The original fix patch generated quite a lot of
    discussion over the right place for the fix:
    https://lore.kernel.org/lkml/20201114000803.909530-1-irogers@google.com/
    The patch reflects my preference of it being near the use, so that
    future code cleanups don't break this somewhat special usage.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20210922173812.456348-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:32:28 -03:00
Petr Machata b69c99463d selftests: net: fib_nexthops: Wait before checking reported idle time
The purpose of this test is to verify that after a short activity passes,
the reported time is reasonable: not zero (which could be reported by
mistake), and not something outrageous (which would be indicative of an
issue in used units).

However, the idle time is reported in units of clock_t, or hundredths of
second. If the initial sequence of commands is very quick, it is possible
that the idle time is reported as just flat-out zero. When this test was
recently enabled in our nightly regression, we started seeing spurious
failures for exactly this reason.

Therefore buffer the delay leading up to the test with a sleep, to make
sure there is no legitimate way of reporting 0.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 12:15:36 +01:00
Martin KaFai Lau ef979017b8 bpf: selftest: Add verifier tests for <8-byte scalar spill and refill
This patch adds a few verifier tests for <8-byte spill and refill.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210922004953.627183-1-kafai@fb.com
2021-09-26 13:07:28 -07:00
Martin KaFai Lau 54ea6079b7 bpf: selftest: A bpf prog that has a 32bit scalar spill
It is a simplified example that can trigger a 32bit scalar spill.
The const scalar is refilled and added to a skb->data later.
Since the reg state of the 32bit scalar spill is not saved now,
adding the refilled reg to skb->data and then comparing it with
skb->data_end cannot verify the skb->data access.

With the earlier verifier patch and the llvm patch [1].  The verifier
can correctly verify the bpf prog.

Here is the snippet of the verifier log that leads to verifier conclusion
that the packet data is unsafe to read.  The log is from the kerne
without the previous verifier patch to save the <8-byte scalar spill.
67: R0=inv1 R1=inv17 R2=invP2 R3=inv1 R4=pkt(id=0,off=68,r=102,imm=0) R5=inv102 R6=pkt(id=0,off=62,r=102,imm=0) R7=pkt(id=0,off=0,r=102,imm=0) R8=pkt_end(id=0,off=0,imm=0) R9=inv17 R10=fp0
67: (63) *(u32 *)(r10 -12) = r5
68: R0=inv1 R1=inv17 R2=invP2 R3=inv1 R4=pkt(id=0,off=68,r=102,imm=0) R5=inv102 R6=pkt(id=0,off=62,r=102,imm=0) R7=pkt(id=0,off=0,r=102,imm=0) R8=pkt_end(id=0,off=0,imm=0) R9=inv17 R10=fp0 fp-16=mmmm????
...
101: R0_w=map_value_or_null(id=2,off=0,ks=16,vs=1,imm=0) R6_w=pkt(id=0,off=70,r=102,imm=0) R7=pkt(id=0,off=0,r=102,imm=0) R8=pkt_end(id=0,off=0,imm=0) R9=inv17 R10=fp0 fp-16=mmmmmmmm
101: (61) r1 = *(u32 *)(r10 -12)
102: R0_w=map_value_or_null(id=2,off=0,ks=16,vs=1,imm=0) R1_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6_w=pkt(id=0,off=70,r=102,imm=0) R7=pkt(id=0,off=0,r=102,imm=0) R8=pkt_end(id=0,off=0,imm=0) R9=inv17 R10=fp0 fp-16=mmmmmmmm
102: (bc) w1 = w1
103: R0_w=map_value_or_null(id=2,off=0,ks=16,vs=1,imm=0) R1_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6_w=pkt(id=0,off=70,r=102,imm=0) R7=pkt(id=0,off=0,r=102,imm=0) R8=pkt_end(id=0,off=0,imm=0) R9=inv17 R10=fp0 fp-16=mmmmmmmm
103: (0f) r7 += r1
last_idx 103 first_idx 67
regs=2 stack=0 before 102: (bc) w1 = w1
regs=2 stack=0 before 101: (61) r1 = *(u32 *)(r10 -12)
104: R0_w=map_value_or_null(id=2,off=0,ks=16,vs=1,imm=0) R1_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6_w=pkt(id=0,off=70,r=102,imm=0) R7_w=pkt(id=3,off=0,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R8=pkt_end(id=0,off=0,imm=0) R9=inv17 R10=fp0 fp-16=mmmmmmmm
...
127: R0_w=inv1 R1=invP(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=pkt(id=0,off=70,r=102,imm=0) R7=pkt(id=3,off=0,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R8=pkt_end(id=0,off=0,imm=0) R9_w=invP17 R10=fp0 fp-16=mmmmmmmm
127: (bf) r1 = r7
128: R0_w=inv1 R1_w=pkt(id=3,off=0,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=pkt(id=0,off=70,r=102,imm=0) R7=pkt(id=3,off=0,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R8=pkt_end(id=0,off=0,imm=0) R9_w=invP17 R10=fp0 fp-16=mmmmmmmm
128: (07) r1 += 8
129: R0_w=inv1 R1_w=pkt(id=3,off=8,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=pkt(id=0,off=70,r=102,imm=0) R7=pkt(id=3,off=0,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R8=pkt_end(id=0,off=0,imm=0) R9_w=invP17 R10=fp0 fp-16=mmmmmmmm
129: (b4) w0 = 1
130: R0=inv1 R1=pkt(id=3,off=8,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=pkt(id=0,off=70,r=102,imm=0) R7=pkt(id=3,off=0,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R8=pkt_end(id=0,off=0,imm=0) R9=invP17 R10=fp0 fp-16=mmmmmmmm
130: (2d) if r1 > r8 goto pc-66
 R0=inv1 R1=pkt(id=3,off=8,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=pkt(id=0,off=70,r=102,imm=0) R7=pkt(id=3,off=0,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R8=pkt_end(id=0,off=0,imm=0) R9=invP17 R10=fp0 fp-16=mmmmmmmm
131: R0=inv1 R1=pkt(id=3,off=8,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=pkt(id=0,off=70,r=102,imm=0) R7=pkt(id=3,off=0,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R8=pkt_end(id=0,off=0,imm=0) R9=invP17 R10=fp0 fp-16=mmmmmmmm
131: (69) r6 = *(u16 *)(r7 +0)
invalid access to packet, off=0 size=2, R7(id=3,off=0,r=0)
R7 offset is outside of the packet

[1]: https://reviews.llvm.org/D109073

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210922004947.626286-1-kafai@fb.com
2021-09-26 13:07:28 -07:00
Linus Torvalds 5bb7b2107f A set of fixes for X86:
- Prevent sending the wrong signal when protection keys are enabled and
    the kernel handles a fault in the vsyscall emulation.
 
  - Invoke early_reserve_memory() before invoking e820_memory_setup() which
    is required to make the Xen dom0 e820 hooks work correctly.
 
  - Use the correct data type for the SETZ operand in the EMQCMDS
    instruction wrapper.
 
  - Prevent undefined behaviour to the potential unaligned accesss in the
    instroction decoder library.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmFQQaITHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoaZjD/0TF0mE8QUhI4tyGELdNgwvje5iZ9vg
 Nd9KJpR4hUALHgfUD44NVl9JWawFY2d8FXyIPoAFEcvmy6o4f1w0ia8US3hQWA0Y
 EdLSigXi/eYSstkONaJUEBCxlLbwy7JDzaazA9DeKOEuRc7NWSyZURYvzTAkPK1Y
 mbE9kjKhjFa5NGnSB8HbSF2yEzFsKaTo4nreWP/OkzDjnEMshLR1/FUOUvZmlsgA
 CWjMxAVYFqeJN3QhDgR/vRKPoz1sOjDL1s4AsU+xdy63WyFJZ7Z1b8t6bOBoYh6w
 UztkuOkzZ6pIdzz4O1WGoFx4/FJ74qNx0vO/hOB+cKH6rgJs6AkHAvwlnjI/fE2C
 Y+IsuE4PBXMRpkaayTCsAq/enabwgKsmLSUu916APrhVvuUtb3GJgyhedLE3mEBw
 yZXezzRDhNpYop2yQSRXDeKebpoQgl+zqEP5g1O8pAFnud8FGHnz64eJV7Su7Y7C
 BCac0hmv+drlqb/jOSYqjsfo6QfhvR60WwDIgTplOMMLa3plEJFx/rIuU2xVg5g9
 w0m2QUsZboyT2yBnl8gRrqrcQmv2t4iX6TAj9Wm23Lx41h94JQMRtZyJT9bcNqY9
 jMJu27BcNSveciZA7W2DVUlFf/gTF3bwpF7ZDWRt/VSrHPtkI9WKlERhQaywo1L0
 rF8SGCEuNU2ktw==
 =h7v1
 -----END PGP SIGNATURE-----

Merge tag 'x86-urgent-2021-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "A set of fixes for X86:

   - Prevent sending the wrong signal when protection keys are enabled
     and the kernel handles a fault in the vsyscall emulation.

   - Invoke early_reserve_memory() before invoking e820_memory_setup()
     which is required to make the Xen dom0 e820 hooks work correctly.

   - Use the correct data type for the SETZ operand in the EMQCMDS
     instruction wrapper.

   - Prevent undefined behaviour to the potential unaligned accesss in
     the instruction decoder library"

* tag 'x86-urgent-2021-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/insn, tools/x86: Fix undefined behavior due to potential unaligned accesses
  x86/asm: Fix SETZ size enqcmds() build failure
  x86/setup: Call early_reserve_memory() earlier
  x86/fault: Fix wrong signal when vsyscall fails with pkey
2021-09-26 10:09:20 -07:00
Linus Torvalds a3b397b4ff Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "16 patches.

  Subsystems affected by this patch series: xtensa, sh, ocfs2, scripts,
  lib, and mm (memory-failure, kasan, damon, shmem, tools, pagecache,
  debug, and pagemap)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm: fix uninitialized use in overcommit_policy_handler
  mm/memory_failure: fix the missing pte_unmap() call
  kasan: always respect CONFIG_KASAN_STACK
  sh: pgtable-3level: fix cast to pointer from integer of different size
  mm/debug: sync up latest migrate_reason to migrate_reason_names
  mm/debug: sync up MR_CONTIG_RANGE and MR_LONGTERM_PIN
  mm: fs: invalidate bh_lrus for only cold path
  lib/zlib_inflate/inffast: check config in C to avoid unused function warning
  tools/vm/page-types: remove dependency on opt_file for idle page tracking
  scripts/sorttable: riscv: fix undeclared identifier 'EM_RISCV' error
  ocfs2: drop acl cache for directories too
  mm/shmem.c: fix judgment error in shmem_is_huge()
  xtensa: increase size of gcc stack frame check
  mm/damon: don't use strnlen() with known-bogus source length
  kasan: fix Kconfig check of CC_HAS_WORKING_NOSANITIZE_ADDRESS
  mm, hwpoison: add is_free_buddy_page() in HWPoisonHandlable()
2021-09-25 16:20:34 -07:00
Linus Torvalds 90316e6ea0 linux-kselftest-fixes-5.15-rc3
This Kselftest fixes update for Linux 5.15-rc3 consists of:
 
 - fix to Kselftest common framework header install to run before
   other targets for it work correctly in parallel build case.
 - fixes to kvm test to not ignore fscanf() returns which could
   result in inconsistent test behavior and failures.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmFOL14ACgkQCwJExA0N
 Qxyz9hAA6LKcNpv7BGXWpTNB2z/p9KcbSWdA3eejvb+3kiovd2AyFPvf1U1fT5kA
 VlJWk4d3cbhPkCrXmkpsI2vBuSb+u5HcwoEy0WH4T4q2aDjoEOnyn3xr2ktnBdXl
 oeWqPusSRKTJn7gCCZN+H6s5JaUbGZxTWsL2n0v5Pzb6sW4hikc8nXpd+1IeoTKq
 0xg7FM6NcaWRzKb6D97wFpQEo6mnC9Zifv6TalxQn71d//n9MXGW2600ZSDDfBRG
 XfolWqVUHGI2Lyy0mIT788fmntA7xOka3Tajzk4WrfmcgczABJFRJr8ZG6iZ+J0j
 dT+KaTqEZHcL1L9Pusf2VehJZTUwKMenc2NmOZ+n6pK+PUL/KZNobcXoFW01I9jI
 z4UYeH9DUnTiaTP8b7OJ1RB7H5XU3CncuO2gELkera212XckdmiTldpox0ywwyh9
 x+X6mk4lzbDSFMPrSPJrVTasvLMEZv6A05I8Td8Gw7mgeYIxupyk0/k5jgjMlZGN
 6CHyQu4iGre3oM72snuthomtzqwArF3bhWAm7ooZYG4qLQ6z6naDdvcpWYmszIV0
 RuC74FZCHZWGXXojVJWPquf47C79QPFQyQcwXxfAaSMY3XezVsI7c/QaXLoT1VeA
 BZbsUoAPvLa8mLRiFU2jZmLbx57rxej5zM+UvKJjI1JdCL9+cUA=
 =Q9zx
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-fixes-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kselftest fixes from Shuah Khan:

 - fix to Kselftest common framework header install to run before other
   targets for it work correctly in parallel build case.

 - fixes to kvm test to not ignore fscanf() returns which could result
   in inconsistent test behavior and failures.

* tag 'linux-kselftest-fixes-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests: kvm: fix get_run_delay() ignoring fscanf() return warn
  selftests: kvm: move get_run_delay() into lib/test_util
  selftests:kvm: fix get_trans_hugepagesz() ignoring fscanf() return warn
  selftests:kvm: fix get_warnings_count() ignoring fscanf() return warn
  selftests: be sure to make khdr before other targets
2021-09-25 15:30:29 -07:00
Linus Torvalds 2c4e969c38 USB driver fixes for 5.15-rc3
Here are some USB driver fixes and new device ids for 5.15-rc3.
 
 They include:
 	- usb-storage quirk additions
 	- usb-serial new device ids
 	- usb-serial driver fixes
 	- USB roothub registration bugfix to resolve a long-reported
 	  issue
 	- usb gadget driver fixes for a large number of small things
 	- dwc2 driver fixes
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYU8uKQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ynjlgCeIE84XRGpjE/jCK/63Sjve9zyJjoAn2ZUFwLN
 lcLJxlHV3XHK8coC5/YZ
 =hNgV
 -----END PGP SIGNATURE-----

Merge tag 'usb-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB driver fixes from Greg KH:
 "Here are some USB driver fixes and new device ids for 5.15-rc3.

  They include:

   - usb-storage quirk additions

   - usb-serial new device ids

   - usb-serial driver fixes

   - USB roothub registration bugfix to resolve a long-reported issue

   - usb gadget driver fixes for a large number of small things

   - dwc2 driver fixes

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'usb-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (28 commits)
  USB: serial: option: add device id for Foxconn T99W265
  USB: serial: cp210x: add ID for GW Instek GDM-834x Digital Multimeter
  USB: serial: cp210x: add part-number debug printk
  USB: serial: cp210x: fix dropped characters with CP2102
  MAINTAINERS: usb, update Peter Korsgaard's entries
  usb: musb: tusb6010: uninitialized data in tusb_fifo_write_unaligned()
  usb-storage: Add quirk for ScanLogic SL11R-IDE older than 2.6c
  Re-enable UAS for LaCie Rugged USB3-FW with fk quirk
  USB: serial: option: remove duplicate USB device ID
  USB: serial: mos7840: remove duplicated 0xac24 device ID
  arm64: dts: qcom: ipq8074: remove USB tx-fifo-resize property
  usb: gadget: f_uac2: Populate SS descriptors' wBytesPerInterval
  usb: gadget: f_uac2: Add missing companion descriptor for feedback EP
  usb: dwc2: gadget: Fix ISOC transfer complete handling for DDMA
  usb: core: hcd: Modularize HCD stop configuration in usb_stop_hcd()
  xhci: Set HCD flag to defer primary roothub registration
  usb: core: hcd: Add support for deferring roothub registration
  usb: dwc2: gadget: Fix ISOC flow for BDMA and Slave
  usb: dwc3: core: balance phy init and exit
  Revert "USB: bcma: Add a check for devm_gpiod_get"
  ...
2021-09-25 10:10:38 -07:00
Jakub Kicinski 7fe7f3182a Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter/IPVS fixes for net

1) ipset limits the max allocatable memory via kvmalloc() to MAX_INT,
   from Jozsef Kadlecsik.

2) Check ip_vs_conn_tab_bits value to be in the range specified
   in Kconfig, from Andrea Claudi.

3) Initialize fragment offset in ip6tables, from Jeremy Sowden.

4) Make conntrack hash chain length random, from Florian Westphal.

5) Add zone ID to conntrack and NAT hashtuple again, also from Florian.

6) Add selftests for bidirectional zone support and colliding tuples,
   from Florian Westphal.

7) Unlink table before synchronize_rcu when cleaning tables with
   owner, from Florian.

8) ipset limits the max allocatable memory via kvmalloc() to MAX_INT.

9) Release conntrack entries via workqueue in masquerade, from Florian.

10) Fix bogus net_init in iptables raw table definition, also from Florian.

11) Work around missing softdep in log extensions, from Florian Westphal.

12) Serialize hash resizes and cleanups with mutex, from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf:
  netfilter: conntrack: serialize hash resizes and cleanups
  netfilter: log: work around missing softdep backend module
  netfilter: iptable_raw: drop bogus net_init annotation
  netfilter: nf_nat_masquerade: defer conntrack walk to work queue
  netfilter: nf_nat_masquerade: make async masq_inet6_event handling generic
  netfilter: nf_tables: Fix oversized kvmalloc() calls
  netfilter: nf_tables: unlink table before deleting it
  selftests: netfilter: add zone stress test with colliding tuples
  selftests: netfilter: add selftest for directional zone support
  netfilter: nat: include zone id in nat table hash again
  netfilter: conntrack: include zone id in tuple hash again
  netfilter: conntrack: make max chain length random
  netfilter: ip6_tables: zero-initialize fragment offset
  ipvs: check that ip_vs_conn_tab_bits is between 8 and 20
  netfilter: ipset: Fix oversized kvmalloc() calls
====================

Link: https://lore.kernel.org/r/20210924221113.348767-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-24 17:27:20 -07:00
Changbin Du ebaeab2fe8 tools/vm/page-types: remove dependency on opt_file for idle page tracking
Idle page tracking can also be used for process address space, not only
file mappings.

Without this change, using with '-i' option for process address space
encounters below errors reported.

  $ sudo ./page-types -p $(pidof bash) -i
  mark page idle: Bad file descriptor
  mark page idle: Bad file descriptor
  mark page idle: Bad file descriptor
  mark page idle: Bad file descriptor
  ...

Link: https://lkml.kernel.org/r/20210917032826.10669-1-changbin.du@gmail.com
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-24 16:13:35 -07:00
Yonghong Song 091037fb77 selftests/bpf: Fix btf_dump __int128 test failure with clang build kernel
With clang build kernel (adding LLVM=1 to kernel and selftests/bpf build
command line), I hit the following test failure:

  $ ./test_progs -t btf_dump
  ...
  btf_dump_data:PASS:ensure expected/actual match 0 nsec
  btf_dump_data:FAIL:find type id unexpected find type id: actual -2 < expected 0
  btf_dump_data:FAIL:find type id unexpected find type id: actual -2 < expected 0
  test_btf_dump_int_data:FAIL:dump __int128 unexpected error: -2 (errno 2)
  #15/9 btf_dump/btf_dump: int_data:FAIL

Further analysis showed gcc build kernel has type "__int128" in dwarf/BTF
and it doesn't exist in clang build kernel. Code searching for kernel code
found the following:
  arch/s390/include/asm/types.h:  unsigned __int128 pair;
  crypto/ecc.c:   unsigned __int128 m = (unsigned __int128)left * right;
  include/linux/math64.h: return (u64)(((unsigned __int128)a * mul) >> shift);
  include/linux/math64.h: return (u64)(((unsigned __int128)a * mul) >> shift);
  lib/ubsan.h:typedef __int128 s_max;
  lib/ubsan.h:typedef unsigned __int128 u_max;

In my case, CONFIG_UBSAN is not enabled. Even if we only have "unsigned __int128"
in the code, somehow gcc still put "__int128" in dwarf while clang didn't.
Hence current test works fine for gcc but not for clang.

Enabling CONFIG_UBSAN is an option to provide __int128 type into dwarf
reliably for both gcc and clang, but not everybody enables CONFIG_UBSAN
in their kernel build. So the best choice is to use "unsigned __int128" type
which is available in both clang and gcc build kernels. But clang and gcc
dwarf encoded names for "unsigned __int128" are different:

  [$ ~] cat t.c
  unsigned __int128 a;
  [$ ~] gcc -g -c t.c && llvm-dwarfdump t.o | grep __int128
                  DW_AT_type      (0x00000031 "__int128 unsigned")
                  DW_AT_name      ("__int128 unsigned")
  [$ ~] clang -g -c t.c && llvm-dwarfdump t.o | grep __int128
                  DW_AT_type      (0x00000033 "unsigned __int128")
                  DW_AT_name      ("unsigned __int128")

The test change in this patch tries to test type name before
doing actual test.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20210924025856.2192476-1-yhs@fb.com
2021-09-24 15:35:03 -07:00
Linus Torvalds 1b7eaf5701 arm64 fixes:
- It turns out that the optimised string routines merged in 5.14 are not
   safe with in-kernel MTE (KASAN_HW_TAGS) because of reading beyond the
   end of a string (strcmp, strncmp). Such reading may go across a 16
   byte tag granule and cause a tag check fault. When KASAN_HW_TAGS is
   enabled, use the generic strcmp/strncmp C implementation.
 
 - An errata workaround for ThunderX relied on the CPU capabilities being
   enabled in a specific order. This disappeared with the automatic
   generation of the cpucaps.h file (sorted alphabetically). Fix it by
   checking the current CPU only rather than the system-wide capability.
 
 - Add system_supports_mte() checks on the kernel entry/exit path and
   thread switching to avoid unnecessary barriers and function calls on
   systems where MTE is not supported.
 
 - kselftests: skip arm64 tests if the required features are missing.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmFOC7oACgkQa9axLQDI
 XvHGhBAAkTzrLtyveymoeD5bk0Lh4Co2i0PKqoOl82ltvagjcvUY5lDLfgKV8ac2
 0fZ6E7Jjt6Khq0VAWryRf/RkjxTXmuqDmyb7hZ/t0TNciRXKMfQyE8o8KhsWOvRF
 hD0COUOnJ4CrgQuAJwsW9uagcrix9NucK9WziHB65RACxnD+k7foffCNv1b9Tw9s
 oqAfZ144B+L1Z0WFoqG4uGkngIyG/lcSPJMhDRo45iRWtNUnlR/m/ApWKb+r+X1X
 /Aq2dWklUnrGBB1MdO3WAaRH/lvulH6t4KVhTJw2Gkq5ogy+mYLO15wGYmM+FAWT
 ExS8fy4/G2vX4RAxYRBZF7Xl3zN7F9n3RYKyYfxgGwSy30TnOLYUMoemYvA9Eh27
 O6cjaZacyPVfaB5LUwj6H84NbDbcJKCO775sLuIW8IEMklyZhorS8HG7nYnL6rBk
 jXM5AVPC2j8kXCgawW9u+s/pEQAvLOvM9y7Q0pjj3n2WNeV09bxzkgX5wStSaMYJ
 Ok95zCK1SbKBYdA7yQnFcQkM52XkqpML8vccBHUdxVbtAetuYxoyGKKs/9G6mPEH
 vWSxP4ymvRza62EofmUJzNIEMWwwNUgyNeY/bVJXYf1oc324Wx4T1ewQagrGVQve
 miEgOVEpn+ygHFkEO3mc0wnFfQ3m5CSBvOjeWlTbv7lgekRDyx4=
 =JIq+
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

 - It turns out that the optimised string routines merged in 5.14 are
   not safe with in-kernel MTE (KASAN_HW_TAGS) because of reading beyond
   the end of a string (strcmp, strncmp). Such reading may go across a
   16 byte tag granule and cause a tag check fault. When KASAN_HW_TAGS
   is enabled, use the generic strcmp/strncmp C implementation.

 - An errata workaround for ThunderX relied on the CPU capabilities
   being enabled in a specific order. This disappeared with the
   automatic generation of the cpucaps.h file (sorted alphabetically).
   Fix it by checking the current CPU only rather than the system-wide
   capability.

 - Add system_supports_mte() checks on the kernel entry/exit path and
   thread switching to avoid unnecessary barriers and function calls on
   systems where MTE is not supported.

 - kselftests: skip arm64 tests if the required features are missing.

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: Restore forced disabling of KPTI on ThunderX
  kselftest/arm64: signal: Skip tests if required features are missing
  arm64: Mitigate MTE issues with str{n}cmp()
  arm64: add MTE supported check to thread switching and syscall entry/exit
2021-09-24 11:12:17 -07:00
Numfor Mbiziwo-Tiapo 5ba1071f75 x86/insn, tools/x86: Fix undefined behavior due to potential unaligned accesses
Don't perform unaligned loads in __get_next() and __peek_nbyte_next() as
these are forms of undefined behavior:

"A pointer to an object or incomplete type may be converted to a pointer
to a different object or incomplete type. If the resulting pointer
is not correctly aligned for the pointed-to type, the behavior is
undefined."

(from http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf)

These problems were identified using the undefined behavior sanitizer
(ubsan) with the tools version of the code and perf test.

 [ bp: Massage commit message. ]

Signed-off-by: Numfor Mbiziwo-Tiapo <nums@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lkml.kernel.org/r/20210923161843.751834-1-irogers@google.com
2021-09-24 12:37:38 +02:00
Oliver Upton 386ca9d7fd selftests: KVM: Explicitly use movq to read xmm registers
Compiling the KVM selftests with clang emits the following warning:

>> include/x86_64/processor.h:297:25: error: variable 'xmm0' is uninitialized when used here [-Werror,-Wuninitialized]
>>                return (unsigned long)xmm0;

where xmm0 is accessed via an uninitialized register variable.

Indeed, this is a misuse of register variables, which really should only
be used for specifying register constraints on variables passed to
inline assembly. Rather than attempting to read xmm registers via
register variables, just explicitly perform the movq from the desired
xmm register.

Fixes: 783e9e5126 ("kvm: selftests: add API testing infrastructure")
Signed-off-by: Oliver Upton <oupton@google.com>
Message-Id: <20210924005147.1122357-1-oupton@google.com>
Reviewed-by: Ricardo Koller <ricarkol@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-24 02:32:58 -04:00
Oliver Upton fbf094ce52 selftests: KVM: Call ucall_init when setting up in rseq_test
While x86 does not require any additional setup to use the ucall
infrastructure, arm64 needs to set up the MMIO address used to signal a
ucall to userspace. rseq_test does not initialize the MMIO address,
resulting in the test spinning indefinitely.

Fix the issue by calling ucall_init() during setup.

Fixes: 61e52f1630 ("KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs")
Signed-off-by: Oliver Upton <oupton@google.com>
Message-Id: <20210923220033.4172362-1-oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-24 02:32:12 -04:00
Linus Torvalds f10f0481a5 A fix for a bug with restartable sequences and KVM. KVM's handling
of TIF_NOTIFY_RESUME, e.g. for task migration, clears the flag without
 informing rseq and leads to stale data in userspace's rseq struct.
 
 I'm sending this as a separate pull request since it's not code
 that I usually touch.  In particular, patch 2 ("entry: rseq: Call
 rseq_handle_notify_resume() in tracehook_notify_resume()") is just a
 cleanup to try and make future bugs less likely.  If you prefer this to
 be sent via Thomas and only in 5.16, please speak up.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmFLPYgUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNCowf6A9pTuDspCC2IRICfQnHj/Q/Xc7HH
 UmwYo26GctfYaq9AyIAGzcRx3iOEGjb9fZVJ6mzPUGIygio9ZyuiPHogf7lMAb+x
 39ts5uSOp+N+8e0fvX578WFfmG5hQa4Tp9W3T2Y5KsVgK2Nf8F08DckzIgD8cbkN
 NQKTRIi8AYgb20y3NFZjzsPRxF8850QK7xVCI+LBjryyWpEGT5ZsthrYUeexiJPz
 XN+VOYJen5GXVBCar2JbA7EVSrMZbKSy+M3fJ1vuW5dZHySaiu69JXJHop71jTnJ
 5BGue917MfH6RTDzIFFUcg7NmwcuXHpw4dsFeiyExYFNw1uWWQpk0efC1g==
 =/xlE
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-rseq' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull rseq fixes from Paolo Bonzini:
 "A fix for a bug with restartable sequences and KVM.

  KVM's handling of TIF_NOTIFY_RESUME, e.g. for task migration, clears
  the flag without informing rseq and leads to stale data in userspace's
  rseq struct"

* tag 'for-linus-rseq' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: selftests: Remove __NR_userfaultfd syscall fallback
  KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs
  tools: Move x86 syscall number fallbacks to .../uapi/
  entry: rseq: Call rseq_handle_notify_resume() in tracehook_notify_resume()
  KVM: rseq: Update rseq when processing NOTIFY_RESUME on xfer to KVM guest
2021-09-23 11:24:12 -07:00
Jakub Kicinski 2fcd14d0f7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
net/mptcp/protocol.c
  977d293e23 ("mptcp: ensure tx skbs always have the MPTCP ext")
  efe686ffce ("mptcp: ensure tx skbs always have the MPTCP ext")

same patch merged in both trees, keep net-next.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-23 11:19:49 -07:00
Linus Torvalds 9bc62afe03 Networking fixes for 5.15-rc3.
Current release - regressions:
 
  - dsa: bcm_sf2: fix array overrun in bcm_sf2_num_active_ports()
 
 Previous releases - regressions:
 
  - introduce a shutdown method to mdio device drivers, and make DSA
    switch drivers compatible with masters disappearing on shutdown;
    preventing infinite reference wait
 
  - fix issues in mdiobus users related to ->shutdown vs ->remove
 
  - virtio-net: fix pages leaking when building skb in big mode
 
  - xen-netback: correct success/error reporting for the SKB-with-fraglist
 
  - dsa: tear down devlink port regions when tearing down the devlink
         port on error
 
  - nexthop: fix division by zero while replacing a resilient group
 
  - hns3: check queue, vf, vlan ids range before using
 
 Previous releases - always broken:
 
  - napi: fix race against netpoll causing NAPI getting stuck
 
  - mlx4_en: ensure link operstate is updated even if link comes up
             before netdev registration
 
  - bnxt_en: fix TX timeout when TX ring size is set to the smallest
 
  - enetc: fix illegal access when reading affinity_hint;
           prevent oops on sysfs access
 
  - mtk_eth_soc: avoid creating duplicate offload entries
 
 Misc:
 
  - core: correct the sock::sk_lock.owned lockdep annotations
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmFMr4MACgkQMUZtbf5S
 Irv6Ag/+Ml4q6/IVO0jBppztZO1RrSalb3YE9JjPQyMchauVcdcADNpYF+Jo/gcH
 4q+/Oikfp6gkQpJTFd0Y9X7UhwA4Jm4wWtEisqc6PJeHOagDZmVUn353WtgpnCNL
 CgYBfa5k3msGudkqgeXIyiP/2sekBevTy+fOptubLZClyBrEwNUUUZBlpT9aI9Sj
 ru1eMYklfcxP60AQgNhqq6ZwJnRELgN75fSR6ypVCGcRnTK4UGL/b6TvnPYn8uYY
 zeNuMZZzYZK5B73tC6rWpteHWZ7VW3Km0WvIKs+ORM8nYchz/EprKZ0HCLPYrWvf
 ib5Wi7HyL7/n9k9NUTCGrQY3tkOWNzXOepjpiBZPqCG9r2hc3JSR7Q2lFwL+gKv/
 sh2y+T2xfp0WFGmG2XiU2MgnkypMSKah1sC/XRE7YLw02vPAnWQxxl/KVNek4j7M
 CH/Tg9ErVKDRLN7KO/kKl3s8I8N4hdctms/YUt9QD5J9Rw/Jqwr/79bq1uLy6d4o
 //ipmCTHex57Nvy80PtgcuKJhoeqGwR/Av6BvBMRZ1SOYs/C6q45skHTlYyiNY3+
 Dyj9+nfrhsyE835GKPe8lqBFZONBXpXw+EUNXeYRiv0Pcd+JKek07bbajSQVSpd8
 8nqQwylpGII0iPGyOc9wKajzh7O5W2odFIdOwtY/5yVjrcFgBd0=
 =VcL3
 -----END PGP SIGNATURE-----

Merge tag 'net-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Current release - regressions:

   - dsa: bcm_sf2: fix array overrun in bcm_sf2_num_active_ports()

  Previous releases - regressions:

   - introduce a shutdown method to mdio device drivers, and make DSA
     switch drivers compatible with masters disappearing on shutdown;
     preventing infinite reference wait

   - fix issues in mdiobus users related to ->shutdown vs ->remove

   - virtio-net: fix pages leaking when building skb in big mode

   - xen-netback: correct success/error reporting for the
     SKB-with-fraglist

   - dsa: tear down devlink port regions when tearing down the devlink
     port on error

   - nexthop: fix division by zero while replacing a resilient group

   - hns3: check queue, vf, vlan ids range before using

  Previous releases - always broken:

   - napi: fix race against netpoll causing NAPI getting stuck

   - mlx4_en: ensure link operstate is updated even if link comes up
     before netdev registration

   - bnxt_en: fix TX timeout when TX ring size is set to the smallest

   - enetc: fix illegal access when reading affinity_hint; prevent oops
     on sysfs access

   - mtk_eth_soc: avoid creating duplicate offload entries

  Misc:

   - core: correct the sock::sk_lock.owned lockdep annotations"

* tag 'net-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (51 commits)
  atlantic: Fix issue in the pm resume flow.
  net/mlx4_en: Don't allow aRFS for encapsulated packets
  net: mscc: ocelot: fix forwarding from BLOCKING ports remaining enabled
  net: ethernet: mtk_eth_soc: avoid creating duplicate offload entries
  nfc: st-nci: Add SPI ID matching DT compatible
  MAINTAINERS: remove Guvenc Gulce as net/smc maintainer
  nexthop: Fix memory leaks in nexthop notification chain listeners
  mptcp: ensure tx skbs always have the MPTCP ext
  qed: rdma - don't wait for resources under hw error recovery flow
  s390/qeth: fix deadlock during failing recovery
  s390/qeth: Fix deadlock in remove_discipline
  s390/qeth: fix NULL deref in qeth_clear_working_pool_list()
  net: dsa: realtek: register the MDIO bus under devres
  net: dsa: don't allocate the slave_mii_bus using devres
  Doc: networking: Fox a typo in ice.rst
  net: dsa: fix dsa_tree_setup error path
  net/smc: fix 'workqueue leaked lock' in smc_conn_abort_work
  net/smc: add missing error check in smc_clc_prfx_set()
  net: hns3: fix a return value error in hclge_get_reset_status()
  net: hns3: check vlan id before using it
  ...
2021-09-23 10:30:31 -07:00
Maxim Levitsky 1ad32105d7 KVM: x86: selftests: test simultaneous uses of V_IRQ from L1 and L0
Test that if:

* L1 disables virtual interrupt masking, and INTR intercept.

* L1 setups a virtual interrupt to be injected to L2 and enters L2 with
  interrupts disabled, thus the virtual interrupt is pending.

* Now an external interrupt arrives in L1 and since
  L1 doesn't intercept it, it should be delivered to L2 when
  it enables interrupts.

  to do this L0 (abuses) V_IRQ to setup an
  interrupt window, and returns to L2.

* L2 enables interrupts.
  This should trigger the interrupt window,
  injection of the external interrupt and delivery
  of the virtual interrupt that can now be done.

* Test that now L2 gets those interrupts.

This is the test that demonstrates the issue that was
fixed in the previous patch.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210914154825.104886-3-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-23 10:05:07 -04:00
David Matlack 7c236b816e KVM: selftests: Create a separate dirty bitmap per slot
The calculation to get the per-slot dirty bitmap was incorrect leading
to a buffer overrun. Fix it by splitting out the dirty bitmap into a
separate bitmap per slot.

Fixes: 609e6202ea ("KVM: selftests: Support multiple slots in dirty_log_perf_test")
Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-Id: <20210917173657.44011-4-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22 10:33:14 -04:00
David Matlack 9f2fc5554a KVM: selftests: Refactor help message for -s backing_src
All selftests that support the backing_src option were printing their
own description of the flag and then calling backing_src_help() to dump
the list of available backing sources. Consolidate the flag printing in
backing_src_help() to align indentation, reduce duplicated strings, and
improve consistency across tests.

Note: Passing "-s" to backing_src_help is unnecessary since every test
uses the same flag. However I decided to keep it for code readability
at the call sites.

While here this opportunistically fixes the incorrectly interleaved
printing -x help message and list of backing source types in
dirty_log_perf_test.

Fixes: 609e6202ea ("KVM: selftests: Support multiple slots in dirty_log_perf_test")
Reviewed-by: Ben Gardon <bgardon@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20210917173657.44011-3-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22 10:33:14 -04:00
David Matlack a1e638da1b KVM: selftests: Change backing_src flag to -s in demand_paging_test
Every other KVM selftest uses -s for the backing_src, so switch
demand_paging_test to match.

Reviewed-by: Ben Gardon <bgardon@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20210917173657.44011-2-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22 10:33:13 -04:00
Oliver Upton 01f91acb55 selftests: KVM: Align SMCCC call with the spec in steal_time
The SMC64 calling convention passes a function identifier in w0 and its
parameters in x1-x17. Given this, there are two deviations in the
SMC64 call performed by the steal_time test: the function identifier is
assigned to a 64 bit register and the parameter is only 32 bits wide.

Align the call with the SMCCC by using a 32 bit register to handle the
function identifier and increasing the parameter width to 64 bits.

Suggested-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-Id: <20210921171121.2148982-3-oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22 10:33:08 -04:00
Oliver Upton 90b54129e8 selftests: KVM: Fix check for !POLLIN in demand_paging_test
The logical not operator applies only to the left hand side of a bitwise
operator. As such, the check for POLLIN not being set in revents wrong.
Fix it by adding parentheses around the bitwise expression.

Fixes: 4f72180eb4 ("KVM: selftests: Add demand paging content to the demand paging test")
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Message-Id: <20210921171121.2148982-2-oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22 10:33:08 -04:00
Sean Christopherson 2da4a23599 KVM: selftests: Remove __NR_userfaultfd syscall fallback
Revert the __NR_userfaultfd syscall fallback added for KVM selftests now
that x86's unistd_{32,63}.h overrides are under uapi/ and thus not in
KVM selftests' search path, i.e. now that KVM gets x86 syscall numbers
from the installed kernel headers.

No functional change intended.

Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210901203030.1292304-6-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22 10:24:02 -04:00
Sean Christopherson 61e52f1630 KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs
Add a test to verify an rseq's CPU ID is updated correctly if the task is
migrated while the kernel is handling KVM_RUN.  This is a regression test
for a bug introduced by commit 72c3c0fe54 ("x86/kvm: Use generic xfer
to guest work function"), where TIF_NOTIFY_RESUME would be cleared by KVM
without updating rseq, leading to a stale CPU ID and other badness.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Message-Id: <20210901203030.1292304-5-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22 10:24:02 -04:00
Sean Christopherson de5f4213da tools: Move x86 syscall number fallbacks to .../uapi/
Move unistd_{32,64}.h from x86/include/asm to x86/include/uapi/asm so
that tools/selftests that install kernel headers, e.g. KVM selftests, can
include non-uapi tools headers, e.g. to get 'struct list_head', without
effectively overriding the installed non-tool uapi headers.

Swapping KVM's search order, e.g. to search the kernel headers before
tool headers, is not a viable option as doing results in linux/type.h and
other core headers getting pulled from the kernel headers, which do not
have the kernel-internal typedefs that are used through tools, including
many files outside of selftests/kvm's control.

Prior to commit cec07f53c3 ("perf tools: Move syscall number fallbacks
from perf-sys.h to tools/arch/x86/include/asm/"), the handcoded numbers
were actual fallbacks, i.e. overriding unistd_{32,64}.h from the kernel
headers was unintentional.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210901203030.1292304-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22 10:24:01 -04:00
Jiri Benc 17b52c226a seltests: bpf: test_tunnel: Use ip neigh
The 'arp' command is deprecated and is another dependency of the selftest.
Just use 'ip neigh', the test depends on iproute2 already.

Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/40f24b9d3f0f53b5c44471b452f9a11f4d13b7af.1632236133.git.jbenc@redhat.com
2021-09-21 20:40:16 -07:00
Andrii Nakryiko cc10623c68 libbpf: Add legacy uprobe attaching support
Similarly to recently added legacy kprobe attach interface support
through tracefs, support attaching uprobes using the legacy interface if
host kernel doesn't support newer FD-based interface.

For uprobes event name consists of "libbpf_" prefix, PID, sanitized
binary path and offset within that binary. Structuraly the code is
aligned with kprobe logic refactoring in previous patch. struct
bpf_link_perf is re-used and all the same legacy_probe_name and
legacy_is_retprobe fields are used to ensure proper cleanup on
bpf_link__destroy().

Users should be aware, though, that on old kernels which don't support
FD-based interface for kprobe/uprobe attachment, if the application
crashes before bpf_link__destroy() is called, uprobe legacy
events will be left in tracefs. This is the same limitation as with
legacy kprobe interfaces.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210921210036.1545557-5-andrii@kernel.org
2021-09-21 19:40:09 -07:00
Andrii Nakryiko 46ed5fc33d libbpf: Refactor and simplify legacy kprobe code
Refactor legacy kprobe handling code to follow the same logic as uprobe
legacy logic added in the next patchs:
  - add append_to_file() helper that makes it simpler to work with
    tracefs file-based interface for creating and deleting probes;
  - move out probe/event name generation outside of the code that
    adds/removes it, which simplifies bookkeeping significantly;
  - change the probe name format to start with "libbpf_" prefix and
    include offset within kernel function;
  - switch 'unsigned long' to 'size_t' for specifying kprobe offsets,
    which is consistent with how uprobes define that, simplifies
    printf()-ing internally, and also avoids unnecessary complications on
    architectures where sizeof(long) != sizeof(void *).

This patch also implicitly fixes the problem with invalid open() error
handling present in poke_kprobe_events(), which (the function) this
patch removes.

Fixes: ca304b40c2 ("libbpf: Introduce legacy kprobe events support")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210921210036.1545557-4-andrii@kernel.org
2021-09-21 19:40:09 -07:00
Andrii Nakryiko d3b0e3b03c selftests/bpf: Adopt attach_probe selftest to work on old kernels
Make sure to not use ref_ctr_off feature when running on old kernels
that don't support this feature. This allows to test libbpf's legacy
kprobe and uprobe logic on old kernels.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210921210036.1545557-3-andrii@kernel.org
2021-09-21 19:40:09 -07:00
Andrii Nakryiko 303a257223 libbpf: Fix memory leak in legacy kprobe attach logic
In some error scenarios legacy_probe string won't be free()'d. Fix this.
This was reported by Coverity static analysis.

Fixes: ca304b40c2 ("libbpf: Introduce legacy kprobe events support")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210921210036.1545557-2-andrii@kernel.org
2021-09-21 19:40:08 -07:00
Cristian Marussi 0e3dbf765f kselftest/arm64: signal: Skip tests if required features are missing
During initialization of a signal testcase, features declared as required
are properly checked against the running system but no action is then taken
to effectively skip such a testcase.

Fix core signals test logic to abort initialization and report such a
testcase as skipped to the KSelfTest framework.

Fixes: f96bf43403 ("kselftest: arm64: mangle_pstate_invalid_compat_toggle and common utils")
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210920121228.35368-1-cristian.marussi@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-09-21 18:12:03 +01:00
Florian Westphal cb89f63ba6 selftests: netfilter: add zone stress test with colliding tuples
Add 20k entries to the connection tracking table, once from the
data plane, once via ctnetlink.

In both cases, each entry lives in a different conntrack zone
and addresses/ports are identical.

Expectation is that insertions work and occurs in constant time:

PASS: added 10000 entries in 1215 ms (now 10000 total, loop 1)
PASS: added 10000 entries in 1214 ms (now 20000 total, loop 2)
PASS: inserted 20000 entries from packet path in 2434 ms total
PASS: added 10000 entries in 57631 ms (now 10000 total)
PASS: added 10000 entries in 58572 ms (now 20000 total)
PASS: inserted 20000 entries via ctnetlink in 116205 ms

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-09-21 03:46:55 +02:00
Florian Westphal 0f1148abb2 selftests: netfilter: add selftest for directional zone support
Add a script to exercise NAT port clash resolution with directional zones.

Add net namespaces that use the same IP address and connect them to a
gateway.

Gateway uses policy routing based on iif/mark and conntrack zones to
isolate the client namespaces.  In server direction, same zone with NAT
to single address is used.

Then, connect to a server from each client netns, using identical
connection id, i.e.  saddr:sport -> daddr:dport.

Expectation is for all connections to succeeed: NAT gatway is
supposed to do port reallocation for each of the (clashing) connections.

This is based on the description/use case provided in the commit message of
deedb59039 ("netfilter: nf_conntrack: add direction support for zones").

Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-09-21 03:46:55 +02:00
Grant Seltzer 97c140d94e libbpf: Add doc comments in libbpf.h
This adds comments above functions in libbpf.h which document
their uses. These comments are of a format that doxygen and sphinx
can pick up and render. These are rendered by libbpf.readthedocs.org

These doc comments are for:
- bpf_object__find_map_by_name()
- bpf_map__fd()
- bpf_map__is_internal()
- libbpf_get_error()
- libbpf_num_possible_cpus()

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210918031457.36204-1-grantseltzer@gmail.com
2021-09-20 17:23:31 -07:00
Linus Torvalds 62453a460a powerpc fixes for 5.15 #2
Fix crashes when scv (System Call Vectored) is used to make a syscall when a transaction
 is active, on Power9 or later.
 
 Fix bad interactions between rfscv (Return-from scv) and Power9 fake-suspend mode.
 
 Fix crashes when handling machine checks in LPARs using the Hash MMU.
 
 Partly revert a recent change to our XICS interrupt controller code, which broke the
 recently added Microwatt support.
 
 Thanks to: Cédric Le Goater, Eirik Fuller, Ganesh Goudar, Gustavo Romero, Joel Stanley,
 Nicholas Piggin.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmFHDfsTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgNlAD/oC/14jV2fCv54vH5Al1XAz2WuPhhEu
 UsN+MLlx7JshXfLeGzzEogvvakfLFuVkpIxsv+g8a9H6jejjCj/A4wjUy34Im0hS
 nhsMsa1vZPZY/Vz9v3WzdYmqjHVZDKo6ZdUmn0z4OjPnv7wQaEEbaXmoe0+CaXXQ
 jIY2TjVQxbgXeUI1QcPUWh36hiSaYn4O8fAD4VdjxdStl2z6q8zbo/5nzQ8dkBpP
 lldC0T/HRCKLe1ODUExFfbBFqFuLVdbALq0RIuh0GXd+dAqpK1iOkNsxUWChU6Q7
 gu0PrmhRydX3bmL9jmOsZfUk/hIR0fEgwSGITQAb8v0LCvZI0JRn+yvkl7uVZ9Ce
 OWY7B8HZQldvoQqA4/f7YCpAIS2ag7ugDrcAfuDVkDl2o/qJHoTi58IU24wI6+LL
 lwJZjNJoxZ8n1tl1l+gtCNp4E4YYOfuKZi0zH0oJoLiSWeK6PYOvwEGuEC2iQ9cJ
 BEdyYVXqSBuc2anCQ4k+y8gu/zJwP1MCpDNMPMU4P/JcaG9lnrLUhvctf4GKtOSy
 UiktHSTL52J5ldMxSMicpw9qCVgePsQX78Sk3UTHRilRmix6TtWOWYslm4mL+zeD
 u3wpnK+NKkju72LRS/AB1U6cWvModW+WBANPws9j6caoOorh397xsDGEYiKWvp0Z
 xSG/dzXzi2oscQ==
 =yC9U
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - Fix crashes when scv (System Call Vectored) is used to make a syscall
   when a transaction is active, on Power9 or later.

 - Fix bad interactions between rfscv (Return-from scv) and Power9
   fake-suspend mode.

 - Fix crashes when handling machine checks in LPARs using the Hash MMU.

 - Partly revert a recent change to our XICS interrupt controller code,
   which broke the recently added Microwatt support.

Thanks to Cédric Le Goater, Eirik Fuller, Ganesh Goudar, Gustavo Romero,
Joel Stanley, Nicholas Piggin.

* tag 'powerpc-5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/xics: Set the IRQ chip data for the ICS native backend
  powerpc/mce: Fix access error in mce handler
  KVM: PPC: Book3S HV: Tolerate treclaim. in fake-suspend mode changing registers
  powerpc/64s: system call rfscv workaround for TM bugs
  selftests/powerpc: Add scv versions of the basic TM syscall tests
  powerpc/64s: system call scv tabort fix for corrupt irq soft-mask state
2021-09-19 13:00:23 -07:00
Shuah Khan e30cd812df selftests: net: af_unix: Fix makefile to use TEST_GEN_PROGS
Makefile uses TEST_PROGS instead of TEST_GEN_PROGS to define
executables. TEST_PROGS is for shell scripts that need to be
installed and run by the common lib.mk framework. The common
framework doesn't touch TEST_PROGS when it does build and clean.

As a result "make kselftest-clean" and "make clean" fail to remove
executables. Run and install work because the common framework runs
and installs TEST_PROGS. Build works because the Makefile defines
"all" rule which is unnecessary if TEST_GEN_PROGS is used.

Use TEST_GEN_PROGS so the common framework can handle build/run/
install/clean properly.

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-19 13:22:18 +01:00
Shuah Khan 48514a2233 selftests: net: af_unix: Fix incorrect args in test result msg
Fix the args to fprintf(). Splitting the message ends up passing
incorrect arg for "sigurg %d" and an extra arg overall. The test
result message ends up incorrect.

test_unix_oob.c: In function ‘main’:
test_unix_oob.c:274:43: warning: format ‘%d’ expects argument of type ‘int’, but argument 3 has type ‘char *’ [-Wformat=]
  274 |   fprintf(stderr, "Test 3 failed, sigurg %d len %d OOB %c ",
      |                                          ~^
      |                                           |
      |                                           int
      |                                          %s
  275 |   "atmark %d\n", signal_recvd, len, oob, atmark);
      |   ~~~~~~~~~~~~~
      |   |
      |   char *
test_unix_oob.c:274:19: warning: too many arguments for format [-Wformat-extra-args]
  274 |   fprintf(stderr, "Test 3 failed, sigurg %d len %d OOB %c ",

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-19 13:15:19 +01:00
Andrii Nakryiko 219d720e6d perf bpf: Ignore deprecation warning when using libbpf's btf__get_from_id()
Perf code re-implements libbpf's btf__load_from_kernel_by_id() API as
a weak function, presumably to dynamically link against old version of
libbpf shared library. Unfortunately this causes compilation warning
when perf is compiled against libbpf v0.6+.

For now, just ignore deprecation warning, but there might be a better
solution, depending on perf's needs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: kernel-team@fb.com
LPU-Reference: 20210914170004.4185659-1-andrii@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-18 17:47:02 -03:00
Ian Rogers aba5daeb64 libperf evsel: Make use of FD robust.
FD uses xyarray__entry that may return NULL if an index is out of
bounds. If NULL is returned then a segv happens as FD unconditionally
dereferences the pointer. This was happening in a case of with perf
iostat as shown below. The fix is to make FD an "int*" rather than an
int and handle the NULL case as either invalid input or a closed fd.

  $ sudo gdb --args perf stat --iostat  list
  ...
  Breakpoint 1, perf_evsel__alloc_fd (evsel=0x5555560951a0, ncpus=1, nthreads=1) at evsel.c:50
  50      {
  (gdb) bt
   #0  perf_evsel__alloc_fd (evsel=0x5555560951a0, ncpus=1, nthreads=1) at evsel.c:50
   #1  0x000055555585c188 in evsel__open_cpu (evsel=0x5555560951a0, cpus=0x555556093410,
      threads=0x555556086fb0, start_cpu=0, end_cpu=1) at util/evsel.c:1792
   #2  0x000055555585cfb2 in evsel__open (evsel=0x5555560951a0, cpus=0x0, threads=0x555556086fb0)
      at util/evsel.c:2045
   #3  0x000055555585d0db in evsel__open_per_thread (evsel=0x5555560951a0, threads=0x555556086fb0)
      at util/evsel.c:2065
   #4  0x00005555558ece64 in create_perf_stat_counter (evsel=0x5555560951a0,
      config=0x555555c34700 <stat_config>, target=0x555555c2f1c0 <target>, cpu=0) at util/stat.c:590
   #5  0x000055555578e927 in __run_perf_stat (argc=1, argv=0x7fffffffe4a0, run_idx=0)
      at builtin-stat.c:833
   #6  0x000055555578f3c6 in run_perf_stat (argc=1, argv=0x7fffffffe4a0, run_idx=0)
      at builtin-stat.c:1048
   #7  0x0000555555792ee5 in cmd_stat (argc=1, argv=0x7fffffffe4a0) at builtin-stat.c:2534
   #8  0x0000555555835ed3 in run_builtin (p=0x555555c3f540 <commands+288>, argc=3,
      argv=0x7fffffffe4a0) at perf.c:313
   #9  0x0000555555836154 in handle_internal_command (argc=3, argv=0x7fffffffe4a0) at perf.c:365
   #10 0x000055555583629f in run_argv (argcp=0x7fffffffe2ec, argv=0x7fffffffe2e0) at perf.c:409
   #11 0x0000555555836692 in main (argc=3, argv=0x7fffffffe4a0) at perf.c:539
  ...
  (gdb) c
  Continuing.
  Error:
  The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (uncore_iio_0/event=0x83,umask=0x04,ch_mask=0xF,fc_mask=0x07/).
  /bin/dmesg | grep -i perf may provide additional information.

  Program received signal SIGSEGV, Segmentation fault.
  0x00005555559b03ea in perf_evsel__close_fd_cpu (evsel=0x5555560951a0, cpu=1) at evsel.c:166
  166                     if (FD(evsel, cpu, thread) >= 0)

v3. fixes a bug in perf_evsel__run_ioctl where the sense of a branch was
    backward.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210918054440.2350466-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-18 17:43:06 -03:00
Michael Petlan 57f0ff059e perf machine: Initialize srcline string member in add_location struct
It's later supposed to be either a correct address or NULL. Without the
initialization, it may contain an undefined value which results in the
following segmentation fault:

  # perf top --sort comm -g --ignore-callees=do_idle

terminates with:

  #0  0x00007ffff56b7685 in __strlen_avx2 () from /lib64/libc.so.6
  #1  0x00007ffff55e3802 in strdup () from /lib64/libc.so.6
  #2  0x00005555558cb139 in hist_entry__init (callchain_size=<optimized out>, sample_self=true, template=0x7fffde7fb110, he=0x7fffd801c250) at util/hist.c:489
  #3  hist_entry__new (template=template@entry=0x7fffde7fb110, sample_self=sample_self@entry=true) at util/hist.c:564
  #4  0x00005555558cb4ba in hists__findnew_entry (hists=hists@entry=0x5555561d9e38, entry=entry@entry=0x7fffde7fb110, al=al@entry=0x7fffde7fb420,
      sample_self=sample_self@entry=true) at util/hist.c:657
  #5  0x00005555558cba1b in __hists__add_entry (hists=hists@entry=0x5555561d9e38, al=0x7fffde7fb420, sym_parent=<optimized out>, bi=bi@entry=0x0, mi=mi@entry=0x0,
      sample=sample@entry=0x7fffde7fb4b0, sample_self=true, ops=0x0, block_info=0x0) at util/hist.c:288
  #6  0x00005555558cbb70 in hists__add_entry (sample_self=true, sample=0x7fffde7fb4b0, mi=0x0, bi=0x0, sym_parent=<optimized out>, al=<optimized out>, hists=0x5555561d9e38)
      at util/hist.c:1056
  #7  iter_add_single_cumulative_entry (iter=0x7fffde7fb460, al=<optimized out>) at util/hist.c:1056
  #8  0x00005555558cc8a4 in hist_entry_iter__add (iter=iter@entry=0x7fffde7fb460, al=al@entry=0x7fffde7fb420, max_stack_depth=<optimized out>, arg=arg@entry=0x7fffffff7db0)
      at util/hist.c:1231
  #9  0x00005555557cdc9a in perf_event__process_sample (machine=<optimized out>, sample=0x7fffde7fb4b0, evsel=<optimized out>, event=<optimized out>, tool=0x7fffffff7db0)
      at builtin-top.c:842
  #10 deliver_event (qe=<optimized out>, qevent=<optimized out>) at builtin-top.c:1202
  #11 0x00005555558a9318 in do_flush (show_progress=false, oe=0x7fffffff80e0) at util/ordered-events.c:244
  #12 __ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP, timestamp=timestamp@entry=0) at util/ordered-events.c:323
  #13 0x00005555558a9789 in __ordered_events__flush (timestamp=<optimized out>, how=<optimized out>, oe=<optimized out>) at util/ordered-events.c:339
  #14 ordered_events__flush (how=OE_FLUSH__TOP, oe=0x7fffffff80e0) at util/ordered-events.c:341
  #15 ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP) at util/ordered-events.c:339
  #16 0x00005555557cd631 in process_thread (arg=0x7fffffff7db0) at builtin-top.c:1114
  #17 0x00007ffff7bb817a in start_thread () from /lib64/libpthread.so.0
  #18 0x00007ffff5656dc3 in clone () from /lib64/libc.so.6

If you look at the frame #2, the code is:

488	 if (he->srcline) {
489          he->srcline = strdup(he->srcline);
490          if (he->srcline == NULL)
491              goto err_rawdata;
492	 }

If he->srcline is not NULL (it is not NULL if it is uninitialized rubbish),
it gets strdupped and strdupping a rubbish random string causes the problem.

Also, if you look at the commit 1fb7d06a50, it adds the srcline property
into the struct, but not initializing it everywhere needed.

Committer notes:

Now I see, when using --ignore-callees=do_idle we end up here at line
2189 in add_callchain_ip():

2181         if (al.sym != NULL) {
2182                 if (perf_hpp_list.parent && !*parent &&
2183                     symbol__match_regex(al.sym, &parent_regex))
2184                         *parent = al.sym;
2185                 else if (have_ignore_callees && root_al &&
2186                   symbol__match_regex(al.sym, &ignore_callees_regex)) {
2187                         /* Treat this symbol as the root,
2188                            forgetting its callees. */
2189                         *root_al = al;
2190                         callchain_cursor_reset(cursor);
2191                 }
2192         }

And the al that doesn't have the ->srcline field initialized will be
copied to the root_al, so then, back to:

1211 int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
1212                          int max_stack_depth, void *arg)
1213 {
1214         int err, err2;
1215         struct map *alm = NULL;
1216
1217         if (al)
1218                 alm = map__get(al->map);
1219
1220         err = sample__resolve_callchain(iter->sample, &callchain_cursor, &iter->parent,
1221                                         iter->evsel, al, max_stack_depth);
1222         if (err) {
1223                 map__put(alm);
1224                 return err;
1225         }
1226
1227         err = iter->ops->prepare_entry(iter, al);
1228         if (err)
1229                 goto out;
1230
1231         err = iter->ops->add_single_entry(iter, al);
1232         if (err)
1233                 goto out;
1234

That al at line 1221 is what hist_entry_iter__add() (called from
sample__resolve_callchain()) saw as 'root_al', and then:

        iter->ops->add_single_entry(iter, al);

will go on with al->srcline with a bogus value, I'll add the above
sequence to the cset and apply, thanks!

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
CC: Milian Wolff <milian.wolff@kdab.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Fixes: 1fb7d06a50 ("perf report Use srcline from callchain for hist entries")
Link: https //lore.kernel.org/r/20210719145332.29747-1-mpetlan@redhat.com
Reported-by: Juri Lelli <jlelli@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-18 17:43:05 -03:00
Adrian Hunter ff6f41fbce perf script: Fix ip display when type != attr->type
set_print_ip_opts() was not being called when type != attr->type
because there is not a one-to-one relationship between output types
and attr->type. That resulted in ip not printing.

The attr_type() function is removed, and the match of attr->type to
output type is corrected.

Example on ADL using taskset to select an atom cpu:

 # perf record -e cpu_atom/cpu-cycles/ taskset 0x1000 uname
 Linux
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.003 MB perf.data (7 samples) ]

 Before:

  # perf script | head
         taskset   428 [-01] 10394.179041:          1 cpu_atom/cpu-cycles/:
         taskset   428 [-01] 10394.179043:          1 cpu_atom/cpu-cycles/:
         taskset   428 [-01] 10394.179044:         11 cpu_atom/cpu-cycles/:
         taskset   428 [-01] 10394.179045:        407 cpu_atom/cpu-cycles/:
         taskset   428 [-01] 10394.179046:      16789 cpu_atom/cpu-cycles/:
         taskset   428 [-01] 10394.179052:     676300 cpu_atom/cpu-cycles/:
           uname   428 [-01] 10394.179278:    4079859 cpu_atom/cpu-cycles/:

 After:

  # perf script | head
         taskset   428 10394.179041:          1 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
         taskset   428 10394.179043:          1 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
         taskset   428 10394.179044:         11 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
         taskset   428 10394.179045:        407 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
         taskset   428 10394.179046:      16789 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
         taskset   428 10394.179052:     676300 cpu_atom/cpu-cycles/:      7f829ef73800 cfree+0x0 (/lib/libc-2.32.so)
           uname   428 10394.179278:    4079859 cpu_atom/cpu-cycles/:  ffffffff95bae912 vma_interval_tree_remove+0x1f2 ([kernel.kallsyms])

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210911133053.15682-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-18 17:43:05 -03:00
Ravi Bangoria 7efbcc8c07 perf annotate: Fix fused instr logic for assembly functions
Some x86 microarchitectures fuse a subset of cmp/test/ALU instructions
with branch instructions, and thus perf annotate highlight such valid
pairs as fused.

When annotated with source, perf uses struct disasm_line to contain
either source or instruction line from objdump output. Usually, a C
statement generates multiple instructions which include such
cmp/test/ALU + branch instruction pairs. But in case of assembly
function, each individual assembly source line generate one
instruction.

The 'perf annotate' instruction fusion logic assumes the previous
disasm_line as the previous instruction line, which is wrong because,
for assembly function, previous disasm_line contains source line.  And
thus perf fails to highlight valid fused instruction pairs for assembly
functions.

Fix it by searching backward until we find an instruction line and
consider that disasm_line as fused with current branch instruction.

Before:
         │    cmpq    %rcx, RIP+8(%rsp)
    0.00 │      cmp    %rcx,0x88(%rsp)
         │    je      .Lerror_bad_iret      <--- Source line
    0.14 │   ┌──je     b4                   <--- Instruction line
         │   │movl    %ecx, %eax

After:
         │    cmpq    %rcx, RIP+8(%rsp)
    0.00 │   ┌──cmp    %rcx,0x88(%rsp)
         │   │je      .Lerror_bad_iret
    0.14 │   ├──je     b4
         │   │movl    %ecx, %eax

Reviewed-by: Jin Yao <yao.jin@linux.intel.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https //lore.kernel.org/r/20210911043854.8373-1-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-18 17:43:05 -03:00
Florian Westphal ce9979129a selftests: mptcp: add mptcp getsockopt test cases
Add a test program that retrieves the three info types:
1. mptcp meta information
2. tcp info for subflow
3. subflow endpoint addresses

For all three rudimentary checks are added.

1. Meta information checks that the logical mptcp
   sequence numbers advance as expected, based on the bytes read
   (init seq + bytes_received/sent) and the connection state
   (after close, we should exect 1 extra byte due to FIN).

2. TCP info checks the number of bytes sent/received vs.
   sums of read/write syscall return values.

3. Subflow endpoint addresses are checked vs. getsockname/getpeername
   result.

Tests for forward compatibility (0-initialisation of output-only
fields in mptcp_subflow_data structure) are added as well.

Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-18 14:20:01 +01:00
Dave Marchevsky a42effb0b2 bpf: Clarify data_len param in bpf_snprintf and bpf_seq_printf comments
Since the data_len in these two functions is a byte len of the preceding
u64 *data array, it must always be a multiple of 8. If this isn't the
case both helpers error out, so let's make the requirement explicit so
users don't need to infer it.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-10-davemarchevsky@fb.com
2021-09-17 14:02:06 -07:00
Dave Marchevsky 7606729fe2 selftests/bpf: Add trace_vprintk test prog
This commit adds a test prog for vprintk which confirms that:
  * bpf_trace_vprintk is writing to /sys/kernel/debug/tracing/trace_pipe
  * __bpf_vprintk macro works as expected
  * >3 args are printed
  * bpf_printk w/ 0 format args compiles
  * bpf_trace_vprintk call w/ a fmt specifier but NULL fmt data fails

Approach and code are borrowed from trace_printk test.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-9-davemarchevsky@fb.com
2021-09-17 14:02:06 -07:00
Dave Marchevsky d313d45a22 selftests/bpf: Migrate prog_tests/trace_printk CHECKs to ASSERTs
Guidance for new tests is to use ASSERT macros instead of CHECK. Since
trace_vprintk test will borrow heavily from trace_printk's, migrate its
CHECKs so it remains obvious that the two are closely related.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-8-davemarchevsky@fb.com
2021-09-17 14:02:05 -07:00
Dave Marchevsky 4190c299a4 bpftool: Only probe trace_vprintk feature in 'full' mode
Since commit 368cb0e7cd ("bpftool: Make probes which emit dmesg
warnings optional"), some helpers aren't probed by bpftool unless
`full` arg is added to `bpftool feature probe`.

bpf_trace_vprintk can emit dmesg warnings when probed, so include it.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-7-davemarchevsky@fb.com
2021-09-17 14:02:05 -07:00
Dave Marchevsky 6c66b0e7c9 libbpf: Use static const fmt string in __bpf_printk
The __bpf_printk convenience macro was using a 'char' fmt string holder
as it predates support for globals in libbpf. Move to more efficient
'static const char', but provide a fallback to the old way via
BPF_NO_GLOBAL_DATA so users on old kernels can still use the macro.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-6-davemarchevsky@fb.com
2021-09-17 14:02:05 -07:00
Dave Marchevsky c2758baa97 libbpf: Modify bpf_printk to choose helper based on arg count
Instead of being a thin wrapper which calls into bpf_trace_printk,
libbpf's bpf_printk convenience macro now chooses between
bpf_trace_printk and bpf_trace_vprintk. If the arg count (excluding
format string) is >3, use bpf_trace_vprintk, otherwise use the older
helper.

The motivation behind this added complexity - instead of migrating
entirely to bpf_trace_vprintk - is to maintain good developer experience
for users compiling against new libbpf but running on older kernels.
Users who are passing <=3 args to bpf_printk will see no change in their
bytecode.

__bpf_vprintk functions similarly to BPF_SEQ_PRINTF and BPF_SNPRINTF
macros elsewhere in the file - it allows use of bpf_trace_vprintk
without manual conversion of varargs to u64 array. Previous
implementation of bpf_printk macro is moved to __bpf_printk for use by
the new implementation.

This does change behavior of bpf_printk calls with >3 args in the "new
libbpf, old kernels" scenario. Before this patch, attempting to use 4
args to bpf_printk results in a compile-time error. After this patch,
using bpf_printk with 4 args results in a trace_vprintk helper call
being emitted and a load-time failure on older kernels.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-5-davemarchevsky@fb.com
2021-09-17 14:02:05 -07:00
Dave Marchevsky 10aceb629e bpf: Add bpf_trace_vprintk helper
This helper is meant to be "bpf_trace_printk, but with proper vararg
support". Follow bpf_snprintf's example and take a u64 pseudo-vararg
array. Write to /sys/kernel/debug/tracing/trace_pipe using the same
mechanism as bpf_trace_printk. The functionality of this helper was
requested in the libbpf issue tracker [0].

[0] Closes: https://github.com/libbpf/libbpf/issues/315

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-4-davemarchevsky@fb.com
2021-09-17 14:02:05 -07:00
Dave Marchevsky 84b4c52960 selftests/bpf: Stop using bpf_program__load
bpf_program__load is not supposed to be used directly. Replace it with
bpf_object__ APIs for the reference_tracking prog_test, which is the
last offender in bpf selftests.

Some additional complexity is added for this test, namely the use of one
bpf_object to iterate through progs, while a second bpf_object is
created and opened/closed to test actual loading of progs. This is
because the test was doing bpf_program__load then __unload to test
loading of individual progs and same semantics with
bpf_object__load/__unload result in failure to load an __unload-ed obj.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-3-davemarchevsky@fb.com
2021-09-17 14:02:05 -07:00
Jakub Kicinski af54faab84 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2021-09-17

We've added 63 non-merge commits during the last 12 day(s) which contain
a total of 65 files changed, 2653 insertions(+), 751 deletions(-).

The main changes are:

1) Streamline internal BPF program sections handling and
   bpf_program__set_attach_target() in libbpf, from Andrii.

2) Add support for new btf kind BTF_KIND_TAG, from Yonghong.

3) Introduce bpf_get_branch_snapshot() to capture LBR, from Song.

4) IMUL optimization for x86-64 JIT, from Jie.

5) xsk selftest improvements, from Magnus.

6) Introduce legacy kprobe events support in libbpf, from Rafael.

7) Access hw timestamp through BPF's __sk_buff, from Vadim.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (63 commits)
  selftests/bpf: Fix a few compiler warnings
  libbpf: Constify all high-level program attach APIs
  libbpf: Schedule open_opts.attach_prog_fd deprecation since v0.7
  selftests/bpf: Switch fexit_bpf2bpf selftest to set_attach_target() API
  libbpf: Allow skipping attach_func_name in bpf_program__set_attach_target()
  libbpf: Deprecated bpf_object_open_opts.relaxed_core_relocs
  selftests/bpf: Stop using relaxed_core_relocs which has no effect
  libbpf: Use pre-setup sec_def in libbpf_find_attach_btf_id()
  bpf: Update bpf_get_smp_processor_id() documentation
  libbpf: Add sphinx code documentation comments
  selftests/bpf: Skip btf_tag test if btf_tag attribute not supported
  docs/bpf: Add documentation for BTF_KIND_TAG
  selftests/bpf: Add a test with a bpf program with btf_tag attributes
  selftests/bpf: Test BTF_KIND_TAG for deduplication
  selftests/bpf: Add BTF_KIND_TAG unit tests
  selftests/bpf: Change NAME_NTH/IS_NAME_NTH for BTF_KIND_TAG format
  selftests/bpf: Test libbpf API function btf__add_tag()
  bpftool: Add support for BTF_KIND_TAG
  libbpf: Add support for BTF_KIND_TAG
  libbpf: Rename btf_{hash,equal}_int to btf_{hash,equal}_int_tag
  ...
====================

Link: https://lore.kernel.org/r/20210917173738.3397064-1-ast@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-17 12:40:21 -07:00
Yonghong Song ca21a3e5ed selftests/bpf: Fix a few compiler warnings
With clang building selftests/bpf, I hit a few warnings like below:

  .../bpf_iter.c:592:48: warning: variable 'expected_key_c' set but not used [-Wunused-but-set-variable]
  __u32 expected_key_a = 0, expected_key_b = 0, expected_key_c = 0;
                                                ^

  .../bpf_iter.c:688:48: warning: variable 'expected_key_c' set but not used [-Wunused-but-set-variable]
  __u32 expected_key_a = 0, expected_key_b = 0, expected_key_c = 0;
                                                ^

  .../tc_redirect.c:657:6: warning: variable 'target_fd' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
  if (!ASSERT_OK_PTR(nstoken, "setns " NS_FWD))
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  .../tc_redirect.c:743:6: note: uninitialized use occurs here
  if (target_fd >= 0)
      ^~~~~~~~~

Removing unused variables and initializing the previously-uninitialized variable
to ensure these warnings are gone.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210917043343.3711917-1-yhs@fb.com
2021-09-17 09:10:54 -07:00
Andrii Nakryiko 942025c9f3 libbpf: Constify all high-level program attach APIs
Attach APIs shouldn't need to modify bpf_program/bpf_map structs, so
change all struct bpf_program and struct bpf_map pointers to const
pointers. This is completely backwards compatible with no functional
change.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-8-andrii@kernel.org
2021-09-17 09:05:41 -07:00
Andrii Nakryiko 91b555d73e libbpf: Schedule open_opts.attach_prog_fd deprecation since v0.7
bpf_object_open_opts.attach_prog_fd makes a pretty strong assumption
that bpf_object contains either only single freplace BPF program or all
of BPF programs in BPF object are freplaces intended to replace
different subprograms of the same target BPF program. This seems both
a bit confusing, too assuming, and limiting.

We've had bpf_program__set_attach_target() API which allows more
fine-grained control over this, on a per-program level. As such, mark
open_opts.attach_prog_fd as deprecated starting from v0.7, so that we
have one more universal way of setting freplace targets. With previous
change to allow NULL attach_func_name argument, and especially combined
with BPF skeleton, arguable bpf_program__set_attach_target() is a more
convenient and explicit API as well.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-7-andrii@kernel.org
2021-09-17 09:05:41 -07:00
Andrii Nakryiko 60aed22076 selftests/bpf: Switch fexit_bpf2bpf selftest to set_attach_target() API
Switch fexit_bpf2bpf selftest to bpf_program__set_attach_target()
instead of using bpf_object_open_opts.attach_prog_fd, which is going to
be deprecated. These changes also demonstrate the new mode of
set_attach_target() in which it allows NULL when the target is BPF
program (attach_prog_fd != 0).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-6-andrii@kernel.org
2021-09-17 09:05:41 -07:00
Andrii Nakryiko 2d5ec1c66e libbpf: Allow skipping attach_func_name in bpf_program__set_attach_target()
Allow to use bpf_program__set_attach_target to only set target attach
program FD, while letting libbpf to use target attach function name from
SEC() definition. This might be useful for some scenarios where
bpf_object contains multiple related freplace BPF programs intended to
replace different sub-programs in target BPF program. In such case all
programs will have the same attach_prog_fd, but different
attach_func_name. It's convenient to specify such target function names
declaratively in SEC() definitions, but attach_prog_fd is a dynamic
runtime setting.

To simplify such scenario, allow bpf_program__set_attach_target() to
delay BTF ID resolution till the BPF program load time by providing NULL
attach_func_name. In that case the behavior will be similar to using
bpf_object_open_opts.attach_prog_fd (which is marked deprecated since
v0.7), but has the benefit of allowing more control by user in what is
attached to what. Such setup allows having BPF programs attached to
different target attach_prog_fd with target functions still declaratively
recorded in BPF source code in SEC() definitions.

Selftests changes in the next patch should make this more obvious.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-5-andrii@kernel.org
2021-09-17 09:05:00 -07:00
Andrii Nakryiko 277641859e libbpf: Deprecated bpf_object_open_opts.relaxed_core_relocs
It's relevant and hasn't been doing anything for a long while now.
Deprecated it.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-4-andrii@kernel.org
2021-09-17 09:04:12 -07:00
Andrii Nakryiko 23a7baaa93 selftests/bpf: Stop using relaxed_core_relocs which has no effect
relaxed_core_relocs option hasn't had any effect for a while now, stop
specifying it. Next patch marks it as deprecated.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-3-andrii@kernel.org
2021-09-17 09:04:12 -07:00
Andrii Nakryiko f11f86a393 libbpf: Use pre-setup sec_def in libbpf_find_attach_btf_id()
Don't perform another search for sec_def inside
libbpf_find_attach_btf_id(), as each recognized bpf_program already has
prog->sec_def set.

Also remove unnecessary NULL check for prog->sec_name, as it can never
be NULL.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-2-andrii@kernel.org
2021-09-17 09:04:12 -07:00
Vladimir Oltean 3c9cfb5269 net: update NXP copyright text
NXP Legal insists that the following are not fine:

- Saying "NXP Semiconductors" instead of "NXP", since the company's
  registered name is "NXP"

- Putting a "(c)" sign in the copyright string

- Putting a comma in the copyright string

The only accepted copyright string format is "Copyright <year-range> NXP".

This patch changes the copyright headers in the networking files that
were sent by me, or derived from code sent by me.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-17 13:52:17 +01:00
Jakub Kicinski 561bed688b Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts!

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-16 13:58:38 -07:00
Linus Torvalds fc0c0548c1 Networking fixes for 5.15-rc2, including fixes from bpf.
Current release - regressions:
 
  - vhost_net: fix OoB on sendmsg() failure
 
  - mlx5: bridge, fix uninitialized variable usage
 
  - bnxt_en: fix error recovery regression
 
 Current release - new code bugs:
 
  - bpf, mm: fix lockdep warning triggered by stack_map_get_build_id_offset()
 
 Previous releases - regressions:
 
  - r6040: restore MDIO clock frequency after MAC reset
 
  - tcp: fix tp->undo_retrans accounting in tcp_sacktag_one()
 
  - dsa: flush switchdev workqueue before tearing down CPU/DSA ports
 
 Previous releases - always broken:
 
  - ptp: dp83640: don't define PAGE0, avoid compiler warning
 
  - igc: fix tunnel segmentation offloads
 
  - phylink: update SFP selected interface on advertising changes
 
  - stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume
 
  - mlx5e: fix mutual exclusion between CQE compression and HW TS
 
 Misc:
 
  - bpf, cgroups: fix cgroup v2 fallback on v1/v2 mixed mode
 
  - sfc: fallback for lack of xdp tx queues
 
  - hns3: add option to turn off page pool feature
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmFDnXIACgkQMUZtbf5S
 Irs+pA//UvT0UqL+G/9He9eVMaX/7mnvoShbU/hHaymLuvJE4dx6B3qw8im5fQ0i
 b7ovy1dxjMjSQI4E+2FleajRBgsdXvWgMCopFZr7h35+qPXziLxTTztwsaOIjD8U
 42HBpdwFrVtdrOhSsiKsGKe8KXqIehHhFAp4vQrRlsvsz1Yk/93o6wYaXaXqR6pQ
 qlTVgI7nAuLAw4VfFZ3k7iAiQsyrXrn5kUXq5/Wy6kK7iFnIN7X9j+epTJP/EXHn
 +17SW/iGnjUfwXsRSC2ysgICtOPatu48Cke5hq5XOGqXAtRVhhLfMDiZlr8a6+z4
 gS+QETakBXLrAfvYK6neLjlMhg1FX+Y952xd3XIU4jz+DgIh03r2Nd1xBsp17Yhu
 RRBc96w2KxuHQ6USEXbi5dY1K6o9DK52m/WRmnqXnaNX4ADeesgcrxXHqZ6O6I5R
 +ddvVWylaHxvngCQSL4DvZdYNJVoKjdlo6fHrwD1qu0Uv0rGz6O+AEI9o8DkK5Af
 ZhFBETvjTKzjCykd3UlaRo/ibYChIPYwpGTK+TTI9rUm8OWNw1+sC/Z14bx0/3Vl
 RB53rkJ93/qi6Wb9ljUxM8CPGViuPvMkQJjpJOaktrj0GOgQz3EBdDkIQMaplZx5
 r2AeHW7X5VGxCWMRg4NS/x7Do1jh8m/ch8/EulOAB5fG7KQPfYs=
 =xd6p
 -----END PGP SIGNATURE-----

Merge tag 'net-5.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bpf.

  Current release - regressions:

   - vhost_net: fix OoB on sendmsg() failure

   - mlx5: bridge, fix uninitialized variable usage

   - bnxt_en: fix error recovery regression

  Current release - new code bugs:

   - bpf, mm: fix lockdep warning triggered by stack_map_get_build_id_offset()

  Previous releases - regressions:

   - r6040: restore MDIO clock frequency after MAC reset

   - tcp: fix tp->undo_retrans accounting in tcp_sacktag_one()

   - dsa: flush switchdev workqueue before tearing down CPU/DSA ports

  Previous releases - always broken:

   - ptp: dp83640: don't define PAGE0, avoid compiler warning

   - igc: fix tunnel segmentation offloads

   - phylink: update SFP selected interface on advertising changes

   - stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume

   - mlx5e: fix mutual exclusion between CQE compression and HW TS

  Misc:

   - bpf, cgroups: fix cgroup v2 fallback on v1/v2 mixed mode

   - sfc: fallback for lack of xdp tx queues

   - hns3: add option to turn off page pool feature"

* tag 'net-5.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (67 commits)
  mlxbf_gige: clear valid_polarity upon open
  igc: fix tunnel offloading
  net/{mlx5|nfp|bnxt}: Remove unnecessary RTNL lock assert
  net: wan: wanxl: define CROSS_COMPILE_M68K
  selftests: nci: replace unsigned int with int
  net: dsa: flush switchdev workqueue before tearing down CPU/DSA ports
  Revert "net: phy: Uniform PHY driver access"
  net: dsa: destroy the phylink instance on any error in dsa_slave_phy_setup
  ptp: dp83640: don't define PAGE0
  bnx2x: Fix enabling network interfaces without VFs
  Revert "Revert "ipv4: fix memory leaks in ip_cmsg_send() callers""
  tcp: fix tp->undo_retrans accounting in tcp_sacktag_one()
  net-caif: avoid user-triggerable WARN_ON(1)
  bpf, selftests: Add test case for mixed cgroup v1/v2
  bpf, selftests: Add cgroup v1 net_cls classid helpers
  bpf, cgroups: Fix cgroup v2 fallback on v1/v2 mixed mode
  bpf: Add oversize check before call kvcalloc()
  net: hns3: fix the timing issue of VF clearing interrupt sources
  net: hns3: fix the exception when query imp info
  net: hns3: disable mac in flr process
  ...
2021-09-16 13:05:42 -07:00
Shuah Khan f5013d412a selftests: kvm: fix get_run_delay() ignoring fscanf() return warn
Fix get_run_delay() to check fscanf() return value to get rid of the
following warning. When fscanf() fails return MIN_RUN_DELAY_NS from
get_run_delay(). Move MIN_RUN_DELAY_NS from steal_time.c to test_util.h
so get_run_delay() and steal_time.c can use it.

lib/test_util.c: In function ‘get_run_delay’:
lib/test_util.c:316:2: warning: ignoring return value of ‘fscanf’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
  316 |  fscanf(fp, "%ld %ld ", &val[0], &val[1]);
      |  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-09-16 12:57:32 -06:00
Shuah Khan 20175d5eac selftests: kvm: move get_run_delay() into lib/test_util
get_run_delay() is defined static in xen_shinfo_test and steal_time test.
Move it to lib and remove code duplication.

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-09-16 12:57:26 -06:00
Shuah Khan 3a4f0cc693 selftests:kvm: fix get_trans_hugepagesz() ignoring fscanf() return warn
Fix get_trans_hugepagesz() to check fscanf() return value to get rid
of the following warning:

lib/test_util.c: In function ‘get_trans_hugepagesz’:
lib/test_util.c:138:2: warning: ignoring return value of ‘fscanf’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
  138 |  fscanf(f, "%ld", &size);
      |  ^~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-09-16 12:57:11 -06:00
Shuah Khan 39a71f712d selftests:kvm: fix get_warnings_count() ignoring fscanf() return warn
Fix get_warnings_count() to check fscanf() return value to get rid
of the following warning:

x86_64/mmio_warning_test.c: In function ‘get_warnings_count’:
x86_64/mmio_warning_test.c:85:2: warning: ignoring return value of ‘fscanf’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
   85 |  fscanf(f, "%d", &warnings);
      |  ^~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-09-16 12:57:02 -06:00
Xiang wangx 98dc68f8b0 selftests: nci: replace unsigned int with int
Should not use comparison of unsigned expressions < 0.

Signed-off-by: Xiang wangx <wangxiang@cdjrlc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-16 13:55:51 +01:00
Matteo Croce 336562752a bpf: Update bpf_get_smp_processor_id() documentation
BPF programs run with migration disabled regardless of preemption, as
they are protected by migrate_disable(). Update the uapi documentation
accordingly.

Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210914235400.59427-1-mcroce@linux.microsoft.com
2021-09-15 22:39:55 +02:00
Grant Seltzer 69cd823956 libbpf: Add sphinx code documentation comments
This adds comments above five functions in btf.h which document
their uses. These comments are of a format that doxygen and sphinx
can pick up and render. These are rendered by libbpf.readthedocs.org

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210915021951.117186-1-grantseltzer@gmail.com
2021-09-15 13:16:02 -07:00
Masami Hiramatsu 80be5998ad tools/bootconfig: Define memblock_free_ptr() to fix build error
The lib/bootconfig.c file is shared with the 'bootconfig' tooling, and
as a result, the changes incommit 77e02cf57b ("memblock: introduce
saner 'memblock_free_ptr()' interface") need to also be reflected in the
tooling header file.

So define the new memblock_free_ptr() wrapper, and remove unused __pa()
and memblock_free().

Fixes: 77e02cf57b ("memblock: introduce saner 'memblock_free_ptr()' interface")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-15 09:49:48 -07:00
Li Zhijian 8914a7a247 selftests: be sure to make khdr before other targets
LKP/0Day reported some building errors about kvm, and errors message
are not always same:
- lib/x86_64/processor.c:1083:31: error: ‘KVM_CAP_NESTED_STATE’ undeclared
(first use in this function); did you mean ‘KVM_CAP_PIT_STATE2’?
- lib/test_util.c:189:30: error: ‘MAP_HUGE_16KB’ undeclared (first use
in this function); did you mean ‘MAP_HUGE_16GB’?

Although kvm relies on the khdr, they still be built in parallel when -j
is specified. In this case, it will cause compiling errors.

Here we mark target khdr as NOTPARALLEL to make it be always built
first.

CC: Philip Li <philip.li@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-09-15 10:34:21 -06:00
Yonghong Song 2220ecf55c selftests/bpf: Skip btf_tag test if btf_tag attribute not supported
Commit c240ba2878 ("selftests/bpf: Add a test with a bpf
program with btf_tag attributes") added btf_tag selftest
to test BTF_KIND_TAG generation from C source code, and to
test kernel validation of generated BTF types.
But if an old clang (clang 13 or earlier) is used, the
following compiler warning may be seen:
  progs/tag.c:23:20: warning: unknown attribute 'btf_tag' ignored
and the test itself is marked OK. The compiler warning is bad
and the test itself shouldn't be marked OK.

This patch added the check for btf_tag attribute support.
If btf_tag is not supported by the clang, the attribute will
not be used in the code and the test will be marked as skipped.
For example, with clang 13:
  ./test_progs -t btf_tag
  #21 btf_tag:SKIP
  Summary: 1/0 PASSED, 1 SKIPPED, 0 FAILED

The selftests/README.rst is updated to clarify when the btf_tag
test may be skipped.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210915061036.2577971-1-yhs@fb.com
2021-09-15 08:49:49 -07:00
Jakub Kicinski 2d6a58996e selftests: net: test ethtool -L vs mq
Add a selftest for checking mq children are visible after ethtool -L.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 15:46:02 +01:00
Yonghong Song c240ba2878 selftests/bpf: Add a test with a bpf program with btf_tag attributes
Add a bpf program with btf_tag attributes. The program is
loaded successfully with the kernel. With the command
  bpftool btf dump file ./tag.o
the following dump shows that tags are properly encoded:
  [8] STRUCT 'key_t' size=12 vlen=3
          'a' type_id=2 bits_offset=0
          'b' type_id=2 bits_offset=32
          'c' type_id=2 bits_offset=64
  [9] TAG 'tag1' type_id=8 component_id=-1
  [10] TAG 'tag2' type_id=8 component_id=-1
  [11] TAG 'tag1' type_id=8 component_id=1
  [12] TAG 'tag2' type_id=8 component_id=1
  ...
  [21] FUNC_PROTO '(anon)' ret_type_id=2 vlen=1
          'x' type_id=2
  [22] FUNC 'foo' type_id=21 linkage=static
  [23] TAG 'tag1' type_id=22 component_id=0
  [24] TAG 'tag2' type_id=22 component_id=0
  [25] TAG 'tag1' type_id=22 component_id=-1
  [26] TAG 'tag2' type_id=22 component_id=-1
  ...
  [29] VAR 'total' type_id=27, linkage=global
  [30] TAG 'tag1' type_id=29 component_id=-1
  [31] TAG 'tag2' type_id=29 component_id=-1

If an old clang compiler, which does not support btf_tag attribute,
is used, these btf_tag attributes will be silently ignored.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223058.248949-1-yhs@fb.com
2021-09-14 18:45:53 -07:00
Yonghong Song ad526474ae selftests/bpf: Test BTF_KIND_TAG for deduplication
Add unit tests for BTF_KIND_TAG deduplication for
  - struct and struct member
  - variable
  - func and func argument

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223052.248535-1-yhs@fb.com
2021-09-14 18:45:52 -07:00
Yonghong Song 35baba7a83 selftests/bpf: Add BTF_KIND_TAG unit tests
Test good and bad variants of BTF_KIND_TAG encoding.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223047.248223-1-yhs@fb.com
2021-09-14 18:45:52 -07:00
Yonghong Song 3df3bd68d4 selftests/bpf: Change NAME_NTH/IS_NAME_NTH for BTF_KIND_TAG format
BTF_KIND_TAG ELF format has a component_idx which might have value -1.
test_btf may confuse it with common_type.name as NAME_NTH checkes
high 16bit to be 0xffff. Change NAME_NTH high 16bit check to be
0xfffe so it won't confuse with component_idx.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223041.248009-1-yhs@fb.com
2021-09-14 18:45:52 -07:00
Yonghong Song 71d29c2d47 selftests/bpf: Test libbpf API function btf__add_tag()
Add btf_write tests with btf__add_tag() function.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223036.247560-1-yhs@fb.com
2021-09-14 18:45:52 -07:00
Yonghong Song 5c07f2fec0 bpftool: Add support for BTF_KIND_TAG
Added bpftool support to dump BTF_KIND_TAG information.
The new bpftool will be used in later patches to dump
btf in the test bpf program object file.

Currently, the tags are not emitted with
  bpftool btf dump file <path> format c
and they are silently ignored.  The tag information is
mostly used in the kernel for verification purpose and the kernel
uses its own btf to check. With adding these tags
to vmlinux.h, tags will be encoded in program's btf but
they will not be used by the kernel, at least for now.
So let us delay adding these tags to format C header files
until there is a real need.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223031.246951-1-yhs@fb.com
2021-09-14 18:45:52 -07:00
Yonghong Song 5b84bd1036 libbpf: Add support for BTF_KIND_TAG
Add BTF_KIND_TAG support for parsing and dedup.
Also added sanitization for BTF_KIND_TAG. If BTF_KIND_TAG is not
supported in the kernel, sanitize it to INTs.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223025.246687-1-yhs@fb.com
2021-09-14 18:45:52 -07:00
Yonghong Song 30025e8bd8 libbpf: Rename btf_{hash,equal}_int to btf_{hash,equal}_int_tag
This patch renames functions btf_{hash,equal}_int() to
btf_{hash,equal}_int_tag() so they can be reused for
BTF_KIND_TAG support. There is no functionality change for
this patch.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223020.245829-1-yhs@fb.com
2021-09-14 18:45:52 -07:00
Yonghong Song b5ea834dde bpf: Support for new btf kind BTF_KIND_TAG
LLVM14 added support for a new C attribute ([1])
  __attribute__((btf_tag("arbitrary_str")))
This attribute will be emitted to dwarf ([2]) and pahole
will convert it to BTF. Or for bpf target, this
attribute will be emitted to BTF directly ([3], [4]).
The attribute is intended to provide additional
information for
  - struct/union type or struct/union member
  - static/global variables
  - static/global function or function parameter.

For linux kernel, the btf_tag can be applied
in various places to specify user pointer,
function pre- or post- condition, function
allow/deny in certain context, etc. Such information
will be encoded in vmlinux BTF and can be used
by verifier.

The btf_tag can also be applied to bpf programs
to help global verifiable functions, e.g.,
specifying preconditions, etc.

This patch added basic parsing and checking support
in kernel for new BTF_KIND_TAG kind.

 [1] https://reviews.llvm.org/D106614
 [2] https://reviews.llvm.org/D106621
 [3] https://reviews.llvm.org/D106622
 [4] https://reviews.llvm.org/D109560

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223015.245546-1-yhs@fb.com
2021-09-14 18:45:52 -07:00
Yonghong Song 41ced4cd88 btf: Change BTF_KIND_* macros to enums
Change BTF_KIND_* macros to enums so they are encoded in dwarf and
appear in vmlinux.h. This will make it easier for bpf programs
to use these constants without macro definitions.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223009.245307-1-yhs@fb.com
2021-09-14 18:45:52 -07:00
Andrii Nakryiko 8987ede3ed selftests/bpf: Fix .gitignore to not ignore test_progs.c
List all possible test_progs flavors explicitly to avoid accidentally
ignoring valid source code files. In this case, test_progs.c was still
ignored after recent 809ed84de8 ("selftests/bpf: Whitelist test_progs.h
from .gitignore") fix that added exception only for test_progs.h.

Fixes: 74b5a5968f ("selftests/bpf: Replace test_progs and test_maps w/ general rule")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210914162228.3995740-1-andrii@kernel.org
2021-09-14 18:41:14 -07:00
Jie Meng c035407743 bpf,x64 Emit IMUL instead of MUL for x86-64
IMUL allows for multiple operands and saving and storing rax/rdx is no
longer needed. Signedness of the operands doesn't matter here because
the we only keep the lower 32/64 bit of the product for 32/64 bit
multiplications.

Signed-off-by: Jie Meng <jmeng@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210913211337.1564014-1-jmeng@fb.com
2021-09-14 18:36:36 -07:00
Andrii Nakryiko b6291a6f30 libbpf: Minimize explicit iterator of section definition array
Remove almost all the code that explicitly iterated BPF program section
definitions in favor of using find_sec_def(). The only remaining user of
section_defs is libbpf_get_type_names that has to iterate all of them to
construct its result.

Having one internal API entry point for section definitions will
simplify further refactorings around libbpf's program section
definitions parsing.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210914014733.2768-5-andrii@kernel.org
2021-09-14 15:49:24 -07:00
Andrii Nakryiko 5532dfd42e libbpf: Simplify BPF program auto-attach code
Remove the need to explicitly pass bpf_sec_def for auto-attachable BPF
programs, as it is already recorded at bpf_object__open() time for all
recognized type of BPF programs. This further reduces number of explicit
calls to find_sec_def(), simplifying further refactorings.

No functional changes are done by this patch.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210914014733.2768-4-andrii@kernel.org
2021-09-14 15:49:24 -07:00
Andrii Nakryiko 91b4d1d1d5 libbpf: Ensure BPF prog types are set before relocations
Refactor bpf_object__open() sequencing to perform BPF program type
detection based on SEC() definitions before we get to relocations
collection. This allows to have more information about BPF program by
the time we get to, say, struct_ops relocation gathering. This,
subsequently, simplifies struct_ops logic and removes the need to
perform extra find_sec_def() resolution.

With this patch libbpf will require all struct_ops BPF programs to be
marked with SEC("struct_ops") or SEC("struct_ops/xxx") annotations.
Real-world applications are already doing that through something like
selftests's BPF_STRUCT_OPS() macro. This change streamlines libbpf's
internal handling of SEC() definitions and is in the sprit of
upcoming libbpf-1.0 section strictness changes ([0]).

  [0] https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0#stricter-and-more-uniform-bpf-program-section-name-sec-handling

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210914014733.2768-3-andrii@kernel.org
2021-09-14 15:49:24 -07:00
Andrii Nakryiko 53df63ccdc selftests/bpf: Update selftests to always provide "struct_ops" SEC
Update struct_ops selftests to always specify "struct_ops" section
prefix. Libbpf will require a proper BPF program type set in the next
patch, so this prevents tests breaking.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210914014733.2768-2-andrii@kernel.org
2021-09-14 15:49:24 -07:00
Rafael David Tinoco ca304b40c2 libbpf: Introduce legacy kprobe events support
Allow kprobe tracepoint events creation through legacy interface, as the
kprobe dynamic PMUs support, used by default, was only created in v4.17.

Store legacy kprobe name in struct bpf_perf_link, instead of creating
a new "subclass" off of bpf_perf_link. This is ok as it's just two new
fields, which are also going to be reused for legacy uprobe support in
follow up patches.

Signed-off-by: Rafael David Tinoco <rafaeldtinoco@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210912064844.3181742-1-rafaeldtinoco@gmail.com
2021-09-14 14:44:45 -07:00
David S. Miller 2865ba8247 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2021-09-14

The following pull-request contains BPF updates for your *net* tree.

We've added 7 non-merge commits during the last 13 day(s) which contain
a total of 18 files changed, 334 insertions(+), 193 deletions(-).

The main changes are:

1) Fix mmap_lock lockdep splat in BPF stack map's build_id lookup, from Yonghong Song.

2) Fix BPF cgroup v2 program bypass upon net_cls/prio activation, from Daniel Borkmann.

3) Fix kvcalloc() BTF line info splat on oversized allocation attempts, from Bixuan Cui.

4) Fix BPF selftest build of task_pt_regs test for arm64/s390, from Jean-Philippe Brucker.

5) Fix BPF's disasm.{c,h} to dual-license so that it is aligned with bpftool given the former
   is a build dependency for the latter, from Daniel Borkmann with ACKs from contributors.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-14 13:09:54 +01:00
Faizel K B f81c08f897 usb: testusb: Fix for showing the connection speed
testusb' application which uses 'usbtest' driver reports 'unknown speed'
from the function 'find_testdev'. The variable 'entry->speed' was not
updated from  the application. The IOCTL mentioned in the FIXME comment can
only report whether the connection is low speed or not. Speed is read using
the IOCTL USBDEVFS_GET_SPEED which reports the proper speed grade.  The
call is implemented in the function 'handle_testdev' where the file
descriptor was availble locally. Sample output is given below where 'high
speed' is printed as the connected speed.

sudo ./testusb -a
high speed      /dev/bus/usb/001/011    0
/dev/bus/usb/001/011 test 0,    0.000015 secs
/dev/bus/usb/001/011 test 1,    0.194208 secs
/dev/bus/usb/001/011 test 2,    0.077289 secs
/dev/bus/usb/001/011 test 3,    0.170604 secs
/dev/bus/usb/001/011 test 4,    0.108335 secs
/dev/bus/usb/001/011 test 5,    2.788076 secs
/dev/bus/usb/001/011 test 6,    2.594610 secs
/dev/bus/usb/001/011 test 7,    2.905459 secs
/dev/bus/usb/001/011 test 8,    2.795193 secs
/dev/bus/usb/001/011 test 9,    8.372651 secs
/dev/bus/usb/001/011 test 10,    6.919731 secs
/dev/bus/usb/001/011 test 11,   16.372687 secs
/dev/bus/usb/001/011 test 12,   16.375233 secs
/dev/bus/usb/001/011 test 13,    2.977457 secs
/dev/bus/usb/001/011 test 14 --> 22 (Invalid argument)
/dev/bus/usb/001/011 test 17,    0.148826 secs
/dev/bus/usb/001/011 test 18,    0.068718 secs
/dev/bus/usb/001/011 test 19,    0.125992 secs
/dev/bus/usb/001/011 test 20,    0.127477 secs
/dev/bus/usb/001/011 test 21 --> 22 (Invalid argument)
/dev/bus/usb/001/011 test 24,    4.133763 secs
/dev/bus/usb/001/011 test 27,    2.140066 secs
/dev/bus/usb/001/011 test 28,    2.120713 secs
/dev/bus/usb/001/011 test 29,    0.507762 secs

Signed-off-by: Faizel K B <faizel.kb@dicortech.com>
Link: https://lore.kernel.org/r/20210902114444.15106-1-faizel.kb@dicortech.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-14 10:31:41 +02:00
Daniel Borkmann 43d2b88c29 bpf, selftests: Add test case for mixed cgroup v1/v2
Minimal selftest which implements a small BPF policy program to the
connect(2) hook which rejects TCP connection requests to port 60123
with EPERM. This is being attached to a non-root cgroup v2 path. The
test asserts that this works under cgroup v2-only and under a mixed
cgroup v1/v2 environment where net_classid is set in the former case.

Before fix:

  # ./test_progs -t cgroup_v1v2
  test_cgroup_v1v2:PASS:server_fd 0 nsec
  test_cgroup_v1v2:PASS:client_fd 0 nsec
  test_cgroup_v1v2:PASS:cgroup_fd 0 nsec
  test_cgroup_v1v2:PASS:server_fd 0 nsec
  run_test:PASS:skel_open 0 nsec
  run_test:PASS:prog_attach 0 nsec
  test_cgroup_v1v2:PASS:cgroup-v2-only 0 nsec
  run_test:PASS:skel_open 0 nsec
  run_test:PASS:prog_attach 0 nsec
  run_test:PASS:join_classid 0 nsec
  (network_helpers.c:219: errno: None) Unexpected success to connect to server
  test_cgroup_v1v2:FAIL:cgroup-v1v2 unexpected error: -1 (errno 0)
  #27 cgroup_v1v2:FAIL
  Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED

After fix:

  # ./test_progs -t cgroup_v1v2
  #27 cgroup_v1v2:OK
  Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210913230759.2313-3-daniel@iogearbox.net
2021-09-13 16:35:58 -07:00
Daniel Borkmann d8079d8026 bpf, selftests: Add cgroup v1 net_cls classid helpers
Minimal set of helpers for net_cls classid cgroupv1 management in order
to set an id, join from a process, initiate setup and teardown. cgroupv2
helpers are left as-is, but reused where possible.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210913230759.2313-2-daniel@iogearbox.net
2021-09-13 16:35:58 -07:00
Nathan Chancellor d0ee23f9d7 tools: compiler-gcc.h: Guard error attribute use with __has_attribute
When building objtool with HOSTCC=clang, there are several errors along
the lines of

  orc_dump.c:201:28: error: unknown attribute 'error' ignored [-Werror,-Wunknown-attributes]

This occurs after commit 4e59869aa6 ("compiler-gcc.h: drop checks for
older GCC versions"), which removed the GCC_VERSION gating.  The removed
version check just so happened to prevent __compiletime_error() from
being defined with clang because it pretends to be GCC 4.2.1 for
compatibility but the error attribute was not added to clang until
14.0.0.

Commit 815f0ddb34 ("include/linux/compiler*.h: make compiler-*.h
mutually exclusive") and commit a3f8a30f3f ("Compiler Attributes: use
feature checks instead of version checks") refactored the handling of
attributes in the main kernel to avoid situations like this but that
refactoring has never been done for the tools directory.

Refactoring is a rather large undertaking and this has never been an
issue before so instead, just guard the definition of
__compiletime_error() with __has_attribute() so that there are no more
errors.

Fixes: 4e59869aa6 ("compiler-gcc.h: drop checks for older GCC versions")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-13 15:51:41 -07:00
Andrii Nakryiko 2f38304127 libbpf: Make libbpf_version.h non-auto-generated
Turn previously auto-generated libbpf_version.h header into a normal
header file. This prevents various tricky Makefile integration issues,
simplifies the overall build process, but also allows to further extend
it with some more versioning-related APIs in the future.

To prevent accidental out-of-sync versions as defined by libbpf.map and
libbpf_version.h, Makefile checks their consistency at build time.

Simultaneously with this change bump libbpf.map to v0.6.

Also undo adding libbpf's output directory into include path for
kernel/bpf/preload, bpftool, and resolve_btfids, which is not necessary
because libbpf_version.h is just a normal header like any other.

Fixes: 0b46b75505 ("libbpf: Add LIBBPF_DEPRECATED_SINCE macro for scheduling API deprecations")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210913222309.3220849-1-andrii@kernel.org
2021-09-13 15:36:47 -07:00
Daniel Borkmann dbd7eb14e0 bpf, selftests: Replicate tailcall limit test for indirect call case
The tailcall_3 test program uses bpf_tail_call_static() where the JIT
would patch a direct jump. Add a new tailcall_6 test program replicating
exactly the same test just ensuring that bpf_tail_call() uses a map
index where the verifier cannot make assumptions this time.

In other words, this will now cover both on x86-64 JIT, meaning, JIT
images with emit_bpf_tail_call_direct() emission as well as JIT images
with emit_bpf_tail_call_indirect() emission.

  # echo 1 > /proc/sys/net/core/bpf_jit_enable
  # ./test_progs -t tailcalls
  #136/1 tailcalls/tailcall_1:OK
  #136/2 tailcalls/tailcall_2:OK
  #136/3 tailcalls/tailcall_3:OK
  #136/4 tailcalls/tailcall_4:OK
  #136/5 tailcalls/tailcall_5:OK
  #136/6 tailcalls/tailcall_6:OK
  #136/7 tailcalls/tailcall_bpf2bpf_1:OK
  #136/8 tailcalls/tailcall_bpf2bpf_2:OK
  #136/9 tailcalls/tailcall_bpf2bpf_3:OK
  #136/10 tailcalls/tailcall_bpf2bpf_4:OK
  #136/11 tailcalls/tailcall_bpf2bpf_5:OK
  #136 tailcalls:OK
  Summary: 1/11 PASSED, 0 SKIPPED, 0 FAILED

  # echo 0 > /proc/sys/net/core/bpf_jit_enable
  # ./test_progs -t tailcalls
  #136/1 tailcalls/tailcall_1:OK
  #136/2 tailcalls/tailcall_2:OK
  #136/3 tailcalls/tailcall_3:OK
  #136/4 tailcalls/tailcall_4:OK
  #136/5 tailcalls/tailcall_5:OK
  #136/6 tailcalls/tailcall_6:OK
  [...]

For interpreter, the tailcall_1-6 tests are passing as well. The later
tailcall_bpf2bpf_* are failing due lack of bpf2bpf + tailcall support
in interpreter, so this is expected.

Also, manual inspection shows that both loaded programs from tailcall_3
and tailcall_6 test case emit the expected opcodes:

* tailcall_3 disasm, emit_bpf_tail_call_direct():

  [...]
   b:   push   %rax
   c:   push   %rbx
   d:   push   %r13
   f:   mov    %rdi,%rbx
  12:   movabs $0xffff8d3f5afb0200,%r13
  1c:   mov    %rbx,%rdi
  1f:   mov    %r13,%rsi
  22:   xor    %edx,%edx                 _
  24:   mov    -0x4(%rbp),%eax          |  limit check
  2a:   cmp    $0x20,%eax               |
  2d:   ja     0x0000000000000046       |
  2f:   add    $0x1,%eax                |
  32:   mov    %eax,-0x4(%rbp)          |_
  38:   nopl   0x0(%rax,%rax,1)
  3d:   pop    %r13
  3f:   pop    %rbx
  40:   pop    %rax
  41:   jmpq   0xffffffffffffe377
  [...]

* tailcall_6 disasm, emit_bpf_tail_call_indirect():

  [...]
  47:   movabs $0xffff8d3f59143a00,%rsi
  51:   mov    %edx,%edx
  53:   cmp    %edx,0x24(%rsi)
  56:   jbe    0x0000000000000093        _
  58:   mov    -0x4(%rbp),%eax          |  limit check
  5e:   cmp    $0x20,%eax               |
  61:   ja     0x0000000000000093       |
  63:   add    $0x1,%eax                |
  66:   mov    %eax,-0x4(%rbp)          |_
  6c:   mov    0x110(%rsi,%rdx,8),%rcx
  74:   test   %rcx,%rcx
  77:   je     0x0000000000000093
  79:   pop    %rax
  7a:   mov    0x30(%rcx),%rcx
  7e:   add    $0xb,%rcx
  82:   callq  0x000000000000008e
  87:   pause
  89:   lfence
  8c:   jmp    0x0000000000000087
  8e:   mov    %rcx,(%rsp)
  92:   retq
  [...]

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Acked-by: Paul Chaignon <paul@cilium.io>
Link: https://lore.kernel.org/bpf/CAM1=_QRyRVCODcXo_Y6qOm1iT163HoiSj8U2pZ8Rj3hzMTT=HQ@mail.gmail.com
Link: https://lore.kernel.org/bpf/20210910091900.16119-1-daniel@iogearbox.net
2021-09-13 14:52:22 -07:00
Song Liu 025bd7c753 selftests/bpf: Add test for bpf_get_branch_snapshot
This test uses bpf_get_branch_snapshot from a fexit program. The test uses
a target function (bpf_testmod_loop_test) and compares the record against
kallsyms. If there isn't enough record matching kallsyms, the test fails.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210910183352.3151445-4-songliubraving@fb.com
2021-09-13 10:53:50 -07:00
Song Liu 856c02dbce bpf: Introduce helper bpf_get_branch_snapshot
Introduce bpf_get_branch_snapshot(), which allows tracing pogram to get
branch trace from hardware (e.g. Intel LBR). To use the feature, the
user need to create perf_event with proper branch_record filtering
on each cpu, and then calls bpf_get_branch_snapshot in the bpf function.
On Intel CPUs, VLBR event (raw event 0x1b00) can be use for this.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210910183352.3151445-3-songliubraving@fb.com
2021-09-13 10:53:50 -07:00
Linus Torvalds 316346243b Merge branch 'gcc-min-version-5.1' (make gcc-5.1 the minimum version)
Merge patch series from Nick Desaulniers to update the minimum gcc
version to 5.1.

This is some of the left-overs from the merge window that I didn't want
to deal with yesterday, so it comes in after -rc1 but was sent before.

Gcc-4.9 support has been an annoyance for some time, and with -Werror I
had the choice of applying a fairly big patch from Kees Cook to remove a
fair number of initializer warnings (still leaving some), or this patch
series from Nick that just removes the source of the problem.

The initializer cleanups might still be worth it regardless, but
honestly, I preferred just tackling the problem with gcc-4.9 head-on.
We've been more aggressiuve about no longer having to care about
compilers that were released a long time ago, and I think it's been a
good thing.

I added a couple of patches on top to sort out a few left-overs now that
we no longer support gcc-4.x.

As noted by Arnd, as a result of this minimum compiler version upgrade
we can probably change our use of '--std=gnu89' to '--std=gnu11', and
finally start using local loop declarations etc.  But this series does
_not_ yet do that.

Link: https://lore.kernel.org/all/20210909182525.372ee687@canb.auug.org.au/
Link: https://lore.kernel.org/lkml/CAK7LNASs6dvU6D3jL2GG3jW58fXfaj6VNOe55NJnTB8UPuk2pA@mail.gmail.com/
Link: https://github.com/ClangBuiltLinux/linux/issues/1438

* emailed patches from Nick Desaulniers <ndesaulniers@google.com>:
  Drop some straggling mentions of gcc-4.9 as being stale
  compiler_attributes.h: drop __has_attribute() support for gcc4
  vmlinux.lds.h: remove old check for GCC 4.9
  compiler-gcc.h: drop checks for older GCC versions
  Makefile: drop GCC < 5 -fno-var-tracking-assignments workaround
  arm64: remove GCC version check for ARCH_SUPPORTS_INT128
  powerpc: remove GCC version check for UPD_CONSTR
  riscv: remove Kconfig check for GCC version for ARCH_RV64I
  Kconfig.debug: drop GCC 5+ version check for DWARF5
  mm/ksm: remove old GCC 4.9+ check
  compiler.h: drop fallback overflow checkers
  Documentation: raise minimum supported version of GCC to 5.1
2021-09-13 10:43:04 -07:00
Nick Desaulniers 4e59869aa6 compiler-gcc.h: drop checks for older GCC versions
Now that GCC 5.1 is the minimally supported default, drop the values we
don't use.

Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-13 10:18:29 -07:00
Nick Desaulniers 4eb6bd55cf compiler.h: drop fallback overflow checkers
Once upgrading the minimum supported version of GCC to 5.1, we can drop
the fallback code for !COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW.

This is effectively a revert of commit f0907827a8 ("compiler.h: enable
builtin overflow checkers and add fallback code")

Link: https://github.com/ClangBuiltLinux/linux/issues/1438#issuecomment-916745801
Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-13 10:18:28 -07:00
Nicholas Piggin 5379ef2a60 selftests/powerpc: Add scv versions of the basic TM syscall tests
The basic TM vs syscall test code hard codes an sc instruction for the
system call, which fails to cover scv even when the userspace libc has
support for it.

Duplicate the tests with hard coded scv variants so both are tested
when possible.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Fix build on old toolchains by using .long for scv]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210903125707.1601269-2-npiggin@gmail.com
2021-09-13 22:34:11 +10:00
Linus Torvalds b5b65f1398 perf tools changes for v5.15: 2nd batch
- Add missing fields and remove some duplicate fields when printing a perf_event_attr.
 
 - Fix hybrid config terms list corruption.
 
 - Update kernel header copies, some resulted in new kernel features being
   automagically added to 'perf trace' syscall/tracepoint argument id->string translators.
 
 - Add a file generated during the documentation build to .gitignore.
 
 - Add an option to build without libbfd, as some distros, like Debian consider
   its ABI unstable.
 
 - Add support to print a textual representation of IBS raw sample data in 'perf report'.
 
 - Fix bpf 'perf test' sample mismatch reporting
 
 - Fix passing arguments to stackcollapse report in a 'perf script' python script.
 
 - Allow build-id with trailing zeros.
 
 - Look for ImageBase in PE file to compute .text offset.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYT0+hwAKCRCyPKLppCJ+
 JxcPAQDO+iCKK/sF3TVN8f0T8xkFD6y8krBXPAtQHCAhVBeiqAD9F4R0VMX6nwy3
 8rJnsNd2ODjywgFBO4uPy0N2fxBWjwo=
 =/hH1
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v5.15-2021-09-11' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull more perf tools updates from Arnaldo Carvalho de Melo:

 - Add missing fields and remove some duplicate fields when printing a
   perf_event_attr.

 - Fix hybrid config terms list corruption.

 - Update kernel header copies, some resulted in new kernel features
   being automagically added to 'perf trace' syscall/tracepoint argument
   id->string translators.

 - Add a file generated during the documentation build to .gitignore.

 - Add an option to build without libbfd, as some distros, like Debian
   consider its ABI unstable.

 - Add support to print a textual representation of IBS raw sample data
   in 'perf report'.

 - Fix bpf 'perf test' sample mismatch reporting

 - Fix passing arguments to stackcollapse report in a 'perf script'
   python script.

 - Allow build-id with trailing zeros.

 - Look for ImageBase in PE file to compute .text offset.

* tag 'perf-tools-for-v5.15-2021-09-11' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (25 commits)
  tools headers UAPI: Update tools's copy of drm.h headers
  tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
  tools headers UAPI: Sync linux/fs.h with the kernel sources
  tools headers UAPI: Sync linux/in.h copy with the kernel sources
  perf tools: Add an option to build without libbfd
  perf tools: Allow build-id with trailing zeros
  perf tools: Fix hybrid config terms list corruption
  perf tools: Factor out copy_config_terms() and free_config_terms()
  perf tools: Fix perf_event_attr__fprintf() missing/dupl. fields
  perf tools: Ignore Documentation dependency file
  perf bpf: Provide a weak btf__load_from_kernel_by_id() for older libbpf versions
  tools include UAPI: Update linux/mount.h copy
  perf beauty: Cover more flags in the  move_mount syscall argument beautifier
  tools headers UAPI: Sync linux/prctl.h with the kernel sources
  tools include UAPI: Sync sound/asound.h copy with the kernel sources
  tools headers UAPI: Sync linux/kvm.h with the kernel sources
  tools headers UAPI: Sync x86's asm/kvm.h with the kernel sources
  perf report: Add support to print a textual representation of IBS raw sample data
  perf report: Add tools/arch/x86/include/asm/amd-ibs.h
  perf env: Add perf_env__cpuid, perf_env__{nr_}pmu_mappings
  ...
2021-09-12 16:18:15 -07:00
Andrea Claudi 1b704b27be selftest: net: fix typo in altname test
If altname deletion of the short alternative name fails, the error
message printed is: "Failed to add short alternative name".
This is obviously a typo, as we are testing altname deletion.

Fix this using a proper error message.

Fixes: f95e6c9c46 ("selftest: net: add alternative names test")
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-12 10:48:26 +01:00
Linus Torvalds 78e709522d virtio,vdpa,vhost: features, fixes
vduse driver supporting blk
 virtio-vsock support for end of record with SEQPACKET
 vdpa: mac and mq support for ifcvf and mlx5
 vdpa: management netlink for ifcvf
 virtio-i2c, gpio dt bindings
 
 misc fixes, cleanups
 
 NB: when merging this with
 b542e383d8 ("eventfd: Make signal recursion protection a task bit")
 from Linus' tree, replace eventfd_signal_count with
 eventfd_signal_allowed, and drop the export of eventfd_wake_count from
 ("eventfd: Export eventfd_wake_count to modules").
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmE1+awPHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRpt6EIAJy0qrc62lktNA0IiIVJSLbUbTMmFj8MzkGR
 8UxZdhpjWqBPJPyaOuNeksAqTGm/UAPEYx3C2c95Jhej7anFpy7dbCtIXcPHLJME
 DjcJg+EDrlNCj8m0FcsHpHWsFzPMERJpyEZNxgB5WazQbv+yWhGrg2FN5DCnF0Ro
 ZFYeKSVty148pQ0nHl8X0JM2XMtqit+O+LvKN2HQZ+fubh7BCzMxzkHY0QLHIzUS
 UeZqd3Qm8YcbqnlX38P5D6k+NPiTEgknmxaBLkPxg6H3XxDAmaIRFb8Ldd1rsgy1
 zTLGDiSGpVDIpawRnuEAzqJThV3Y5/MVJ1WD+mDYQ96tmhfp+KY=
 =DBH/
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio updates from Michael Tsirkin:

 - vduse driver ("vDPA Device in Userspace") supporting emulated virtio
   block devices

 - virtio-vsock support for end of record with SEQPACKET

 - vdpa: mac and mq support for ifcvf and mlx5

 - vdpa: management netlink for ifcvf

 - virtio-i2c, gpio dt bindings

 - misc fixes and cleanups

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (39 commits)
  Documentation: Add documentation for VDUSE
  vduse: Introduce VDUSE - vDPA Device in Userspace
  vduse: Implement an MMU-based software IOTLB
  vdpa: Support transferring virtual addressing during DMA mapping
  vdpa: factor out vhost_vdpa_pa_map() and vhost_vdpa_pa_unmap()
  vdpa: Add an opaque pointer for vdpa_config_ops.dma_map()
  vhost-iotlb: Add an opaque pointer for vhost IOTLB
  vhost-vdpa: Handle the failure of vdpa_reset()
  vdpa: Add reset callback in vdpa_config_ops
  vdpa: Fix some coding style issues
  file: Export receive_fd() to modules
  eventfd: Export eventfd_wake_count to modules
  iova: Export alloc_iova_fast() and free_iova_fast()
  virtio-blk: remove unneeded "likely" statements
  virtio-balloon: Use virtio_find_vqs() helper
  vdpa: Make use of PFN_PHYS/PFN_UP/PFN_DOWN helper macro
  vsock_test: update message bounds test for MSG_EOR
  af_vsock: rename variables in receive loop
  virtio/vsock: support MSG_EOR bit processing
  vhost/vsock: support MSG_EOR bit processing
  ...
2021-09-11 14:48:42 -07:00
Arnaldo Carvalho de Melo 17a99e521f tools headers UAPI: Update tools's copy of drm.h headers
Picking the changes from:

  17ce9c61c7 ("drm: document DRM_IOCTL_MODE_RMFB")

Doesn't result in any tooling changes:

  $ tools/perf/trace/beauty/drm_ioctl.sh  > before
  $ cp include/uapi/drm/drm.h tools/include/uapi/drm/drm.h
  $ tools/perf/trace/beauty/drm_ioctl.sh  > after
  $ diff -u before after

Silencing these perf build warnings:

  Warning: Kernel ABI header at 'tools/include/uapi/drm/drm.h' differs from latest version at 'include/uapi/drm/drm.h'
  diff -u tools/include/uapi/drm/drm.h include/uapi/drm/drm.h

Cc: Simon Ser <contact@emersion.fr>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11 16:24:10 -03:00
Arnaldo Carvalho de Melo 4dc24d7cf4 tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
To pick the changes in:

  b65a948973 ("drm/i915/userptr: Probe existence of backing struct pages upon creation")
  ee242ca704 ("drm/i915/guc: Implement GuC priority management")
  81340cf3bd ("drm/i915/uapi: reject set_domain for discrete")
  7961c5b60f ("drm/i915: Add TTM offset argument to mmap.")
  aef7b67a79 ("drm/i915/uapi: convert drm_i915_gem_userptr to kernel doc")
  e7737b67ab ("drm/i915/uapi: reject caching ioctls for discrete")
  3aa8c57fe2 ("drm/i915/uapi: convert drm_i915_gem_set_domain to kernel doc")
  289f5a7200 ("drm/i915/uapi: convert drm_i915_gem_caching to kernel doc")
  4a766ae40e ("drm/i915: Drop the CONTEXT_CLONE API (v2)")
  6ff6d61dd2 ("drm/i915: Drop I915_CONTEXT_PARAM_NO_ZEROMAP")
  fe4751c3d5 ("drm/i915: Drop I915_CONTEXT_PARAM_RINGSIZE")
  577729533c ("drm/i915: Document the Virtual Engine uAPI")
  c649432e86 ("drm/i915: Fix busy ioctl commentary")

That doesn't result in any changes to tooling as no new ioctl were
added (at least not perceived by tools/perf/trace/beauty/drm_ioctl.sh).

Addressing this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/drm/i915_drm.h' differs from latest version at 'include/uapi/drm/i915_drm.h'
  diff -u tools/include/uapi/drm/i915_drm.h include/uapi/drm/i915_drm.h

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11 16:21:22 -03:00
Arnaldo Carvalho de Melo 2bae3e64ec tools headers UAPI: Sync linux/fs.h with the kernel sources
To pick the change in:

  7957d93bf3 ("block: add ioctl to read the disk sequence number")

It adds a new ioctl, but we are still not using that to generate tables
for 'perf trace', so no changes in tooling.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/fs.h' differs from latest version at 'include/uapi/linux/fs.h'
  diff -u tools/include/uapi/linux/fs.h include/uapi/linux/fs.h

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11 16:14:53 -03:00
Arnaldo Carvalho de Melo ee286c60c2 tools headers UAPI: Sync linux/in.h copy with the kernel sources
To pick the changes in:

  db243b7964 ("net/ipv4/ipv6: Replace one-element arraya with flexible-array members")
  2d3e5caf96 ("net/ipv4: Replace one-element array with flexible-array member")

That don't result in any change in tooling, the structs changed remains
with the same layout.

This addresses this build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/in.h' differs from latest version at 'include/uapi/linux/in.h'
  diff -u tools/include/uapi/linux/in.h include/uapi/linux/in.h

Cc: David S. Miller <davem@davemloft.net>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11 16:12:26 -03:00
Ian Rogers 0d1c50ac48 perf tools: Add an option to build without libbfd
Some distributions, like debian, don't link perf with libbfd. Add a
build flag to make this configuration buildable and testable.

This was inspired by:

  https://lore.kernel.org/linux-perf-users/20210910102307.2055484-1-tonyg@leastfixedpoint.com/T/#u

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: tony garnock-jones <tonyg@leastfixedpoint.com>
Link: http://lore.kernel.org/lkml/20210910225756.729087-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11 16:06:52 -03:00
Namhyung Kim 4a86d41404 perf tools: Allow build-id with trailing zeros
Currently perf saves a build-id with size but old versions assumes the
size of 20.  In case the build-id is less than 20 (like for MD5), it'd
fill the rest with 0s.

I saw a problem when old version of perf record saved a binary in the
build-id cache and new version of perf reads the data.  The symbols
should be read from the build-id cache (as the path no longer has the
same binary) but it failed due to mismatch in the build-id.

  symsrc__init: build id mismatch for /home/namhyung/.debug/.build-id/53/e4c2f42a4c61a2d632d92a72afa08f00000000/elf.

The build-id event in the data has 20 byte build-ids, but it saw a
different size (16) when it reads the build-id of the elf file in the
build-id cache.

  $ readelf -n ~/.debug/.build-id/53/e4c2f42a4c61a2d632d92a72afa08f00000000/elf

  Displaying notes found in: .note.gnu.build-id
    Owner                Data size 	Description
    GNU                  0x00000010	NT_GNU_BUILD_ID (unique build ID bitstring)
      Build ID: 53e4c2f42a4c61a2d632d92a72afa08f

Let's fix this by allowing trailing zeros if the size is different.

Fixes: 39be8d0115 ("perf tools: Pass build_id object to dso__build_id_equal()")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210910224630.1084877-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11 16:04:47 -03:00
Adrian Hunter 99fc5941b8 perf tools: Fix hybrid config terms list corruption
A config terms list was spliced twice, resulting in a never-ending loop
when the list was traversed. Fix by using list_splice_init() and copying
and freeing the lists as necessary.

This patch also depends on patch "perf tools: Factor out
copy_config_terms() and free_config_terms()"

Example on ADL:

 Before:

  # perf record -e '{intel_pt//,cycles/aux-sample-size=4096/pp}' uname &
  # jobs
  [1]+  Running                    perf record -e "{intel_pt//,cycles/aux-sample-size=4096/pp}" uname
  # perf top -E 10
    PerfTop:    4071 irqs/sec  kernel: 6.9%  exact: 100.0% lost: 0/0 drop: 0/0 [4000Hz cycles],  (all, 24 CPUs)
  ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    97.60%  perf           [.] __evsel__get_config_term
     0.25%  [kernel]       [k] kallsyms_expand_symbol.constprop.13
     0.24%  perf           [.] kallsyms__parse
     0.15%  [kernel]       [k] _raw_spin_lock
     0.14%  [kernel]       [k] number
     0.13%  [kernel]       [k] advance_transaction
     0.08%  [kernel]       [k] format_decode
     0.08%  perf           [.] map__process_kallsym_symbol
     0.08%  perf           [.] rb_insert_color
     0.08%  [kernel]       [k] vsnprintf
  exiting.
  # kill %1

After:

  # perf record -e '{intel_pt//,cycles/aux-sample-size=4096/pp}' uname &
  Linux
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.060 MB perf.data ]
  # perf script | head
       perf-exec   604 [001]  1827.312293:                            psb:  psb offs: 0                       ffffffffb8415e87 pt_config_start+0x37 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb856a3bd event_sched_in.isra.133+0xfd ([kernel.kallsyms]) => ffffffffb856a9a0 perf_pmu_nop_void+0x0 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb856b10e merge_sched_in+0x26e ([kernel.kallsyms]) => ffffffffb856a2c0 event_sched_in.isra.133+0x0 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb856a45d event_sched_in.isra.133+0x19d ([kernel.kallsyms]) => ffffffffb8568b80 perf_event_set_state.part.61+0x0 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb8568b86 perf_event_set_state.part.61+0x6 ([kernel.kallsyms]) => ffffffffb85662a0 perf_event_update_time+0x0 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb856a35c event_sched_in.isra.133+0x9c ([kernel.kallsyms]) => ffffffffb8567610 perf_log_itrace_start+0x0 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb856a377 event_sched_in.isra.133+0xb7 ([kernel.kallsyms]) => ffffffffb8403b40 x86_pmu_add+0x0 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb8403b86 x86_pmu_add+0x46 ([kernel.kallsyms]) => ffffffffb8403940 collect_events+0x0 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb8403a7b collect_events+0x13b ([kernel.kallsyms]) => ffffffffb8402cd0 collect_event+0x0 ([kernel.kallsyms])

Fixes: 30def61f64 ("perf parse-events Create two hybrid cache events")
Fixes: 94da591b1c ("perf parse-events Create two hybrid raw events")
Fixes: 9cbfa2f64c ("perf parse-events Create two hybrid hardware events")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Link: https //lore.kernel.org/r/20210909125508.28693-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11 16:00:34 -03:00
Adrian Hunter a7d212fc6c perf tools: Factor out copy_config_terms() and free_config_terms()
Factor out copy_config_terms() and free_config_terms() so that they can
be reused.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Link: https //lore.kernel.org/r/20210909125508.28693-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11 16:00:13 -03:00
Adrian Hunter eb34363ae1 perf tools: Fix perf_event_attr__fprintf() missing/dupl. fields
Some fields are missing and text_poke is duplicated. Fix that up.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210911120550.12203-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11 15:58:36 -03:00
Ian Rogers da4572d62d perf tools: Ignore Documentation dependency file
When building directly on the checked out repository the build process
produces a file that should be ignored, so add it to .gitignore.

Fixes: a81df63a5d ("perf doc: Fix doc.dep")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210910232249.739661-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11 15:36:16 -03:00
Linus Torvalds dd4703876e - Add the tegra3 thermal sensor and fix the compilation testing on
tegra by adding a dependency on ARCH_TEGRA along with COMPILE_TEST
   (Dmitry Osipenko)
 
 - Fix the error code for the exynos when devm_get_clk() fails (Dan
   Carpenter)
 
 - Add the TCC cooling support for AlderLake platform (Sumeet Pawnikar)
 
 - Add support for hardware trip points for the rcar gen3 thermal
   driver and store TSC id as unsigned int (Niklas Söderlund)
 
 - Replace the deprecated CPU-hotplug functions get_online_cpus() and
   put_online_cpus (Sebastian Andrzej Siewior)
 
 - Add the thermal tools directory in the MAINTAINERS file (Daniel
   Lezcano)
 
 - Fix the Makefile and the cross compilation flags for the userspace
   'tmon' tool (Rolf Eike Beer)
 
 - Allow to use the IMOK independently from the GDDV on Int340x (Sumeet
   Pawnikar)
 
 - Fix the stub thermal_cooling_device_register() function prototype
   which does not match the real function (Arnd Bergmann)
 
 - Make the thermal trip point optional in the DT bindings (Maxime
   Ripard)
 
 - Fix a typo in a comment in the core code (Geert Uytterhoeven)
 
 - Reduce the verbosity of the trace in the SoC thermal tegra driver
   (Dmitry Osipenko)
 
 - Add the support for the LMh (Limit Management hardware) driver on
   the QCom platforms (Thara Gopinath)
 
 - Allow processing of HWP interrupt by adding a weak function in the
   Intel driver (Srinivas Pandruvada)
 
 - Prevent an abort of the sensor probe is a channel is not used
   (Matthias Kaehlcke)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGn3N4YVz0WNVyHskqDIjiipP6E8FAmE7yAcACgkQqDIjiipP
 6E8WKwgAmI+kdOURCz1CtCIv6FnQFx20suxZidfATPULBPZ8zcHqdjxJECXJpvDz
 y8T8dGAPRqXmMBJm2vF/qVraHaDJvscHXaflhsurhyEp0tiGptNeugBosso0jDEQ
 JFrQ6Rny/NCmNMOkWW4YlSfkik0zQ43xlHocy3UfOKfcshhYWGsPTbIXdA6wH/zh
 3tFW8j327tkyenrKZiaioJOSX88Oyy7Ft8l5k0HP+hPYkrvihWDJuEfEEmpfkdA7
 Q70Lacf2QtFP8mUfGKpQxURSgndjphDPU7tzP5NqqWvLuT94aTspvJx4ywVwK8Iv
 o+jNvR33yBIs5FXXB7B1ymkNE1h8uQ==
 =blIQ
 -----END PGP SIGNATURE-----

Merge tag 'thermal-v5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux

Pull thermal updates from Daniel Lezcano:

 - Add the tegra3 thermal sensor and fix the compilation testing on
   tegra by adding a dependency on ARCH_TEGRA along with COMPILE_TEST
   (Dmitry Osipenko)

 - Fix the error code for the exynos when devm_get_clk() fails (Dan
   Carpenter)

 - Add the TCC cooling support for AlderLake platform (Sumeet Pawnikar)

 - Add support for hardware trip points for the rcar gen3 thermal driver
   and store TSC id as unsigned int (Niklas Söderlund)

 - Replace the deprecated CPU-hotplug functions get_online_cpus() and
   put_online_cpus (Sebastian Andrzej Siewior)

 - Add the thermal tools directory in the MAINTAINERS file (Daniel
   Lezcano)

 - Fix the Makefile and the cross compilation flags for the userspace
   'tmon' tool (Rolf Eike Beer)

 - Allow to use the IMOK independently from the GDDV on Int340x (Sumeet
   Pawnikar)

 - Fix the stub thermal_cooling_device_register() function prototype
   which does not match the real function (Arnd Bergmann)

 - Make the thermal trip point optional in the DT bindings (Maxime
   Ripard)

 - Fix a typo in a comment in the core code (Geert Uytterhoeven)

 - Reduce the verbosity of the trace in the SoC thermal tegra driver
   (Dmitry Osipenko)

 - Add the support for the LMh (Limit Management hardware) driver on the
   QCom platforms (Thara Gopinath)

 - Allow processing of HWP interrupt by adding a weak function in the
   Intel driver (Srinivas Pandruvada)

 - Prevent an abort of the sensor probe is a channel is not used
   (Matthias Kaehlcke)

* tag 'thermal-v5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
  thermal/drivers/qcom/spmi-adc-tm5: Don't abort probing if a sensor is not used
  thermal/drivers/intel: Allow processing of HWP interrupt
  dt-bindings: thermal: Add dt binding for QCOM LMh
  thermal/drivers/qcom: Add support for LMh driver
  firmware: qcom_scm: Introduce SCM calls to access LMh
  thermal/drivers/tegra-soctherm: Silence message about clamped temperature
  thermal: Spelling s/scallbacks/callbacks/
  dt-bindings: thermal: Make trips node optional
  thermal/core: Fix thermal_cooling_device_register() prototype
  thermal/drivers/int340x: Use IMOK independently
  tools/thermal/tmon: Add cross compiling support
  thermal/tools/tmon: Improve the Makefile
  MAINTAINERS: Add missing userspace thermal tools to the thermal section
  thermal/drivers/intel_powerclamp: Replace deprecated CPU-hotplug functions.
  thermal/drivers/rcar_gen3_thermal: Store TSC id as unsigned int
  thermal/drivers/rcar_gen3_thermal: Add support for hardware trip points
  drivers/thermal/intel: Add TCC cooling support for AlderLake platform
  thermal/drivers/exynos: Fix an error code in exynos_tmu_probe()
  thermal/drivers/tegra: Correct compile-testing of drivers
  thermal/drivers/tegra: Add driver for Tegra30 thermal sensor
2021-09-11 09:20:57 -07:00
Arnaldo Carvalho de Melo 218e7b775d perf bpf: Provide a weak btf__load_from_kernel_by_id() for older libbpf versions
The btf__get_from_id() function was deprecated in favour of
btf__load_from_kernel_by_id(), but it is still avaiable, so use it to
provide a weak function btf__load_from_kernel_by_id() for older libbpf
when building perf with LIBBPF_DYNAMIC=1, i.e. using the system's libbpf
package.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 18:32:04 -03:00
Vadim Fedorenko 3384c7c764 selftests/bpf: Test new __sk_buff field hwtstamp
Analogous to the gso_segs selftests introduced in commit d9ff286a0f
("bpf: allow BPF programs access skb_shared_info->gso_segs field").

Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210909220409.8804-3-vfedorenko@novek.ru
2021-09-10 23:20:13 +02:00
Vadim Fedorenko f64c4acea5 bpf: Add hardware timestamp field to __sk_buff
BPF programs may want to know hardware timestamps if NIC supports
such timestamping.

Expose this data as hwtstamp field of __sk_buff the same way as
gso_segs/gso_size. This field could be accessed from the same
programs as tstamp field, but it's read-only field. Explicit test
to deny access to padding data is added to bpf_skb_is_valid_access.

Also update BPF_PROG_TEST_RUN tests of the feature.

Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210909220409.8804-2-vfedorenko@novek.ru
2021-09-10 23:19:58 +02:00
Arnaldo Carvalho de Melo 37ce9e4fc5 tools include UAPI: Update linux/mount.h copy
To pick the changes from:

  9ffb14ef61 ("move_mount: allow to add a mount into an existing group")

That ends up adding support for the new MOVE_MOUNT_SET_GROUP move_mount
flag.

  $ tools/perf/trace/beauty/move_mount_flags.sh > before
  $ cp include/uapi/linux/mount.h tools/include/uapi/linux/mount.h
  $ tools/perf/trace/beauty/move_mount_flags.sh > after
  $ diff -u before after
  --- before	2021-09-10 12:28:43.865279808 -0300
  +++ after	2021-09-10 12:28:50.183429184 -0300
  @@ -5,4 +5,5 @@
   	[ilog2(0x00000010) + 1] = "T_SYMLINKS",
   	[ilog2(0x00000020) + 1] = "T_AUTOMOUNTS",
   	[ilog2(0x00000040) + 1] = "T_EMPTY_PATH",
  +	[ilog2(0x00000100) + 1] = "SET_GROUP",
   };
  $

So now one can use it in --filter expressions for tracepoints.

This silences this perf build warnings:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/mount.h' differs from latest version at 'include/uapi/linux/mount.h'
  diff -u tools/include/uapi/linux/mount.h include/uapi/linux/mount.h

Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 18:15:22 -03:00
Arnaldo Carvalho de Melo 155ed9f1b5 perf beauty: Cover more flags in the move_mount syscall argument beautifier
Previously the regext expected MOVE_MOUNT_[FT]_*, but in the next patch
a flag that doesn't match that expression will be added, MOVE_MOUNT_SET_GROUP

To make this more future proof, take advantage of the fact that the only
one we don't need to cover is MOVE_MOUNT__MASK and use MOVE_MOUNT_[^_]+_*_.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 18:15:22 -03:00
Arnaldo Carvalho de Melo 2c3ef25c4a tools headers UAPI: Sync linux/prctl.h with the kernel sources
To pick the changes in:

  433c38f40f ("arm64: mte: change ASYNC and SYNC TCF settings into bitfields")
  e893bb1bb4 ("x86, prctl: Hook L1D flushing in via prctl")

That don't result in any changes in tooling:

  $ tools/perf/trace/beauty/prctl_option.sh > before
  $ cp include/uapi/linux/prctl.h tools/include/uapi/linux/prctl.h
  $ tools/perf/trace/beauty/prctl_option.sh > after
  $ diff -u before after
  $

Just silences this perf tools build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h'
  diff -u tools/include/uapi/linux/prctl.h include/uapi/linux/prctl.h

Cc: Balbir Singh <sblbir@amazon.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 18:15:22 -03:00
Arnaldo Carvalho de Melo f9f018e4d9 tools include UAPI: Sync sound/asound.h copy with the kernel sources
Picking the changes from:

  81be109349 ("ALSA: pcm: Add SNDRV_PCM_INFO_EXPLICIT_SYNC flag")

Which entails no changes in the tooling side as it doesn't introduce new
ioctls.

To silence this perf tools build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/sound/asound.h' differs from latest version at 'include/uapi/sound/asound.h'
  diff -u tools/include/uapi/sound/asound.h include/uapi/sound/asound.h

Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 18:15:22 -03:00
Arnaldo Carvalho de Melo dfa00459c6 tools headers UAPI: Sync linux/kvm.h with the kernel sources
To pick the changes in:

  f95937ccf5 ("KVM: stats: Support linear and logarithmic histogram statistics")
  f0376edb1d ("KVM: arm64: Add ioctl to fetch/store tags in a guest")
  ea7fc1bb1c ("KVM: arm64: Introduce MTE VM feature")

That just rebuilds perf, as these patches don't add any new KVM ioctl to
be harvested for the the 'perf trace' ioctl syscall argument
beautifiers.

This is also by now used by tools/testing/selftests/kvm/, so that will
pick the new KVM_STATS_TYPE_LINEAR_HIST and KVM_STATS_TYPE_LOG_HIST
defines.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
  diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h

Cc: Jing Zhang <jingzhangos@google.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Steven Price <steven.price@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 18:15:22 -03:00
Arnaldo Carvalho de Melo 03d6f3fe54 tools headers UAPI: Sync x86's asm/kvm.h with the kernel sources
To pick the changes in:

  61e5f69ef0 ("KVM: x86: implement KVM_GUESTDBG_BLOCKIRQ")

That just rebuilds kvm-stat.c on x86, no change in functionality.

This silences these perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/kvm.h' differs from latest version at 'arch/x86/include/uapi/asm/kvm.h'
  diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h

Cc: Maxim Levitsky <mlevitsk@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 18:15:22 -03:00
Kim Phillips 291dcb98d7 perf report: Add support to print a textual representation of IBS raw sample data
Perf records IBS (Instruction Based Sampling) extra sample data when
'perf record --raw-samples' is used with an IBS-compatible event, on a
machine that supports IBS.  IBS support is indicated in
CPUID_Fn80000001_ECX bit #10.

Up until now, users have been able to see the extra sample data solely
in raw hex format using 'perf report --dump-raw-trace'.  From there,
users could decode the data either manually, or by using an external
script.

Enable the built-in 'perf report --dump-raw-trace' to do the decoding of
the extra sample data bits, so manual or external script decoding isn't
necessary.

Example usage:

  $ sudo perf record -c 10000001 -a --raw-samples -e ibs_fetch/rand_en=1/,ibs_op/cnt_ctl=1/ -C 0,1 taskset -c 0,1 7za b -mmt2 | perf report --dump-raw-trace

Stdout contains IBS Fetch samples, e.g.:

  ibs_fetch_ctl:	02170007ffffffff MaxCnt 1048560 Cnt 1048560 Lat     7 En 1 Val 1 Comp 1 IcMiss 0 PhyAddrValid 1 L1TlbPgSz 4KB L1TlbMiss 0 L2TlbMiss 0 RandEn 1 L2Miss 0
  IbsFetchLinAd:	000056016b2ead40
  IbsFetchPhysAd:	000000115cedfd40
  c_ibs_ext_ctl:	0000000000000000 IbsItlbRefillLat   0

..and IBS Op samples, e.g.:

  ibs_op_ctl:	0000009e009e8968 MaxCnt  10000000 En 1 Val 1 CntCtl 1=uOps CurCnt       158
  IbsOpRip:	000056016b2ea73d
  ibs_op_data:	00000000000b0002 CompToRetCtr     2 TagToRetCtr    11 BrnRet 0  RipInvalid 0 BrnFuse 0 Microcode 0
  ibs_op_data2:	0000000000000002 CacheHitSt 0=M-state RmtNode 0 DataSrc 2=Local node cache
  ibs_op_data3:	0000000000c60002 LdOp 0 StOp 1 DcL1TlbMiss 0 DcL2TlbMiss 0 DcL1TlbHit2M 0 DcL1TlbHit1G 0 DcL2TlbHit2M 0 DcMiss 0 DcMisAcc 0 DcWcMemAcc 0 DcUcMemAcc 0 DcLockedOp 0 DcMissNoMabAlloc 0 DcLinAddrValid 1 DcPhyAddrValid 1 DcL2TlbHit1G 0 L2Miss 0 SwPf 0 OpMemWidth  4 bytes OpDcMissOpenMemReqs  0 DcMissLat     0 TlbRefillLat     0
  IbsDCLinAd:	00007f133c319ce0
  IbsDCPhysAd:	0000000270485ce0

Committer notes:

Fixed up this:

  util/amd-sample-raw.c: In function ‘evlist__amd_sample_raw’:
  util/amd-sample-raw.c:125:42: error: ‘ bytes’ directive output may be truncated writing 6 bytes into a region of size between 4 and 7 [-Werror=format-truncation=]
    125 |                          " OpMemWidth %2d bytes", 1 << (reg.op_mem_width - 1));
        |                                          ^~~~~~
  In file included from /usr/include/stdio.h:866,
                   from util/amd-sample-raw.c:7:
  /usr/include/bits/stdio2.h:71:10: note: ‘__builtin___snprintf_chk’ output between 21 and 24 bytes into a destination of size 21
     71 |   return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
        |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     72 |                                    __glibc_objsize (__s), __fmt,
        |                                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     73 |                                    __va_arg_pack ());
        |                                    ~~~~~~~~~~~~~~~~~
  cc1: all warnings being treated as errors

As that %2d won't limit the number of chars to 2, just state that 2 is
the minimal width:

  $ cat printf.c
  #include <stdio.h>
  #include <stdlib.h>

  int main(int argc, char *argv[])
  {
  	char bf[64];
  	int len = snprintf(bf, sizeof(bf), "%2d", atoi(argv[1]));

  	printf("strlen(%s): %u\n", bf, len);

  	return 0;
  }
  $ ./printf 1
  strlen( 1): 2
  $ ./printf 12
  strlen(12): 2
  $ ./printf 123
  strlen(123): 3
  $ ./printf 1234
  strlen(1234): 4
  $ ./printf 12345
  strlen(12345): 5
  $ ./printf 123456
  strlen(123456): 6
  $

And since we probably don't want that output to be truncated, just
assume the worst case, as the compiler did, and add a few more chars to
that buffer.

Also use sizeof(var) instead of sizeof(dup-of-wanted-format-string) to
avoid bugs when changing one but not the other.

I also had to change this:

  -#include <asm/amd-ibs.h>
  +#include "../../arch/x86/include/asm/amd-ibs.h"

To make it build on other architectures, just like intel-pt does.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https //lore.kernel.org/r/20210817221509.88391-4-kim.phillips@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 18:15:21 -03:00
Kim Phillips dde994dd54 perf report: Add tools/arch/x86/include/asm/amd-ibs.h
This is a tools/-side patch for the patch that adds the original copy
of the IBS header file, in arch/x86/include/asm/.

We also add an entry to check-headers.sh, so future changes continue
to be copied.

Committer notes:

Had to add this

  -#include <asm/msr-index.h>
  +#include "msr-index.h"

And change the check-headers.sh entry to ignore this line when diffing
with the original kernel header.

This is needed so that we can use 'perf report' on a perf.data with IBS
data on a !x86 system, i.e. building on ARM fails without this as there
is no asm/msr-index.h there.

This was done on the next patch in this series and is done for things
like Intel PT and ARM CoreSight.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https //lore.kernel.org/r/20210817221509.88391-3-kim.phillips@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 18:13:20 -03:00
Magnus Karlsson 909f0e2820 selftests: xsk: Add tests for 2K frame size
Add tests for 2K frame size. Both a standard send and receive test and
one testing for invalid descriptors when the frame size is 2K.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-21-magnus.karlsson@gmail.com
2021-09-10 21:15:32 +02:00
Magnus Karlsson 0d1b7f3a00 selftests: xsk: Add tests for invalid xsk descriptors
Add tests for invalid xsk descriptors in the Tx ring. A number of
handcrafted nasty invalid descriptors are created and submitted to the
tx ring to check that they are validated correctly. Corner case valid
ones are also sent. The tests are run for both aligned and unaligned
mode.

pkt_stream_set() is introduced to be able to create a hand-crafted
packet stream where every single packet is specified in detail.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-20-magnus.karlsson@gmail.com
2021-09-10 21:15:32 +02:00
Magnus Karlsson 6ce67b5165 selftests: xsk: Eliminate test specific if-statement in test runner
Eliminate a test specific if-statement for the RX_FILL_EMTPY stats
test that is present in the test runner. We can do this as we now have
the use_addr_for_fill option. Just create and empty Rx packet stream
and indicated that the test runner should use the addresses in that to
populate the fill ring. As there are no packets in the stream, the
fill ring will be empty and we will get the error stats that we want
to test.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-19-magnus.karlsson@gmail.com
2021-09-10 21:15:32 +02:00
Magnus Karlsson a4ba98dd0c selftests: xsk: Add test for unaligned mode
Add a test for unaligned mode in which packet buffers can be placed
anywhere within the umem. Some packets are made to straddle page
boundaries in order to check for correctness. On the Tx side, buffers
are now allocated according to the addresses found in the packet
stream. Thus, the placement of buffers can be controlled with the
boolean use_addr_for_fill in the packet stream.

One new pkt_stream interface is introduced: pkt_stream_replace_half()
that replaces every other packet in the default packet stream with the
specified new packet. The constant DEFAULT_OFFSET is also
introduced. It specifies at what offset from the start of a chunk a Tx
packet is placed by the sending thread. This is just to be able to
test that it is possible to send packets at an offset not equal to
zero.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-18-magnus.karlsson@gmail.com
2021-09-10 21:15:31 +02:00
Magnus Karlsson 605091c510 selftests: xsk: Introduce replacing the default packet stream
Introduce the concept of a default packet stream that is the set of
packets sent by most tests. Then add the ability to replace it for a
test that would like to send or receive something else through the use
of the function pkt_stream_replace() and then restored with
pkt_stream_restore_default(). These are then used to convert the
STAT_TEST_TX_INVALID to use these new APIs.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-17-magnus.karlsson@gmail.com
2021-09-10 21:15:31 +02:00
Magnus Karlsson 8abf6f725a selftests: xsk: Allow for invalid packets
Allow for invalid packets to be sent. These are verified by the Rx
thread not to be received. Or put in another way, if they are
received, the test will fail. This feature will be used to eliminate
an if statement for a stats test and will also be used by other tests
in later patches. The previous code could only deal with valid
packets.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-16-magnus.karlsson@gmail.com
2021-09-10 21:15:31 +02:00
Magnus Karlsson 8ce7192b50 selftests: xsk: Eliminate MAX_SOCKS define
Remove the MAX_SOCKS define as it always will be one for the forseable
future and the code does not work for any other case anyway.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-15-magnus.karlsson@gmail.com
2021-09-10 21:15:31 +02:00
Magnus Karlsson e2d850d534 selftests: xsx: Make pthreads local scope
Make the pthread_t variables local scope instead of global. No reason
for them to be global.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-14-magnus.karlsson@gmail.com
2021-09-10 21:15:31 +02:00
Magnus Karlsson af6731d1e1 selftests: xsk: Make xdp_flags and bind_flags local
Make xdp_flags and bind_flags local instead of global by moving them
into the interface object. These flags decide if the socket should be
created in SKB mode or in DRV mode and therefore they are sticky and
will survive a test_spec_reset. Since every test is first run in SKB
mode then in DRV mode, this change only happens once. With this
change, the configured_mode global variable can also be
erradicated. The first test_spec_init() also becomes superfluous and
can be eliminated.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-13-magnus.karlsson@gmail.com
2021-09-10 21:15:31 +02:00
Magnus Karlsson 85c6c95739 selftests: xsk: Specify number of sockets to create
Add the ability in the test specification to specify numbers of
sockets to create. The default is one socket. This is then used to
remove test specific if-statements around the bpf_res tests.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-12-magnus.karlsson@gmail.com
2021-09-10 21:15:31 +02:00
Magnus Karlsson 55be575dc1 selftests: xsk: Replace second_step global variable
Replace the second_step global variable with a test specification
variable called total_steps that a test can be set to indicate how
many times the packet stream should be sent without reinitializing any
sockets. This eliminates test specific code in the test runner around
the bidirectional test.

The total_steps variable is 1 by default as most tests only need a
single round of packets.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-11-magnus.karlsson@gmail.com
2021-09-10 21:15:31 +02:00