If failed to set link1_1 to netns client, we should delete link1_1 in the
cleanup path. But if set link1_1 to netns client successfully, delete
link1_1 will report warning. So it will be safer creating directly the
devices in the target namespaces.
Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Closes: https://lore.kernel.org/all/ZNyJx1HtXaUzOkNA@Laptop-X1/
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Acked-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for using NLM_F_REPLACE, _EXCL, _CREATE and _APPEND flags
in requests.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230825122756.7603-10-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add support for the 'array-nest' attribute type that is used by several
netlink-raw families.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230825122756.7603-9-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Move decode_fixed_header into YnlFamily and add a _fixed_header_size
method to allow extack decoding to skip the fixed header.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230825122756.7603-7-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
We use a tempfile for code generation, to avoid wiping the target
file out if the code generator crashes. File contents are copied
from tempfile to actual destination at the end of main().
uAPI generation is relatively simple so when generating the uAPI
header we return from main() early, and never reach the "copy code
over" stage. Since commit under Fixes uAPI headers are not updated
by ynl-gen.
Move the copy/commit of the code into CodeWriter, to make it
easier to call at any point in time. Hook it into the destructor
to make sure we don't miss calling it.
Fixes: f65f305ae0 ("tools: ynl-gen: use temporary file for rendering")
Link: https://lore.kernel.org/r/20230824212431.1683612-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZOjkTAAKCRDbK58LschI
gx32AP9gaaHFBtOYBfoenKTJfMgv1WhtQHIBas+WN9ItmBx9MAEA4gm/VyQ6oD7O
EBjJKJQ2CZ/QKw7cNacXw+l5jF7/+Q0=
=8P7g
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2023-08-25
We've added 87 non-merge commits during the last 8 day(s) which contain
a total of 104 files changed, 3719 insertions(+), 4212 deletions(-).
The main changes are:
1) Add multi uprobe BPF links for attaching multiple uprobes
and usdt probes, which is significantly faster and saves extra fds,
from Jiri Olsa.
2) Add support BPF cpu v4 instructions for arm64 JIT compiler,
from Xu Kuohai.
3) Add support BPF cpu v4 instructions for riscv64 JIT compiler,
from Pu Lehui.
4) Fix LWT BPF xmit hooks wrt their return values where propagating
the result from skb_do_redirect() would trigger a use-after-free,
from Yan Zhai.
5) Fix a BPF verifier issue related to bpf_kptr_xchg() with local kptr
where the map's value kptr type and locally allocated obj type
mismatch, from Yonghong Song.
6) Fix BPF verifier's check_func_arg_reg_off() function wrt graph
root/node which bypassed reg->off == 0 enforcement,
from Kumar Kartikeya Dwivedi.
7) Lift BPF verifier restriction in networking BPF programs to treat
comparison of packet pointers not as a pointer leak,
from Yafang Shao.
8) Remove unmaintained XDP BPF samples as they are maintained
in xdp-tools repository out of tree, from Toke Høiland-Jørgensen.
9) Batch of fixes for the tracing programs from BPF samples in order
to make them more libbpf-aware, from Daniel T. Lee.
10) Fix a libbpf signedness determination bug in the CO-RE relocation
handling logic, from Andrii Nakryiko.
11) Extend libbpf to support CO-RE kfunc relocations. Also follow-up
fixes for bpf_refcount shared ownership implementation,
both from Dave Marchevsky.
12) Add a new bpf_object__unpin() API function to libbpf,
from Daniel Xu.
13) Fix a memory leak in libbpf to also free btf_vmlinux
when the bpf_object gets closed, from Hao Luo.
14) Small error output improvements to test_bpf module, from Helge Deller.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (87 commits)
selftests/bpf: Add tests for rbtree API interaction in sleepable progs
bpf: Allow bpf_spin_{lock,unlock} in sleepable progs
bpf: Consider non-owning refs to refcounted nodes RCU protected
bpf: Reenable bpf_refcount_acquire
bpf: Use bpf_mem_free_rcu when bpf_obj_dropping refcounted nodes
bpf: Consider non-owning refs trusted
bpf: Ensure kptr_struct_meta is non-NULL for collection insert and refcount_acquire
selftests/bpf: Enable cpu v4 tests for RV64
riscv, bpf: Support unconditional bswap insn
riscv, bpf: Support signed div/mod insns
riscv, bpf: Support 32-bit offset jmp insn
riscv, bpf: Support sign-extension mov insns
riscv, bpf: Support sign-extension load insns
riscv, bpf: Fix missing exception handling and redundant zext for LDX_B/H/W
samples/bpf: Add note to README about the XDP utilities moved to xdp-tools
samples/bpf: Cleanup .gitignore
samples/bpf: Remove the xdp_sample_pkts utility
samples/bpf: Remove the xdp1 and xdp2 utilities
samples/bpf: Remove the xdp_rxq_info utility
samples/bpf: Remove the xdp_redirect* utilities
...
====================
Link: https://lore.kernel.org/r/20230825194319.12727-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Merge devfreq changes and power management tools changes for 6.6-rc1:
- Fix memory leak in devfreq_dev_release() (Boris Brezillon).
- Rewrite devfreq_monitor_start() kerneldoc comment (Manivannan
Sadhasivam).
- Explicitly include correct DT includes in devfreq (Rob Herring).
- Add turbo-boost support to cpupower (Wyes Karny).
- Add support for amd_pstate mode change to cpupower (Wyes Karny).
- Fix 'cpupower idle_set' command to accept only numeric values of
arguments (Likhitha Korrapati).
* pm-devfreq:
PM / devfreq: Fix leak in devfreq_dev_release()
PM / devfreq: Reword the kernel-doc comment for devfreq_monitor_start() API
PM / devfreq: Explicitly include correct DT includes
* pm-tools:
cpupower: Fix cpuidle_set to accept only numeric values for idle-set operation.
cpupower: Add turbo-boost support in cpupower
cpupower: Add support for amd_pstate mode change
cpupower: Add EPP value change support
cpupower: Add is_valid_path API
cpupower: Recognise amd-pstate active mode driver
cpupower: Bump soname version
or aren't considered suitable for a -stable backport.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZOjuGgAKCRDdBJ7gKXxA
jkLlAQDY9sYxhQZp1PFLirUIPeOBjEyifVy6L6gCfk9j0snLggEA2iK+EtuJt2Dc
SlMfoTq29zyU/YgfKKwZEVKtPJZOHQU=
=oTcj
-----END PGP SIGNATURE-----
Merge tag 'mm-hotfixes-stable-2023-08-25-11-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"18 hotfixes. 13 are cc:stable and the remainder pertain to post-6.4
issues or aren't considered suitable for a -stable backport"
* tag 'mm-hotfixes-stable-2023-08-25-11-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
shmem: fix smaps BUG sleeping while atomic
selftests: cachestat: catch failing fsync test on tmpfs
selftests: cachestat: test for cachestat availability
maple_tree: disable mas_wr_append() when other readers are possible
madvise:madvise_free_pte_range(): don't use mapcount() against large folio for sharing check
madvise:madvise_free_huge_pmd(): don't use mapcount() against large folio for sharing check
madvise:madvise_cold_or_pageout_pte_range(): don't use mapcount() against large folio for sharing check
mm: multi-gen LRU: don't spin during memcg release
mm: memory-failure: fix unexpected return value in soft_offline_page()
radix tree: remove unused variable
mm: add a call to flush_cache_vmap() in vmap_pfn()
selftests/mm: FOLL_LONGTERM need to be updated to 0x100
nilfs2: fix general protection fault in nilfs_lookup_dirty_data_buffers()
mm/gup: handle cont-PTE hugetlb pages correctly in gup_must_unshare() via GUP-fast
selftests: cgroup: fix test_kmem_basic less than error
mm: enable page walking API to lock vmas during the walk
smaps: use vm_normal_page_pmd() instead of follow_trans_huge_pmd()
mm/gup: reintroduce FOLL_NUMA as FOLL_HONOR_NUMA_FAULT
Confirm that the following sleepable prog states fail verification:
* bpf_rcu_read_unlock before bpf_spin_unlock
* RCU CS will last at least as long as spin_lock CS
Also confirm that correct usage passes verification, specifically:
* Explicit use of bpf_rcu_read_{lock, unlock} in sleepable test prog
* Implied RCU CS due to spin_lock CS
None of the selftest progs actually attach to bpf_testmod's
bpf_testmod_test_read.
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/r/20230821193311.3290257-8-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Now that all reported issues are fixed, bpf_refcount_acquire can be
turned back on. Also reenable all bpf_refcount-related tests which were
disabled.
This a revert of:
* commit f3514a5d67 ("selftests/bpf: Disable newly-added 'owner' field test until refcount re-enabled")
* commit 7deca5eae8 ("bpf: Disable bpf_refcount_acquire kfunc calls until race conditions are fixed")
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230821193311.3290257-5-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Explicitly set the exception vector to #UD when potentially injecting an
exception in sync_regs_test's subtests that try to detect TOCTOU bugs
in KVM's handling of exceptions injected by userspace. A side effect of
the original KVM bug was that KVM would clear the vector, but relying on
KVM to clear the vector (i.e. make it #DE) makes it less likely that the
test would ever find *new* KVM bugs, e.g. because only the first iteration
would run with a legal vector to start.
Explicitly inject #UD for race_events_inj_pen() as well, e.g. so that it
doesn't inherit the illegal 255 vector from race_events_exc(), which
currently runs first.
Link: https://lore.kernel.org/r/20230817233430.1416463-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reload known good vCPU state if the vCPU triple faults in any of the
race_sync_regs() subtests, e.g. if KVM successfully injects an exception
(the vCPU isn't configured to handle exceptions). On Intel, the VMCS
is preserved even after shutdown, but AMD's APM states that the VMCB is
undefined after a shutdown and so KVM synthesizes an INIT to sanitize
vCPU/VMCB state, e.g. to guard against running with a garbage VMCB.
The synthetic INIT results in the vCPU never exiting to userspace, as it
gets put into Real Mode at the reset vector, which is full of zeros (as is
GPA 0 and beyond), and so executes ADD for a very, very long time.
Fixes: 60c4063b47 ("KVM: selftests: Extend x86's sync_regs_test to check for event vector races")
Cc: Michal Luczaj <mhal@rbox.co>
Link: https://lore.kernel.org/r/20230817233430.1416463-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Casts were necessary for older versions of libslang, however, these
are now 15 years old and so we no longer need to care about supporting
them. Tidy the casts and remove unnecessary logic.
Move the ENABLE_SLFUTURE_CONST to the libslang.h common include file,
and also enable ENABLE_SLFUTURE_VOID.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Wei Li <liwei391@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230825024002.801955-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Initialize realname to NULL, rather than name.
This avoids a cast and as realpath is either NULL or an allocated
string, free can be called unconditionally.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Wei Li <liwei391@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230825024002.801955-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The struct pmu id is initialized from pmu_id that is read into allocated
memory from a file, as such it needs free-ing in pmu__delete().
Make the id value const so that we can remove casts in tests.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Wei Li <liwei391@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230825024002.801955-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This avoids casts in tests. Use zfree in a few places to avoid
warnings about a freeing a const pointer.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Wei Li <liwei391@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230825024002.801955-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The PMU name could be NULL in the case of the fake_pmu. Initialize the
name for the fake_pmu to "fake" so that all other logic can assume it
is initialized. Add a const to the type of name so that a literal can
be used to avoid additional initialization code. Propagate the cost
through related routines and remove now unnecessary "(char *)"
casts. Doing this located a bug in builtin-list for the pmu_glob that
was missing a strdup.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20230825024002.801955-3-irogers@google.com
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: James Clark <james.clark@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Wei Li <liwei391@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Ming Wang <wangming01@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
PMU caps are written as HEADER_PMU_CAPS or for the special case of the
PMU "cpu" as HEADER_CPU_PMU_CAPS. As the PMU "cpu" is special, and not
any "core" PMU, the logic had become broken and core PMUs not called
"cpu" were not having their caps written.
This affects ARM and s390 non-hybrid PMUs.
Simplify the PMU caps writing logic to scan one fewer time and to be
more explicit in its behavior.
Fixes: 178ddf3bad ("perf header: Avoid hybrid PMU list in write_pmu_caps")
Reported-by: Wei Li <liwei391@huawei.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230825024002.801955-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* for-next/selftests: (22 commits)
kselftest/arm64: Fix hwcaps selftest build
kselftest/arm64: add jscvt feature to hwcap test
kselftest/arm64: add pmull feature to hwcap test
kselftest/arm64: add AES feature check to hwcap test
kselftest/arm64: add SHA1 and related features to hwcap test
kselftest/arm64: build BTI tests in output directory
kselftest/arm64: fix a memleak in zt_regs_run()
kselftest/arm64: Size sycall-abi buffers for the actual maximum VL
kselftest/arm64: add lse and lse2 features to hwcap test
kselftest/arm64: add test item that support to capturing the SIGBUS signal
kselftest/arm64: add DEF_SIGHANDLER_FUNC() and DEF_INST_RAISE_SIG() helpers
kselftest/arm64: add crc32 feature to hwcap test
kselftest/arm64: add float-point feature to hwcap test
kselftest/arm64: Use the tools/include compiler.h rather than our own
kselftest/arm64: Use shared OPTIMZER_HIDE_VAR() definiton
kselftest/arm64: Make the tools/include headers available
tools include: Add some common function attributes
tools compiler.h: Add OPTIMIZER_HIDE_VAR()
kselftest/arm64: Exit streaming mode after collecting signal context
kselftest/arm64: add RCpc load-acquire to hwcap test
...
- Fix ring buffer being permanently disabled due to missed record_disabled()
Changing the trace cpu mask will disable the ring buffers for the CPUs no
longer in the mask. But it fails to update the snapshot buffer. If a snapshot
takes place, the accounting for the ring buffer being disabled is corrupted
and this can lead to the ring buffer being permanently disabled.
- Add test case for snapshot and cpu mask working together
- Fix memleak by the function graph tracer not getting closed properly.
The iterator is used to read the ring buffer. When it opens, it calls
the open function of a tracer, and when it is closed, it calls the close
iteration. While a trace is being read, it is still possible to change
the tracer. If this happens between the function graph tracer and the
wakeup tracer (which uses function graph tracing), the tracers are not
closed properly during when the iterator sees the switch, and the wakeup
function did not initialize its private pointer to NULL, which is used
to know if the function graph tracer was the last tracer. It could be
fooled in thinking it is, but then on exit it does not call the close
function of the function graph tracer to clean up its data.
- Fix synthetic events on big endian machines, by introducing a union
that does the conversions properly.
- Fix synthetic events from printing out the number of elements in the
stacktrace when it shouldn't.
- Fix synthetic events stacktrace to not print a bogus value at the end.
- Introduce a pipe_cpumask that prevents the trace_pipe files from being
opened by more than one task (file descriptor). There was a race found
where if splice is called, the iter->ent could become stale and events
could be missed. There's no point reading a producer/consumer file by
more than one task as they will corrupt each other anyway. Add a cpumask
that keeps track of the per_cpu trace_pipe files as well as the global
trace_pipe file that prevents more than one open of a trace_pipe file
that represents the same ring buffer. This prevents the race from
happening.
- Fix ftrace samples for arm64 to work with older compilers.
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEXtmkj8VMCiLR0IBM68Js21pW3nMFAmTn7hsUHHJvc3RlZHRA
Z29vZG1pcy5vcmcACgkQ68Js21pW3nOSVg/9HVFTUX52yqvT9YLv8b1QZYMo2I5n
to1nSbF9UUYIAMqkyNHuXU2TBj1I77JVkbsPEFsjYTN97CzJ/Zc8jNa6p7HgVure
1xUuaCgtOFPDjU6OpTa6wFRt4usU1UM8Noc/ii0WDUGsA+RKBAKmGUsNmuDHx1Ae
a3uJTLR4VHMeAkbtth/8f6RHBVocDVSYPQbKnC0PksGW1wg9e9PU2G6+o3069I1Q
qbOZJqvGDpeaapjyG2LYrDwkVGOPSvUJuJNfcjcNcyKoHSqxZzBb+feY16NhWDm9
idpyHLE4qF1nvlh1SIpErFl8Bu5MG9CN8a+xrx3ufd6i1jO4bcHyRD9XmjG4p72+
zVshoS86DI4KCK9wHxJ/5/SU6XuL6JoNTP7NhmDIX83QCuZwgTOa8C2xzHKHu58F
In13IhJqS5ob6Jy25a/bAy0CbdTl0cjQvMfXrrYK0ZWuEBWgBUDqwB/eKY6oq79D
oTKmFNOZsuiLAhPywAoY5cAqQtftpy5Ul8o6ed3RAw3th0WC4EeXIk5eUu5g/ABI
1ZfnpeY5al/JROFGXWxyn964RRjoVpbC1M6NVTz33e7No+r2KSRlLb5ZEWZVVIcZ
My96QiZXamBZ/EOR7x72yFWxXBSuACu9nvbSSZRnbEGujoKwDb89xtgWzSNuR/wi
GexDcj6qWfODrfc=
=CI9f
-----END PGP SIGNATURE-----
Merge tag 'trace-v6.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Fix ring buffer being permanently disabled due to missed
record_disabled()
Changing the trace cpu mask will disable the ring buffers for the
CPUs no longer in the mask. But it fails to update the snapshot
buffer. If a snapshot takes place, the accounting for the ring buffer
being disabled is corrupted and this can lead to the ring buffer
being permanently disabled.
- Add test case for snapshot and cpu mask working together
- Fix memleak by the function graph tracer not getting closed properly.
The iterator is used to read the ring buffer. When it opens, it calls
the open function of a tracer, and when it is closed, it calls the
close iteration. While a trace is being read, it is still possible to
change the tracer.
If this happens between the function graph tracer and the wakeup
tracer (which uses function graph tracing), the tracers are not
closed properly during when the iterator sees the switch, and the
wakeup function did not initialize its private pointer to NULL, which
is used to know if the function graph tracer was the last tracer. It
could be fooled in thinking it is, but then on exit it does not call
the close function of the function graph tracer to clean up its data.
- Fix synthetic events on big endian machines, by introducing a union
that does the conversions properly.
- Fix synthetic events from printing out the number of elements in the
stacktrace when it shouldn't.
- Fix synthetic events stacktrace to not print a bogus value at the
end.
- Introduce a pipe_cpumask that prevents the trace_pipe files from
being opened by more than one task (file descriptor).
There was a race found where if splice is called, the iter->ent could
become stale and events could be missed. There's no point reading a
producer/consumer file by more than one task as they will corrupt
each other anyway. Add a cpumask that keeps track of the per_cpu
trace_pipe files as well as the global trace_pipe file that prevents
more than one open of a trace_pipe file that represents the same ring
buffer. This prevents the race from happening.
- Fix ftrace samples for arm64 to work with older compilers.
* tag 'trace-v6.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
samples: ftrace: Replace bti assembly with hint for older compiler
tracing: Introduce pipe_cpumask to avoid race on trace_pipes
tracing: Fix memleak due to race between current_tracer and trace
tracing/synthetic: Allocate one additional element for size
tracing/synthetic: Skip first entry for stack traces
tracing/synthetic: Use union instead of casts
selftests/ftrace: Add a basic testcase for snapshot
tracing: Fix cpu buffers unavailable due to 'record_disabled' missed
Differentiate between empty list and None for member lists.
New families may want to create request responses with no attribute.
If we treat those the same as None we end up rendering
a full parsing policy in user space, instead of an empty one.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://lore.kernel.org/r/20230824003056.1436637-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
We look for attributes inside do.request, but there's another
layer of nesting in the spec, look inside do.request.attributes.
This bug had no effect as all global policies we generate (fou)
seem to be full, anyway, and we treat full and empty the same.
Next patch will change the treatment of empty policies.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://lore.kernel.org/r/20230824003056.1436637-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Recent changes made us assume that input for binary data is in hex.
When using YNL as a Python library it's possible to pass in raw bytes.
Bring the ability to do that back.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://lore.kernel.org/r/20230824003056.1436637-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Remove comparing pointer to 0 to avoid this warning from coccinelle:
./tools/testing/selftests/mm/map_populate.c:80:16-17: WARNING comparing pointer to 0, suggest !E
./tools/testing/selftests/mm/map_populate.c:80:16-17: WARNING comparing pointer to 0
Link: https://lkml.kernel.org/r/20230817160033.90079-1-tuananhlfc@gmail.com
Signed-off-by: Anh Tuan Phan <tuananhlfc@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Currently, not all kernel memory usage is being accounted for. This
commit switches to using the kernel entry within memory.stat which
already includes kernel_stack, pagetables, and slab. The kernel entry
also includes vmalloc and other additional kernel memory use cases which
were missing.
Link: https://lkml.kernel.org/r/bvrhe2tpsts2azaroq4ubp2slawmop6orndsswrewuscw3ugvk@kmemmrttsnc7
Signed-off-by: Lucas Karpinski <lkarpins@redhat.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Zefan Li <lizefan.x@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "New page table range API", v6.
This patchset changes the API used by the MM to set up page table entries.
The four APIs are:
set_ptes(mm, addr, ptep, pte, nr)
update_mmu_cache_range(vma, addr, ptep, nr)
flush_dcache_folio(folio)
flush_icache_pages(vma, page, nr)
flush_dcache_folio() isn't technically new, but no architecture
implemented it, so I've done that for them. The old APIs remain around
but are mostly implemented by calling the new interfaces.
The new APIs are based around setting up N page table entries at once.
The N entries belong to the same PMD, the same folio and the same VMA, so
ptep++ is a legitimate operation, and locking is taken care of for you.
Some architectures can do a better job of it than just a loop, but I have
hesitated to make too deep a change to architectures I don't understand
well.
One thing I have changed in every architecture is that PG_arch_1 is now a
per-folio bit instead of a per-page bit when used for dcache clean/dirty
tracking. This was something that would have to happen eventually, and it
makes sense to do it now rather than iterate over every page involved in a
cache flush and figure out if it needs to happen.
The point of all this is better performance, and Fengwei Yin has measured
improvement on x86. I suspect you'll see improvement on your architecture
too. Try the new will-it-scale test mentioned here:
https://lore.kernel.org/linux-mm/20230206140639.538867-5-fengwei.yin@intel.com/
You'll need to run it on an XFS filesystem and have
CONFIG_TRANSPARENT_HUGEPAGE set.
This patchset is the basis for much of the anonymous large folio work
being done by Ryan, so it's received quite a lot of testing over the last
few months.
This patch (of 38):
Determine if a value lies within a range more efficiently (subtraction +
comparison vs two comparisons and an AND). It also has useful (under some
circumstances) behaviour if the range exceeds the maximum value of the
type. Convert all the conflicting definitions of in_range() within the
kernel; some can use the generic definition while others need their own
definition.
Link: https://lkml.kernel.org/r/20230802151406.3735276-1-willy@infradead.org
Link: https://lkml.kernel.org/r/20230802151406.3735276-2-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Unit with the PMU name is appended to desc in jevents.py, but on
hybrid platforms it causes the desc to differ from the regular
non-hybrid system with a PMU of 'cpu'. Having differing descs means
the events don't deduplicate. To make the perf list output not differ,
append the Unit on again in the perf list printing code.
On x86 reduces the binary size by 409,600 bytes or about 4%. Update
pmu-events test expectations to match the differently generated
pmu-events.c code.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824183212.374787-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The cachestat kselftest runs a test on a normal file, which is created
temporarily in the current directory. Among the tests it runs there is a
call to fsync(), which is expected to clean all dirty pages used by the
file.
However the tmpfs filesystem implements fsync() as noop_fsync(), so the
call will not even attempt to clean anything when this test file happens
to live on a tmpfs instance. This happens in an initramfs, or when the
current directory is in /dev/shm or sometimes /tmp.
To avoid this test failing wrongly, use statfs() to check which filesystem
the test file lives on. If that is "tmpfs", we skip the fsync() test.
Since the fsync test is only one part of the "normal file" test, we now
execute this twice, skipping the fsync part on the first call. This way
only the second test, including the fsync part, would be skipped.
Link: https://lkml.kernel.org/r/20230821160534.3414911-3-andre.przywara@arm.com
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "selftests: cachestat: fix run on older kernels", v2.
I ran all kernel selftests on some test machine, and stumbled upon
cachestat failing (among others). These patches fix the run on older
kernels and when the current directory is on a tmpfs instance.
This patch (of 2):
As cachestat is a new syscall, it won't be available on older kernels, for
instance those running on a development machine. At the moment the test
reports all tests as "not ok" in this case.
Test for the cachestat syscall availability first, before doing further
tests, and bail out early with a TAP SKIP comment.
This also uses the opportunity to add the proper TAP headers, and add one
check for proper error handling (illegal file descriptor).
Link: https://lkml.kernel.org/r/20230821160534.3414911-1-andre.przywara@arm.com
Link: https://lkml.kernel.org/r/20230821160534.3414911-2-andre.przywara@arm.com
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Nhat Pham <nphamcs@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
All required libraries have been imported and make sure that none of
them are external dependencies. To achieve this, created a virt env and
verified.
Modified usage information and added combined command.
Modified the main() function to read the --save-only command-line option
and set the output_file variable accordingly.
Modified the trace_end() function to check for the output_file variable.
If it is set, the profiler data is saved to a local file in Gecko
Profile format, or the profiler.firefox.com is opened on the default
browser.
Included trace_begin() to initialize the Firefox Profiler and launch the
default browser to display the profiler.firefox.com.
Added a new function launchFirefox() to start a local server and launch
the profiler UI on the default browser with the appropriate URL.
Created the "CORSRequestHandler" class to enable Cross-Origin Resource
Sharing.
Summary:
This integration now includes a exiting feature to conveniently host the
Gecko Profile data on a local server and open it directly in the default
web browser.
This means that users can now effortlessly visualize and analyze the
profiler results with just a single click.
The addition of the --save-only command-line option allows users to save
the profiler output to a local file in Gecko Profile format, but the
real highlight lies in the capability to seamlessly launch a local
server, making the data accessible to Firefox Profiler via a web
browser.
In addition, it's important to highlight that all data are hosted
locally, eliminating any concerns about data privacy rules and
regulations.
Signed-off-by: Anup Sharma <anupnewsmail@gmail.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/ZNOS0vo58DnVLpD8@yoga
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Refines the argument handling mechanism in the "gecko-report" script to
enable better compatibility and improved user experience.
The script now differentiates between scenarios where arguments are
provided for record and report cases where gecko.py arguments are
passed.
Signed-off-by: Anup Sharma <anupnewsmail@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/ZNf7W+EIrrCSHZN0@yoga
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Enable cpu v4 tests for RV64, and the relevant tests have passed.
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20230824095001.3408573-8-pulehui@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
and netfilter
Fixes to fixes:
- nf_tables:
- GC transaction race with abort path
- defer gc run if previous batch is still pending
Previous releases - regressions:
- ipv4: fix data-races around inet->inet_id
- phy: fix deadlocking in phy_error() invocation
- mdio: fix C45 read/write protocol
- ipvlan: fix a reference count leak warning in ipvlan_ns_exit()
- ice: fix NULL pointer deref during VF reset
- i40e: fix potential NULL pointer dereferencing of pf->vf i40e_sync_vsi_filters()
- tg3: use slab_build_skb() when needed
- mtk_eth_soc: fix NULL pointer on hw reset
Previous releases - always broken:
- core: validate veth and vxcan peer ifindexes
- sched: fix a qdisc modification with ambiguous command request
- devlink: add missing unregister linecard notification
- wifi: mac80211: limit reorder_buf_filtered to avoid UBSAN warning
- batman:
- do not get eth header before batadv_check_management_packet
- fix batadv_v_ogm_aggr_send memory leak
- bonding: fix macvlan over alb bond support
- mlxsw: set time stamp fields also when its type is MIRROR_UTC
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmTnJIQSHHBhYmVuaUBy
ZWRoYXQuY29tAAoJECkkeY3MjxOkt7kP/jy6HOMwSOMFbtxQD2m89EImr6ZlLUPg
H09seQzC5nwRbgZrdzukmM27HDKEkYe1sPyxhpS8E4iAslFaefEvnWqOY0oiQSpH
OuF4mP/cS9QKb62NwKVrau3SCARS9arLmOF0mcJNdDOWwucE+SoFaebxSMitAU/w
k8hHVsLwc5dwZAYznOl2/qsmPBnIUsxfymNJE/RuFqj1nHccGybh9mJKpAxc0knj
QEjqno//PgAXPV/X3mH/wG0fcsXs0OlAnBS9yA95GNzuR2yWrh7bD/et99En/elS
8paUio+O3P6Y6WaewgDYFm44pf/x+hFb18Irtab82BkdRw+lgFyF23g8IH7ToJAE
mEaxwdS7AQ4XEunNyJsjwiffWUG1nFaoIhaGb0Lo1qmgLHDo+rrNhkrBWvZxSf0Q
8QlMnCXopJ1c5Qltz5QNVaWPErpCcanxV3cpNlG+lTpfamWBrUpuv/EhHCUF/fr3
hlgJEm+WoFTvexO+QC3CyJDz2JYLLMaaYaoUZ1aJS2dtTTc3tfUjEL8VcopfXI87
2FXJ3qEtCkvfdtfFjhofw97qHDvGrTXa9r2JSh1Pp8v15pKdM2P/lMYxd4B0cSEw
9udW/3bWkvHZayzBWvqDEiz3UTID1+uX0/qpBWY40QzTdIXo6sBrCCk93tjJUdcA
kXjw9HkSqW6H
=WKil
-----END PGP SIGNATURE-----
Merge tag 'net-6.5-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from wifi, can and netfilter.
Fixes to fixes:
- nf_tables:
- GC transaction race with abort path
- defer gc run if previous batch is still pending
Previous releases - regressions:
- ipv4: fix data-races around inet->inet_id
- phy: fix deadlocking in phy_error() invocation
- mdio: fix C45 read/write protocol
- ipvlan: fix a reference count leak warning in ipvlan_ns_exit()
- ice: fix NULL pointer deref during VF reset
- i40e: fix potential NULL pointer dereferencing of pf->vf in
i40e_sync_vsi_filters()
- tg3: use slab_build_skb() when needed
- mtk_eth_soc: fix NULL pointer on hw reset
Previous releases - always broken:
- core: validate veth and vxcan peer ifindexes
- sched: fix a qdisc modification with ambiguous command request
- devlink: add missing unregister linecard notification
- wifi: mac80211: limit reorder_buf_filtered to avoid UBSAN warning
- batman:
- do not get eth header before batadv_check_management_packet
- fix batadv_v_ogm_aggr_send memory leak
- bonding: fix macvlan over alb bond support
- mlxsw: set time stamp fields also when its type is MIRROR_UTC"
* tag 'net-6.5-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (54 commits)
selftests: bonding: add macvlan over bond testing
selftest: bond: add new topo bond_topo_2d1c.sh
bonding: fix macvlan over alb bond support
rtnetlink: Reject negative ifindexes in RTM_NEWLINK
netfilter: nf_tables: defer gc run if previous batch is still pending
netfilter: nf_tables: fix out of memory error handling
netfilter: nf_tables: use correct lock to protect gc_list
netfilter: nf_tables: GC transaction race with abort path
netfilter: nf_tables: flush pending destroy work before netlink notifier
netfilter: nf_tables: validate all pending tables
ibmveth: Use dcbf rather than dcbfl
i40e: fix potential NULL pointer dereferencing of pf->vf i40e_sync_vsi_filters()
net/sched: fix a qdisc modification with ambiguous command request
igc: Fix the typo in the PTM Control macro
batman-adv: Hold rtnl lock during MTU update via netlink
igb: Avoid starting unnecessary workqueues
can: raw: add missing refcount for memory leak fix
can: isotp: fix support for transmission of SF without flow control
bnx2x: new flag for track HW resource allocation
sfc: allocate a big enough SKB for loopback selftest packet
...
Test the basic locking stuff on 2 fds: multiple read locks,
conflicts between read and write locks, use of len==0 for queries.
Also tests for F_UNLCK F_OFD_GETLK extension.
[ jlayton: fix unlink() pathname in selftest ]
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: linux-api@vger.kernel.org
Signed-off-by: Stas Sergeev <stsp2@yandex.ru>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Sort the strings within the big C string based on whether they were
for a metric and then by when they were added. This helps group
related strings and reduce minor faults by approximately 10 in 1740,
about 0.57%.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-18-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Don't load sysfs aliases for a PMU when the PMU is first created, defer
until an alias needs to be found. For the pmu-scan benchmark, average
core PMU scanning is reduced by 30.8%, and average PMU scanning by
12.6%.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-17-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Event info is only needed when an event is parsed or when merging data
from an JSON and sysfs event. Be lazy in its loading to reduce file
accesses.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-16-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Scan sysfs PMU's type early so that format and aliases aren't
attempted to be loaded if the PMU name is invalid.
This is the case for event_pmu tokens in parse-events.y where a wildcard
name is first assumed to be a PMU name.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-15-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rather than scanning all JSON events and adding them when a PMU is
created, add the alias when the JSON event is needed.
Average core PMU scanning run time reduced by 60.2%. Average PMU
scanning run time reduced by 15%. Page faults with no events reduced by
74 page faults, 4% of total.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-14-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cache the JSON events table so that finding it isn't done per
event/alias.
Change the events table find so that when the PMU is given, if the PMU
has no JSON events return null.
Update usage to always use the PMU variable.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-13-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rather than load all sysfs events then parsing all JSON events and
merging with ones that already exist. When a sysfs event is loaded, look
for a corresponding JSON event and merge immediately.
To simplify the logic, early exit the perf_pmu__new_alias function if an
alias is attempted to be added twice - as merging has already been
explicitly handled.
Fix the copying of terms to a merged alias and some ENOMEM paths.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-12-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The aliases list is part of the PMU. Rather than pass the aliases
list, pass the full PMU simplifying some callbacks.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-11-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rather than read a sysfs events file into a 256 byte char buffer, pass
the FILE* directly to the lex/yacc parser.
This avoids there being a maximum events file size.
While changing the API, constify some arguments to remove unnecessary
casts.
Allocating the read buffer decreases the performance of pmu-scan by
around 3%.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-10-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
jevents stores events sorted by name. Add a find function that will
binary search event names avoiding the need to linearly search through
events.
Add a test in tests/pmu-events.c. If the PMU or event aren't found -1000
is returned. If the event is found but no callback function given, 0 is
returned.
This allows the find function also act as a test for existence.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pass the PMU to pmu_events_table__for_each_event so that entries that
don't match don't need to be processed by callback.
If a NULL PMU is passed then all PMUs are processed.
'perf bench internals pmu-scan's "Average PMU scanning" performance is
reduced by about 5% on an Intel tigerlake.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rather than scanning all PMUs for a counter name, scan the PMU
associated with the evsel of the sample. This is done to remove a
dependence on pmu-events.h.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Double setting information for an event would produce an error message
associated with the PMU rather than the term that was double setting.
Improve the error message to be on the term.
Before:
$ perf stat -e 'cpu/inst_retired.any,inst_retired.any/' true
event syntax error: 'cpu/inst_retired.any,inst_retired.any/'
\___ Bad event or PMU
Unabled to find PMU or event on a PMU of 'cpu'
Run 'perf list' for a list of valid events
$
After:
$ perf stat -e 'cpu/inst_retired.any,inst_retired.any/' true
event syntax error: '..etired.any,inst_retired.any/'
\___ Bad event or PMU
Unabled to find PMU or event on a PMU of 'cpu'
Initial error:
event syntax error: '..etired.any,inst_retired.any/'
\___ Attempt to set event's scale twice
Run 'perf list' for a list of valid events
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Prior to this change a cpuid would map to a list of events where the PMU
would be encoded alongside the event information. This change breaks
apart each group of events so that there is a group per PMU. A new table
is added with the PMU's name and the list of events, the original table
now holding an array of these per PMU tables.
These changes are to make it easier to get per PMU information about
events, rather than the current approach of scanning all events. The
perf binary size with BPF skeletons on x86 is reduced by about 1%. The
unidentified PMU is now always expanded to "cpu".
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add extra underscore before "for" of pmu_events_table_for_each_event
and pmu_metrics_table_for_each_metric.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In order to be able to lazily compute aliases/events for a PMU, move
the struct perf_pmu_alias into pmu.c.
Add perf_pmu__find_event and perf_pmu__for_each_event that take a
callback that is called for the found event or for each event.
The layout of struct pmu and the event/alias list is unchanged but the
API is altered so that aliases are no longer directly accessed, allowing
for later changes.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a new testing topo bond_topo_2d1c.sh which is used more commonly.
Make bond_topo_3d1c.sh just source bond_topo_2d1c.sh and add the
extra link.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Extracting btf_int_encoding() is only meaningful for BTF_KIND_INT, so we
need to check that first before inferring signedness.
Closes: https://github.com/libbpf/libbpf/issues/704
Reported-by: Lorenz Bauer <lmb@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230824000016.2658017-2-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
It seems like it was forgotten to add uprobe_multi binary to .gitignore.
Fix this trivial omission.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230824000016.2658017-1-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
For bpf_object__pin_programs() there is bpf_object__unpin_programs().
Likewise bpf_object__unpin_maps() for bpf_object__pin_maps().
But no bpf_object__unpin() for bpf_object__pin(). Adding the former adds
symmetry to the API.
It's also convenient for cleanup in application code. It's an API I
would've used if it was available for a repro I was writing earlier.
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/bpf/b2f9d41da4a350281a0b53a804d11b68327e14e5.1692832478.git.dxu@dxuuu.xyz
Add tests that enforce mmap hint address behavior. mmap should default
to sv48. mmap will provide an address at the highest address space that
can fit into the hint address, unless the hint address is less than sv39
and not 0, then it will return a sv39 address.
These tests are split into two files: mmap_default.c and mmap_bottomup.c
because a new process must be exec'd in order to change the mmap layout.
The run_mmap.sh script sets the stack to be unlimited for the
mmap_bottomup.c test which triggers a bottomup layout.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20230809232218.849726-3-charlie@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
This also puts an unconditional -Werror under control of WERROR. The
clang includes added during the build can lead to a warning that may be
turned into an error. In addition, hardened clang produces a warning
about lack of support for -fstack-protector* options for the BPF target:
clang -g -O2 -target bpf -Wall -Werror -Ilinux/tools/perf/util/bpf_skel/.tmp/.. \
-I -idirafter /usr/lib/llvm/16/bin/../../../../lib/clang/16/include -idirafter /usr/local/include \
-idirafter /usr/include -Ilinux/tools/include/uapi -c util/bpf_skel/bperf_follower.bpf.c \
-o linux/tools/perf/util/bpf_skel/.tmp/bperf_follower.bpf.o && llvm-strip -g linux/tools/perf/util/bpf_skel/.tmp/bperf_follower.bpf.o
clang-16: error: /usr/lib/llvm/16/bin/../../../../lib/clang/16/include: 'linker' input unused [-Werror,-Wunused-command-line-argument]
clang-16: error: ignoring '-fstack-protector-strong' option as it is not currently supported for target 'bpf' [-Werror,-Woption-ignored]
make[1]: *** [Makefile.perf:1082: linux/tools/perf/util/bpf_skel/.tmp/bpf_prog_profiler.bpf.o] Error 1
Signed-off-by: Guilherme Amadio <amadio@gentoo.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/ZOZQ2LDA+3Wg8x2T@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pass the pmu so the aliases and format list can be better abstracted
and later lazily loaded.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230823080828.1460376-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pass the PMU so the format list can be better abstracted and later
lazily loaded.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230823080828.1460376-8-irogers@google.com
[ Did missing conversions in tools/perf/arch/arm*/util/cs-etm.c ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pass the pmu so the format list can be better abstracted and later
lazily loaded.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230823080828.1460376-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Abstract the format list better, hiding it in the PMU, by changing
perf_pmu__config_terms() the PMU rather than the format list in the PMU.
Change the PMU test to pass a dummy PMU for this purpose. Changing the
test allows perf_pmu__del_formats() to become static.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230823080828.1460376-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move declaration from header file to pmu.y and make static.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230823080828.1460376-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Avoid having the function in the C and header file, as it is only used
locally by pmu.y.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230823080828.1460376-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rather than read a base path and append into a 2nd path, read the base
path directly into output buffer and append to that.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230823080828.1460376-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Done to reduce dependencies on pmu-events.h.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230823080828.1460376-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Based on commit 7d54a4acd8 ("perf test: Skip watchpoint tests if
no watchpoints available"), hardware breakpoints are not available for
power9 platform and because of that 'perf bench breakpoint' run fails on
power9 platform.
Add code to check for the return value of perf_event_open() in the
breakpoint run and skip the 'perf bench breakpoint' run, if hardware
breakpoints are not available.
Result on power9 system before patch changes:
[command]# perf bench breakpoint thread
perf_event_open: No such device
Result on power9 system after patch changes:
[command]# ./perf bench breakpoint thread
Skipping perf bench breakpoint thread: No hardware support
Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Naveen N Rao <naveen@kernel.org>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Disha Goel <disgoel@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230823075103.190565-1-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Having __sysret() as an inline function has the unfortunate effect of
adding casts and large constants comparisons after the syscall returns
that significantly inflate some light code that's otherwise syscall-
heavy. Even nolibc-test grew by ~1%.
Let's switch back to a macro for this, and use it only with signed
arguments. Note that it is also possible to design a slightly more
complex macro covering unsigned and pointers but we only have 3 such
syscalls so it is pointless, and these were just addressed not to use
this macro anymore. Now for the argument (the local variable containing
the syscall return value), any negative value is an error, that results
in -1 being returned and errno to be assigned the opposite value.
This may be revisited again in the future if really needed but for now
let's get back to something sane.
Fixes: 428905da6e ("tools/nolibc: sys.h: add a syscall return helper")
Link: https://lore.kernel.org/lkml/20230806095846.GB10627@1wt.eu/
Link: https://lore.kernel.org/lkml/ZNKOJY+g66nkIyvv@1wt.eu/
Cc: Zhangjin Wu <falcon@tinylab.org>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Thomas Weißschuh <thomas@t-8ch.de>
Signed-off-by: Willy Tarreau <w@1wt.eu>
The __sysret() function causes some undesirable casts so we'll revert
it. In order to keep it simple it will now only support integer return
values like in the past, so we must basically revert the changes that
were made to these 3 syscalls which return a pointer so that they
simply rely on their own test and the SET_ERRNO() macro.
Fixes: 4201cfce15 ("tools/nolibc: clean up sbrk() routine")
Fixes: 924e9539ae ("tools/nolibc: clean up mmap() routine")
Fixes: d27447bc2e ("tools/nolibc: sys.h: apply __sysret() helper")
Link: https://lore.kernel.org/lkml/20230806095846.GB10627@1wt.eu/
Link: https://lore.kernel.org/lkml/ZNKOJY+g66nkIyvv@1wt.eu/
Cc: Zhangjin Wu <falcon@tinylab.org>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Thomas Weißschuh <thomas@t-8ch.de>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Silence the following warnings reported by the new -Wall -Wextra options
with pure assembly code.
In file included from sysroot/powerpc/include/stdio.h:13,
from nolibc-test.c:13:
sysroot/powerpc/include/arch.h: In function '_start':
sysroot/powerpc/include/arch.h:192:32: warning: unused variable 'r2' [-Wunused-variable]
192 | register volatile long r2 __asm__ ("r2") = (void *)&TOC - (void *)_start;
| ^~
sysroot/powerpc/include/arch.h:187:97: warning: optimization may eliminate reads and/or writes to register variables [-Wvolatile-register-var]
187 | void __attribute__((weak, noreturn, optimize("Os", "omit-frame-pointer"))) __no_stack_protector _start(void)
| ^~~~~~
Since only elfv2 ABI requires to save the TOC/GOT pointer to r2
register, when using elfv1 ABI, the old C code is simply ignored by the
compiler, but the compiler can not ignore the inline assembly code and
will introduce build failure or running segfaults. So, let's further
only add the new assembly code for elfv2 ABI with the checking of
_CALL_ELF == 2.
Link: https://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi.pdf
Link: https://www.llvm.org/devmtg/2014-04/PDFs/Talks/Euro-LLVM-2014-Weigand.pdf
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
libc-test is mainly added to compare the behavior of nolibc to the
system libc, it is meaningless and error-prone with cross compiling.
Let's use HOSTCC instead of CC to avoid wrongly use cross compiler when
CROSS_COMPILE is passed or customized.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Fixes: cfb672f94f ("selftests/nolibc: add run-libc-test target")
Signed-off-by: Willy Tarreau <w@1wt.eu>
This allows to generate smaller text/data/dec size.
As the _start_c() function added by crt.h, __stack_chk_init() is called
from _start_c() instead of the assembly _start. So, it is able to mark
it with static now.
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
After the tests finish, it is valuable to report and summarize with
existing test log.
This avoid rerun or run the tests again when not necessary.
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Kernel uses ARCH=powerpc for both 32-bit and 64-bit PowerPC, here adds a
ppc64 variant for big endian 64-bit PowerPC, users can pass XARCH=ppc64
to test it.
The powernv machine of qemu-system-ppc64 is used with
powernv_be_defconfig.
As the document [1] shows:
PowerNV (as Non-Virtualized) is the “bare metal” platform using the
OPAL firmware. It runs Linux on IBM and OpenPOWER systems and it can be
used as an hypervisor OS, running KVM guests, or simply as a host OS.
Notes,
- differs from little endian 64-bit PowerPC, vmlinux is used instead of
zImage, because big endian zImage [2] only boot on qemu with x-vof=on
(added from qemu v7.0) and a fixup patch [3] for qemu v7.0.51:
- since the VSX support may be disabled in kernel side, to avoid
"illegal instruction" errors due to missing VSX kernel support, let's
simply let compiler not generate vector/scalar (VSX) instructions via
the '-mno-vsx' option.
- as 'man gcc' shows, '-mmultiple' is used to generate code that uses
the load multiple word instructions and the store multiple word
instructions. Those instructions do not work when the processor is in
little-endian mode (except PPC740/PPC750), so, we only enable it
for big endian powerpc.
- for big endian ppc64, as the help message from arch/powerpc/Kconfig
shows, the V2 ABI is standard for 64-bit little-endian, but for
big-endian it is less well tested by kernel and toolchain, so, use
elfv1 as-is, no need to explicitly ask toolchain to use elfv2 here.
[1]: https://qemu.readthedocs.io/en/latest/system/ppc/powernv.html
[2]: https://github.com/linuxppc/issues/issues/402
[3]: https://lore.kernel.org/qemu-devel/20220504065536.3534488-1-aik@ozlabs.ru/
Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/20230722121019.GD17311@1wt.eu/
Link: https://lore.kernel.org/lkml/20230719043353.GC5331@1wt.eu/
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Kernel uses ARCH=powerpc for both 32-bit and 64-bit PowerPC, here adds a
ppc64le variant for little endian 64-bit PowerPC, users can pass
XARCH=ppc64le to test it.
The powernv machine of qemu-system-ppc64le is used for there is just a
working powernv_defconfig.
As the document [1] shows:
PowerNV (as Non-Virtualized) is the “bare metal” platform using the
OPAL firmware. It runs Linux on IBM and OpenPOWER systems and it can be
used as an hypervisor OS, running KVM guests, or simply as a host OS.
Notes,
- since the VSX support may be disabled in kernel side, to avoid
"illegal instruction" errors due to missing VSX kernel support, let's
simply let compiler not generate vector/scalar (VSX) instructions via
the '-mno-vsx' option.
- little endian ppc64 prefers elfv2 to elfv1 if the toolchain (e.g. gcc
13.1.0) supports it, let's align with kernel, otherwise, our elfv1
binary will not run on kernel with elfv2 ABI.
[1]: https://qemu.readthedocs.io/en/latest/system/ppc/powernv.html
Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/20230722120747.GC17311@1wt.eu/
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Kernel uses ARCH=powerpc for both 32-bit and 64-bit PowerPC, here adds a
ppc variant for 32-bit PowerPC and uses it as the default variant of
powerpc architecture.
Users can pass XARCH=ppc (or ARCH=powerpc) to test 32-bit PowerPC.
The default qemu-system-ppc g3beige machine [1] is used to run 32-bit
powerpc kernel with pmac32_defconfig. The missing PMACZILOG serial tty
and console are enabled in another patch [2].
Note,
- zImage doesn't boot due to "qemu-system-ppc: Some ROM regions are
overlapping" error, so, vmlinux is used instead.
- since the VSX support may be disabled in kernel side, to avoid
"illegal instruction" errors due to missing VSX kernel support, let's
simply let compiler not generate vector/scalar (VSX) instructions via
the '-mno-vsx' option.
- as 'man gcc' shows, '-mmultiple' is used to generate code that uses
the load multiple word instructions and the store multiple word
instructions. Those instructions do not work when the processor is in
little-endian mode (except PPC740/PPC750), so, we only enable it
for big endian powerpc.
[1]: https://qemu.readthedocs.io/en/latest/system/ppc/powermac.html
[2]: https://lore.kernel.org/lkml/bb7b5f9958b3e3a20f6573ff7ce7c5dc566e7e32.1690982937.git.tanyuan@tinylab.org/
Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/ZL9leVOI25S2+0+g@1wt.eu/
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Most of the CPU architectures have different variants, but kernel
usually only accepts parts of them via the ARCH variable, the others
should be customized via kernel config files.
To simplify testing, a new XARCH variable is added to extend the
kernel's ARCH with a few variants of the same architecture, and it is
used to customize variant specific variables, at last XARCH is converted
to the kernel's ARCH:
e.g. make run XARCH=<one of the supported variants>
| \
| `-> variant specific variables:
| IMAGE, DEFCONFIG, QEMU_ARCH, QEMU_ARGS, CFLAGS ...
\
`---> kernel's ARCH
XARCH and ARCH are carefully mapped to allow users to pass architecture
variants via XARCH or pass architecture via ARCH from cmdline.
PowerPC is the first user and also a very good reference architecture of
this mapping, it has variants with different combinations of
32-bit/64-bit and bit endian/little endian.
To use this mapping, the other architectures can refer to PowerPC, If
the target architecture only has one variant, XARCH is simply an alias
of ARCH, no additional mapping required.
Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/20230702171715.GD16233@1wt.eu/
Link: https://lore.kernel.org/lkml/20230730061801.GA7690@1wt.eu/
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
This follows the 64-bit PowerPC ABI [1], refers to the slides: "A new
ABI for little-endian PowerPC64 Design & Implementation" [2] and the
musl code in arch/powerpc64/crt_arch.h.
First, stdu and clrrdi are used instead of stwu and clrrwi for
powerpc64.
Second, the stack frame size is increased to 32 bytes for powerpc64, 32
bytes is the minimal stack frame size supported described in [2].
Besides, the TOC pointer (GOT pointer) must be saved to r2.
This works on both little endian and big endian 64-bit PowerPC.
[1]: https://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi.pdf
[2]: https://www.llvm.org/devmtg/2014-04/PDFs/Talks/Euro-LLVM-2014-Weigand.pdf
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Both syscall declarations and _start code definition are added for
powerpc to nolibc.
Like mips, powerpc uses a register (exactly, the summary overflow bit)
to record the error occurred, and uses another register to return the
value [1]. So, the return value of every syscall declaration must be
normalized to match the __sysret() helper, return -value when there is
an error, otheriwse, return value directly.
Glibc and musl use different methods to check the summary overflow bit,
glibc (sysdeps/unix/sysv/linux/powerpc/sysdep.h) saves the cr register
to r0 at first, and then check the summary overflow bit in cr0:
mfcr r0
r0 & (1 << 28) ? -r3 : r3
-->
10003c14: 7c 00 00 26 mfcr r0
10003c18: 74 09 10 00 andis. r9,r0,4096
10003c1c: 41 82 00 08 beq 0x10003c24
10003c20: 7c 63 00 d0 neg r3,r3
Musl (arch/powerpc/syscall_arch.h) directly checks the summary overflow
bit with the 'bns' instruction, it is smaller:
/* no summary overflow bit means no error, return value directly */
bns+ 1f
/* otherwise, return negated value */
neg r3, r3
1:
-->
10000418: 40 a3 00 08 bns 0x10000420
1000041c: 7c 63 00 d0 neg r3,r3
Like musl, Linux (arch/powerpc/include/asm/vdso/gettimeofday.h) uses the
same method for do_syscall_2() too.
Here applies the second method to get smaller size.
[1]: https://man7.org/linux/man-pages/man2/syscall.2.html
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
It will help the developers to avoid cruft and detect some bugs.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Binary size is not important for nolibc-test and some debugging
information is nice to have, so don't strip the binary during linking.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
If read() fails and returns -1 (or returns garbage for some other
reason) buf would be accessed out of bounds.
Only use the return value of read() after it has been validated.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Avoid truncating values before comparing them.
As printf in nolibc doesn't support ssize_t add casts to int for
printing.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
These warnings will be enabled later so avoid triggering them.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
This warning will be enabled later so avoid triggering it.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
This allows the compiler to generate warnings if they go unused.
Functions that are supposed to be used as breakpoints should not be
static, so un-statify those if necessary.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Recent fix ceb528feb7 ("selftests/nolibc: avoid gaps in test numbers")
had the annoying side effect of always returning skipped tests, which
are normally supposed to happen only when certain features are missing
to run the test (missing kernel options, toolchain not supporting
stack-protector etc). As such there are now always warnings. Let's
modify the test to not use the condition and instead use a ternary
expression to check the result.
Fixes: ceb528feb7 ("selftests/nolibc: avoid gaps in test numbers")
Cc: Thomas WeiÃ<9F>schuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Otherwise both gcc and clang may generate warnings about type
mismatches:
sysroot/mips/include/string.h:12:14: warning: mismatch in argument 1 type of built-in function 'malloc'; expected 'unsigned int' [-Wbuiltin-declaration-mismatch]
12 | static void *malloc(size_t len);
| ^~~~~~
The compiler provides __SIZE_TYPE__ as the type that corresponds to size_t
(typically "long unsigned int" or "unsigned int"). It was verified to be
available at least since gcc-3.4 and clang-3.8, so from now on we'll use
this definition for size_t.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/lkml/20230805161929.GA15284@1wt.eu/
Signed-off-by: Willy Tarreau <w@1wt.eu>
getauxval() returns an unsigned long but the overall type of the ternary
operator needs to be signed.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
This warning will be enabled later so avoid triggering it.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
It's documented as returning int which is also implemented by glibc and
musl, so adopt that return type.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Add a test case of pipe that sends and receives message in a single
process.
Suggested-by: Thomas Weißschuh <thomas@t-8ch.de>
Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/all/c5de2d13-3752-4e1b-90d9-f58cca99c702@t-8ch.de/
Signed-off-by: Yuan Tan <tanyuan@tinylab.org>
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
[wt: fixed the "len" type to size_t to address a sign-compare warning
with upcoming patches]
Signed-off-by: Willy Tarreau <w@1wt.eu>
According to manual page [1], posix spec [2] and source code like
arch/mips/kernel/syscall.c, for historic reasons, the sys_pipe() syscall
on some architectures has an unusual calling convention. It returns
results in two registers which means there is no need for it to do
verify the validity of a userspace pointer argument. Historically that
used to be expensive in Linux. These days the performance advantage is
negligible.
Nolibc doesn't support the unusual calling convention above, luckily
Linux provides a generic sys_pipe2() with an additional flags argument
from 2.6.27. If flags is 0, then pipe2() is the same as pipe(). So here
we use sys_pipe2() to implement the pipe().
pipe2() is also provided to allow users to use flags argument on demand.
[1]: https://man7.org/linux/man-pages/man2/pipe.2.html
[2]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/pipe.html
Suggested-by: Zhangjin Wu <falcon@tinylab.org>
Link: https://lore.kernel.org/all/20230729100401.GA4577@1wt.eu/
Signed-off-by: Yuan Tan <tanyuan@tinylab.org>
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
The other tests use 1 as failure, mmap_munmap_good uses -1 as failure,
let's fix up this.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
If the test description is longer than the status alignment the
parameter 'n' to putcharn() would lead to a signed underflow that then
gets converted to a very large unsigned value.
This in turn leads out-of-bound writes in memset() crashing the
application.
The failure case of EXPECT_PTRER() used in "mmap_bad" exhibits this
exact behavior.
Fixes: 29f5540be3 ("selftests/nolibc: add EXPECT_PTREQ, EXPECT_PTRNE and EXPECT_PTRER")
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Add a minimal implementation of setvbuf(), which error checks the mode
argument (as required by spec) and returns. Since nolibc never buffers
output, nothing needs to be done.
The kselftest framework recently added a call to setvbuf(). As a result,
any tests that use the kselftest framework and nolibc cause a compiler
error due to missing function. This provides an urgent fix for the
problem which is preventing arm64 testing on linux-next.
Example:
clang --target=aarch64-linux-gnu -fintegrated-as
-Werror=unknown-warning-option -Werror=ignored-optimization-argument
-Werror=option-ignored -Werror=unused-command-line-argument
--target=aarch64-linux-gnu -fintegrated-as
-fno-asynchronous-unwind-tables -fno-ident -s -Os -nostdlib \
-include ../../../../include/nolibc/nolibc.h -I../..\
-static -ffreestanding -Wall za-fork.c
build/kselftest/arm64/fp/za-fork-asm.o
-o build/kselftest/arm64/fp/za-fork
In file included from <built-in>:1:
In file included from ./../../../../include/nolibc/nolibc.h:97:
In file included from ./../../../../include/nolibc/arch.h:25:
./../../../../include/nolibc/arch-aarch64.h:178:35: warning: unknown
attribute 'optimize' ignored [-Wunknown-attributes]
void __attribute__((weak,noreturn,optimize("omit-frame-pointer")))
__no_stack_protector _start(void)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from za-fork.c:12:
../../kselftest.h:123:2: error: call to undeclared function 'setvbuf';
ISO C99 and later do not support implicit function declarations
[-Wimplicit-function-declaration]
setvbuf(stdout, NULL, _IOLBF, 0);
^
../../kselftest.h:123:24: error: use of undeclared identifier '_IOLBF'
setvbuf(stdout, NULL, _IOLBF, 0);
^
1 warning and 2 errors generated.
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Link: https://lore.kernel.org/linux-kselftest/CA+G9fYus3Z8r2cg3zLv8uH8MRrzLFVWdnor02SNr=rCz+_WGVg@mail.gmail.com/
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
As the head comment of nolibc-test.c shows, it can be built in 3 ways:
$(CC) -nostdlib -include /path/to/nolibc.h => NOLIBC already defined
$(CC) -nostdlib -I/path/to/nolibc/sysroot => _NOLIBC_* guards are present
$(CC) with default libc => NOLIBC* never defined
Only last two of them are tested currently, let's allow test the first one too.
This may help to find issues about using nolibc.h to build programs. it
derives from this change:
commit 3a8039e289 ("tools/nolibc: Fix build of stdio.h due to header ordering")
Usage:
// test with sysroot by default
$ make run-user
// test without sysroot, using nolibc.h directly
$ make run-user NOLIBC_SYSROOT=0
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
It is able to run nolibc-test directly without qemu-user when the target
machine is the same as the host machine.
Sometimes, the result running locally may help a lot when the qemu-user
package is too old.
When the target machine differs from the host machine, it is also able
to run nolibc-test directly with qemu-user-static + binfmt_misc.
Link: https://lore.kernel.org/lkml/ZKutZwIOfy5MqedG@1wt.eu/
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
The startup code is critical to get the right argc, argv, envp/environ
and _auxv, let's add a startup test group and the corresponding
testcases.
The "environ" test case is also moved from the stdlib test group to this
new startup test group and it is renamed to "environ_envp".
Since argv0 has been used by many other test cases, let's add testcases
to gurantee it too.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
4 new pointer compare macros are added, they are similar to the integer
compare macros.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
move most of the _start operations to _start_c(), include the
stackprotector initialization.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
move most of the _start operations to _start_c(), include the
stackprotector initialization.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
move most of the _start operations to _start_c(), include the
stackprotector initialization.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
move most of the _start operations to _start_c(), include the
stackprotector initialization.
Also clean up the instructions in delayed slots.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
move most of the _start operations to _start_c(), include the
stackprotector initialization.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
move most of the _start operations to _start_c(), include the
stackprotector initialization.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
move most of the _start operations to _start_c(), include the
stackprotector initialization.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
move most of the _start operations to _start_c(), include the
stackprotector initialization.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
As suggested by Thomas, It is able to move the stackprotector
initialization from the assembly _start to the beginning of the new
_start_c(). Let's call __stack_chk_init() in _start_c() as a
preparation.
Suggested-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/lkml/a00284a6-54b1-498c-92aa-44997fa78403@t-8ch.de/
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Let's define an empty __stack_chk_init for the !_NOLIBC_STACKPROTECTOR
branch.
This allows to remove #ifdef around every call of __stack_chk_init().
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
As the environ and _auxv support added for nolibc, the assembly _start
function becomes more and more complex and therefore makes the porting
of nolibc to new architectures harder and harder.
To simplify portability, this C version of _start_c() is added to do
most of the assembly start operations in C, which reduces the complexity
a lot and will eventually simplify the porting of nolibc to the new
architectures.
The new _start_c() only requires a stack pointer argument, it will find
argc, argv, envp/environ and _auxv for us, and then call main(),
finally, it exit() with main's return status. With this new _start_c(),
the future new architectures only require to add very few assembly
instructions.
As suggested by Thomas, users may use a different signature of main
(e.g. void main(void)), a _nolibc_main alias is added for main to
silence the warning about potential conflicting types.
As suggested by Willy, the code is carefully polished for both smaller
size and better readability with local variables and the right types.
Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/20230715095729.GC24086@1wt.eu/
Suggested-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/lkml/90fdd255-32f4-4caf-90ff-06456b53dac3@t-8ch.de/
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
The statx manpage [1] shows that it has been supported from Linux 4.11
and glibc 2.28, the Linux support can be checked for all of the
architectures with this command:
$ git grep -r statx v4.11 arch/ include/uapi/asm-generic/unistd.h \
| grep -E "aarch64|arm|mips|s390|x86|:include/uapi"
Besides riscv and loongarch, all of the nolibc supported architectures
have added sys_statx from Linux v4.11. riscv is mainlined to v4.15,
loongarch is mainlined to v5.19, both of them use the generic unistd.h,
so, they have added sys_statx from their first mainline versions.
The current oldest stable branch is v4.14, only reserving sys_statx
still preserves compatibility with all of the supported stable branches,
So, let's remove the old arch related and dependent sys_stat support
completely.
This is friendly to the future new architecture porting.
[1]: https://man7.org/linux/man-pages/man2/statx.2.html
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
As gcc doc [1] shows:
Most optimizations are completely disabled at -O0 or if an -O level is
not set on the command line, even if individual optimization flags are
specified.
Test result [2] shows, gcc>=11.1.0 deviates from the above description,
but before gcc 11.1.0, "-O0" still forcely uses frame pointer in the
_start function even if the individual optimize("omit-frame-pointer")
flag is specified.
The frame pointer related operations will change the stack pointer (e.g.
In x86_64, an extra "push %rbp" will be inserted at the beginning of
_start) and make it differs from the one we expected, as a result, break
the whole startup function.
To fix up this issue, as suggested by Thomas, the individual "Os" and
"omit-frame-pointer" optimize flags are used together on _start function
to disable frame pointer completely even if the -O0 is set on the
command line.
[1]: https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
[2]: https://lore.kernel.org/lkml/20230714094723.140603-1-falcon@tinylab.org/
Suggested-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/lkml/34b21ba5-7b59-4b3b-9ed6-ef9a3a5e06f7@t-8ch.de/
Fixes: 7f85485896 ("tools/nolibc: make compiler and assembler agree on the section around _start")
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Fix up such errors reported by scripts/checkpatch.pl:
ERROR: space required after that ',' (ctx:VxV)
#148: FILE: tools/include/nolibc/arch-aarch64.h:148:
+void __attribute__((weak,noreturn,optimize("omit-frame-pointer"))) __no_stack_protector _start(void)
^
ERROR: space required after that ',' (ctx:VxV)
#148: FILE: tools/include/nolibc/arch-aarch64.h:148:
+void __attribute__((weak,noreturn,optimize("omit-frame-pointer"))) __no_stack_protector _start(void)
^
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
As the test numbers are based on line numbers gaps without testcases are
to be avoided.
Instead use the already existing test condition logic to implement
conditional execution.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
pad_spc() is only ever used to print the status message of testcases.
The line size is always constant, the return value is never used and the
format string is never used as such.
Remove all the unneeded logic and simplify the API and its users.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
If "cond" is a multi-token statement the behavior of the preprocessor
will lead to the negation "!" to be only applied to the first token.
Although currently no test uses such multi-token conditions but it can
happen at any time.
Put braces around "cond" to ensure the negation works as expected.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
In commit 52e423f5b9 ("tools/nolibc: export environ as a weak symbol on i386")
and friends the asm startup logic was extended to directly populate the
"environ" array.
This makes it impossible for "environ" to be dropped by the linker.
Therefore also drop the other logic to handle non-present "environ".
Also add a testcase to validate the initialization of environ.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
a newline is inserted just before the test failures to avoid mixing the
test failures with the raw test log.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
two newlines are added around the test summary line to extrude the test
status.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
align the test values for different runs and different architectures.
Since the total number of tests is not bigger than 1000 currently, let's
align them with "%3d".
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
[wt: s/%03d/%3d/ as discussed with Zhangjin]
Link: https://lore.kernel.org/lkml/20230709185112.97236-1-falcon@tinylab.org/
Signed-off-by: Willy Tarreau <w@1wt.eu>
Let's count and print the total number of tests, now, the data of
passed, skipped and failed have the same format.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
one of the test status: success, warning and failure is printed to
summarize the passed, skipped and failed values.
- "success" means no skipped and no failed.
- "warning" means has at least one skipped and no failed.
- "failure" means all tests are failed.
Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/20230702164358.GB16233@1wt.eu/
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
argv0 is readable and chmodable, let's use it for chmod test, but a safe
umask should be used, the readable and executable modes should be
reserved.
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Since argv0 also works for CONFIG_PROC_FS=n, let's use it instead of
'/proc/self/exe'.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
'/proc/self/' is a good path which doesn't have stale time info but it
is only available for CONFIG_PROC_FS=y.
When CONFIG_PROC_FS=n, use argv0 instead of '/proc/self', use '/' for the
worst case.
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
The PWD environment variable has the path of the nolibc-test program,
the current path must be the same as it, otherwise, the test cases will
fail with relative path (e.g. ./nolibc-test).
Since only chdir_root really changes the current path, let's restore it
with the PWD environment variable.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
The vfprintf test case require to open a temporary file to write, the
old memfd_create() method is perfect but has strong dependency on
MEMFD_CREATE and also TMPFS or HUGETLBFS (see fs/Kconfig):
config MEMFD_CREATE
def_bool TMPFS || HUGETLBFS
And from v6.2, MFD_NOEXEC_SEAL must be passed for the non-executable
memfd, otherwise, The kernel warning will be output to the test result
like this:
Running test 'vfprintf'
0 emptymemfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL, pid=1 'init'
"" = "" [OK]
To avoid such warning and also to remove the MEMFD_CREATE dependency,
let's open a file from tmpfs directly.
The /tmp directory is used to detect the existing of tmpfs, if not
there, skip instead of fail.
And further, for pid == 1, the initramfs is loaded as ramfs, which can
be used as tmpfs, so, it is able to further remove TMPFS dependency too.
Suggested-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/lkml/9ad51430-b7c0-47dc-80af-20c86539498d@t-8ch.de
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
create a /tmp directory. If it succeeds, the directory is writable,
which is normally the case when booted from an initramfs anyway.
This will be used instead of procfs for some tests.
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Link: https://lore.kernel.org/lkml/20230710050600.9697-1-falcon@tinylab.org/
[wt: removed the unneeded mount() call]
Signed-off-by: Willy Tarreau <w@1wt.eu>
For CONFIG_PROC_FS=n, the /proc is not mountable, but the /proc
directory has been created in the prepare() stage whenever /proc is
there or not.
so, the checking of /proc in the run_syscall() stage will be always true
and at last it will fail all of the procfs dependent test cases, which
deviates from the 'cond' check design of the EXPECT_xx macros, without
procfs, these test cases should be skipped instead of failed.
To solve this issue, one method is checking /proc/self instead of /proc,
another method is removing the /proc directory completely for
CONFIG_PROC_FS=n, we apply the second method to avoid misleading the
users.
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
A new rmdir_blah test case is added to remove a non-existing /blah,
which expects failure with ENOENT errno.
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
a reverse operation of mkdir() is meaningful, add rmdir() here.
required by nolibc-test to remove /proc while CONFIG_PROC_FS is not
enabled.
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
For CONFIG_NET=n, there would be no /proc/self/net, so, use
/proc/self/cmdline instead.
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
kernel parameters allow pass two types of strings, one type is like
'noapic', another type is like 'panic=5', the first type is passed as
arguments of the init program, the second type is passed as environment
variables of the init program.
when users pass kernel parameters like this:
noapic NOLIBC_TEST=syscall
our nolibc-test program will use the test setting from argv[1] and
ignore the one from NOLIBC_TEST environment variable, and at last, it
will print the following line and ignore the whole test setting.
Ignoring unknown test name 'noapic'
reversing the parsing order does solve the above issue:
test = getenv("NOLIBC_TEST");
if (test)
test = argv[1];
but it still doesn't work with such kernel parameters (without
NOLIBC_TEST environment variable):
noapic FOO=bar
To support all of the potential kernel parameters, let's verify the test
setting from both of argv[1] and NOLIBC_TEST environment variable.
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Since both glibc and musl provide RB_ flags via <sys/reboot.h>, and we
just add RB_ flags for nolibc, let's use RB_ flags instead of
LINUX_REBOOT_ flags and only reserve the required <sys/reboot.h> header.
This allows compile libc-test for musl libc without the linux headers.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Both glibc and musl provide RB_ flags via <sys/reboot.h> for reboot(),
they don't need to include <linux/reboot.h>, let nolibc provide RB_
flags too.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
musl limits the fast signed int in 32bit, but glibc and nolibc don't, to
let such test cases work on musl, let's provide the type based
SINT_MAX_OF_TYPE(type) and SINT_MIN_OF_TYPE(type).
Suggested-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/lkml/bc635c4f-67fe-4e86-bfdf-bcb4879b928d@t-8ch.de/
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
_GNU_SOURCE Implies _LARGEFILE64_SOURCE in glibc, but in musl, the
default configuration doesn't enable _LARGEFILE64_SOURCE.
>From include/dirent.h of musl, getdents64 is provided as getdents when
_LARGEFILE64_SOURCE is defined.
#if defined(_LARGEFILE64_SOURCE)
...
#define getdents64 getdents
#endif
Let's define _LARGEFILE64_SOURCE to fix up this compile error:
tools/testing/selftests/nolibc/nolibc-test.c: In function ‘test_getdents64’:
tools/testing/selftests/nolibc/nolibc-test.c:453:8: warning: implicit declaration of function ‘getdents64’; did you mean ‘getdents’? [-Wimplicit-function-declaration]
453 | ret = getdents64(fd, (void *)buffer, sizeof(buffer));
| ^~~~~~~~~~
| getdents
/usr/bin/ld: /tmp/ccKILm5u.o: in function `test_getdents64':
nolibc-test.c:(.text+0xe3e): undefined reference to `getdents64'
collect2: error: ld returned 1 exit status
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
As the gettid manpage [1] shows, glibc 2.30 has gettid support, so,
let's enable the test for glibc >= 2.30.
gettid works on musl too.
[1]: https://man7.org/linux/man-pages/man2/gettid.2.html
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
mmap() a file with a good offset and then munmap() it. a non-zero offset
is passed to test the 6th argument of my_syscall6().
Note, it is not easy to find a unique file for mmap() in different
scenes, so, a file list is used to search the right one:
- /dev/zero: is commonly used to allocate anonymous memory and is likely
present and readable
- /proc/1/exe: for 'run' and 'run-user' target, 'run-user' can not find
'/proc/self/exe'
- /proc/self/exe: for 'libc-test' target, normal program 'libc-test' has
no permission to access '/proc/1/exe'
- argv0: the path of the program itself, let it pass even with worst
case scene: no procfs and no /dev/zero
Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/20230702193306.GK16233@1wt.eu/
Suggested-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/lkml/bff82ea6-610b-4471-a28b-6c76c28604a6@t-8ch.de/
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
The addr argument of munmap() must be a multiple of the page size,
passing invalid (void *)1 addr expects failure with -EINVAL.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
The length argument of mmap() must be greater than 0, passing a zero
length argument expects failure with -EINVAL.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
>From musl 0.9.14 (to the latest version 1.2.3), both sbrk() and brk()
have almost been disabled for they conflict with malloc, only sbrk(0) is
still permitted as a way to get the current location of the program
break, let's support such case.
EXPECT_PTRNE() is used to expect sbrk() always successfully getting the
current break.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
The syscalls like sbrk() and mmap() return pointers, to test them, more
pointer compare test macros are required, add them:
- EXPECT_PTREQ() expects two equal pointers.
- EXPECT_PTRNE() expects two non-equal pointers.
- EXPECT_PTRER() expects failure with a specified errno.
- EXPECT_PTRER2() expects failure with one of two specified errnos.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
/dev/zero is commonly used to allocate anonymous memory, it is a very
good file for tests, let's prepare it.
Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/20230702193306.GK16233@1wt.eu/
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
argv0 is the path to nolibc-test program itself, which is a very good
always existing readable file for some tests, let's export it.
Note, the path may be absolute or relative, please make sure the tests
work with both of them. If it is relative, we must make sure the current
path is the one specified by the PWD environment variable.
Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/ZKKbS3cwKcHgnGwu@1wt.eu/
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Fix up the error reported by scripts/checkpatch.pl:
ERROR: do not use assignment in if condition
#95: FILE: tools/include/nolibc/sys.h:95:
+ if ((ret = sys_brk(0)) && (sys_brk(ret + inc) == ret + inc))
Apply the new generic __sysret() to merge the SET_ERRNO() and return
lines.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Do several cleanups together:
- Since all supported architectures have my_syscall6() now, remove the
#ifdef check.
- Move the mmap() related macros to tools/include/nolibc/types.h and
reuse most of them from <linux/mman.h>
- Apply the new generic __sysret() to convert the calling of sys_map()
to oneline code
Note, since MAP_FAILED is -1 on Linux, so we can use the generic
__sysret() which returns -1 upon error and still satisfy user land that
checks for MAP_FAILED.
Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/20230702192347.GJ16233@1wt.eu/
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
No official reference states the errno range, here aligns with musl and
glibc and uses [-MAX_ERRNO, -1] instead of all negative ones.
- musl: src/internal/syscall_ret.c
- glibc: sysdeps/unix/sysv/linux/sysdep.h
The MAX_ERRNO used by musl and glibc is 4095, just like the one nolibc
defined in tools/include/nolibc/errno.h.
Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/ZKKdD%2Fp4UkEavru6@1wt.eu/
Suggested-by: David Laight <David.Laight@ACULAB.COM>
Link: https://lore.kernel.org/linux-riscv/94dd5170929f454fbc0a10a2eb3b108d@AcuMS.aculab.com/
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
It is able to pass the 6th argument like the 5th argument via the stack
for mips, let's add a new my_syscall6() now, see [1] for details:
The mips/o32 system call convention passes arguments 5 through 8 on
the user stack.
Both mmap() and pselect6() require my_syscall6().
[1]: https://man7.org/linux/man-pages/man2/syscall.2.html
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
my_syscall<N> share the same long clobber list, define a macro for them.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
my_syscall<N> share the same long clobber list, define a macro for them.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
replace "__asm__ volatile" with "__asm__ volatile" and insert necessary
whitespace before "\" to make sure the lines are aligned.
$ sed -i -e 's/__asm__ volatile ( /__asm__ volatile ( /g' tools/include/nolibc/*.h
Note, arch-s390.h uses post-tab instead of post-whitespaces, must avoid
insert whitespace just before the tabs:
$ sed -i -e 's/__asm__ volatile (\t/__asm__ volatile (\t/g' tools/include/nolibc/arch-*.h
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
More than 8 whitespaces of the code indent are replaced with "tab +
whitespaces" to fix up such errors reported by scripts/checkpatch.pl:
ERROR: code indent should use tabs where possible
#64: FILE: tools/include/nolibc/arch-mips.h:64:
+^I \$
ERROR: code indent should use tabs where possible
#72: FILE: tools/include/nolibc/arch-mips.h:72:
+^I "t0", "t1", "t2", "t3", "t4", "t5", "t6", "t7", "t8", "t9" \$
This command is used:
$ sed -i -e '/^\t* /{s/ /\t/g}' tools/include/nolibc/arch-*.h
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Since commit 53fcfafa8c ("tools/nolibc/unistd: add syscall()") nolibc
has support for syscall(2).
Use it to get rid of some ifdef-ery.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Add test cases for accessing the data structure fields using BTF info.
This includes the field access from parameters and retval, and accessing
string information.
Link: https://lore.kernel.org/all/169272161265.160970.14048619786574971276.stgit@devnote2/
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Assume the fprobe event is a return event if there is $retval is
used in the probe's argument without %return. e.g.
echo 'f:myevent vfs_read $retval' >> dynamic_events
then 'myevent' is a return probe event.
Link: https://lore.kernel.org/all/169272160261.160970.13613040161560998787.stgit@devnote2/
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
I hit a memory leak when testing bpf_program__set_attach_target().
Basically, set_attach_target() may allocate btf_vmlinux, for example,
when setting attach target for bpf_iter programs. But btf_vmlinux
is freed only in bpf_object_load(), which means if we only open
bpf object but not load it, setting attach target may leak
btf_vmlinux.
So let's free btf_vmlinux in bpf_object__close() anyway.
Signed-off-by: Hao Luo <haoluo@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230822193840.1509809-1-haoluo@google.com
I noticed some error with:
# perf list ex_ret_brn
lzma: fopen failed on /usr/lib/modules/5.15.14-100.fc34.x86_64/kernel/net/bluetooth/bnep/bnep.ko.xz: 'No such file or directory'
lzma: fopen failed on /usr/lib/modules/5.16.16-200.fc35.x86_64/kernel/drivers/gpu/drm/drm_kms_helper.ko.xz: 'No such file or directory'
lzma: fopen failed on /usr/lib/modules/5.18.16-200.fc36.x86_64/kernel/arch/x86/crypto/crct10dif-pclmul.ko.xz: 'No such file or directory'
lzma: fopen failed on /usr/lib/modules/5.16.16-200.fc35.x86_64/kernel/drivers/i2c/busses/i2c-piix4.ko.xz: 'No such file or directory'
<BIG SNIP>
Then using 'perf probe' + 'perf trace' to debug 'perf list', it seems
its some inconsistency in the ~/.debug/ cache where broken build id
symlinks that ends up making it try to uncompress some kernel modules
using the lzma routines:
395.309 perf/3594447 probe_perf:lzma_decompress_to_file(__probe_ip: 6118448, input_string: "/usr/lib/modules/5.18.17-200.fc36.x86_64/kernel/drivers/nvme/host/nvme.ko.xz")
lzma_decompress_to_file (/var/home/acme/bin/perf)
filename__decompress (/var/home/acme/bin/perf)
filename__read_build_id (/var/home/acme/bin/perf)
filename__sprintf_build_id (inlined)
build_id_cache__valid_id (inlined)
build_id_cache__list_all (/var/home/acme/bin/perf)
print_sdt_events (/var/home/acme/bin/perf)
cmd_list (/var/home/acme/bin/perf)
run_builtin (/var/home/acme/bin/perf)
handle_internal_command (inlined)
run_argv (inlined)
main (/var/home/acme/bin/perf)
__libc_start_call_main (/usr/lib64/libc.so.6)
__libc_start_main@@GLIBC_2.34 (/usr/lib64/libc.so.6)
_start (/var/home/acme/bin/perf)
But callers of filename__decompress() already check its return and use
pr_debug(), so be consistent and make functions it calls also use
pr_debug().
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/ZOUD0+GkuCVkYF7n@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a selftest for the fix provided in the previous commit. Without the
fix, the selftest passes the verifier while it should fail. The special
logic for detecting graph root or node for reg->off and bypassing
reg->off == 0 guarantee for release helpers/kfuncs has been dropped.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230822175140.1317749-3-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
For a bpf_kptr_xchg() with local kptr, if the map value kptr type and
allocated local obj type does not match, with the previous patch,
the below verifier error message will be logged:
R2 is of type <allocated local obj type> but <map value kptr type> is expected
Without the previous patch, the test will have unexpected success.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230822050058.2887354-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Before adding a port to bond, it need to be set down first. In the
lacpdu test the author set the port down specifically. But commit
a4abfa627c ("net: rtnetlink: Enslave device before bringing it up")
changed the operation order, the kernel will set the port down _after_
adding to bond. So all the ports will be down at last and the test failed.
In fact, the veth interfaces are already inactive when added. This
means there's no need to set them down again before adding to the bond.
Let's just remove the link down operation.
Fixes: a4abfa627c ("net: rtnetlink: Enslave device before bringing it up")
Reported-by: Zhengchao Shao <shaozhengchao@huawei.com>
Closes: https://lore.kernel.org/netdev/a0ef07c7-91b0-94bd-240d-944a330fcabd@huawei.com/
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20230817082459.1685972-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Attaching extra program to same functions system wide for api
and link tests.
This way we can test the pid filter works properly when there's
extra system wide consumer on the same uprobe that will trigger
the original uprobe handler.
We expect to have the same counts as before.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-29-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Running api and link tests also with pid filter and checking
the probe gets executed only for specific pid.
Spawning extra process to trigger attached uprobes and checking
we get correct counts from executed programs.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-28-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding test for cookies setup/retrieval in uprobe_link uprobes
and making sure bpf_get_attach_cookie works properly.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-27-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding test that attaches 50k usdt probes in usdt_multi binary.
After the attach is done we run the binary and make sure we get
proper amount of hits.
With current uprobes:
# perf stat --null ./test_progs -n 254/6
#254/6 uprobe_multi_test/bench_usdt:OK
#254 uprobe_multi_test:OK
Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
Performance counter stats for './test_progs -n 254/6':
1353.659680562 seconds time elapsed
With uprobe_multi link:
# perf stat --null ./test_progs -n 254/6
#254/6 uprobe_multi_test/bench_usdt:OK
#254 uprobe_multi_test:OK
Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
Performance counter stats for './test_progs -n 254/6':
0.322046364 seconds time elapsed
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-26-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding code in uprobe_multi test binary that defines 50k usdts
and will serve as attach point for uprobe_multi usdt bench test
in following patch.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-25-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding test that attaches 50k uprobes in uprobe_multi binary.
After the attach is done we run the binary and make sure we
get proper amount of hits.
The resulting attach/detach times on my setup:
test_bench_attach_uprobe:PASS:uprobe_multi__open 0 nsec
test_bench_attach_uprobe:PASS:uprobe_multi__attach 0 nsec
test_bench_attach_uprobe:PASS:uprobes_count 0 nsec
test_bench_attach_uprobe: attached in 0.346s
test_bench_attach_uprobe: detached in 0.419s
#262/5 uprobe_multi_test/bench_uprobe:OK
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-24-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding uprobe_multi test program that defines 50k uprobe_multi_func_*
functions and will serve as attach point for uprobe_multi bench test
in following patch.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-23-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding uprobe_multi test for bpf_link_create attach function.
Testing attachment using the struct bpf_link_create_opts.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-22-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding uprobe_multi test for bpf_program__attach_uprobe_multi
attach function.
Testing attachment using glob patterns and via bpf_uprobe_multi_opts
paths/syms fields.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-21-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding uprobe_multi test for skeleton load/attach functions,
to test skeleton auto attach for uprobe_multi link.
Test that bpf_get_func_ip works properly for uprobe_multi
attachment.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-20-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
We'd like to have single copy of get_time_ns used b bench and test_progs,
but we can't just include bench.h, because of conflicting 'struct env'
objects.
Moving get_time_ns to testing_helpers.h which is being included by both
bench and test_progs objects.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-19-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding support for usdt_manager_attach_usdt to use uprobe_multi
link to attach to usdt probes.
The uprobe_multi support is detected before the usdt program is
loaded and its expected_attach_type is set accordingly.
If uprobe_multi support is detected the usdt_manager_attach_usdt
gathers uprobes info and calls bpf_program__attach_uprobe to
create all needed uprobes.
If uprobe_multi support is not detected the old behaviour stays.
Also adding usdt.s program section for sleepable usdt probes.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-18-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding uprobe-multi link detection. It will be used later in
bpf_program__attach_usdt function to check and use uprobe_multi
link over standard uprobe links.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-17-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding support for several uprobe_multi program sections
to allow auto attach of multi_uprobe programs.
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-16-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding bpf_program__attach_uprobe_multi function that
allows to attach multiple uprobes with uprobe_multi link.
The user can specify uprobes with direct arguments:
binary_path/func_pattern/pid
or with struct bpf_uprobe_multi_opts opts argument fields:
const char **syms;
const unsigned long *offsets;
const unsigned long *ref_ctr_offsets;
const __u64 *cookies;
User can specify 2 mutually exclusive set of inputs:
1) use only path/func_pattern/pid arguments
2) use path/pid with allowed combinations of:
syms/offsets/ref_ctr_offsets/cookies/cnt
- syms and offsets are mutually exclusive
- ref_ctr_offsets and cookies are optional
Any other usage results in error.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-15-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding new uprobe_multi struct to bpf_link_create_opts object
to pass multiple uprobe data to link_create attr uapi.
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-14-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding elf_resolve_pattern_offsets function that looks up
offsets for symbols specified by pattern argument.
The 'pattern' argument allows wildcards (*?' supported).
Offsets are returned in allocated array together with its
size and needs to be released by the caller.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-13-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding elf_resolve_syms_offsets function that looks up
offsets for symbols specified in syms array argument.
Offsets are returned in allocated array with the 'cnt' size,
that needs to be released by the caller.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-12-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding elf symbol iterator object (and some functions) that follow
open-coded iterator pattern and some functions to ease up iterating
elf object symbols.
The idea is to iterate single symbol section with:
struct elf_sym_iter iter;
struct elf_sym *sym;
if (elf_sym_iter_new(&iter, elf, binary_path, SHT_DYNSYM))
goto error;
while ((sym = elf_sym_iter_next(&iter))) {
...
}
I considered opening the elf inside the iterator and iterate all symbol
sections, but then it gets more complicated wrt user checks for when
the next section is processed.
Plus side is the we don't need 'exit' function, because caller/user is
in charge of that.
The returned iterated symbol object from elf_sym_iter_next function
is placed inside the struct elf_sym_iter, so no extra allocation or
argument is needed.
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-11-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>