perf tools changes for v6.1: 2nd batch

- Use BPF CO-RE (Compile Once, Run Everywhere) to support old kernels
   when using bperf (perf BPF based counters) with cgroups.
 
 - Support HiSilicon PCIe Performance Monitoring Unit (PMU), that
   monitors bandwidth, latency, bus utilization and buffer occupancy.
 
   Documented in Documentation/admin-guide/perf/hisi-pcie-pmu.rst.
 
 - User space tasks can migrate between CPUs, so when tracing selected
   CPUs, system-wide sideband is still needed, fix it in the setup of
   Intel PT on hybrid systems.
 
 - Fix metricgroups title message in 'perf list', it should state that
   the metrics groups are to be used with the '-M' option, not '-e'.
 
 - Sync the msr-index.h copy with the kernel sources, adding support
   for using "AMD64_TSC_RATIO" in filter expressions in 'perf trace' as
   well as decoding it when printing the MSR tracepoint arguments.
 
 - Fix program header size and alignment when generating a JIT ELF
   in 'perf inject'.
 
 - Add multiple new Intel PT 'perf test' entries, including a jitdump one.
 
 - Fix the 'perf test' entries for 'perf stat' CSV and JSON output when
   running on PowerPC due to an invalid topology number in that arch.
 
 - Fix the 'perf test' for arm_coresight failures on the ARM Juno system.
 
 - Fix the 'perf test' attr entry for PERF_FORMAT_LOST, adding this option
   to the or expression expected in the intercepted perf_event_open() syscall.
 
 - Add missing condition flags ('hs', 'lo', 'vc', 'vs') for arm64 in the 'perf
   annotate' asm parser.
 
 - Fix 'perf mem record -C' option processing, it was being chopped up
   when preparing the underlying 'perf record -e mem-events' and thus being
   ignored, requiring using '-- -C CPUs' as a workaround.
 
 - Improvements and tidy ups for 'perf test' shell infra.
 
 - Fix Intel PT information printing segfault in uClibc, where a NULL
   format was being passed to fprintf.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCY0vyXAAKCRCyPKLppCJ+
 J3EgAQDgr9FhTCTG+u46iGqPG4lxc46ZWKB3MgZwPuX6P2jwLwD9GCwGow4qHQVP
 F/m7S/3tK/ShPfPWB2m4nVHd9xp7uwM=
 =F1IB
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v6.1-2-2022-10-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull more perf tools updates from Arnaldo Carvalho de Melo:

 - Use BPF CO-RE (Compile Once, Run Everywhere) to support old kernels
   when using bperf (perf BPF based counters) with cgroups.

 - Support HiSilicon PCIe Performance Monitoring Unit (PMU), that
   monitors bandwidth, latency, bus utilization and buffer occupancy.

   Documented in Documentation/admin-guide/perf/hisi-pcie-pmu.rst.

 - User space tasks can migrate between CPUs, so when tracing selected
   CPUs, system-wide sideband is still needed, fix it in the setup of
   Intel PT on hybrid systems.

 - Fix metricgroups title message in 'perf list', it should state that
   the metrics groups are to be used with the '-M' option, not '-e'.

 - Sync the msr-index.h copy with the kernel sources, adding support for
   using "AMD64_TSC_RATIO" in filter expressions in 'perf trace' as well
   as decoding it when printing the MSR tracepoint arguments.

 - Fix program header size and alignment when generating a JIT ELF in
   'perf inject'.

 - Add multiple new Intel PT 'perf test' entries, including a jitdump
   one.

 - Fix the 'perf test' entries for 'perf stat' CSV and JSON output when
   running on PowerPC due to an invalid topology number in that arch.

 - Fix the 'perf test' for arm_coresight failures on the ARM Juno
   system.

 - Fix the 'perf test' attr entry for PERF_FORMAT_LOST, adding this
   option to the or expression expected in the intercepted
   perf_event_open() syscall.

 - Add missing condition flags ('hs', 'lo', 'vc', 'vs') for arm64 in the
   'perf annotate' asm parser.

 - Fix 'perf mem record -C' option processing, it was being chopped up
   when preparing the underlying 'perf record -e mem-events' and thus
   being ignored, requiring using '-- -C CPUs' as a workaround.

 - Improvements and tidy ups for 'perf test' shell infra.

 - Fix Intel PT information printing segfault in uClibc, where a NULL
   format was being passed to fprintf.

* tag 'perf-tools-for-v6.1-2-2022-10-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (23 commits)
  tools arch x86: Sync the msr-index.h copy with the kernel sources
  perf auxtrace arm64: Add support for parsing HiSilicon PCIe Trace packet
  perf auxtrace arm64: Add support for HiSilicon PCIe Tune and Trace device driver
  perf auxtrace arm: Refactor event list iteration in auxtrace_record__init()
  perf tests stat+json_output: Include sanity check for topology
  perf tests stat+csv_output: Include sanity check for topology
  perf intel-pt: Fix system_wide dummy event for hybrid
  perf intel-pt: Fix segfault in intel_pt_print_info() with uClibc
  perf test: Fix attr tests for PERF_FORMAT_LOST
  perf test: test_intel_pt.sh: Add 9 tests
  perf inject: Fix GEN_ELF_TEXT_OFFSET for jit
  perf test: test_intel_pt.sh: Add jitdump test
  perf test: test_intel_pt.sh: Tidy some alignment
  perf test: test_intel_pt.sh: Print a message when skipping kernel tracing
  perf test: test_intel_pt.sh: Tidy some perf record options
  perf test: test_intel_pt.sh: Fix return checking again
  perf: Skip and warn on unknown format 'configN' attrs
  perf list: Fix metricgroups title message
  perf mem: Fix -C option behavior for perf mem record
  perf annotate: Add missing condition flags for arm64
  ...
This commit is contained in:
Linus Torvalds 2022-10-16 15:14:29 -07:00
commit 8636df94ec
36 changed files with 1265 additions and 71 deletions

View File

@ -155,6 +155,11 @@
* Return Stack Buffer Predictions.
*/
#define ARCH_CAP_XAPIC_DISABLE BIT(21) /*
* IA32_XAPIC_DISABLE_STATUS MSR
* supported
*/
#define MSR_IA32_FLUSH_CMD 0x0000010b
#define L1D_FLUSH BIT(0) /*
* Writeback and invalidate the
@ -585,6 +590,9 @@
#define MSR_AMD64_PERF_CNTR_GLOBAL_CTL 0xc0000301
#define MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR 0xc0000302
/* AMD Last Branch Record MSRs */
#define MSR_AMD64_LBR_SELECT 0xc000010e
/* Fam 17h MSRs */
#define MSR_F17H_IRPERF 0xc00000e9
@ -756,6 +764,8 @@
#define MSR_AMD_DBG_EXTN_CFG 0xc000010f
#define MSR_AMD_SAMP_BR_FROM 0xc0010300
#define DBG_EXTN_CFG_LBRV2EN BIT_ULL(6)
#define MSR_IA32_MPERF 0x000000e7
#define MSR_IA32_APERF 0x000000e8
@ -1054,4 +1064,12 @@
#define MSR_IA32_HW_FEEDBACK_PTR 0x17d0
#define MSR_IA32_HW_FEEDBACK_CONFIG 0x17d1
/* x2APIC locked status */
#define MSR_IA32_XAPIC_DISABLE_STATUS 0xBD
#define LEGACY_XAPIC_DISABLED BIT(0) /*
* x2APIC mode is locked and
* disabling x2APIC will cause
* a #GP
*/
#endif /* _ASM_X86_MSR_INDEX_H */

View File

@ -6,7 +6,6 @@
#include <linux/types.h>
#include <linux/limits.h>
#include <linux/bpf.h>
#include <linux/compiler.h>
#include <sys/types.h> /* pid_t */
#define event_contains(obj, mem) ((obj).header.size > offsetof(typeof(obj), mem))
@ -207,7 +206,7 @@ struct perf_record_range_cpu_map {
__u16 end_cpu;
};
struct __packed perf_record_cpu_map_data {
struct perf_record_cpu_map_data {
__u16 type;
union {
/* Used when type == PERF_CPU_MAP__CPUS. */
@ -219,7 +218,7 @@ struct __packed perf_record_cpu_map_data {
/* Used when type == PERF_CPU_MAP__RANGE_CPUS. */
struct perf_record_range_cpu_map range_cpu_data;
};
};
} __attribute__((packed));
#pragma GCC diagnostic pop

View File

@ -4,9 +4,11 @@
* Author: Mathieu Poirier <mathieu.poirier@linaro.org>
*/
#include <dirent.h>
#include <stdbool.h>
#include <linux/coresight-pmu.h>
#include <linux/zalloc.h>
#include <api/fs/fs.h>
#include "../../../util/auxtrace.h"
#include "../../../util/debug.h"
@ -14,6 +16,7 @@
#include "../../../util/pmu.h"
#include "cs-etm.h"
#include "arm-spe.h"
#include "hisi-ptt.h"
static struct perf_pmu **find_all_arm_spe_pmus(int *nr_spes, int *err)
{
@ -50,42 +53,114 @@ static struct perf_pmu **find_all_arm_spe_pmus(int *nr_spes, int *err)
return arm_spe_pmus;
}
static struct perf_pmu **find_all_hisi_ptt_pmus(int *nr_ptts, int *err)
{
const char *sysfs = sysfs__mountpoint();
struct perf_pmu **hisi_ptt_pmus = NULL;
struct dirent *dent;
char path[PATH_MAX];
DIR *dir = NULL;
int idx = 0;
snprintf(path, PATH_MAX, "%s" EVENT_SOURCE_DEVICE_PATH, sysfs);
dir = opendir(path);
if (!dir) {
pr_err("can't read directory '%s'\n", EVENT_SOURCE_DEVICE_PATH);
*err = -EINVAL;
return NULL;
}
while ((dent = readdir(dir))) {
if (strstr(dent->d_name, HISI_PTT_PMU_NAME))
(*nr_ptts)++;
}
if (!(*nr_ptts))
goto out;
hisi_ptt_pmus = zalloc(sizeof(struct perf_pmu *) * (*nr_ptts));
if (!hisi_ptt_pmus) {
pr_err("hisi_ptt alloc failed\n");
*err = -ENOMEM;
goto out;
}
rewinddir(dir);
while ((dent = readdir(dir))) {
if (strstr(dent->d_name, HISI_PTT_PMU_NAME) && idx < *nr_ptts) {
hisi_ptt_pmus[idx] = perf_pmu__find(dent->d_name);
if (hisi_ptt_pmus[idx])
idx++;
}
}
out:
closedir(dir);
return hisi_ptt_pmus;
}
static struct perf_pmu *find_pmu_for_event(struct perf_pmu **pmus,
int pmu_nr, struct evsel *evsel)
{
int i;
if (!pmus)
return NULL;
for (i = 0; i < pmu_nr; i++) {
if (evsel->core.attr.type == pmus[i]->type)
return pmus[i];
}
return NULL;
}
struct auxtrace_record
*auxtrace_record__init(struct evlist *evlist, int *err)
{
struct perf_pmu *cs_etm_pmu;
struct evsel *evsel;
bool found_etm = false;
struct perf_pmu *found_spe = NULL;
struct perf_pmu *cs_etm_pmu = NULL;
struct perf_pmu **arm_spe_pmus = NULL;
struct perf_pmu **hisi_ptt_pmus = NULL;
struct evsel *evsel;
struct perf_pmu *found_etm = NULL;
struct perf_pmu *found_spe = NULL;
struct perf_pmu *found_ptt = NULL;
int auxtrace_event_cnt = 0;
int nr_spes = 0;
int i = 0;
int nr_ptts = 0;
if (!evlist)
return NULL;
cs_etm_pmu = perf_pmu__find(CORESIGHT_ETM_PMU_NAME);
arm_spe_pmus = find_all_arm_spe_pmus(&nr_spes, err);
hisi_ptt_pmus = find_all_hisi_ptt_pmus(&nr_ptts, err);
evlist__for_each_entry(evlist, evsel) {
if (cs_etm_pmu &&
evsel->core.attr.type == cs_etm_pmu->type)
found_etm = true;
if (cs_etm_pmu && !found_etm)
found_etm = find_pmu_for_event(&cs_etm_pmu, 1, evsel);
if (!nr_spes || found_spe)
continue;
if (arm_spe_pmus && !found_spe)
found_spe = find_pmu_for_event(arm_spe_pmus, nr_spes, evsel);
for (i = 0; i < nr_spes; i++) {
if (evsel->core.attr.type == arm_spe_pmus[i]->type) {
found_spe = arm_spe_pmus[i];
break;
}
}
if (hisi_ptt_pmus && !found_ptt)
found_ptt = find_pmu_for_event(hisi_ptt_pmus, nr_ptts, evsel);
}
free(arm_spe_pmus);
if (found_etm && found_spe) {
pr_err("Concurrent ARM Coresight ETM and SPE operation not currently supported\n");
free(arm_spe_pmus);
free(hisi_ptt_pmus);
if (found_etm)
auxtrace_event_cnt++;
if (found_spe)
auxtrace_event_cnt++;
if (found_ptt)
auxtrace_event_cnt++;
if (auxtrace_event_cnt > 1) {
pr_err("Concurrent AUX trace operation not currently supported\n");
*err = -EOPNOTSUPP;
return NULL;
}
@ -96,6 +171,9 @@ struct auxtrace_record
#if defined(__aarch64__)
if (found_spe)
return arm_spe_recording_init(err, found_spe);
if (found_ptt)
return hisi_ptt_recording_init(err, found_ptt);
#endif
/*

View File

@ -10,6 +10,7 @@
#include <linux/string.h>
#include "arm-spe.h"
#include "hisi-ptt.h"
#include "../../../util/pmu.h"
struct perf_event_attr
@ -22,6 +23,8 @@ struct perf_event_attr
#if defined(__aarch64__)
} else if (strstarts(pmu->name, ARM_SPE_PMU_NAME)) {
return arm_spe_pmu_default_config(pmu);
} else if (strstarts(pmu->name, HISI_PTT_PMU_NAME)) {
pmu->selectable = true;
#endif
}

View File

@ -102,7 +102,7 @@ static int arm64__annotate_init(struct arch *arch, char *cpuid __maybe_unused)
if (err)
goto out_free_arm;
/* b, b.cond, br, cbz/cbnz, tbz/tbnz */
err = regcomp(&arm->jump_insn, "^[ct]?br?\\.?(cc|cs|eq|ge|gt|hi|le|ls|lt|mi|ne|pl)?n?z?$",
err = regcomp(&arm->jump_insn, "^[ct]?br?\\.?(cc|cs|eq|ge|gt|hi|hs|le|lo|ls|lt|mi|ne|pl|vc|vs)?n?z?$",
REG_EXTENDED);
if (err)
goto out_free_call;

View File

@ -11,4 +11,4 @@ perf-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o
perf-$(CONFIG_AUXTRACE) += ../../arm/util/pmu.o \
../../arm/util/auxtrace.o \
../../arm/util/cs-etm.o \
arm-spe.o mem-events.o
arm-spe.o mem-events.o hisi-ptt.o

View File

@ -0,0 +1,188 @@
// SPDX-License-Identifier: GPL-2.0
/*
* HiSilicon PCIe Trace and Tuning (PTT) support
* Copyright (c) 2022 HiSilicon Technologies Co., Ltd.
*/
#include <linux/kernel.h>
#include <linux/types.h>
#include <linux/bitops.h>
#include <linux/log2.h>
#include <linux/zalloc.h>
#include <time.h>
#include <internal/lib.h> // page_size
#include "../../../util/auxtrace.h"
#include "../../../util/cpumap.h"
#include "../../../util/debug.h"
#include "../../../util/event.h"
#include "../../../util/evlist.h"
#include "../../../util/evsel.h"
#include "../../../util/hisi-ptt.h"
#include "../../../util/pmu.h"
#include "../../../util/record.h"
#include "../../../util/session.h"
#include "../../../util/tsc.h"
#define KiB(x) ((x) * 1024)
#define MiB(x) ((x) * 1024 * 1024)
struct hisi_ptt_recording {
struct auxtrace_record itr;
struct perf_pmu *hisi_ptt_pmu;
struct evlist *evlist;
};
static size_t
hisi_ptt_info_priv_size(struct auxtrace_record *itr __maybe_unused,
struct evlist *evlist __maybe_unused)
{
return HISI_PTT_AUXTRACE_PRIV_SIZE;
}
static int hisi_ptt_info_fill(struct auxtrace_record *itr,
struct perf_session *session,
struct perf_record_auxtrace_info *auxtrace_info,
size_t priv_size)
{
struct hisi_ptt_recording *pttr =
container_of(itr, struct hisi_ptt_recording, itr);
struct perf_pmu *hisi_ptt_pmu = pttr->hisi_ptt_pmu;
if (priv_size != HISI_PTT_AUXTRACE_PRIV_SIZE)
return -EINVAL;
if (!session->evlist->core.nr_mmaps)
return -EINVAL;
auxtrace_info->type = PERF_AUXTRACE_HISI_PTT;
auxtrace_info->priv[0] = hisi_ptt_pmu->type;
return 0;
}
static int hisi_ptt_set_auxtrace_mmap_page(struct record_opts *opts)
{
bool privileged = perf_event_paranoid_check(-1);
if (!opts->full_auxtrace)
return 0;
if (opts->full_auxtrace && !opts->auxtrace_mmap_pages) {
if (privileged) {
opts->auxtrace_mmap_pages = MiB(16) / page_size;
} else {
opts->auxtrace_mmap_pages = KiB(128) / page_size;
if (opts->mmap_pages == UINT_MAX)
opts->mmap_pages = KiB(256) / page_size;
}
}
/* Validate auxtrace_mmap_pages */
if (opts->auxtrace_mmap_pages) {
size_t sz = opts->auxtrace_mmap_pages * (size_t)page_size;
size_t min_sz = KiB(8);
if (sz < min_sz || !is_power_of_2(sz)) {
pr_err("Invalid mmap size for HISI PTT: must be at least %zuKiB and a power of 2\n",
min_sz / 1024);
return -EINVAL;
}
}
return 0;
}
static int hisi_ptt_recording_options(struct auxtrace_record *itr,
struct evlist *evlist,
struct record_opts *opts)
{
struct hisi_ptt_recording *pttr =
container_of(itr, struct hisi_ptt_recording, itr);
struct perf_pmu *hisi_ptt_pmu = pttr->hisi_ptt_pmu;
struct evsel *evsel, *hisi_ptt_evsel = NULL;
struct evsel *tracking_evsel;
int err;
pttr->evlist = evlist;
evlist__for_each_entry(evlist, evsel) {
if (evsel->core.attr.type == hisi_ptt_pmu->type) {
if (hisi_ptt_evsel) {
pr_err("There may be only one " HISI_PTT_PMU_NAME "x event\n");
return -EINVAL;
}
evsel->core.attr.freq = 0;
evsel->core.attr.sample_period = 1;
evsel->needs_auxtrace_mmap = true;
hisi_ptt_evsel = evsel;
opts->full_auxtrace = true;
}
}
err = hisi_ptt_set_auxtrace_mmap_page(opts);
if (err)
return err;
/*
* To obtain the auxtrace buffer file descriptor, the auxtrace event
* must come first.
*/
evlist__to_front(evlist, hisi_ptt_evsel);
evsel__set_sample_bit(hisi_ptt_evsel, TIME);
/* Add dummy event to keep tracking */
err = parse_event(evlist, "dummy:u");
if (err)
return err;
tracking_evsel = evlist__last(evlist);
evlist__set_tracking_event(evlist, tracking_evsel);
tracking_evsel->core.attr.freq = 0;
tracking_evsel->core.attr.sample_period = 1;
evsel__set_sample_bit(tracking_evsel, TIME);
return 0;
}
static u64 hisi_ptt_reference(struct auxtrace_record *itr __maybe_unused)
{
return rdtsc();
}
static void hisi_ptt_recording_free(struct auxtrace_record *itr)
{
struct hisi_ptt_recording *pttr =
container_of(itr, struct hisi_ptt_recording, itr);
free(pttr);
}
struct auxtrace_record *hisi_ptt_recording_init(int *err,
struct perf_pmu *hisi_ptt_pmu)
{
struct hisi_ptt_recording *pttr;
if (!hisi_ptt_pmu) {
*err = -ENODEV;
return NULL;
}
pttr = zalloc(sizeof(*pttr));
if (!pttr) {
*err = -ENOMEM;
return NULL;
}
pttr->hisi_ptt_pmu = hisi_ptt_pmu;
pttr->itr.pmu = hisi_ptt_pmu;
pttr->itr.recording_options = hisi_ptt_recording_options;
pttr->itr.info_priv_size = hisi_ptt_info_priv_size;
pttr->itr.info_fill = hisi_ptt_info_fill;
pttr->itr.free = hisi_ptt_recording_free;
pttr->itr.reference = hisi_ptt_reference;
pttr->itr.read_finish = auxtrace_record__read_finish;
pttr->itr.alignment = 0;
*err = 0;
return &pttr->itr;
}

View File

@ -866,7 +866,7 @@ static int intel_pt_recording_options(struct auxtrace_record *itr,
* User space tasks can migrate between CPUs, so when tracing
* selected CPUs, sideband for all CPUs is still needed.
*/
need_system_wide_tracking = evlist->core.has_user_cpus &&
need_system_wide_tracking = opts->target.cpu_list &&
!intel_pt_evsel->core.attr.exclude_user;
tracking_evsel = evlist__add_aux_dummy(evlist, need_system_wide_tracking);

View File

@ -60,7 +60,7 @@ int cmd_list(int argc, const char **argv)
setup_pager();
if (!raw_dump && pager_in_use())
printf("\nList of pre-defined events (to be used in -e):\n\n");
printf("\nList of pre-defined events (to be used in -e or -M):\n\n");
if (hybrid_type) {
pmu_name = perf_pmu__hybrid_type_to_pmu(hybrid_type);

View File

@ -97,6 +97,9 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
else
rec_argc = argc + 9 * perf_pmu__hybrid_pmu_num();
if (mem->cpu_list)
rec_argc += 2;
rec_argv = calloc(rec_argc + 1, sizeof(char *));
if (!rec_argv)
return -1;
@ -159,6 +162,11 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
if (all_kernel)
rec_argv[i++] = "--all-kernel";
if (mem->cpu_list) {
rec_argv[i++] = "-C";
rec_argv[i++] = mem->cpu_list;
}
for (j = 0; j < argc; j++, i++)
rec_argv[i] = argv[j];

View File

@ -9,7 +9,7 @@ size=128
config=0
sample_period=*
sample_type=263
read_format=0|4
read_format=0|4|20
disabled=1
inherit=1
pinned=0

View File

@ -11,7 +11,7 @@ size=128
config=9
sample_period=4000
sample_type=455
read_format=4
read_format=4|20
# Event will be enabled right away.
disabled=0
inherit=1

View File

@ -7,14 +7,14 @@ ret = 1
fd=1
group_fd=-1
sample_type=327
read_format=4
read_format=4|20
[event-2:base-record]
fd=2
group_fd=1
config=1
sample_type=327
read_format=4
read_format=4|20
mmap=0
comm=0
task=0

View File

@ -7,7 +7,7 @@ ret = 1
fd=1
group_fd=-1
sample_type=343
read_format=12
read_format=12|28
inherit=0
[event-2:base-record]
@ -21,8 +21,8 @@ config=3
# default | PERF_SAMPLE_READ
sample_type=343
# PERF_FORMAT_ID | PERF_FORMAT_GROUP
read_format=12
# PERF_FORMAT_ID | PERF_FORMAT_GROUP | PERF_FORMAT_LOST
read_format=12|28
task=0
mmap=0
comm=0

View File

@ -7,7 +7,7 @@ ret = 1
fd=1
group_fd=-1
sample_type=327
read_format=4
read_format=4|20
[event-2:base-record]
fd=2
@ -15,7 +15,7 @@ group_fd=1
type=0
config=1
sample_type=327
read_format=4
read_format=4|20
mmap=0
comm=0
task=0

View File

@ -9,7 +9,7 @@ group_fd=-1
config=0|1
sample_period=1234000
sample_type=87
read_format=12
read_format=12|28
inherit=0
freq=0
@ -19,7 +19,7 @@ group_fd=1
config=0|1
sample_period=6789000
sample_type=87
read_format=12
read_format=12|28
disabled=0
inherit=0
mmap=0

View File

@ -6,6 +6,8 @@
set -e
skip_test=0
function commachecker()
{
local -i cnt=0
@ -156,14 +158,47 @@ check_per_socket()
echo "[Success]"
}
# The perf stat options for per-socket, per-core, per-die
# and -A ( no_aggr mode ) uses the info fetched from this
# directory: "/sys/devices/system/cpu/cpu*/topology". For
# example, socket value is fetched from "physical_package_id"
# file in topology directory.
# Reference: cpu__get_topology_int in util/cpumap.c
# If the platform doesn't expose topology information, values
# will be set to -1. For example, incase of pSeries platform
# of powerpc, value for "physical_package_id" is restricted
# and set to -1. Check here validates the socket-id read from
# topology file before proceeding further
FILE_LOC="/sys/devices/system/cpu/cpu*/topology/"
FILE_NAME="physical_package_id"
check_for_topology()
{
if ! ParanoidAndNotRoot 0
then
socket_file=`ls $FILE_LOC/$FILE_NAME | head -n 1`
[ -z $socket_file ] && return 0
socket_id=`cat $socket_file`
[ $socket_id == -1 ] && skip_test=1
return 0
fi
}
check_for_topology
check_no_args
check_system_wide
check_system_wide_no_aggr
check_interval
check_event
check_per_core
check_per_thread
check_per_die
check_per_node
check_per_socket
if [ $skip_test -ne 1 ]
then
check_system_wide_no_aggr
check_per_core
check_per_die
check_per_socket
else
echo "[Skip] Skipping tests for system_wide_no_aggr, per_core, per_die and per_socket since socket id exposed via topology is invalid"
fi
exit 0

View File

@ -6,6 +6,8 @@
set -e
skip_test=0
pythonchecker=$(dirname $0)/lib/perf_json_output_lint.py
if [ "x$PYTHON" == "x" ]
then
@ -134,14 +136,47 @@ check_per_socket()
echo "[Success]"
}
# The perf stat options for per-socket, per-core, per-die
# and -A ( no_aggr mode ) uses the info fetched from this
# directory: "/sys/devices/system/cpu/cpu*/topology". For
# example, socket value is fetched from "physical_package_id"
# file in topology directory.
# Reference: cpu__get_topology_int in util/cpumap.c
# If the platform doesn't expose topology information, values
# will be set to -1. For example, incase of pSeries platform
# of powerpc, value for "physical_package_id" is restricted
# and set to -1. Check here validates the socket-id read from
# topology file before proceeding further
FILE_LOC="/sys/devices/system/cpu/cpu*/topology/"
FILE_NAME="physical_package_id"
check_for_topology()
{
if ! ParanoidAndNotRoot 0
then
socket_file=`ls $FILE_LOC/$FILE_NAME | head -n 1`
[ -z $socket_file ] && return 0
socket_id=`cat $socket_file`
[ $socket_id == -1 ] && skip_test=1
return 0
fi
}
check_for_topology
check_no_args
check_system_wide
check_system_wide_no_aggr
check_interval
check_event
check_per_core
check_per_thread
check_per_die
check_per_node
check_per_socket
if [ $skip_test -ne 1 ]
then
check_system_wide_no_aggr
check_per_core
check_per_die
check_per_socket
else
echo "[Skip] Skipping tests for system_wide_no_aggr, per_core, per_die and per_socket since socket id exposed via topology is invalid"
fi
exit 0

View File

@ -70,7 +70,7 @@ perf_report_instruction_samples() {
# 68.12% touch libc-2.27.so [.] _dl_addr
# 5.80% touch libc-2.27.so [.] getenv
# 4.35% touch ld-2.27.so [.] _dl_fixup
perf report --itrace=i1000i --stdio -i ${perfdata} 2>&1 | \
perf report --itrace=i20i --stdio -i ${perfdata} 2>&1 | \
egrep " +[0-9]+\.[0-9]+% +$1" > /dev/null 2>&1
}

View File

@ -22,6 +22,8 @@ outfile="${temp_dir}/test-out.txt"
errfile="${temp_dir}/test-err.txt"
workload="${temp_dir}/workload"
awkscript="${temp_dir}/awkscript"
jitdump_workload="${temp_dir}/jitdump_workload"
maxbrstack="${temp_dir}/maxbrstack.py"
cleanup()
{
@ -42,6 +44,21 @@ trap_cleanup()
trap trap_cleanup EXIT TERM INT
# perf record for testing without decoding
perf_record_no_decode()
{
# Options to speed up recording: no post-processing, no build-id cache update,
# and no BPF events.
perf record -B -N --no-bpf-event "$@"
}
# perf record for testing should not need BPF events
perf_record_no_bpf()
{
# Options for no BPF events
perf record --no-bpf-event "$@"
}
have_workload=false
cat << _end_of_file_ | /usr/bin/cc -o "${workload}" -xc - -pthread && have_workload=true
#include <time.h>
@ -76,7 +93,7 @@ _end_of_file_
can_cpu_wide()
{
echo "Checking for CPU-wide recording on CPU $1"
if ! perf record -o "${tmpfile}" -B -N --no-bpf-event -e dummy:u -C "$1" true >/dev/null 2>&1 ; then
if ! perf_record_no_decode -o "${tmpfile}" -e dummy:u -C "$1" true >/dev/null 2>&1 ; then
echo "No so skipping"
return 2
fi
@ -93,7 +110,7 @@ test_system_wide_side_band()
can_cpu_wide 1 || return $?
# Record on CPU 0 a task running on CPU 1
perf record -B -N --no-bpf-event -o "${perfdatafile}" -e intel_pt//u -C 0 -- taskset --cpu-list 1 uname
perf_record_no_decode -o "${perfdatafile}" -e intel_pt//u -C 0 -- taskset --cpu-list 1 uname
# Should get MMAP events from CPU 1 because they can be needed to decode
mmap_cnt=$(perf script -i "${perfdatafile}" --no-itrace --show-mmap-events -C 1 2>/dev/null | grep -c MMAP)
@ -109,7 +126,14 @@ test_system_wide_side_band()
can_kernel()
{
perf record -o "${tmpfile}" -B -N --no-bpf-event -e dummy:k true >/dev/null 2>&1 || return 2
if [ -z "${can_kernel_trace}" ] ; then
can_kernel_trace=0
perf_record_no_decode -o "${tmpfile}" -e dummy:k true >/dev/null 2>&1 && can_kernel_trace=1
fi
if [ ${can_kernel_trace} -eq 0 ] ; then
echo "SKIP: no kernel tracing"
return 2
fi
return 0
}
@ -235,7 +259,7 @@ test_per_thread()
wait_for_threads ${w1} 2
wait_for_threads ${w2} 2
perf record -B -N --no-bpf-event -o "${perfdatafile}" -e intel_pt//u"${k}" -vvv --per-thread -p "${w1},${w2}" 2>"${errfile}" >"${outfile}" &
perf_record_no_decode -o "${perfdatafile}" -e intel_pt//u"${k}" -vvv --per-thread -p "${w1},${w2}" 2>"${errfile}" >"${outfile}" &
ppid=$!
echo "perf PID is $ppid"
wait_for_perf_to_start ${ppid} "${errfile}" || return 1
@ -254,6 +278,342 @@ test_per_thread()
return 0
}
test_jitdump()
{
echo "--- Test tracing self-modifying code that uses jitdump ---"
script_path=$(realpath "$0")
script_dir=$(dirname "$script_path")
jitdump_incl_dir="${script_dir}/../../util"
jitdump_h="${jitdump_incl_dir}/jitdump.h"
if [ ! -e "${jitdump_h}" ] ; then
echo "SKIP: Include file jitdump.h not found"
return 2
fi
if [ -z "${have_jitdump_workload}" ] ; then
have_jitdump_workload=false
# Create a workload that uses self-modifying code and generates its own jitdump file
cat <<- "_end_of_file_" | /usr/bin/cc -o "${jitdump_workload}" -I "${jitdump_incl_dir}" -xc - -pthread && have_jitdump_workload=true
#define _GNU_SOURCE
#include <sys/mman.h>
#include <sys/types.h>
#include <stddef.h>
#include <stdio.h>
#include <stdint.h>
#include <unistd.h>
#include <string.h>
#include "jitdump.h"
#define CHK_BYTE 0x5a
static inline uint64_t rdtsc(void)
{
unsigned int low, high;
asm volatile("rdtsc" : "=a" (low), "=d" (high));
return low | ((uint64_t)high) << 32;
}
static FILE *open_jitdump(void)
{
struct jitheader header = {
.magic = JITHEADER_MAGIC,
.version = JITHEADER_VERSION,
.total_size = sizeof(header),
.pid = getpid(),
.timestamp = rdtsc(),
.flags = JITDUMP_FLAGS_ARCH_TIMESTAMP,
};
char filename[256];
FILE *f;
void *m;
snprintf(filename, sizeof(filename), "jit-%d.dump", getpid());
f = fopen(filename, "w+");
if (!f)
goto err;
/* Create an MMAP event for the jitdump file. That is how perf tool finds it. */
m = mmap(0, 4096, PROT_READ | PROT_EXEC, MAP_PRIVATE, fileno(f), 0);
if (m == MAP_FAILED)
goto err_close;
munmap(m, 4096);
if (fwrite(&header,sizeof(header),1,f) != 1)
goto err_close;
return f;
err_close:
fclose(f);
err:
return NULL;
}
static int write_jitdump(FILE *f, void *addr, const uint8_t *dat, size_t sz, uint64_t *idx)
{
struct jr_code_load rec = {
.p.id = JIT_CODE_LOAD,
.p.total_size = sizeof(rec) + sz,
.p.timestamp = rdtsc(),
.pid = getpid(),
.tid = gettid(),
.vma = (unsigned long)addr,
.code_addr = (unsigned long)addr,
.code_size = sz,
.code_index = ++*idx,
};
if (fwrite(&rec,sizeof(rec),1,f) != 1 ||
fwrite(dat, sz, 1, f) != 1)
return -1;
return 0;
}
static void close_jitdump(FILE *f)
{
fclose(f);
}
int main()
{
/* Get a memory page to store executable code */
void *addr = mmap(0, 4096, PROT_WRITE | PROT_EXEC, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
/* Code to execute: mov CHK_BYTE, %eax ; ret */
uint8_t dat[] = {0xb8, CHK_BYTE, 0x00, 0x00, 0x00, 0xc3};
FILE *f = open_jitdump();
uint64_t idx = 0;
int ret = 1;
if (!f)
return 1;
/* Copy executable code to executable memory page */
memcpy(addr, dat, sizeof(dat));
/* Record it in the jitdump file */
if (write_jitdump(f, addr, dat, sizeof(dat), &idx))
goto out_close;
/* Call it */
ret = ((int (*)(void))addr)() - CHK_BYTE;
out_close:
close_jitdump(f);
return ret;
}
_end_of_file_
fi
if ! $have_jitdump_workload ; then
echo "SKIP: No jitdump workload"
return 2
fi
# Change to temp_dir so jitdump collateral files go there
cd "${temp_dir}"
perf_record_no_bpf -o "${tmpfile}" -e intel_pt//u "${jitdump_workload}"
perf inject -i "${tmpfile}" -o "${perfdatafile}" --jit
decode_br_cnt=$(perf script -i "${perfdatafile}" --itrace=b | wc -l)
# Note that overflow and lost errors are suppressed for the error count
decode_err_cnt=$(perf script -i "${perfdatafile}" --itrace=e-o-l | grep -ci error)
cd -
# Should be thousands of branches
if [ "${decode_br_cnt}" -lt 1000 ] ; then
echo "Decode failed, only ${decode_br_cnt} branches"
return 1
fi
# Should be no errors
if [ "${decode_err_cnt}" -ne 0 ] ; then
echo "Decode failed, ${decode_err_cnt} errors"
perf script -i "${perfdatafile}" --itrace=e-o-l --show-mmap-events | cat
return 1
fi
echo OK
return 0
}
test_packet_filter()
{
echo "--- Test with MTC and TSC disabled ---"
# Disable MTC and TSC
perf_record_no_decode -o "${perfdatafile}" -e intel_pt/mtc=0,tsc=0/u uname
# Should not get MTC packet
mtc_cnt=$(perf script -i "${perfdatafile}" -D 2>/dev/null | grep -c "MTC 0x")
if [ "${mtc_cnt}" -ne 0 ] ; then
echo "Failed to filter with mtc=0"
return 1
fi
# Should not get TSC package
tsc_cnt=$(perf script -i "${perfdatafile}" -D 2>/dev/null | grep -c "TSC 0x")
if [ "${tsc_cnt}" -ne 0 ] ; then
echo "Failed to filter with tsc=0"
return 1
fi
echo OK
return 0
}
test_disable_branch()
{
echo "--- Test with branches disabled ---"
# Disable branch
perf_record_no_decode -o "${perfdatafile}" -e intel_pt/branch=0/u uname
# Should not get branch related packets
tnt_cnt=$(perf script -i "${perfdatafile}" -D 2>/dev/null | grep -c "TNT 0x")
tip_cnt=$(perf script -i "${perfdatafile}" -D 2>/dev/null | grep -c "TIP 0x")
fup_cnt=$(perf script -i "${perfdatafile}" -D 2>/dev/null | grep -c "FUP 0x")
if [ "${tnt_cnt}" -ne 0 ] || [ "${tip_cnt}" -ne 0 ] || [ "${fup_cnt}" -ne 0 ] ; then
echo "Failed to disable branches"
return 1
fi
echo OK
return 0
}
test_time_cyc()
{
echo "--- Test with/without CYC ---"
# Check if CYC is supported
cyc=$(cat /sys/bus/event_source/devices/intel_pt/caps/psb_cyc)
if [ "${cyc}" != "1" ] ; then
echo "SKIP: CYC is not supported"
return 2
fi
# Enable CYC
perf_record_no_decode -o "${perfdatafile}" -e intel_pt/cyc/u uname
# should get CYC packets
cyc_cnt=$(perf script -i "${perfdatafile}" -D 2>/dev/null | grep -c "CYC 0x")
if [ "${cyc_cnt}" = "0" ] ; then
echo "Failed to get CYC packet"
return 1
fi
# Without CYC
perf_record_no_decode -o "${perfdatafile}" -e intel_pt//u uname
# Should not get CYC packets
cyc_cnt=$(perf script -i "${perfdatafile}" -D 2>/dev/null | grep -c "CYC 0x")
if [ "${cyc_cnt}" -gt 0 ] ; then
echo "Still get CYC packet without cyc"
return 1
fi
echo OK
return 0
}
test_sample()
{
echo "--- Test recording with sample mode ---"
# Check if recording with sample mode is working
if ! perf_record_no_decode -o "${perfdatafile}" --aux-sample=8192 -e '{intel_pt//u,branch-misses:u}' uname ; then
echo "perf record failed with --aux-sample"
return 1
fi
echo OK
return 0
}
test_kernel_trace()
{
echo "--- Test with kernel trace ---"
# Check if recording with kernel trace is working
can_kernel || return 2
if ! perf_record_no_decode -o "${perfdatafile}" -e intel_pt//k -m1,128 uname ; then
echo "perf record failed with intel_pt//k"
return 1
fi
echo OK
return 0
}
test_virtual_lbr()
{
echo "--- Test virtual LBR ---"
# Python script to determine the maximum size of branch stacks
cat << "_end_of_file_" > "${maxbrstack}"
from __future__ import print_function
bmax = 0
def process_event(param_dict):
if "brstack" in param_dict:
brstack = param_dict["brstack"]
n = len(brstack)
global bmax
if n > bmax:
bmax = n
def trace_end():
print("max brstack", bmax)
_end_of_file_
# Check if virtual lbr is working
perf_record_no_bpf -o "${perfdatafile}" --aux-sample -e '{intel_pt//,cycles}:u' uname
times_val=$(perf script -i "${perfdatafile}" --itrace=L -s "${maxbrstack}" 2>/dev/null | grep "max brstack " | cut -d " " -f 3)
case "${times_val}" in
[0-9]*) ;;
*) times_val=0;;
esac
if [ "${times_val}" -lt 2 ] ; then
echo "Failed with virtual lbr"
return 1
fi
echo OK
return 0
}
test_power_event()
{
echo "--- Test power events ---"
# Check if power events are supported
power_event=$(cat /sys/bus/event_source/devices/intel_pt/caps/power_event_trace)
if [ "${power_event}" != "1" ] ; then
echo "SKIP: power_event_trace is not supported"
return 2
fi
if ! perf_record_no_decode -o "${perfdatafile}" -a -e intel_pt/pwr_evt/u uname ; then
echo "perf record failed with pwr_evt"
return 1
fi
echo OK
return 0
}
test_no_tnt()
{
echo "--- Test with TNT packets disabled ---"
# Check if TNT disable is supported
notnt=$(cat /sys/bus/event_source/devices/intel_pt/caps/tnt_disable)
if [ "${notnt}" != "1" ] ; then
echo "SKIP: tnt_disable is not supported"
return 2
fi
perf_record_no_decode -o "${perfdatafile}" -e intel_pt/notnt/u uname
# Should be no TNT packets
tnt_cnt=$(perf script -i "${perfdatafile}" -D | grep -c TNT)
if [ "${tnt_cnt}" -ne 0 ] ; then
echo "TNT packets still there after notnt"
return 1
fi
echo OK
return 0
}
test_event_trace()
{
echo "--- Test with event_trace ---"
# Check if event_trace is supported
event_trace=$(cat /sys/bus/event_source/devices/intel_pt/caps/event_trace)
if [ "${event_trace}" != 1 ] ; then
echo "SKIP: event_trace is not supported"
return 2
fi
if ! perf_record_no_decode -o "${perfdatafile}" -e intel_pt/event/u uname ; then
echo "perf record failed with event trace"
return 1
fi
echo OK
return 0
}
count_result()
{
if [ "$1" -eq 2 ] ; then
@ -265,13 +625,22 @@ count_result()
return
fi
err_cnt=$((err_cnt + 1))
ret=0
}
ret=0
test_system_wide_side_band || ret=$? ; count_result $ret
test_per_thread "" "" || ret=$? ; count_result $ret
test_per_thread "k" "(incl. kernel) " || ret=$? ; count_result $ret
test_system_wide_side_band || ret=$? ; count_result $ret ; ret=0
test_per_thread "" "" || ret=$? ; count_result $ret ; ret=0
test_per_thread "k" "(incl. kernel) " || ret=$? ; count_result $ret ; ret=0
test_jitdump || ret=$? ; count_result $ret ; ret=0
test_packet_filter || ret=$? ; count_result $ret ; ret=0
test_disable_branch || ret=$? ; count_result $ret ; ret=0
test_time_cyc || ret=$? ; count_result $ret ; ret=0
test_sample || ret=$? ; count_result $ret ; ret=0
test_kernel_trace || ret=$? ; count_result $ret ; ret=0
test_virtual_lbr || ret=$? ; count_result $ret ; ret=0
test_power_event || ret=$? ; count_result $ret ; ret=0
test_no_tnt || ret=$? ; count_result $ret ; ret=0
test_event_trace || ret=$? ; count_result $ret ; ret=0
cleanup

View File

@ -118,6 +118,8 @@ perf-$(CONFIG_AUXTRACE) += intel-pt.o
perf-$(CONFIG_AUXTRACE) += intel-bts.o
perf-$(CONFIG_AUXTRACE) += arm-spe.o
perf-$(CONFIG_AUXTRACE) += arm-spe-decoder/
perf-$(CONFIG_AUXTRACE) += hisi-ptt.o
perf-$(CONFIG_AUXTRACE) += hisi-ptt-decoder/
perf-$(CONFIG_AUXTRACE) += s390-cpumsf.o
ifdef CONFIG_LIBOPENCSD

View File

@ -52,6 +52,7 @@
#include "intel-pt.h"
#include "intel-bts.h"
#include "arm-spe.h"
#include "hisi-ptt.h"
#include "s390-cpumsf.h"
#include "util/mmap.h"
@ -1320,6 +1321,9 @@ int perf_event__process_auxtrace_info(struct perf_session *session,
case PERF_AUXTRACE_S390_CPUMSF:
err = s390_cpumsf_process_auxtrace_info(event, session);
break;
case PERF_AUXTRACE_HISI_PTT:
err = hisi_ptt_process_auxtrace_info(event, session);
break;
case PERF_AUXTRACE_UNKNOWN:
default:
return -EINVAL;

View File

@ -48,6 +48,7 @@ enum auxtrace_type {
PERF_AUXTRACE_CS_ETM,
PERF_AUXTRACE_ARM_SPE,
PERF_AUXTRACE_S390_CPUMSF,
PERF_AUXTRACE_HISI_PTT,
};
enum itrace_period_type {

View File

@ -43,6 +43,18 @@ struct {
__uint(value_size, sizeof(struct bpf_perf_event_value));
} cgrp_readings SEC(".maps");
/* new kernel cgroup definition */
struct cgroup___new {
int level;
struct cgroup *ancestors[];
} __attribute__((preserve_access_index));
/* old kernel cgroup definition */
struct cgroup___old {
int level;
u64 ancestor_ids[];
} __attribute__((preserve_access_index));
const volatile __u32 num_events = 1;
const volatile __u32 num_cpus = 1;
@ -50,6 +62,21 @@ int enabled = 0;
int use_cgroup_v2 = 0;
int perf_subsys_id = -1;
static inline __u64 get_cgroup_v1_ancestor_id(struct cgroup *cgrp, int level)
{
/* recast pointer to capture new type for compiler */
struct cgroup___new *cgrp_new = (void *)cgrp;
if (bpf_core_field_exists(cgrp_new->ancestors)) {
return BPF_CORE_READ(cgrp_new, ancestors[level], kn, id);
} else {
/* recast pointer to capture old type for compiler */
struct cgroup___old *cgrp_old = (void *)cgrp;
return BPF_CORE_READ(cgrp_old, ancestor_ids[level]);
}
}
static inline int get_cgroup_v1_idx(__u32 *cgrps, int size)
{
struct task_struct *p = (void *)bpf_get_current_task();
@ -77,7 +104,7 @@ static inline int get_cgroup_v1_idx(__u32 *cgrps, int size)
break;
// convert cgroup-id to a map index
cgrp_id = BPF_CORE_READ(cgrp, ancestors[i], kn, id);
cgrp_id = get_cgroup_v1_ancestor_id(cgrp, i);
elem = bpf_map_lookup_elem(&cgrp_idx, &cgrp_id);
if (!elem)
continue;

View File

@ -2,6 +2,8 @@
#ifndef __GENELF_H__
#define __GENELF_H__
#include <linux/math.h>
/* genelf.c */
int jit_write_elf(int fd, uint64_t code_addr, const char *sym,
const void *code, int csize, void *debug, int nr_debug_entries,
@ -76,6 +78,6 @@ int jit_add_debug_info(Elf *e, uint64_t code_addr, void *debug, int nr_debug_ent
#endif
/* The .text section is directly after the ELF header */
#define GEN_ELF_TEXT_OFFSET sizeof(Elf_Ehdr)
#define GEN_ELF_TEXT_OFFSET round_up(sizeof(Elf_Ehdr) + sizeof(Elf_Phdr), 16)
#endif

View File

@ -0,0 +1 @@
perf-$(CONFIG_AUXTRACE) += hisi-ptt-pkt-decoder.o

View File

@ -0,0 +1,164 @@
// SPDX-License-Identifier: GPL-2.0
/*
* HiSilicon PCIe Trace and Tuning (PTT) support
* Copyright (c) 2022 HiSilicon Technologies Co., Ltd.
*/
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <endian.h>
#include <byteswap.h>
#include <linux/bitops.h>
#include <stdarg.h>
#include "../color.h"
#include "hisi-ptt-pkt-decoder.h"
/*
* For 8DW format, the bit[31:11] of DW0 is always 0x1fffff, which can be
* used to distinguish the data format.
* 8DW format is like:
* bits [ 31:11 ][ 10:0 ]
* |---------------------------------------|-------------------|
* DW0 [ 0x1fffff ][ Reserved (0x7ff) ]
* DW1 [ Prefix ]
* DW2 [ Header DW0 ]
* DW3 [ Header DW1 ]
* DW4 [ Header DW2 ]
* DW5 [ Header DW3 ]
* DW6 [ Reserved (0x0) ]
* DW7 [ Time ]
*
* 4DW format is like:
* bits [31:30] [ 29:25 ][24][23][22][21][ 20:11 ][ 10:0 ]
* |-----|---------|---|---|---|---|-------------|-------------|
* DW0 [ Fmt ][ Type ][T9][T8][TH][SO][ Length ][ Time ]
* DW1 [ Header DW1 ]
* DW2 [ Header DW2 ]
* DW3 [ Header DW3 ]
*/
enum hisi_ptt_8dw_pkt_field_type {
HISI_PTT_8DW_CHK_AND_RSV0,
HISI_PTT_8DW_PREFIX,
HISI_PTT_8DW_HEAD0,
HISI_PTT_8DW_HEAD1,
HISI_PTT_8DW_HEAD2,
HISI_PTT_8DW_HEAD3,
HISI_PTT_8DW_RSV1,
HISI_PTT_8DW_TIME,
HISI_PTT_8DW_TYPE_MAX
};
enum hisi_ptt_4dw_pkt_field_type {
HISI_PTT_4DW_HEAD1,
HISI_PTT_4DW_HEAD2,
HISI_PTT_4DW_HEAD3,
HISI_PTT_4DW_TYPE_MAX
};
static const char * const hisi_ptt_8dw_pkt_field_name[] = {
[HISI_PTT_8DW_PREFIX] = "Prefix",
[HISI_PTT_8DW_HEAD0] = "Header DW0",
[HISI_PTT_8DW_HEAD1] = "Header DW1",
[HISI_PTT_8DW_HEAD2] = "Header DW2",
[HISI_PTT_8DW_HEAD3] = "Header DW3",
[HISI_PTT_8DW_TIME] = "Time"
};
static const char * const hisi_ptt_4dw_pkt_field_name[] = {
[HISI_PTT_4DW_HEAD1] = "Header DW1",
[HISI_PTT_4DW_HEAD2] = "Header DW2",
[HISI_PTT_4DW_HEAD3] = "Header DW3",
};
union hisi_ptt_4dw {
struct {
uint32_t format : 2;
uint32_t type : 5;
uint32_t t9 : 1;
uint32_t t8 : 1;
uint32_t th : 1;
uint32_t so : 1;
uint32_t len : 10;
uint32_t time : 11;
};
uint32_t value;
};
static void hisi_ptt_print_pkt(const unsigned char *buf, int pos, const char *desc)
{
const char *color = PERF_COLOR_BLUE;
int i;
printf(".");
color_fprintf(stdout, color, " %08x: ", pos);
for (i = 0; i < HISI_PTT_FIELD_LENTH; i++)
color_fprintf(stdout, color, "%02x ", buf[pos + i]);
for (i = 0; i < HISI_PTT_MAX_SPACE_LEN; i++)
color_fprintf(stdout, color, " ");
color_fprintf(stdout, color, " %s\n", desc);
}
static int hisi_ptt_8dw_kpt_desc(const unsigned char *buf, int pos)
{
int i;
for (i = 0; i < HISI_PTT_8DW_TYPE_MAX; i++) {
/* Do not show 8DW check field and reserved fields */
if (i == HISI_PTT_8DW_CHK_AND_RSV0 || i == HISI_PTT_8DW_RSV1) {
pos += HISI_PTT_FIELD_LENTH;
continue;
}
hisi_ptt_print_pkt(buf, pos, hisi_ptt_8dw_pkt_field_name[i]);
pos += HISI_PTT_FIELD_LENTH;
}
return hisi_ptt_pkt_size[HISI_PTT_8DW_PKT];
}
static void hisi_ptt_4dw_print_dw0(const unsigned char *buf, int pos)
{
const char *color = PERF_COLOR_BLUE;
union hisi_ptt_4dw dw0;
int i;
dw0.value = *(uint32_t *)(buf + pos);
printf(".");
color_fprintf(stdout, color, " %08x: ", pos);
for (i = 0; i < HISI_PTT_FIELD_LENTH; i++)
color_fprintf(stdout, color, "%02x ", buf[pos + i]);
for (i = 0; i < HISI_PTT_MAX_SPACE_LEN; i++)
color_fprintf(stdout, color, " ");
color_fprintf(stdout, color,
" %s %x %s %x %s %x %s %x %s %x %s %x %s %x %s %x\n",
"Format", dw0.format, "Type", dw0.type, "T9", dw0.t9,
"T8", dw0.t8, "TH", dw0.th, "SO", dw0.so, "Length",
dw0.len, "Time", dw0.time);
}
static int hisi_ptt_4dw_kpt_desc(const unsigned char *buf, int pos)
{
int i;
hisi_ptt_4dw_print_dw0(buf, pos);
pos += HISI_PTT_FIELD_LENTH;
for (i = 0; i < HISI_PTT_4DW_TYPE_MAX; i++) {
hisi_ptt_print_pkt(buf, pos, hisi_ptt_4dw_pkt_field_name[i]);
pos += HISI_PTT_FIELD_LENTH;
}
return hisi_ptt_pkt_size[HISI_PTT_4DW_PKT];
}
int hisi_ptt_pkt_desc(const unsigned char *buf, int pos, enum hisi_ptt_pkt_type type)
{
if (type == HISI_PTT_8DW_PKT)
return hisi_ptt_8dw_kpt_desc(buf, pos);
return hisi_ptt_4dw_kpt_desc(buf, pos);
}

View File

@ -0,0 +1,31 @@
/* SPDX-License-Identifier: GPL-2.0 */
/*
* HiSilicon PCIe Trace and Tuning (PTT) support
* Copyright (c) 2022 HiSilicon Technologies Co., Ltd.
*/
#ifndef INCLUDE__HISI_PTT_PKT_DECODER_H__
#define INCLUDE__HISI_PTT_PKT_DECODER_H__
#include <stddef.h>
#include <stdint.h>
#define HISI_PTT_8DW_CHECK_MASK GENMASK(31, 11)
#define HISI_PTT_IS_8DW_PKT GENMASK(31, 11)
#define HISI_PTT_MAX_SPACE_LEN 10
#define HISI_PTT_FIELD_LENTH 4
enum hisi_ptt_pkt_type {
HISI_PTT_4DW_PKT,
HISI_PTT_8DW_PKT,
HISI_PTT_PKT_MAX
};
static int hisi_ptt_pkt_size[] = {
[HISI_PTT_4DW_PKT] = 16,
[HISI_PTT_8DW_PKT] = 32,
};
int hisi_ptt_pkt_desc(const unsigned char *buf, int pos, enum hisi_ptt_pkt_type type);
#endif

192
tools/perf/util/hisi-ptt.c Normal file
View File

@ -0,0 +1,192 @@
// SPDX-License-Identifier: GPL-2.0
/*
* HiSilicon PCIe Trace and Tuning (PTT) support
* Copyright (c) 2022 HiSilicon Technologies Co., Ltd.
*/
#include <byteswap.h>
#include <endian.h>
#include <errno.h>
#include <inttypes.h>
#include <linux/bitops.h>
#include <linux/kernel.h>
#include <linux/log2.h>
#include <linux/types.h>
#include <linux/zalloc.h>
#include <stdlib.h>
#include <unistd.h>
#include "auxtrace.h"
#include "color.h"
#include "debug.h"
#include "evsel.h"
#include "hisi-ptt.h"
#include "hisi-ptt-decoder/hisi-ptt-pkt-decoder.h"
#include "machine.h"
#include "session.h"
#include "tool.h"
#include <internal/lib.h>
struct hisi_ptt {
struct auxtrace auxtrace;
u32 auxtrace_type;
struct perf_session *session;
struct machine *machine;
u32 pmu_type;
};
struct hisi_ptt_queue {
struct hisi_ptt *ptt;
struct auxtrace_buffer *buffer;
};
static enum hisi_ptt_pkt_type hisi_ptt_check_packet_type(unsigned char *buf)
{
uint32_t head = *(uint32_t *)buf;
if ((HISI_PTT_8DW_CHECK_MASK & head) == HISI_PTT_IS_8DW_PKT)
return HISI_PTT_8DW_PKT;
return HISI_PTT_4DW_PKT;
}
static void hisi_ptt_dump(struct hisi_ptt *ptt __maybe_unused,
unsigned char *buf, size_t len)
{
const char *color = PERF_COLOR_BLUE;
enum hisi_ptt_pkt_type type;
size_t pos = 0;
int pkt_len;
type = hisi_ptt_check_packet_type(buf);
len = round_down(len, hisi_ptt_pkt_size[type]);
color_fprintf(stdout, color, ". ... HISI PTT data: size %zu bytes\n",
len);
while (len > 0) {
pkt_len = hisi_ptt_pkt_desc(buf, pos, type);
if (!pkt_len)
color_fprintf(stdout, color, " Bad packet!\n");
pos += pkt_len;
len -= pkt_len;
}
}
static void hisi_ptt_dump_event(struct hisi_ptt *ptt, unsigned char *buf,
size_t len)
{
printf(".\n");
hisi_ptt_dump(ptt, buf, len);
}
static int hisi_ptt_process_event(struct perf_session *session __maybe_unused,
union perf_event *event __maybe_unused,
struct perf_sample *sample __maybe_unused,
struct perf_tool *tool __maybe_unused)
{
return 0;
}
static int hisi_ptt_process_auxtrace_event(struct perf_session *session,
union perf_event *event,
struct perf_tool *tool __maybe_unused)
{
struct hisi_ptt *ptt = container_of(session->auxtrace, struct hisi_ptt,
auxtrace);
int fd = perf_data__fd(session->data);
int size = event->auxtrace.size;
void *data = malloc(size);
off_t data_offset;
int err;
if (!data)
return -errno;
if (perf_data__is_pipe(session->data)) {
data_offset = 0;
} else {
data_offset = lseek(fd, 0, SEEK_CUR);
if (data_offset == -1)
return -errno;
}
err = readn(fd, data, size);
if (err != (ssize_t)size) {
free(data);
return -errno;
}
if (dump_trace)
hisi_ptt_dump_event(ptt, data, size);
return 0;
}
static int hisi_ptt_flush(struct perf_session *session __maybe_unused,
struct perf_tool *tool __maybe_unused)
{
return 0;
}
static void hisi_ptt_free_events(struct perf_session *session __maybe_unused)
{
}
static void hisi_ptt_free(struct perf_session *session)
{
struct hisi_ptt *ptt = container_of(session->auxtrace, struct hisi_ptt,
auxtrace);
session->auxtrace = NULL;
free(ptt);
}
static bool hisi_ptt_evsel_is_auxtrace(struct perf_session *session,
struct evsel *evsel)
{
struct hisi_ptt *ptt = container_of(session->auxtrace, struct hisi_ptt, auxtrace);
return evsel->core.attr.type == ptt->pmu_type;
}
static void hisi_ptt_print_info(__u64 type)
{
if (!dump_trace)
return;
fprintf(stdout, " PMU Type %" PRId64 "\n", (s64) type);
}
int hisi_ptt_process_auxtrace_info(union perf_event *event,
struct perf_session *session)
{
struct perf_record_auxtrace_info *auxtrace_info = &event->auxtrace_info;
struct hisi_ptt *ptt;
if (auxtrace_info->header.size < HISI_PTT_AUXTRACE_PRIV_SIZE +
sizeof(struct perf_record_auxtrace_info))
return -EINVAL;
ptt = zalloc(sizeof(*ptt));
if (!ptt)
return -ENOMEM;
ptt->session = session;
ptt->machine = &session->machines.host; /* No kvm support */
ptt->auxtrace_type = auxtrace_info->type;
ptt->pmu_type = auxtrace_info->priv[0];
ptt->auxtrace.process_event = hisi_ptt_process_event;
ptt->auxtrace.process_auxtrace_event = hisi_ptt_process_auxtrace_event;
ptt->auxtrace.flush_events = hisi_ptt_flush;
ptt->auxtrace.free_events = hisi_ptt_free_events;
ptt->auxtrace.free = hisi_ptt_free;
ptt->auxtrace.evsel_is_auxtrace = hisi_ptt_evsel_is_auxtrace;
session->auxtrace = &ptt->auxtrace;
hisi_ptt_print_info(auxtrace_info->priv[0]);
return 0;
}

View File

@ -0,0 +1,19 @@
/* SPDX-License-Identifier: GPL-2.0 */
/*
* HiSilicon PCIe Trace and Tuning (PTT) support
* Copyright (c) 2022 HiSilicon Technologies Co., Ltd.
*/
#ifndef INCLUDE__PERF_HISI_PTT_H__
#define INCLUDE__PERF_HISI_PTT_H__
#define HISI_PTT_PMU_NAME "hisi_ptt"
#define HISI_PTT_AUXTRACE_PRIV_SIZE sizeof(u64)
struct auxtrace_record *hisi_ptt_recording_init(int *err,
struct perf_pmu *hisi_ptt_pmu);
int hisi_ptt_process_auxtrace_info(union perf_event *event,
struct perf_session *session);
#endif

View File

@ -4046,6 +4046,7 @@ static const char * const intel_pt_info_fmts[] = {
[INTEL_PT_SNAPSHOT_MODE] = " Snapshot mode %"PRId64"\n",
[INTEL_PT_PER_CPU_MMAPS] = " Per-cpu maps %"PRId64"\n",
[INTEL_PT_MTC_BIT] = " MTC bit %#"PRIx64"\n",
[INTEL_PT_MTC_FREQ_BITS] = " MTC freq bits %#"PRIx64"\n",
[INTEL_PT_TSC_CTC_N] = " TSC:CTC numerator %"PRIu64"\n",
[INTEL_PT_TSC_CTC_D] = " TSC:CTC denominator %"PRIu64"\n",
[INTEL_PT_CYC_BIT] = " CYC bit %#"PRIx64"\n",
@ -4060,8 +4061,12 @@ static void intel_pt_print_info(__u64 *arr, int start, int finish)
if (!dump_trace)
return;
for (i = start; i <= finish; i++)
fprintf(stdout, intel_pt_info_fmts[i], arr[i]);
for (i = start; i <= finish; i++) {
const char *fmt = intel_pt_info_fmts[i];
if (fmt)
fprintf(stdout, fmt, arr[i]);
}
}
static void intel_pt_print_info_str(const char *name, const char *str)

View File

@ -246,6 +246,9 @@ __add_event(struct list_head *list, int *idx,
struct perf_cpu_map *cpus = pmu ? perf_cpu_map__get(pmu->cpus) :
cpu_list ? perf_cpu_map__new(cpu_list) : NULL;
if (pmu)
perf_pmu__warn_invalid_formats(pmu);
if (pmu && attr->type == PERF_TYPE_RAW)
perf_pmu__warn_invalid_config(pmu, attr->config, name);

View File

@ -1005,6 +1005,23 @@ err:
return NULL;
}
void perf_pmu__warn_invalid_formats(struct perf_pmu *pmu)
{
struct perf_pmu_format *format;
/* fake pmu doesn't have format list */
if (pmu == &perf_pmu__fake)
return;
list_for_each_entry(format, &pmu->format, list)
if (format->value >= PERF_PMU_FORMAT_VALUE_CONFIG_END) {
pr_warning("WARNING: '%s' format '%s' requires 'perf_event_attr::config%d'"
"which is not supported by this version of perf!\n",
pmu->name, format->name, format->value);
return;
}
}
static struct perf_pmu *pmu_find(const char *name)
{
struct perf_pmu *pmu;

View File

@ -17,6 +17,7 @@ enum {
PERF_PMU_FORMAT_VALUE_CONFIG,
PERF_PMU_FORMAT_VALUE_CONFIG1,
PERF_PMU_FORMAT_VALUE_CONFIG2,
PERF_PMU_FORMAT_VALUE_CONFIG_END,
};
#define PERF_PMU_FORMAT_BITS 64
@ -139,6 +140,7 @@ int perf_pmu__caps_parse(struct perf_pmu *pmu);
void perf_pmu__warn_invalid_config(struct perf_pmu *pmu, __u64 config,
const char *name);
void perf_pmu__warn_invalid_formats(struct perf_pmu *pmu);
bool perf_pmu__has_hybrid(void);
int perf_pmu__match(char *pattern, char *name, char *tok);

View File

@ -27,8 +27,6 @@ num_dec [0-9]+
{num_dec} { return value(10); }
config { return PP_CONFIG; }
config1 { return PP_CONFIG1; }
config2 { return PP_CONFIG2; }
- { return '-'; }
: { return ':'; }
, { return ','; }

View File

@ -18,7 +18,7 @@ do { \
%}
%token PP_CONFIG PP_CONFIG1 PP_CONFIG2
%token PP_CONFIG
%token PP_VALUE PP_ERROR
%type <num> PP_VALUE
%type <bits> bit_term
@ -45,18 +45,11 @@ PP_CONFIG ':' bits
$3));
}
|
PP_CONFIG1 ':' bits
PP_CONFIG PP_VALUE ':' bits
{
ABORT_ON(perf_pmu__new_format(format, name,
PERF_PMU_FORMAT_VALUE_CONFIG1,
$3));
}
|
PP_CONFIG2 ':' bits
{
ABORT_ON(perf_pmu__new_format(format, name,
PERF_PMU_FORMAT_VALUE_CONFIG2,
$3));
$2,
$4));
}
bits: