Commit Graph

17547 Commits

Author SHA1 Message Date
Arnaldo Carvalho de Melo eefa09b499 perf beauty: Add generator for 'move_mount' flags argument
$ tools/perf/trace/beauty/move_mount_flags.sh
  static const char *move_mount_flags[] = {
	  [ilog2(0x00000001) + 1] = "F_SYMLINKS",
	  [ilog2(0x00000002) + 1] = "F_AUTOMOUNTS",
	  [ilog2(0x00000004) + 1] = "F_EMPTY_PATH",
	  [ilog2(0x00000010) + 1] = "T_SYMLINKS",
	  [ilog2(0x00000020) + 1] = "T_AUTOMOUNTS",
	  [ilog2(0x00000040) + 1] = "T_EMPTY_PATH",
  };
  $

Will be wired up to the 'perf trace' arg in a followup patch.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-px7v33suw1k2ehst52l7bwa3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-28 18:37:42 -03:00
Arnaldo Carvalho de Melo 8a70c6b162 perf augmented_raw_syscalls: Fix up comment
Cut'n'paste error, the second comment is about the syscalls that have as
its second arg a string.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-zo5s6rloy42u41acsf6q3pvi@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-28 18:37:42 -03:00
Jiri Olsa fb5a88d413 perf tools: Preserve eBPF maps when loading kcore
We need to preserve eBPF maps even if they are covered by kcore, because
we need to access eBPF dso for source data.

Add the map_groups__merge_in function to do that.  It merges a map into
map_groups by splitting the new map within the existing map regions.

Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stanislav Fomichev <sdf@google.com>
Link: http://lkml.kernel.org/r/20190508132010.14512-9-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-28 18:37:42 -03:00
Jiri Olsa 8529f2e673 perf machine: Keep zero in pgoff BPF map
With pgoff set to zero, the map__map_ip function will return BPF
addresses based from 0, which is what we need when we read the data from
a BPF DSO.

Adding BPF symbols with mapped IP addresses as well.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stanislav Fomichev <sdf@google.com>
Link: http://lkml.kernel.org/r/20190508132010.14512-7-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-28 18:37:42 -03:00
Adrian Hunter a2d8a1585e perf intel-pt: Fix itrace defaults for perf script intel-pt documentation
Fix intel-pt documentation to reflect the change of itrace defaults for
perf script.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 4eb0681571 ("perf script: Make itrace script default to all calls")
Link: http://lkml.kernel.org/r/20190520113728.14389-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-28 18:37:42 -03:00
Adrian Hunter 355200e0f6 perf auxtrace: Fix itrace defaults for perf script
Commit 4eb0681571 ("perf script: Make itrace script default to all
calls") does not work for the case when '--itrace' only is used, because
default_no_sample is not being passed.

Example:

 Before:

  $ perf record -e intel_pt/cyc/u ls
  $ perf script --itrace > cmp1.txt
  $ perf script --itrace=cepwx > cmp2.txt
  $ diff -sq cmp1.txt cmp2.txt
  Files cmp1.txt and cmp2.txt differ

 After:

  $ perf script --itrace > cmp1.txt
  $ perf script --itrace=cepwx > cmp2.txt
  $ diff -sq cmp1.txt cmp2.txt
  Files cmp1.txt and cmp2.txt are identical

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 4eb0681571 ("perf script: Make itrace script default to all calls")
Link: http://lkml.kernel.org/r/20190520113728.14389-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-28 18:37:42 -03:00
Adrian Hunter 26f19c2eb7 perf intel-pt: Fix itrace defaults for perf script
Commit 4eb0681571 ("perf script: Make itrace script default to all
calls") does not work because 'use_browser' is being used to determine
whether to default to periodic sampling (i.e. better for perf report).
The result is that nothing but CBR events display for perf script when
no --itrace option is specified.

Fix by using 'default_no_sample' and 'inject' instead.

Example:

 Before:

  $ perf record -e intel_pt/cyc/u ls
  $ perf script > cmp1.txt
  $ perf script --itrace=cepwx > cmp2.txt
  $ diff -sq cmp1.txt cmp2.txt
  Files cmp1.txt and cmp2.txt differ

 After:

  $ perf script > cmp1.txt
  $ perf script --itrace=cepwx > cmp2.txt
  $ diff -sq cmp1.txt cmp2.txt
  Files cmp1.txt and cmp2.txt are identical

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v4.20+
Fixes: 90e457f7be ("perf tools: Add Intel PT support")
Link: http://lkml.kernel.org/r/20190520113728.14389-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-28 18:37:42 -03:00
Adrian Hunter a685c7a4a2 perf-with-kcore.sh: Always allow fix_buildid_cache_permissions
The user's buildid cache may contain entries added by root even if root
has its own home directory (e.g. by using perfconfig to specify the same
buildid dir), so remove that validation.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190412113830.4126-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-28 18:37:42 -03:00
Arnaldo Carvalho de Melo a7350998a2 tools headers UAPI: Sync kvm.h headers with the kernel sources
dd53f6102c ("Merge tag 'kvmarm-for-v5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD")
  59c5c58c5b ("Merge tag 'kvm-ppc-next-5.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD")
  d7547c55cb ("KVM: Introduce KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2")
  6520ca64cd ("KVM: PPC: Book3S HV: XIVE: Add a mapping for the source ESB pages")
  39e9af3de5 ("KVM: PPC: Book3S HV: XIVE: Add a TIMA mapping")
  e4945b9da5 ("KVM: PPC: Book3S HV: XIVE: Add get/set accessors for the VP XIVE state")
  e6714bd167 ("KVM: PPC: Book3S HV: XIVE: Add a control to dirty the XIVE EQ pages")
  7b46b6169a ("KVM: PPC: Book3S HV: XIVE: Add a control to sync the sources")
  5ca8064748 ("KVM: PPC: Book3S HV: XIVE: Add a global reset control")
  13ce3297c5 ("KVM: PPC: Book3S HV: XIVE: Add controls for the EQ configuration")
  e8676ce50e ("KVM: PPC: Book3S HV: XIVE: Add a control to configure a source")
  4131f83c3d ("KVM: PPC: Book3S HV: XIVE: add a control to initialize a source")
  eacc56bb9d ("KVM: PPC: Book3S HV: XIVE: Introduce a new capability KVM_CAP_PPC_IRQ_XIVE")
  90c73795af ("KVM: PPC: Book3S HV: Add a new KVM device for the XIVE native exploitation mode")
  4f45b90e1c ("KVM: s390: add deflate conversion facilty to cpu model")
  a243c16d18 ("KVM: arm64: Add capability to advertise ptrauth for guest")
  a22fa321d1 ("KVM: arm64: Add userspace flag to enable pointer authentication")
  4bd774e57b ("KVM: arm64/sve: Simplify KVM_REG_ARM64_SVE_VLS array sizing")
  8ae6efdde4 ("KVM: arm64/sve: Clean up UAPI register ID definitions")
  173aec2d5a ("KVM: s390: add enhanced sort facilty to cpu model")
  555f3d03e7 ("KVM: arm64: Add a capability to advertise SVE support")
  9033bba4b5 ("KVM: arm64/sve: Add pseudo-register for the guest's vector lengths")
  7dd32a0d01 ("KVM: arm/arm64: Add KVM_ARM_VCPU_FINALIZE ioctl")
  e1c9c98345 ("KVM: arm64/sve: Add SVE support to register access ioctl interface")
  2b953ea348 ("KVM: Allow 2048-bit register access via ioctl interface")

None entails changes in tooling, the closest to that were some new arch
specific ioctls, that are still not handled by the tools/perf/trace/beauty/
library, that needs to create per-arch tables to convert ioctl cmd->string (and
back).

From a quick look the arch specific kvm-stat.c files at:

  $ ls -1 tools/perf/arch/*/util/kvm-stat.c
  tools/perf/arch/powerpc/util/kvm-stat.c
  tools/perf/arch/s390/util/kvm-stat.c
  tools/perf/arch/x86/util/kvm-stat.c
  $

Are not affected.

This silences these perf building warnings:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
  diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h
  Warning: Kernel ABI header at 'tools/arch/powerpc/include/uapi/asm/kvm.h' differs from latest version at 'arch/powerpc/include/uapi/asm/kvm.h'
  diff -u tools/arch/powerpc/include/uapi/asm/kvm.h arch/powerpc/include/uapi/asm/kvm.h
  Warning: Kernel ABI header at 'tools/arch/s390/include/uapi/asm/kvm.h' differs from latest version at 'arch/s390/include/uapi/asm/kvm.h'
  diff -u tools/arch/s390/include/uapi/asm/kvm.h arch/s390/include/uapi/asm/kvm.h
  Warning: Kernel ABI header at 'tools/arch/arm64/include/uapi/asm/kvm.h' differs from latest version at 'arch/arm64/include/uapi/asm/kvm.h'
  diff -u tools/arch/arm64/include/uapi/asm/kvm.h arch/arm64/include/uapi/asm/kvm.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Amit Daniel Kachhap <amit.kachhap@arm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cédric Le Goater <clg@kaod.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Peter Xu <peterx@redhat.com>
Link: https://lkml.kernel.org/n/tip-3msmqjenlmb7eygcdnmlqaq1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-28 09:52:23 -03:00
Thomas Richter 6738028dd5 perf record: Fix s390 missing module symbol and warning for non-root users
Command 'perf record' and 'perf report' on a system without kernel
debuginfo packages uses /proc/kallsyms and /proc/modules to find
addresses for kernel and module symbols. On x86 this works for root and
non-root users.

On s390, when invoked as non-root user, many of the following warnings
are shown and module symbols are missing:

    proc/{kallsyms,modules} inconsistency while looking for
        "[sha1_s390]" module!

Command 'perf record' creates a list of module start addresses by
parsing the output of /proc/modules and creates a PERF_RECORD_MMAP
record for the kernel and each module. The following function call
sequence is executed:

  machine__create_kernel_maps
    machine__create_module
      modules__parse
        machine__create_module --> for each line in /proc/modules
          arch__fix_module_text_start

Function arch__fix_module_text_start() is s390 specific. It opens
file /sys/module/<name>/sections/.text to extract the module's .text
section start address. On s390 the module loader prepends a header
before the first section, whereas on x86 the module's text section
address is identical the the module's load address.

However module section files are root readable only. For non-root the
read operation fails and machine__create_module() returns an error.
Command perf record does not generate any PERF_RECORD_MMAP record
for loaded modules. Later command perf report complains about missing
module maps.

To fix this function arch__fix_module_text_start() always returns
success. For root users there is no change, for non-root users
the module's load address is used as module's text start address
(the prepended header then counts as part of the text section).

This enable non-root users to use module symbols and avoid the
warning when perf report is executed.

Output before:

  [tmricht@m83lp54 perf]$ ./perf report -D | fgrep MMAP
  0 0x168 [0x50]: PERF_RECORD_MMAP ... x [kernel.kallsyms]_text

Output after:

  [tmricht@m83lp54 perf]$ ./perf report -D | fgrep MMAP
  0 0x168 [0x50]: PERF_RECORD_MMAP ... x [kernel.kallsyms]_text
  0 0x1b8 [0x98]: PERF_RECORD_MMAP ... x /lib/modules/.../autofs4.ko.xz
  0 0x250 [0xa8]: PERF_RECORD_MMAP ... x /lib/modules/.../sha_common.ko.xz
  0 0x2f8 [0x98]: PERF_RECORD_MMAP ... x /lib/modules/.../des_generic.ko.xz

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Link: http://lkml.kernel.org/r/20190522144601.50763-4-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-28 09:52:23 -03:00
Jiri Olsa ed9adb2035 perf machine: Read also the end of the kernel
We mark the end of kernel based on the first module, but that could
cover some bpf program maps. Reading _etext symbol if it's present to
get precise kernel map end.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: http://lkml.kernel.org/r/20190508132010.14512-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-28 09:52:23 -03:00
Arnaldo Carvalho de Melo 93f678b9ae perf test vmlinux-kallsyms: Ignore aliases to _etext when searching on kallsyms
No need to search for aliases for the symbol that marks the end of the
kernel text segment, the following patch will make such symbols not to
be found when searching in the kallsyms maps causing this test to fail.

So as a prep patch to avoid breaking bisection, ignore such symbols.

Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lkml.kernel.org/n/tip-qfwuih8cvmk9doh7k5k244eq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-28 09:52:23 -03:00
Namhyung Kim acd244b84b perf session: Add missing swap ops for namespace events
In case it's recorded in a different arch.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> <hbathini@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Krister Johansen <kjlx@templeofstupid.com>
Fixes: f3b3614a28 ("perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info")
Link: http://lkml.kernel.org/r/20190522053250.207156-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-28 09:52:23 -03:00
Namhyung Kim 6584140ba9 perf namespace: Protect reading thread's namespace
It seems that the current code lacks holding the namespace lock in
thread__namespaces().  Otherwise it can see inconsistent results.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Krister Johansen <kjlx@templeofstupid.com>
Link: http://lkml.kernel.org/r/20190522053250.207156-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-28 09:52:23 -03:00
Arnaldo Carvalho de Melo 9903c64f0f tools headers UAPI: Sync drm/drm.h with the kernel
To pick up the changes in these csets:

  060cebb20c ("drm: introduce a capability flag for syncobj timeline support")
  50d1ebef79 ("drm/syncobj: add timeline signal ioctl for syncobj v5")
  ea569910cb ("drm/syncobj: add transition iotcls between binary and timeline v2")
  27b575a9aa ("drm/syncobj: add timeline payload query ioctl v6")
  01d6c35783 ("drm/syncobj: add support for timeline point wait v8")
  783195ec1c ("drm/syncobj: disable the timeline UAPI for now v2")
  48197bc564 ("drm: add syncobj timeline support v9")

Which automagically results in the following new ioctls being recognized
by the 'perf trace' ioctl cmd arg beautifier:

  $ tools/perf/trace/beauty/drm_ioctl.sh > /tmp/before
  $ cp include/uapi/drm/drm.h tools/include/uapi/drm/drm.h
  $ tools/perf/trace/beauty/drm_ioctl.sh > /tmp/after
  $ diff -u /tmp/before /tmp/after
  --- /tmp/before	2019-05-22 10:25:31.443151182 -0300
  +++ /tmp/after	2019-05-22 10:25:46.449354819 -0300
  @@ -103,6 +103,10 @@
   	[0xC7] = "MODE_LIST_LESSEES",
   	[0xC8] = "MODE_GET_LEASE",
   	[0xC9] = "MODE_REVOKE_LEASE",
  +	[0xCA] = "SYNCOBJ_TIMELINE_WAIT",
  +	[0xCB] = "SYNCOBJ_QUERY",
  +	[0xCC] = "SYNCOBJ_TRANSFER",
  +	[0xCD] = "SYNCOBJ_TIMELINE_SIGNAL",
   	[DRM_COMMAND_BASE + 0x00] = "I915_INIT",
   	[DRM_COMMAND_BASE + 0x01] = "I915_FLUSH",
   	[DRM_COMMAND_BASE + 0x02] = "I915_FLIP",
    $

I.e. the strace like raw_tracepoint:sys_enter handler in 'perf trace'
will get the cmd integer value and map it to the string.

At some point it should be possible to translate from string to integer
and use to filter using expressions such as:

   # perf trace -e ioctl/cmd==DRM_IOCTL_SYNCOBJ*/

Or some more suitable syntax to express that only these ioctls when
acting on DRM fds should be shown.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Chunming Zhou <david1.zhou@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-jrc9ogw33w4zgqc3pu7o1l3g@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-28 09:52:11 -03:00
Arnaldo Carvalho de Melo e6aff9f8bf tools headers UAPI: Sync drm/i915_drm.h with the kernel
To pick up the changes from:

  d1172ab3d4 ("drm/i915: Introduce struct class_instance for engines across the uAPI")
  96fd2c6633 ("drm/i915: Drop new chunks of context creation ABI (for now)")
  ea593dbba4 ("drm/i915: Allow contexts to share a single timeline across all engines")
  b917154172 ("drm/i915: Extend CONTEXT_CREATE to set parameters upon construction")
  e0695db729 ("drm/i915: Create/destroy VM (ppGTT) for use with contexts")
  9d1305ef80 ("drm/i915: Introduce the i915_user_extension_method")
  c8b502422b ("drm/i915: Remove last traces of exec-id (GEM_BUSY)")
  d90c06d570 ("drm/i915: Fix I915_EXEC_RING_MASK")
  e886196469 ("drm/i915: Use HW semaphores for inter-engine synchronisation on gen8+")
  be03564bd7 ("drm/i915: Include reminders about leaving no holes in uAPI enums")
  ba4fda620a ("drm/i915: Optionally disable automatic recovery after a GPU reset")

We still don't take into account the _IOC_SIZE() to differentiate ioctl cmds,
so more work is needed to support the extension mechanism that is being used
here so that we can differentiate DRM_IOCTL_I915_GEM_CONTEXT_CREATE from the
newly introduced DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT cmd.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/drm/i915_drm.h' differs from latest version at 'include/uapi/drm/i915_drm.h'
  diff -u tools/include/uapi/drm/i915_drm.h include/uapi/drm/i915_drm.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://lkml.kernel.org/n/tip-csn0vanmc7pevyka5qcg0xyw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-28 09:49:03 -03:00
Arnaldo Carvalho de Melo b5b999dca6 tools headers UAPI: Sync linux/fs.h with the kernel
To pick up the changes in:

  c553ea4fdf ("fs/sync.c: sync_file_range(2) may use WB_SYNC_ALL writeback")

That should be used to beautify the 'sync_file_range' syscall 'flags'
arg.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/fs.h' differs from latest version at 'include/uapi/linux/fs.h'
  diff -u tools/include/uapi/linux/fs.h include/uapi/linux/fs.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-at3uoqcvmqdkwaysmvbj1wpv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-28 09:49:03 -03:00
Arnaldo Carvalho de Melo c27de2b891 tools headers UAPI: Sync linux/sched.h with the kernel
To pick up the change in:

  b3e5838252 ("clone: add CLONE_PIDFD")

This requires changes in the 'perf trace' beautification routines for
the 'clone' syscall args, which is done in a followup patch.

This silences the following perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/sched.h' differs from latest version at 'include/uapi/linux/sched.h'
  diff -u tools/include/uapi/linux/sched.h include/uapi/linux/sched.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Christian Brauner <christian@brauner.io>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-lenja6gmy26dkt0ybk747qgq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-28 09:49:03 -03:00
Arnaldo Carvalho de Melo b979540a75 tools arch x86: Sync asm/cpufeatures.h with the with the kernel
To pick up the changes in:

  ed5194c273 ("x86/speculation/mds: Add basic bug infrastructure for MDS")
  e261f209c3 ("x86/speculation/mds: Add BUG_MSBDS_ONLY")

That don't affect anything in tools/.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
  diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/n/tip-jp1afecx3ql1jkuirpgkqfad@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-28 09:49:03 -03:00
Arnaldo Carvalho de Melo fba29f1820 tools include UAPI: Update copy of files related to new fspick, fsmount, fsconfig, fsopen, move_mount and open_tree syscalls
Copy the headers changed by these csets:

  d8076bdb56 ("uapi: Wire up the mount API syscalls on non-x86 arches [ver #2]")
  9c8ad7a2ff ("uapi, x86: Fix the syscall numbering of the mount API syscalls [ver #2]")
  cf3cba4a42 ("vfs: syscall: Add fspick() to select a superblock for reconfiguration")
  93766fbd26 ("vfs: syscall: Add fsmount() to create a mount for a superblock")
  ecdab150fd ("vfs: syscall: Add fsconfig() for configuring and managing a context")
  24dcb3d90a ("vfs: syscall: Add fsopen() to prepare for superblock creation")
  2db154b3ea ("vfs: syscall: Add move_mount(2) to move mounts around")
  a07b200047 ("vfs: syscall: Add open_tree(2) to reference or clone a mount")

We need to create tables for all the flags argument in the new syscalls,
in followup patches.

This silences these perf build warnings:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/mount.h' differs from latest version at 'include/uapi/linux/mount.h'
  diff -u tools/include/uapi/linux/mount.h include/uapi/linux/mount.h
  Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
  diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
  Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
  diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: David Howells <dhowells@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-knpqr1u2ffvz6641056z2mwu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-28 09:49:03 -03:00
Vitaly Chikunov f95d050cdc perf arm64: Fix mksyscalltbl when system kernel headers are ahead of the kernel
When a host system has kernel headers that are newer than a compiling
kernel, mksyscalltbl fails with errors such as:

  <stdin>: In function 'main':
  <stdin>:271:44: error: '__NR_kexec_file_load' undeclared (first use in this function)
  <stdin>:271:44: note: each undeclared identifier is reported only once for each function it appears in
  <stdin>:272:46: error: '__NR_pidfd_send_signal' undeclared (first use in this function)
  <stdin>:273:43: error: '__NR_io_uring_setup' undeclared (first use in this function)
  <stdin>:274:43: error: '__NR_io_uring_enter' undeclared (first use in this function)
  <stdin>:275:46: error: '__NR_io_uring_register' undeclared (first use in this function)
  tools/perf/arch/arm64/entry/syscalls//mksyscalltbl: line 48: /tmp/create-table-xvUQdD: Permission denied

mksyscalltbl is compiled with default host includes, but run with
compiling kernel tree includes, causing some syscall numbers to being
undeclared.

Committer testing:

Before this patch, in my cross build environment, no build problems, but
these new syscalls were not in the syscalls.c generated from the
unistd.h file, which is a bug, this patch fixes it:

perfbuilder@6e20056ed532:/git/perf$ tail /tmp/build/perf/arch/arm64/include/generated/asm/syscalls.c
	[292] = "io_pgetevents",
	[293] = "rseq",
	[294] = "kexec_file_load",
	[424] = "pidfd_send_signal",
	[425] = "io_uring_setup",
	[426] = "io_uring_enter",
	[427] = "io_uring_register",
	[428] = "syscalls",
};
perfbuilder@6e20056ed532:/git/perf$ strings /tmp/build/perf/perf | egrep '^(io_uring_|pidfd_|kexec_file)'
kexec_file_load
pidfd_send_signal
io_uring_setup
io_uring_enter
io_uring_register
perfbuilder@6e20056ed532:/git/perf$
$

Well, there is that last "syscalls" thing, but that looks like some
other bug.

Signed-off-by: Vitaly Chikunov <vt@altlinux.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Michael Petlan <mpetlan@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20190521030203.1447-1-vt@altlinux.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-28 09:49:03 -03:00
Shawn Landden 97acec7df1 perf data: Fix 'strncat may truncate' build failure with recent gcc
This strncat() is safe because the buffer was allocated with zalloc(),
however gcc doesn't know that. Since the string always has 4 non-null
bytes, just use memcpy() here.

    CC       /home/shawn/linux/tools/perf/util/data-convert-bt.o
  In file included from /usr/include/string.h:494,
                   from /home/shawn/linux/tools/lib/traceevent/event-parse.h:27,
                   from util/data-convert-bt.c:22:
  In function ‘strncat’,
      inlined from ‘string_set_value’ at util/data-convert-bt.c:274:4:
  /usr/include/powerpc64le-linux-gnu/bits/string_fortified.h:136:10: error: ‘__builtin_strncat’ output may be truncated copying 4 bytes from a string of length 4 [-Werror=stringop-truncation]
    136 |   return __builtin___strncat_chk (__dest, __src, __len, __bos (__dest));
        |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Shawn Landden <shawn@git.icu>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
LPU-Reference: 20190518183238.10954-1-shawn@git.icu
Link: https://lkml.kernel.org/n/tip-289f1jice17ta7tr3tstm9jm@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-28 09:49:03 -03:00
Linus Torvalds 78e0365184 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:1) Use after free in __dev_map_entry_free(), from Eric Dumazet.

 1) Use after free in __dev_map_entry_free(), from Eric Dumazet.

 2) Fix TCP retransmission timestamps on passive Fast Open, from Yuchung
    Cheng.

 3) Orphan NFC, we'll take the patches directly into my tree. From
    Johannes Berg.

 4) We can't recycle cloned TCP skbs, from Eric Dumazet.

 5) Some flow dissector bpf test fixes, from Stanislav Fomichev.

 6) Fix RCU marking and warnings in rhashtable, from Herbert Xu.

 7) Fix some potential fib6 leaks, from Eric Dumazet.

 8) Fix a _decode_session4 uninitialized memory read bug fix that got
    lost in a merge. From Florian Westphal.

 9) Fix ipv6 source address routing wrt. exception route entries, from
    Wei Wang.

10) The netdev_xmit_more() conversion was not done %100 properly in mlx5
    driver, fix from Tariq Toukan.

11) Clean up botched merge on netfilter kselftest, from Florian
    Westphal.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (74 commits)
  of_net: fix of_get_mac_address retval if compiled without CONFIG_OF
  net: fix kernel-doc warnings for socket.c
  net: Treat sock->sk_drops as an unsigned int when printing
  kselftests: netfilter: fix leftover net/net-next merge conflict
  mlxsw: core: Prevent reading unsupported slave address from SFP EEPROM
  mlxsw: core: Prevent QSFP module initialization for old hardware
  vsock/virtio: Initialize core virtio vsock before registering the driver
  net/mlx5e: Fix possible modify header actions memory leak
  net/mlx5e: Fix no rewrite fields with the same match
  net/mlx5e: Additional check for flow destination comparison
  net/mlx5e: Add missing ethtool driver info for representors
  net/mlx5e: Fix number of vports for ingress ACL configuration
  net/mlx5e: Fix ethtool rxfh commands when CONFIG_MLX5_EN_RXNFC is disabled
  net/mlx5e: Fix wrong xmit_more application
  net/mlx5: Fix peer pf disable hca command
  net/mlx5: E-Switch, Correct type to u16 for vport_num and int for vport_index
  net/mlx5: Add meaningful return codes to status_to_err function
  net/mlx5: Imply MLXFW in mlx5_core
  Revert "tipc: fix modprobe tipc failed after switch order of device registration"
  vsock/virtio: free packets during the socket release
  ...
2019-05-20 08:21:07 -07:00
Linus Torvalds 1ba3b5dc14 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf tooling updates from Ingo Molnar:
 "perf.data:

   - Streaming compression of perf ring buffer into
     PERF_RECORD_COMPRESSED user space records, resulting in ~3-5x
     perf.data file size reduction on variety of tested workloads what
     saves storage space on larger server systems where perf.data size
     can easily reach several tens or even hundreds of GiBs, especially
     when profiling with DWARF-based stacks and tracing of context
     switches.

  perf record:

   - Improve -user-regs/intr-regs suggestions to overcome errors

  perf annotate:

   - Remove hist__account_cycles() from callback, speeding up branch
     processing (perf record -b)

  perf stat:

   - Add a 'percore' event qualifier, e.g.: -e
     cpu/event=0,umask=0x3,percore=1/, that sums up the event counts for
     both hardware threads in a core.

     We can already do this with --per-core, but it's often useful to do
     this together with other metrics that are collected per hardware
     thread.

     I.e. now its possible to do this per-event, and have it mixed with
     other events not aggregated by core.

  arm64:

   - Map Brahma-B53 CPUID to cortex-a53 events.

   - Add Cortex-A57 and Cortex-A72 events.

  csky:

   - Add DWARF register mappings for libdw, allowing --call-graph=dwarf
     to work on the C-SKY arch.

  x86:

   - Add support for recording and printing XMM registers, available,
     for instance, on Icelake.

   - Add uncore_upi (Intel's "Ultra Path Interconnect" events) JSON
     support. UPI replaced the Intel QuickPath Interconnect (QPI) in
     Xeon Skylake-SP.

  Intel PT:

   - Fix instructions sampling rate.

   - Timestamp fixes.

   - Improve exported-sql-viewer GUI, allowing, for instance, to
     copy'n'paste the trees, useful for e-mailing"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (73 commits)
  perf stat: Support 'percore' event qualifier
  perf stat: Factor out aggregate counts printing
  perf tools: Add a 'percore' event qualifier
  perf docs: Add description for stderr
  perf intel-pt: Fix sample timestamp wrt non-taken branches
  perf intel-pt: Fix improved sample timestamp
  perf intel-pt: Fix instructions sampling rate
  perf regs x86: Add X86 specific arch__intr_reg_mask()
  perf parse-regs: Add generic support for arch__intr/user_reg_mask()
  perf parse-regs: Split parse_regs
  perf vendor events arm64: Add Cortex-A57 and Cortex-A72 events
  perf vendor events arm64: Map Brahma-B53 CPUID to cortex-a53 events
  perf vendor events arm64: Remove [[:xdigit:]] wildcard
  perf jevents: Remove unused variable
  perf test zstd: Fixup verbose mode output
  perf tests: Implement Zstd comp/decomp integration test
  perf inject: Enable COMPRESSED record decompression
  perf report: Implement perf.data record decompression
  perf record: Implement -z,--compression_level[=<n>] option
  perf report: Add stub processing of compressed events for -D
  ...
2019-05-19 11:20:22 -07:00
Linus Torvalds 1335d9a1fb Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core fixes from Ingo Molnar:
 "This fixes a particularly thorny munmap() bug with MPX, plus fixes a
  host build environment assumption in objtool"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Allow AR to be overridden with HOSTAR
  x86/mpx, mm/core: Fix recursive munmap() corruption
2019-05-19 10:23:24 -07:00
Florian Westphal c50a42b8f6 kselftests: netfilter: fix leftover net/net-next merge conflict
In nf-next, I had extended this script to also cover NAT support for the
inet family.

In nf, I extended it to cover a regression with 'fully-random' masquerade.

Make this script work again by resolving the conflicts as needed.

Fixes: 8b44836583 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-18 18:15:13 -07:00
Ingo Molnar 62e1c09418 perf/core improvements and fixes:
perf.data:
 
   Alexey Budankov:
 
   - Streaming compression of perf ring buffer into PERF_RECORD_COMPRESSED
     user space records, resulting in ~3-5x perf.data file size reduction
     on variety of tested workloads what saves storage space on larger
     server systems where perf.data size can easily reach several tens or
     even hundreds of GiBs, especially when profiling with DWARF-based
     stacks and tracing of context switches.
 
 perf record:
 
   Arnaldo Carvalho de Melo
 
   - Improve -user-regs/intr-regs suggestions to overcome errors.
 
 perf annotate:
 
   Jin Yao:
 
   - Remove hist__account_cycles() from callback, speeding up branch processing
     (perf record -b).
 
 perf stat:
 
   - Add a 'percore' event qualifier, e.g.: -e cpu/event=0,umask=0x3,percore=1/,
     that sums up the event counts for both hardware threads in a core.
 
     We can already do this with --per-core, but it's often useful to do
     this together with other metrics that are collected per hardware thread.
 
     I.e. now its possible to do this per-event, and have it mixed with other
     events not aggregated by core.
 
 core libraries:
 
   Donald Yandt:
 
   - Check for errors when doing fgets(/proc/version).
 
   Jiri Olsa:
 
   - Speed up report for perf compiled with linbunwind.
 
 tools headers:
 
   Arnaldo Carvalho de Melo
 
   - Update memcpy_64.S, x86's kvm.h and pt_regs.h.
 
 arm64:
 
   Florian Fainelli:
 
   - Map Brahma-B53 CPUID to cortex-a53 events.
 
   - Add Cortex-A57 and Cortex-A72 events.
 
 csky:
 
   Mao Han:
 
   - Add DWARF register mappings for libdw, allowing --call-graph=dwarf to work
     on the C-SKY arch.
 
 x86:
 
   Andi Kleen/Kan Liang:
 
   - Add support for recording and printing XMM registers, available, for
     instance, on Icelake.
 
   Kan Liang:
 
   - Add uncore_upi (Intel's "Ultra Path Interconnect" events) JSON support.
     UPI replaced the Intel QuickPath Interconnect (QPI) in Xeon Skylake-SP.
 
 Intel PT:
 
   Adrian Hunter
 
   . Fix instructions sampling rate.
 
   . Timestamp fixes.
 
   . Improve exported-sql-viewer GUI, allowing, for instance, to copy'n'paste
     the trees, useful for e-mailing.
 
 Documentation:
 
   Thomas Richter:
 
   - Add description for 'perf --debug stderr=1', which redirects stderr to stdout.
 
 libtraceevent:
 
   Tzvetomir Stoyanov:
 
   - Add man pages for the various APIs.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCXN8KEwAKCRCyPKLppCJ+
 JxbAAPsE4bDD7UyTJgVWVXFyqMkaFURv9eGtmhbH/EhnZ/ADfwEAoUG6C+G2DLre
 IMtgTXSlRShQucQbum5Rcp9lcXGL+gY=
 =mGtw
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo-5.2-20190517' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

perf.data:

  Alexey Budankov:

  - Streaming compression of perf ring buffer into PERF_RECORD_COMPRESSED
    user space records, resulting in ~3-5x perf.data file size reduction
    on variety of tested workloads what saves storage space on larger
    server systems where perf.data size can easily reach several tens or
    even hundreds of GiBs, especially when profiling with DWARF-based
    stacks and tracing of context switches.

perf record:

  Arnaldo Carvalho de Melo

  - Improve -user-regs/intr-regs suggestions to overcome errors.

perf annotate:

  Jin Yao:

  - Remove hist__account_cycles() from callback, speeding up branch processing
    (perf record -b).

perf stat:

  - Add a 'percore' event qualifier, e.g.: -e cpu/event=0,umask=0x3,percore=1/,
    that sums up the event counts for both hardware threads in a core.

    We can already do this with --per-core, but it's often useful to do
    this together with other metrics that are collected per hardware thread.

    I.e. now its possible to do this per-event, and have it mixed with other
    events not aggregated by core.

core libraries:

  Donald Yandt:

  - Check for errors when doing fgets(/proc/version).

  Jiri Olsa:

  - Speed up report for perf compiled with linbunwind.

tools headers:

  Arnaldo Carvalho de Melo

  - Update memcpy_64.S, x86's kvm.h and pt_regs.h.

arm64:

  Florian Fainelli:

  - Map Brahma-B53 CPUID to cortex-a53 events.

  - Add Cortex-A57 and Cortex-A72 events.

csky:

  Mao Han:

  - Add DWARF register mappings for libdw, allowing --call-graph=dwarf to work
    on the C-SKY arch.

x86:

  Andi Kleen/Kan Liang:

  - Add support for recording and printing XMM registers, available, for
    instance, on Icelake.

  Kan Liang:

  - Add uncore_upi (Intel's "Ultra Path Interconnect" events) JSON support.
    UPI replaced the Intel QuickPath Interconnect (QPI) in Xeon Skylake-SP.

Intel PT:

  Adrian Hunter

  . Fix instructions sampling rate.

  . Timestamp fixes.

  . Improve exported-sql-viewer GUI, allowing, for instance, to copy'n'paste
    the trees, useful for e-mailing.

Documentation:

  Thomas Richter:

  - Add description for 'perf --debug stderr=1', which redirects stderr to stdout.

libtraceevent:

  Tzvetomir Stoyanov:

  - Add man pages for the various APIs.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-05-18 10:24:43 +02:00
David S. Miller 5a35c8ea7c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2019-05-18

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Fix bpftool's raw BTF dump in relation to forward declarations of union/
   structs, and another fix to unexport logging helpers, from Andrii.

2) Fix inode permission check for retrieving bpf programs, from Chenbo.

3) Fix bpftool to raise rlimit earlier as otherwise libbpf's feature probing
   can fail and subsequently it refuses to load an object, from Yonghong.

4) Fix declaration of bpf_get_current_task() in kselftests, from Alexei.

5) Fix up BPF kselftest .gitignore to add generated files, from Stanislav.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-17 16:33:06 -07:00
Linus Torvalds 0ef0fd3515 * ARM: support for SVE and Pointer Authentication in guests, PMU improvements
* POWER: support for direct access to the POWER9 XIVE interrupt controller,
 memory and performance optimizations.
 
 * x86: support for accessing memory not backed by struct page, fixes and refactoring
 
 * Generic: dirty page tracking improvements
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJc3qV/AAoJEL/70l94x66Dn3QH/jX1Bn0P/RZAIt4w0SySklSg
 PqxUKDyBQqB9vN9Qeb9jWXAKPH2CtM3+up/rz7oRnBWp7qA6vXcC/R/QJYAvzdXE
 nklsR/oYCsflR1KdlVYuDvvPCPP2fLBU5zfN83OsaBQ8fNRkm3gN+N5XQ2SbXbLy
 Mo9tybS4otY201UAC96e8N0ipwwyCRpDneQpLcl+F5nH3RBt63cVbs04O+70MXn7
 eT4I+8K3+Go7LATzT8hglD21D/7uvE31qQb6yr5L33IfhU4GB51RZzBXTNaAdY8n
 hT1rMrRkAMAFWYZPQDfoMadjWU3i5DIfstKjDxOr9oTfuOEp5Z+GvJwvVnUDg1I=
 =D0+p
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "ARM:
   - support for SVE and Pointer Authentication in guests
   - PMU improvements

  POWER:
   - support for direct access to the POWER9 XIVE interrupt controller
   - memory and performance optimizations

  x86:
   - support for accessing memory not backed by struct page
   - fixes and refactoring

  Generic:
   - dirty page tracking improvements"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (155 commits)
  kvm: fix compilation on aarch64
  Revert "KVM: nVMX: Expose RDPMC-exiting only when guest supports PMU"
  kvm: x86: Fix L1TF mitigation for shadow MMU
  KVM: nVMX: Disable intercept for FS/GS base MSRs in vmcs02 when possible
  KVM: PPC: Book3S: Remove useless checks in 'release' method of KVM device
  KVM: PPC: Book3S HV: XIVE: Fix spelling mistake "acessing" -> "accessing"
  KVM: PPC: Book3S HV: Make sure to load LPID for radix VCPUs
  kvm: nVMX: Set nested_run_pending in vmx_set_nested_state after checks complete
  tests: kvm: Add tests for KVM_SET_NESTED_STATE
  KVM: nVMX: KVM_SET_NESTED_STATE - Tear down old EVMCS state before setting new state
  tests: kvm: Add tests for KVM_CAP_MAX_VCPUS and KVM_CAP_MAX_CPU_ID
  tests: kvm: Add tests to .gitignore
  KVM: Introduce KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2
  KVM: Fix kvm_clear_dirty_log_protect off-by-(minus-)one
  KVM: Fix the bitmap range to copy during clear dirty
  KVM: arm64: Fix ptrauth ID register masking logic
  KVM: x86: use direct accessors for RIP and RSP
  KVM: VMX: Use accessors for GPRs outside of dedicated caching logic
  KVM: x86: Omit caching logic for always-available GPRs
  kvm, x86: Properly check whether a pfn is an MMIO or not
  ...
2019-05-17 10:33:30 -07:00
Andrii Nakryiko 9c3ddee124 bpftool: fix BTF raw dump of FWD's fwd_kind
kflag bit determines whether FWD is for struct or union. Use that bit.

Fixes: c93cc69004 ("bpftool: add ability to dump BTF types")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-17 14:21:29 +02:00
Alexei Starovoitov 7ed4b4e60b selftests/bpf: fix bpf_get_current_task
Fix bpf_get_current_task() declaration.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-17 13:19:30 +02:00
Nathan Chancellor 8ea58f1e8b objtool: Allow AR to be overridden with HOSTAR
Currently, this Makefile hardcodes GNU ar, meaning that if it is not
available, there is no way to supply a different one and the build will
fail.

  $ make AR=llvm-ar CC=clang LD=ld.lld HOSTAR=llvm-ar HOSTCC=clang \
         HOSTLD=ld.lld HOSTLDFLAGS=-fuse-ld=lld defconfig modules_prepare
  ...
    AR       /out/tools/objtool/libsubcmd.a
  /bin/sh: 1: ar: not found
  ...

Follow the logic of HOST{CC,LD} and allow the user to specify a
different ar tool via HOSTAR (which is used elsewhere in other
tools/ Makefiles).

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Cc: <stable@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/80822a9353926c38fd7a152991c6292491a9d0e8.1558028966.git.jpoimboe@redhat.com
Link: https://github.com/ClangBuiltLinux/linux/issues/481
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-05-17 11:10:42 +02:00
Linus Torvalds 4c7b63a32d linux-kselftest-5.2-rc1-2
This kselftest second update for Linux 5.2-rc1 consists of
 
 Kselftest framework fixes from Shuah Khan
 
 - kselftest framework bpf build/test workflow regression fix
 - Fix to kselftest install to use default install path
 - Fix to kselftest KBUILD_OUTPUT builds to not clutter main
   KBUILD_OUTPUT directory with selftest objects
 
 - .gitignore fixes from Kelsey Skunberg
 
 - rseq selftests updates from Mathieu Desnoyers and Martin Schwidefsky:
 
   They change the per-architecture pre-abort signatures to ensure those
   are valid trap instructions.
 
   The way exit points are presented to debuggers is enhanced, ensuring
   all exit points are present, so debuggers don't have to disassemble
   rseq critical section to properly skip over them.
 
   Discussions with the glibc community is reaching a consensus of exposing
   a __rseq_handled symbol from glibc to coexist with rseq early adopters.
   Update the rseq selftest code to expose and use this symbol.
 
   Support for compiling asm goto with clang is added with the
   "-no-integrated-as" compiler switch, similarly to the top level kernel
   Makefile.
 
 - kselftest Makefile test run output refactoring and making test
   output TAP13 compliant from Kees Cook:
 
   This re-factors the selftest Makefiles to extract the test running logic
   to be reused between "run_tests" and "emit_tests", while also fixing
   up the test output to be TAP version 13 compliant:
 	- added "plan" line
 	- fixed result line syntax
 	- moved all test output to be "# "-prefixed as TAP "diagnostic"
 	  lines
 
   The prefixing code includes a fallback mode for limited execution
   environments.
 
   Additionally, the plan lines are fixed for all callers of kselftest.h.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAlzdwpcACgkQCwJExA0N
 Qxya+xAAm/+ozRxrVGuhQt44y/lfbCNqgiHp2PPAsuTISTtujea96VQ20DLhihvy
 hdpcOvNS00G5Fs6Nn3x/JLw7tftkTlchgOPZ0VwZXG23YAdhbQADBm8piELmzNM4
 j+sA7O7MMJ55Hmh5GOGf7E/Wt6mlWrkKwzcAt986iWzB1j+cbEx7bX6APRh3E0fn
 SplH4+DclfAFHCTI4Ns++DAtJvH6nCnaZgEYib+wMUr4jRNVB1fe4q31Bamzag46
 QXO7Jgn/CAYq1+wTPyfKkAJb9wlXvNVi1KxJyLTxP2Pir47HuDtaugg3sVHk8BCX
 o08U8c9z8H7X8y1eXcP/DqMMGFVo0hNT2MC8RpG8GDD/U2PLKeRegjyxEG9ssDJc
 48efizxCJffrJTplN6fANAb28EezdQ5l+NOuccXhf1D2RIXJuUlTtbyCm7bRkgDB
 yDzFrTWtp16AFjaS5Bvnkk57bjCnlHnTq5YuQscK0b5CnWggIzipGh/Sl6H5cYQ2
 JqphN00A48IfJDVFxjwoPKUXQEcy9U7EtHoKET7L+dMZ8W3yEZy9me73Ncc7dGym
 htLcuzLsEIfkRZVwhh4DegXodrFFzbpXf1nCV5/ULJNVTFgjRD5quzfnGo4xj//Z
 0iD/AybtgrAeEKL5wIuYLNRd2j9uVO+KvuDDmnF+BZ5Hsi2ko2c=
 =LZRz
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-5.2-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull more kselftest updates from Shuah Khan:

 - kselftest framework bpf build/test workflow regression fix

 - Fix to kselftest install to use default install path

 - Fix to kselftest KBUILD_OUTPUT builds to not clutter main
   KBUILD_OUTPUT directory with selftest objects

 - .gitignore fixes (Kelsey Skunberg)

 - rseq selftests updates (Mathieu Desnoyers and Martin Schwidefsky)

   They change the per-architecture pre-abort signatures to ensure those
   are valid trap instructions.

   The way exit points are presented to debuggers is enhanced, ensuring
   all exit points are present, so debuggers don't have to disassemble
   rseq critical section to properly skip over them.

   Discussions with the glibc community is reaching a consensus of
   exposing a __rseq_handled symbol from glibc to coexist with rseq
   early adopters. Update the rseq selftest code to expose and use this
   symbol.

   Support for compiling asm goto with clang is added with the
   "-no-integrated-as" compiler switch, similarly to the top level
   kernel Makefile.

 - kselftest Makefile test run output refactoring and making test output
   TAP13 compliant from Kees Cook:

   This re-factors the selftest Makefiles to extract the test running
   logic to be reused between "run_tests" and "emit_tests", while also
   fixing up the test output to be TAP version 13 compliant:
	- added "plan" line
	- fixed result line syntax
	- moved all test output to be "# "-prefixed as TAP "diagnostic"
	  lines

   The prefixing code includes a fallback mode for limited execution
   environments.

   Additionally, the plan lines are fixed for all callers of
   kselftest.h.

* tag 'linux-kselftest-5.2-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (25 commits)
  selftests: avoid KBUILD_OUTPUT dir cluttering with selftest objects
  selftests: drivers: Create .gitignore to include /dma-buf/udmabuf
  selftests: pidfd: Create .gitignore to include pidfd_test
  selftests: fix bpf build/test workflow regression when KBUILD_OUTPUT is set
  selftests: fix install target to use default install path
  rseq/selftests: add -no-integrated-as for clang
  rseq/selftests: mips: use break instruction for RSEQ_SIG
  rseq/selftests: powerpc code signature: generate valid instructions
  rseq/selftests: aarch64 code signature: handle big-endian environment
  rseq/selftests: arm: use udf instruction for RSEQ_SIG
  rseq/selftests: s390: use trap4 for RSEQ_SIG
  rseq/selftests: x86: use ud1 instruction as RSEQ_SIG opcode
  rseq/selftests: s390: use jg instruction for jumps outside of the asm
  rseq/selftests: Use __rseq_handled symbol to coexist with glibc
  rseq/selftests: Introduce __rseq_cs_ptr_array, rename __rseq_table to __rseq_cs
  rseq/selftests: Add __rseq_exit_point_array section for debuggers
  rseq/selftests: x86: Work-around bogus gcc-8 optimisation
  selftests: Add test plan API to kselftest.h and adjust callers
  selftests: Remove KSFT_TAP_LEVEL
  selftests: Move test output to diagnostic lines
  ...
2019-05-16 18:57:58 -07:00
David Ahern 9a6c8bf91b selftests: pmtu.sh: Remove quotes around commands in setup_xfrm
The first command in setup_xfrm is failing resulting in the test getting
skipped:

+ ip netns exec ns-B ip -6 xfrm state add src fd00:1::a dst fd00:1::b spi 0x1000 proto esp aead 'rfc4106(gcm(aes))' 0x0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f 128 mode tunnel
+ out=RTNETLINK answers: Function not implemented
...
  xfrm6 not supported
TEST: vti6: PMTU exceptions                                         [SKIP]
  xfrm4 not supported
TEST: vti4: PMTU exceptions                                         [SKIP]
...

The setup command started failing when the run_cmd option was added.
Removing the quotes fixes the problem:
...
TEST: vti6: PMTU exceptions                                         [ OK ]
TEST: vti4: PMTU exceptions                                         [ OK ]
...

Fixes: 56490b623a ("selftests: Add debugging options to pmtu.sh")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-16 14:28:22 -07:00
Andrii Nakryiko d72386fe7a libbpf: move logging helpers into libbpf_internal.h
libbpf_util.h header was recently exposed as public as a dependency of
xsk.h. In addition to memory barriers, it contained logging helpers,
which are not supposed to be exposed. This patch moves those into
libbpf_internal.h, which is kept as an internal header.

Cc: Stanislav Fomichev <sdf@google.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Fixes: 7080da8909 ("libbpf: add libbpf_util.h to header install.")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-05-16 12:47:47 -07:00
Yonghong Song ac4e0e055f tools/bpftool: move set_max_rlimit() before __bpf_object__open_xattr()
For a host which has a lower rlimit for max locked memory (e.g., 64KB),
the following error occurs in one of our production systems:
  # /usr/sbin/bpftool prog load /paragon/pods/52877437/home/mark.o \
    /sys/fs/bpf/paragon_mark_21 type cgroup/skb \
    map idx 0 pinned /sys/fs/bpf/paragon_map_21
  libbpf: Error in bpf_object__probe_name():Operation not permitted(1).
    Couldn't load basic 'r0 = 0' BPF program.
  Error: failed to open object file

The reason is due to low locked memory during bpf_object__probe_name()
which probes whether program name is supported in kernel or not
during __bpf_object__open_xattr().

bpftool program load already tries to relax mlock rlimit before
bpf_object__load(). Let us move set_max_rlimit() before
__bpf_object__open_xattr(), which fixed the issue here.

Fixes: 47eff61777 ("bpf, libbpf: introduce bpf_object__probe_caps to test BPF capabilities")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-05-16 11:44:50 -07:00
Stanislav Fomichev bca844a8c9 selftests/bpf: add test_sysctl and map_tests/tests.h to .gitignore
Missing files are:
* tools/testing/selftests/bpf/map_tests/tests.h - autogenerated
* tools/testing/selftests/bpf/test_sysctl - binary

Fixes: 51a0e301a5 ("bpf: Add BPF_MAP_TYPE_SK_STORAGE test to test_maps")
Fixes: 1f5fa9ab6e ("selftests/bpf: Test BPF_CGROUP_SYSCTL")
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-05-16 11:41:31 -07:00
Linus Torvalds b2ca74d32b Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core fixes from Ingo Molnar:
 "A handful of objtool updates, plus a documentation addition for
  __ab_c_size()"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Fix whitelist documentation typo
  objtool: Fix function fallthrough detection
  objtool: Don't use ignore flag for fake jumps
  overflow.h: Add comment documenting __ab_c_size()
2019-05-16 10:29:00 -07:00
Jin Yao 4fc4d8dfa0 perf stat: Support 'percore' event qualifier
With this patch, we can use the 'percore' event qualifier in perf-stat.

  root@skl:/tmp# perf stat -e cpu/event=0,umask=0x3,percore=1/,cpu/event=0,umask=0x3/ -a -A -I1000
    1.000773050 S0-C0   98,352,832 cpu/event=0,umask=0x3,percore=1/  (50.01%)
    1.000773050 S0-C1  103,763,057 cpu/event=0,umask=0x3,percore=1/  (50.02%)
    1.000773050 S0-C2  196,776,995 cpu/event=0,umask=0x3,percore=1/  (50.02%)
    1.000773050 S0-C3  176,493,779 cpu/event=0,umask=0x3,percore=1/  (50.02%)
    1.000773050 CPU0    47,699,641 cpu/event=0,umask=0x3/            (50.02%)
    1.000773050 CPU1    49,052,451 cpu/event=0,umask=0x3/            (49.98%)
    1.000773050 CPU2   102,771,422 cpu/event=0,umask=0x3/            (49.98%)
    1.000773050 CPU3   100,784,662 cpu/event=0,umask=0x3/            (49.98%)
    1.000773050 CPU4    43,171,342 cpu/event=0,umask=0x3/            (49.98%)
    1.000773050 CPU5    54,152,158 cpu/event=0,umask=0x3/            (49.98%)
    1.000773050 CPU6    93,618,410 cpu/event=0,umask=0x3/            (49.98%)
    1.000773050 CPU7    74,477,589 cpu/event=0,umask=0x3/            (49.99%)

In this example, we count the event 'ref-cycles' per-core and per-CPU in
one perf stat command-line. From the output, we can see:

  S0-C0 = CPU0 + CPU4
  S0-C1 = CPU1 + CPU5
  S0-C2 = CPU2 + CPU6
  S0-C3 = CPU3 + CPU7

So the result is expected (tiny difference is ignored).

Note that, the 'percore' event qualifier needs to use with option '-A'.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1555077590-27664-4-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-16 14:17:24 -03:00
Jin Yao 40480a8136 perf stat: Factor out aggregate counts printing
Move the aggregate counts printing to a new function
print_counter_aggrdata, which will be used in following patches.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1555077590-27664-3-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-16 14:17:24 -03:00
Jin Yao 064b4e82aa perf tools: Add a 'percore' event qualifier
Add a 'percore' event qualifier, like cpu/event=0,umask=0x3,percore=1/,
that sums up the event counts for both hardware threads in a core.

We can already do this with --per-core, but it's often useful to do
this together with other metrics that are collected per hardware thread.
So we need to support this per-core counting on a event level.

This can be implemented in only the user tool, no kernel support needed.

 v4:
 ---
 1. Add Arnaldo's patch which updates the documentation for
    this new qualifier.
 2. Rebase to latest perf/core branch

 v3:
 ---
 Simplify the code according to Jiri's comments.
 Before:
   "return term->val.percore ? true : false;"
 Now:
   "return term->val.percore;"

 v2:
 ---
 Change the qualifier name from 'coresum' to 'percore' according to
 comments from Jiri and Andi.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1555077590-27664-2-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-16 14:17:24 -03:00
Thomas Richter 6cf6265639 perf docs: Add description for stderr
'perf report' displays recorded data on the screen and emits warnings
and debug messages in the status line (last one on screen).

perf also supports the possibility to write all debug messages to stderr
(instead of writing them to the status line).

This is achieved with the following command:

  # ./perf --debug stderr=1 report -vvvvv -i ~/fast.data 2>/tmp/2
  # ll /tmp/2
  -rw-rw-r-- 1 tmricht tmricht 5420835 May  7 13:46 /tmp/2
  #

The usage of variable stderr=1 is not documented, so add it to the perf
man page.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20190513080220.91966-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-16 14:17:24 -03:00
Adrian Hunter 1b6599a9d8 perf intel-pt: Fix sample timestamp wrt non-taken branches
The sample timestamp is updated to ensure that the timestamp represents
the time of the sample and not a branch that the decoder is still
walking towards. The sample timestamp is updated when the decoder
returns, but the decoder does not return for non-taken branches. Update
the sample timestamp then also.

Note that commit 3f04d98e97 ("perf intel-pt: Improve sample
timestamp") was also a stable fix and appears, for example, in v4.4
stable tree as commit a4ebb58fd1 ("perf intel-pt: Improve sample
timestamp").

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v4.4+
Fixes: 3f04d98e97 ("perf intel-pt: Improve sample timestamp")
Link: http://lkml.kernel.org/r/20190510124143.27054-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-16 14:17:24 -03:00
Adrian Hunter 61b6e08dc8 perf intel-pt: Fix improved sample timestamp
The decoder uses its current timestamp in samples. Usually that is a
timestamp that has already passed, but in some cases it is a timestamp
for a branch that the decoder is walking towards, and consequently
hasn't reached.

The intel_pt_sample_time() function decides which is which, but was not
handling TNT packets exactly correctly.

In the case of TNT, the timestamp applies to the first branch, so the
decoder must first walk to that branch.

That means intel_pt_sample_time() should return true for TNT, and this
patch makes that change. However, if the first branch is a non-taken
branch (i.e. a 'N'), then intel_pt_sample_time() needs to return false
for subsequent taken branches in the same TNT packet.

To handle that, introduce a new state INTEL_PT_STATE_TNT_CONT to
distinguish the cases.

Note that commit 3f04d98e97 ("perf intel-pt: Improve sample
timestamp") was also a stable fix and appears, for example, in v4.4
stable tree as commit a4ebb58fd1 ("perf intel-pt: Improve sample
timestamp").

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v4.4+
Fixes: 3f04d98e97 ("perf intel-pt: Improve sample timestamp")
Link: http://lkml.kernel.org/r/20190510124143.27054-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-16 14:17:23 -03:00
Adrian Hunter 7ba8fa20e2 perf intel-pt: Fix instructions sampling rate
The timestamp used to determine if an instruction sample is made, is an
estimate based on the number of instructions since the last known
timestamp. A consequence is that it might go backwards, which results in
extra samples. Change it so that a sample is only made when the
timestamp goes forwards.

Note this does not affect a sampling period of 0 or sampling periods
specified as a count of instructions.

Example:

 Before:

 $ perf script --itrace=i10us
 ls 13812 [003] 2167315.222583:       3270 instructions:u:      7fac71e2e494 __GI___tunables_init+0xf4 (/lib/x86_64-linux-gnu/ld-2.28.so)
 ls 13812 [003] 2167315.222667:      30902 instructions:u:      7fac71e2da0f _dl_cache_libcmp+0x2f (/lib/x86_64-linux-gnu/ld-2.28.so)
 ls 13812 [003] 2167315.222667:         10 instructions:u:      7fac71e2d9ff _dl_cache_libcmp+0x1f (/lib/x86_64-linux-gnu/ld-2.28.so)
 ls 13812 [003] 2167315.222667:          8 instructions:u:      7fac71e2d9ea _dl_cache_libcmp+0xa (/lib/x86_64-linux-gnu/ld-2.28.so)
 ls 13812 [003] 2167315.222667:         14 instructions:u:      7fac71e2d9ea _dl_cache_libcmp+0xa (/lib/x86_64-linux-gnu/ld-2.28.so)
 ls 13812 [003] 2167315.222667:          6 instructions:u:      7fac71e2d9ff _dl_cache_libcmp+0x1f (/lib/x86_64-linux-gnu/ld-2.28.so)
 ls 13812 [003] 2167315.222667:         14 instructions:u:      7fac71e2d9ff _dl_cache_libcmp+0x1f (/lib/x86_64-linux-gnu/ld-2.28.so)
 ls 13812 [003] 2167315.222667:          4 instructions:u:      7fac71e2dab2 _dl_cache_libcmp+0xd2 (/lib/x86_64-linux-gnu/ld-2.28.so)
 ls 13812 [003] 2167315.222728:      16423 instructions:u:      7fac71e2477a _dl_map_object_deps+0x1ba (/lib/x86_64-linux-gnu/ld-2.28.so)
 ls 13812 [003] 2167315.222734:      12731 instructions:u:      7fac71e27938 _dl_name_match_p+0x68 (/lib/x86_64-linux-gnu/ld-2.28.so)
 ...

 After:
 $ perf script --itrace=i10us
 ls 13812 [003] 2167315.222583:       3270 instructions:u:      7fac71e2e494 __GI___tunables_init+0xf4 (/lib/x86_64-linux-gnu/ld-2.28.so)
 ls 13812 [003] 2167315.222667:      30902 instructions:u:      7fac71e2da0f _dl_cache_libcmp+0x2f (/lib/x86_64-linux-gnu/ld-2.28.so)
 ls 13812 [003] 2167315.222728:      16479 instructions:u:      7fac71e2477a _dl_map_object_deps+0x1ba (/lib/x86_64-linux-gnu/ld-2.28.so)
 ...

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Fixes: f4aa081949 ("perf tools: Add Intel PT decoder")
Link: http://lkml.kernel.org/r/20190510124143.27054-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-16 14:17:23 -03:00
Kan Liang 6466ec14aa perf regs x86: Add X86 specific arch__intr_reg_mask()
XMM registers can be collected on Icelake and later platforms.

Add specific arch__intr_reg_mask(), which creating an event to check if
the kernel and hardware can collect XMM registers.

Test on Skylake which doesn't support XMM registers collection. There is
nothing changed.

   #perf record -I?
   available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9
   R10 R11 R12 R13 R14 R15

   Usage: perf record [<options>] [<command>]
    or: perf record [<options>] -- <command> [<options>]

    -I, --intr-regs[=<any register>]
                          sample selected machine registers on
   interrupt, use '-I?' to list register names

   #perf record -I
   [ perf record: Woken up 1 times to write data ]
   [ perf record: Captured and wrote 0.905 MB perf.data (2520 samples) ]

   #perf evlist -v
   cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type:
   IP|TID|TIME|CPU|PERIOD|REGS_INTR, read_format: ID, disabled: 1,
   inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3,
   sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol:
   1, bpf_event: 1, sample_regs_intr: 0xff0fff

Test on Icelake which support XMM registers collection.

   #perf record -I?
   available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10
   R11 R12 R13 R14 R15 XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 XMM9
   XMM10 XMM11 XMM12 XMM13 XMM14 XMM15

   Usage: perf record [<options>] [<command>]
    or: perf record [<options>] -- <command> [<options>]

    -I, --intr-regs[=<any register>]
                          sample selected machine registers on
   interrupt, use '-I?' to list register names

   #perf record -I
   [ perf record: Woken up 1 times to write data ]
   [ perf record: Captured and wrote 0.800 MB perf.data (318 samples) ]

   #perf evlist -v
   cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type:
   IP|TID|TIME|CPU|PERIOD|REGS_INTR, read_format: ID, disabled: 1,
   inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3,
   sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol:
   1, bpf_event: 1, sample_regs_intr: 0xffffffff00ff0fff

Committer notes:

Don't set attr.sample_period as a named struct init, as it is part of an
unnamed union in 'struct perf_event_attr', and doing so breaks the build
on older gcc versions, such as:

  gcc version 4.1.2 20080704 (Red Hat 4.1.2-55)
  gcc version 4.4.7 20120313 (Red Hat 4.4.7-23) (GCC)

  arch/x86/util/perf_regs.c: In function 'arch__intr_reg_mask':
  arch/x86/util/perf_regs.c:279: error: unknown field 'sample_period' specified in initializer
  cc1: warnings being treated as errors
  arch/x86/util/perf_regs.c:279: warning: missing braces around initializer
  arch/x86/util/perf_regs.c:279: warning: (near initialization for 'attr.<anonymous>')

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
[ Only on a lenovo t480s, a skylake machine, where the XMM registers didn't show up in -I?/--user-regs=? as expected ]
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1557865174-56264-3-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-16 14:17:23 -03:00
Kan Liang af785e75bf perf parse-regs: Add generic support for arch__intr/user_reg_mask()
There may be different register mask for use with intr or user on some
platforms, e.g. Icelake.

Add weak functions arch__intr_reg_mask() and arch__user_reg_mask() to
return intr and user register mask respectively.

Check mask before printing or comparing the register name.

Generic code always return PERF_REGS_MASK. No functional change.

Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1557865174-56264-2-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-16 14:17:12 -03:00
Linus Torvalds 83f3ef3de6 libnvdimm fixes 5.2-rc1
* Fix a long standing namespace label corruption scenario when
   re-provisioning capacity for a namespace.
 
 * Restore the ability of the dax_pmem module to be built-in.
 
 * Harden the build for the 'nfit_test' unit test modules so that the
   userspace test harness can ensure all required test modules are
   available.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJc3KYfAAoJEB7SkWpmfYgCGtUP/1eZkLk6X2xYYZw6mMKbaVUm
 F4f7uOhpFFNonor0EhZVgTXqLEjFE9ux3+kZi0EZkvJhOSWAftICo1mzqLBnDxSW
 QiNMOcFeM16Df3a6kwMbrcsjVRMyI63E4qH2puaPcW4sSRVhrMzKcklx+iWtubtk
 q/bXx4B4n78E509FMagF3Irt6iWh235YqAXapbh4jSLGs8BJ0LRE8WVnMridCYcD
 MV4QEYHWwHh0SlQ7HM/jdGtCwJyaFiHK+G6CDqUqTR7NKFpkpAwKDT2UULkdpzSo
 1YUJQfUcdpwp7GUaTvGWL8BDyVklh+pLQtp+lepQfpPbSLPMC6qRC+hZXAuxlX7h
 Dj94P9DYZWJk9b0Z4NaqJCjADi/iKIdCHa9dIOPb4XmbXgTLnS15HsG0asBeoTuF
 UfDNdOLo8Tz+3dwNvmJ9Avb8ShqYLkwiOfkOeBDGu/+OWTiUrbghbAhhM4iOd8ey
 cFYc5MWD0HA4F0f4jw0o0fKQ3qGfhqEabdN1Z54nZ53y+t9MFx2fTAXAq26f+oly
 HM3ehus7EiNUS9gjMC55AAPTt5S/S4nu+YiMQUwlRfj2ErkYvrRsQAl7x4QjEWdu
 RSfrGCMb37OaMnzFGw49GGJsZqPUeT1O2anxVDVTM+RvMCi6fJY75XRJVMs2A6CG
 WNVEIGIQQbHwEyidOAPC
 =QM33
 -----END PGP SIGNATURE-----

Merge tag 'libnvdimm-fixes-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm updates from Dan Williams:
 "Just a small collection of fixes this time around.

  The new virtio-pmem driver is nearly ready, but some last minute
  device-mapper acks and virtio questions made it prudent to await v5.3.

  Other major topics that were brewing on the linux-nvdimm mailing list
  like sub-section hotplug, and other devm_memremap_pages() reworks will
  go upstream through Andrew's tree.

  Summary:

   - Fix a long standing namespace label corruption scenario when
     re-provisioning capacity for a namespace.

   - Restore the ability of the dax_pmem module to be built-in.

   - Harden the build for the 'nfit_test' unit test modules so that the
     userspace test harness can ensure all required test modules are
     available"

* tag 'libnvdimm-fixes-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  drivers/dax: Allow to include DEV_DAX_PMEM as builtin
  libnvdimm/namespace: Fix label tracking error
  tools/testing/nvdimm: add watermarks for dax_pmem* modules
  dax/pmem: Fix whitespace in dax_pmem
2019-05-15 18:56:50 -07:00
David S. Miller c7d5ec26ea Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2019-05-16

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Fix a use after free in __dev_map_entry_free(), from Eric.

2) Several sockmap related bug fixes: a splat in strparser if
   it was never initialized, remove duplicate ingress msg list
   purging which can race, fix msg->sg.size accounting upon
   skb to msg conversion, and last but not least fix a timeout
   bug in tcp_bpf_wait_data(), from John.

3) Fix LRU map to avoid messing with eviction heuristics upon
   syscall lookup, e.g. map walks from user space side will
   then lead to eviction of just recently created entries on
   updates as it would mark all map entries, from Daniel.

4) Don't bail out when libbpf feature probing fails. Also
   various smaller fixes to flow_dissector test, from Stanislav.

5) Fix missing brackets for BTF_INT_OFFSET() in UAPI, from Gary.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-15 18:28:44 -07:00
Linus Torvalds b06ed1e7a2 Updates to ktest.pl
- Handle meta data in GRUB_MENU
 
  - Add variable to cusomize what return value the reboot code should return.
 
  - Add support for grub2bls boot loader
 
  - Show name and test iteration number in error message sent in mail
 
  - Minor fixes and clean ups
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCXNxRZxQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qrVCAP4rgnMgMsPOt0cMtKty1Z3uA6njfrZc
 UU1gNeHEvKr1MQEAhYy4N5FCigBygALEczmUIYwrzVq3luNPTwgUeUIH3AY=
 =BTC7
 -----END PGP SIGNATURE-----

Merge tag 'ktest-v5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest

Pull more ktest updates from Steven Rostedt:

 - Add support for grub2bls boot loader

 - Show name and test iteration number in error message sent in mail

 - Minor fixes and clean ups

* tag 'ktest-v5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest:
  ktest: update sample.conf for grub2bls
  ktest: remove get_grub2_index
  ktest: pass KERNEL_VERSION to POST_KTEST
  ktest: introduce grub2bls REBOOT_TYPE option
  ktest: cleanup get_grub_index
  ktest: introduce _get_grub_index
2019-05-15 16:46:32 -07:00