linux-stable/tools/perf
Arnaldo Carvalho de Melo 980b68ec06 perf annotate: Use absolute addresses to calculate jump target offsets
These types of jumps were confusing the annotate browser:

entry_SYSCALL_64  /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux

entry_SYSCALL_64  /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux
  Percent│ffffffff81a00020:   swapgs
  <SNIP>
         │ffffffff81a00128: ↓ jae    ffffffff81a00139 <syscall_return_via_sysret+0x53>
  <SNIP>
         │ffffffff81a00155: → jmpq   *0x825d2d(%rip)   # ffffffff82225e88 <pv_cpu_ops+0xe8>

I.e. the syscall_return_via_sysret function is actually "inside" the
entry_SYSCALL_64 function, and the offsets in jumps like these (+0x53)
are relative to syscall_return_via_sysret, not to syscall_return_via_sysret.

Or this may be some artifact in how the assembler marks the start and
end of a function and how this ends up in the ELF symtab for vmlinux,
i.e. syscall_return_via_sysret() isn't "inside" entry_SYSCALL_64, but
just right after it.

From readelf -sw vmlinux:

 80267: ffffffff81a00020   315 NOTYPE  GLOBAL DEFAULT    1 entry_SYSCALL_64
   316: ffffffff81a000e6     0 NOTYPE  LOCAL  DEFAULT    1 syscall_return_via_sysret

 0xffffffff81a00020 + 315 > 0xffffffff81a000e6

So instead of looking for offsets after that last '+' sign, calculate
offsets for jump target addresses that are inside the function being
disassembled from the absolute address, 0xffffffff81a00139 in this case,
subtracting from it the objdump address for the start of the function
being disassembled, entry_SYSCALL_64() in this case.

So, before this patch:

entry_SYSCALL_64  /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux
Percent│       pop    %r10
       │       pop    %r9
       │       pop    %r8
       │       pop    %rax
       │       pop    %rsi
       │       pop    %rdx
       │       pop    %rsi
       │       mov    %rsp,%rdi
       │       mov    %gs:0x5004,%rsp
       │       pushq  0x28(%rdi)
       │       pushq  (%rdi)
       │       push   %rax
       │     ↑ jmp    6c
       │       mov    %cr3,%rdi
       │     ↑ jmp    62
       │       mov    %rdi,%rax
       │       and    $0x7ff,%rdi
       │       bt     %rdi,%gs:0x2219a
       │     ↑ jae    53
       │       btr    %rdi,%gs:0x2219a
       │       mov    %rax,%rdi
       │     ↑ jmp    5b

After:

entry_SYSCALL_64  /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux
  0.65 │     → jne    swapgs_restore_regs_and_return_to_usermode
       │       pop    %r10
       │       pop    %r9
       │       pop    %r8
       │       pop    %rax
       │       pop    %rsi
       │       pop    %rdx
       │       pop    %rsi
       │       mov    %rsp,%rdi
       │       mov    %gs:0x5004,%rsp
       │       pushq  0x28(%rdi)
       │       pushq  (%rdi)
       │       push   %rax
       │     ↓ jmp    132
       │       mov    %cr3,%rdi
       │    ┌──jmp    128
       │    │  mov    %rdi,%rax
       │    │  and    $0x7ff,%rdi
       │    │  bt     %rdi,%gs:0x2219a
       │    │↓ jae    119
       │    │  btr    %rdi,%gs:0x2219a
       │    │  mov    %rax,%rdi
       │    │↓ jmp    121
       │119:│  mov    %rax,%rdi
       │    │  bts    $0x3f,%rdi
       │121:│  or     $0x800,%rdi
       │128:└─→or     $0x1000,%rdi
       │       mov    %rdi,%cr3
       │132:   pop    %rax
       │       pop    %rdi
       │       pop    %rsp
       │     → jmpq   *0x825d2d(%rip)        # ffffffff82225e88 <pv_cpu_ops+0xe8>

With those at least navigating to the right destination, an improvement
for these cases seems to be to be to somehow mark those inner functions,
which in this case could be:

entry_SYSCALL_64  /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux
       │syscall_return_via_sysret:
       │       pop    %r15
       │       pop    %r14
       │       pop    %r13
       │       pop    %r12
       │       pop    %rbp
       │       pop    %rbx
       │       pop    %rsi
       │       pop    %r10
       │       pop    %r9
       │       pop    %r8
       │       pop    %rax
       │       pop    %rsi
       │       pop    %rdx
       │       pop    %rsi
       │       mov    %rsp,%rdi
       │       mov    %gs:0x5004,%rsp
       │       pushq  0x28(%rdi)
       │       pushq  (%rdi)
       │       push   %rax
       │     ↓ jmp    132
       │       mov    %cr3,%rdi
       │    ┌──jmp    128
       │    │  mov    %rdi,%rax
       │    │  and    $0x7ff,%rdi
       │    │  bt     %rdi,%gs:0x2219a
       │    │↓ jae    119
       │    │  btr    %rdi,%gs:0x2219a
       │    │  mov    %rax,%rdi
       │    │↓ jmp    121
       │119:│  mov    %rax,%rdi
       │    │  bts    $0x3f,%rdi
       │121:│  or     $0x800,%rdi
       │128:└─→or     $0x1000,%rdi
       │       mov    %rdi,%cr3
       │132:   pop    %rax
       │       pop    %rdi
       │       pop    %rsp
       │     → jmpq   *0x825d2d(%rip)        # ffffffff82225e88 <pv_cpu_ops+0xe8>

This all gets much better viewed if one uses 'perf report --ignore-vmlinux'
forcing the usage of /proc/kcore + /proc/kallsyms, when the above
actually gets down to:

  # perf report --ignore-vmlinux
  ## do '/64', will show the function names containing '64',
  ## navigate to /entry_SYSCALL_64_after_hwframe.annotation,
  ## press 'A' to annotate, then 'P' to print that annotation
  ## to a file
  ## From another xterm (or see on screen, this 'P' thing is for
  ## getting rid of those right side scroll bars/spaces):
  # cat /entry_SYSCALL_64_after_hwframe.annotation
  entry_SYSCALL_64_after_hwframe() /proc/kcore
  Event: cycles:ppp

  Percent
              Disassembly of section load0:

              ffffffff9aa00044 <load0>:
   11.97        push   %rax
    4.85        push   %rdi
                push   %rsi
    2.59        push   %rdx
    2.27        push   %rcx
    0.32        pushq  $0xffffffffffffffda
    1.29        push   %r8
                xor    %r8d,%r8d
    1.62        push   %r9
    0.65        xor    %r9d,%r9d
    1.62        push   %r10
                xor    %r10d,%r10d
    5.50        push   %r11
                xor    %r11d,%r11d
    3.56        push   %rbx
                xor    %ebx,%ebx
    4.21        push   %rbp
                xor    %ebp,%ebp
    2.59        push   %r12
    0.97        xor    %r12d,%r12d
    3.24        push   %r13
                xor    %r13d,%r13d
    2.27        push   %r14
                xor    %r14d,%r14d
    4.21        push   %r15
                xor    %r15d,%r15d
    0.97        mov    %rsp,%rdi
    5.50      → callq  do_syscall_64
   14.56        mov    0x58(%rsp),%rcx
    7.44        mov    0x80(%rsp),%r11
    0.32        cmp    %rcx,%r11
              → jne    swapgs_restore_regs_and_return_to_usermode
    0.32        shl    $0x10,%rcx
    0.32        sar    $0x10,%rcx
    3.24        cmp    %rcx,%r11
              → jne    swapgs_restore_regs_and_return_to_usermode
    2.27        cmpq   $0x33,0x88(%rsp)
    1.29      → jne    swapgs_restore_regs_and_return_to_usermode
                mov    0x30(%rsp),%r11
    8.74        cmp    %r11,0x90(%rsp)
              → jne    swapgs_restore_regs_and_return_to_usermode
    0.32        test   $0x10100,%r11
              → jne    swapgs_restore_regs_and_return_to_usermode
    0.32        cmpq   $0x2b,0xa0(%rsp)
    0.65      → jne    swapgs_restore_regs_and_return_to_usermode

I.e. using kallsyms makes the function start/end be done differently
than using what is in the vmlinux ELF symtab and actually the hits
goes to entry_SYSCALL_64_after_hwframe, which is a GLOBAL() after the
start of entry_SYSCALL_64:

  ENTRY(entry_SYSCALL_64)
          UNWIND_HINT_EMPTY
  <SNIP>
          pushq   $__USER_CS                      /* pt_regs->cs */
          pushq   %rcx                            /* pt_regs->ip */
  GLOBAL(entry_SYSCALL_64_after_hwframe)
          pushq   %rax                            /* pt_regs->orig_ax */

          PUSH_AND_CLEAR_REGS rax=$-ENOSYS

And it goes and ends at:

          cmpq    $__USER_DS, SS(%rsp)            /* SS must match SYSRET */
          jne     swapgs_restore_regs_and_return_to_usermode

          /*
           * We win! This label is here just for ease of understanding
           * perf profiles. Nothing jumps here.
           */
  syscall_return_via_sysret:
          /* rcx and r11 are already restored (see code above) */
          UNWIND_HINT_EMPTY
          POP_REGS pop_rdi=0 skip_r11rcx=1

So perhaps some people should really just play with '--ignore-vmlinux'
to force /proc/kcore + kallsyms.

One idea is to do both, i.e. have a vmlinux annotation and a
kcore+kallsyms one, when possible, and even show the patched location,
etc.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-r11knxv8voesav31xokjiuo6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-23 16:46:53 -03:00
..
arch perf annotate: Pass function descriptor to its instruction parsing routines 2018-03-21 16:19:41 -03:00
bench perf perf: Remove duplicate includes 2017-12-27 12:15:49 -03:00
Documentation perf report: Introduce --ignore-vmlinux command line option 2018-03-21 12:53:42 -03:00
jvmti perf jvmti: Generate correct debug information for inlined code 2017-12-18 11:54:08 -03:00
pmu-events perf vendor events: Update POWER9 events 2018-03-16 13:57:08 -03:00
python perf python: Make twatch.py work with both python2 and python3 2018-02-19 12:28:08 -03:00
scripts perf tools: Add Python 3 support 2018-02-19 12:28:23 -03:00
tests perf tests bp_account: Fix build with clang-6 2018-03-19 13:51:54 -03:00
trace perf trace beauty flock: Move to separate object file 2018-01-25 06:37:31 -03:00
ui perf annotate: Support jumping from one function to another 2018-03-23 16:46:18 -03:00
util perf annotate: Use absolute addresses to calculate jump target offsets 2018-03-23 16:46:53 -03:00
.gitignore perf tools: Add trace/beauty/generated/ into .gitignore 2018-02-05 13:58:02 -03:00
Build perf trace: Remove audit-libs dependency if syscall tables are present 2018-01-23 09:51:38 -03:00
builtin-annotate.c perf annotate: Introduce --ignore-vmlinux command line option 2018-03-21 12:53:42 -03:00
builtin-bench.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
builtin-buildid-cache.c perf buildid-cache: Update help text for purge command 2017-11-16 14:49:54 -03:00
builtin-buildid-list.c Merge branch 'linus' into perf/core, to fix conflicts 2017-11-07 10:30:18 +01:00
builtin-c2c.c perf c2c report: Add cacheline address count column 2018-03-16 13:53:38 -03:00
builtin-config.c Merge branch 'linus' into perf/core, to fix conflicts 2017-11-07 10:30:18 +01:00
builtin-data.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
builtin-diff.c Merge branch 'linus' into perf/core, to fix conflicts 2017-11-07 10:30:18 +01:00
builtin-evlist.c Merge branch 'linus' into perf/core, to fix conflicts 2017-11-07 10:30:18 +01:00
builtin-ftrace.c perf ftrace: Append an EOL when write tracing files 2018-02-19 09:49:12 -03:00
builtin-help.c perf trace: Remove audit-libs dependency if syscall tables are present 2018-01-23 09:51:38 -03:00
builtin-inject.c perf tools: Get rid of unused 'swapped' parameter from perf_event__synthesize_sample() 2018-01-18 09:01:23 -03:00
builtin-kallsyms.c
builtin-kmem.c mm: remove __GFP_COLD 2017-11-15 18:21:06 -08:00
builtin-kvm.c perf mmap: Simplify perf_mmap__read_init() 2018-03-08 10:05:53 -03:00
builtin-list.c Merge branch 'linus' into perf/core, to fix conflicts 2017-11-07 10:30:18 +01:00
builtin-lock.c Merge branch 'linus' into perf/core, to fix conflicts 2017-11-07 10:30:18 +01:00
builtin-mem.c Merge branch 'linus' into perf/core, to fix conflicts 2017-11-07 10:30:18 +01:00
builtin-probe.c perf buildid-cache: Support binary objects from other namespaces 2017-07-18 23:14:11 -03:00
builtin-record.c perf record: Synthesize features before events in pipe mode 2018-03-16 13:56:50 -03:00
builtin-report.c perf report: Introduce --ignore-vmlinux command line option 2018-03-21 12:53:42 -03:00
builtin-sched.c perf sched map: Re-annotate shortname if thread comm changed 2018-03-07 10:22:26 -03:00
builtin-script.c perf tools: Fix snprint warnings for gcc 8 2018-03-19 10:00:43 -03:00
builtin-stat.c perf stat: Fix core dump when flag T is used 2018-03-16 13:55:29 -03:00
builtin-timechart.c perf tools: Add struct perf_data_file 2017-10-30 13:37:37 -03:00
builtin-top.c perf annotate: Move the default annotate options to the library 2018-03-21 12:53:40 -03:00
builtin-trace.c perf mmap: Simplify perf_mmap__read_init() 2018-03-08 10:05:53 -03:00
builtin-version.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
builtin.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
check-headers.sh tools include powerpc: Grab a copy of arch/powerpc/include/uapi/asm/unistd.h 2018-02-16 14:55:47 -03:00
command-list.txt
CREDITS
design.txt
Makefile License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
Makefile.config perf tools arm64: Add libdw DWARF post unwind support for ARM64 2018-03-16 13:53:46 -03:00
Makefile.perf perf tools: Update tags with .cpp files 2018-03-08 11:30:47 -03:00
MANIFEST perf tools: Get all of tools/{arch,include}/ in the MANIFEST 2017-09-25 10:39:43 -03:00
perf-archive.sh License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
perf-completion.sh perf tools: Auto-complete for events with ':' 2017-12-27 12:16:00 -03:00
perf-read-vdso.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
perf-sys.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
perf-with-kcore.sh
perf.c perf trace: Remove audit-libs dependency if syscall tables are present 2018-01-23 09:51:38 -03:00
perf.h perf record: Fix crash in pipe mode 2018-03-05 11:52:41 -03:00