Commit graph

4802 commits

Author SHA1 Message Date
Linus Torvalds
05de45383b Fix stack trace entry size to stop showing garbage
The macro that creates both the structure and the format displayed
 to user space for the stack trace event was changed a while ago
 to fix the parsing by user space tooling. But this change also modified
 the structure used to store the stack trace event. It changed the
 caller array field from [0] to [8]. Even though the size in the ring
 buffer is dynamic and can be something other than 8 (user space knows
 how to handle this), the 8 extra words was not accounted for when
 reserving the event on the ring buffer, and added 8 more entries, due
 to the calculation of "sizeof(*entry) + nr_entries * sizeof(long)",
 as the sizeof(*entry) now contains 8 entries. The size of the caller
 field needs to be subtracted from the size of the entry to create
 the correct allocation size.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYGccURQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qiboAPwNM1q8A7EFLDGfj+3tXksvp4H3hXd3
 ErMd2OMlsNQtRAD9GGmYyt2OtFdxZWzKOSEC07vdxq2TYTz50mqJM81YbgE=
 =7hwx
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.12-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fix from Steven Rostedt:
 "Fix stack trace entry size to stop showing garbage

  The macro that creates both the structure and the format displayed to
  user space for the stack trace event was changed a while ago to fix
  the parsing by user space tooling. But this change also modified the
  structure used to store the stack trace event. It changed the caller
  array field from [0] to [8].

  Even though the size in the ring buffer is dynamic and can be
  something other than 8 (user space knows how to handle this), the 8
  extra words was not accounted for when reserving the event on the ring
  buffer, and added 8 more entries, due to the calculation of
  "sizeof(*entry) + nr_entries * sizeof(long)", as the sizeof(*entry)
  now contains 8 entries.

  The size of the caller field needs to be subtracted from the size of
  the entry to create the correct allocation size"

* tag 'trace-v5.12-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Fix stack trace event size
2021-04-02 08:39:00 -07:00
Steven Rostedt (VMware)
9deb193af6 tracing: Fix stack trace event size
Commit cbc3b92ce0 fixed an issue to modify the macros of the stack trace
event so that user space could parse it properly. Originally the stack
trace format to user space showed that the called stack was a dynamic
array. But it is not actually a dynamic array, in the way that other
dynamic event arrays worked, and this broke user space parsing for it. The
update was to make the array look to have 8 entries in it. Helper
functions were added to make it parse it correctly, as the stack was
dynamic, but was determined by the size of the event stored.

Although this fixed user space on how it read the event, it changed the
internal structure used for the stack trace event. It changed the array
size from [0] to [8] (added 8 entries). This increased the size of the
stack trace event by 8 words. The size reserved on the ring buffer was the
size of the stack trace event plus the number of stack entries found in
the stack trace. That commit caused the amount to be 8 more than what was
needed because it did not expect the caller field to have any size. This
produced 8 entries of garbage (and reading random data) from the stack
trace event:

          <idle>-0       [002] d... 1976396.837549: <stack trace>
 => trace_event_raw_event_sched_switch
 => __traceiter_sched_switch
 => __schedule
 => schedule_idle
 => do_idle
 => cpu_startup_entry
 => secondary_startup_64_no_verify
 => 0xc8c5e150ffff93de
 => 0xffff93de
 => 0
 => 0
 => 0xc8c5e17800000000
 => 0x1f30affff93de
 => 0x00000004
 => 0x200000000

Instead, subtract the size of the caller field from the size of the event
to make sure that only the amount needed to store the stack trace is
reserved.

Link: https://lore.kernel.org/lkml/your-ad-here.call-01617191565-ext-9692@work.hours/

Cc: stable@vger.kernel.org
Fixes: cbc3b92ce0 ("tracing: Set kernel_stack's caller size properly")
Reported-by: Vasily Gorbik <gor@linux.ibm.com>
Tested-by: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-04-01 14:06:33 -04:00
Linus Torvalds
d19cc4bfbf Add check of order < 0 before calling free_pages()
The function addresses that are traced by ftrace are stored in pages,
 and the size is held in a variable. If there's some error in creating
 them, the allocate ones will be freed. In this case, it is possible that
 the order of pages to be freed may end up being negative due to a size of
 zero passed to get_count_order(), and then that negative number will cause
 free_pages() to free a very large section. Make sure that does not happen.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYGR30BQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qnbDAP9yEhTLcDRUi3VLWnEq19Dt4Lsg86Bf
 QRpbWG6Ze9EbZQEAgYAOe1fsNCNEIMXXh/4nlKVpKKH+vviS0ux9Z6uhpQQ=
 =Veyq
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull ftrace fix from Steven Rostedt:
 "Add check of order < 0 before calling free_pages()

  The function addresses that are traced by ftrace are stored in pages,
  and the size is held in a variable. If there's some error in creating
  them, the allocate ones will be freed. In this case, it is possible
  that the order of pages to be freed may end up being negative due to a
  size of zero passed to get_count_order(), and then that negative
  number will cause free_pages() to free a very large section.

  Make sure that does not happen"

* tag 'trace-v5.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace: Check if pages were allocated before calling free_pages()
2021-03-31 10:14:55 -07:00
Steven Rostedt (VMware)
59300b36f8 ftrace: Check if pages were allocated before calling free_pages()
It is possible that on error pg->size can be zero when getting its order,
which would return a -1 value. It is dangerous to pass in an order of -1
to free_pages(). Check if order is greater than or equal to zero before
calling free_pages().

Link: https://lore.kernel.org/lkml/20210330093916.432697c7@gandalf.local.home/

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-30 09:58:38 -04:00
Alexei Starovoitov
8a141dd7f7 ftrace: Fix modify_ftrace_direct.
The following sequence of commands:
  register_ftrace_direct(ip, addr1);
  modify_ftrace_direct(ip, addr1, addr2);
  unregister_ftrace_direct(ip, addr2);
will cause the kernel to warn:
[   30.179191] WARNING: CPU: 2 PID: 1961 at kernel/trace/ftrace.c:5223 unregister_ftrace_direct+0x130/0x150
[   30.180556] CPU: 2 PID: 1961 Comm: test_progs    W  O      5.12.0-rc2-00378-g86bc10a0a711-dirty #3246
[   30.182453] RIP: 0010:unregister_ftrace_direct+0x130/0x150

When modify_ftrace_direct() changes the addr from old to new it should update
the addr stored in ftrace_direct_funcs. Otherwise the final
unregister_ftrace_direct() won't find the address and will cause the splat.

Fixes: 0567d68091 ("ftrace: Add modify_ftrace_direct()")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lore.kernel.org/bpf/20210316195815.34714-1-alexei.starovoitov@gmail.com
2021-03-17 00:43:12 +01:00
Steven Rostedt (VMware)
ee666a1855 tracing: Skip selftests if tracing is disabled
If tracing is disabled for some reason (traceoff_on_warning, command line,
etc), the ftrace selftests are guaranteed to fail, as their results are
defined by trace data in the ring buffers. If the ring buffers are turned
off, the tests will fail, due to lack of data.

Because tracing being disabled is for a specific reason (warning, user
decided to, etc), it does not make sense to enable tracing to run the self
tests, as the test output may corrupt the reason for the tracing to be
disabled.

Instead, simply skip the self tests and report that they are being skipped
due to tracing being disabled.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-04 09:51:25 -05:00
Vamshi K Sthambamkadi
f40fc799af tracing: Fix memory leak in __create_synth_event()
kmemleak report:
unreferenced object 0xc5a6f708 (size 8):
  comm "ftracetest", pid 1209, jiffies 4294911500 (age 6.816s)
  hex dump (first 8 bytes):
    00 c1 3d 60 14 83 1f 8a                          ..=`....
  backtrace:
    [<f0aa4ac4>] __kmalloc_track_caller+0x2a6/0x460
    [<7d3d60a6>] kstrndup+0x37/0x70
    [<45a0e739>] argv_split+0x1c/0x120
    [<c17982f8>] __create_synth_event+0x192/0xb00
    [<0708b8a3>] create_synth_event+0xbb/0x150
    [<3d1941e1>] create_dyn_event+0x5c/0xb0
    [<5cf8b9e3>] trace_parse_run_command+0xa7/0x140
    [<04deb2ef>] dyn_event_write+0x10/0x20
    [<8779ac95>] vfs_write+0xa9/0x3c0
    [<ed93722a>] ksys_write+0x89/0xc0
    [<b9ca0507>] __ia32_sys_write+0x15/0x20
    [<7ce02d85>] __do_fast_syscall_32+0x45/0x80
    [<cb0ecb35>] do_fast_syscall_32+0x29/0x60
    [<2467454a>] do_SYSENTER_32+0x15/0x20
    [<9beaa61d>] entry_SYSENTER_32+0xa9/0xfc
unreferenced object 0xc5a6f078 (size 8):
  comm "ftracetest", pid 1209, jiffies 4294911500 (age 6.816s)
  hex dump (first 8 bytes):
    08 f7 a6 c5 00 00 00 00                          ........
  backtrace:
    [<bbac096a>] __kmalloc+0x2b6/0x470
    [<aa2624b4>] argv_split+0x82/0x120
    [<c17982f8>] __create_synth_event+0x192/0xb00
    [<0708b8a3>] create_synth_event+0xbb/0x150
    [<3d1941e1>] create_dyn_event+0x5c/0xb0
    [<5cf8b9e3>] trace_parse_run_command+0xa7/0x140
    [<04deb2ef>] dyn_event_write+0x10/0x20
    [<8779ac95>] vfs_write+0xa9/0x3c0
    [<ed93722a>] ksys_write+0x89/0xc0
    [<b9ca0507>] __ia32_sys_write+0x15/0x20
    [<7ce02d85>] __do_fast_syscall_32+0x45/0x80
    [<cb0ecb35>] do_fast_syscall_32+0x29/0x60
    [<2467454a>] do_SYSENTER_32+0x15/0x20
    [<9beaa61d>] entry_SYSENTER_32+0xa9/0xfc

In __create_synth_event(), while iterating field/type arguments, the
argv_split() will return array of atleast 2 elements even when zero
arguments(argc=0) are passed. for e.g. when there is double delimiter
or string ends with delimiter

To fix call argv_free() even when argc=0.

Link: https://lkml.kernel.org/r/20210304094521.GA1826@cosmos

Signed-off-by: Vamshi K Sthambamkadi <vamshi.k.sthambamkadi@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-04 09:45:57 -05:00
Steven Rostedt (VMware)
6549de1fe3 ring-buffer: Add a little more information and a WARN when time stamp going backwards is detected
When the CONFIG_RING_BUFFER_VALIDATE_TIME_DELTAS is enabled, and the time
stamps are detected as not being valid, it reports information about the
write stamp, but does not show the before_stamp which is still useful
information. Also, it should give a warning once, such that tests detect
this happening.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-04 09:45:17 -05:00
Steven Rostedt (VMware)
6f6be606e7 ring-buffer: Force before_stamp and write_stamp to be different on discard
Part of the logic of the new time stamp code depends on the before_stamp and
the write_stamp to be different if the write_stamp does not match the last
event on the buffer, as it will be used to calculate the delta of the next
event written on the buffer.

The discard logic depends on this, as the next event to come in needs to
inject a full timestamp as it can not rely on the last event timestamp in
the buffer because it is unknown due to events after it being discarded. But
by changing the write_stamp back to the time before it, it forces the next
event to use a full time stamp, instead of relying on it.

The issue came when a full time stamp was used for the event, and
rb_time_delta() returns zero in that case. The update to the write_stamp
(which subtracts delta) made it not change. Then when the event is removed
from the buffer, because the before_stamp and write_stamp still match, the
next event written would calculate its delta from the write_stamp, but that
would be wrong as the write_stamp is of the time of the event that was
discarded.

In the case that the delta change being made to write_stamp is zero, set the
before_stamp to zero as well, and this will force the next event to inject a
full timestamp and not use the current write_stamp.

Cc: stable@vger.kernel.org
Fixes: a389d86f7f ("ring-buffer: Have nested events still record running time stamp")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-04 09:45:17 -05:00
Rolf Eike Beer
69268094a1 tracing: Fix help text of TRACEPOINT_BENCHMARK in Kconfig
It's "cond_resched()" not "cond_sched()".

Link: https://lkml.kernel.org/r/1863065.aFVDpXsuPd@devpool47

Signed-off-by: Rolf Eike Beer <eb@emlix.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-04 09:45:17 -05:00
Yordan Karadzhov (VMware)
70d443d846 tracing: Remove duplicate declaration from trace.h
A declaration of function "int trace_empty(struct trace_iterator *iter)"
shows up twice in the header file kernel/trace/trace.h

Link: https://lkml.kernel.org/r/20210304092348.208033-1-y.karadz@gmail.com

Signed-off-by: Yordan Karadzhov (VMware) <y.karadz@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-04 09:44:47 -05:00
Linus Torvalds
3ab6608e66 block-5.12-2021-02-27
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmA6njIQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgprolD/9zWti9LsZvA7yE+PhVwrwF3CsNzLfQlClw
 99HaA7HxtAc/VLJrnD/SubhCAPdBC5B2xPv6faajdwF2iUR3Rr1Uc93CQ3uP2KKq
 kvm6ALTpzPTMI6YSABhY74sg9BkkoDbMo54JQYVQPleiE+5eDLbuFZck6ObfUHyY
 a4aaImlndWp/t14GzrClL4hucF+5KJy846P+QCVclkh0yl8xSsqZ5LIFU7tu3iQb
 HpZ5HKLT/2ma/EOr3wknnsIe97AUZQU0q5aMparhYlm+qR511eop3QXx850FL/oC
 tEGceKLij6qazmkiocKVzML8Fs+Y9/a4vCMjLCScWJmzDlmKdlH2uudeahN6b9Hm
 15qRQHOjl1Hc2bdr5ZVn87nq9RWhSm18C+SRMwOKHCOnEhwxqM3RjRfAgj4BJ6QB
 PFbFqdY+8Y1YLPFmn9hph72ePaEcN4L2IXW6TI/WX8mot8ODAnkq9Hr38dKwzO+i
 0mon6DVyJKKho6XwvVu5IYurkR2beQprjeVUxwZjjT6DxUgsc+J6itK5LDHFSkeZ
 qZlXn5Di8MkiXg0DFJYDQiFXnO0Z5GlRWOGPVfBaOr3x+1dqzDdHGw4oz1oGqvnr
 GNNYCsYIpDGm7eauX5lqL5MUFpjqRCceXy5JSHPhnWWw617nYkr4H9jdsV9HiTX1
 tQFx05QW3w==
 =ccMs
 -----END PGP SIGNATURE-----

Merge tag 'block-5.12-2021-02-27' of git://git.kernel.dk/linux-block

Pull more block updates from Jens Axboe:
 "A few stragglers (and one due to me missing it originally), and fixes
  for changes in this merge window mostly. In particular:

   - blktrace cleanups (Chaitanya, Greg)

   - Kill dead blk_pm_* functions (Bart)

   - Fixes for the bio alloc changes (Christoph)

   - Fix for the partition changes (Christoph, Ming)

   - Fix for turning off iopoll with polled IO inflight (Jeffle)

   - nbd disconnect fix (Josef)

   - loop fsync error fix (Mauricio)

   - kyber update depth fix (Yang)

   - max_sectors alignment fix (Mikulas)

   - Add bio_max_segs helper (Matthew)"

* tag 'block-5.12-2021-02-27' of git://git.kernel.dk/linux-block: (21 commits)
  block: Add bio_max_segs
  blktrace: fix documentation for blk_fill_rw()
  block: memory allocations in bounce_clone_bio must not fail
  block: remove the gfp_mask argument to bounce_clone_bio
  block: fix bounce_clone_bio for passthrough bios
  block-crypto-fallback: use a bio_set for splitting bios
  block: fix logging on capacity change
  blk-settings: align max_sectors on "logical_block_size" boundary
  block: reopen the device in blkdev_reread_part
  block: don't skip empty device in in disk_uevent
  blktrace: remove debugfs file dentries from struct blk_trace
  nbd: handle device refs for DESTROY_ON_DISCONNECT properly
  kyber: introduce kyber_depth_updated()
  loop: fix I/O error on fsync() in detached loop devices
  block: fix potential IO hang when turning off io_poll
  block: get rid of the trace rq insert wrapper
  blktrace: fix blk_rq_merge documentation
  blktrace: fix blk_rq_issue documentation
  blktrace: add blk_fill_rwbs documentation comment
  block: remove superfluous param in blk_fill_rwbs()
  ...
2021-02-28 11:23:38 -08:00
Alexander Potapenko
9c0dee54eb tracing: add error_report_end trace point
Patch series "Add error_report_end tracepoint to KFENCE and KASAN", v3.

This patchset adds a tracepoint, error_repor_end, that is to be used by
KFENCE, KASAN, and potentially other bug detection tools, when they print
an error report.  One of the possible use cases is userspace collection of
kernel error reports: interested parties can subscribe to the tracing
event via tracefs, and get notified when an error report occurs.

This patch (of 3):

Introduce error_report_end tracepoint.  It can be used in debugging tools
like KASAN, KFENCE, etc.  to provide extensions to the error reporting
mechanisms (e.g.  allow tests hook into error reporting, ease error report
collection from production kernels).  Another benefit would be making use
of ftrace for debugging or benchmarking the tools themselves.

Should we need it, the tracepoint name leaves us with the possibility to
introduce a complementary error_report_start tracepoint in the future.

Link: https://lkml.kernel.org/r/20210121131915.1331302-1-glider@google.com
Link: https://lkml.kernel.org/r/20210121131915.1331302-2-glider@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Suggested-by: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-26 09:41:02 -08:00
Chaitanya Kulkarni
94d4bffdda blktrace: fix documentation for blk_fill_rw()
Add missing ":" after rwbs function parameter documentation that fixes
following warning :-

./kernel/trace/blktrace.c:1877: warning: Function parameter or member 'rwbs' not described in 'blk_fill_rwbs'

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 1f83bb4b49 ("blktrace: add blk_fill_rwbs documentation comment")
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-24 08:55:30 -07:00
Linus Torvalds
414eece95b clang-lto for v5.12-rc1 (part2)
- Generate __mcount_loc in objtool (Peter Zijlstra)
 - Support running objtool against vmlinux.o (Sami Tolvanen)
 - Clang LTO enablement for x86 (Sami Tolvanen)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmA1fn8ACgkQiXL039xt
 wCbswQ//Zmnq912Ubyn5uPe9SOS/kumGDoqtxGzlZwo/pSB3qFArhD6G07sJ49XD
 nu/05ZcOda760wubnhcuK91n2fY5i/eGLXMSjfgtdVcco4Q67nPQydc+LGdhuDco
 FlhL8TAIwqYN1f2nJK1IggZpZFxz5r/r1Pq8q1S0oQRqDenxDBQwNtBba4B1OIxw
 /FE/1Hp3xwRnuJEP2jREBeY1yQ+Y1n859pZcDgSOWlTArcp8EVUi5hIWJ9DwIe73
 mqnx6PcFWEYB0zLNZmZz2gpEac+ncGyme6ChayeuQfInbL5dhx97jFGt3S6/+NSY
 mF2zyaR/+JsGGuM8dVqH3izKCJXCEAGirrdMO1ndb9HdwS3KnYEiag2ciNWL0wm3
 UEM4r0i2B14sU3pkyotKgsJdOSgorMKkQUPb2wW+OUfnkZNEWKLqylMgNXBD80l4
 WG5vYQRwwFN9jRBik6Z5YFGnwGsNIoGg1F1GRNMjh6h51adYQeBN/1QJE1FJ5L4D
 iKzmZYqimKUINXWfI6TNyqiv9TctOt65pxnRyq+MHxfTDzHGyc3MUeCeCiR1a1yI
 S5QhcgfSnC/NjDA0+oYC6yRlcBtfhjtUqFTGoZ4q4q/LF1BVU1bPyIXZrROLc05s
 LNMMBcWbJetJxFtm/gYfiVFuNitYtxbBV1krVtsWznCA2nKGJ9w=
 =htKJ
 -----END PGP SIGNATURE-----

Merge tag 'clang-lto-v5.12-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull more clang LTO updates from Kees Cook:
 "Clang LTO x86 enablement.

  Full disclosure: while this has _not_ been in linux-next (since it
  initially looked like the objtool dependencies weren't going to make
  v5.12), it has been under daily build and runtime testing by Sami for
  quite some time. These x86 portions have been discussed on lkml, with
  Peter, Josh, and others helping nail things down.

  The bulk of the changes are to get objtool working happily. The rest
  of the x86 enablement is very small.

  Summary:

   - Generate __mcount_loc in objtool (Peter Zijlstra)

   - Support running objtool against vmlinux.o (Sami Tolvanen)

   - Clang LTO enablement for x86 (Sami Tolvanen)"

Link: https://lore.kernel.org/lkml/20201013003203.4168817-26-samitolvanen@google.com/
Link: https://lore.kernel.org/lkml/cover.1611263461.git.jpoimboe@redhat.com/

* tag 'clang-lto-v5.12-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  kbuild: lto: force rebuilds when switching CONFIG_LTO
  x86, build: allow LTO to be selected
  x86, cpu: disable LTO for cpu.c
  x86, vdso: disable LTO only for vDSO
  kbuild: lto: postpone objtool
  objtool: Split noinstr validation from --vmlinux
  x86, build: use objtool mcount
  tracing: add support for objtool mcount
  objtool: Don't autodetect vmlinux.o
  objtool: Fix __mcount_loc generation with Clang's assembler
  objtool: Add a pass for generating __mcount_loc
2021-02-23 15:13:45 -08:00
Sami Tolvanen
22c8542d7b tracing: add support for objtool mcount
This change adds build support for using objtool to generate
__mcount_loc sections.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2021-02-23 12:46:57 -08:00
Linus Torvalds
21a6ab2131 Modules updates for v5.12
Summary of modules changes for the 5.12 merge window:
 
 - Retire EXPORT_UNUSED_SYMBOL() and EXPORT_SYMBOL_GPL_FUTURE(). These export
   types were introduced between 2006 - 2008. All the of the unused symbols have
   been long removed and gpl future symbols were converted to gpl quite a long
   time ago, and I don't believe these export types have been used ever since.
   So, I think it should be safe to retire those export types now. (Christoph Hellwig)
 
 - Refactor and clean up some aged code cruft in the module loader (Christoph Hellwig)
 
 - Build {,module_}kallsyms_on_each_symbol only when livepatching is enabled, as
   it is the only caller (Christoph Hellwig)
 
 - Unexport find_module() and module_mutex and fix the last module
   callers to not rely on these anymore. Make module_mutex internal to
   the module loader. (Christoph Hellwig)
 
 - Harden ELF checks on module load and validate ELF structures before checking
   the module signature (Frank van der Linden)
 
 - Fix undefined symbol warning for clang (Fangrui Song)
 
 - Fix smatch warning (Dan Carpenter)
 
 Signed-off-by: Jessica Yu <jeyu@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEVrp26glSWYuDNrCUwEV+OM47wXIFAmA0/KMQHGpleXVAa2Vy
 bmVsLm9yZwAKCRDARX44zjvBcu0uD/4nmRp18EKAtdUZivsZHat0aEWGlkmrVueY
 5huYw6iwM8b/wIAl3xwLki1Iv0/l0a83WXZhLG4ekl0/Nj8kgllA+jtBrZWpoLMH
 CZusN5dS9YwwyD2vu3ak83ARcehcDEPeA9thvc3uRFGis6Hi4bt1rkzGdrzsgqR4
 tybfN4qaQx4ZAKFxA8bnS58l7QTFwUzTxJfM6WWzl1Q+mLZDr/WP+loJ/f1/oFFg
 ufN31KrqqFpdQY5UKq5P4H8FVq/eXE1Mwl8vo3HsnAj598fznyPUmA3D/j+N4GuR
 sTGBVZ9CSehUj7uZRs+Qgg6Bd+y3o44N29BrdZWA6K3ieTeQQpA+VgPUNrDBjGhP
 J/9Y4ms4PnuNEWWRaa73m9qsVqAsjh9+T2xp9PYn9uWLCM8BvQFtWcY7tw4/nB0/
 INmyiP/tIRpwWkkBl47u1TPR09FzBBGDZjBiSn3lm3VX+zCYtHoma5jWyejG11cf
 ybDrTsci9ANyHNP2zFQsUOQJkph78PIal0i3k4ODqGJvaC0iEIH3Xjv+0dmE14rq
 kGRrG/HN6HhMZPjashudVUktyTZ63+PJpfFlQbcUzdvjQQIkzW0vrCHMWx9vD1xl
 Na7vZLl4Nb03WSJp6saY6j2YSRKL0poGETzGqrsUAHEhpEOPHduaiCVlAr/EmeMk
 p6SrWv8+UQ==
 =T29Q
 -----END PGP SIGNATURE-----

Merge tag 'modules-for-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux

Pull module updates from Jessica Yu:

 - Retire EXPORT_UNUSED_SYMBOL() and EXPORT_SYMBOL_GPL_FUTURE(). These
   export types were introduced between 2006 - 2008. All the of the
   unused symbols have been long removed and gpl future symbols were
   converted to gpl quite a long time ago, and I don't believe these
   export types have been used ever since. So, I think it should be safe
   to retire those export types now (Christoph Hellwig)

 - Refactor and clean up some aged code cruft in the module loader
   (Christoph Hellwig)

 - Build {,module_}kallsyms_on_each_symbol only when livepatching is
   enabled, as it is the only caller (Christoph Hellwig)

 - Unexport find_module() and module_mutex and fix the last module
   callers to not rely on these anymore. Make module_mutex internal to
   the module loader (Christoph Hellwig)

 - Harden ELF checks on module load and validate ELF structures before
   checking the module signature (Frank van der Linden)

 - Fix undefined symbol warning for clang (Fangrui Song)

 - Fix smatch warning (Dan Carpenter)

* tag 'modules-for-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
  module: potential uninitialized return in module_kallsyms_on_each_symbol()
  module: remove EXPORT_UNUSED_SYMBOL*
  module: remove EXPORT_SYMBOL_GPL_FUTURE
  module: move struct symsearch to module.c
  module: pass struct find_symbol_args to find_symbol
  module: merge each_symbol_section into find_symbol
  module: remove each_symbol_in_section
  module: mark module_mutex static
  kallsyms: only build {,module_}kallsyms_on_each_symbol when required
  kallsyms: refactor {,module_}kallsyms_on_each_symbol
  module: use RCU to synchronize find_module
  module: unexport find_module and module_mutex
  drm: remove drm_fb_helper_modinit
  powerpc/powernv: remove get_cxl_module
  module: harden ELF info handling
  module: Ignore _GLOBAL_OFFSET_TABLE_ when warning for undefined symbols
2021-02-23 10:15:33 -08:00
Linus Torvalds
79db4d2293 clang-lto series for v5.12-rc1
- Clang LTO build infrastructure and arm64-specific enablement (Sami Tolvanen)
 - Recursive build CC_FLAGS_LTO fix (Alexander Lobakin)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmA0OEYACgkQiXL039xt
 wCYGJw/8CcyvQUGmXYEZVDLMahKz93RYijiGuSTVnhl0pNAyfOojaZ8Z//eD1VNA
 s82azW1XybbA6RnPGD7YQzYz27cSF2qUFDmplwVfE4mwBnPXzRxtVBDLSxksP1HS
 77sCOu91QhbovPCWET4dSHLJB3DVc78FiW4lVlRgrglyAz+dut1iXYar5e7VNoS0
 S4MwnqwteHC6YXP619rubhpdDoj7njuw1uxRIaodt9S/zRSpl5MCUgHmzQusgezs
 yWDdPHPWHnF7xxKgwSvE7AKZPdOnIxKxRi6Yd6vUIyrYB3qLZkFe75nUsgMroAhs
 /Bgrn69U2McMiJsOdh0ERzP2VNYfvMacBQ308nb45j83Bgv5l6uj8QOZU4ZogmXV
 PsDzsfUe9GsxgYexfozGX61rpd6JinzQKVyoDW3oTT54fbBxO3uDqT8kOBw72dPT
 9nkOxTzyb+UO0dpb/MhXLGkGcv8+lTA5ffVIKUx5UxKngRbukc3dxwVJgO4HmucK
 bwVQGD83D+/if5/JL9WtQRjDwFEn+IFmdv+3cAXkRo4IIS18LPZB1MJncTeWr8Z9
 HlkuDXlJOncUWCABGd1IKu1j0S2HpXV4qhqQXJ6PdfOvUPEaD9qgqEAjD5FxxyXF
 wpaV2MWya5i1FGwD5UKhi8hVnAFJyF0/w+enjiPwlmIbjdyEVXE=
 =6peY
 -----END PGP SIGNATURE-----

Merge tag 'clang-lto-v5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull clang LTO updates from Kees Cook:
 "Clang Link Time Optimization.

  This is built on the work done preparing for LTO by arm64 folks,
  tracing folks, etc. This includes the core changes as well as the
  remaining pieces for arm64 (LTO has been the default build method on
  Android for about 3 years now, as it is the prerequisite for the
  Control Flow Integrity protections).

  While x86 LTO enablement is done, it depends on some pending objtool
  clean-ups. It's possible that I'll send a "part 2" pull request for
  LTO that includes x86 support.

  For merge log posterity, and as detailed in commit dc5723b02e
  ("kbuild: add support for Clang LTO"), here is the lt;dr to do an LTO
  build:

        make LLVM=1 LLVM_IAS=1 defconfig
        scripts/config -e LTO_CLANG_THIN
        make LLVM=1 LLVM_IAS=1

  (To do a cross-compile of arm64, add "CROSS_COMPILE=aarch64-linux-gnu-"
  and "ARCH=arm64" to the "make" command lines.)

  Summary:

   - Clang LTO build infrastructure and arm64-specific enablement (Sami
     Tolvanen)

   - Recursive build CC_FLAGS_LTO fix (Alexander Lobakin)"

* tag 'clang-lto-v5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  kbuild: prevent CC_FLAGS_LTO self-bloating on recursive rebuilds
  arm64: allow LTO to be selected
  arm64: disable recordmcount with DYNAMIC_FTRACE_WITH_REGS
  arm64: vdso: disable LTO
  drivers/misc/lkdtm: disable LTO for rodata.o
  efi/libstub: disable LTO
  scripts/mod: disable LTO for empty.c
  modpost: lto: strip .lto from module names
  PCI: Fix PREL32 relocations for LTO
  init: lto: fix PREL32 relocations
  init: lto: ensure initcall ordering
  kbuild: lto: add a default list of used symbols
  kbuild: lto: merge module sections
  kbuild: lto: limit inlining
  kbuild: lto: fix module versioning
  kbuild: add support for Clang LTO
  tracing: move function tracer options to Kconfig
2021-02-23 09:28:51 -08:00
Greg Kroah-Hartman
c0ea57608b blktrace: remove debugfs file dentries from struct blk_trace
These debugfs dentries do not need to be saved for anything as the whole
directory and everything in it is properly cleaned up when the parent
directory is removed.  So remove them from struct blk_trace and don't
save them when created as it's not necessary.

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-block@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-23 09:54:51 -07:00
Linus Torvalds
c958423470 Tracing updates for 5.12
- Update to the way irqs and preemption is tracked via the trace event PC field
 
  - Fix handling of unregistering event failing due to allocate memory.
    This is only triggered by failure injection, as it is pretty much guaranteed
    to have less than a page allocation succeed.
 
  - Do not show the useless "filter" or "enable" files for the "ftrace" trace
    system, as they have no effect on doing anything.
 
  - Add a warning if kprobes are registered more than once.
 
  - Synthetic events now have their fields parsed by semicolons.
    Old formats without semicolons will still work, but new features will
    require them.
 
  - New option to allow trace events to show %p without hashing in trace file.
    The trace file can only be read by root, and reading the raw event buffer
    did not have any pointers hashed, so this does not expose anything new.
 
  - New directory in tools called tools/tracing, where a new tool that reads
    sequential latency reports from the ftrace latency tracers.
 
  - Other minor fixes and cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYDL2wBQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qti6AP0RUcSU5U1onx8DcwPQLC5Xr3CPqJkm
 RvKeJDdgFP+sVgEAiMTFsy2UMc0gmlHZMFd5nZLSiJCu1I2hHmhS5yKbHgY=
 =fD9+
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:

 - Update to the way irqs and preemption is tracked via the trace event
   PC field

 - Fix handling of unregistering event failing due to allocate memory.
   This is only triggered by failure injection, as it is pretty much
   guaranteed to have less than a page allocation succeed.

 - Do not show the useless "filter" or "enable" files for the "ftrace"
   trace system, as they have no effect on doing anything.

 - Add a warning if kprobes are registered more than once.

 - Synthetic events now have their fields parsed by semicolons. Old
   formats without semicolons will still work, but new features will
   require them.

 - New option to allow trace events to show %p without hashing in trace
   file. The trace file can only be read by root, and reading the raw
   event buffer did not have any pointers hashed, so this does not
   expose anything new.

 - New directory in tools called tools/tracing, where a new tool that
   reads sequential latency reports from the ftrace latency tracers.

 - Other minor fixes and cleanups.

* tag 'trace-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (33 commits)
  kprobes: Fix to delay the kprobes jump optimization
  tracing/tools: Add the latency-collector to tools directory
  tracing: Make hash-ptr option default
  tracing: Add ptr-hash option to show the hashed pointer value
  tracing: Update the stage 3 of trace event macro comment
  tracing: Show real address for trace event arguments
  selftests/ftrace: Add '!event' synthetic event syntax check
  selftests/ftrace: Update synthetic event syntax errors
  tracing: Add a backward-compatibility check for synthetic event creation
  tracing: Update synth command errors
  tracing: Rework synthetic event command parsing
  tracing/dynevent: Delegate parsing to create function
  kprobes: Warn if the kprobe is reregistered
  ftrace: Remove unused ftrace_force_update()
  tracepoints: Code clean up
  tracepoints: Do not punish non static call users
  tracepoints: Remove unnecessary "data_args" macro parameter
  tracing: Do not create "enable" or "filter" files for ftrace event subsystem
  kernel: trace: preemptirq_delay_test: add cpu affinity
  tracepoint: Do not fail unregistering a probe due to memory failure
  ...
2021-02-22 14:07:15 -08:00
Chaitanya Kulkarni
1f83bb4b49 blktrace: add blk_fill_rwbs documentation comment
blk_fill_rwbs() is an expoted function, add kernel style documentation
comment.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-22 06:37:41 -07:00
Chaitanya Kulkarni
179d160072 block: remove superfluous param in blk_fill_rwbs()
The last parameter for the function blk_fill_rwbs() was added in
5782138e47 ("tracing/events: convert block trace points to
TRACE_EVENT()") in order to signal read request and use of that parameter
was replaced with using switch case REQ_OP_READ with
1b9a9ab78b ("blktrace: use op accessors"), but the parameter was never
removed.

Remove the unused parameter and adjust the respective call sites.

Fixes: 1b9a9ab78b ("blktrace: use op accessors")
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-22 06:37:41 -07:00
Linus Torvalds
582cd91f69 for-5.12/block-2021-02-17
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmAtmIwQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgplzLEAC5O+3rBM8QuiJdo39Yppmuw4hDJ6hOKynP
 EJQLKQQi0VfXgU+MprGvcbpFYmNbgICvUICQkEzJuk++kPCu/BJtJz0yErQeLgS+
 RdXiPV6enbF7iRML5TVRTr1q/z7sJMXcIIJ8Pz/rU/JNfGYExVd0WfnEY9mp1jOt
 Bl9V+qyTazdP+Ma4+uEPatSayqcdi1rxB5I+7v/sLiOvKZZWkaRZjUZ/mxAjUfvK
 dBOOPjMygEo3tCLkIyyA6lpLvr1r+SUZhLuebRLEKa3To3TW6RtoG0qwpKmI2iKw
 ylLeVLB60nM9RUxjflVOfBsHxz1bDg5Ve86y5nCjQd4Jo8x1c4DnecyGE5/Tu8Rg
 rgbsfD6nFWzhDCvcZT0XrfQ4ZAjIL2IfT+ypQiQ6UlRd3hvIKRmzWMkjuH2svr0u
 ey9Kq+lYerI4cM0F3W73gzUKdIQOuCzBCYxQuSQQomscBa7FCInyU192dAI9Aj6l
 Yd06mgKu6qCx6zLv6JfpBqaBHZMwyGE4dmZgPQFuuwO+b4N+Ck3Jm5fzEzw/xIxQ
 wdo/DlsAl60BXentB6FByGBJaCjVdSymRqN/xNCAbFKCjmr6TLBuXPfg1gYYO7xC
 VOcVjWe8iN3wWHZab3t2mxMKH9B9B/KKzIhu6TNHSmgtQ5paZPRCBx995pDyRw26
 WC22RGC2MA==
 =os1E
 -----END PGP SIGNATURE-----

Merge tag 'for-5.12/block-2021-02-17' of git://git.kernel.dk/linux-block

Pull core block updates from Jens Axboe:
 "Another nice round of removing more code than what is added, mostly
  due to Christoph's relentless pursuit of tech debt removal/cleanups.
  This pull request contains:

   - Two series of BFQ improvements (Paolo, Jan, Jia)

   - Block iov_iter improvements (Pavel)

   - bsg error path fix (Pan)

   - blk-mq scheduler improvements (Jan)

   - -EBUSY discard fix (Jan)

   - bvec allocation improvements (Ming, Christoph)

   - bio allocation and init improvements (Christoph)

   - Store bdev pointer in bio instead of gendisk + partno (Christoph)

   - Block trace point cleanups (Christoph)

   - hard read-only vs read-only split (Christoph)

   - Block based swap cleanups (Christoph)

   - Zoned write granularity support (Damien)

   - Various fixes/tweaks (Chunguang, Guoqing, Lei, Lukas, Huhai)"

* tag 'for-5.12/block-2021-02-17' of git://git.kernel.dk/linux-block: (104 commits)
  mm: simplify swapdev_block
  sd_zbc: clear zone resources for non-zoned case
  block: introduce blk_queue_clear_zone_settings()
  zonefs: use zone write granularity as block size
  block: introduce zone_write_granularity limit
  block: use blk_queue_set_zoned in add_partition()
  nullb: use blk_queue_set_zoned() to setup zoned devices
  nvme: cleanup zone information initialization
  block: document zone_append_max_bytes attribute
  block: use bi_max_vecs to find the bvec pool
  md/raid10: remove dead code in reshape_request
  block: mark the bio as cloned in bio_iov_bvec_set
  block: set BIO_NO_PAGE_REF in bio_iov_bvec_set
  block: remove a layer of indentation in bio_iov_iter_get_pages
  block: turn the nr_iovecs argument to bio_alloc* into an unsigned short
  block: remove the 1 and 4 vec bvec_slabs entries
  block: streamline bvec_alloc
  block: factor out a bvec_alloc_gfp helper
  block: move struct biovec_slab to bio.c
  block: reuse BIO_INLINE_VECS for integrity bvecs
  ...
2021-02-21 11:02:48 -08:00
Linus Torvalds
51e6d17809 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from David Miller:
 "Here is what we have this merge window:

   1) Support SW steering for mlx5 Connect-X6Dx, from Yevgeny Kliteynik.

   2) Add RSS multi group support to octeontx2-pf driver, from Geetha
      Sowjanya.

   3) Add support for KS8851 PHY. From Marek Vasut.

   4) Add support for GarfieldPeak bluetooth controller from Kiran K.

   5) Add support for half-duplex tcan4x5x can controllers.

   6) Add batch skb rx processing to bcrm63xx_enet, from Sieng Piaw
      Liew.

   7) Rework RX port offload infrastructure, particularly wrt, UDP
      tunneling, from Jakub Kicinski.

   8) Add BCM72116 PHY support, from Florian Fainelli.

   9) Remove Dsa specific notifiers, they are unnecessary. From Vladimir
      Oltean.

  10) Add support for picosecond rx delay in dwmac-meson8b chips. From
      Martin Blumenstingl.

  11) Support TSO on xfrm interfaces from Eyal Birger.

  12) Add support for MP_PRIO to mptcp stack, from Geliang Tang.

  13) Support BCM4908 integrated switch, from Rafał Miłecki.

  14) Support for directly accessing kernel module variables via module
      BTF info, from Andrii Naryiko.

  15) Add DASH (esktop and mobile Architecture for System Hardware)
      support to r8169 driver, from Heiner Kallweit.

  16) Add rx vlan filtering to dpaa2-eth, from Ionut-robert Aron.

  17) Add support for 100 base0x SFP devices, from Bjarni Jonasson.

  18) Support link aggregation in DSA, from Tobias Waldekranz.

  19) Support for bitwidse atomics in bpf, from Brendan Jackman.

  20) SmartEEE support in at803x driver, from Russell King.

  21) Add support for flow based tunneling to GTP, from Pravin B Shelar.

  22) Allow arbitrary number of interconnrcts in ipa, from Alex Elder.

  23) TLS RX offload for bonding, from Tariq Toukan.

  24) RX decap offklload support in mac80211, from Felix Fietkou.

  25) devlink health saupport in octeontx2-af, from George Cherian.

  26) Add TTL attr to SCM_TIMESTAMP_OPT_STATS, from Yousuk Seung

  27) Delegated actionss support in mptcp, from Paolo Abeni.

  28) Support receive timestamping when doin zerocopy tcp receive. From
      Arjun Ray.

  29) HTB offload support for mlx5, from Maxim Mikityanskiy.

  30) UDP GRO forwarding, from Maxim Mikityanskiy.

  31) TAPRIO offloading in dsa hellcreek driver, from Kurt Kanzenbach.

  32) Weighted random twos choice algorithm for ipvs, from Darby Payne.

  33) Fix netdev registration deadlock, from Johannes Berg.

  34) Various conversions to new tasklet api, from EmilRenner Berthing.

  35) Bulk skb allocations in veth, from Lorenzo Bianconi.

  36) New ethtool interface for lane setting, from Danielle Ratson.

  37) Offload failiure notifications for routes, from Amit Cohen.

  38) BCM4908 support, from Rafał Miłecki.

  39) Support several new iwlwifi chips, from Ihab Zhaika.

  40) Flow drector support for ipv6 in i40e, from Przemyslaw Patynowski.

  41) Support for mhi prrotocols, from Loic Poulain.

  42) Optimize bpf program stats.

  43) Implement RFC6056, for better port randomization, from Eric
      Dumazet.

  44) hsr tag offloading support from George McCollister.

  45) Netpoll support in qede, from Bhaskar Upadhaya.

  46) 2005/400g speed support in bonding 3ad mode, from Nikolay
      Aleksandrov.

  47) Netlink event support in mptcp, from Florian Westphal.

  48) Better skbuff caching, from Alexander Lobakin.

  49) MRP (Media Redundancy Protocol) offloading in DSA and a few
      drivers, from Horatiu Vultur.

  50) mqprio saupport in mvneta, from Maxime Chevallier.

  51) Remove of_phy_attach, no longer needed, from Florian Fainelli"

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1766 commits)
  octeontx2-pf: Fix otx2_get_fecparam()
  cteontx2-pf: cn10k: Prevent harmless double shift bugs
  net: stmmac: Add PCI bus info to ethtool driver query output
  ptp: ptp_clockmatrix: clean-up - parenthesis around a == b are unnecessary
  ptp: ptp_clockmatrix: Simplify code - remove unnecessary `err` variable.
  ptp: ptp_clockmatrix: Coding style - tighten vertical spacing.
  ptp: ptp_clockmatrix: Clean-up dev_*() messages.
  ptp: ptp_clockmatrix: Remove unused header declarations.
  ptp: ptp_clockmatrix: Add alignment of 1 PPS to idtcm_perout_enable.
  ptp: ptp_clockmatrix: Add wait_for_sys_apll_dpll_lock.
  net: stmmac: dwmac-sun8i: Add a shutdown callback
  net: stmmac: dwmac-sun8i: Minor probe function cleanup
  net: stmmac: dwmac-sun8i: Use reset_control_reset
  net: stmmac: dwmac-sun8i: Remove unnecessary PHY power check
  net: stmmac: dwmac-sun8i: Return void from PHY unpower
  r8169: use macro pm_ptr
  net: mdio: Remove of_phy_attach()
  net: mscc: ocelot: select PACKING in the Kconfig
  net: re-solve some conflicts after net -> net-next merge
  net: dsa: tag_rtl4_a: Support also egress tags
  ...
2021-02-20 17:45:32 -08:00
David S. Miller
b8af417e4d Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2021-02-16

The following pull-request contains BPF updates for your *net-next* tree.

There's a small merge conflict between 7eeba1706e ("tcp: Add receive timestamp
support for receive zerocopy.") from net-next tree and 9cacf81f81 ("bpf: Remove
extra lock_sock for TCP_ZEROCOPY_RECEIVE") from bpf-next tree. Resolve as follows:

  [...]
                lock_sock(sk);
                err = tcp_zerocopy_receive(sk, &zc, &tss);
                err = BPF_CGROUP_RUN_PROG_GETSOCKOPT_KERN(sk, level, optname,
                                                          &zc, &len, err);
                release_sock(sk);
  [...]

We've added 116 non-merge commits during the last 27 day(s) which contain
a total of 156 files changed, 5662 insertions(+), 1489 deletions(-).

The main changes are:

1) Adds support of pointers to types with known size among global function
   args to overcome the limit on max # of allowed args, from Dmitrii Banshchikov.

2) Add bpf_iter for task_vma which can be used to generate information similar
   to /proc/pid/maps, from Song Liu.

3) Enable bpf_{g,s}etsockopt() from all sock_addr related program hooks. Allow
   rewriting bind user ports from BPF side below the ip_unprivileged_port_start
   range, both from Stanislav Fomichev.

4) Prevent recursion on fentry/fexit & sleepable programs and allow map-in-map
   as well as per-cpu maps for the latter, from Alexei Starovoitov.

5) Add selftest script to run BPF CI locally. Also enable BPF ringbuffer
   for sleepable programs, both from KP Singh.

6) Extend verifier to enable variable offset read/write access to the BPF
   program stack, from Andrei Matei.

7) Improve tc & XDP MTU handling and add a new bpf_check_mtu() helper to
   query device MTU from programs, from Jesper Dangaard Brouer.

8) Allow bpf_get_socket_cookie() helper also be called from [sleepable] BPF
   tracing programs, from Florent Revest.

9) Extend x86 JIT to pad JMPs with NOPs for helping image to converge when
   otherwise too many passes are required, from Gary Lin.

10) Verifier fixes on atomics with BPF_FETCH as well as function-by-function
    verification both related to zero-extension handling, from Ilya Leoshkevich.

11) Better kernel build integration of resolve_btfids tool, from Jiri Olsa.

12) Batch of AF_XDP selftest cleanups and small performance improvement
    for libbpf's xsk map redirect for newer kernels, from Björn Töpel.

13) Follow-up BPF doc and verifier improvements around atomics with
    BPF_FETCH, from Brendan Jackman.

14) Permit zero-sized data sections e.g. if ELF .rodata section contains
    read-only data from local variables, from Yonghong Song.

15) veth driver skb bulk-allocation for ndo_xdp_xmit, from Lorenzo Bianconi.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 13:14:06 -08:00
Song Liu
3d06f34aa8 bpf: Allow bpf_d_path in bpf_iter program
task_file and task_vma iter programs have access to file->f_path. Enable
bpf_d_path to print paths of these file.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210212183107.50963-3-songliubraving@fb.com
2021-02-12 12:56:53 -08:00
Linus Torvalds
e77a6817d4 tracing: Fix buffer overflow in trace event filter
It was reported that if an trace event was larger than a page
 and was filtered, that it caused memory corruption. The reason
 is that filtered events first go into a buffer to test the filter
 before being written into the ring buffer. Unfortunately,
 this write did not check the size.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYCaSHBQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qqOpAQCUSlZdBxLzs87zeHgXbkMudWvCYSbA
 mndzddqtxPXlXwEAsRnO8BERyZnasEdXnJ98JJwQaFFYH0dBCA2pTU2onQc=
 =NokV
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.11-rc7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fix from Steven Rostedt:
 "Fix buffer overflow in trace event filter.

  It was reported that if an trace event was larger than a page and was
  filtered, that it caused memory corruption. The reason is that
  filtered events first go into a buffer to test the filter before being
  written into the ring buffer. Unfortunately, this write did not check
  the size"

* tag 'trace-v5.11-rc7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Check length before giving out the filter buffer
2021-02-12 11:16:17 -08:00
Steven Rostedt (VMware)
99e22ce73c tracing: Make hash-ptr option default
Since the original behavior of the trace events is to hash the %p pointers,
make that the default, and have developers have to enable the option in
order to have them unhashed.

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-12 11:52:59 -05:00
Florent Revest
c5dbb89fc2 bpf: Expose bpf_get_socket_cookie to tracing programs
This needs a new helper that:
- can work in a sleepable context (using sock_gen_cookie)
- takes a struct sock pointer and checks that it's not NULL

Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: KP Singh <kpsingh@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210210111406.785541-2-revest@chromium.org
2021-02-11 17:44:41 -08:00
Masami Hiramatsu
a345a6718b tracing: Add ptr-hash option to show the hashed pointer value
Add tracefs/options/hash-ptr option to show hashed pointer
value by %p in event printk format string.

For the security reason, normal printk will show the hashed
pointer value (encrypted by random number) with %p to printk
buffer to hide the real address. But the tracefs/trace always
shows real address for debug. To bridge those outputs, add an
option to switch the output format. Ftrace users can use it
to find the hashed value corresponding to the real address
in trace log.

Link: https://lkml.kernel.org/r/160277372504.29307.14909828808982012211.stgit@devnote2

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-11 16:31:57 -05:00
Masami Hiramatsu
efbbdaa22b tracing: Show real address for trace event arguments
To help debugging kernel, show real address for trace event arguments
in tracefs/trace{,pipe} instead of hashed pointer value.

Since ftrace human-readable format uses vsprintf(), all %p are
translated to hash values instead of pointer address.

However, when debugging the kernel, raw address value gives a
hint when comparing with the memory mapping in the kernel.
(Those are sometimes used with crash log, which is not hashed too)
So converting %p with %px when calling trace_seq_printf().

Moreover, this is not improving the security because the tracefs
can be used only by root user and the raw address values are readable
from tracefs/percpu/cpu*/trace_pipe_raw file.

Link: https://lkml.kernel.org/r/160277370703.29307.5134475491761971203.stgit@devnote2

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-11 16:31:57 -05:00
Steven Rostedt (VMware)
b220c049d5 tracing: Check length before giving out the filter buffer
When filters are used by trace events, a page is allocated on each CPU and
used to copy the trace event fields to this page before writing to the ring
buffer. The reason to use the filter and not write directly into the ring
buffer is because a filter may discard the event and there's more overhead
on discarding from the ring buffer than the extra copy.

The problem here is that there is no check against the size being allocated
when using this page. If an event asks for more than a page size while being
filtered, it will get only a page, leading to the caller writing more that
what was allocated.

Check the length of the request, and if it is more than PAGE_SIZE minus the
header default back to allocating from the ring buffer directly. The ring
buffer may reject the event if its too big anyway, but it wont overflow.

Link: https://lore.kernel.org/ath10k/1612839593-2308-1-git-send-email-wgong@codeaurora.org/

Cc: stable@vger.kernel.org
Fixes: 0fc1b09ff1 ("tracing: Use temp buffer when filtering events")
Reported-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-11 14:23:37 -05:00
Linus Torvalds
6016bf19b3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:
 "Another pile of networing fixes:

   1) ath9k build error fix from Arnd Bergmann

   2) dma memory leak fix in mediatec driver from Lorenzo Bianconi.

   3) bpf int3 kprobe fix from Alexei Starovoitov.

   4) bpf stackmap integer overflow fix from Bui Quang Minh.

   5) Add usb device ids for Cinterion MV31 to qmi_qwwan driver, from
      Christoph Schemmel.

   6) Don't update deleted entry in xt_recent netfilter module, from
      Jazsef Kadlecsik.

   7) Use after free in nftables, fix from Pablo Neira Ayuso.

   8) Header checksum fix in flowtable from Sven Auhagen.

   9) Validate user controlled length in qrtr code, from Sabyrzhan
      Tasbolatov.

  10) Fix race in xen/netback, from Juergen Gross,

  11) New device ID in cxgb4, from Raju Rangoju.

  12) Fix ring locking in rxrpc release call, from David Howells.

  13) Don't return LAPB error codes from x25_open(), from Xie He.

  14) Missing error returns in gsi_channel_setup() from Alex Elder.

  15) Get skb_copy_and_csum_datagram working properly with odd segment
      sizes, from Willem de Bruijn.

  16) Missing RFS/RSS table init in enetc driver, from Vladimir Oltean.

  17) Do teardown on probe failure in DSA, from Vladimir Oltean.

  18) Fix compilation failures of txtimestamp selftest, from Vadim
      Fedorenko.

  19) Limit rx per-napi gro queue size to fix latency regression, from
      Eric Dumazet.

  20) dpaa_eth xdp fixes from Camelia Groza.

  21) Missing txq mode update when switching CBS off, in stmmac driver,
      from Mohammad Athari Bin Ismail.

  22) Failover pending logic fix in ibmvnic driver, from Sukadev
      Bhattiprolu.

  23) Null deref fix in vmw_vsock, from Norbert Slusarek.

  24) Missing verdict update in xdp paths of ena driver, from Shay
      Agroskin.

  25) seq_file iteration fix in sctp from Neil Brown.

  26) bpf 32-bit src register truncation fix on div/mod, from Daniel
      Borkmann.

  27) Fix jmp32 pruning in bpf verifier, from Daniel Borkmann.

  28) Fix locking in vsock_shutdown(), from Stefano Garzarella.

  29) Various missing index bound checks in hns3 driver, from Yufeng Mo.

  30) Flush ports on .phylink_mac_link_down() in dsa felix driver, from
      Vladimir Oltean.

  31) Don't mix up stp and mrp port states in bridge layer, from Horatiu
      Vultur.

  32) Fix locking during netif_tx_disable(), from Edwin Peer"

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (45 commits)
  bpf: Fix 32 bit src register truncation on div/mod
  bpf: Fix verifier jmp32 pruning decision logic
  bpf: Fix verifier jsgt branch analysis on max bound
  vsock: fix locking in vsock_shutdown()
  net: hns3: add a check for index in hclge_get_rss_key()
  net: hns3: add a check for tqp_index in hclge_get_ring_chain_from_mbx()
  net: hns3: add a check for queue_id in hclge_reset_vf_queue()
  net: dsa: felix: implement port flushing on .phylink_mac_link_down
  switchdev: mrp: Remove SWITCHDEV_ATTR_ID_MRP_PORT_STAT
  bridge: mrp: Fix the usage of br_mrp_port_switchdev_set_state
  net: watchdog: hold device global xmit lock during tx disable
  netfilter: nftables: relax check for stateful expressions in set definition
  netfilter: conntrack: skip identical origin tuple in same zone only
  vsock/virtio: update credit only if socket is not closed
  net: fix iteration for sctp transport seq_files
  net: ena: Update XDP verdict upon failure
  net/vmw_vsock: improve locking in vsock_connect_timeout()
  net/vmw_vsock: fix NULL pointer dereference
  ibmvnic: Clear failover_pending if unable to schedule
  net: stmmac: set TxQ mode back to DCB after disabling CBS
  ...
2021-02-10 11:33:39 -08:00
Tom Zanussi
8b5ab6bd0b tracing: Add a backward-compatibility check for synthetic event creation
The synthetic event parsing rework now requires semicolons between
synthetic event fields.  That requirement breaks existing users who
might already have used the old synthetic event command format, so
this adds an inner loop that can parse more than one field, if
present, between semicolons.  For each field, parse_synth_field()
checks in which version that field was introduced, using
check_field_version().  The caller, __create_synth_event() can then use
that version information to determine whether or not to enforce the
requirement on the command as a whole.

In the future, if/when new features are added, the requirement will be
that any field/string containing the new feature must use semicolons,
and the check_field_version() check can then check for those and
enforce it.  Using a version number allows this scheme to be extended
if necessary.

Link: https://lkml.kernel.org/r/74fcc500d561b40ce91c5ee94818c70c6b0c9330.1612208610.git.zanussi@kernel.org

[ zanussi: added check_field_version() comment from rostedt@goodmis.org ]
Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-09 12:52:15 -05:00
Tom Zanussi
8d3e816523 tracing: Update synth command errors
Since array types are handled differently, errors referencing them
also need to be handled differently.  Add and use a new
INVALID_ARRAY_SPEC error.  Also add INVALID_CMD and INVALID_DYN_CMD to
catch and display the correct form for badly-formed commands, which
can also be used in place of CMD_INCOMPLETE, which is removed, and
remove CMD_TOO_LONG, since it's no longer used.

Link: https://lkml.kernel.org/r/b9dd434dc6458dcff11adc6ed616fe93a8794770.1612208610.git.zanussi@kernel.org

Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-09 12:52:15 -05:00
Tom Zanussi
c9e759b1e8 tracing: Rework synthetic event command parsing
Now that command parsing has been delegated to the create functions
and we're no longer constrained by argv_split(), we can modify the
synthetic event command parser to better match the higher-level
structure of the synthetic event commands, which is basically an event
name followed by a set of semicolon-separated fields.

Since we're also now passed the raw command, we can also save it
directly and can get rid of save_cmdstr().

Link: https://lkml.kernel.org/r/cb9e2be92d992ce59f2b4f132264a5d467f3933f.1612208610.git.zanussi@kernel.org

Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-09 12:52:15 -05:00
Masami Hiramatsu
d262271d04 tracing/dynevent: Delegate parsing to create function
Delegate command parsing to each create function so that the
command syntax can be customized.

This requires changes to the kprobe/uprobe/synthetic event handling,
which are also included here.

Link: https://lkml.kernel.org/r/e488726f49cbdbc01568618f8680584306c4c79f.1612208610.git.zanussi@kernel.org

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
[ zanussi@kernel.org: added synthetic event modifications ]
Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-09 12:52:15 -05:00
Christoph Hellwig
a006050575 module: use RCU to synchronize find_module
Allow for a RCU-sched critical section around find_module, following
the lower level find_module_all helper, and switch the two callers
outside of module.c to use such a RCU-sched critical section instead
of module_mutex.

Reviewed-by: Petr Mladek <pmladek@suse.com>
Acked-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2021-02-08 12:21:40 +01:00
Steven Rostedt (VMware)
2d396cb3b1 tracing: Do not create "enable" or "filter" files for ftrace event subsystem
The ftrace event subsystem is only created for showing the format files of
events created by the ftrace tracers, and are not trace events. The ftrace
subsystem currently has both the "enable" and "filter" files that in other
subsystems are used to enable/disable all events within the subsystem or set
a filter for all the subsystem events.

As ftrace subsystem events do not use enable or filter operations, these
files are useless in the ftrace subsystem. Remove them.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-05 18:04:39 -05:00
Steven Rostedt (VMware)
256cfdd6fd tracing: Do not count ftrace events in top level enable output
The file /sys/kernel/tracing/events/enable is used to enable all events by
echoing in "1", or disabling all events when echoing in "0". To know if all
events are enabled, disabled, or some are enabled but not all of them,
cating the file should show either "1" (all enabled), "0" (all disabled), or
"X" (some enabled but not all of them). This works the same as the "enable"
files in the individule system directories (like tracing/events/sched/enable).

But when all events are enabled, the top level "enable" file shows "X". The
reason is that its checking the "ftrace" events, which are special events
that only exist for their format files. These include the format for the
function tracer events, that are enabled when the function tracer is
enabled, but not by the "enable" file. The check includes these events,
which will always be disabled, and even though all true events are enabled,
the top level "enable" file will show "X" instead of "1".

To fix this, have the check test the event's flags to see if it has the
"IGNORE_ENABLE" flag set, and if so, not test it.

Cc: stable@vger.kernel.org
Fixes: 553552ce17 ("tracing: Combine event filter_active and enable into single flags field")
Reported-by: "Yordan Karadzhov (VMware)" <y.karadz@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-05 15:40:04 -05:00
Alexei Starovoitov
548f1191d8 bpf: Unbreak BPF_PROG_TYPE_KPROBE when kprobe is called via do_int3
The commit 0d00449c7a ("x86: Replace ist_enter() with nmi_enter()")
converted do_int3 handler to be "NMI-like".
That made old if (in_nmi()) check abort execution of bpf programs
attached to kprobe when kprobe is firing via int3
(For example when kprobe is placed in the middle of the function).
Remove the check to restore user visible behavior.

Fixes: 0d00449c7a ("x86: Replace ist_enter() with nmi_enter()")
Reported-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/bpf/20210203070636.70926-1-alexei.starovoitov@gmail.com
2021-02-03 15:54:22 +01:00
Song Chen
4b9091e1c1 kernel: trace: preemptirq_delay_test: add cpu affinity
The kernel thread executing test can run on any cpu, which might be
different cpu latency tracer is running on, as a result, the
big latency caused by preemptirq delay test can't be detected.

Therefore, the argument cpu_affinity is added to be passed to test,
ensure it's running on the same cpu with latency tracer.

e.g.
cyclictest -p 90 -m -c 0 -i 1000 -a 3
modprobe preemptirq_delay_test test_mode=preempt delay=500 \
burst_size=3 cpu_affinity=3

Link: https://lkml.kernel.org/r/1611797713-20965-1-git-send-email-chensong_2000@189.cn

Signed-off-by: Song Chen <chensong_2000@189.cn>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-02 17:02:07 -05:00
Tom Rix
f2a99ddfd0 tracing: Remove definition of DEBUG in trace_mmiotrace.c
Defining DEBUG should only be done in development.
So remove DEBUG.

Link: https://lkml.kernel.org/r/20210115153348.131791-1-trix@redhat.com

Signed-off-by: Tom Rix <trix@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-02 17:02:07 -05:00
Bean Huo
557d50e79d tracing: Fix a kernel doc warning
Add description for trace_array_put() parameter.

kernel/trace/trace.c:464: warning: Function parameter or member 'this_tr' not described in 'trace_array_put'

Link: https://lkml.kernel.org/r/20210112111202.23508-1-huobean@gmail.com

Signed-off-by: Bean Huo <beanhuo@micron.com>
[ Merged as one of the original fixes was already fixed by someone else ]
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-02 17:02:07 -05:00
Bhaskar Chowdhury
39bcdd6a96 tracing: Fix spelling of controlling in uprobes
s/controling/controlling/p

Link: https://lkml.kernel.org/r/20210112045008.29834-1-unixbhaskar@gmail.com

Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-02 17:02:07 -05:00
Colin Ian King
28cc65a173 tracing: Fix spelling mistake in Kconfig "infinit" -> "infinite"
There is a spelling mistake in the Kconfig help text. Fix it.

Link: https://lkml.kernel.org/r/20201216114051.12056-1-colin.king@canonical.com

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-02 17:02:07 -05:00
Sebastian Andrzej Siewior
fe427886bf tracing: Use in_serving_softirq() to deduct softirq status.
PREEMPT_RT does not report "serving softirq" because the tracing core
looks at the preemption counter while PREEMPT_RT does not update it
while processing softirqs in order to remain preemptible. The
information is stored somewhere else.
The in_serving_softirq() macro and the SOFTIRQ_OFFSET define are still
working but not on the preempt-counter.

Use in_serving_softirq() macro which works on PREEMPT_RT. On !PREEMPT_RT
the compiler (gcc-10 / clang-11) is smart enough to optimize the
in_serving_softirq() related read of the preemption counter away.
The only difference I noticed by using in_serving_softirq() on
!PREEMPT_RT is that gcc-10 implemented tracing_gen_ctx_flags() as
reading FLAG, jmp _tracing_gen_ctx_flags(). Without in_serving_softirq()
it inlined _tracing_gen_ctx_flags() into tracing_gen_ctx_flags().

Link: https://lkml.kernel.org/r/20210125194511.3924915-4-bigeasy@linutronix.de

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-02 17:02:06 -05:00
Sebastian Andrzej Siewior
0c02006e6f tracing: Inline tracing_gen_ctx_flags()
Inline tracing_gen_ctx_flags(). This allows to have one ifdef
CONFIG_TRACE_IRQFLAGS_SUPPORT.

This requires to move `trace_flag_type' so tracing_gen_ctx_flags() can
use it.

Link: https://lkml.kernel.org/r/20210125194511.3924915-3-bigeasy@linutronix.de

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Link: https://lkml.kernel.org/r/20210125140323.6b1ff20c@gandalf.local.home
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-02 17:02:06 -05:00
Sebastian Andrzej Siewior
36590c50b2 tracing: Merge irqflags + preempt counter.
The state of the interrupts (irqflags) and the preemption counter are
both passed down to tracing_generic_entry_update(). Only one bit of
irqflags is actually required: The on/off state. The complete 32bit
of the preemption counter isn't needed. Just whether of the upper bits
(softirq, hardirq and NMI) are set and the preemption depth is needed.

The irqflags and the preemption counter could be evaluated early and the
information stored in an integer `trace_ctx'.
tracing_generic_entry_update() would use the upper bits as the
TRACE_FLAG_* and the lower 8bit as the disabled-preemption depth
(considering that one must be substracted from the counter in one
special cases).

The actual preemption value is not used except for the tracing record.
The `irqflags' variable is mostly used only for the tracing record. An
exception here is for instance wakeup_tracer_call() or
probe_wakeup_sched_switch() which explicilty disable interrupts and use
that `irqflags' to save (and restore) the IRQ state and to record the
state.

Struct trace_event_buffer has also the `pc' and flags' members which can
be replaced with `trace_ctx' since their actual value is not used
outside of trace recording.

This will reduce tracing_generic_entry_update() to simply assign values
to struct trace_entry. The evaluation of the TRACE_FLAG_* bits is moved
to _tracing_gen_ctx_flags() which replaces preempt_count() and
local_save_flags() invocations.

As an example, ftrace_syscall_enter() may invoke:
- trace_buffer_lock_reserve() -> … -> tracing_generic_entry_update()
- event_trigger_unlock_commit()
  -> ftrace_trace_stack() -> … -> tracing_generic_entry_update()
  -> ftrace_trace_userstack() -> … -> tracing_generic_entry_update()

In this case the TRACE_FLAG_* bits were evaluated three times. By using
the `trace_ctx' they are evaluated once and assigned three times.

A build with all tracers enabled on x86-64 with and without the patch:

    text     data      bss      dec      hex    filename
21970669 17084168  7639260 46694097  2c87ed1 vmlinux.old
21970293 17084168  7639260 46693721  2c87d59 vmlinux.new

text shrank by 379 bytes, data remained constant.

Link: https://lkml.kernel.org/r/20210125194511.3924915-2-bigeasy@linutronix.de

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-02 17:02:06 -05:00
Qiujun Huang
c6358bacdc ring-buffer: Drop unneeded check in ring_buffer_resize()
Remove the cpumask check, as we has done it at the beginning of
the function.
Also fix a typo. s/also the on the/also on the/

Link: https://lkml.kernel.org/r/20201224144634.3210-1-hqjagain@gmail.com

Signed-off-by: Qiujun Huang <hqjagain@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-02 17:02:06 -05:00