Commit graph

901490 commits

Author SHA1 Message Date
Ravi Bangoria
68aac855b6 perf annotate: Fix --show-total-period for tui/stdio2
perf annotate --show-total-period does not really show total period.

The reason is we have two separate variables for the same purpose.

One is in symbol_conf.show_total_period and another is
annotation_options.show_total_period.

We save command line option in symbol_conf.show_total_period but uses
annotation_option.show_total_period while rendering tui/stdio2 browser.

Though, we copy symbol_conf.show_total_period to
annotation__default_options.show_total_period but that is not really
effective as we don't use annotation__default_options once we copy
default options to dynamic variable annotate.opts in cmd_annotate().

Instead of all these complication, keep only one variable and use it all
over. symbol_conf.show_total_period is used by perf report/top as well.
So let's kill annotation_options.show_total_period.

On a side note, I've kept annotation_options.show_total_period
definition because it's still used by perf-config code. Follow up patch
to fix perf-config for annotate will remove
annotation_options.show_total_period.

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yisheng Xie <xieyisheng1@huawei.com>
Link: http://lore.kernel.org/lkml/20200213064306.160480-3-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-02-27 10:44:40 -03:00
Ravi Bangoria
54cf752cfb perf annotate/tui: Re-render title bar after switching back from script browser
The 'perf annotate' TUI browser provides a 'r' hot key to switch to a
script browser. But the annotate browser title bar becomes hidden while
switching back from script browser. Fix it.

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yisheng Xie <xieyisheng1@huawei.com>
Link: http://lore.kernel.org/lkml/20200213064306.160480-2-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-02-27 10:44:14 -03:00
Arnaldo Carvalho de Melo
0d6f94fd49 tools headers UAPI: Update tools's copy of kvm.h headers
Picking the changes from:

  5ef8acbdd6 ("KVM: nVMX: Emulate MTF when performing instruction emulation")

Silencing this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/kvm.h' differs from latest version at 'arch/x86/include/uapi/asm/kvm.h'
  diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h

No change in tooling ensues, just the x86 kvm tooling gets rebuilt as
those headers are included in its build:

  $ cp arch/x86/include/uapi/asm/kvm.h tools/arch/x86/include/uapi/asm/kvm.h
  $ make -C tools/perf
  make: Entering directory '/home/acme/git/perf/tools/perf'
    BUILD:   Doing 'make -j12' parallel build

  Auto-detecting system features:
  ...                         dwarf: [ on  ]
  <SNIP>
  ...        disassembler-four-args: [ on  ]

    DESCEND  plugins
    CC       /tmp/build/perf/arch/x86/util/kvm-stat.o
  <SNIP>
    LD       /tmp/build/perf/arch/x86/util/perf-in.o
    LD       /tmp/build/perf/arch/x86/perf-in.o
    LD       /tmp/build/perf/arch/perf-in.o
    LD       /tmp/build/perf/perf-in.o
    LINK     /tmp/build/perf/perf
  <SNIP>
  $

As it doesn't seem to be used there:

  $ grep STATE tools/perf/arch/x86/util/kvm-stat.c
  $

And the 'perf trace' beautifier table generator isn't interested in
these things:

  $ grep regex= tools/perf/trace/beauty/kvm_ioctl.sh
  regex='^#[[:space:]]*define[[:space:]]+KVM_(\w+)[[:space:]]+_IO[RW]*\([[:space:]]*KVMIO[[:space:]]*,[[:space:]]*(0x[[:xdigit:]]+).*'
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Oliver Upton <oupton@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-02-27 09:51:30 -03:00
Arnaldo Carvalho de Melo
d8e3ee2e2b tools arch x86: Sync the msr-index.h copy with the kernel sources
To pick up the changes from these csets:

  21b5ee59ef ("x86/cpu/amd: Enable the fixed Instructions Retired counter IRPERF")

  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
  $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
  $ git diff
  diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h
  index ebe1685e92dd..d5e517d1c3dd 100644
  --- a/tools/arch/x86/include/asm/msr-index.h
  +++ b/tools/arch/x86/include/asm/msr-index.h
  @@ -512,6 +512,8 @@
   #define MSR_K7_HWCR                    0xc0010015
   #define MSR_K7_HWCR_SMMLOCK_BIT                0
   #define MSR_K7_HWCR_SMMLOCK            BIT_ULL(MSR_K7_HWCR_SMMLOCK_BIT)
  +#define MSR_K7_HWCR_IRPERF_EN_BIT      30
  +#define MSR_K7_HWCR_IRPERF_EN          BIT_ULL(MSR_K7_HWCR_IRPERF_EN_BIT)
   #define MSR_K7_FID_VID_CTL             0xc0010041
   #define MSR_K7_FID_VID_STATUS          0xc0010042
  $

That don't result in any change in tooling:

  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
  $ diff -u before after
  $

To silence this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'
  diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-02-27 09:49:56 -03:00
Rafael J. Wysocki
1df97a02a9 Update devfreq for 5.6-rc4
Detailed description for this pull request:
 1. Revert "PM / devfreq: Modify the device name as devfreq(X) for sysfs"
 - This changes as devfreq(X) cause break some user space applications
 such as Android HAL from Unisoc and Hikey. In result, decide to revert it
 for preventing the HAL layer problem.
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEsSpuqBtbWtRe4rLGnM3fLN7rz1MFAl5XQ1oWHGN3MDAuY2hv
 aUBzYW1zdW5nLmNvbQAKCRCczd8s3uvPU/QKD/0UjqchWhu1KDE1A6be087oacSd
 Q+Br5bWm0I7YfgoRdUM9la+Wob2M7MuNq9ON4yd4zpStZeuHcfzF8sQIwd5VoDwB
 G7rfVxYTVfgEQp8x0fHf3cFvoYRlQCBzwkgQZsm+jTyajbr5xZ1GXjwN++lmuJvo
 Ku/hE6mNI8CdHWx8isxgB2xtmnWc6ZuRaPp3KraX9QMN0T5JtzmxKazXr2VImWP2
 ZbCnA2j8DZyEpe/y0L9R6nibq/dhI36CK5utIf4dv2pb8YCfYpjJ7J78KW9M9bzJ
 zJGMbBwPaS1rM3X7pjLKxek79WC/oZ2I+UpijXSmLoR6RnMA6l87+igJXGDmXyVX
 8BrZGUR/bWd/vJvHo9ltA+BmZmVwhAPcXWkWVxoCnTW6Lnf3nPo2Di8Qy2roAo1b
 XH50qyAJ0ur51wb9azJkUew8WbadkkE1Djpa+QERrTMY88xVqG8sdfPeJYj1oyzf
 yyG+CZOkFcbah5NW2LUl9nHnRP2jbdIzXMfZyV8J3NSKL1IAqcRSXzLzMFipbCs5
 ZtpwKnotJhU6/+eLZrTVSNkmfCcpY8tiycMmk3p+ppBm9+I7FWptyELZN6wZrY0s
 Zi5gumRsVWfuSPXa+empfv3sUH0Y0+yrXjGp0Kq0xT5GJtFvzii/58XqE/+Vo1IH
 kuExMwlh6YBDlMGLDA==
 =OleH
 -----END PGP SIGNATURE-----

Merge tag 'devfreq-fixes-for-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux

Pull a devfreq fix for 5.6-rc4 from Chanwoo Choi:

"Revert "PM / devfreq: Modify the device name as devfreq(X) for sysfs"
 - This changes as devfreq(X) cause break some user space applications
 such as Android HAL from Unisoc and Hikey. In result, decide to revert it
 for preventing the HAL layer problem."

* tag 'devfreq-fixes-for-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux:
  Revert "PM / devfreq: Modify the device name as devfreq(X) for sysfs"
2020-02-27 11:21:23 +01:00
Vincent Guittot
289de35984 sched/fair: Fix statistics for find_idlest_group()
sgs->group_weight is not set while gathering statistics in
update_sg_wakeup_stats(). This means that a group can be classified as
fully busy with 0 running tasks if utilization is high enough.

This path is mainly used for fork and exec.

Fixes: 57abff067a ("sched/fair: Rework find_idlest_group()")
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Link: https://lore.kernel.org/r/20200218144534.4564-1-vincent.guittot@linaro.org
2020-02-27 10:08:27 +01:00
Rafael J. Wysocki
f5739cb0b5 cpufreq: Fix policy initialization for internal governor drivers
Before commit 1e4f63aecb ("cpufreq: Avoid creating excessively
large stack frames") the initial value of the policy field in struct
cpufreq_policy set by the driver's ->init() callback was implicitly
passed from cpufreq_init_policy() to cpufreq_set_policy() if the
default governor was neither "performance" nor "powersave".  After
that commit, however, cpufreq_init_policy() must take that case into
consideration explicitly and handle it as appropriate, so make that
happen.

Fixes: 1e4f63aecb ("cpufreq: Avoid creating excessively large stack frames")
Link: https://lore.kernel.org/linux-pm/39fb762880c27da110086741315ca8b111d781cd.camel@gmail.com/
Reported-by: Artem Bityutskiy <dedekind1@gmail.com>
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2020-02-27 08:57:48 +01:00
Karsten Graul
a2f2ef4a54 net/smc: check for valid ib_client_data
In smc_ib_remove_dev() check if the provided ib device was actually
initialized for SMC before.

Reported-by: syzbot+84484ccebdd4e5451d91@syzkaller.appspotmail.com
Fixes: a4cf0443c4 ("smc: introduce SMC as an IB-client")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-26 20:56:25 -08:00
Aaro Koskinen
474a31e13a net: stmmac: fix notifier registration
We cannot register the same netdev notifier multiple times when probing
stmmac devices. Register the notifier only once in module init, and also
make debugfs creation/deletion safe against simultaneous notifier call.

Fixes: 481a7d154c ("stmmac: debugfs entry name is not be changed when udev rename device name.")
Signed-off-by: Aaro Koskinen <aaro.koskinen@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-26 20:55:14 -08:00
Antoine Tenart
c87a9d6fc6 net: phy: mscc: fix firmware paths
The firmware paths for the VSC8584 PHYs not not contain the leading
'microchip/' directory, as used in linux-firmware, resulting in an
error when probing the driver. This patch fixes it.

Fixes: a5afc16780 ("net: phy: mscc: add support for VSC8584 PHY")
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-26 20:53:09 -08:00
Paolo Abeni
dc24f8b4ec mptcp: add dummy icsk_sync_mss()
syzbot noted that the master MPTCP socket lacks the icsk_sync_mss
callback, and was able to trigger a null pointer dereference:

BUG: kernel NULL pointer dereference, address: 0000000000000000
PGD 8e171067 P4D 8e171067 PUD 93fa2067 PMD 0
Oops: 0010 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 8984 Comm: syz-executor066 Not tainted 5.6.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:0x0
Code: Bad RIP value.
RSP: 0018:ffffc900020b7b80 EFLAGS: 00010246
RAX: 1ffff110124ba600 RBX: 0000000000000000 RCX: ffff88809fefa600
RDX: ffff8880994cdb18 RSI: 0000000000000000 RDI: ffff8880925d3140
RBP: ffffc900020b7bd8 R08: ffffffff870225be R09: fffffbfff140652a
R10: fffffbfff140652a R11: 0000000000000000 R12: ffff8880925d35d0
R13: ffff8880925d3140 R14: dffffc0000000000 R15: 1ffff110124ba6ba
FS:  0000000001a0b880(0000) GS:ffff8880aea00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 00000000a6d6f000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 cipso_v4_sock_setattr+0x34b/0x470 net/ipv4/cipso_ipv4.c:1888
 netlbl_sock_setattr+0x2a7/0x310 net/netlabel/netlabel_kapi.c:989
 smack_netlabel security/smack/smack_lsm.c:2425 [inline]
 smack_inode_setsecurity+0x3da/0x4a0 security/smack/smack_lsm.c:2716
 security_inode_setsecurity+0xb2/0x140 security/security.c:1364
 __vfs_setxattr_noperm+0x16f/0x3e0 fs/xattr.c:197
 vfs_setxattr fs/xattr.c:224 [inline]
 setxattr+0x335/0x430 fs/xattr.c:451
 __do_sys_fsetxattr fs/xattr.c:506 [inline]
 __se_sys_fsetxattr+0x130/0x1b0 fs/xattr.c:495
 __x64_sys_fsetxattr+0xbf/0xd0 fs/xattr.c:495
 do_syscall_64+0xf7/0x1c0 arch/x86/entry/common.c:294
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x440199
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffcadc19e48 EFLAGS: 00000246 ORIG_RAX: 00000000000000be
RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000440199
RDX: 0000000020000200 RSI: 00000000200001c0 RDI: 0000000000000003
RBP: 00000000006ca018 R08: 0000000000000003 R09: 00000000004002c8
R10: 0000000000000009 R11: 0000000000000246 R12: 0000000000401a20
R13: 0000000000401ab0 R14: 0000000000000000 R15: 0000000000000000
Modules linked in:
CR2: 0000000000000000

Address the issue adding a dummy icsk_sync_mss callback.
To properly sync the subflows mss and options list we need some
additional infrastructure, which will land to net-next.

Reported-by: syzbot+f4dfece964792d80b139@syzkaller.appspotmail.com
Fixes: 2303f994b3 ("mptcp: Associate MPTCP context with TCP socket")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-26 20:49:50 -08:00
Sudheesh Mavila
4f31c532ad net: phy: corrected the return value for genphy_check_and_restart_aneg and genphy_c45_check_and_restart_aneg
When auto-negotiation is not required, return value should be zero.

Changes v1->v2:
- improved comments and code as Andrew Lunn and Heiner Kallweit suggestion
- fixed issue in genphy_c45_check_and_restart_aneg as Russell King
  suggestion.

Fixes: 2a10ab043a ("net: phy: add genphy_check_and_restart_aneg()")
Fixes: 1af9f16840 ("net: phy: add genphy_c45_check_and_restart_aneg()")
Signed-off-by: Sudheesh Mavila <sudheesh.mavila@amd.com>
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-26 20:41:42 -08:00
yangerkun
f596c87005 slip: not call free_netdev before rtnl_unlock in slip_open
As the description before netdev_run_todo, we cannot call free_netdev
before rtnl_unlock, fix it by reorder the code.

Signed-off-by: yangerkun <yangerkun@huawei.com>
Reviewed-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-26 20:37:06 -08:00
Eric Dumazet
b6f6118901 ipv6: restrict IPV6_ADDRFORM operation
IPV6_ADDRFORM is able to transform IPv6 socket to IPv4 one.
While this operation sounds illogical, we have to support it.

One of the things it does for TCP socket is to switch sk->sk_prot
to tcp_prot.

We now have other layers playing with sk->sk_prot, so we should make
sure to not interfere with them.

This patch makes sure sk_prot is the default pointer for TCP IPv6 socket.

syzbot reported :
BUG: kernel NULL pointer dereference, address: 0000000000000000
PGD a0113067 P4D a0113067 PUD a8771067 PMD 0
Oops: 0010 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 10686 Comm: syz-executor.0 Not tainted 5.6.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:0x0
Code: Bad RIP value.
RSP: 0018:ffffc9000281fce0 EFLAGS: 00010246
RAX: 1ffffffff15f48ac RBX: ffffffff8afa4560 RCX: dffffc0000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8880a69a8f40
RBP: ffffc9000281fd10 R08: ffffffff86ed9b0c R09: ffffed1014d351f5
R10: ffffed1014d351f5 R11: 0000000000000000 R12: ffff8880920d3098
R13: 1ffff1101241a613 R14: ffff8880a69a8f40 R15: 0000000000000000
FS:  00007f2ae75db700(0000) GS:ffff8880aea00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 00000000a3b85000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 inet_release+0x165/0x1c0 net/ipv4/af_inet.c:427
 __sock_release net/socket.c:605 [inline]
 sock_close+0xe1/0x260 net/socket.c:1283
 __fput+0x2e4/0x740 fs/file_table.c:280
 ____fput+0x15/0x20 fs/file_table.c:313
 task_work_run+0x176/0x1b0 kernel/task_work.c:113
 tracehook_notify_resume include/linux/tracehook.h:188 [inline]
 exit_to_usermode_loop arch/x86/entry/common.c:164 [inline]
 prepare_exit_to_usermode+0x480/0x5b0 arch/x86/entry/common.c:195
 syscall_return_slowpath+0x113/0x4a0 arch/x86/entry/common.c:278
 do_syscall_64+0x11f/0x1c0 arch/x86/entry/common.c:304
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x45c429
Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f2ae75dac78 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
RAX: 0000000000000000 RBX: 00007f2ae75db6d4 RCX: 000000000045c429
RDX: 0000000000000001 RSI: 000000000000011a RDI: 0000000000000004
RBP: 000000000076bf20 R08: 0000000000000038 R09: 0000000000000000
R10: 0000000020000180 R11: 0000000000000246 R12: 00000000ffffffff
R13: 0000000000000a9d R14: 00000000004ccfb4 R15: 000000000076bf2c
Modules linked in:
CR2: 0000000000000000
---[ end trace 82567b5207e87bae ]---
RIP: 0010:0x0
Code: Bad RIP value.
RSP: 0018:ffffc9000281fce0 EFLAGS: 00010246
RAX: 1ffffffff15f48ac RBX: ffffffff8afa4560 RCX: dffffc0000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8880a69a8f40
RBP: ffffc9000281fd10 R08: ffffffff86ed9b0c R09: ffffed1014d351f5
R10: ffffed1014d351f5 R11: 0000000000000000 R12: ffff8880920d3098
R13: 1ffff1101241a613 R14: ffff8880a69a8f40 R15: 0000000000000000
FS:  00007f2ae75db700(0000) GS:ffff8880aea00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 00000000a3b85000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Fixes: 604326b41a ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot+1938db17e275e85dc328@syzkaller.appspotmail.com
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-26 20:20:58 -08:00
Ursula Braun
51e3dfa890 net/smc: fix cleanup for linkgroup setup failures
If an SMC connection to a certain peer is setup the first time,
a new linkgroup is created. In case of setup failures, such a
linkgroup is unusable and should disappear. As a first step the
linkgroup is removed from the linkgroup list in smc_lgr_forget().

There are 2 problems:
smc_listen_decline() might be called before linkgroup creation
resulting in a crash due to calling smc_lgr_forget() with
parameter NULL.
If a setup failure occurs after linkgroup creation, the connection
is never unregistered from the linkgroup, preventing linkgroup
freeing.

This patch introduces an enhanced smc_lgr_cleanup_early() function
which
* contains a linkgroup check for early smc_listen_decline()
  invocations
* invokes smc_conn_free() to guarantee unregistering of the
  connection.
* schedules fast linkgroup removal of the unusable linkgroup

And the unused function smcd_conn_free() is removed from smc_core.h.

Fixes: 3b2dec2603 ("net/smc: restructure client and server code in af_smc")
Fixes: 2a0674fffb ("net/smc: improve abnormal termination of link groups")
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-26 20:18:07 -08:00
Nicolas Saenz Julienne
402482a6a7 net: bcmgenet: Clear ID_MODE_DIS in EXT_RGMII_OOB_CTRL when not needed
Outdated Raspberry Pi 4 firmware might configure the external PHY as
rgmii although the kernel currently sets it as rgmii-rxid. This makes
connections unreliable as ID_MODE_DIS is left enabled. To avoid this,
explicitly clear that bit whenever we don't need it.

Fixes: da38802211 ("net: bcmgenet: Add RGMII_RXID support")
Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-26 17:12:30 -08:00
Jiri Pirko
1521a67e60 sched: act: count in the size of action flags bitfield
The put of the flags was added by the commit referenced in fixes tag,
however the size of the message was not extended accordingly.

Fix this by adding size of the flags bitfield to the message size.

Fixes: e382267860 ("net: sched: update action implementations to support flags")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-26 17:10:44 -08:00
Masahiro Yamada
eabc8bcb29 kbuild: get rid of trailing slash from subdir- example
obj-* needs a trailing slash for a directory, but subdir-* does not.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2020-02-27 10:03:27 +09:00
Madhuparna Bhowmik
2eb51c75dc net: core: devlink.c: Use built-in RCU list checking
list_for_each_entry_rcu() has built-in RCU and lock checking.

Pass cond argument to list_for_each_entry_rcu() to silence
false lockdep warning when CONFIG_PROVE_RCU_LIST is enabled.

The devlink->lock is held when devlink_dpipe_table_find()
is called in non RCU read side section. Therefore, pass struct devlink
to devlink_dpipe_table_find() for lockdep checking.

Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-26 16:59:18 -08:00
Florian Fainelli
98c5f7d44f net: dsa: bcm_sf2: Forcibly configure IMP port for 1Gb/sec
We are still experiencing some packet loss with the existing advanced
congestion buffering (ACB) settings with the IMP port configured for
2Gb/sec, so revert to conservative link speeds that do not produce
packet loss until this is resolved.

Fixes: 8f1880cbe8 ("net: dsa: bcm_sf2: Configure IMP port for 2Gb/sec")
Fixes: de34d7084e ("net: dsa: bcm_sf2: Only 7278 supports 2Gb/sec IMP port")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-26 16:38:23 -08:00
David S. Miller
574b238f64 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes:

1) Perform garbage collection from workqueue to fix rcu detected
   stall in ipset hash set types, from Jozsef Kadlecsik.

2) Fix the forceadd evaluation path, also from Jozsef.

3) Fix nft_set_pipapo selftest, from Stefano Brivio.

4) Crash when add-flush-add element in pipapo set, also from Stefano.
   Add test to cover this crash.

5) Remove sysctl entry under mutex in hashlimit, from Cong Wang.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-26 16:30:17 -08:00
Linus Torvalds
bfdc6d91a2 platform/chrome fixes for v5.6-rc4
Includes this commit:
 platform/chrome: wilco_ec: Include asm/unaligned instead of linux/ path
 
 Fixes a compilation warning.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQCtZK6p/AktxXfkOlzbaomhzOwwgUCXlblyAAKCRBzbaomhzOw
 wt4ZAP9/aN/gan3EUCrJvFW0z2uZbHrWfblFSUNKH2//4KPjjAEAmp6HKG4b/D73
 7eeJCq9cVzCJoYL2MVhwpIwB2oxoDg4=
 =f5X+
 -----END PGP SIGNATURE-----

Merge tag 'tag-chrome-platform-fixes-for-v5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux

Pull chrome platform fix from Benson Leung:
 "Fix a build warning"

* tag 'tag-chrome-platform-fixes-for-v5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux:
  platform/chrome: wilco_ec: Include asm/unaligned instead of linux/ path
2020-02-26 15:54:52 -08:00
Jonathan Lemon
9a005c3898 bnxt_en: add newline to netdev_*() format strings
Add missing newlines to netdev_* format strings so the lines
aren't buffered by the printk subsystem.

Nitpicked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-26 15:52:33 -08:00
Cong Wang
99b79c3900 netfilter: xt_hashlimit: unregister proc file before releasing mutex
Before releasing the global mutex, we only unlink the hashtable
from the hash list, its proc file is still not unregistered at
this point. So syzbot could trigger a race condition where a
parallel htable_create() could register the same file immediately
after the mutex is released.

Move htable_remove_proc_entry() back to mutex protection to
fix this. And, fold htable_destroy() into htable_put() to make
the code slightly easier to understand.

Reported-and-tested-by: syzbot+d195fd3b9a364ddd6731@syzkaller.appspotmail.com
Fixes: c4a3922d2d ("netfilter: xt_hashlimit: reduce hashlimit_mutex scope for htable_put()")
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-02-26 23:25:07 +01:00
Jani Nikula
8e9a400c70 Merge tag 'gvt-fixes-2020-02-26' of https://github.com/intel/gvt-linux into drm-intel-fixes
gvt-fixes-2020-02-26

- Fix virtual display reset (Tina)
- Fix one use-after-free for dmabuf (Tina)

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
From: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200226103016.GC10413@zhen-hp.sh.intel.com
2020-02-26 22:58:25 +02:00
Michal Kubecek
e34f1753ee ethtool: limit bitset size
Syzbot reported that ethnl_compact_sanity_checks() can be tricked into
reading past the end of ETHTOOL_A_BITSET_VALUE and ETHTOOL_A_BITSET_MASK
attributes and even the message by passing a value between (u32)(-31)
and (u32)(-1) as ETHTOOL_A_BITSET_SIZE.

The problem is that DIV_ROUND_UP(attr_nbits, 32) is 0 for such values so
that zero length ETHTOOL_A_BITSET_VALUE will pass the length check but
ethnl_bitmap32_not_zero() check would try to access up to 512 MB of
attribute "payload".

Prevent this overflow byt limiting the bitset size. Technically, compact
bitset format would allow bitset sizes up to almost 2^18 (so that the
nest size does not exceed U16_MAX) but bitsets used by ethtool are much
shorter. S16_MAX, the largest value which can be directly used as an
upper limit in policy, should be a reasonable compromise.

Fixes: 10b518d4e6 ("ethtool: netlink bitset handling")
Reported-by: syzbot+7fd4ed5b4234ab1fdccd@syzkaller.appspotmail.com
Reported-by: syzbot+709b7a64d57978247e44@syzkaller.appspotmail.com
Reported-by: syzbot+983cb8fb2d17a7af549d@syzkaller.appspotmail.com
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-26 11:27:31 -08:00
Amritha Nambiar
6e11d1578f net: Fix Tx hash bound checking
Fixes the lower and upper bounds when there are multiple TCs and
traffic is on the the same TC on the same device.

The lower bound is represented by 'qoffset' and the upper limit for
hash value is 'qcount + qoffset'. This gives a clean Rx to Tx queue
mapping when there are multiple TCs, as the queue indices for upper TCs
will be offset by 'qoffset'.

v2: Fixed commit description based on comments.

Fixes: 1b837d489e ("net: Revoke export for __skb_tx_hash, update it to just be static skb_tx_hash")
Fixes: eadec877ce ("net: Add support for subordinate traffic classes to netdev_pick_tx")
Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-26 11:14:10 -08:00
Daniel Vetter
eb12c95773 drm/radeon: Inline drm_get_pci_dev
It's the last user, and more importantly, it's the last non-legacy
user of anything in drm_pci.c.

The only tricky bit is the agp initialization. But a close look shows
that radeon does not use the drm_agp midlayer (the main use of that is
drm_bufs for legacy drivers), and instead could use the agp subsystem
directly (like nouveau does already). Hence we can just pull this in
too.

A further step would be to entirely drop the use of drm_device->agp,
but feels like too much churn just for this patch.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2020-02-26 14:02:41 -05:00
Daniel Vetter
8a3bddf67c drm/amdgpu: Drop DRIVER_USE_AGP
This doesn't do anything except auto-init drm_agp support when you
call drm_get_pci_dev(). Which amdgpu stopped doing with

commit b58c11314a
Author: Alex Deucher <alexander.deucher@amd.com>
Date:   Fri Jun 2 17:16:31 2017 -0400

    drm/amdgpu: drop deprecated drm_get_pci_dev and drm_put_dev

No idea whether this was intentional or accidental breakage, but I
guess anyone who manages to boot a this modern gpu behind an agp
bridge deserves a price. A price I never expect anyone to ever collect
:-)

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Xiaojie Yuan <xiaojie.yuan@amd.com>
Cc: Evan Quan <evan.quan@amd.com>
Cc: "Tianci.Yin" <tianci.yin@amd.com>
Cc: "Marek Olšák" <marek.olsak@amd.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2020-02-26 14:02:41 -05:00
Linus Torvalds
91ad64a84e Tracing updates:
Change in API of bootconfig (before it comes live in a release)
   - Have a magic value "BOOTCONFIG" in initrd to know a bootconfig exists
   - Set CONFIG_BOOT_CONFIG to 'n' by default
   - Show error if "bootconfig" on cmdline but not compiled in
   - Prevent redefining the same value
   - Have a way to append values
   - Added a SELECT BLK_DEV_INITRD to fix a build failure
 
  Synthetic event fixes:
   - Switch to raw_smp_processor_id() for recording CPU value in preempt
     section. (No care for what the value actually is)
   - Fix samples always recording u64 values
   - Fix endianess
   - Check number of values matches number of fields
   - Fix a printing bug
 
  Fix of trace_printk() breaking postponed start up tests
 
  Make a function static that is only used in a single file.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCXlW4vxQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qtioAP0WLEm3dWO0z3321h/a0DSshC+Bslu3
 HDPTsGVGrXmvggEA/lr1ikRHd8PsO7zW8BfaZMxoXaTqXiuSrzEWxnMlFw0=
 =O8PM
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing and bootconfig updates:
 "Fixes and changes to bootconfig before it goes live in a release.

  Change in API of bootconfig (before it comes live in a release):
  - Have a magic value "BOOTCONFIG" in initrd to know a bootconfig
    exists
  - Set CONFIG_BOOT_CONFIG to 'n' by default
  - Show error if "bootconfig" on cmdline but not compiled in
  - Prevent redefining the same value
  - Have a way to append values
  - Added a SELECT BLK_DEV_INITRD to fix a build failure

  Synthetic event fixes:
  - Switch to raw_smp_processor_id() for recording CPU value in preempt
    section. (No care for what the value actually is)
  - Fix samples always recording u64 values
  - Fix endianess
  - Check number of values matches number of fields
  - Fix a printing bug

  Fix of trace_printk() breaking postponed start up tests

  Make a function static that is only used in a single file"

* tag 'trace-v5.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  bootconfig: Fix CONFIG_BOOTTIME_TRACING dependency issue
  bootconfig: Add append value operator support
  bootconfig: Prohibit re-defining value on same key
  bootconfig: Print array as multiple commands for legacy command line
  bootconfig: Reject subkey and value on same parent key
  tools/bootconfig: Remove unneeded error message silencer
  bootconfig: Add bootconfig magic word for indicating bootconfig explicitly
  bootconfig: Set CONFIG_BOOT_CONFIG=n by default
  tracing: Clear trace_state when starting trace
  bootconfig: Mark boot_config_checksum() static
  tracing: Disable trace_printk() on post poned tests
  tracing: Have synthetic event test use raw_smp_processor_id()
  tracing: Fix number printing bug in print_synth_event()
  tracing: Check that number of vals matches number of synth event fields
  tracing: Make synth_event trace functions endian-correct
  tracing: Make sure synth_event_trace() example always uses u64
2020-02-26 10:34:42 -08:00
Linus Torvalds
b98cce1ef5 linux-kselftest-kunit-5.6-rc4
This Kselftest kunit update consists of fixes to documentation and
 run-time tool from Brendan Higgins and Heidi Fahim.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAl5VuwUACgkQCwJExA0N
 QxwJiw/+OgVUhIVw4GNvuyDfRruZBR77h41brG3yIlkiJeswxrJBvv6mgQWP69nu
 3V2MO7DrJ/Y4LINZ4ElGyiSMpoY+Tpex7GBX0WZy31FVrmOAd4AhZ/fHZar1k4ye
 7rnts9Py6PwIYVxO3hcuDAfpIhEa98qKTKhVrLfHxR2CxbcvKDXIWfvz1gcp5M3y
 n4D3KVXwmb6yy7q85l8VjwxXevdaFp/bGmRW5HwzpMPJkrtBJWQrFJBGxeX1LVTY
 IcNKGu61Efd2KP6K9WF6EyS/seD+GbyuFOMq9xOG3WM6f65EILq6K6A24EGZtUxV
 IpJySFvewf+in8lzQql6F0flCvThYXkf2Dofi3yoQAda0XrwcL+Z/rugeLMQoEHN
 bYgCKzwW/otwLpJHlWJLPxEnWfuY7A1025xG7Ly+k7qBVsKy2aMZk70gP9uPr6hh
 lCp+zRRrnMAwFgKNSD6hVC+yblw0ACXv0UmL+ccUtX5KtSa+yYJ3JFZhOFzhhHug
 vwXCF5eLYdGuBVNWAO39kyLyV02nUwXiNaoVW5NF9fNpq6HdA6XWcofcV70AM6WZ
 l3s2MDBq7hc7edYknnTHCgaFlHqIlWkFAm828HtJXBV3IpHAagPRFWUVWnkfPlU9
 FCQXfnbkteB2ZUlHQwjUGBZzh07ZV0iafzNZcYzgyFCjDlVeHDw=
 =Q7Zl
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-kunit-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kunit fixes from Shuah Khan:
 "This Kselftest kunit update consists of fixes to documentation and
  the run-time tool from Brendan Higgins and Heidi Fahim"

* tag 'linux-kselftest-kunit-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: run kunit_tool from any directory
  kunit: test: Improve error messages for kunit_tool when kunitconfig is invalid
  Documentation: kunit: fixed sphinx error in code block
2020-02-26 10:28:59 -08:00
Linus Torvalds
2fcc74178f linux-kselftest-5.6-rc4
This Kselftest update for Linux 5.6-rc4 consists of:
 
 -  fixes to TIMEOUT failures and out-of-tree compilation compilation
    errors from Michael Ellerman.
 - Declutter git status fix from Christophe Leroy
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAl5VlisACgkQCwJExA0N
 QxzxAhAAol+8YeyQNqkesjUUPZR+hc7fM1G3TfHlwar5ljhlwbIOFCtjp66b9EKA
 4Cxy5s2/Vhkbs6CFJPa78UXRoH1enMejff6Dd5njwwNmS+cE1wAatM8RBSJeB4X3
 hMjfXCwvjJXqNhayD8n+sHmpEVtCL8SmiG5kKfQu6s+qXN/4EEUw1AaUfms4WO9t
 VDDC8Cc8RKhl9ZM1YxZTMoS7xISoWeZM94+aK12kXfL/rlt86k0FcN1FoApf/kIo
 15ILTo4cZvWMCLdDxbpw6RSGSdB9+siNFNnWnVp5ytTaD8nVRjLSf/sHlu5B9dvh
 VHPA56lofJmXjMxz/cNoHP2jgVsu+hNuG8J3h/GYkaCd6mEG8f5k7kAdqJjQ1D1/
 3cA54DtxCxfmDji24bTJaD5+uG60NAAh1EjeNKiWkMK07zsUxzXqDgJLLUM67EFk
 cYYwTcT9Yqc/GKVV7e2BkiwOiIYQih0NTg2ugV2HEdmm/1EqycoS0McwzIAIa5+2
 k6iUQ3nlpjLnP7vz4950aLVD9a5CsrRM9dY+ngYcbaAX00g9s0G0sLVfRXW6Ls2t
 9KMYoio1ERILqwvkHgdDyEXGUW/uMYhVMpbx647ZjtRAVNSVTvxZe4jIewZ3o6lx
 6vJ+sxYrrXoyZPPUrQGq3NiHg3Wh8BDw5EZaXuuo8JHbVCpvrMk=
 =QRUz
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kselftest fixes from Shuah Khan:

 - fixes to TIMEOUT failures and out-of-tree compilation compilation
   errors from Michael Ellerman.

 - declutter git status fix from Christophe Leroy

* tag 'linux-kselftest-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests/rseq: Fix out-of-tree compilation
  selftests: Install settings files to fix TIMEOUT failures
  selftest/lkdtm: Don't pollute 'git status'
2020-02-26 10:06:56 -08:00
Christoph Hellwig
cfe2ce49b9 Revert "KVM: x86: enable -Werror"
This reverts commit ead68df94d.

Using the -Werror flag breaks the build for me due to mostly harmless
KASAN or similar warnings:

  arch/x86/kvm/x86.c: In function ‘kvm_timer_init’:
  arch/x86/kvm/x86.c:7209:1: error: the frame size of 1112 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]

Feel free to add a CONFIG_WERROR if you care strong enough, but don't
break peoples builds for absolutely no good reason.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-26 09:59:58 -08:00
Linus Torvalds
fda31c5029 signal: avoid double atomic counter increments for user accounting
When queueing a signal, we increment both the users count of pending
signals (for RLIMIT_SIGPENDING tracking) and we increment the refcount
of the user struct itself (because we keep a reference to the user in
the signal structure in order to correctly account for it when freeing).

That turns out to be fairly expensive, because both of them are atomic
updates, and particularly under extreme signal handling pressure on big
machines, you can get a lot of cache contention on the user struct.
That can then cause horrid cacheline ping-pong when you do these
multiple accesses.

So change the reference counting to only pin the user for the _first_
pending signal, and to unpin it when the last pending signal is
dequeued.  That means that when a user sees a lot of concurrent signal
queuing - which is the only situation when this matters - the only
atomic access needed is generally the 'sigpending' count update.

This was noticed because of a particularly odd timing artifact on a
dual-socket 96C/192T Cascade Lake platform: when you get into bad
contention, on that machine for some reason seems to be much worse when
the contention happens in the upper 32-byte half of the cacheline.

As a result, the kernel test robot will-it-scale 'signal1' benchmark had
an odd performance regression simply due to random alignment of the
'struct user_struct' (and pointed to a completely unrelated and
apparently nonsensical commit for the regression).

Avoiding the double increments (and decrements on the dequeueing side,
of course) makes for much less contention and hugely improved
performance on that will-it-scale microbenchmark.

Quoting Feng Tang:

 "It makes a big difference, that the performance score is tripled! bump
  from original 17000 to 54000. Also the gap between 5.0-rc6 and
  5.0-rc6+Jiri's patch is reduced to around 2%"

[ The "2% gap" is the odd cacheline placement difference on that
  platform: under the extreme contention case, the effect of which half
  of the cacheline was hot was 5%, so with the reduced contention the
  odd timing artifact is reduced too ]

It does help in the non-contended case too, but is not nearly as
noticeable.

Reported-and-tested-by: Feng Tang <feng.tang@intel.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Huang, Ying <ying.huang@intel.com>
Cc: Philip Li <philip.li@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-26 09:54:03 -08:00
Jens Axboe
dd3db2a34c io_uring: drop file set ref put/get on switch
Dan reports that he triggered a warning on ring exit doing some testing:

percpu ref (io_file_data_ref_zero) <= 0 (0) after switching to atomic
WARNING: CPU: 3 PID: 0 at lib/percpu-refcount.c:160 percpu_ref_switch_to_atomic_rcu+0xe8/0xf0
Modules linked in:
CPU: 3 PID: 0 Comm: swapper/3 Not tainted 5.6.0-rc3+ #5648
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
RIP: 0010:percpu_ref_switch_to_atomic_rcu+0xe8/0xf0
Code: e7 ff 55 e8 eb d2 80 3d bd 02 d2 00 00 75 8b 48 8b 55 d8 48 c7 c7 e8 70 e6 81 c6 05 a9 02 d2 00 01 48 8b 75 e8 e8 3a d0 c5 ff <0f> 0b e9 69 ff ff ff 90 55 48 89 fd 53 48 89 f3 48 83 ec 28 48 83
RSP: 0018:ffffc90000110ef8 EFLAGS: 00010292
RAX: 0000000000000045 RBX: 7fffffffffffffff RCX: 0000000000000000
RDX: 0000000000000045 RSI: ffffffff825be7a5 RDI: ffffffff825bc32c
RBP: ffff8881b75eac38 R08: 000000042364b941 R09: 0000000000000045
R10: ffffffff825beb40 R11: ffffffff825be78a R12: 0000607e46005aa0
R13: ffff888107dcdd00 R14: 0000000000000000 R15: 0000000000000009
FS:  0000000000000000(0000) GS:ffff8881b9d80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f49e6a5ea20 CR3: 00000001b747c004 CR4: 00000000001606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 rcu_core+0x1e4/0x4d0
 __do_softirq+0xdb/0x2f1
 irq_exit+0xa0/0xb0
 smp_apic_timer_interrupt+0x60/0x140
 apic_timer_interrupt+0xf/0x20
 </IRQ>
RIP: 0010:default_idle+0x23/0x170
Code: ff eb ab cc cc cc cc 0f 1f 44 00 00 41 54 55 53 65 8b 2d 10 96 92 7e 0f 1f 44 00 00 e9 07 00 00 00 0f 00 2d 21 d0 51 00 fb f4 <65> 8b 2d f6 95 92 7e 0f 1f 44 00 00 5b 5d 41 5c c3 65 8b 05 e5 95

Turns out that this is due to percpu_ref_switch_to_atomic() only
grabbing a reference to the percpu refcount if it's not already in
atomic mode. io_uring drops a ref and re-gets it when switching back to
percpu mode. We attempt to protect against this with the FFD_F_ATOMIC
bit, but that isn't reliable.

We don't actually need to juggle these refcounts between atomic and
percpu switch, we can just do them when we've switched to atomic mode.
This removes the need for FFD_F_ATOMIC, which wasn't reliable.

Fixes: 05f3fb3c53 ("io_uring: avoid ring quiesce for fixed file set unregister and update")
Reported-by: Dan Melnic <dmm@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-02-26 10:53:33 -07:00
John Garry
cae740a04b blk-mq: Remove some unused function arguments
The struct blk_mq_hw_ctx pointer argument in blk_mq_put_tag(),
blk_mq_poll_nsecs(), and blk_mq_poll_hybrid_sleep() is unused, so remove
it.

Overall obj code size shows a minor reduction, before:
   text	   data	    bss	    dec	    hex	filename
  27306	   1312	      0	  28618	   6fca	block/blk-mq.o
   4303	    272	      0	   4575	   11df	block/blk-mq-tag.o

after:
  27282	   1312	      0	  28594	   6fb2	block/blk-mq.o
   4311	    272	      0	   4583	   11e7	block/blk-mq-tag.o

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: John Garry <john.garry@huawei.com>
--
This minor patch had been carried as part of the blk-mq shared tags RFC,
I'd rather not carry it anymore as it required rebasing, so now or never..
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-02-26 10:34:41 -07:00
Masahiro Yamada
c473a8d03e kbuild: add dt_binding_check to PHONY in a correct place
The dt_binding_check is added to PHONY, but it is invisible when
$(dtstree) is empty. So, it is not specified as phony for
ARCH=x86 etc.

Add it to PHONY outside the ifneq ... endif block.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Rob Herring <robh@kernel.org>
2020-02-27 02:28:15 +09:00
Masahiro Yamada
964a596db8 kbuild: add dtbs_check to PHONY
The dtbs_check should be a phony target, but currently it is not
specified so.

'make dtbs_check' works even if a file named 'dtbs_check' exists
because it depends on another phony target, scripts_dtc, but we
should not rely on it.

Add dtbs_check to PHONY.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Rob Herring <robh@kernel.org>
2020-02-27 02:26:01 +09:00
Masahiro Yamada
fd63fab48f kbuild: remove unneeded semicolon at the end of cmd_dtb_check
This trailing semicolon is unneeded.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Rob Herring <robh@kernel.org>
2020-02-27 02:24:34 +09:00
Masahiro Yamada
7a04960560 kbuild: fix DT binding schema rule to detect command line changes
This if_change_rule is not working properly; it cannot detect any
command line change.

The reason is because cmd-check in scripts/Kbuild.include compares
$(cmd_$@) and $(cmd_$1), but cmd_dtc_dt_yaml does not exist here.

For if_change_rule to work properly, the stem part of cmd_* and rule_*
must match. Because this cmd_and_fixdep invokes cmd_dtc, this rule must
be named rule_dtc.

Fixes: 4f0e3a57d6 ("kbuild: Add support for DT binding schema checks")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Rob Herring <robh@kernel.org>
2020-02-27 02:23:12 +09:00
Masahiro Yamada
eccbde4f6c kbuild: remove wrong documentation about mandatory-y
This sentence does not make sense in the section about mandatory-y.

This seems to be a copy-paste mistake of commit fcc8487d47 ("uapi:
export all headers under uapi directories").

The correct description would be "The convention is to list one
mandatory-y per line ...".

I just removed it instead of fixing it. If such information is needed,
it could be commented in include/asm-generic/Kbuild and
include/uapi/asm-generic/Kbuild.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2020-02-27 02:19:30 +09:00
Randy Dunlap
505b12b386 kbuild: add comment for V=2 mode
Complete the comments for valid values of KBUILD_VERBOSE,
specifically for KBUILD_VERBOSE=2.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2020-02-27 02:19:07 +09:00
Jason A. Donenfeld
be36f9e751 efi: READ_ONCE rng seed size before munmap
This function is consistent with using size instead of seed->size
(except for one place that this patch fixes), but it reads seed->size
without using READ_ONCE, which means the compiler might still do
something unwanted. So, this commit simply adds the READ_ONCE
wrapper.

Fixes: 636259880a ("efi: Add support for seeding the RNG from a UEFI ...")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: linux-efi@vger.kernel.org
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20200217123354.21140-1-Jason@zx2c4.com
Link: https://lore.kernel.org/r/20200221084849.26878-5-ardb@kernel.org
2020-02-26 15:31:43 +01:00
Ard Biesheuvel
8319e9d5ad efi/x86: Handle by-ref arguments covering multiple pages in mixed mode
The mixed mode runtime wrappers are fragile when it comes to how the
memory referred to by its pointer arguments are laid out in memory, due
to the fact that it translates these addresses to physical addresses that
the runtime services can dereference when running in 1:1 mode. Since
vmalloc'ed pages (including the vmap'ed stack) are not contiguous in the
physical address space, this scheme only works if the referenced memory
objects do not cross page boundaries.

Currently, the mixed mode runtime service wrappers require that all by-ref
arguments that live in the vmalloc space have a size that is a power of 2,
and are aligned to that same value. While this is a sensible way to
construct an object that is guaranteed not to cross a page boundary, it is
overly strict when it comes to checking whether a given object violates
this requirement, as we can simply take the physical address of the first
and the last byte, and verify that they point into the same physical page.

When this check fails, we emit a WARN(), but then simply proceed with the
call, which could cause data corruption if the next physical page belongs
to a mapping that is entirely unrelated.

Given that with vmap'ed stacks, this condition is much more likely to
trigger, let's relax the condition a bit, but fail the runtime service
call if it does trigger.

Fixes: f6697df36b ("x86/efi: Prevent mixed mode boot corruption with CONFIG_VMAP_STACK=y")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: linux-efi@vger.kernel.org
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20200221084849.26878-4-ardb@kernel.org
2020-02-26 15:31:42 +01:00
Ard Biesheuvel
f80c9f6476 efi/x86: Remove support for EFI time and counter services in mixed mode
Mixed mode calls at runtime are rather tricky with vmap'ed stacks,
as we can no longer assume that data passed in by the callers of the
EFI runtime wrapper routines is contiguous in physical memory.

We need to fix this, but before we do, let's drop the implementations
of routines that we know are never used on x86, i.e., the RTC related
ones. Given that UEFI rev2.8 permits any runtime service to return
EFI_UNSUPPORTED at runtime, let's return that instead.

As get_next_high_mono_count() is never used at all, even on other
architectures, let's make that return EFI_UNSUPPORTED too.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: linux-efi@vger.kernel.org
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20200221084849.26878-3-ardb@kernel.org
2020-02-26 15:31:42 +01:00
Ard Biesheuvel
63056e8b5e efi/x86: Align GUIDs to their size in the mixed mode runtime wrapper
Hans reports that his mixed mode systems running v5.6-rc1 kernels hit
the WARN_ON() in virt_to_phys_or_null_size(), caused by the fact that
efi_guid_t objects on the vmap'ed stack happen to be misaligned with
respect to their sizes. As a quick (i.e., backportable) fix, copy GUID
pointer arguments to the local stack into a buffer that is naturally
aligned to its size, so that it is guaranteed to cover only one
physical page.

Note that on x86, we cannot rely on the stack pointer being aligned
the way the compiler expects, so we need to allocate an 8-byte aligned
buffer of sufficient size, and copy the GUID into that buffer at an
offset that is aligned to 16 bytes.

Fixes: f6697df36b ("x86/efi: Prevent mixed mode boot corruption with CONFIG_VMAP_STACK=y")
Reported-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Hans de Goede <hdegoede@redhat.com>
Cc: linux-efi@vger.kernel.org
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20200221084849.26878-2-ardb@kernel.org
2020-02-26 15:31:41 +01:00
Ingo Molnar
4c45945aa4 perf/urgent fixes:
auxtrace:
 
   Adrian Hunter:
 
   - Fix endless record after being terminated on arm-spe.
 
   Wei Li:
 
   - Fix endless record after being terminated on Intel PT and BTS and
     on ARM's cs-etm.
 
 perf test:
 
   Thomas Richter
 
   - Fix test trace+probe_vfs_getname.sh on s390
 
 PowerPC:
 
   Arnaldo Carvalho de Melo:
 
   - Sync powerpc syscall.tbl with the kernel sources.
 
 BPF:
 
   Arnaldo Carvalho de Melo:
 
   - Remove extraneous bpf/ subdir from bpf.h headers used to build bpf events.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCXk82oQAKCRCyPKLppCJ+
 J88IAP9m763x0sW44iNxv2RvA021JWYSbMb5XnsUTZ+ELBkqEAD/fU80obv1hAXP
 8LIc6bC6Zv/J+RMa3HHNx0qLiLqgHQo=
 =m+4d
 -----END PGP SIGNATURE-----

Merge tag 'perf-urgent-for-mingo-5.6-20200220' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

auxtrace:

  Adrian Hunter:

  - Fix endless record after being terminated on arm-spe.

  Wei Li:

  - Fix endless record after being terminated on Intel PT and BTS and
    on ARM's cs-etm.

perf test:

  Thomas Richter

  - Fix test trace+probe_vfs_getname.sh on s390

PowerPC:

  Arnaldo Carvalho de Melo:

  - Sync powerpc syscall.tbl with the kernel sources.

BPF:

  Arnaldo Carvalho de Melo:

  - Remove extraneous bpf/ subdir from bpf.h headers used to build bpf events.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-02-26 15:18:05 +01:00
Jens Axboe
3a9015988b io_uring: import_single_range() returns 0/-ERROR
Unlike the other core import helpers, import_single_range() returns 0 on
success, not the length imported. This means that links that depend on
the result of non-vec based IORING_OP_{READ,WRITE} that were added for
5.5 get errored when they should not be.

Fixes: 3a6820f2bb ("io_uring: add non-vectored read/write commands")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-02-26 07:06:57 -07:00
Jens Axboe
2a44f46781 io_uring: pick up link work on submit reference drop
If work completes inline, then we should pick up a dependent link item
in __io_queue_sqe() as well. If we don't do so, we're forced to go async
with that item, which is suboptimal.

This also fixes an issue with io_put_req_find_next(), which always looks
up the next work item. That should only be done if we're dropping the
last reference to the request, to prevent multiple lookups of the same
work item.

Outside of being a fix, this also enables a good cleanup series for 5.7,
where we never have to pass 'nxt' around or into the work handlers.

Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-02-26 07:05:30 -07:00
Stefano Brivio
0954df70fb selftests: nft_concat_range: Add test for reported add/flush/add issue
Add a specific test for the crash reported by Phil Sutter and addressed
in the previous patch. The test cases that, in my intention, should
have covered these cases, that is, the ones from the 'concurrency'
section, don't run these sequences tightly enough and spectacularly
failed to catch this.

While at it, define a convenient way to add these kind of tests, by
adding a "reported issues" test section.

It's more convenient, for this particular test, to execute the set
setup in its own function. However, future test cases like this one
might need to call setup functions, and will typically need no tools
other than nft, so allow for this in check_tools().

The original form of the reproducer used here was provided by Phil.

Reported-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-02-26 14:33:09 +01:00