Commit graph

901515 commits

Author SHA1 Message Date
Qais Yousef
d9cb236b94 sched/rt: cpupri_find: Implement fallback mechanism for !fit case
When searching for the best lowest_mask with a fitness_fn passed, make
sure we record the lowest_level that returns a valid lowest_mask so that
we can use that as a fallback in case we fail to find a fitting CPU at
all levels.

The intention in the original patch was not to allow a down migration to
unfitting CPU. But this missed the case where we are already running on
unfitting one.

With this change now RT tasks can still move between unfitting CPUs when
they're already running on such CPU.

And as Steve suggested; to adhere to the strict priority rules of RT, if
a task is already running on a fitting CPU but due to priority it can't
run on it, allow it to downmigrate to unfitting CPU so it can run.

Reported-by: Pavan Kondeti <pkondeti@codeaurora.org>
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fixes: 804d402fb6 ("sched/rt: Make RT capacity-aware")
Link: https://lkml.kernel.org/r/20200302132721.8353-2-qais.yousef@arm.com
Link: https://lore.kernel.org/lkml/20200203142712.a7yvlyo2y3le5cpn@e107158-lin/
2020-03-06 12:57:26 +01:00
Vincent Guittot
5ab297bab9 sched/fair: Fix reordering of enqueue/dequeue_task_fair()
Even when a cgroup is throttled, the group se of a child cgroup can still
be enqueued and its gse->on_rq stays true. When a task is enqueued on such
child, we still have to update the load_avg and increase
h_nr_running of the throttled cfs. Nevertheless, the 1st
for_each_sched_entity() loop is skipped because of gse->on_rq == true and the
2nd loop because the cfs is throttled whereas we have to update both
load_avg with the old h_nr_running and increase h_nr_running in such case.

The same sequence can happen during dequeue when se moves to parent before
breaking in the 1st loop.

Note that the update of load_avg will effectively happen only once in order
to sync up to the throttled time. Next call for updating load_avg will stop
early because the clock stays unchanged.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fixes: 6d4d22468d ("sched/fair: Reorder enqueue/dequeue_task_fair path")
Link: https://lkml.kernel.org/r/20200306084208.12583-1-vincent.guittot@linaro.org
2020-03-06 12:57:25 +01:00
Vincent Guittot
6212437f0f sched/fair: Fix runnable_avg for throttled cfs
When a cfs_rq is throttled, its group entity is dequeued and its running
tasks are removed. We must update runnable_avg with the old h_nr_running
and update group_se->runnable_weight with the new h_nr_running at each
level of the hierarchy.

Reviewed-by: Ben Segall <bsegall@google.com>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fixes: 9f68395333 ("sched/pelt: Add a new runnable average signal")
Link: https://lkml.kernel.org/r/20200227154115.8332-1-vincent.guittot@linaro.org
2020-03-06 12:57:25 +01:00
Yu Chen
ba4f7bc1de sched/deadline: Make two functions static
Since commit 06a76fe08d ("sched/deadline: Move DL related code
from sched/core.c to sched/deadline.c"), DL related code moved to
deadline.c.

Make the following two functions static since they're only used in
deadline.c:

	dl_change_utilization()
	init_dl_rq_bw_ratio()

Signed-off-by: Yu Chen <chen.yu@easystack.cn>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200228100329.16927-1-chen.yu@easystack.cn
2020-03-06 12:57:24 +01:00
Valentin Schneider
6f693dd5be arm64: defconfig: enable CONFIG_SCHED_SMT
The (CFS) scheduler has some extra logic catering to systems with SMT, but
that logic won't be compiled in unless the above config is set.

Note that the SMT-centric codepaths are gated by the sched_smt_present
static key, and the SMT sched_domains will only survive if the platform has
SMT. As such, the only impact on !SMT platforms should be a slightly
bigger kernel - no behavioural change.

Distro kernels already enable it, which makes sense since there already are
things like ThunderX2 out in the wild. Enable it for the defconfig.

Some deltas
===========

FWIW my ELF symbol table diff looks something like this:

  NAME                                BEFORE    AFTER     DELTA
  update_sd_lb_stats.constprop.135    0         1864      +1864
  find_idlest_group.isra.115          0         1808      +1808
  update_numa_stats.isra.121          0         628       +628
  select_task_rq_fair                 3236      3732      +496
  compute_energy.isra.112             0         420       +420
  score_nearby_nodes.part.120         0         380       +380
  __update_idle_core                  0         232       +232
  nohz_balance_exit_idle.part.127     0         216       +216
  sched_slice.isra.99                 0         172       +172
  update_load_avg.part.107            0         116       +116
  wakeup_preempt_entity.isra.101      0         92        +92
  sched_cpu_activate                  340       396       +56
  pick_next_task_idle                 8         56        +48
  sched_cpu_deactivate                252       292       +40
  show_smt_active                     44        80        +36
  cpu_smt_mask                        0         28        +28
  set_next_task_idle                  4         32        +28
  task_numa_work                      680       692       +12
  cpu_smt_flags                       0         8         +8
  enqueue_task_fair                   2608      2612      +4
  wakeup_preempt_entity.isra.104      92        0         -92
  update_load_avg                     1028      932       -96
  task_numa_migrate                   1824      1728      -96
  sched_slice.isra.102                172       0         -172
  nohz_balance_exit_idle.part.130     216       0         -216
  task_numa_find_cpu                  2116      1868      -248
  score_nearby_nodes.part.123         380       0         -380
  compute_energy.isra.115             420       0         -420
  update_numa_stats.isra.124          472       0         -472
  find_idlest_group.isra.118          1808      0         -1808
  update_sd_lb_stats.constprop.138    1864      0         -1864
  ------------------------------------------------------------------
  DELTA SUM                                               +820

As for the sched_domains, this is on a hikey960:

before:
  $ cat /proc/sys/kernel/sched_domain/cpu*/domain*/name | sort | uniq
  DIE
  MC

after:
  $ cat /proc/sys/kernel/sched_domain/cpu*/domain*/name | sort | uniq
  DIE
  MC

Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200227191433.31994-3-valentin.schneider@arm.com
2020-03-06 12:57:23 +01:00
Valentin Schneider
38502ab4bf sched/topology: Don't enable EAS on SMT systems
EAS already requires asymmetric CPU capacities to be enabled, and mixing
this with SMT is an aberration, but better be safe than sorry.

Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Acked-by: Quentin Perret <qperret@google.com>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200227191433.31994-2-valentin.schneider@arm.com
2020-03-06 12:57:23 +01:00
Mel Gorman
0621df3154 sched/numa: Acquire RCU lock for checking idle cores during NUMA balancing
Qian Cai reported the following bug:

  The linux-next commit ff7db0bf24 ("sched/numa: Prefer using an idle CPU as a
  migration target instead of comparing tasks") introduced a boot warning,

  [   86.520534][    T1] WARNING: suspicious RCU usage
  [   86.520540][    T1] 5.6.0-rc3-next-20200227 #7 Not tainted
  [   86.520545][    T1] -----------------------------
  [   86.520551][    T1] kernel/sched/fair.c:5914 suspicious rcu_dereference_check() usage!
  [   86.520555][    T1]
  [   86.520555][    T1] other info that might help us debug this:
  [   86.520555][    T1]
  [   86.520561][    T1]
  [   86.520561][    T1] rcu_scheduler_active = 2, debug_locks = 1
  [   86.520567][    T1] 1 lock held by systemd/1:
  [   86.520571][    T1]  #0: ffff8887f4b14848 (&mm->mmap_sem#2){++++}, at: do_page_fault+0x1d2/0x998
  [   86.520594][    T1]
  [   86.520594][    T1] stack backtrace:
  [   86.520602][    T1] CPU: 1 PID: 1 Comm: systemd Not tainted 5.6.0-rc3-next-20200227 #7

task_numa_migrate() checks for idle cores when updating NUMA-related statistics.
This relies on reading a RCU-protected structure in test_idle_cores() via this
call chain

task_numa_migrate
  -> update_numa_stats
    -> numa_idle_core
      -> test_idle_cores

While the locking could be fine-grained, it is more appropriate to acquire
the RCU lock for the entire scan of the domain. This patch removes the
warning triggered at boot time.

Reported-by: Qian Cai <cai@lca.pw>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fixes: ff7db0bf24 ("sched/numa: Prefer using an idle CPU as a migration target instead of comparing tasks")
Link: https://lkml.kernel.org/r/20200227191804.GJ3818@techsingularity.net
2020-03-06 12:57:22 +01:00
Valentin Schneider
76c389ab2b sched/fair: Fix kernel build warning in test_idle_cores() for !SMT NUMA
Building against the tip/sched/core as ff7db0bf24 ("sched/numa: Prefer
using an idle CPU as a migration target instead of comparing tasks") with
the arm64 defconfig (which doesn't have CONFIG_SCHED_SMT set) leads to:

  kernel/sched/fair.c:1525:20: warning: 'test_idle_cores' declared 'static' but never defined [-Wunused-function]
   static inline bool test_idle_cores(int cpu, bool def);
		      ^~~~~~~~~~~~~~~

Rather than define it in its own CONFIG_SCHED_SMT #define island, bunch it
up with test_idle_cores().

Reported-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
[mgorman@techsingularity.net: Edit changelog, minor style change]
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fixes: ff7db0bf24 ("sched/numa: Prefer using an idle CPU as a migration target instead of comparing tasks")
Link: https://lkml.kernel.org/r/20200303110258.1092-3-mgorman@techsingularity.net
2020-03-06 12:57:22 +01:00
Thara Gopinath
05289b90c2 sched/fair: Enable tuning of decay period
Thermal pressure follows pelt signals which means the decay period for
thermal pressure is the default pelt decay period. Depending on SoC
characteristics and thermal activity, it might be beneficial to decay
thermal pressure slower, but still in-tune with the pelt signals.  One way
to achieve this is to provide a command line parameter to set a decay
shift parameter to an integer between 0 and 10.

Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200222005213.3873-10-thara.gopinath@linaro.org
2020-03-06 12:57:21 +01:00
Thara Gopinath
f12e4f66ab thermal/cpu-cooling: Update thermal pressure in case of a maximum frequency capping
Thermal governors can request for a CPU's maximum supported frequency to
be capped in case of an overheat event. This in turn means that the
maximum capacity available for tasks to run on the particular CPU is
reduced. Delta between the original maximum capacity and capped maximum
capacity is known as thermal pressure. Enable cpufreq cooling device to
update the thermal pressure in event of a capped maximum frequency.

Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200222005213.3873-9-thara.gopinath@linaro.org
2020-03-06 12:57:21 +01:00
Thara Gopinath
467b7d01c4 sched/fair: Update cpu_capacity to reflect thermal pressure
cpu_capacity initially reflects the maximum possible capacity of a CPU.
Thermal pressure on a CPU means this maximum possible capacity is
unavailable due to thermal events. This patch subtracts the average
thermal pressure for a CPU from its maximum possible capacity so that
cpu_capacity reflects the remaining maximum capacity.

Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200222005213.3873-8-thara.gopinath@linaro.org
2020-03-06 12:57:20 +01:00
Thara Gopinath
b4eccf5f8e sched/fair: Enable periodic update of average thermal pressure
Introduce support in scheduler periodic tick and other CFS bookkeeping
APIs to trigger the process of computing average thermal pressure for a
CPU. Also consider avg_thermal.load_avg in others_have_blocked which
allows for decay of pelt signals.

Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200222005213.3873-7-thara.gopinath@linaro.org
2020-03-06 12:57:20 +01:00
Thara Gopinath
8eab879c54 arm/topology: Populate arch_scale_thermal_pressure() for ARM platforms
Hook up topology_get_thermal_pressure to arch_scale_thermal_pressure thus
enabling scheduler to retrieve instantaneous thermal pressure of a CPU.

Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200222005213.3873-6-thara.gopinath@linaro.org
2020-03-06 12:57:19 +01:00
Thara Gopinath
ae1677c0bb arm64/topology: Populate arch_scale_thermal_pressure() for arm64 platforms
Hook up topology_get_thermal_pressure to arch_scale_thermal_pressure thus
enabling scheduler to retrieve instantaneous thermal pressure of a CPU.

Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200222005213.3873-5-thara.gopinath@linaro.org
2020-03-06 12:57:19 +01:00
Thara Gopinath
ad58cc5cc5 drivers/base/arch_topology: Add infrastructure to store and update instantaneous thermal pressure
Add architecture specific APIs to update and track thermal pressure on a
per CPU basis. A per CPU variable thermal_pressure is introduced to keep
track of instantaneous per CPU thermal pressure. Thermal pressure is the
delta between maximum capacity and capped capacity due to a thermal event.

topology_get_thermal_pressure can be hooked into the scheduler specified
arch_scale_thermal_pressure to retrieve instantaneous thermal pressure of
a CPU.

arch_set_thermal_pressure can be used to update the thermal pressure.

Considering topology_get_thermal_pressure reads thermal_pressure and
arch_set_thermal_pressure writes into thermal_pressure, one can argue for
some sort of locking mechanism to avoid a stale value.  But considering
topology_get_thermal_pressure can be called from a system critical path
like scheduler tick function, a locking mechanism is not ideal. This means
that it is possible the thermal_pressure value used to calculate average
thermal pressure for a CPU can be stale for up to 1 tick period.

Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200222005213.3873-4-thara.gopinath@linaro.org
2020-03-06 12:57:18 +01:00
Thara Gopinath
36a0df85d2 sched/topology: Add callback to read per CPU thermal pressure
Introduce the arch_scale_thermal_pressure() callback to retrieve per CPU thermal
pressure.

Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200222005213.3873-3-thara.gopinath@linaro.org
2020-03-06 12:57:17 +01:00
Thara Gopinath
765047932f sched/pelt: Add support to track thermal pressure
Extrapolating on the existing framework to track rt/dl utilization using
pelt signals, add a similar mechanism to track thermal pressure. The
difference here from rt/dl utilization tracking is that, instead of
tracking time spent by a CPU running a RT/DL task through util_avg, the
average thermal pressure is tracked through load_avg. This is because
thermal pressure signal is weighted time "delta" capacity unlike util_avg
which is binary. "delta capacity" here means delta between the actual
capacity of a CPU and the decreased capacity a CPU due to a thermal event.

In order to track average thermal pressure, a new sched_avg variable
avg_thermal is introduced. Function update_thermal_load_avg can be called
to do the periodic bookkeeping (accumulate, decay and average) of the
thermal pressure.

Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200222005213.3873-2-thara.gopinath@linaro.org
2020-03-06 12:57:17 +01:00
Chris Wilson
f1dfdab694 sched/vtime: Prevent unstable evaluation of WARN(vtime->state)
As the vtime is sampled under loose seqcount protection by kcpustat, the
vtime fields may change as the code flows. Where logic dictates a field
has a static value, use a READ_ONCE.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fixes: 74722bb223 ("sched/vtime: Bring up complete kcpustat accessor")
Link: https://lkml.kernel.org/r/20200123180849.28486-1-frederic@kernel.org
2020-03-06 12:57:16 +01:00
Ingo Molnar
1b10d388d0 Merge branch 'linus' into sched/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-03-06 12:49:56 +01:00
Linus Torvalds
8b614cb8f1 five small cifs/smb3 fixes, two for stable
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAl5dnvEACgkQiiy9cAdy
 T1FaWAv/XnyYfYh6H4fhtgtfNxW9xt9mkHo/AohHcf2rk2erqjVz0lHVe7SuS9C5
 EpDYnijZKa//aiIV6VzDymPaMrXQZ+oCAExAzLPmWZnLeZ65Q02K2P1F3KvURdue
 4nLjuOyzyG4YYkoBi4wKneu1Ji377m9L6BpSfM+MzPScCOl8OV/vv/nBRY1N6gIY
 Rreq5iipRaDhifsaOgiA501sUu7mvpPEHNpluCtFmY4iTHQzYqjWZ5ZGXr2xz63n
 5VV8KWWn/p3nhJGt7L/1aynws59AdEd5GNZ5FbDQHokx9n3MMnyl4QGDzUehnhlY
 Ym6n50QA5QMn9I9NLg8I2aD6z4vNIj9kZxersoHduf4UsA9CyPaucUIyV81mt683
 AZIqtz8H21fgJXOQ3nv4uNc8Yyt1SGQfFDo1EfphwLl6LaE8rx3CFEnVoNLM+jqb
 nyRB/NxLtDWVQhYM8Bg/TP7iMqknHtarfZirv48LFdXLlhb83+qpSSHy0zVy9vli
 y/0B7rEI
 =zLW4
 -----END PGP SIGNATURE-----

Merge tag '5.6-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
 "Five small cifs/smb3 fixes, two for stable (one for a reconnect
  problem and the other fixes a use case when renaming an open file)"

* tag '5.6-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: Use #define in cifs_dbg
  cifs: fix rename() by ensuring source handle opened with DELETE bit
  cifs: add missing mount option to /proc/mounts
  cifs: fix potential mismatch of UNC paths
  cifs: don't leak -EAGAIN for stat() during reconnect
2020-03-03 17:31:19 -06:00
Linus Torvalds
2873dc2547 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Misc fixes: a pkeys fix for a bug that triggers with weird BIOS
  settings, and two Xen PV fixes: a paravirt interface fix, and
  pagetable dumping fix"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm: Fix dump_pagetables with Xen PV
  x86/ioperm: Add new paravirt function update_io_bitmap()
  x86/pkeys: Manually set X86_FEATURE_OSPKE to preserve existing changes
2020-03-02 06:54:54 -06:00
Linus Torvalds
c105df5d86 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Ingo Molnar:
 "Fix a scheduler statistics bug"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Fix statistics for find_idlest_group()
2020-03-02 06:51:43 -06:00
Linus Torvalds
852fb4a728 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "No kernel side changes, all tooling fixes plus two tooling cleanups
  that were committed late in the merge window alongside the perf
  annotate fixes, delayed by Arnaldo's European trip"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
  perf annotate: Fix segfault with source toggle
  perf annotate: Align struct annotate_args
  perf annotate: Simplify disasm_line allocation and freeing code
  perf annotate: Remove privsize from symbol__annotate() args
  perf probe: Check return value of strlist__add() for -ENOMEM
  perf config: Document missing config options
  perf annotate: Fix perf config option description
  perf annotate: Prefer cmdline option over default config
  perf annotate: Make perf config effective
  perf config: Introduce perf_config_u8()
  perf annotate: Fix --show-nr-samples for tui/stdio2
  perf annotate: Fix --show-total-period for tui/stdio2
  perf annotate/tui: Re-render title bar after switching back from script browser
  tools headers UAPI: Update tools's copy of kvm.h headers
  tools arch x86: Sync the msr-index.h copy with the kernel sources
  perf arch powerpc: Sync powerpc syscall.tbl with the kernel sources
  perf auxtrace: Add auxtrace_record__read_finish()
  perf arm-spe: Fix endless record after being terminated
  perf cs-etm: Fix endless record after being terminated
  perf intel-bts: Fix endless record after being terminated
  ...
2020-03-02 06:46:39 -06:00
Linus Torvalds
e130a920f6 Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fixes from Ingo Molnar:
 "Three fixes to EFI mixed boot mode, mostly related to x86-64 vmap
  stacks activated years ago, bug-fixed recently for EFI, which had
  knock-on effects of various 1:1 mapping assumptions in mixed mode.

  There's also a READ_ONCE() fix for reading an mmap-ed EFI firmware
  data field only once, out of caution"

* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  efi: READ_ONCE rng seed size before munmap
  efi/x86: Handle by-ref arguments covering multiple pages in mixed mode
  efi/x86: Remove support for EFI time and counter services in mixed mode
  efi/x86: Align GUIDs to their size in the mixed mode runtime wrapper
2020-03-02 06:41:41 -06:00
Linus Torvalds
98d54f81e3 Linux 5.6-rc4 2020-03-01 16:38:46 -06:00
Linus Torvalds
e70869821a Two more bug fixes (including a regression) for 5.6
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAl5cMPoACgkQ8vlZVpUN
 gaNYmgf/WX4/jMSYQu2fICudCqLr5fkLqsybvYGZGei3F8BaJ90zohQAQybNznWS
 iyF0JzrOp37b/o0haz7KfDr7xVB3lAVsKu9Bglq+zL8mc9IkPmjhCXuLbknUtOUw
 j3aVdntt4d6S3szbtP4PIZxNqh+/4KJDS2soWvuNWRpYMOv2yoMClptWWQtsimAt
 3fYpxasSz0Jrhtbuf+I1oID++wOycDT3RKiko5tpLlQiFVoKBzfou+0ZdkC4+UIl
 KvcpMBm1ijdGAaN9jfb2L2KCY5UdSvmeVui3sMXtHBEpKMJl2QsClylR1wGfgBKi
 +YMEsjBONxKo3kH2DaPJaU6LEm8JuQ==
 =rszH
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 fixes from Ted Ts'o:
 "Two more bug fixes (including a regression) for 5.6"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: potential crash on allocation error in ext4_alloc_flex_bg_array()
  jbd2: fix data races at struct journal_head
2020-03-01 16:35:08 -06:00
Linus Torvalds
f853ed90e2 More bugfixes, including a few remaining "make W=1" issues such
as too large frame sizes on some configurations.  On the
 ARM side, the compiler was messing up shadow stacks between
 EL1 and EL2 code, which is easily fixed with __always_inline.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJeXAT4AAoJEL/70l94x66DWywH/1kv4MmeGo6PI0Nxk/yvA7X8
 78iqIBchtxZX0v/9kqpTB7bYmHyTgmZHM+IkwtIUANDSaOvWqJwU+TLUfduOiuXF
 NxBHcZDyuMoftX5CSQ+bJ5PwxKijAdJsIkCZ13CnsTCkwcfamSGypFUCK8LacPeq
 WHvV5Ws5pFc51xrP3CH1DrRhLoulaBmt5xxqK9fxWtslrlsnm1uNza5vs8As8CzM
 apnmdRIf5p4v91Zic3PFH7/GXES0m1tjIBKdtZ4YHb8yrXV/kBsEVhhTjqE9mrUq
 qtRRl5waOFoP4yc9ey52PAbMm1x1Ho/pyunpM0xh40Yq8OPFwqXBPTnWfobSoiM=
 =LNQc
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "More bugfixes, including a few remaining "make W=1" issues such as too
  large frame sizes on some configurations.

  On the ARM side, the compiler was messing up shadow stacks between EL1
  and EL2 code, which is easily fixed with __always_inline"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: VMX: check descriptor table exits on instruction emulation
  kvm: x86: Limit the number of "kvm: disabled by bios" messages
  KVM: x86: avoid useless copy of cpufreq policy
  KVM: allow disabling -Werror
  KVM: x86: allow compiling as non-module with W=1
  KVM: Pre-allocate 1 cpumask variable per cpu for both pv tlb and pv ipis
  KVM: Introduce pv check helpers
  KVM: let declaration of kvm_get_running_vcpus match implementation
  KVM: SVM: allocate AVIC data structures based on kvm_amd module parameter
  arm64: Ask the compiler to __always_inline functions used by KVM at HYP
  KVM: arm64: Define our own swab32() to avoid a uapi static inline
  KVM: arm64: Ask the compiler to __always_inline functions used at HYP
  kvm: arm/arm64: Fold VHE entry/exit work into kvm_vcpu_run_vhe()
  KVM: arm/arm64: Fix up includes for trace.h
2020-03-01 15:16:35 -06:00
Oliver Upton
86f7e90ce8 KVM: VMX: check descriptor table exits on instruction emulation
KVM emulates UMIP on hardware that doesn't support it by setting the
'descriptor table exiting' VM-execution control and performing
instruction emulation. When running nested, this emulation is broken as
KVM refuses to emulate L2 instructions by default.

Correct this regression by allowing the emulation of descriptor table
instructions if L1 hasn't requested 'descriptor table exiting'.

Fixes: 07721feee4 ("KVM: nVMX: Don't emulate instructions in guest mode")
Reported-by: Jan Kiszka <jan.kiszka@web.de>
Cc: stable@vger.kernel.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Jim Mattson <jmattson@google.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-01 19:26:31 +01:00
Linus Torvalds
fb279f4e23 Merge branch 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
 "I2C has three driver bugfixes for you. We agreed on the Mac regression
  to go in via I2C"

* 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  macintosh: therm_windtunnel: fix regression when instantiating devices
  i2c: altera: Fix potential integer overflow
  i2c: jz4780: silence log flood on txabrt
2020-02-29 19:16:46 -06:00
Dan Carpenter
37b0b6b8b9 ext4: potential crash on allocation error in ext4_alloc_flex_bg_array()
If sbi->s_flex_groups_allocated is zero and the first allocation fails
then this code will crash.  The problem is that "i--" will set "i" to
-1 but when we compare "i >= sbi->s_flex_groups_allocated" then the -1
is type promoted to unsigned and becomes UINT_MAX.  Since UINT_MAX
is more than zero, the condition is true so we call kvfree(new_groups[-1]).
The loop will carry on freeing invalid memory until it crashes.

Fixes: 7c990728b9 ("ext4: fix potential race between s_flex_groups online resizing and access")
Reviewed-by: Suraj Jitindar Singh <surajjs@amazon.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/20200228092142.7irbc44yaz3by7nb@kili.mountain
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-02-29 17:48:08 -05:00
Wolfram Sang
38b17afb0e macintosh: therm_windtunnel: fix regression when instantiating devices
Removing attach_adapter from this driver caused a regression for at
least some machines. Those machines had the sensors described in their
DT, too, so they didn't need manual creation of the sensor devices. The
old code worked, though, because manual creation came first. Creation of
DT devices then failed later and caused error logs, but the sensors
worked nonetheless because of the manually created devices.

When removing attach_adaper, manual creation now comes later and loses
the race. The sensor devices were already registered via DT, yet with
another binding, so the driver could not be bound to it.

This fix refactors the code to remove the race and only manually creates
devices if there are no DT nodes present. Also, the DT binding is updated
to match both, the DT and manually created devices. Because we don't
know which device creation will be used at runtime, the code to start
the kthread is moved to do_probe() which will be called by both methods.

Fixes: 3e7bed5271 ("macintosh: therm_windtunnel: drop using attach_adapter")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=201723
Reported-by: Erhard Furtner <erhard_f@mailbox.org>
Tested-by: Erhard Furtner <erhard_f@mailbox.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org # v4.19+
2020-02-29 21:13:22 +01:00
Qian Cai
6c5d911249 jbd2: fix data races at struct journal_head
journal_head::b_transaction and journal_head::b_next_transaction could
be accessed concurrently as noticed by KCSAN,

 LTP: starting fsync04
 /dev/zero: Can't open blockdev
 EXT4-fs (loop0): mounting ext3 file system using the ext4 subsystem
 EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null)
 ==================================================================
 BUG: KCSAN: data-race in __jbd2_journal_refile_buffer [jbd2] / jbd2_write_access_granted [jbd2]

 write to 0xffff99f9b1bd0e30 of 8 bytes by task 25721 on cpu 70:
  __jbd2_journal_refile_buffer+0xdd/0x210 [jbd2]
  __jbd2_journal_refile_buffer at fs/jbd2/transaction.c:2569
  jbd2_journal_commit_transaction+0x2d15/0x3f20 [jbd2]
  (inlined by) jbd2_journal_commit_transaction at fs/jbd2/commit.c:1034
  kjournald2+0x13b/0x450 [jbd2]
  kthread+0x1cd/0x1f0
  ret_from_fork+0x27/0x50

 read to 0xffff99f9b1bd0e30 of 8 bytes by task 25724 on cpu 68:
  jbd2_write_access_granted+0x1b2/0x250 [jbd2]
  jbd2_write_access_granted at fs/jbd2/transaction.c:1155
  jbd2_journal_get_write_access+0x2c/0x60 [jbd2]
  __ext4_journal_get_write_access+0x50/0x90 [ext4]
  ext4_mb_mark_diskspace_used+0x158/0x620 [ext4]
  ext4_mb_new_blocks+0x54f/0xca0 [ext4]
  ext4_ind_map_blocks+0xc79/0x1b40 [ext4]
  ext4_map_blocks+0x3b4/0x950 [ext4]
  _ext4_get_block+0xfc/0x270 [ext4]
  ext4_get_block+0x3b/0x50 [ext4]
  __block_write_begin_int+0x22e/0xae0
  __block_write_begin+0x39/0x50
  ext4_write_begin+0x388/0xb50 [ext4]
  generic_perform_write+0x15d/0x290
  ext4_buffered_write_iter+0x11f/0x210 [ext4]
  ext4_file_write_iter+0xce/0x9e0 [ext4]
  new_sync_write+0x29c/0x3b0
  __vfs_write+0x92/0xa0
  vfs_write+0x103/0x260
  ksys_write+0x9d/0x130
  __x64_sys_write+0x4c/0x60
  do_syscall_64+0x91/0xb05
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

 5 locks held by fsync04/25724:
  #0: ffff99f9911093f8 (sb_writers#13){.+.+}, at: vfs_write+0x21c/0x260
  #1: ffff99f9db4c0348 (&sb->s_type->i_mutex_key#15){+.+.}, at: ext4_buffered_write_iter+0x65/0x210 [ext4]
  #2: ffff99f5e7dfcf58 (jbd2_handle){++++}, at: start_this_handle+0x1c1/0x9d0 [jbd2]
  #3: ffff99f9db4c0168 (&ei->i_data_sem){++++}, at: ext4_map_blocks+0x176/0x950 [ext4]
  #4: ffffffff99086b40 (rcu_read_lock){....}, at: jbd2_write_access_granted+0x4e/0x250 [jbd2]
 irq event stamp: 1407125
 hardirqs last  enabled at (1407125): [<ffffffff980da9b7>] __find_get_block+0x107/0x790
 hardirqs last disabled at (1407124): [<ffffffff980da8f9>] __find_get_block+0x49/0x790
 softirqs last  enabled at (1405528): [<ffffffff98a0034c>] __do_softirq+0x34c/0x57c
 softirqs last disabled at (1405521): [<ffffffff97cc67a2>] irq_exit+0xa2/0xc0

 Reported by Kernel Concurrency Sanitizer on:
 CPU: 68 PID: 25724 Comm: fsync04 Tainted: G L 5.6.0-rc2-next-20200221+ #7
 Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019

The plain reads are outside of jh->b_state_lock critical section which result
in data races. Fix them by adding pairs of READ|WRITE_ONCE().

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Qian Cai <cai@lca.pw>
Link: https://lore.kernel.org/r/20200222043111.2227-1-cai@lca.pw
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-02-29 13:40:02 -05:00
Linus Torvalds
7557c1b3f7 SCSI fixes on 20200229
Four small fixes.  Three are in drivers for fairly obvious bugs.  The
 fourth is a set of regressions introduced by the compat_ioctl changes
 because some of the compat updates wrongly replaced .ioctl instead of
 .compat_ioctl.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXlpxDCYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishSXsAPwOGPkU
 ObFbUs75Tdmk1M7jqtxgBsNhuNta0S8d7dJ3aAEA/YBtGGQWoeEGivUKwzwA4cwL
 1w1GbhPEblpMNO8keVA=
 =I7qk
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Four small fixes.

  Three are in drivers for fairly obvious bugs. The fourth is a set of
  regressions introduced by the compat_ioctl changes because some of the
  compat updates wrongly replaced .ioctl instead of .compat_ioctl"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: compat_ioctl: cdrom: Replace .ioctl with .compat_ioctl in four appropriate places
  scsi: zfcp: fix wrong data and display format of SFP+ temperature
  scsi: sd_sbc: Fix sd_zbc_report_zones()
  scsi: libfc: free response frame from GPN_ID
2020-02-29 09:58:47 -06:00
Juergen Gross
bba42affa7 x86/mm: Fix dump_pagetables with Xen PV
Commit 2ae27137b2 ("x86: mm: convert dump_pagetables to use
walk_page_range") broke Xen PV guests as the hypervisor reserved hole in
the memory map was not taken into account.

Fix that by starting the kernel range only at GUARD_HOLE_END_ADDR.

Fixes: 2ae27137b2 ("x86: mm: convert dump_pagetables to use walk_page_range")
Reported-by: Julien Grall <julien@xen.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Julien Grall <julien@xen.org>
Link: https://lkml.kernel.org/r/20200221103851.7855-1-jgross@suse.com
2020-02-29 12:43:10 +01:00
Juergen Gross
99bcd4a6e5 x86/ioperm: Add new paravirt function update_io_bitmap()
Commit 111e7b15cf ("x86/ioperm: Extend IOPL config to control ioperm()
as well") reworked the iopl syscall to use I/O bitmaps.

Unfortunately this broke Xen PV domains using that syscall as there is
currently no I/O bitmap support in PV domains.

Add I/O bitmap support via a new paravirt function update_io_bitmap which
Xen PV domains can use to update their I/O bitmaps via a hypercall.

Fixes: 111e7b15cf ("x86/ioperm: Extend IOPL config to control ioperm() as well")
Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Cc: <stable@vger.kernel.org> # 5.5
Link: https://lkml.kernel.org/r/20200218154712.25490-1-jgross@suse.com
2020-02-29 12:43:09 +01:00
Ingo Molnar
7977fed974 perf/urgent fixes:
perf annotate:
 
   Ravi Bangoria:
 
   - Fix segfault with source toggle.
 
   - Fix --show-total-period and --show-nr-samples for tui/stdio2.
 
   - Fix handling of settings in ~/.perfconfig versus the ones passed
     in the command line
 
   - Re-render title bar after switching back from script browser.
 
   - Fix options man page, document some missing ones.
 
 perf probe:
 
   He Zhe:
 
   - Check return value of strlist__add() for -ENOMEM.
 
 tools UAPI:
 
   Arnaldo Carvalho de Melo:
 
   - Sync x86's msr-index.h copy with the kernel sources.
 
   - Update tools's copy of x86's kvm.h headers.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCXlkcOwAKCRCyPKLppCJ+
 J36pAP9bTih+YOItpzt2hX+j3yshW2DAhc53h5McYkJnX4oLOAD+Jo+OQgwp3ort
 ceQOKzghA+P0QB7b8Ntbi34qH26OzAE=
 =OQg+
 -----END PGP SIGNATURE-----

Merge tag 'perf-urgent-for-mingo-5.6-20200228' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

perf annotate:

  Ravi Bangoria:

  - Fix segfault with source toggle.

  - Fix --show-total-period and --show-nr-samples for tui/stdio2.

  - Fix handling of settings in ~/.perfconfig versus the ones passed
    in the command line

  - Re-render title bar after switching back from script browser.

  - Fix options man page, document some missing ones.

perf probe:

  He Zhe:

  - Check return value of strlist__add() for -ENOMEM.

tools UAPI:

  Arnaldo Carvalho de Melo:

  - Sync x86's msr-index.h copy with the kernel sources.

  - Update tools's copy of x86's kvm.h headers.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-02-29 10:10:03 +01:00
Linus Torvalds
29795de0d2 pci-v5.6-fixes-2
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAl5ZZ4MUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vxQKRAAq51gX+DRNfGDf42a0EwuQi4lZVfD
 mg9wEvKdreNL+pa5dSYopEwTcQPF2Pf5trFYZIFkZ61FaK4W/A3pOw/43i50LBFw
 rE2x/pvNfVDDLVXZUiDLQp1Yz3UIBAoxZQd0s0TzMApUBEtylVQDEN/9+NYGreFp
 QU+sb2wkU3gc99woyqpvcacybS8NgKDWI4jGGOkxod/QOlCGE3tbyOYYUNx5/IhE
 nt1AIKS8D/LqPJMh+pfZYatSF2uxfqk5YEN1k+S4Z0h/EAgIfVGRyVMo/HSeg+Al
 yKHiGQ3ApIeTqsOiscdeb00jPSn8IsED/1Uv/QCkmmlP72arAyCyNDs7ZVUqrhGx
 nufA/oHel4g/pYPcY2Dyjr/qR/dM2X17SE/7KhUkThAVim01KS26uGMAuL4RApUS
 d4gPS2iY3LqmHyQBbcUHjrkrKRmqyl6V+WMwz9DihZ66FiNw1w2gLoI8FiYN7e9e
 XwNVhuGNPLZiVw0gmAfIHoPDucj9B+IRjOmLU4/x93vAbhduKUwCIDHv91Mbe2Gt
 bHHzJeiyzOotw+SpgvRTrafX6JZ/K4fbe81yDryf8a8WSHxSLd6HpvyVukLJYdd0
 n0d7w+a2XxspptTzsGe593CQ7GG93I5Ena/GeXXMztQmFrSG9ufUcN54+PJia3P+
 frvcmHb1sZair5Q=
 =kzKz
 -----END PGP SIGNATURE-----

Merge tag 'pci-v5.6-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:

 - Fix build issue on 32-bit ARM with old compilers (Marek Szyprowski)

 - Update MAINTAINERS for recent Cadence driver file move (Lukas
   Bulwahn)

* tag 'pci-v5.6-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  MAINTAINERS: Correct Cadence PCI driver path
  PCI: brcmstb: Fix build on 32bit ARM platforms with older compilers
2020-02-28 11:51:53 -08:00
Linus Torvalds
2edc78b9a4 block-5.6-2020-02-28
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl5ZXl0QHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpmltEACSA4yxdvWsVMYRCijjm/FzBEq7C8PSsWNK
 H8KPmjQiNpbSiZSi1uMVsHMlhBmBM8ZQ6Zc+gbZSs6xMqa4yP/iRtmzxnGonC7TB
 f5Ne2QuC0+TKMFJJTG8cCTzrgEOrWYkFKkmabzDml7HtloJtuzgArrmPzRj2sUfY
 J+d0osdp1b4U4sqhhAnxSm/zYJkGrQb+9UgNdVjhZCUzaX6oCcuK8xUwu2reLGlM
 qPkSKOywnl3WHCSCJXsCrNLKX0QWtIfMzlWDr40GYgHauPBbWfa8+1yHR1/lWP4R
 zyxGk63I9f6/+iQSUC72wP77bAVWKW674c53jgd7r1pNL9TiuK+a3E4lgf7eU+rl
 ymA/rM6Iy3SjTgiLT57PPOecsILJns3cwZ6mhvSRs0+zpao7LOQZXWdu9V0+Fyqo
 jur+7Ll/Qfdv/CLlM94DeBJtwhaTWiHTfDoaDHlG9p1/vvcWWXTUTIVPwAD+YGbj
 geio/bIWECnQxDtZL5Jikf5zsC76aQ46vvxK4F6RJlXj6jaugIbN3mWLsg17sUVf
 Y4h+IEVtQr0zA0LkPrfVdAS9IqVlTrMRDCkrrlhsDt7FI0orCOag7JOcmN2/nPn/
 2H22nl6i02b0gdGrScU5pyBswSPaImddH5tqE9uL2rK4hrFe6oKxL5EicTFDZmTh
 tHnukoc+Yg==
 =1bzv
 -----END PGP SIGNATURE-----

Merge tag 'block-5.6-2020-02-28' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:

 - Passthrough insertion fix (Ming)

 - Kill off some unused arguments (John)

 - blktrace RCU fix (Jan)

 - Dead fields removal for null_blk (Dongli)

 - NVMe polled IO fix (Bijan)

* tag 'block-5.6-2020-02-28' of git://git.kernel.dk/linux-block:
  nvme-pci: Hold cq_poll_lock while completing CQEs
  blk-mq: Remove some unused function arguments
  null_blk: remove unused fields in 'nullb_cmd'
  blktrace: Protect q->blk_trace with RCU
  blk-mq: insert passthrough request into hctx->dispatch directly
2020-02-28 11:43:30 -08:00
Linus Torvalds
74dea5d99d io_uring-5.6-2020-02-28
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl5ZXkgQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgprZqEACOvhiprH9Q75Pp+ZwQknM3xGyJRWI3Mbj9
 ZOyTVTK0qhTeaq6rN4MLSYevXOh+L68x5WRt1YJ1UnQRE0i8+ZQyZczqLKxxl8gF
 trhbYDXjvXIWr9zvdtiL01PKKu4Vjjp6eZAomrbxCTFku0qn76fo9wDgGPRGL+Kx
 lNO/6QvCXr9EjDniEUhlQsxTad5xc4sL0cnL4s2i7RlTCYtW4WJXJMC/4Gkg69j+
 W5GBZyjJDa8Sj3pEbLjtDtA4ooE9VMaldb7ZvR62ONUVwGpftPsbN7UhVlhyhpW+
 8v4ZEf07CxB246+hj7oL0RvEW3+/nB2hym1ySMXyBzpbx4O1JOUG7hQtNgdLRbCZ
 27IOg2O36qbUKM1hUwn7Qm3XAfBPQdFpVmqE2+E9MEOKzigLzhRP6Bu5d9x9VQGh
 JDxsm3B8PRHFJVAasiYu0p7mlx/+BCLjB84UrMB3I9UCBuVfk4mtmuwZX+mcK2PR
 pV1xJlEMYKme3cz2/u6uB8p3Nq6ipE1nSVrI6AnfEvJbQ9sFL61KaG4wHKPvtb0y
 mlNgc4seSjiWcBR2/84561a4CSmlXAn9dWMIGdHFFA43mTPYGc5omTcM8FwcEDkW
 cTFGB8sFukcTNmOw62HUHYI1vPpowX6apV08lEQrScz7GiK5piTYqTFNneqEzcwZ
 3bIMisH3Gg==
 =WheR
 -----END PGP SIGNATURE-----

Merge tag 'io_uring-5.6-2020-02-28' of git://git.kernel.dk/linux-block

Pull io_uring fixes from Jens Axboe:

 - Fix for a race with IOPOLL used with SQPOLL (Xiaoguang)

 - Only show ->fdinfo if procfs is enabled (Tobias)

 - Fix for a chain with multiple personalities in the SQEs

 - Fix for a missing free of personality idr on exit

 - Removal of the spin-for-work optimization

 - Fix for next work lookup on request completion

 - Fix for non-vec read/write result progation in case of links

 - Fix for a fileset references on switch

 - Fix for a recvmsg/sendmsg 32-bit compatability mode

* tag 'io_uring-5.6-2020-02-28' of git://git.kernel.dk/linux-block:
  io_uring: fix 32-bit compatability with sendmsg/recvmsg
  io_uring: define and set show_fdinfo only if procfs is enabled
  io_uring: drop file set ref put/get on switch
  io_uring: import_single_range() returns 0/-ERROR
  io_uring: pick up link work on submit reference drop
  io-wq: ensure work->task_pid is cleared on init
  io-wq: remove spin-for-work optimization
  io_uring: fix poll_list race for SETUP_IOPOLL|SETUP_SQPOLL
  io_uring: fix personality idr leak
  io_uring: handle multiple personalities in link chains
2020-02-28 11:39:14 -08:00
Jens Axboe
5b8ea58b6a Merge branch 'nvme-5.6-rc4' of git://git.infradead.org/nvme into block-5.6
Pull NVMe fix from Keith.

* 'nvme-5.6-rc4' of git://git.infradead.org/nvme:
  nvme-pci: Hold cq_poll_lock while completing CQEs
2020-02-28 10:02:36 -07:00
Linus Torvalds
c60c040213 ACPI fixes for 5.6-rc4
Fix a couple of configuration issues in the ACPI watchdog (WDAT)
 driver (Mika Westerberg) and make it possible to disable that
 driver at boot time in case it still does not work as
 expected (Jean Delvare).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl5Y6UYSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxj7YP/1qVMcIJo3CcItiQVV7+2S+WS+kK+Rf8
 t6g+8F5KL0DBrDqji7l19kv0O+kPpn2uGnREcNk3T3mHEu0jSywghvBI+nzrym2P
 Q+p2+2Fryy5zgiNTyhq7imA3IH4Zw5CVt6GRJjITvzFwhMGM38NQsDUbZEpZtJrz
 HTVQyqPgmhvox5sa2Qk1+vZOtclWlkPenWmtJEs8m+Pg8w+Ejfs4Lj+CQYlI5RB8
 GGhTgBiBMS7A80yTVxl8LULsrini6PWAGGY2yMScwDUYNZTAUuT8QSUx4/QBmyyP
 U5K1sIK2pd/JuE2QqSBztrYC7x1fpfoBJylF8B0HK5uNBpZth7eXhe6t+xDYVTpw
 U31unYfmcx6UT47lF/yWWNuhP6EDvdvZ0Xx4DHlgBJqKNuyfhz4Uazsr7HyLi2sI
 /YrBPimtgvICFHYorrNVwEheTUEp+QXtNZp6aTugLGVX6zbQ3nZp7ajifNO7hdsQ
 K+IjjRgH/YsVi9GQY/U3A/zTO+39fSupQgKiH33pG8Yp5bZbreoo1kPKzlnt23JJ
 DXuaN7wH74sgT2EqZm30lFapF/0wv+iIpdWmorlE/4WwTwIeDck/ULeBiGFqpUI6
 9Di10O6w+TJ+VevyJ5Hqkzukjlj93Vq+kCY8eOiWVpTga2ieQiYKFSbxDPE3OC1k
 j2wkDwKqkh0o
 =eIlw
 -----END PGP SIGNATURE-----

Merge tag 'acpi-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
 "Fix a couple of configuration issues in the ACPI watchdog (WDAT)
  driver (Mika Westerberg) and make it possible to disable that driver
  at boot time in case it still does not work as expected (Jean
  Delvare)"

* tag 'acpi-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: watchdog: Set default timeout in probe
  ACPI: watchdog: Fix gas->access_width usage
  ACPICA: Introduce ACPI_ACCESS_BYTE_WIDTH() macro
  ACPI: watchdog: Allow disabling WDAT at boot
2020-02-28 09:02:18 -08:00
Linus Torvalds
3642859812 Power management fixes for 5.6-rc4
Fix a recent cpufreq initialization regression (Rafael Wysocki),
 revert a devfreq commit that made incompatible changes and broke
 user land on some systems (Orson Zhai), drop a stale reference to
 a document that has gone away recently (Jonathan Neuschäfer) and
 fix a typo in a hibernation code comment (Alexandre Belloni).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl5Y6MYSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxqCAP/3zmwDsBMG07c+g9kOEJzuGLkpUrGIFp
 t1vR2+EmV7bCq7onCLSBz090wvogLCmNPpXmq45Ddt5hx3ltZSx4stwYdbsS/xmU
 t+9pFiCSSFufs1vaUzZqCuw53jpizRD4KKoUbCT2kpCgH4wjF7ftRIC5y2DYI57Y
 hOIfrI9XzwR2UZYRWZqCYtwpPkMicao8lXLfhs1mrZFk+AxwwwUFSp8/Gw94LBPD
 +ESctvsrGfmqEDEvcZUaYd2i0i5PsqnbnVy6wxWb3rhmhSXJTfiCzCGECItBJuSA
 HcZj3m2dohOXDxXK/Yevwiv+fBkqeeeP+KVlvtjzmYzTDW1bryLYQpxAsdpeYGEE
 bANjAl6MARqBDKUjvw+N6yDdAPPNbsT6zEK3f0w4eQNRpfmjhHP7TxrOsjC0DRJu
 EZpN6hWGQs1Vmndr+xaK0ITeRXB+7IQeXG6t6XDYZg2gS65hkhF5z1E5vjmn1Cen
 wpQsqvek6K7/DqqLqCH27yATNNKff7PNNCfpIazPE6CjDKBAiM9jsMpWkQf4E9/7
 VKJFzmhy5P9VsHnUNonRN1K8Vl5eo+b7YOfmR8kvfsmNpfvKqaXM813owVMkOEq6
 44kc2D38t3TnlH+GNgGfyPBnHchqlw/P0y0mypQ73Wyg+PQB+fx+W9vAXWDe6fBW
 /tIhZYILu90f
 =ySea
 -----END PGP SIGNATURE-----

Merge tag 'pm-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "Fix a recent cpufreq initialization regression (Rafael Wysocki),
  revert a devfreq commit that made incompatible changes and broke user
  land on some systems (Orson Zhai), drop a stale reference to a
  document that has gone away recently (Jonathan Neuschäfer), and fix a
  typo in a hibernation code comment (Alexandre Belloni)"

* tag 'pm-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: Fix policy initialization for internal governor drivers
  Revert "PM / devfreq: Modify the device name as devfreq(X) for sysfs"
  PM / hibernate: fix typo "reserverd_size" -> "reserved_size"
  Documentation: power: Drop reference to interface.rst
2020-02-28 08:49:52 -08:00
Linus Torvalds
bfeb4f9977 zonefs fixes for 5.6-rc4
Two fixes in this pull request:
 * Revert the initial decision to silently ignore IOCB_NOWAIT for
   asynchronous direct IOs to sequential zone files. Instead, return an
   error to the user to signal that the feature is not supported (from
   Christoph)
 * A fix to zonefs Kconfig to select FS_IOMAP to avoid build failures if
   no other file system already selected this option (from Johannes).
 
 Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSRPv8tYSvhwAzJdzjdoc3SxdoYdgUCXljJWAAKCRDdoc3SxdoY
 dmztAP9Sj74cHVTxac+HoDKwf6DYWfjPWonT5tO4wc8q0PBDOgEAhKzHQJZNqJvd
 a0BrEf/t6RLWDgsi75cB/U6HsiGkiA0=
 =+maQ
 -----END PGP SIGNATURE-----

Merge tag 'zonefs-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs

Pull zonefs fixes from Damien Le Moal:
 "Two fixes in here:

   - Revert the initial decision to silently ignore IOCB_NOWAIT for
     asynchronous direct IOs to sequential zone files. Instead, return
     an error to the user to signal that the feature is not supported
     (from Christoph)

   - A fix to zonefs Kconfig to select FS_IOMAP to avoid build failures
     if no other file system already selected this option (from
     Johannes)"

* tag 'zonefs-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
  zonefs: select FS_IOMAP
  zonefs: fix IOCB_NOWAIT handling
2020-02-28 08:34:47 -08:00
Paolo Bonzini
e951445f4d KVM/arm fixes for 5.6, take #1
- Fix compilation on 32bit
 - Move  VHE guest entry/exit into the VHE-specific entry code
 - Make sure all functions called by the non-VHE HYP code is tagged as __always_inline
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl5VsNMPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDLhUQAIsecO9IyYjy1J0Q5AxaKLL7NuKYlAaty2xX
 uY6UkTfPNsEaHFXSGYXWPDxrmkgArp2wuy4WVQB59Om00+LE7h9kiz7+xKpcUy1G
 UoHa5mzMlqoOeUIWO/oSU6LYHhYDnIpHTDco93YrscU4nNRevJZ/GVeuQeMblzuZ
 Sg7cWc+0V43FXUt9Jw8BsNhXH/D0l0p3v86p7GZLcSfFAccO62YfOwC8J/znLPym
 4S+O9RYQkCczvzFeQVYQwqImOAunaOb0OzERUbm8icOF6ekYGwywjrtlmAC/3q+q
 1g/te1yfwQ8fpprWl4QSH0sQVdfAcxdDZqcWtN2LhNaEShZtNa5yKpsRGn1V0eAS
 tIO8eexAKCXoASHrrwfSkizYjRAeDabmodBQmS50/isY9OdBE2tDel+BLrCjzBJ2
 hABwEZ3Q78216EuoqsZqWaEUZ3ck0iSW3IcXglmHE4TC8Iq6dwskvOPjay+msHr9
 dcHDCxFIN4jzv9QcpKN8LkxfmW0Us28bzap3OhKfrz0nv7b4n+j0q1xbKL1QnN/l
 RcDPW0dQeXuX9vYMeYIUDQcV4IgTUkF6IPDCRW7KCApi98HfPTbrfQ97nir79zDp
 pD8NXaNFr4PtxJoheYYia3sjZMt/fgfvP2dM32iOpsMu7W1FXdfQN7heNSc6MQmO
 ciyhf/mj
 =NpPo
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-fixes-5.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm fixes for 5.6, take #1

- Fix compilation on 32bit
- Move  VHE guest entry/exit into the VHE-specific entry code
- Make sure all functions called by the non-VHE HYP code is tagged as __always_inline
2020-02-28 11:50:06 +01:00
Erwan Velu
ef935c25fd kvm: x86: Limit the number of "kvm: disabled by bios" messages
In older version of systemd(219), at boot time, udevadm is called with :
	/usr/bin/udevadm trigger --type=devices --action=add"

This program generates an echo "add" in /sys/devices/system/cpu/cpu<x>/uevent,
leading to the "kvm: disabled by bios" message in case of your Bios disabled
the virtualization extensions.

On a modern system running up to 256 CPU threads, this pollutes the Kernel logs.

This patch offers to ratelimit this message to avoid any userspace program triggering
this uevent printing this message too often.

This patch is only a workaround but greatly reduce the pollution without
breaking the current behavior of printing a message if some try to instantiate
KVM on a system that doesn't support it.

Note that recent versions of systemd (>239) do not have trigger this behavior.

This patch will be useful at least for some using older systemd with recent Kernels.

Signed-off-by: Erwan Velu <e.velu@criteo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-28 11:37:20 +01:00
Rafael J. Wysocki
189c6967fe Merge branches 'pm-sleep' and 'pm-devfreq'
* pm-sleep:
  PM / hibernate: fix typo "reserverd_size" -> "reserved_size"
  Documentation: power: Drop reference to interface.rst

* pm-devfreq:
  Revert "PM / devfreq: Modify the device name as devfreq(X) for sysfs"
2020-02-28 11:00:50 +01:00
Paolo Bonzini
aaec7c03de KVM: x86: avoid useless copy of cpufreq policy
struct cpufreq_policy is quite big and it is not a good idea
to allocate one on the stack.  Just use cpufreq_cpu_get and
cpufreq_cpu_put which is even simpler.

Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-28 10:54:50 +01:00
Paolo Bonzini
4f337faf1c KVM: allow disabling -Werror
Restrict -Werror to well-tested configurations and allow disabling it
via Kconfig.

Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-28 10:45:28 +01:00
Valdis Klētnieks
575b255c16 KVM: x86: allow compiling as non-module with W=1
Compile error with CONFIG_KVM_INTEL=y and W=1:

  CC      arch/x86/kvm/vmx/vmx.o
arch/x86/kvm/vmx/vmx.c:68:32: error: 'vmx_cpu_id' defined but not used [-Werror=unused-const-variable=]
   68 | static const struct x86_cpu_id vmx_cpu_id[] = {
      |                                ^~~~~~~~~~
cc1: all warnings being treated as errors

When building with =y, the MODULE_DEVICE_TABLE macro doesn't generate a
reference to the structure (or any code at all).  This makes W=1 compiles
unhappy.

Wrap both in a #ifdef to avoid the issue.

Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
[Do the same for CONFIG_KVM_AMD. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-28 10:35:37 +01:00
Wanpeng Li
8a9442f49c KVM: Pre-allocate 1 cpumask variable per cpu for both pv tlb and pv ipis
Nick Desaulniers Reported:

  When building with:
  $ make CC=clang arch/x86/ CFLAGS=-Wframe-larger-than=1000
  The following warning is observed:
  arch/x86/kernel/kvm.c:494:13: warning: stack frame size of 1064 bytes in
  function 'kvm_send_ipi_mask_allbutself' [-Wframe-larger-than=]
  static void kvm_send_ipi_mask_allbutself(const struct cpumask *mask, int
  vector)
              ^
  Debugging with:
  https://github.com/ClangBuiltLinux/frame-larger-than
  via:
  $ python3 frame_larger_than.py arch/x86/kernel/kvm.o \
    kvm_send_ipi_mask_allbutself
  points to the stack allocated `struct cpumask newmask` in
  `kvm_send_ipi_mask_allbutself`. The size of a `struct cpumask` is
  potentially large, as it's CONFIG_NR_CPUS divided by BITS_PER_LONG for
  the target architecture. CONFIG_NR_CPUS for X86_64 can be as high as
  8192, making a single instance of a `struct cpumask` 1024 B.

This patch fixes it by pre-allocate 1 cpumask variable per cpu and use it for
both pv tlb and pv ipis..

Reported-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-28 10:34:25 +01:00