Commit graph

713599 commits

Author SHA1 Message Date
Marek Vasut
ae6647c78f ARM: dts: socfpga: Fix NAND controller clock supply
commit 4eda9b766b upstream.

The Denali NAND x-clock should be supplied by nand_x_clk, not by
nand_clk. Fix this, otherwise the Denali driver gets incorrect
clock frequency information and incorrectly configures the NAND
timing.

Cc: stable@vger.kernel.org
Signed-off-by: Marek Vasut <marex@denx.de>
Fixes: d837a80d19 ("ARM: dts: socfpga: add nand controller nodes")
Cc: Steffen Trumtrar <s.trumtrar@pengutronix.de>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:52 +02:00
Marek Vasut
3482130d8d ARM: dts: socfpga: Fix NAND controller node compatible
commit d9a695f3c8 upstream.

The compatible string for the Denali NAND controller is incorrect,
fix it by replacing it with one matching the DT bindings and the
driver.

Cc: stable@vger.kernel.org
Signed-off-by: Marek Vasut <marex@denx.de>
Fixes: d837a80d19 ("ARM: dts: socfpga: add nand controller nodes")
Cc: Steffen Trumtrar <s.trumtrar@pengutronix.de>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:51 +02:00
Thor Thayer
3db24d2e19 ARM: dts: Fix SPI node for Arria10
commit 975ba94c2c upstream.

Remove the unused bus-num node and change num-chipselect
to num-cs to match SPI bindings.

Cc: stable@vger.kernel.org
Fixes: f2d6f8f817 ("ARM: dts: socfpga: Add SPI Master1 for Arria10 SR chip")
Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:51 +02:00
David Rivshin
eda170a9fe ARM: 8764/1: kgdb: fix NUMREGBYTES so that gdb_regs[] is the correct size
commit 76ed0b803a upstream.

NUMREGBYTES (which is used as the size for gdb_regs[]) is incorrectly
based on DBG_MAX_REG_NUM instead of GDB_MAX_REGS. DBG_MAX_REG_NUM
is the number of total registers, while GDB_MAX_REGS is the number
of 'unsigned longs' it takes to serialize those registers. Since
FP registers require 3 'unsigned longs' each, DBG_MAX_REG_NUM is
smaller than GDB_MAX_REGS.

This causes GDB 8.0 give the following error on connect:
"Truncated register 19 in remote 'g' packet"

This also causes the register serialization/deserialization logic
to overflow gdb_regs[], overwriting whatever follows.

Fixes: 834b2964b7 ("kgdb,arm: fix register dump")
Cc: <stable@vger.kernel.org> # 2.6.37+
Signed-off-by: David Rivshin <drivshin@allworx.com>
Acked-by: Rabin Vincent <rabin@rab.in>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:51 +02:00
Vaibhav Jain
c9debbd1a5 cxl: Disable prefault_mode in Radix mode
commit b6c84ba22f upstream.

Currently we see a kernel-oops reported on Power-9 while attaching a
context to an AFU, with radix-mode and sysfs attr 'prefault_mode' set
to anything other than 'none'. The backtrace of the oops is of this
form:

  Unable to handle kernel paging request for data at address 0x00000080
  Faulting instruction address: 0xc00800000bcf3b20
  cpu 0x1: Vector: 300 (Data Access) at [c00000037f003800]
      pc: c00800000bcf3b20: cxl_load_segment+0x178/0x290 [cxl]
      lr: c00800000bcf39f0: cxl_load_segment+0x48/0x290 [cxl]
      sp: c00000037f003a80
     msr: 9000000000009033
     dar: 80
   dsisr: 40000000
    current = 0xc00000037f280000
    paca    = 0xc0000003ffffe600   softe: 3        irq_happened: 0x01
      pid   = 3529, comm = afp_no_int
  <snip>
  cxl_prefault+0xfc/0x248 [cxl]
  process_element_entry_psl9+0xd8/0x1a0 [cxl]
  cxl_attach_dedicated_process_psl9+0x44/0x130 [cxl]
  native_attach_process+0xc0/0x130 [cxl]
  afu_ioctl+0x3f4/0x5e0 [cxl]
  do_vfs_ioctl+0xdc/0x890
  ksys_ioctl+0x68/0xf0
  sys_ioctl+0x40/0xa0
  system_call+0x58/0x6c

The issue is caused as on Power-8 the AFU attr 'prefault_mode' was
used to improve initial storage fault performance by prefaulting
process segments. However on Power-9 with radix mode we don't have
Storage-Segments that we can prefault. Also prefaulting process Pages
will be too costly and fine-grained.

Hence, since the prefaulting mechanism doesn't makes sense of
radix-mode, this patch updates prefault_mode_store() to not allow any
other value apart from CXL_PREFAULT_NONE when radix mode is enabled.

Fixes: f24be42aab ("cxl: Add psl9 specific code")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:51 +02:00
Finley Xiao
971a55574b soc: rockchip: power-domain: Fix wrong value when power up pd with writemask
commit 9e59c5f66c upstream.

Solve the pd could only ever turn off but never turn them on again,
if the pd registers have the writemask bits.

So far this affects the rk3328 only.

Fixes: 79bb17ce8e ("soc: rockchip: power-domain: Support domain control in hiword-registers")
Cc: stable@vger.kernel.org
Signed-off-by: Finley Xiao <finley.xiao@rock-chips.com>
Signed-off-by: Elaine Zhang <zhangqing@rock-chips.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:51 +02:00
Mahesh Salgaonkar
56fbab60aa powerpc/fadump: Unregister fadump on kexec down path.
commit 722cde76d6 upstream.

Unregister fadump on kexec down path otherwise the fadump registration
in new kexec-ed kernel complains that fadump is already registered.
This makes new kernel to continue using fadump registered by previous
kernel which may lead to invalid vmcore generation. Hence this patch
fixes this issue by un-registering fadump in fadump_cleanup() which is
called during kexec path so that new kernel can register fadump with
new valid values.

Fixes: b500afff11 ("fadump: Invalidate registration and release reserved memory for general use.")
Cc: stable@vger.kernel.org # v3.4+
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:51 +02:00
Gautham R. Shenoy
3b185e667b cpuidle: powernv: Fix promotion from snooze if next state disabled
commit 0a4ec6aa03 upstream.

The commit 78eaa10f02 ("cpuidle: powernv/pseries: Auto-promotion of
snooze to deeper idle state") introduced a timeout for the snooze idle
state so that it could be eventually be promoted to a deeper idle
state. The snooze timeout value is static and set to the target
residency of the next idle state, which would train the cpuidle
governor to pick the next idle state eventually.

The unfortunate side-effect of this is that if the next idle state(s)
is disabled, the CPU will forever remain in snooze, despite the fact
that the system is completely idle, and other deeper idle states are
available.

This patch fixes the issue by dynamically setting the snooze timeout
to the target residency of the next enabled state on the device.

Before Patch:
  POWER8 : Only nap disabled.
  $ cpupower monitor sleep 30
  sleep took 30.01297 seconds and exited with status 0
                |Idle_Stats
  PKG |CORE|CPU | snoo | Nap  | Fast
     0|   8|   0| 96.41|  0.00|  0.00
     0|   8|   1| 96.43|  0.00|  0.00
     0|   8|   2| 96.47|  0.00|  0.00
     0|   8|   3| 96.35|  0.00|  0.00
     0|   8|   4| 96.37|  0.00|  0.00
     0|   8|   5| 96.37|  0.00|  0.00
     0|   8|   6| 96.47|  0.00|  0.00
     0|   8|   7| 96.47|  0.00|  0.00

  POWER9: Shallow states (stop0lite, stop1lite, stop2lite, stop0, stop1,
  stop2) disabled:
  $ cpupower monitor sleep 30
  sleep took 30.05033 seconds and exited with status 0
                |Idle_Stats
  PKG |CORE|CPU | snoo | stop | stop | stop | stop | stop | stop | stop | stop
     0|  16|   0| 89.79|  0.00|  0.00|  0.00|  0.00|  0.00|  0.00|  0.00|  0.00
     0|  16|   1| 90.12|  0.00|  0.00|  0.00|  0.00|  0.00|  0.00|  0.00|  0.00
     0|  16|   2| 90.21|  0.00|  0.00|  0.00|  0.00|  0.00|  0.00|  0.00|  0.00
     0|  16|   3| 90.29|  0.00|  0.00|  0.00|  0.00|  0.00|  0.00|  0.00|  0.00

After Patch:
  POWER8 : Only nap disabled.
  $ cpupower monitor sleep 30
  sleep took 30.01200 seconds and exited with status 0
                |Idle_Stats
  PKG |CORE|CPU | snoo | Nap  | Fast
     0|   8|   0| 16.58|  0.00| 77.21
     0|   8|   1| 18.42|  0.00| 75.38
     0|   8|   2|  4.70|  0.00| 94.09
     0|   8|   3| 17.06|  0.00| 81.73
     0|   8|   4|  3.06|  0.00| 95.73
     0|   8|   5|  7.00|  0.00| 96.80
     0|   8|   6|  1.00|  0.00| 98.79
     0|   8|   7|  5.62|  0.00| 94.17

  POWER9: Shallow states (stop0lite, stop1lite, stop2lite, stop0, stop1,
  stop2) disabled:

  $ cpupower monitor sleep 30
  sleep took 30.02110 seconds and exited with status 0
                |Idle_Stats
  PKG |CORE|CPU | snoo | stop | stop | stop | stop | stop | stop | stop | stop
     0|   0|   0|  0.69|  0.00|  0.00|  0.00|  0.00|  0.00|  0.00|  9.39| 89.70
     0|   0|   1|  0.00|  0.00|  0.00|  0.00|  0.00|  0.00|  0.00|  0.05| 93.21
     0|   0|   2|  0.00|  0.00|  0.00|  0.00|  0.00|  0.00|  0.00|  0.00| 89.93
     0|   0|   3|  0.00|  0.00|  0.00|  0.00|  0.00|  0.00|  0.00|  0.00| 93.26

Fixes: 78eaa10f02 ("cpuidle: powernv/pseries: Auto-promotion of snooze to deeper idle state")
Cc: stable@vger.kernel.org # v4.2+
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:51 +02:00
Akshay Adiga
a5d49dfb20 powerpc/powernv/cpuidle: Init all present cpus for deep states
commit ac9816dcba upstream.

Init all present cpus for deep states instead of "all possible" cpus.
Init fails if a possible cpu is guarded. Resulting in making only
non-deep states available for cpuidle/hotplug.

Stewart says, this means that for single threaded workloads, if you
guard out a CPU core you'll not get WoF (Workload Optimised
Frequency), which means that performance goes down when you wouldn't
expect it to.

Fixes: 77b54e9f21 ("powernv/powerpc: Add winkle support for offline cpus")
Cc: stable@vger.kernel.org # v3.19+
Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:51 +02:00
Haren Myneni
134e70c22e powerpc/powernv: copy/paste - Mask SO bit in CR
commit 7574364906 upstream.

NX can set the 3rd bit in CR register for XER[SO] (Summary overflow)
which is not related to paste request. The current paste function
returns failure for a successful request when this bit is set. So mask
this bit and check the proper return status.

Fixes: 2392c8c8c0 ("powerpc/powernv/vas: Define copy/paste interfaces")
Cc: stable@vger.kernel.org # v4.14+
Signed-off-by: Haren Myneni <haren@us.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:50 +02:00
Alexey Kardashevskiy
0e8bb91c6d powerpc/powernv/ioda2: Remove redundant free of TCE pages
commit 98fd72fe82 upstream.

When IODA2 creates a PE, it creates an IOMMU table with it_ops::free
set to pnv_ioda2_table_free() which calls pnv_pci_ioda2_table_free_pages().

Since iommu_tce_table_put() calls it_ops::free when the last reference
to the table is released, explicit call to pnv_pci_ioda2_table_free_pages()
is not needed so let's remove it.

This should fix double free in the case of PCI hotuplug as
pnv_pci_ioda2_table_free_pages() does not reset neither
iommu_table::it_base nor ::it_size.

This was not exposed by SRIOV as it uses different code path via
pnv_pcibios_sriov_disable().

IODA1 does not inialize it_ops::free so it does not have this issue.

Fixes: c5f7700bbd ("powerpc/powernv: Dynamically release PE")
Cc: stable@vger.kernel.org # v4.8+
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:50 +02:00
Michael Neuling
919c9b8187 powerpc/ptrace: Fix enforcement of DAWR constraints
commit cd6ef7eebf upstream.

Back when we first introduced the DAWR, in commit 4ae7ebe952
("powerpc: Change hardware breakpoint to allow longer ranges"), we
screwed up the constraint making it a 1024 byte boundary rather than a
512. This makes the check overly permissive. Fortunately GDB is the
only real user and it always did they right thing, so we never
noticed.

This fixes the constraint to 512 bytes.

Fixes: 4ae7ebe952 ("powerpc: Change hardware breakpoint to allow longer ranges")
Cc: stable@vger.kernel.org # v3.9+
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:50 +02:00
Anju T Sudhakar
1ab9092356 powerpc/perf: Fix memory allocation for core-imc based on num_possible_cpus()
commit d2032678e5 upstream.

Currently memory is allocated for core-imc based on cpu_present_mask,
which has bit 'cpu' set iff cpu is populated. We use (cpu number / threads
per core) as the array index to access the memory.

Under some circumstances firmware marks a CPU as GUARDed CPU and boot the
system, until cleared of errors, these CPU's are unavailable for all
subsequent boots. GUARDed CPUs are possible but not present from linux
view, so it blows a hole when we assume the max length of our allocation
is driven by our max present cpus, where as one of the cpus might be online
and be beyond the max present cpus, due to the hole.
So (cpu number / threads per core) value bounds the array index and leads
to memory overflow.

Call trace observed during a guard test:

Faulting instruction address: 0xc000000000149f1c
cpu 0x69: Vector: 380 (Data Access Out of Range) at [c000003fea303420]
    pc:c000000000149f1c: prefetch_freepointer+0x14/0x30
    lr:c00000000014e0f8: __kmalloc+0x1a8/0x1ac
    sp:c000003fea3036a0
   msr:9000000000009033
   dar:c9c54b2c91dbf6b7
  current = 0xc000003fea2c0000
  paca    = 0xc00000000fddd880	 softe: 3	 irq_happened: 0x01
    pid   = 1, comm = swapper/104
Linux version 4.16.7-openpower1 (smc@smc-desktop) (gcc version 6.4.0
(Buildroot 2018.02.1-00006-ga8d1126)) #2 SMP Fri May 4 16:44:54 PDT 2018
enter ? for help
call trace:
	 __kmalloc+0x1a8/0x1ac
	 (unreliable)
	 init_imc_pmu+0x7f4/0xbf0
	 opal_imc_counters_probe+0x3fc/0x43c
	 platform_drv_probe+0x48/0x80
	 driver_probe_device+0x22c/0x308
	 __driver_attach+0xa0/0xd8
	 bus_for_each_dev+0x88/0xb4
	 driver_attach+0x2c/0x40
	 bus_add_driver+0x1e8/0x228
	 driver_register+0xd0/0x114
	 __platform_driver_register+0x50/0x64
	 opal_imc_driver_init+0x24/0x38
	 do_one_initcall+0x150/0x15c
	 kernel_init_freeable+0x250/0x254
	 kernel_init+0x1c/0x150
	 ret_from_kernel_thread+0x5c/0xc8

Allocating memory for core-imc based on cpu_possible_mask, which has
bit 'cpu' set iff cpu is populatable, will fix this issue.

Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Tested-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
Fixes: 39a846db1d ("powerpc/perf: Add core IMC PMU support")
Cc: stable@vger.kernel.org # v4.14+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:50 +02:00
Michael Neuling
c12d241616 powerpc/ptrace: Fix setting 512B aligned breakpoints with PTRACE_SET_DEBUGREG
commit 4f7c06e26e upstream.

In commit e2a800beac ("powerpc/hw_brk: Fix off by one error when
validating DAWR region end") we fixed setting the DAWR end point to
its max value via PPC_PTRACE_SETHWDEBUG. Unfortunately we broke
PTRACE_SET_DEBUGREG when setting a 512 byte aligned breakpoint.

PTRACE_SET_DEBUGREG currently sets the length of the breakpoint to
zero (memset() in hw_breakpoint_init()). This worked with
arch_validate_hwbkpt_settings() before the above patch was applied but
is now broken if the breakpoint is 512byte aligned.

This sets the length of the breakpoint to 8 bytes when using
PTRACE_SET_DEBUGREG.

Fixes: e2a800beac ("powerpc/hw_brk: Fix off by one error when validating DAWR region end")
Cc: stable@vger.kernel.org # v3.11+
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:50 +02:00
Aneesh Kumar K.V
5fefd9a5d9 powerpc/mm/hash: Add missing isync prior to kernel stack SLB switch
commit 91d0697188 upstream.

Currently we do not have an isync, or any other context synchronizing
instruction prior to the slbie/slbmte in _switch() that updates the
SLB entry for the kernel stack.

However that is not correct as outlined in the ISA.

From Power ISA Version 3.0B, Book III, Chapter 11, page 1133:

  "Changing the contents of ... the contents of SLB entries ... can
   have the side effect of altering the context in which data
   addresses and instruction addresses are interpreted, and in which
   instructions are executed and data accesses are performed.
   ...
   These side effects need not occur in program order, and therefore
   may require explicit synchronization by software.
   ...
   The synchronizing instruction before the context-altering
   instruction ensures that all instructions up to and including that
   synchronizing instruction are fetched and executed in the context
   that existed before the alteration."

And page 1136:

  "For data accesses, the context synchronizing instruction before the
   slbie, slbieg, slbia, slbmte, tlbie, or tlbiel instruction ensures
   that all preceding instructions that access data storage have
   completed to a point at which they have reported all exceptions
   they will cause."

We're not aware of any bugs caused by this, but it should be fixed
regardless.

Add the missing isync when updating kernel stack SLB entry.

Cc: stable@vger.kernel.org
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
[mpe: Flesh out change log with more ISA text & explanation]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:50 +02:00
Miklos Szeredi
69829f749a fuse: fix control dir setup and teardown
commit 6becdb601b upstream.

syzbot is reporting NULL pointer dereference at fuse_ctl_remove_conn() [1].
Since fc->ctl_ndents is incremented by fuse_ctl_add_conn() when new_inode()
failed, fuse_ctl_remove_conn() reaches an inode-less dentry and tries to
clear d_inode(dentry)->i_private field.

Fix by only adding the dentry to the array after being fully set up.

When tearing down the control directory, do d_invalidate() on it to get rid
of any mounts that might have been added.

[1] https://syzkaller.appspot.com/bug?id=f396d863067238959c91c0b7cfc10b163638cac6
Reported-by: syzbot <syzbot+32c236387d66c4516827@syzkaller.appspotmail.com>
Fixes: bafa96541b ("[PATCH] fuse: add control filesystem")
Cc: <stable@vger.kernel.org> # v2.6.18
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:49 +02:00
Tetsuo Handa
3a37d85a90 fuse: don't keep dead fuse_conn at fuse_fill_super().
commit 543b8f8662 upstream.

syzbot is reporting use-after-free at fuse_kill_sb_blk() [1].
Since sb->s_fs_info field is not cleared after fc was released by
fuse_conn_put() when initialization failed, fuse_kill_sb_blk() finds
already released fc and tries to hold the lock. Fix this by clearing
sb->s_fs_info field after calling fuse_conn_put().

[1] https://syzkaller.appspot.com/bug?id=a07a680ed0a9290585ca424546860464dd9658db

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: syzbot <syzbot+ec3986119086fe4eec97@syzkaller.appspotmail.com>
Fixes: 3b463ae0c6 ("fuse: invalidation reverse calls")
Cc: John Muir <john@jmuir.com>
Cc: Csaba Henk <csaba@gluster.com>
Cc: Anand Avati <avati@redhat.com>
Cc: <stable@vger.kernel.org> # v2.6.31
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:49 +02:00
Miklos Szeredi
2f7bf369b5 fuse: atomic_o_trunc should truncate pagecache
commit df0e91d488 upstream.

Fuse has an "atomic_o_trunc" mode, where userspace filesystem uses the
O_TRUNC flag in the OPEN request to truncate the file atomically with the
open.

In this mode there's no need to send a SETATTR request to userspace after
the open, so fuse_do_setattr() checks this mode and returns.  But this
misses the important step of truncating the pagecache.

Add the missing parts of truncation to the ATTR_OPEN branch.

Reported-by: Chad Austin <chadaustin@fb.com>
Fixes: 6ff958edbf ("fuse: add atomic open+truncate support")
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:49 +02:00
Tejun Heo
02832578eb fuse: fix congested state leak on aborted connections
commit 8a301eb16d upstream.

If a connection gets aborted while congested, FUSE can leave
nr_wb_congested[] stuck until reboot causing wait_iff_congested() to
wait spuriously which can lead to severe performance degradation.

The leak is caused by gating congestion state clearing with
fc->connected test in request_end().  This was added way back in 2009
by 26c3679101 ("fuse: destroy bdi on umount").  While the commit
description doesn't explain why the test was added, it most likely was
to avoid dereferencing bdi after it got destroyed.

Since then, bdi lifetime rules have changed many times and now we're
always guaranteed to have access to the bdi while the superblock is
alive (fc->sb).

Drop fc->connected conditional to avoid leaking congestion states.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Joshua Miller <joshmiller@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: stable@vger.kernel.org # v2.6.29+
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:49 +02:00
Tetsuo Handa
a47c3c4876 printk: fix possible reuse of va_list variable
commit 988a35f8da upstream.

I noticed that there is a possibility that printk_safe_log_store() causes
kernel oops because "args" parameter is passed to vsnprintf() again when
atomic_cmpxchg() detected that we raced. Fix this by using va_copy().

Link: http://lkml.kernel.org/r/201805112002.GIF21216.OFVHFOMLJtQFSO@I-love.SAKURA.ne.jp
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: dvyukov@google.com
Cc: syzkaller@googlegroups.com
Cc: fengguang.wu@intel.com
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Fixes: 42a0bb3f71 ("printk/nmi: generic solution for safe printk in NMI")
Cc: 4.7+ <stable@vger.kernel.org> # v4.7+
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:49 +02:00
Amit Pundir
affd84024c Bluetooth: hci_qca: Avoid missing rampatch failure with userspace fw loader
commit 7dc5fe0814 upstream.

AOSP use userspace firmware loader to load firmwares, which will
return -EAGAIN in case qca/rampatch_00440302.bin is not found.
Since there is no rampatch for dragonboard820c QCA controller
revision, just make it work as is.

CC: Loic Poulain <loic.poulain@linaro.org>
CC: Nicolas Dechesne <nicolas.dechesne@linaro.org>
CC: Marcel Holtmann <marcel@holtmann.org>
CC: Johan Hedberg <johan.hedberg@gmail.com>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:49 +02:00
Corey Minyard
3ffecef63d ipmi:bt: Set the timeout before doing a capabilities check
commit fe50a7d039 upstream.

There was one place where the timeout value for an operation was
not being set, if a capabilities request was done from idle.  Move
the timeout value setting to before where that change might be
requested.

IMHO the cause here is the invisible returns in the macros.  Maybe
that's a job for later, though.

Reported-by: Nordmark Claes <Claes.Nordmark@tieto.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:49 +02:00
Mikulas Patocka
26e03f8dcd branch-check: fix long->int truncation when profiling branches
commit 2026d35741 upstream.

The function __builtin_expect returns long type (see the gcc
documentation), and so do macros likely and unlikely. Unfortunatelly, when
CONFIG_PROFILE_ANNOTATED_BRANCHES is selected, the macros likely and
unlikely expand to __branch_check__ and __branch_check__ truncates the
long type to int. This unintended truncation may cause bugs in various
kernel code (we found a bug in dm-writecache because of it), so it's
better to fix __branch_check__ to return long.

Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1805300818140.24812@file01.intranet.prod.int.rdu2.redhat.com

Cc: Ingo Molnar <mingo@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 1f0d69a9fc ("tracing: profile likely and unlikely annotations")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:49 +02:00
Matthias Schiffer
5eff5dbf31 mips: ftrace: fix static function graph tracing
commit 6fb8656646 upstream.

ftrace_graph_caller was never run after calling ftrace_trace_function,
breaking the function graph tracer. Fix this, bringing it in line with the
x86 implementation.

While we're at it, also streamline the control flow of _mcount a bit to
reduce the number of branches.

This issue was reported before:
https://www.linux-mips.org/archives/linux-mips/2014-11/msg00295.html

Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Tested-by: Matt Redfearn <matt.redfearn@mips.com>
Patchwork: https://patchwork.linux-mips.org/patch/18929/
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: stable@vger.kernel.org # v3.17+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:48 +02:00
Steven Rostedt (VMware)
5f7a15af64 ftrace/selftest: Have the reset_trigger code be a bit more careful
commit 756b56a9e8 upstream.

The trigger code is picky in how it can be disabled as there may be
dependencies between different events and synthetic events. Change the order
on how triggers are reset.

 1) Reset triggers of all synthetic events first
 2) Remove triggers with actions attached to them
 3) Remove all other triggers

If this order isn't followed, then some triggers will not be reset, and an
error may happen because a trigger is busy.

Cc: stable@vger.kernel.org
Fixes: cfa0963dc4 ("kselftests/ftrace : Add event trigger testcases")
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:48 +02:00
Geert Uytterhoeven
ea0ac01f68 lib/vsprintf: Remove atomic-unsafe support for %pCr
commit 666902e42f upstream.

"%pCr" formats the current rate of a clock, and calls clk_get_rate().
The latter obtains a mutex, hence it must not be called from atomic
context.

Remove support for this rarely-used format, as vsprintf() (and e.g.
printk()) must be callable from any context.

Any remaining out-of-tree users will start seeing the clock's name
printed instead of its rate.

Reported-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Fixes: 900cca2944 ("lib/vsprintf: add %pC{,n,r} format specifiers for clocks")
Link: http://lkml.kernel.org/r/1527845302-12159-5-git-send-email-geert+renesas@glider.be
To: Jia-Ju Bai <baijiaju1990@gmail.com>
To: Jonathan Corbet <corbet@lwn.net>
To: Michael Turquette <mturquette@baylibre.com>
To: Stephen Boyd <sboyd@kernel.org>
To: Zhang Rui <rui.zhang@intel.com>
To: Eduardo Valentin <edubezval@gmail.com>
To: Eric Anholt <eric@anholt.net>
To: Stefan Wahren <stefan.wahren@i2se.com>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-doc@vger.kernel.org
Cc: linux-clk@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: linux-serial@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-renesas-soc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: stable@vger.kernel.org # 4.1+
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:48 +02:00
Geert Uytterhoeven
9fcc267de2 clk: renesas: cpg-mssr: Stop using printk format %pCr
commit ef4b0be626 upstream.

Printk format "%pCr" will be removed soon, as clk_get_rate() must not be
called in atomic context.

Replace it by open-coding the operation.  This is safe here, as the code
runs in task context.

Link: http://lkml.kernel.org/r/1527845302-12159-2-git-send-email-geert+renesas@glider.be
To: Jia-Ju Bai <baijiaju1990@gmail.com>
To: Jonathan Corbet <corbet@lwn.net>
To: Michael Turquette <mturquette@baylibre.com>
To: Stephen Boyd <sboyd@kernel.org>
To: Zhang Rui <rui.zhang@intel.com>
To: Eduardo Valentin <edubezval@gmail.com>
To: Eric Anholt <eric@anholt.net>
To: Stefan Wahren <stefan.wahren@i2se.com>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-doc@vger.kernel.org
Cc: linux-clk@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: linux-serial@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-renesas-soc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: stable@vger.kernel.org # 4.5+
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:48 +02:00
Geert Uytterhoeven
0cf93821e3 thermal: bcm2835: Stop using printk format %pCr
commit bd2a07f71a upstream.

Printk format "%pCr" will be removed soon, as clk_get_rate() must not be
called in atomic context.

Replace it by printing the variable that already holds the clock rate.
Note that calling clk_get_rate() is safe here, as the code runs in task
context.

Link: http://lkml.kernel.org/r/1527845302-12159-3-git-send-email-geert+renesas@glider.be
To: Jia-Ju Bai <baijiaju1990@gmail.com>
To: Jonathan Corbet <corbet@lwn.net>
To: Michael Turquette <mturquette@baylibre.com>
To: Stephen Boyd <sboyd@kernel.org>
To: Zhang Rui <rui.zhang@intel.com>
To: Eduardo Valentin <edubezval@gmail.com>
To: Eric Anholt <eric@anholt.net>
To: Stefan Wahren <stefan.wahren@i2se.com>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-doc@vger.kernel.org
Cc: linux-clk@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: linux-serial@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-renesas-soc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org # 4.12+
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:48 +02:00
Alexander Sverdlin
b2291a435c ASoC: cirrus: i2s: Fix {TX|RX}LinCtrlData setup
commit 5d302ed3cc upstream.

According to "EP93xx User’s Guide", I2STXLinCtrlData and I2SRXLinCtrlData
registers actually have different format. The only currently used bit
(Left_Right_Justify) has different position. Fix this and simplify the
whole setup taking into account the fact that both registers have zero
default value.

The practical effect of the above is repaired SND_SOC_DAIFMT_RIGHT_J
support (currently unused).

Signed-off-by: Alexander Sverdlin <alexander.sverdlin@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:48 +02:00
Alexander Sverdlin
b5e8118779 ASoC: cirrus: i2s: Fix LRCLK configuration
commit 2d534113be upstream.

The bit responsible for LRCLK polarity is i2s_tlrs (0), not i2s_trel (2)
(refer to "EP93xx User's Guide").

Previously card drivers which specified SND_SOC_DAIFMT_NB_IF actually got
SND_SOC_DAIFMT_NB_NF, an adaptation is necessary to retain the old
behavior.

Signed-off-by: Alexander Sverdlin <alexander.sverdlin@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:48 +02:00
Paul Handrigan
7a1d79de77 ASoC: cs35l35: Add use_single_rw to regmap config
commit 6a6ad7face upstream.

Add the use_single_rw flag to regmap config since the
device does not support bulk transactions over i2c.

Signed-off-by: Paul Handrigan <Paul.Handrigan@cirrus.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:48 +02:00
Srinivas Kandagatla
040fecfd71 ASoC: dapm: delete dapm_kcontrol_data paths list before freeing it
commit ff2faf1289 upstream.

dapm_kcontrol_data is freed as part of dapm_kcontrol_free(), leaving the
paths pointer dangling in the list.

This leads to system crash when we try to unload and reload sound card.
I hit this bug during ADSP crash/reboot test case on Dragon board DB410c.

Without this patch, on SLAB Poisoning enabled build, kernel crashes with
"BUG kmalloc-128 (Tainted: G        W        ): Poison overwritten"

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:47 +02:00
Ingo Flaschberger
4e0ce7053a 1wire: family module autoload fails because of upper/lower case mismatch.
commit 065c09563c upstream.

1wire family module autoload fails because of upper/lower
  case mismatch.

Signed-off-by: Ingo Flaschberger <ingo.flaschberger@gmail.com>
Acked-by: Evgeniy Polyakov <zbr@ioremap.net>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:47 +02:00
Maxim Moseychuk
3c22218ed8 usb: do not reset if a low-speed or full-speed device timed out
commit 6e01827ed9 upstream.

Some low-speed and full-speed devices (for example, bluetooth)
do not have time to initialize. For them, ETIMEDOUT is a valid error.
We need to give them another try. Otherwise, they will
never be initialized correctly and in dmesg will be messages
"Bluetooth: hci0 command 0x1002 tx timeout" or similars.

Fixes: 264904ccc3 ("usb: retry reset if a device times out")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Maxim Moseychuk <franchesko.salias.hudro.pedros@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:47 +02:00
Waldemar Rymarkiewicz
8b03376580 PM / OPP: Update voltage in case freq == old_freq
commit c5c2a97b3a upstream.

This commit fixes a rare but possible case when the clk rate is updated
without update of the regulator voltage.

At boot up, CPUfreq checks if the system is running at the right freq. This
is a sanity check in case a bootloader set clk rate that is outside of freq
table present with cpufreq core. In such cases system can be unstable so
better to change it to a freq that is preset in freq-table.

The CPUfreq takes next freq that is >= policy->cur and this is our
target_freq that needs to be set now.

dev_pm_opp_set_rate(dev, target_freq) checks the target_freq and the
old_freq (a current rate). If these are equal it returns early. If not,
it searches for OPP (old_opp) that fits best to old_freq (not listed in
the table) and updates old_freq (!).

Here, we can end up with old_freq = old_opp.rate = target_freq, which
is not handled in _generic_set_opp_regulator(). It's supposed to update
voltage only when freq > old_freq  || freq > old_freq.

if (freq > old_freq) {
		ret = _set_opp_voltage(dev, reg, new_supply);
[...]
if (freq < old_freq) {
		ret = _set_opp_voltage(dev, reg, new_supply);
		if (ret)

It results in, no voltage update while clk rate is updated.

Example:
freq-table = {
	1000MHz   1.15V
	 666MHZ   1.10V
	 333MHz   1.05V
}
boot-up-freq        = 800MHz   # not listed in freq-table
freq = target_freq  = 1GHz
old_freq            = 800Mhz
old_opp = _find_freq_ceil(opp_table, &old_freq);  #(old_freq is modified!)
old_freq            = 1GHz

Fixes: 6a0712f6f1 ("PM / OPP: Add dev_pm_opp_set_rate()")
Cc: 4.6+ <stable@vger.kernel.org> # v4.6+
Signed-off-by: Waldemar Rymarkiewicz <waldemar.rymarkiewicz@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:47 +02:00
Rafael J. Wysocki
ba0be5973f PM / core: Fix supplier device runtime PM usage counter imbalance
commit 47e5abfb54 upstream.

If a device link is added via device_link_add() by the driver of the
link's consumer device, the supplier's runtime PM usage counter is
going to be dropped by the pm_runtime_put_suppliers() call in
driver_probe_device().  However, in that case it is not incremented
unless the supplier driver is already present and the link is not
stateless.  That leads to a runtime PM usage counter imbalance for
the supplier device in a few cases.

To prevent that from happening, bump up the supplier runtime
PM usage counter in device_link_add() for all links with the
DL_FLAG_PM_RUNTIME flag set that are added at the consumer probe
time.  Use pm_runtime_get_noresume() for that as the callers of
device_link_add() who want the supplier to be resumed by it are
expected to pass DL_FLAG_RPM_ACTIVE in flags to it anyway, but
additionally resume the supplier if the link is added during
consumer driver probe to retain the existing behavior for the
callers depending on it.

Fixes: 21d5c57b37 (PM / runtime: Use device links)
Reported-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: 4.10+ <stable@vger.kernel.org> # 4.10+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:47 +02:00
Ulf Hansson
b7ac038977 PM / Domains: Fix error path during attach in genpd
commit 72038df3c5 upstream.

In case the PM domain fails to be powered on in genpd_dev_pm_attach(), it
returns -EPROBE_DEFER, but keeping the device attached to its PM domain.
This leads to problems when the next attempt to attach is re-tried. More
precisely, in that situation an -EEXIST error code is returned, because the
device already has its PM domain pointer assigned, from the first attempt.

Now, because of the sloppy error handling by the existing callers of
dev_pm_domain_attach(), probing is allowed to continue when -EEXIST is
returned. However, in such case there are no guarantees that the PM domain
is powered on by genpd, which may lead to hangs when buses/drivers tried to
access their devices.

Let's fix this behaviour, simply by detaching the device when powering on
fails in genpd_dev_pm_attach().

Cc: v4.11+ <stable@vger.kernel.org> # v4.11+
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:47 +02:00
Eric W. Biederman
8ae5d476a3 signal/xtensa: Consistenly use SIGBUS in do_unaligned_user
commit 7de712ccc0 upstream.

While working on changing this code to use force_sig_fault I
discovered that do_unaliged_user is sets si_signo to SIGBUS and passes
SIGSEGV to force_sig_info.  Which is just b0rked.

The code is reporting a SIGBUS error so replace the SIGSEGV with SIGBUS.

Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: linux-xtensa@linux-xtensa.org
Cc: stable@vger.kernel.org
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Fixes: 5a0015d626 ("[PATCH] xtensa: Architecture support for Tensilica Xtensa Part 3")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:47 +02:00
Daniel Wagner
980899da5d serial: sh-sci: Use spin_{try}lock_irqsave instead of open coding version
commit 8afb1d2c12 upstream.

Commit 40f70c03e3 ("serial: sh-sci: add locking to console write
function to avoid SMP lockup") copied the strategy to avoid locking
problems in conjuncture with the console from the UART8250
driver. Instead using directly spin_{try}lock_irqsave(),
local_irq_save() followed by spin_{try}lock() was used. While this is
correct on mainline, for -rt it is a problem. spin_{try}lock() will
check if it is running in a valid context. Since the local_irq_save()
has already been executed, the context has changed and
spin_{try}lock() will complain. The reason why spin_{try}lock()
complains is that on -rt the spin locks are turned into mutexes and
therefore can sleep. Sleeping with interrupts disabled is not valid.

BUG: sleeping function called from invalid context at /home/wagi/work/rt/v4.4-cip-rt/kernel/locking/rtmutex.c:995
in_atomic(): 0, irqs_disabled(): 128, pid: 778, name: irq/76-eth0
CPU: 0 PID: 778 Comm: irq/76-eth0 Not tainted 4.4.126-test-cip22-rt14-00403-gcd03665c8318 #12
Hardware name: Generic RZ/G1 (Flattened Device Tree)
Backtrace:
[<c00140a0>] (dump_backtrace) from [<c001424c>] (show_stack+0x18/0x1c)
 r7:c06b01f0 r6:60010193 r5:00000000 r4:c06b01f0
[<c0014234>] (show_stack) from [<c01d3c94>] (dump_stack+0x78/0x94)
[<c01d3c1c>] (dump_stack) from [<c004c134>] (___might_sleep+0x134/0x194)
 r7:60010113 r6:c06d3559 r5:00000000 r4:ffffe000
[<c004c000>] (___might_sleep) from [<c04ded60>] (rt_spin_lock+0x20/0x74)
 r5:c06f4d60 r4:c06f4d60
[<c04ded40>] (rt_spin_lock) from [<c02577e4>] (serial_console_write+0x100/0x118)
 r5:c06f4d60 r4:c06f4d60
[<c02576e4>] (serial_console_write) from [<c0061060>] (call_console_drivers.constprop.15+0x10c/0x124)
 r10:c06d2894 r9:c04e18b0 r8:00000028 r7:00000000 r6:c06d3559 r5:c06d2798
 r4:c06b9914 r3:c02576e4
[<c0060f54>] (call_console_drivers.constprop.15) from [<c0062984>] (console_unlock+0x32c/0x430)
 r10:c06d30d8 r9:00000028 r8:c06dd518 r7:00000005 r6:00000000 r5:c06d2798
 r4:c06d2798 r3:00000028
[<c0062658>] (console_unlock) from [<c0062e1c>] (vprintk_emit+0x394/0x4f0)
 r10:c06d2798 r9:c06d30ee r8:00000006 r7:00000005 r6:c06a78fc r5:00000027
 r4:00000003
[<c0062a88>] (vprintk_emit) from [<c0062fa0>] (vprintk+0x28/0x30)
 r10:c060bd46 r9:00001000 r8:c06b9a90 r7:c06b9a90 r6:c06b994c r5:c06b9a3c
 r4:c0062fa8
[<c0062f78>] (vprintk) from [<c0062fb8>] (vprintk_default+0x10/0x14)
[<c0062fa8>] (vprintk_default) from [<c009cd30>] (printk+0x78/0x84)
[<c009ccbc>] (printk) from [<c025afdc>] (credit_entropy_bits+0x17c/0x2cc)
 r3:00000001 r2:decade60 r1:c061a5ee r0:c061a523
 r4:00000006
[<c025ae60>] (credit_entropy_bits) from [<c025bf74>] (add_interrupt_randomness+0x160/0x178)
 r10:466e7196 r9:1f536000 r8:fffeef74 r7:00000000 r6:c06b9a60 r5:c06b9a3c
 r4:dfbcf680
[<c025be14>] (add_interrupt_randomness) from [<c006536c>] (irq_thread+0x1e8/0x248)
 r10:c006537c r9:c06cdf21 r8:c0064fcc r7:df791c24 r6:df791c00 r5:ffffe000
 r4:df525180
[<c0065184>] (irq_thread) from [<c003fba4>] (kthread+0x108/0x11c)
 r10:00000000 r9:00000000 r8:c0065184 r7:df791c00 r6:00000000 r5:df791d00
 r4:decac000
[<c003fa9c>] (kthread) from [<c00101b8>] (ret_from_fork+0x14/0x3c)
 r8:00000000 r7:00000000 r6:00000000 r5:c003fa9c r4:df791d00

Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Daniel Wagner <daniel.wagner@siemens.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:47 +02:00
Finn Thain
60711b27c5 m68k/mac: Fix SWIM memory resource end address
commit 3e2816c107 upstream.

The resource size is 0x2000 == end - start + 1.
Therefore end == start + 0x2000 - 1.

Cc: Laurent Vivier <lvivier@redhat.com>
Cc: stable@vger.kernel.org # v4.14+
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Acked-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:46 +02:00
Michael Schmitz
da9ad89c72 m68k/mm: Adjust VM area to be unmapped by gap size for __iounmap()
commit 3f90f9ef2d upstream.

If 020/030 support is enabled, get_io_area() leaves an IO_SIZE gap
between mappings which is added to the vm_struct representing the
mapping.  __ioremap() uses the actual requested size (after alignment),
while __iounmap() is passed the size from the vm_struct.

On 020/030, early termination descriptors are used to set up mappings of
extent 'size', which are validated on unmapping. The unmapped gap of
size IO_SIZE defeats the sanity check of the pmd tables, causing
__iounmap() to loop forever on 030.

On 040/060, unmapping of page table entries does not check for a valid
mapping, so the umapping loop always completes there.

Adjust size to be unmapped by the gap that had been added in the
vm_struct prior.

This fixes the hang in atari_platform_init() reported a long time ago,
and a similar one reported by Finn recently (addressed by removing
ioremap() use from the SWIM driver.

Tested on my Falcon in 030 mode - untested but should work the same on
040/060 (the extra page tables cleared there would never have been set
up anyway).

Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
[geert: Minor commit description improvements]
[geert: This was fixed in 2.4.23, but not in 2.5.x]
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:46 +02:00
Siarhei Liakh
ab693cc665 x86: Call fixup_exception() before notify_die() in math_error()
commit 3ae6295ccb upstream.

fpu__drop() has an explicit fwait which under some conditions can trigger a
fixable FPU exception while in kernel. Thus, we should attempt to fixup the
exception first, and only call notify_die() if the fixup failed just like
in do_general_protection(). The original call sequence incorrectly triggers
KDB entry on debug kernels under particular FPU-intensive workloads.

Andy noted, that this makes the whole conditional irq enable thing even
more inconsistent, but fixing that it outside the scope of this.

Signed-off-by: Siarhei Liakh <siarhei.liakh@concurrent-rt.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Borislav  Petkov" <bpetkov@suse.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/DM5PR11MB201156F1CAB2592B07C79A03B17D0@DM5PR11MB2011.namprd11.prod.outlook.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:46 +02:00
Borislav Petkov
64d44661e2 x86/mce: Do not overwrite MCi_STATUS in mce_no_way_out()
commit 1f74c8a647 upstream.

mce_no_way_out() does a quick check during #MC to see whether some of
the MCEs logged would require the kernel to panic immediately. And it
passes a struct mce where MCi_STATUS gets written.

However, after having saved a valid status value, the next iteration
of the loop which goes over the MCA banks on the CPU, overwrites the
valid status value because we're using struct mce as storage instead of
a temporary variable.

Which leads to MCE records with an empty status value:

  mce: [Hardware Error]: CPU 0: Machine Check Exception: 6 Bank 0: 0000000000000000
  mce: [Hardware Error]: RIP 10:<ffffffffbd42fbd7> {trigger_mce+0x7/0x10}

In order to prevent the loss of the status register value, return
immediately when severity is a panic one so that we can panic
immediately with the first fatal MCE logged. This is also the intention
of this function and not to noodle over the banks while a fatal MCE is
already logged.

Tony: read the rest of the MCA bank to populate the struct mce fully.

Suggested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20180622095428.626-8-bp@alien8.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:46 +02:00
Tony Luck
5b8e086891 x86/mce: Fix incorrect "Machine check from unknown source" message
commit 40c36e2741 upstream.

Some injection testing resulted in the following console log:

  mce: [Hardware Error]: CPU 22: Machine Check Exception: f Bank 1: bd80000000100134
  mce: [Hardware Error]: RIP 10:<ffffffffc05292dd> {pmem_do_bvec+0x11d/0x330 [nd_pmem]}
  mce: [Hardware Error]: TSC c51a63035d52 ADDR 3234bc4000 MISC 88
  mce: [Hardware Error]: PROCESSOR 0:50654 TIME 1526502199 SOCKET 0 APIC 38 microcode 2000043
  mce: [Hardware Error]: Run the above through 'mcelog --ascii'
  Kernel panic - not syncing: Machine check from unknown source

This confused everybody because the first line quite clearly shows
that we found a logged error in "Bank 1", while the last line says
"unknown source".

The problem is that the Linux code doesn't do the right thing
for a local machine check that results in a fatal error.

It turns out that we know very early in the handler whether the
machine check is fatal. The call to mce_no_way_out() has checked
all the banks for the CPU that took the local machine check. If
it says we must crash, we can do so right away with the right
messages.

We do scan all the banks again. This means that we might initially
not see a problem, but during the second scan find something fatal.
If this happens we print a slightly different message (so I can
see if it actually every happens).

[ bp: Remove unneeded severity assignment. ]

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: stable@vger.kernel.org # 4.2
Link: http://lkml.kernel.org/r/52e049a497e86fd0b71c529651def8871c804df0.1527283897.git.tony.luck@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:46 +02:00
Tony Luck
d292f33b74 x86/mce: Check for alternate indication of machine check recovery on Skylake
commit 4c5717da1d upstream.

Currently we just check the "CAPID0" register to see whether the CPU
can recover from machine checks.

But there are also some special SKUs which do not have all advanced
RAS features, but do enable machine check recovery for use with NVDIMMs.

Add a check for any of bits {8:5} in the "CAPID5" register (each
reports some NVDIMM mode available, if any of them are set, then
the system supports memory machine check recovery).

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: stable@vger.kernel.org # 4.9
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/03cbed6e99ddafb51c2eadf9a3b7c8d7a0cc204e.1527283897.git.tony.luck@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:46 +02:00
Tony Luck
1d1dd2011a x86/mce: Improve error message when kernel cannot recover
commit c7d606f560 upstream.

Since we added support to add recovery from some errors inside the kernel in:

commit b2f9d678e2 ("x86/mce: Check for faults tagged in EXTABLE_CLASS_FAULT exception table entries")

we have done a less than stellar job at reporting the cause of recoverable
machine checks that occur in other parts of the kernel. The user just gets
the unhelpful message:

	mce: [Hardware Error]: Machine check: Action required: unknown MCACOD

doubly unhelpful when they check the manual for the reported IA32_MSR_STATUS.MCACOD
and see that it is listed as one of the standard recoverable values.

Add an extra rule to the MCE severity table to catch this case and report it
as:

	mce: [Hardware Error]: Machine check: Data load in unrecoverable area of kernel

Fixes: b2f9d678e2 ("x86/mce: Check for faults tagged in EXTABLE_CLASS_FAULT exception table entries")
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: stable@vger.kernel.org # 4.6+
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/4cc7c465150a9a48b8b9f45d0b840278e77eb9b5.1527283897.git.tony.luck@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:46 +02:00
Juergen Gross
dbb37d98b9 x86/xen: Add call of speculative_store_bypass_ht_init() to PV paths
commit 74899d92e6 upstream.

Commit:

  1f50ddb4f4 ("x86/speculation: Handle HT correctly on AMD")

... added speculative_store_bypass_ht_init() to the per-CPU initialization sequence.

speculative_store_bypass_ht_init() needs to be called on each CPU for
PV guests, too.

Reported-by: Brian Woods <brian.woods@amd.com>
Tested-by: Brian Woods <brian.woods@amd.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Cc: <stable@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: boris.ostrovsky@oracle.com
Cc: xen-devel@lists.xenproject.org
Fixes: 1f50ddb4f4 ("x86/speculation: Handle HT correctly on AMD")
Link: https://lore.kernel.org/lkml/20180621084331.21228-1-jgross@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:46 +02:00
Dan Williams
3ce79716a9 x86/spectre_v1: Disable compiler optimizations over array_index_mask_nospec()
commit eab6870fee upstream.

Mark Rutland noticed that GCC optimization passes have the potential to elide
necessary invocations of the array_index_mask_nospec() instruction sequence,
so mark the asm() volatile.

Mark explains:

"The volatile will inhibit *some* cases where the compiler could lift the
 array_index_nospec() call out of a branch, e.g. where there are multiple
 invocations of array_index_nospec() with the same arguments:

        if (idx < foo) {
                idx1 = array_idx_nospec(idx, foo)
                do_something(idx1);
        }

        < some other code >

        if (idx < foo) {
                idx2 = array_idx_nospec(idx, foo);
                do_something_else(idx2);
        }

 ... since the compiler can determine that the two invocations yield the same
 result, and reuse the first result (likely the same register as idx was in
 originally) for the second branch, effectively re-writing the above as:

        if (idx < foo) {
                idx = array_idx_nospec(idx, foo);
                do_something(idx);
        }

        < some other code >

        if (idx < foo) {
                do_something_else(idx);
        }

 ... if we don't take the first branch, then speculatively take the second, we
 lose the nospec protection.

 There's more info on volatile asm in the GCC docs:

   https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Volatile
 "

Reported-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@vger.kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: babdde2698 ("x86: Implement array_index_mask_nospec")
Link: https://lkml.kernel.org/lkml/152838798950.14521.4893346294059739135.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:46 +02:00
Greg Kroah-Hartman
a26899e0ba Linux 4.14.52 2018-06-26 08:06:33 +08:00
Vlastimil Babka
1d26c11295 mm, page_alloc: do not break __GFP_THISNODE by zonelist reset
commit 7810e6781e upstream.

In __alloc_pages_slowpath() we reset zonelist and preferred_zoneref for
allocations that can ignore memory policies.  The zonelist is obtained
from current CPU's node.  This is a problem for __GFP_THISNODE
allocations that want to allocate on a different node, e.g.  because the
allocating thread has been migrated to a different CPU.

This has been observed to break SLAB in our 4.4-based kernel, because
there it relies on __GFP_THISNODE working as intended.  If a slab page
is put on wrong node's list, then further list manipulations may corrupt
the list because page_to_nid() is used to determine which node's
list_lock should be locked and thus we may take a wrong lock and race.

Current SLAB implementation seems to be immune by luck thanks to commit
511e3a0588 ("mm/slab: make cache_grow() handle the page allocated on
arbitrary node") but there may be others assuming that __GFP_THISNODE
works as promised.

We can fix it by simply removing the zonelist reset completely.  There
is actually no reason to reset it, because memory policies and cpusets
don't affect the zonelist choice in the first place.  This was different
when commit 183f6371aa ("mm: ignore mempolicies when using
ALLOC_NO_WATERMARK") introduced the code, as mempolicies provided their
own restricted zonelists.

We might consider this for 4.17 although I don't know if there's
anything currently broken.

SLAB is currently not affected, but in kernels older than 4.7 that don't
yet have 511e3a0588 ("mm/slab: make cache_grow() handle the page
allocated on arbitrary node") it is.  That's at least 4.4 LTS.  Older
ones I'll have to check.

So stable backports should be more important, but will have to be
reviewed carefully, as the code went through many changes.  BTW I think
that also the ac->preferred_zoneref reset is currently useless if we
don't also reset ac->nodemask from a mempolicy to NULL first (which we
probably should for the OOM victims etc?), but I would leave that for a
separate patch.

Link: http://lkml.kernel.org/r/20180525130853.13915-1-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Fixes: 183f6371aa ("mm: ignore mempolicies when using ALLOC_NO_WATERMARK")
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-06-26 08:06:33 +08:00