Commit graph

708304 commits

Author SHA1 Message Date
Dmitry V. Levin
33e58deefa uapi: fix linux/tls.h userspace compilation error
commit b9f3eb499d upstream.

Move inclusion of a private kernel header <net/tcp.h>
from uapi/linux/tls.h to its only user - net/tls.h,
to fix the following linux/tls.h userspace compilation error:

/usr/include/linux/tls.h:41:21: fatal error: net/tcp.h: No such file or directory

As to this point uapi/linux/tls.h was totaly unusuable for userspace,
cleanup this header file further by moving other redundant includes
to net/tls.h.

Fixes: 3c4d755915 ("tls: kernel TLS support")
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:40:40 +00:00
Philip Derrin
30d3389d80 ARM: 8721/1: mm: dump: check hardware RO bit for LPAE
commit 3b0c0c922f upstream.

When CONFIG_ARM_LPAE is set, the PMD dump relies on the software
read-only bit to determine whether a page is writable. This
concealed a bug which left the kernel text section writable
(AP2=0) while marked read-only in the software bit.

In a kernel with the AP2 bug, the dump looks like this:

    ---[ Kernel Mapping ]---
    0xc0000000-0xc0200000           2M RW NX SHD
    0xc0200000-0xc0600000           4M ro x  SHD
    0xc0600000-0xc0800000           2M ro NX SHD
    0xc0800000-0xc4800000          64M RW NX SHD

The fix is to check that the software and hardware bits are both
set before displaying "ro". The dump then shows the true perms:

    ---[ Kernel Mapping ]---
    0xc0000000-0xc0200000           2M RW NX SHD
    0xc0200000-0xc0600000           4M RW x  SHD
    0xc0600000-0xc0800000           2M RW NX SHD
    0xc0800000-0xc4800000          64M RW NX SHD

Fixes: ded9477984 ("ARM: 8109/1: mm: Modify pte_write and pmd_write logic for LPAE")
Signed-off-by: Philip Derrin <philip@cog.systems>
Tested-by: Neil Dick <neil@cog.systems>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:40:40 +00:00
Philip Derrin
cb7cc998a0 ARM: 8722/1: mm: make STRICT_KERNEL_RWX effective for LPAE
commit 400eeffaff upstream.

Currently, for ARM kernels with CONFIG_ARM_LPAE and
CONFIG_STRICT_KERNEL_RWX enabled, the 2MiB pages mapping the
kernel code and rodata are writable. They are marked read-only in
a software bit (L_PMD_SECT_RDONLY) but the hardware read-only bit
is not set (PMD_SECT_AP2).

For user mappings, the logic that propagates the software bit
to the hardware bit is in set_pmd_at(); but for the kernel,
section_update() writes the PMDs directly, skipping this logic.

The fix is to set PMD_SECT_AP2 for read-only sections in
section_update(), at the same time as L_PMD_SECT_RDONLY.

Fixes: 1e3479225a ("ARM: 8275/1: mm: fix PMD_SECT_RDONLY undeclared compile error")
Signed-off-by: Philip Derrin <philip@cog.systems>
Reported-by: Neil Dick <neil@cog.systems>
Tested-by: Neil Dick <neil@cog.systems>
Tested-by: Laura Abbott <labbott@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:40:40 +00:00
Catalin Marinas
ecc3c4795b arm64: Implement arch-specific pte_access_permitted()
commit 6218f96c58 upstream.

The generic pte_access_permitted() implementation only checks for
pte_present() (together with the write permission where applicable).
However, for both kernel ptes and PROT_NONE mappings pte_present() also
returns true on arm64 even though such mappings are not user accessible.
Additionally, arm64 now supports execute-only user permission
(PROT_EXEC) which is implemented by clearing the PTE_USER bit.

With this patch the arm64 implementation of pte_access_permitted()
checks for the PTE_VALID and PTE_USER bits together with writable access
if applicable.

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:40:40 +00:00
Andi Kleen
c35be33486 perf/x86/intel: Hide TSX events when RTM is not supported
commit 58ba4d5a25 upstream.

0day testing reported a perf test regression on Haswell systems without
RTM. Commit a5df70c35 hides the in_tx/in_tx_cp attributes when RTM is not
available, but the TSX events are still available in sysfs. Due to the
missing attributes the event parser fails on those files.

Don't show the TSX events in sysfs when RTM is not available on
Haswell/Broadwell/Skylake.

Fixes: a5df70c354 (perf/x86: Only show format attributes when supported)
Reported-by: kernel test robot <xiaolong.ye@intel.com>
Tested-by: Jin Yao <yao.jin@linux.intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20171109000718.14137-1-andi@firstfloor.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:40:40 +00:00
Andy Lutomirski
f9a64e23a9 x86/entry/64: Add missing irqflags tracing to native_load_gs_index()
commit ca37e57bbe upstream.

Running this code with IRQs enabled (where dummy_lock is a spinlock):

static void check_load_gs_index(void)
{
	/* This will fail. */
	load_gs_index(0xffff);

	spin_lock(&dummy_lock);
	spin_unlock(&dummy_lock);
}

Will generate a lockdep warning.  The issue is that the actual write
to %gs would cause an exception with IRQs disabled, and the exception
handler would, as an inadvertent side effect, update irqflag tracing
to reflect the IRQs-off status.  native_load_gs_index() would then
turn IRQs back on and return with irqflag tracing still thinking that
IRQs were off.  The dummy lock-and-unlock causes lockdep to notice the
error and warn.

Fix it by adding the missing tracing.

Apparently nothing did this in a context where it mattered.  I haven't
tried to find a code path that would actually exhibit the warning if
appropriately nasty user code were running.

I suspect that the security impact of this bug is very, very low --
production systems don't run with lockdep enabled, and the warning is
mostly harmless anyway.

Found during a quick audit of the entry code to try to track down an
unrelated bug that Ingo found in some still-in-development code.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/e1aeb0e6ba8dd430ec36c8a35e63b429698b4132.1511411918.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:40:39 +00:00
Andy Lutomirski
c91f3fc241 x86/entry/64: Fix entry_SYSCALL_64_after_hwframe() IRQ tracing
commit 548c3050ea upstream.

When I added entry_SYSCALL_64_after_hwframe(), I left TRACE_IRQS_OFF
before it.  This means that users of entry_SYSCALL_64_after_hwframe()
were responsible for invoking TRACE_IRQS_OFF, and the one and only
user (Xen, added in the same commit) got it wrong.

I think this would manifest as a warning if a Xen PV guest with
CONFIG_DEBUG_LOCKDEP=y were used with context tracking.  (The
context tracking bit is to cause lockdep to get invoked before we
turn IRQs back on.)  I haven't tested that for real yet because I
can't get a kernel configured like that to boot at all on Xen PV.

Move TRACE_IRQS_OFF below the label.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 8a9949bc71 ("x86/xen/64: Rearrange the SYSCALL entries")
Link: http://lkml.kernel.org/r/9150aac013b7b95d62c2336751d5b6e91d2722aa.1511325444.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:40:39 +00:00
Masami Hiramatsu
00d5e292a8 x86/decoder: Add new TEST instruction pattern
commit 12a78d43de upstream.

The kbuild test robot reported this build warning:

  Warning: arch/x86/tools/test_get_len found difference at <jump_table>:ffffffff8103dd2c

  Warning: ffffffff8103dd82: f6 09 d8 testb $0xd8,(%rcx)
  Warning: objdump says 3 bytes, but insn_get_length() says 2
  Warning: decoded and checked 1569014 instructions with 1 warnings

This sequence seems to be a new instruction not in the opcode map in the Intel SDM.

The instruction sequence is "F6 09 d8", means Group3(F6), MOD(00)REG(001)RM(001), and 0xd8.
Intel SDM vol2 A.4 Table A-6 said the table index in the group is "Encoding of Bits 5,4,3 of
the ModR/M Byte (bits 2,1,0 in parenthesis)"

In that table, opcodes listed by the index REG bits as:

  000         001       010 011  100        101        110         111
 TEST Ib/Iz,(undefined),NOT,NEG,MUL AL/rAX,IMUL AL/rAX,DIV AL/rAX,IDIV AL/rAX

So, it seems TEST Ib is assigned to 001.

Add the new pattern.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:40:39 +00:00
Tom Lendacky
46855f80d1 x86/boot: Fix boot failure when SMP MP-table is based at 0
commit ac5292e9a2 upstream.

When crosvm is used to boot a kernel as a VM, the SMP MP-table is found
at physical address 0x0. This causes mpf_base to be set to 0 and a
subsequent "if (!mpf_base)" check in default_get_smp_config() results in
the MP-table not being parsed.  Further into the boot this results in an
oops when attempting a read_apic_id().

Add a boolean variable that is set to true when the MP-table is found.
Use this variable for testing if the MP-table was found so that even a
value of 0 for mpf_base will result in continued parsing of the MP-table.

Fixes: 5997efb967 ("x86/boot: Use memremap() to map the MPF and MPC data")
Reported-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: regression@leemhuis.info
Link: https://lkml.kernel.org/r/20171106201753.23059.86674.stgit@tlendack-t1.amdoffice.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:40:39 +00:00
Eric Biggers
ce922b7b4a lib/mpi: call cond_resched() from mpi_powm() loop
commit 1d9ddde12e upstream.

On a non-preemptible kernel, if KEYCTL_DH_COMPUTE is called with the
largest permitted inputs (16384 bits), the kernel spends 10+ seconds
doing modular exponentiation in mpi_powm() without rescheduling.  If all
threads do it, it locks up the system.  Moreover, it can cause
rcu_sched-stall warnings.

Notwithstanding the insanity of doing this calculation in kernel mode
rather than in userspace, fix it by calling cond_resched() as each bit
from the exponent is processed.  It's still noninterruptible, but at
least it's preemptible now.

Do the cond_resched() once per bit rather than once per MPI limb because
each limb might still easily take 100+ milliseconds on slow CPUs.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:40:39 +00:00
Paul E. McKenney
9f088f6a67 sched: Make resched_cpu() unconditional
commit 7c2102e56a upstream.

The current implementation of synchronize_sched_expedited() incorrectly
assumes that resched_cpu() is unconditional, which it is not.  This means
that synchronize_sched_expedited() can hang when resched_cpu()'s trylock
fails as follows (analysis by Neeraj Upadhyay):

o	CPU1 is waiting for expedited wait to complete:

	sync_rcu_exp_select_cpus
	     rdp->exp_dynticks_snap & 0x1   // returns 1 for CPU5
	     IPI sent to CPU5

	synchronize_sched_expedited_wait
		 ret = swait_event_timeout(rsp->expedited_wq,
					   sync_rcu_preempt_exp_done(rnp_root),
					   jiffies_stall);

	expmask = 0x20, CPU 5 in idle path (in cpuidle_enter())

o	CPU5 handles IPI and fails to acquire rq lock.

	Handles IPI
	     sync_sched_exp_handler
		 resched_cpu
		     returns while failing to try lock acquire rq->lock
		 need_resched is not set

o	CPU5 calls  rcu_idle_enter() and as need_resched is not set, goes to
	idle (schedule() is not called).

o	CPU 1 reports RCU stall.

Given that resched_cpu() is now used only by RCU, this commit fixes the
assumption by making resched_cpu() unconditional.

Reported-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Suggested-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:40:39 +00:00
Johan Hovold
668a128518 serdev: fix registration of second slave
commit 08fcee289f upstream.

Serdev currently only supports a single slave device, but the required
sanity checks to prevent further registration attempts were missing.

If a serial-port node has two child nodes with compatible properties,
the OF code would try to register two slave devices using the same id
and name. Driver core will not allow this (and there will be loud
complaints), but the controller's slave pointer would already have been
set to address of the soon to be deallocated second struct
serdev_device. As the first slave device remains registered, this can
lead to later use-after-free issues when the slave callbacks are
accessed.

Note that while the serdev registration helpers are exported, they are
typically only called by serdev core. Any other (out-of-tree) callers
must serialise registration and deregistration themselves.

Fixes: cd6484e183 ("serdev: Introduce new bus for serial attached devices")
Cc: Rob Herring <robh@kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:40:39 +00:00
Viresh Kumar
b79974945e cpufreq: schedutil: Reset cached_raw_freq when not in sync with next_freq
commit 07458f6a51 upstream.

'cached_raw_freq' is used to get the next frequency quickly but should
always be in sync with sg_policy->next_freq. There is a case where it is
not and in such cases it should be reset to avoid switching to incorrect
frequencies.

Consider this case for example:

 - policy->cur is 1.2 GHz (Max)
 - New request comes for 780 MHz and we store that in cached_raw_freq.
 - Based on 780 MHz, we calculate the effective frequency as 800 MHz.
 - We then see the CPU wasn't idle recently and choose to keep the next
   freq as 1.2 GHz.
 - Now we have cached_raw_freq is 780 MHz and sg_policy->next_freq is
   1.2 GHz.
 - Now if the utilization doesn't change in then next request, then the
   next target frequency will still be 780 MHz and it will match with
   cached_raw_freq. But we will choose 1.2 GHz instead of 800 MHz here.

Fixes: b7eaf1aab9 (cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely)
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:40:39 +00:00
Lv Zheng
3fe36d0c58 ACPI / EC: Fix regression related to triggering source of EC event handling
commit 53c5eaabae upstream.

Originally the Samsung quirks removed by commit 4c237371 can be covered
by commit e923e8e7 and ec_freeze_events=Y mode. But commit 9c40f956
changed ec_freeze_events=Y back to N, making this problem re-surface.

Actually, if commit e923e8e7 is robust enough, we can freely change
ec_freeze_events mode, so this patch fixes the issue by improving
commit e923e8e7.

Related commits listed in the merged order:

 Commit: e923e8e79e
 Subject: ACPI / EC: Fix an issue that SCI_EVT cannot be detected
          after event is enabled

 Commit: 4c237371f2
 Subject: ACPI / EC: Remove old CLEAR_ON_RESUME quirk

 Commit: 9c40f956ce
 Subject: Revert "ACPI / EC: Enable event freeze mode..." to fix
          a regression

This patch not only fixes the reported post-resume EC event triggering
source issue, but also fixes an unreported similar issue related to the
driver bind by adding EC event triggering source in ec_install_handlers().

Fixes: e923e8e79e (ACPI / EC: Fix an issue that SCI_EVT cannot be detected after event is enabled)
Fixes: 4c237371f2 (ACPI / EC: Remove old CLEAR_ON_RESUME quirk)
Fixes: 9c40f956ce (Revert "ACPI / EC: Enable event freeze mode..." to fix a regression)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=196833
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Reported-by: Alistair Hamilton <ahpatent@gmail.com>
Tested-by: Alistair Hamilton <ahpatent@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:40:39 +00:00
Ville Syrjälä
ef2b11c0a3 ACPI / PM: Fix acpi_pm_notifier_lock vs flush_workqueue() deadlock
commit ff1656790b upstream.

acpi_remove_pm_notifier() ends up calling flush_workqueue() while
holding acpi_pm_notifier_lock, and that same lock is taken by
by the work via acpi_pm_notify_handler(). This can deadlock.

To fix the problem let's split the single lock into two: one to
protect the dev->wakeup between the work vs. add/remove, and
another one to handle notifier installation vs. removal.

After commit a1d14934ea "workqueue/lockdep: 'Fix' flush_work()
annotation" I was able to kill the machine (Intel Braswell)
very easily with 'powertop --auto-tune', runtime suspending i915,
and trying to wake it up via the USB keyboard. The cases when
it didn't die are presumably explained by lockdep getting disabled
by something else (cpu hotplug locking issues usually).

Fortunately I still got a lockdep report over netconsole
(trickling in very slowly), even though the machine was
otherwise practically dead:

[  112.179806] ======================================================
[  114.670858] WARNING: possible circular locking dependency detected
[  117.155663] 4.13.0-rc6-bsw-bisect-00169-ga1d14934ea4b #119 Not tainted
[  119.658101] ------------------------------------------------------
[  121.310242] xhci_hcd 0000:00:14.0: xHCI host not responding to stop endpoint command.
[  121.313294] xhci_hcd 0000:00:14.0: xHCI host controller not responding, assume dead
[  121.313346] xhci_hcd 0000:00:14.0: HC died; cleaning up
[  121.313485] usb 1-6: USB disconnect, device number 3
[  121.313501] usb 1-6.2: USB disconnect, device number 4
[  134.747383] kworker/0:2/47 is trying to acquire lock:
[  137.220790]  (acpi_pm_notifier_lock){+.+.}, at: [<ffffffff813cafdf>] acpi_pm_notify_handler+0x2f/0x80
[  139.721524]
[  139.721524] but task is already holding lock:
[  144.672922]  ((&dpc->work)){+.+.}, at: [<ffffffff8109ce90>] process_one_work+0x160/0x720
[  147.184450]
[  147.184450] which lock already depends on the new lock.
[  147.184450]
[  154.604711]
[  154.604711] the existing dependency chain (in reverse order) is:
[  159.447888]
[  159.447888] -> #2 ((&dpc->work)){+.+.}:
[  164.183486]        __lock_acquire+0x1255/0x13f0
[  166.504313]        lock_acquire+0xb5/0x210
[  168.778973]        process_one_work+0x1b9/0x720
[  171.030316]        worker_thread+0x4c/0x440
[  173.257184]        kthread+0x154/0x190
[  175.456143]        ret_from_fork+0x27/0x40
[  177.624348]
[  177.624348] -> #1 ("kacpi_notify"){+.+.}:
[  181.850351]        __lock_acquire+0x1255/0x13f0
[  183.941695]        lock_acquire+0xb5/0x210
[  186.046115]        flush_workqueue+0xdd/0x510
[  190.408153]        acpi_os_wait_events_complete+0x31/0x40
[  192.625303]        acpi_remove_notify_handler+0x133/0x188
[  194.820829]        acpi_remove_pm_notifier+0x56/0x90
[  196.989068]        acpi_dev_pm_detach+0x5f/0xa0
[  199.145866]        dev_pm_domain_detach+0x27/0x30
[  201.285614]        i2c_device_probe+0x100/0x210
[  203.411118]        driver_probe_device+0x23e/0x310
[  205.522425]        __driver_attach+0xa3/0xb0
[  207.634268]        bus_for_each_dev+0x69/0xa0
[  209.714797]        driver_attach+0x1e/0x20
[  211.778258]        bus_add_driver+0x1bc/0x230
[  213.837162]        driver_register+0x60/0xe0
[  215.868162]        i2c_register_driver+0x42/0x70
[  217.869551]        0xffffffffa0172017
[  219.863009]        do_one_initcall+0x45/0x170
[  221.843863]        do_init_module+0x5f/0x204
[  223.817915]        load_module+0x225b/0x29b0
[  225.757234]        SyS_finit_module+0xc6/0xd0
[  227.661851]        do_syscall_64+0x5c/0x120
[  229.536819]        return_from_SYSCALL_64+0x0/0x7a
[  231.392444]
[  231.392444] -> #0 (acpi_pm_notifier_lock){+.+.}:
[  235.124914]        check_prev_add+0x44e/0x8a0
[  237.024795]        __lock_acquire+0x1255/0x13f0
[  238.937351]        lock_acquire+0xb5/0x210
[  240.840799]        __mutex_lock+0x75/0x940
[  242.709517]        mutex_lock_nested+0x1c/0x20
[  244.551478]        acpi_pm_notify_handler+0x2f/0x80
[  246.382052]        acpi_ev_notify_dispatch+0x44/0x5c
[  248.194412]        acpi_os_execute_deferred+0x14/0x30
[  250.003925]        process_one_work+0x1ec/0x720
[  251.803191]        worker_thread+0x4c/0x440
[  253.605307]        kthread+0x154/0x190
[  255.387498]        ret_from_fork+0x27/0x40
[  257.153175]
[  257.153175] other info that might help us debug this:
[  257.153175]
[  262.324392] Chain exists of:
[  262.324392]   acpi_pm_notifier_lock --> "kacpi_notify" --> (&dpc->work)
[  262.324392]
[  267.391997]  Possible unsafe locking scenario:
[  267.391997]
[  270.758262]        CPU0                    CPU1
[  272.431713]        ----                    ----
[  274.060756]   lock((&dpc->work));
[  275.646532]                                lock("kacpi_notify");
[  277.260772]                                lock((&dpc->work));
[  278.839146]   lock(acpi_pm_notifier_lock);
[  280.391902]
[  280.391902]  *** DEADLOCK ***
[  280.391902]
[  284.986385] 2 locks held by kworker/0:2/47:
[  286.524895]  #0:  ("kacpi_notify"){+.+.}, at: [<ffffffff8109ce90>] process_one_work+0x160/0x720
[  288.112927]  #1:  ((&dpc->work)){+.+.}, at: [<ffffffff8109ce90>] process_one_work+0x160/0x720
[  289.727725]

Fixes: c072530f39 (ACPI / PM: Revork the handling of ACPI device wakeup notifications)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:40:38 +00:00
Vasily Gorbik
6047980008 s390/disassembler: increase show_code buffer size
commit b192571d1a upstream.

Current buffer size of 64 is too small. objdump shows that there are
instructions which would require up to 75 bytes buffer (with current
formating). 128 bytes "ought to be enough for anybody".

Also replaces 8 spaces with a single tab to reduce the memory footprint.

Fixes the following KASAN finding:

BUG: KASAN: stack-out-of-bounds in number+0x3fe/0x538
Write of size 1 at addr 000000005a4a75a0 by task bash/1282

CPU: 1 PID: 1282 Comm: bash Not tainted 4.14.0+ #215
Hardware name: IBM 2964 N96 702 (z/VM 6.4.0)
Call Trace:
([<000000000011eeb6>] show_stack+0x56/0x88)
 [<0000000000e1ce1a>] dump_stack+0x15a/0x1b0
 [<00000000004e2994>] print_address_description+0xf4/0x288
 [<00000000004e2cf2>] kasan_report+0x13a/0x230
 [<0000000000e38ae6>] number+0x3fe/0x538
 [<0000000000e3dfe4>] vsnprintf+0x194/0x948
 [<0000000000e3ea42>] sprintf+0xa2/0xb8
 [<00000000001198dc>] print_insn+0x374/0x500
 [<0000000000119346>] show_code+0x4ee/0x538
 [<000000000011f234>] show_registers+0x34c/0x388
 [<000000000011f2ae>] show_regs+0x3e/0xa8
 [<000000000011f502>] die+0x1ea/0x2e8
 [<0000000000138f0e>] do_no_context+0x106/0x168
 [<0000000000139a1a>] do_protection_exception+0x4da/0x7d0
 [<0000000000e55914>] pgm_check_handler+0x16c/0x1c0
 [<000000000090639e>] sysrq_handle_crash+0x46/0x58
([<0000000000000007>] 0x7)
 [<00000000009073fa>] __handle_sysrq+0x102/0x218
 [<0000000000907c06>] write_sysrq_trigger+0xd6/0x100
 [<000000000061d67a>] proc_reg_write+0xb2/0x128
 [<0000000000520be6>] __vfs_write+0xee/0x368
 [<0000000000521222>] vfs_write+0x21a/0x278
 [<000000000052156a>] SyS_write+0xda/0x178
 [<0000000000e555cc>] system_call+0xc4/0x270

The buggy address belongs to the page:
page:000003d1016929c0 count:0 mapcount:0 mapping:          (null) index:0x0
flags: 0x0()
raw: 0000000000000000 0000000000000000 0000000000000000 ffffffff00000000
raw: 0000000000000100 0000000000000200 0000000000000000 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 000000005a4a7480: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
 000000005a4a7500: 00 00 00 00 00 00 00 00 f2 f2 f2 f2 00 00 00 00
>000000005a4a7580: 00 00 00 00 f3 f3 f3 f3 00 00 00 00 00 00 00 00
                               ^
 000000005a4a7600: 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 f8 f8
 000000005a4a7680: f2 f2 f2 f2 f2 f2 f8 f8 f2 f2 f3 f3 f3 f3 00 00
==================================================================

Signed-off-by: Vasily Gorbik <gor@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:40:38 +00:00
Heiko Carstens
15e82cdb4d s390/disassembler: add missing end marker for e7 table
commit 5c50538752 upstream.

The e7 opcode table does not have an end marker. Hence when trying to
find an unknown e7 instruction the code will access memory behind the
table until it finds something that matches the opcode, or the kernel
crashes, whatever comes first.

This affects not only the in-kernel disassembler but also uprobes and
kprobes which refuse to set a probe on unknown instructions, and
therefore search the opcode tables to figure out if instructions are
known or not.

Fixes: 3585cb0280 ("s390/disassembler: add vector instructions")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:40:38 +00:00
Heiko Carstens
7ee3f026ae s390/guarded storage: fix possible memory corruption
commit fa1edf3f63 upstream.

For PREEMPT enabled kernels the guarded storage (GS) code contains a
possible use-after-free bug. If a task that makes use of GS exits, it
will execute do_exit() while still enabled for preemption.

That function will call exit_thread_runtime_instr() via exit_thread().
If exit_thread_gs() gets preempted after the GS control block of the
task has been freed but before the pointer to it is set to NULL, then
save_gs_cb(), called from switch_to(), will write to already freed
memory.

Avoid this and simply disable preemption while freeing the control
block and setting the pointer to NULL.

Fixes: 916cda1aa1 ("s390: add a system call for guarded storage")
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:40:38 +00:00
Heiko Carstens
27576413d9 s390/runtime instrumention: fix possible memory corruption
commit d6e646ad7c upstream.

For PREEMPT enabled kernels the runtime instrumentation (RI) code
contains a possible use-after-free bug. If a task that makes use of RI
exits, it will execute do_exit() while still enabled for preemption.

That function will call exit_thread_runtime_instr() via
exit_thread(). If exit_thread_runtime_instr() gets preempted after the
RI control block of the task has been freed but before the pointer to
it is set to NULL, then save_ri_cb(), called from switch_to(), will
write to already freed memory.

Avoid this and simply disable preemption while freeing the control
block and setting the pointer to NULL.

Fixes: e4b8b3f33f ("s390: add support for runtime instrumentation")
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:40:38 +00:00
Heiko Carstens
21caac65d2 s390/noexec: execute kexec datamover without DAT
commit d0e810eeb3 upstream.

Rebooting into a new kernel with kexec fails (system dies) if tried on
a machine that has no-execute support. Reason for this is that the so
called datamover code gets executed with DAT on (MMU is active) and
the page that contains the datamover is marked as non-executable.
Therefore when branching into the datamover an unexpected program
check happens and afterwards the machine is dead.

This can be simply avoided by disabling DAT, which also disables any
no-execute checks, just before the datamover gets executed.

In fact the first thing done by the datamover is to disable DAT. The
code in the datamover that disables DAT can be removed as well.

Thanks to Michael Holzheu and Gerald Schaefer for tracking this down.

Reviewed-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Reviewed-by: Philipp Rudo <prudo@linux.vnet.ibm.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fixes: 57d7f939e7 ("s390: add no-execute support")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:40:38 +00:00
Heiko Carstens
236f6e7283 s390: fix transactional execution control register handling
commit a1c5befc1c upstream.

Dan Horák reported the following crash related to transactional execution:

User process fault: interruption code 0013 ilc:3 in libpthread-2.26.so[3ff93c00000+1b000]
CPU: 2 PID: 1 Comm: /init Not tainted 4.13.4-300.fc27.s390x #1
Hardware name: IBM 2827 H43 400 (z/VM 6.4.0)
task: 00000000fafc8000 task.stack: 00000000fafc4000
User PSW : 0705200180000000 000003ff93c14e70
           R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:1 AS:0 CC:2 PM:0 RI:0 EA:3
User GPRS: 0000000000000077 000003ff00000000 000003ff93144d48 000003ff93144d5e
           0000000000000000 0000000000000002 0000000000000000 000003ff00000000
           0000000000000000 0000000000000418 0000000000000000 000003ffcc9fe770
           000003ff93d28f50 000003ff9310acf0 000003ff92b0319a 000003ffcc9fe6d0
User Code: 000003ff93c14e62: 60e0b030            std     %f14,48(%r11)
           000003ff93c14e66: 60f0b038            std     %f15,56(%r11)
          #000003ff93c14e6a: e5600000ff0e        tbegin  0,65294
          >000003ff93c14e70: a7740006            brc     7,3ff93c14e7c
           000003ff93c14e74: a7080000            lhi     %r0,0
           000003ff93c14e78: a7f40023            brc     15,3ff93c14ebe
           000003ff93c14e7c: b2220000            ipm     %r0
           000003ff93c14e80: 8800001c            srl     %r0,28

There are several bugs with control register handling with respect to
transactional execution:

- on task switch update_per_regs() is only called if the next task has
  an mm (is not a kernel thread). This however is incorrect. This
  breaks e.g. for user mode helper handling, where the kernel creates
  a kernel thread and then execve's a user space program. Control
  register contents related to transactional execution won't be
  updated on execve. If the previous task ran with transactional
  execution disabled then the new task will also run with
  transactional execution disabled, which is incorrect. Therefore call
  update_per_regs() unconditionally within switch_to().

- on startup the transactional execution facility is not enabled for
  the idle thread. This is not really a bug, but an inconsistency to
  other facilities. Therefore enable the facility if it is available.

- on fork the new thread's per_flags field is not cleared. This means
  that a child process inherits the PER_FLAG_NO_TE flag. This flag can
  be set with a ptrace request to disable transactional execution for
  the current process. It should not be inherited by new child
  processes in order to be consistent with the handling of all other
  PER related debugging options. Therefore clear the per_flags field in
  copy_thread_tls().

Reported-and-tested-by: Dan Horák <dan@danny.cz>
Fixes: d35339a42d ("s390: add support for transactional memory")
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:40:38 +00:00
Greg Kroah-Hartman
f9f0b03ded Linux 4.14.2 2017-11-24 08:37:05 +01:00
Corey Minyard
8d8564fce8 ipmi: Prefer ACPI system interfaces over SMBIOS ones
commit 7e030d6dff upstream.

The recent changes to add SMBIOS (DMI) IPMI interfaces as platform
devices caused DMI to be selected before ACPI, causing ACPI type
of operations to not work.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:37:05 +01:00
Jan Harkes
71d3850bf1 coda: fix 'kernel memory exposure attempt' in fsync
commit d337b66a4c upstream.

When an application called fsync on a file in Coda a small request with
just the file identifier was allocated, but the declared length was set
to the size of union of all possible upcall requests.

This bug has been around for a very long time and is now caught by the
extra checking in usercopy that was introduced in Linux-4.8.

The exposure happens when the Coda cache manager process reads the fsync
upcall request at which point it is killed. As a result there is nobody
servicing any further upcalls, trapping any processes that try to access
the mounted Coda filesystem.

Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:37:05 +01:00
Jaewon Kim
cb58b8f8a8 mm/page_ext.c: check if page_ext is not prepared
commit e492080e64 upstream.

online_page_ext() and page_ext_init() allocate page_ext for each
section, but they do not allocate if the first PFN is !pfn_present(pfn)
or !pfn_valid(pfn).  Then section->page_ext remains as NULL.
lookup_page_ext checks NULL only if CONFIG_DEBUG_VM is enabled.  For a
valid PFN, __set_page_owner will try to get page_ext through
lookup_page_ext.  Without CONFIG_DEBUG_VM lookup_page_ext will misuse
NULL pointer as value 0.  This incurrs invalid address access.

This is the panic example when PFN 0x100000 is not valid but PFN
0x13FC00 is being used for page_ext.  section->page_ext is NULL,
get_entry returned invalid page_ext address as 0x1DFA000 for a PFN
0x13FC00.

To avoid this panic, CONFIG_DEBUG_VM should be removed so that page_ext
will be checked at all times.

  Unable to handle kernel paging request at virtual address 01dfa014
  ------------[ cut here ]------------
  Kernel BUG at ffffff80082371e0 [verbose debug info unavailable]
  Internal error: Oops: 96000045 [#1] PREEMPT SMP
  Modules linked in:
  PC is at __set_page_owner+0x48/0x78
  LR is at __set_page_owner+0x44/0x78
    __set_page_owner+0x48/0x78
    get_page_from_freelist+0x880/0x8e8
    __alloc_pages_nodemask+0x14c/0xc48
    __do_page_cache_readahead+0xdc/0x264
    filemap_fault+0x2ac/0x550
    ext4_filemap_fault+0x3c/0x58
    __do_fault+0x80/0x120
    handle_mm_fault+0x704/0xbb0
    do_page_fault+0x2e8/0x394
    do_mem_abort+0x88/0x124

Pre-4.7 kernels also need commit f86e427197 ("mm: check the return
value of lookup_page_ext for all call sites").

Link: http://lkml.kernel.org/r/20171107094131.14621-1-jaewon31.kim@samsung.com
Fixes: eefa864b70 ("mm/page_ext: resurrect struct page extending code for debugging")
Signed-off-by: Jaewon Kim <jaewon31.kim@samsung.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:37:05 +01:00
Pavel Tatashin
5a77c92fa1 mm/page_alloc.c: broken deferred calculation
commit d135e57502 upstream.

In reset_deferred_meminit() we determine number of pages that must not
be deferred.  We initialize pages for at least 2G of memory, but also
pages for reserved memory in this node.

The reserved memory is determined in this function:
memblock_reserved_memory_within(), which operates over physical
addresses, and returns size in bytes.  However, reset_deferred_meminit()
assumes that that this function operates with pfns, and returns page
count.

The result is that in the best case machine boots slower than expected
due to initializing more pages than needed in single thread, and in the
worst case panics because fewer than needed pages are initialized early.

Link: http://lkml.kernel.org/r/20171021011707.15191-1-pasha.tatashin@oracle.com
Fixes: 864b9a393d ("mm: consider memblock reservations for deferred memory initialization sizing")
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:37:05 +01:00
Corey Minyard
3a46857604 ipmi: fix unsigned long underflow
commit 392a17b10e upstream.

When I set the timeout to a specific value such as 500ms, the timeout
event will not happen in time due to the overflow in function
check_msg_timeout:
...
	ent->timeout -= timeout_period;
	if (ent->timeout > 0)
		return;
...

The type of timeout_period is long, but ent->timeout is unsigned long.
This patch makes the type consistent.

Reported-by: Weilong Chen <chenweilong@huawei.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Tested-by: Weilong Chen <chenweilong@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:37:04 +01:00
alex chen
44ec0aecc7 ocfs2: should wait dio before inode lock in ocfs2_setattr()
commit 28f5a8a7c0 upstream.

we should wait dio requests to finish before inode lock in
ocfs2_setattr(), otherwise the following deadlock will happen:

process 1                  process 2                    process 3
truncate file 'A'          end_io of writing file 'A'   receiving the bast messages
ocfs2_setattr
 ocfs2_inode_lock_tracker
  ocfs2_inode_lock_full
 inode_dio_wait
  __inode_dio_wait
  -->waiting for all dio
  requests finish
                                                        dlm_proxy_ast_handler
                                                         dlm_do_local_bast
                                                          ocfs2_blocking_ast
                                                           ocfs2_generic_handle_bast
                                                            set OCFS2_LOCK_BLOCKED flag
                        dio_end_io
                         dio_bio_end_aio
                          dio_complete
                           ocfs2_dio_end_io
                            ocfs2_dio_end_io_write
                             ocfs2_inode_lock
                              __ocfs2_cluster_lock
                               ocfs2_wait_for_mask
                               -->waiting for OCFS2_LOCK_BLOCKED
                               flag to be cleared, that is waiting
                               for 'process 1' unlocking the inode lock
                           inode_dio_end
                           -->here dec the i_dio_count, but will never
                           be called, so a deadlock happened.

Link: http://lkml.kernel.org/r/59F81636.70508@huawei.com
Signed-off-by: Alex Chen <alex.chen@huawei.com>
Reviewed-by: Jun Piao <piaojun@huawei.com>
Reviewed-by: Joseph Qi <jiangqi903@gmail.com>
Acked-by: Changwei Ge <ge.changwei@h3c.com>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:37:04 +01:00
Changwei Ge
e250a19937 ocfs2: fix cluster hang after a node dies
commit 1c01967116 upstream.

When a node dies, other live nodes have to choose a new master for an
existed lock resource mastered by the dead node.

As for ocfs2/dlm implementation, this is done by function -
dlm_move_lockres_to_recovery_list which marks those lock rsources as
DLM_LOCK_RES_RECOVERING and manages them via a list from which DLM
changes lock resource's master later.

So without invoking dlm_move_lockres_to_recovery_list, no master will be
choosed after dlm recovery accomplishment since no lock resource can be
found through ::resource list.

What's worse is that if DLM_LOCK_RES_RECOVERING is not marked for lock
resources mastered a dead node, it will break up synchronization among
nodes.

So invoke dlm_move_lockres_to_recovery_list again.

Fixs: 'commit ee8f7fcbe6 ("ocfs2/dlm: continue to purge recovery lockres when recovery master goes down")'
Link: http://lkml.kernel.org/r/63ADC13FD55D6546B7DECE290D39E373CED6E0F9@H3CMLB14-EX.srv.huawei-3com.com
Signed-off-by: Changwei Ge <ge.changwei@h3c.com>
Reported-by: Vitaly Mayatskih <v.mayatskih@gmail.com>
Tested-by: Vitaly Mayatskikh <v.mayatskih@gmail.com>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:37:04 +01:00
Jann Horn
bbce81fc96 mm/pagewalk.c: report holes in hugetlb ranges
commit 373c4557d2 upstream.

This matters at least for the mincore syscall, which will otherwise copy
uninitialized memory from the page allocator to userspace.  It is
probably also a correctness error for /proc/$pid/pagemap, but I haven't
tested that.

Removing the `walk->hugetlb_entry` condition in walk_hugetlb_range() has
no effect because the caller already checks for that.

This only reports holes in hugetlb ranges to callers who have specified
a hugetlb_entry callback.

This issue was found using an AFL-based fuzzer.

v2:
 - don't crash on ->pte_hole==NULL (Andrew Morton)
 - add Cc stable (Andrew Morton)

Fixes: 1e25a271c8 ("mincore: apply page table walker on do_mincore()")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:37:04 +01:00
Neeraj Upadhyay
3594216fc6 rcu: Fix up pending cbs check in rcu_prepare_for_idle
commit 135bd1a230 upstream.

The pending-callbacks check in rcu_prepare_for_idle() is backwards.
It should accelerate if there are pending callbacks, but the check
rather uselessly accelerates only if there are no callbacks.  This commit
therefore inverts this check.

Fixes: 15fecf89e4 ("srcu: Abstract multi-tail callback list handling")
Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:37:04 +01:00
Alexander Steffen
a371abb25a tpm-dev-common: Reject too short writes
commit ee70bc1e7b upstream.

tpm_transmit() does not offer an explicit interface to indicate the number
of valid bytes in the communication buffer. Instead, it relies on the
commandSize field in the TPM header that is encoded within the buffer.
Therefore, ensure that a) enough data has been written to the buffer, so
that the commandSize field is present and b) the commandSize field does not
announce more data than has been written to the buffer.

This should have been fixed with CVE-2011-1161 long ago, but apparently
a correct version of that patch never made it into the kernel.

Signed-off-by: Alexander Steffen <Alexander.Steffen@infineon.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:37:04 +01:00
Ji-Ze Hong (Peter Hong)
432a1a50d3 serial: 8250_fintek: Fix finding base_port with activated SuperIO
commit fd97e66c55 upstream.

The SuperIO will be configured at boot time by BIOS, but some BIOS
will not deactivate the SuperIO when the end of configuration. It'll
lead to mismatch for pdata->base_port in probe_setup_port(). So we'll
deactivate all SuperIO before activate special base_port in
fintek_8250_enter_key().

Tested on iBASE MI802.

Tested-by: Ji-Ze Hong (Peter Hong) <hpeter+linux_kernel@gmail.com>
Signed-off-by: Ji-Ze Hong (Peter Hong) <hpeter+linux_kernel@gmail.com>
Reviewd-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:37:04 +01:00
Lukas Wunner
c75289a7b7 serial: omap: Fix EFR write on RTS deassertion
commit 2a71de2f73 upstream.

Commit 348f9bb31c ("serial: omap: Fix RTS handling") sought to enable
auto RTS upon manual RTS assertion and disable it on deassertion.
However it seems the latter was done incorrectly, it clears all bits in
the Extended Features Register *except* auto RTS.

Fixes: 348f9bb31c ("serial: omap: Fix RTS handling")
Cc: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:37:04 +01:00
Roberto Sassu
87eb84b9a0 ima: do not update security.ima if appraisal status is not INTEGRITY_PASS
commit 020aae3ee5 upstream.

Commit b65a9cfc2c ("Untangling ima mess, part 2: deal with counters")
moved the call of ima_file_check() from may_open() to do_filp_open() at a
point where the file descriptor is already opened.

This breaks the assumption made by IMA that file descriptors being closed
belong to files whose access was granted by ima_file_check(). The
consequence is that security.ima and security.evm are updated with good
values, regardless of the current appraisal status.

For example, if a file does not have security.ima, IMA will create it after
opening the file for writing, even if access is denied. Access to the file
will be allowed afterwards.

Avoid this issue by checking the appraisal status before updating
security.ima.

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:37:03 +01:00
Eric W. Biederman
8d02869481 net/sctp: Always set scope_id in sctp_inet6_skb_msgname
[ Upstream commit 7c8a61d9ee ]

Alexandar Potapenko while testing the kernel with KMSAN and syzkaller
discovered that in some configurations sctp would leak 4 bytes of
kernel stack.

Working with his reproducer I discovered that those 4 bytes that
are leaked is the scope id of an ipv6 address returned by recvmsg.

With a little code inspection and a shrewd guess I discovered that
sctp_inet6_skb_msgname only initializes the scope_id field for link
local ipv6 addresses to the interface index the link local address
pertains to instead of initializing the scope_id field for all ipv6
addresses.

That is almost reasonable as scope_id's are meaniningful only for link
local addresses.  Set the scope_id in all other cases to 0 which is
not a valid interface index to make it clear there is nothing useful
in the scope_id field.

There should be no danger of breaking userspace as the stack leak
guaranteed that previously meaningless random data was being returned.

Fixes: 372f525b495c ("SCTP:  Resync with LKSCTP tree.")
History-tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Reported-by: Alexander Potapenko <glider@google.com>
Tested-by: Alexander Potapenko <glider@google.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:37:03 +01:00
Huacai Chen
7b6868743d fealnx: Fix building error on MIPS
[ Upstream commit cc54c1d32e ]

This patch try to fix the building error on MIPS. The reason is MIPS
has already defined the LONG macro, which conflicts with the LONG enum
in drivers/net/ethernet/fealnx.c.

Signed-off-by: Huacai Chen <chenhc@lemote.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:37:03 +01:00
Bjørn Mork
060dd7a476 net: cdc_ncm: GetNtbFormat endian fix
[ Upstream commit 6314dab4b8 ]

The GetNtbFormat and SetNtbFormat requests operate on 16 bit little
endian values. We get away with ignoring this most of the time, because
we only care about USB_CDC_NCM_NTB16_FORMAT which is 0x0000.  This
fails for USB_CDC_NCM_NTB32_FORMAT.

Fix comparison between LE value from device and constant by converting
the constant to LE.

Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Fixes: 2b02c20ce0 ("cdc_ncm: Set NTB format again after altsetting switch for Huawei devices")
Cc: Enrico Mioso <mrkiko.rs@gmail.com>
Cc: Christian Panton <christian@panton.org>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Acked-By: Enrico Mioso <mrkiko.rs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:37:03 +01:00
Xin Long
829af2cd8a vxlan: fix the issue that neigh proxy blocks all icmpv6 packets
[ Upstream commit 8bff3685a4 ]

Commit f1fb08f633 ("vxlan: fix ND proxy when skb doesn't have transport
header offset") removed icmp6_code and icmp6_type check before calling
neigh_reduce when doing neigh proxy.

It means all icmpv6 packets would be blocked by this, not only ns packet.
In Jianlin's env, even ping6 couldn't work through it.

This patch is to bring the icmp6_code and icmp6_type check back and also
removed the same check from neigh_reduce().

Fixes: f1fb08f633 ("vxlan: fix ND proxy when skb doesn't have transport header offset")
Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Vincent Bernat <vincent@bernat.im>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:37:03 +01:00
Jason A. Donenfeld
5856c858c6 af_netlink: ensure that NLMSG_DONE never fails in dumps
[ Upstream commit 0642840b8b ]

The way people generally use netlink_dump is that they fill in the skb
as much as possible, breaking when nla_put returns an error. Then, they
get called again and start filling out the next skb, and again, and so
forth. The mechanism at work here is the ability for the iterative
dumping function to detect when the skb is filled up and not fill it
past the brim, waiting for a fresh skb for the rest of the data.

However, if the attributes are small and nicely packed, it is possible
that a dump callback function successfully fills in attributes until the
skb is of size 4080 (libmnl's default page-sized receive buffer size).
The dump function completes, satisfied, and then, if it happens to be
that this is actually the last skb, and no further ones are to be sent,
then netlink_dump will add on the NLMSG_DONE part:

  nlh = nlmsg_put_answer(skb, cb, NLMSG_DONE, sizeof(len), NLM_F_MULTI);

It is very important that netlink_dump does this, of course. However, in
this example, that call to nlmsg_put_answer will fail, because the
previous filling by the dump function did not leave it enough room. And
how could it possibly have done so? All of the nla_put variety of
functions simply check to see if the skb has enough tailroom,
independent of the context it is in.

In order to keep the important assumptions of all netlink dump users, it
is therefore important to give them an skb that has this end part of the
tail already reserved, so that the call to nlmsg_put_answer does not
fail. Otherwise, library authors are forced to find some bizarre sized
receive buffer that has a large modulo relative to the common sizes of
messages received, which is ugly and buggy.

This patch thus saves the NLMSG_DONE for an additional message, for the
case that things are dangerously close to the brim. This requires
keeping track of the errno from ->dump() across calls.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:37:03 +01:00
Michael Lyle
5d62da3a8e bio: ensure __bio_clone_fast copies bi_partno
commit 62530ed8b1 upstream.

A new field was introduced in 74d46992e0, bi_partno, instead of using
bdev->bd_contains and encoding the partition information in the bi_bdev
field.  __bio_clone_fast was changed to copy the disk information, but
not the partition information.  At minimum, this regressed bcache and
caused data corruption.

Signed-off-by: Michael Lyle <mlyle@lyle.org>
Fixes: 74d46992e0 ("block: replace bi_bdev with a gendisk pointer and partitions index")
Reported-by: Pavel Goran <via-bcache@pvgoran.name>
Reported-by: Campbell Steven <casteven@gmail.com>
Reviewed-by: Coly Li <colyli@suse.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:37:03 +01:00
Greg Kroah-Hartman
780a781dd6 Linux 4.14.1 2017-11-21 09:49:25 +01:00
Nitin Gupta
ea826dbc30 sparc64: Fix page table walk for PUD hugepages
[ Upstream commit 70f3c8b7c2 ]

For a PUD hugepage entry, we need to propagate bits [32:22]
from virtual address to resolve at 4M granularity. However,
the current code was incorrectly propagating bits [29:19].
This bug can cause incorrect data to be returned for pages
backed with 16G hugepages.

Signed-off-by: Nitin Gupta <nitin.m.gupta@oracle.com>
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-21 09:49:25 +01:00
Guenter Roeck
e4b57c4bc1 sparc64: mmu_context: Add missing include files
commit 01c3f0a42a upstream.

Fix the following build errors.

In file included from arch/sparc/include/asm/mmu_context.h:4:0,
                 from include/linux/mmu_context.h:4,
		 from drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h:29,
		 from drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c:23:
arch/sparc/include/asm/mmu_context_64.h:22:37: error:
	unknown type name 'per_cpu_secondary_mm'

arch/sparc/include/asm/mmu_context_64.h: In function 'switch_mm':
arch/sparc/include/asm/mmu_context_64.h:79:2: error:
	implicit declaration of function 'smp_processor_id'

Fixes: 70539bd795 ("drm/amd: Update MEC HQD loading code for KFD")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-21 09:49:24 +01:00
David S. Miller
05e15a59ee sparc32: Add cmpxchg64().
commit 23198ddffb upstream.

This fixes the build with i40e driver enabled.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-21 09:49:24 +01:00
Johan Hovold
9aacd82304 spi: fix use-after-free at controller deregistration
commit 67f7b2781f upstream.

The controller is typically freed as part of device_unregister() so
store the bus id before deregistration to avoid use-after-free when the
id is later released.

Fixes: 9b61e30221 ("spi: Pick spi bus number from Linux idr or spi alias")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-21 09:49:24 +01:00
Hans de Goede
d3e36fd07b staging: rtl8188eu: Revert 4 commits breaking ARP
commit 66d32fdcbf upstream.

Commit 2ba8444c97 ("staging:r8188eu: move IV/ICV trimming into
decrypt() and also place it after rtl88eu_mon_recv_hook()") breaks ARP.

After this commit ssh-ing to a laptop with r8188eu wifi no longer works
if the machine connecting has never communicated with the laptop before.
This is 100% reproducable using "arp -d <ipv4> && ssh <ipv4>" to ssh to
a laptop with r8188eu wifi.

This commit reverts 4 commits in total:

1. Commit 79650ffde3 ("staging:r8188eu: trim IV/ICV fields in
   validate_recv_data_frame()")
This commit depends on 2 of the other commits being reverted.

2. Commit 02b19b4c49 ("staging:r8188eu: inline unprotect_frame() in
   mon_recv_decrypted_recv()")
The inline code is wrong the un-inlined version contains:
	if (skb->len < hdr_len + iv_len + icv_len)
		return;
	...
Where as the inline-ed code introduced by this commit does:
	if (skb->len < hdr_len + iv_len + icv_len) {
		...
Note the same check, but now to actually continue doing ... instead
of to not do it, so this commit is no good.

3. Commit d86e16da6a ("staging:r8188eu: use different mon_recv_decrypted()
   inside rtl88eu_mon_recv_hook() and rtl88eu_mon_xmit_hook().")
This commit introduced a 1:1 copy of a function so that one of the
2 copies can be modified in the 2 commits we're already reverting.

4. Commit 2ba8444c97 ("staging:r8188eu: move IV/ICV trimming into
   decrypt() and also place it after rtl88eu_mon_recv_hook()")
This is the commit actually breaking ARP.

Note this commit is a straight-forward squash of the revert of these
4 commits, without any changes.

Cc: Ivan Safonov <insafonov@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-21 09:49:24 +01:00
Hans de Goede
ccc04bde3a staging: vboxvideo: Fix reporting invalid suggested-offset-properties
commit ce10d7b4e8 upstream.

The x and y hints receives from the host are unsigned 32 bit integers and
they get set to -1 (0xffffffff) when invalid. Before this commit the
vboxvideo driver was storing them in an u16 causing the -1 to be truncated
to 65535 which, once reported to userspace, was breaking gnome 3.26+
in Wayland mode.

This commit stores the host values in 32 bit variables, removing the
truncation and checks for -1, replacing it with 0 as -1 is not a valid
suggested-offset-property value. Likewise the properties are now
initialized to 0 instead of -1, since -1 is not a valid value.
This fixes gnome 3.26+ in Wayland mode not working with the vboxvideo
driver.

Reported-by: Gianfranco Costamagna <locutusofborg@debian.org>
Cc: Michael Thayer <michael.thayer@oracle.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-21 09:49:24 +01:00
Johan Hovold
5abe8d4847 staging: greybus: spilib: fix use-after-free after deregistration
commit 770b03c2ca upstream.

Remove erroneous spi_master_put() after controller deregistration which
would access the already freed spi controller.

Note that spi_unregister_master() drops our only controller reference.

Fixes: ba3e67001b ("greybus: SPI: convert to a gpbridge driver")
Signed-off-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Rui Miguel Silva <rmfrfs@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-21 09:49:24 +01:00
Gilad Ben-Yossef
75559e0f48 staging: ccree: fix 64 bit scatter/gather DMA ops
commit e0b3f39092 upstream.

Fix a wrong offset used in splitting a 64 DMA address to MSB/LSB
parts needed for scatter/gather HW descriptors causing operations
relying on them to fail on 64 bit platforms.

Fixes: c6f7f2f459 ("staging: ccree: refactor LLI access macros")
Reported-by: Stuart Yoder <stuart.yoder@arm.com>
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-21 09:49:24 +01:00