Commit graph

826607 commits

Author SHA1 Message Date
Dan Williams
6041186a32 init: initialize jump labels before command line option parsing
When a module option, or core kernel argument, toggles a static-key it
requires jump labels to be initialized early.  While x86, PowerPC, and
ARM64 arrange for jump_label_init() to be called before parse_args(),
ARM does not.

  Kernel command line: rdinit=/sbin/init page_alloc.shuffle=1 panic=-1 console=ttyAMA0,115200 page_alloc.shuffle=1
  ------------[ cut here ]------------
  WARNING: CPU: 0 PID: 0 at ./include/linux/jump_label.h:303
  page_alloc_shuffle+0x12c/0x1ac
  static_key_enable(): static key 'page_alloc_shuffle_key+0x0/0x4' used
  before call to jump_label_init()
  Modules linked in:
  CPU: 0 PID: 0 Comm: swapper Not tainted
  5.1.0-rc4-next-20190410-00003-g3367c36ce744 #1
  Hardware name: ARM Integrator/CP (Device Tree)
  [<c0011c68>] (unwind_backtrace) from [<c000ec48>] (show_stack+0x10/0x18)
  [<c000ec48>] (show_stack) from [<c07e9710>] (dump_stack+0x18/0x24)
  [<c07e9710>] (dump_stack) from [<c001bb1c>] (__warn+0xe0/0x108)
  [<c001bb1c>] (__warn) from [<c001bb88>] (warn_slowpath_fmt+0x44/0x6c)
  [<c001bb88>] (warn_slowpath_fmt) from [<c0b0c4a8>]
  (page_alloc_shuffle+0x12c/0x1ac)
  [<c0b0c4a8>] (page_alloc_shuffle) from [<c0b0c550>] (shuffle_store+0x28/0x48)
  [<c0b0c550>] (shuffle_store) from [<c003e6a0>] (parse_args+0x1f4/0x350)
  [<c003e6a0>] (parse_args) from [<c0ac3c00>] (start_kernel+0x1c0/0x488)

Move the fallback call to jump_label_init() to occur before
parse_args().

The redundant calls to jump_label_init() in other archs are left intact
in case they have static key toggling use cases that are even earlier
than option parsing.

Link: http://lkml.kernel.org/r/155544804466.1032396.13418949511615676665.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reported-by: Guenter Roeck <groeck@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Russell King <rmk@armlinux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:05 -07:00
Sergey Senozhatsky
8f4a8c12ca kernel/watchdog_hld.c: hard lockup message should end with a newline
Separate print_modules() and hard lockup error message.

Before the patch:

  NMI watchdog: Watchdog detected hard LOCKUP on cpu 1Modules linked in: nls_cp437

Link: http://lkml.kernel.org/r/20190412062557.2700-1-sergey.senozhatsky@gmail.com
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:05 -07:00
Mark Rutland
40453c4f9b kcov: improve CONFIG_ARCH_HAS_KCOV help text
The help text for CONFIG_ARCH_HAS_KCOV is stale, and describes the
feature as being enabled only for x86_64, when it is now enabled for
several architectures, including arm, arm64, powerpc, and s390.

Let's remove that stale help text, and update it along the lines of hat
for ARCH_HAS_FORTIFY_SOURCE, better describing when an architecture
should select CONFIG_ARCH_HAS_KCOV.

Link: http://lkml.kernel.org/r/20190412102733.5154-1-mark.rutland@arm.com
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:05 -07:00
Johannes Weiner
3b991208b8 mm: fix inactive list balancing between NUMA nodes and cgroups
During !CONFIG_CGROUP reclaim, we expand the inactive list size if it's
thrashing on the node that is about to be reclaimed.  But when cgroups
are enabled, we suddenly ignore the node scope and use the cgroup scope
only.  The result is that pressure bleeds between NUMA nodes depending
on whether cgroups are merely compiled into Linux.  This behavioral
difference is unexpected and undesirable.

When the refault adaptivity of the inactive list was first introduced,
there were no statistics at the lruvec level - the intersection of node
and memcg - so it was better than nothing.

But now that we have that infrastructure, use lruvec_page_state() to
make the list balancing decision always NUMA aware.

[hannes@cmpxchg.org: fix bisection hole]
  Link: http://lkml.kernel.org/r/20190417155241.GB23013@cmpxchg.org
Link: http://lkml.kernel.org/r/20190412144438.2645-1-hannes@cmpxchg.org
Fixes: 2a2e48854d ("mm: vmscan: fix IO/refault regression in cache workingset transition")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:05 -07:00
Qian Cai
1a9f219157 mm/hotplug: treat CMA pages as unmovable
has_unmovable_pages() is used by allocating CMA and gigantic pages as
well as the memory hotplug.  The later doesn't know how to offline CMA
pool properly now, but if an unused (free) CMA page is encountered, then
has_unmovable_pages() happily considers it as a free memory and
propagates this up the call chain.  Memory offlining code then frees the
page without a proper CMA tear down which leads to an accounting issues.
Moreover if the same memory range is onlined again then the memory never
gets back to the CMA pool.

State after memory offline:

 # grep cma /proc/vmstat
 nr_free_cma 205824

 # cat /sys/kernel/debug/cma/cma-kvm_cma/count
 209920

Also, kmemleak still think those memory address are reserved below but
have already been used by the buddy allocator after onlining.  This
patch fixes the situation by treating CMA pageblocks as unmovable except
when has_unmovable_pages() is called as part of CMA allocation.

  Offlined Pages 4096
  kmemleak: Cannot insert 0xc000201f7d040008 into the object search tree (overlaps existing)
  Call Trace:
    dump_stack+0xb0/0xf4 (unreliable)
    create_object+0x344/0x380
    __kmalloc_node+0x3ec/0x860
    kvmalloc_node+0x58/0x110
    seq_read+0x41c/0x620
    __vfs_read+0x3c/0x70
    vfs_read+0xbc/0x1a0
    ksys_read+0x7c/0x140
    system_call+0x5c/0x70
  kmemleak: Kernel memory leak detector disabled
  kmemleak: Object 0xc000201cc8000000 (size 13757317120):
  kmemleak:   comm "swapper/0", pid 0, jiffies 4294937297
  kmemleak:   min_count = -1
  kmemleak:   count = 0
  kmemleak:   flags = 0x5
  kmemleak:   checksum = 0
  kmemleak:   backtrace:
       cma_declare_contiguous+0x2a4/0x3b0
       kvm_cma_reserve+0x11c/0x134
       setup_arch+0x300/0x3f8
       start_kernel+0x9c/0x6e8
       start_here_common+0x1c/0x4b0
  kmemleak: Automatic memory scanning thread ended

[cai@lca.pw: use is_migrate_cma_page() and update commit log]
  Link: http://lkml.kernel.org/r/20190416170510.20048-1-cai@lca.pw
Link: http://lkml.kernel.org/r/20190413002623.8967-1-cai@lca.pw
Signed-off-by: Qian Cai <cai@lca.pw>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:05 -07:00
Alexey Dobriyan
68545aa1cd proc: fixup proc-pid-vm test
Silly sizeof(pointer) vs sizeof(uint8_t[]) bug.

Link: http://lkml.kernel.org/r/20190414123009.GA12971@avx2
Fixes: e483b02087 ("proc: test /proc/*/maps, smaps, smaps_rollup, statm")
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:04 -07:00
Alexey Dobriyan
8cd40d1d41 proc: fix map_files test on F29
F29 bans mapping first 64KB even for root making test fail.  Iterate
from address 0 until mmap() works.

Gentoo (root):

	openat(AT_FDCWD, "/dev/zero", O_RDONLY) = 3
	mmap(NULL, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0

Gentoo (non-root):

	openat(AT_FDCWD, "/dev/zero", O_RDONLY) = 3
	mmap(NULL, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EPERM (Operation not permitted)
	mmap(0x1000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0x1000

F29 (root):

	openat(AT_FDCWD, "/dev/zero", O_RDONLY) = 3
	mmap(NULL, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0x1000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0x2000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0x3000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0x4000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0x5000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0x6000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0x7000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0x8000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0x9000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0xa000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0xb000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0xc000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0xd000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0xe000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0xf000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0x10000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0x10000

Now all proc tests succeed on F29 if run as root, at last!

Link: http://lkml.kernel.org/r/20190414123612.GB12971@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:04 -07:00
Konstantin Khlebnikov
e8277b3b52 mm/vmstat.c: fix /proc/vmstat format for CONFIG_DEBUG_TLBFLUSH=y CONFIG_SMP=n
Commit 58bc4c34d2 ("mm/vmstat.c: skip NR_TLB_REMOTE_FLUSH* properly")
depends on skipping vmstat entries with empty name introduced in
7aaf772723 ("mm: don't show nr_indirectly_reclaimable in
/proc/vmstat") but reverted in b29940c1ab ("mm: rename and change
semantics of nr_indirectly_reclaimable_bytes").

So skipping no longer works and /proc/vmstat has misformatted lines " 0".

This patch simply shows debug counters "nr_tlb_remote_*" for UP.

Link: http://lkml.kernel.org/r/155481488468.467.4295519102880913454.stgit@buzz
Fixes: 58bc4c34d2 ("mm/vmstat.c: skip NR_TLB_REMOTE_FLUSH* properly")
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Roman Gushchin <guro@fb.com>
Cc: Jann Horn <jannh@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:04 -07:00
zhong jiang
37803841c9 mm/memory_hotplug: do not unlock after failing to take the device_hotplug_lock
When adding memory by probing a memory block in the sysfs interface,
there is an obvious issue where we will unlock the device_hotplug_lock
when we failed to takes it.

That issue was introduced in 8df1d0e4a2 ("mm/memory_hotplug: make
add_memory() take the device_hotplug_lock").

We should drop out in time when failing to take the device_hotplug_lock.

Link: http://lkml.kernel.org/r/1554696437-9593-1-git-send-email-zhongjiang@huawei.com
Fixes: 8df1d0e4a2 ("mm/memory_hotplug: make add_memory() take the device_hotplug_lock")
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Reported-by: Yang yingliang <yangyingliang@huawei.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:04 -07:00
Hugh Dickins
af53d3e9e0 mm: swapoff: shmem_unuse() stop eviction without igrab()
The igrab() in shmem_unuse() looks good, but we forgot that it gives no
protection against concurrent unmounting: a point made by Konstantin
Khlebnikov eight years ago, and then fixed in 2.6.39 by 778dd893ae
("tmpfs: fix race between umount and swapoff").  The current 5.1-rc
swapoff is liable to hit "VFS: Busy inodes after unmount of tmpfs.
Self-destruct in 5 seconds.  Have a nice day..." followed by GPF.

Once again, give up on using igrab(); but don't go back to making such
heavy-handed use of shmem_swaplist_mutex as last time: that would spoil
the new design, and I expect could deadlock inside shmem_swapin_page().

Instead, shmem_unuse() just raise a "stop_eviction" count in the shmem-
specific inode, and shmem_evict_inode() wait for that to go down to 0.
Call it "stop_eviction" rather than "swapoff_busy" because it can be put
to use for others later (huge tmpfs patches expect to use it).

That simplifies shmem_unuse(), protecting it from both unlink and
unmount; and in practice lets it locate all the swap in its first try.
But do not rely on that: there's still a theoretical case, when
shmem_writepage() might have been preempted after its get_swap_page(),
before making the swap entry visible to swapoff.

[hughd@google.com: remove incorrect list_del()]
  Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1904091133570.1898@eggly.anvils
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1904081259400.1523@eggly.anvils
Fixes: b56a2d8af9 ("mm: rid swapoff of quadratic complexity")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: "Alex Xu (Hello71)" <alex_y_xu@yahoo.ca>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Kelley Nielsen <kelleynnn@gmail.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Rik van Riel <riel@surriel.com>
Cc: Vineeth Pillai <vpillai@digitalocean.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:04 -07:00
Hugh Dickins
64165b1aff mm: swapoff: take notice of completion sooner
The old try_to_unuse() implementation was driven by find_next_to_unuse(),
which terminated as soon as all the swap had been freed.

Add inuse_pages checks now (alongside signal_pending()) to stop scanning
mms and swap_map once finished.

The same ought to be done in shmem_unuse() too, but never was before,
and needs a different interface: so leave it as is for now.

Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1904081258200.1523@eggly.anvils
Fixes: b56a2d8af9 ("mm: rid swapoff of quadratic complexity")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: "Alex Xu (Hello71)" <alex_y_xu@yahoo.ca>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Kelley Nielsen <kelleynnn@gmail.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Rik van Riel <riel@surriel.com>
Cc: Vineeth Pillai <vpillai@digitalocean.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:04 -07:00
Hugh Dickins
dd862deb15 mm: swapoff: remove too limiting SWAP_UNUSE_MAX_TRIES
SWAP_UNUSE_MAX_TRIES 3 appeared to work well in earlier testing, but
further testing has proved it to be a source of unnecessary swapoff
EBUSY failures (which can then be followed by unmount EBUSY failures).

When mmget_not_zero() or shmem's igrab() fails, there is an mm exiting
or inode being evicted, freeing up swap independent of try_to_unuse().
Those typically completed much sooner than the old quadratic swapoff,
but now it's more common that swapoff may need to wait for them.

It's possible to move those cases from init_mm.mmlist and shmem_swaplist
to separate "exiting" swaplists, and try_to_unuse() then wait for those
lists to be emptied; but we've not bothered with that in the past, and
don't want to risk missing some other forgotten case.  So just revert to
cycling around until the swap is gone, without any retries limit.

Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1904081256170.1523@eggly.anvils
Fixes: b56a2d8af9 ("mm: rid swapoff of quadratic complexity")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: "Alex Xu (Hello71)" <alex_y_xu@yahoo.ca>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Kelley Nielsen <kelleynnn@gmail.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Rik van Riel <riel@surriel.com>
Cc: Vineeth Pillai <vpillai@digitalocean.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:04 -07:00
Hugh Dickins
8703954654 mm: swapoff: shmem_find_swap_entries() filter out other types
Swapfile "type" was passed all the way down to shmem_unuse_inode(), but
then forgotten from shmem_find_swap_entries(): with the result that
removing one swapfile would try to free up all the swap from shmem - no
problem when only one swapfile anyway, but counter-productive when more,
causing swapoff to be unnecessarily OOM-killed when it should succeed.

Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1904081254470.1523@eggly.anvils
Fixes: b56a2d8af9 ("mm: rid swapoff of quadratic complexity")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: "Alex Xu (Hello71)" <alex_y_xu@yahoo.ca>
Cc: Vineeth Pillai <vpillai@digitalocean.com>
Cc: Kelley Nielsen <kelleynnn@gmail.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:04 -07:00
Qian Cai
1a62b18d51 slab: store tagged freelist for off-slab slabmgmt
Commit 51dedad06b ("kasan, slab: make freelist stored without tags")
calls kasan_reset_tag() for off-slab slab management object leading to
freelist being stored non-tagged.

However, cache_grow_begin() calls alloc_slabmgmt() which calls
kmem_cache_alloc_node() assigns a tag for the address and stores it in
the shadow address.  As the result, it causes endless errors below
during boot due to drain_freelist() -> slab_destroy() ->
kasan_slab_free() which compares already untagged freelist against the
stored tag in the shadow address.

Since off-slab slab management object freelist is such a special case,
just store it tagged.  Non-off-slab management object freelist is still
stored untagged which has not been assigned a tag and should not cause
any other troubles with this inconsistency.

  BUG: KASAN: double-free or invalid-free in slab_destroy+0x84/0x88
  Pointer tag: [ff], memory tag: [99]

  CPU: 0 PID: 1376 Comm: kworker/0:4 Tainted: G        W 5.1.0-rc3+ #8
  Hardware name: HPE Apollo 70             /C01_APACHE_MB         , BIOS L50_5.13_1.0.6 07/10/2018
  Workqueue: cgroup_destroy css_killed_work_fn
  Call trace:
   print_address_description+0x74/0x2a4
   kasan_report_invalid_free+0x80/0xc0
   __kasan_slab_free+0x204/0x208
   kasan_slab_free+0xc/0x18
   kmem_cache_free+0xe4/0x254
   slab_destroy+0x84/0x88
   drain_freelist+0xd0/0x104
   __kmem_cache_shrink+0x1ac/0x224
   __kmemcg_cache_deactivate+0x1c/0x28
   memcg_deactivate_kmem_caches+0xa0/0xe8
   memcg_offline_kmem+0x8c/0x3d4
   mem_cgroup_css_offline+0x24c/0x290
   css_killed_work_fn+0x154/0x618
   process_one_work+0x9cc/0x183c
   worker_thread+0x9b0/0xe38
   kthread+0x374/0x390
   ret_from_fork+0x10/0x18

  Allocated by task 1625:
   __kasan_kmalloc+0x168/0x240
   kasan_slab_alloc+0x18/0x20
   kmem_cache_alloc_node+0x1f8/0x3a0
   cache_grow_begin+0x4fc/0xa24
   cache_alloc_refill+0x2f8/0x3e8
   kmem_cache_alloc+0x1bc/0x3bc
   sock_alloc_inode+0x58/0x334
   alloc_inode+0xb8/0x164
   new_inode_pseudo+0x20/0xec
   sock_alloc+0x74/0x284
   __sock_create+0xb0/0x58c
   sock_create+0x98/0xb8
   __sys_socket+0x60/0x138
   __arm64_sys_socket+0xa4/0x110
   el0_svc_handler+0x2c0/0x47c
   el0_svc+0x8/0xc

  Freed by task 1625:
   __kasan_slab_free+0x114/0x208
   kasan_slab_free+0xc/0x18
   kfree+0x1a8/0x1e0
   single_release+0x7c/0x9c
   close_pdeo+0x13c/0x43c
   proc_reg_release+0xec/0x108
   __fput+0x2f8/0x784
   ____fput+0x1c/0x28
   task_work_run+0xc0/0x1b0
   do_notify_resume+0xb44/0x1278
   work_pending+0x8/0x10

  The buggy address belongs to the object at ffff809681b89e00
   which belongs to the cache kmalloc-128 of size 128
  The buggy address is located 0 bytes inside of
   128-byte region [ffff809681b89e00, ffff809681b89e80)
  The buggy address belongs to the page:
  page:ffff7fe025a06e00 count:1 mapcount:0 mapping:01ff80082000fb00
  index:0xffff809681b8fe04
  flags: 0x17ffffffc000200(slab)
  raw: 017ffffffc000200 ffff7fe025a06d08 ffff7fe022ef7b88 01ff80082000fb00
  raw: ffff809681b8fe04 ffff809681b80000 00000001000000e0 0000000000000000
  page dumped because: kasan: bad access detected
  page allocated via order 0, migratetype Unmovable, gfp_mask
  0x2420c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_COMP|__GFP_THISNODE)
   prep_new_page+0x4e0/0x5e0
   get_page_from_freelist+0x4ce8/0x50d4
   __alloc_pages_nodemask+0x738/0x38b8
   cache_grow_begin+0xd8/0xa24
   ____cache_alloc_node+0x14c/0x268
   __kmalloc+0x1c8/0x3fc
   ftrace_free_mem+0x408/0x1284
   ftrace_free_init_mem+0x20/0x28
   kernel_init+0x24/0x548
   ret_from_fork+0x10/0x18

  Memory state around the buggy address:
   ffff809681b89c00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
   ffff809681b89d00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
  >ffff809681b89e00: 99 99 99 99 99 99 99 99 fe fe fe fe fe fe fe fe
                     ^
   ffff809681b89f00: 43 43 43 43 43 fe fe fe fe fe fe fe fe fe fe fe
   ffff809681b8a000: 6d fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe

Link: http://lkml.kernel.org/r/20190403022858.97584-1-cai@lca.pw
Fixes: 51dedad06b ("kasan, slab: make freelist stored without tags")
Signed-off-by: Qian Cai <cai@lca.pw>
Reviewed-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:04 -07:00
Christian König
bd4264112f drm/ttm: fix re-init of global structures
When a driver unloads without unloading TTM we don't correctly
clear the global structures leading to errors on re-init.

Next step should probably be to remove the global structures and
kobjs all together, but this is tricky since we need to maintain
backward compatibility.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
CC: stable@vger.kernel.org # 5.0.x
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-04-19 11:11:20 -05:00
Andi Kleen
1de7edbb59 x86/cpu/bugs: Use __initconst for 'const' init data
Some of the recently added const tables use __initdata which causes section
attribute conflicts.

Use __initconst instead.

Fixes: fa1202ef22 ("x86/speculation: Add command line control")
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190330004743.29541-9-andi@firstfloor.org
2019-04-19 17:11:39 +02:00
Masami Hiramatsu
b191fa96ea x86/kprobes: Avoid kretprobe recursion bug
Avoid kretprobe recursion loop bg by setting a dummy
kprobes to current_kprobe per-CPU variable.

This bug has been introduced with the asm-coded trampoline
code, since previously it used another kprobe for hooking
the function return placeholder (which only has a nop) and
trampoline handler was called from that kprobe.

This revives the old lost kprobe again.

With this fix, we don't see deadlock anymore.

And you can see that all inner-called kretprobe are skipped.

  event_1                                  235               0
  event_2                                19375           19612

The 1st column is recorded count and the 2nd is missed count.
Above shows (event_1 rec) + (event_2 rec) ~= (event_2 missed)
(some difference are here because the counter is racy)

Reported-by: Andrea Righi <righi.andrea@gmail.com>
Tested-by: Andrea Righi <righi.andrea@gmail.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Fixes: c9becf58d9 ("[PATCH] kretprobe: kretprobe-booster")
Link: http://lkml.kernel.org/r/155094064889.6137.972160690963039.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-19 14:26:07 +02:00
Masami Hiramatsu
fabe38ab6b kprobes: Mark ftrace mcount handler functions nokprobe
Mark ftrace mcount handler functions nokprobe since
probing on these functions with kretprobe pushes
return address incorrectly on kretprobe shadow stack.

Reported-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Tested-by: Andrea Righi <righi.andrea@gmail.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/155094062044.6137.6419622920568680640.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-19 14:26:06 +02:00
Masami Hiramatsu
3ff9c075cc x86/kprobes: Verify stack frame on kretprobe
Verify the stack frame pointer on kretprobe trampoline handler,
If the stack frame pointer does not match, it skips the wrong
entry and tries to find correct one.

This can happen if user puts the kretprobe on the function
which can be used in the path of ftrace user-function call.
Such functions should not be probed, so this adds a warning
message that reports which function should be blacklisted.

Tested-by: Andrea Righi <righi.andrea@gmail.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/155094059185.6137.15527904013362842072.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-19 14:26:05 +02:00
Andrew Morton
b50776ae01 locking/atomics: Don't assume that scripts are executable
patch(1) doesn't set the x bit on files.  So if someone downloads and
applies patch-4.21.xz, their kernel won't build.  Fix that by executing
/bin/sh.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-19 14:21:43 +02:00
Guoqing Jiang
c53051128b sc16is7xx: put err_spi and err_i2c into correct #ifdef
err_spi is only called within SERIAL_SC16IS7XX_SPI
while err_i2c is called inside SERIAL_SC16IS7XX_I2C.
So we need to put err_spi and err_i2c into each #ifdef
accordingly.

This change fixes ("sc16is7xx: move label 'err_spi'
to correct section").

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-19 14:09:23 +02:00
Christoph Hellwig
144ec97493 scsi: aic7xxx: fix EISA support
Instead of relying on the now removed NULL argument to
pci_alloc_consistent, switch to the generic DMA API, and store the struct
device so that we can pass it.

Fixes: 4167b2ad51 ("PCI: Remove NULL device handling from PCI DMA API")
Reported-by: Matthew Whitehead <tedheadster@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Matthew Whitehead <tedheadster@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-18 20:43:10 -04:00
Saurav Kashyap
0228034d8e Revert "scsi: fcoe: clear FC_RP_STARTED flags when receiving a LOGO"
This patch clears FC_RP_STARTED flag during logoff, because of this
re-login(flogi) didn't happen to the switch.

This reverts commit 1550ec458e.

Fixes: 1550ec458e ("scsi: fcoe: clear FC_RP_STARTED flags when receiving a LOGO")
Cc: <stable@vger.kernel.org> # v4.18+
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Reviewed-by: Hannes Reinecke <hare@#suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-18 20:40:00 -04:00
Jakub Kicinski
9188d5ca45 net/tls: fix refcount adjustment in fallback
Unlike atomic_add(), refcount_add() does not deal well
with a negative argument.  TLS fallback code reallocates
the skb and is very likely to shrink the truesize, leading to:

[  189.513254] WARNING: CPU: 5 PID: 0 at lib/refcount.c:81 refcount_add_not_zero_checked+0x15c/0x180
 Call Trace:
  refcount_add_checked+0x6/0x40
  tls_enc_skb+0xb93/0x13e0 [tls]

Once wmem_allocated count saturates the application can no longer
send data on the socket.  This is similar to Eric's fixes for GSO,
TCP:
commit 7ec318feee ("tcp: gso: avoid refcount_t warning from tcp_gso_segment()")
and UDP:
commit 575b65bc5b ("udp: avoid refcount_t saturation in __udp_gso_segment()").

Unlike the GSO case, for TLS fallback it's likely that the skb has
shrunk, so the "likely" annotation is the other way around (likely
branch being "sub").

Fixes: e8f6979981 ("net/tls: Add generic NIC offload infrastructure")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-18 16:51:03 -07:00
Su Bao Cheng
e0c1d14a1a stmmac: pci: Adjust IOT2000 matching
Since there are more IOT2040 variants with identical hardware but
different asset tags, the asset tag matching should be adjusted to
support them.

For the board name "SIMATIC IOT2000", currently there are 2 types of
hardware, IOT2020 and IOT2040. The IOT2020 is identified by its unique
asset tag. Match on it first. If we then match on the board name only,
we will catch all IOT2040 variants. In the future there will be no other
devices with the "SIMATIC IOT2000" DMI board name but different
hardware.

Signed-off-by: Su Bao Cheng <baocheng.su@siemens.com>
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-18 11:48:02 -07:00
Colin Ian King
a7cf2cc3cd firestream: fix spelling mistake "tramsitted" -> "transmitted"
There is a spelling mistake in a debug message. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-18 11:38:22 -07:00
Colin Ian King
d5f6db3538 net: ipv6: addrlabel: fix spelling mistake "requewst" -> "request"
There is a spelling mistake in a NL_SET_ERR_MSG_MOD error message,
fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-18 10:44:17 -07:00
David S. Miller
7275a7edf9 Merge branch 'mlxsw-Few-small-fixes'
Ido Schimmel says:

====================
mlxsw: Few small fixes

Patch #1, from Petr, adjusts mlxsw to provide the same QoS behavior for
both Spectrum-1 and Spectrum-2. The fix is required due to a difference
in the behavior of Spectrum-2 compared to Spectrum-1. The problem and
solution are described in the detail in the changelog.

Patch #2 increases the time period in which the driver waits for the
firmware to signal it has finished its initialization. The issue will be
fixed in future firmware versions and the timeout will be decreased.

Patch #3, from Amit, fixes a display problem where the autoneg status in
ethtool is not updated in case the netdev is not running.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-18 10:37:30 -07:00
Amit Cohen
151f0dddbb mlxsw: spectrum: Fix autoneg status in ethtool
If link is down and autoneg is set to on/off, the status in ethtool does
not change.

The reason is when the link is down the function returns with zero
before changing autoneg value.

Move the checking of link state (up/down) to be performed after setting
autoneg value, in order to be sure that autoneg will change in any case.

Fixes: 56ade8fe3f ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
Signed-off-by: Amit Cohen <amitc@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-18 10:37:30 -07:00
Ido Schimmel
1ab3030193 mlxsw: pci: Reincrease PCI reset timeout
During driver initialization the driver sends a reset to the device and
waits for the firmware to signal that it is ready to continue.

Commit d2f372ba09 ("mlxsw: pci: Increase PCI SW reset timeout")
increased the timeout to 13 seconds due to longer PHY calibration in
Spectrum-2 compared to Spectrum-1.

Recently it became apparent that this timeout is too short and therefore
this patch increases it again to a safer limit that will be reduced in
the future.

Fixes: c3ab435466 ("mlxsw: spectrum: Extend to support Spectrum-2 ASIC")
Fixes: d2f372ba09 ("mlxsw: pci: Increase PCI SW reset timeout")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-18 10:37:30 -07:00
Petr Machata
f476b3f809 mlxsw: spectrum: Put MC TCs into DWRR mode
Both Spectrum-1 and Spectrum-2 chips are currently configured such that
pairs of TC n (which is used for UC traffic) and TC n+8 (which is used
for MC traffic) are feeding into the same subgroup. Strict
prioritization is configured between the two TCs, and by enabling
MC-aware mode on the switch, the lower-numbered (UC) TCs are favored
over the higher-numbered (MC) TCs.

On Spectrum-2 however, there is an issue in configuration of the
MC-aware mode. As a result, MC traffic is prioritized over UC traffic.
To work around the issue, configure the MC TCs with DWRR mode (while
keeping the UC TCs in strict mode).

With this patch, the multicast-unicast arbitration results in the same
behavior on both Spectrum-1 and Spectrum-2 chips.

Fixes: 7b81953066 ("mlxsw: spectrum: Configure MC-aware mode on mlxsw ports")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-18 10:37:30 -07:00
Linus Torvalds
6d906f9981 Avoid compiler uninitialised warning introduced by recent arm64 futex fix
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAly4sd0ACgkQa9axLQDI
 XvGEbw/+I3511KdExOVY3MSubeSOUmY6sPmTkTOp/Gl1DxFK8GkWGQ4QrZupimqg
 YREir7frY6Mufd6qknQClzBlcOfi1tjWe7hjW1bFLHdI3aG1Aui25RsyDsRZkHV7
 p5HWX48fNgBwWtx6z/4jIYjbqt5pzNnNxxhvC7wMttMDVgZUqSnGT+cucUbh4hth
 LrnsoyvRSSUQqtkls64TQIWMfd3/cMzxtxdi6BZltv6C//LKeaZiEZtf2jGecdMT
 K2FINRbxp8Cya/pZRWHSasuhlAstV5l+8Vc9C9P/n1I5Xcxr1PwQAt53mQpu5zwf
 lQlJRzUKnWhwXyedvZe17hPoiByqn07h9JkFBu+ZLBu3ZfY/sEst3pWfyRs6fokb
 8YXnD+lcywU0tlitbZSzYlH/QmFc9Df7P6XUx57zPc9cqZj9Meolqg9VWc+xs5iX
 sr2MhLr0tUIvQZ9DI3FmcgZ1Xuoq7suCqn2/tQ715pb3zoDsVPog4wQHT95CyRWP
 9N8MzGcwIRnXsNe5gw47/ipyle8DN5nKi7pklYsnx6aYLqJQbgc5lJX/Czbv5cTu
 /VgFoCC4CW2vH/o2sccuc95+3JPXvzPXQCB75fpCPWG2Z3AAjxewAIUqLZ0LgEEp
 +rqnczLPWrSXEa6VDqRCwPPvrQRzSJ0YD4Ol2yyH894xOK9CBOs=
 =Z+9W
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fix from Catalin Marinas:
 "Avoid compiler uninitialised warning introduced by recent arm64 futex
  fix"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: futex: Restore oldval initialization to work around buggy compilers
2019-04-18 10:24:48 -07:00
Nathan Chancellor
ff8acf9290 arm64: futex: Restore oldval initialization to work around buggy compilers
Commit 045afc2412 ("arm64: futex: Fix FUTEX_WAKE_OP atomic ops with
non-zero result value") removed oldval's zero initialization in
arch_futex_atomic_op_inuser because it is not necessary. Unfortunately,
Android's arm64 GCC 4.9.4 [1] does not agree:

../kernel/futex.c: In function 'do_futex':
../kernel/futex.c:1658:17: warning: 'oldval' may be used uninitialized
in this function [-Wmaybe-uninitialized]
   return oldval == cmparg;
                 ^
In file included from ../kernel/futex.c:73:0:
../arch/arm64/include/asm/futex.h:53:6: note: 'oldval' was declared here
  int oldval, ret, tmp;
      ^

GCC fails to follow that when ret is non-zero, futex_atomic_op_inuser
returns right away, avoiding the uninitialized use that it claims.
Restoring the zero initialization works around this issue.

[1]: https://android.googlesource.com/platform/prebuilts/gcc/linux-x86/aarch64/aarch64-linux-android-4.9/

Cc: stable@vger.kernel.org
Fixes: 045afc2412 ("arm64: futex: Fix FUTEX_WAKE_OP atomic ops with non-zero result value")
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-04-18 18:17:08 +01:00
Christian Brauner
738a7832d2 signal: use fdget() since we don't allow O_PATH
As stated in the original commit for pidfd_send_signal() we don't allow
to signal processes through O_PATH file descriptors since it is
semantically equivalent to a write on the pidfd.

We already correctly error out right now and return EBADF if an O_PATH
fd is passed.  This is because we use file->f_op to detect whether a
pidfd is passed and O_PATH fds have their file->f_op set to empty_fops
in do_dentry_open() and thus fail the test.

Thus, there is no regression.  It's just semantically correct to use
fdget() and return an error right from there instead of taking a
reference and returning an error later.

Signed-off-by: Christian Brauner <christian@brauner.io>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jann Horn <jann@thejh.net>
Cc: David Howells <dhowells@redhat.com>
Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
Cc: Andy Lutomirsky <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-18 08:35:18 -07:00
Linus Torvalds
d22113a2cd s390 update with bug fixes for 5.1-rc6
- Fix overwrite of the initial ramdisk due to misuse of IS_ENABLED
 
  - Fix integer overflow in the dasd driver resulting in incorrect number
    of blocks for large devices
 
  - Fix a lockdep false positive in the 3270 driver
 
  - Fix a deadlock in the zcrypt driver
 
  - Fix incorrect debug feature entries in the pkey api
 
  - Fix inline assembly constraints fallout with CONFIG_KASAN=y
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJcuIwlAAoJEDjwexyKj9rgCJEH/Agbx2sJlxApuKC6BMAjPZCy
 dJUu7hmumgtL+QI7IqQMRgfuxiC16xpJQd14GvfNoyjX1NVajepYI2+OUZHj7F6s
 La0fq/16fRriVByLiL+pqjV2y0c5kQ8Uhe15vwEvSmr3T/NgPLqUbIe5pwY5KS59
 O6B3WpVKeIb8UOUlEAQnezTYj70jWMT783NTyzhPcYLkiN7JJ7vI21FKPRM88ElS
 Q7f9qxGPv0hGsOqBXCBWztxgx+ezEHsnlH4t+he0tLOGXuuiy4BrgGtgbJx17frj
 xNNe8gOXV83bUbYTBygITnjMjS0m/1viJ4XTBgv/HLGgcKG06n3v6U4+zefB4N4=
 =+41g
 -----END PGP SIGNATURE-----

Merge tag 's390-5.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 bug fixes from Martin Schwidefsky:

 - Fix overwrite of the initial ramdisk due to misuse of IS_ENABLED

 - Fix integer overflow in the dasd driver resulting in incorrect number
   of blocks for large devices

 - Fix a lockdep false positive in the 3270 driver

 - Fix a deadlock in the zcrypt driver

 - Fix incorrect debug feature entries in the pkey api

 - Fix inline assembly constraints fallout with CONFIG_KASAN=y

* tag 's390-5.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390: correct some inline assembly constraints
  s390/pkey: add one more argument space for debug feature entry
  s390/zcrypt: fix possible deadlock situation on ap queue remove
  s390/3270: fix lockdep false positive on view->lock
  s390/dasd: Fix capacity calculation for large volumes
  s390/mem_detect: Use IS_ENABLED(CONFIG_BLK_DEV_INITRD)
2019-04-18 08:15:06 -07:00
Linus Torvalds
2a852fd1ac AFS fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIVAwUAXLGSOfu3V2unywtrAQKrLA/+MJfgZC9ejNsXqjnnqq48LtBf4B3FhYrO
 qdPZMKPuy5XxSWBIIGpXtgK7PspkpAD1NCe7qpvibQoa1SJqwa2gc46aN7+gzp3e
 HETkb4UEXNV5yXSjEA+YOQVa0MiLG2yL6satlqhN4KqoDB5Rj9KoaPFwT79K4SIU
 1iME9Q//tA2M7SU8B1FWqyx5kd350A38KoXmHOux7x8K1ZyjEB2DbyKqZBtn6Y/T
 nfH9ZZioep1JbPFwOuG5NYQH0hP2jUj8ogDnrzzWzn1rQQTWs6wVfigwAh8tc/Dv
 eDKA8rV9WtZT97ExOG6cQOe0UZsM3NFhnb4P/oSCpl+jtr9WHO3pDmJPLToCr0BA
 8xXW7vA+Ejx15dNz4Snnz9yj3FS0UeURSFyhn3g2AFQMK4v9Dd3ZZKOgjR9Czi98
 f1UNh768LKY+siOHmoOhgZNlrI8bgamCmXOucPayGzrQ/HVHqAwR6PiUGCzaH+vk
 +LfDC/Z/ezSqrfQiH/Ji5lLkQkhmYiUTcePn5xdi+kZGWqHwl9qGplVAFUVafs5O
 Hz+H2F7i8F6Oc4v5hzCHp5aDUP2oEKmruaXJimN+O9SOkUPi72aG8STRpBFnBqf7
 77WxNVG/XDFL6I8HukkgRMRePFDXi+2NwGOjB9k8PpAqyA1Vi6sSkCXVkwgAhLRS
 NCaOR2iDaqE=
 =cqBe
 -----END PGP SIGNATURE-----

Merge tag 'afs-fixes-20190413' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull AFS fixes from David Howells:

 - Stop using the deprecated get_seconds().

 - Don't make tracepoint strings const as the section they go in isn't
   read-only.

 - Differentiate failure due to unmarshalling from other failure cases.
   We shouldn't abort with RXGEN_CC/SS_UNMARSHAL if it's not due to
   unmarshalling.

 - Add a missing unlock_page().

 - Fix the interaction between receiving a notification from a server
   that it has invalidated all outstanding callback promises and a
   client call that we're in the middle of making that will get a new
   promise.

* tag 'afs-fixes-20190413' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  afs: Fix in-progess ops to ignore server-level callback invalidation
  afs: Unlock pages for __pagevec_release()
  afs: Differentiate abort due to unmarshalling from other errors
  afs: Avoid section confusion in CM_NAME
  afs: avoid deprecated get_seconds()
2019-04-18 08:10:22 -07:00
Linus Torvalds
d3ce3b1879 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
 "Fix a bug in the implementation of the x86 accelerated version of
  poly1305"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: x86/poly1305 - fix overflow during partial reduction
2019-04-18 08:04:10 -07:00
Linus Torvalds
95ea55291e drm tegra, amdgpu fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJct/CWAAoJEAx081l5xIa+jkAP/0uiIUunGeRIGbNzbNSaIHcH
 E87Ggq5FB3bSIq7LY13JffrTYfeZRCTjpqVhqnOuqLPxrzxnjXmX9z+OLM2tuP5e
 paPYPtJSUCEBWf5Y8nHysIE4iKjS1Cc8HACfj/3LyYBRzG7x6bYG6qe4y0GIhyVP
 xypYuLpyYDGl0Ixgmk5MstO5UCOe2fnkyrvlw2Aoguk8sZcHqIZK0DJRO0teCx7t
 dgqO5x4JlsC26aCNLy0v0lhcmEXwCDgQLvSZlrS2c7OLiTWurlQJkdOvL9bx/ji2
 mShuuTqVNs9CVn7iIqFToOLMKgX0ZYmTD3W5367/plTdbd7DT6nkfyIjKpXpzmtH
 L2NrTl+Z8yiUlU4E4hhN9eGara4p5J9wI95SAQyvCADCB2LhfPuSSPpVBkjbAgl6
 8vLGASEyzR98337MvdTXcMeVXMwpnVbPTiTANeihMKDfIBN6LqL/fCFsxpcFQuTg
 NcrkmHLHAof80M7rYbjF6ECSFI7/uWMDRpLmRszO31xaJzwvWIDmO0Q5JGx3q3i7
 gnJz2NJsmWv/+IXbY1toiF6F4x7joCfeVL3uwY0eNpiKIe2cvZcsuU5DollXS2va
 awsgfMM0x9vSSLUr/E7QTMEZX8x1BsIxePG7OrazltLG6T9xHEun+Xp7BvGovFcF
 y7qVkYrKbGvMZ1+lBlB0
 =nAcQ
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-2019-04-18' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Since Easter is looming for me, I'm just pushing whatever is in my
  tree, I'll see what else turns up and maybe I'll send another pull
  early next week if there is anything.

  tegra:
   - stream id programming fix
   - avoid divide by 0 for bad hdmi audio setup code

  ttm:
   - Hugepages fix
   - refcount imbalance in error path fix

  amdgpu:
   - GPU VM fixes for Vega/RV
   - DC AUX fix for active DP-DVI dongles
   - DC fix for multihead regression"

* tag 'drm-fixes-2019-04-18' of git://anongit.freedesktop.org/drm/drm:
  drm/tegra: hdmi: Setup audio only if configured
  drm/amd/display: If one stream full updates, full update all planes
  drm/amdgpu/gmc9: fix VM_L2_CNTL3 programming
  drm/amdgpu: shadow in shadow_list without tbo.mem.start cause page fault in sriov TDR
  gpu: host1x: Program stream ID to bypass without SMMU
  drm/amd/display: extending AUX SW Timeout
  drm/ttm: fix dma_fence refcount imbalance on error path
  drm/ttm: fix incrementing the page pointer for huge pages
  drm/ttm: fix start page for huge page check in ttm_put_pages()
  drm/ttm: fix out-of-bounds read in ttm_put_pages() v2
2019-04-18 07:56:05 -07:00
Paul Kocialkowski
f5a9ed867c
drm/sun4i: Fix component unbinding and component master deletion
For our component-backed driver to be properly removed, we need to
delete the component master in sun4i_drv_remove and make sure to call
component_unbind_all in the master's unbind so that all components are
unbound when the master is.

Fixes: 9026e0d122 ("drm: Add Allwinner A10 Display Engine support")
Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190418132727.5128-4-paul.kocialkowski@bootlin.com
2019-04-18 16:26:53 +02:00
Paul Kocialkowski
02b92adbe3
drm/sun4i: Set device driver data at bind time for use in unbind
Our sun4i_drv_unbind gets the drm device using dev_get_drvdata.
However, that driver data is never set in sun4i_drv_bind.

Set it there to avoid getting a NULL pointer at unbind time.

Fixes: 9026e0d122 ("drm: Add Allwinner A10 Display Engine support")
Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190418132727.5128-3-paul.kocialkowski@bootlin.com
2019-04-18 16:26:36 +02:00
Paul Kocialkowski
71adf60f0a
drm/sun4i: Add missing drm_atomic_helper_shutdown at driver unbind
A call to drm_atomic_helper_shutdown is required to properly release
the internal references taken by the core and avoid warnings about
leaking objects. Add it since it was missing.

Fixes: 9026e0d122 ("drm: Add Allwinner A10 Display Engine support")
Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190418132727.5128-2-paul.kocialkowski@bootlin.com
2019-04-18 16:26:22 +02:00
Herbert Xu
b257b48cd5 crypto: lrw - Fix atomic sleep when walking skcipher
When we perform a walk in the completion function, we need to ensure
that it is atomic.

Fixes: ac3c8f36c3 ("crypto: lrw - Do not use auxiliary buffer")
Cc: <stable@vger.kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-18 22:13:46 +08:00
Herbert Xu
44427c0fbc crypto: xts - Fix atomic sleep when walking skcipher
When we perform a walk in the completion function, we need to ensure
that it is atomic.

Reported-by: syzbot+6f72c20560060c98b566@syzkaller.appspotmail.com
Fixes: 78105c7e76 ("crypto: xts - Drop use of auxiliary buffer")
Cc: <stable@vger.kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-18 22:13:46 +08:00
Chang-An Chen
3f2552f7e9 timers/sched_clock: Prevent generic sched_clock wrap caused by tick_freeze()
tick_freeze() introduced by suspend-to-idle in commit 124cf9117c ("PM /
sleep: Make it possible to quiesce timers during suspend-to-idle") uses
timekeeping_suspend() instead of syscore_suspend() during
suspend-to-idle. As a consequence generic sched_clock will keep going
because sched_clock_suspend() and sched_clock_resume() are not invoked
during suspend-to-idle which can result in a generic sched_clock wrap.

On a ARM system with suspend-to-idle enabled, sched_clock is registered
as "56 bits at 13MHz, resolution 76ns, wraps every 4398046511101ns", which
means the real wrapping duration is 8796093022202ns.

[  134.551779] suspend-to-idle suspend (timekeeping_suspend())
[ 1204.912239] suspend-to-idle resume (timekeeping_resume())
......
[ 1206.912239] suspend-to-idle suspend (timekeeping_suspend())
[ 5880.502807] suspend-to-idle resume (timekeeping_resume())
......
[ 6000.403724] suspend-to-idle suspend (timekeeping_suspend())
[ 8035.753167] suspend-to-idle resume  (timekeeping_resume())
......
[ 8795.786684] (2)[321:charger_thread]......
[ 8795.788387] (2)[321:charger_thread]......
[    0.057226] (0)[0:swapper/0]......
[    0.061447] (2)[0:swapper/2]......

sched_clock was not stopped during suspend-to-idle, and sched_clock_poll
hrtimer was not expired because timekeeping_suspend() was invoked during
suspend-to-idle. It makes sched_clock wrap at kernel time 8796s.

To prevent this, invoke sched_clock_suspend() and sched_clock_resume() in
tick_freeze() together with timekeeping_suspend() and timekeeping_resume().

Fixes: 124cf9117c (PM / sleep: Make it possible to quiesce timers during suspend-to-idle)
Signed-off-by: Chang-An Chen <chang-an.chen@mediatek.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Corey Minyard <cminyard@mvista.com>
Cc: <linux-mediatek@lists.infradead.org>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Stanley Chu <stanley.chu@mediatek.com>
Cc: <kuohong.wang@mediatek.com>
Cc: <freddy.hsin@mediatek.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1553828349-8914-1-git-send-email-chang-an.chen@mediatek.com
2019-04-18 14:34:53 +02:00
Kim Phillips
3fe3331bb2 perf/x86/amd: Add event map for AMD Family 17h
Family 17h differs from prior families by:

 - Does not support an L2 cache miss event
 - It has re-enumerated PMC counters for:
   - L2 cache references
   - front & back end stalled cycles

So we add a new amd_f17h_perfmon_event_map[] so that the generic
perf event names will resolve to the correct h/w events on
family 17h and above processors.

Reference sections 2.1.13.3.3 (stalls) and 2.1.13.3.6 (L2):

  https://www.amd.com/system/files/TechDocs/54945_PPR_Family_17h_Models_00h-0Fh.pdf

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Cc: <stable@vger.kernel.org> # v4.9+
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Pu Wen <puwen@hygon.cn>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Fixes: e40ed1542d ("perf/x86: Add perf support for AMD family-17h processors")
[ Improved the formatting a bit. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-18 14:31:54 +02:00
Baoquan He
ec3937107a x86/mm/KASLR: Fix the size of the direct mapping section
kernel_randomize_memory() uses __PHYSICAL_MASK_SHIFT to calculate
the maximum amount of system RAM supported. The size of the direct
mapping section is obtained from the smaller one of the below two
values:

  (actual system RAM size + padding size) vs (max system RAM size supported)

This calculation is wrong since commit

  b83ce5ee91 ("x86/mm/64: Make __PHYSICAL_MASK_SHIFT always 52").

In it, __PHYSICAL_MASK_SHIFT was changed to be 52, regardless of whether
the kernel is using 4-level or 5-level page tables. Thus, it will always
use 4 PB as the maximum amount of system RAM, even in 4-level paging
mode where it should actually be 64 TB.

Thus, the size of the direct mapping section will always
be the sum of the actual system RAM size plus the padding size.

Even when the amount of system RAM is 64 TB, the following layout will
still be used. Obviously KALSR will be weakened significantly.

   |____|_______actual RAM_______|_padding_|______the rest_______|
   0            64TB                                            ~120TB

Instead, it should be like this:

   |____|_______actual RAM_______|_________the rest______________|
   0            64TB                                            ~120TB

The size of padding region is controlled by
CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING, which is 10 TB by default.

The above issue only exists when
CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING is set to a non-zero value,
which is the case when CONFIG_MEMORY_HOTPLUG is enabled. Otherwise,
using __PHYSICAL_MASK_SHIFT doesn't affect KASLR.

Fix it by replacing __PHYSICAL_MASK_SHIFT with MAX_PHYSMEM_BITS.

 [ bp: Massage commit message. ]

Fixes: b83ce5ee91 ("x86/mm/64: Make __PHYSICAL_MASK_SHIFT always 52")
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Thomas Garnier <thgarnie@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: frank.ramsay@hpe.com
Cc: herbert@gondor.apana.org.au
Cc: kirill@shutemov.name
Cc: mike.travis@hpe.com
Cc: thgarnie@google.com
Cc: x86-ml <x86@kernel.org>
Cc: yamada.masahiro@socionext.com
Link: https://lkml.kernel.org/r/20190417083536.GE7065@MiWiFi-R3L-srv
2019-04-18 10:42:58 +02:00
Arnd Bergmann
27b141fc23 s390: ctcm: fix ctcm_new_device error return code
clang points out that the return code from this function is
undefined for one of the error paths:

../drivers/s390/net/ctcm_main.c:1595:7: warning: variable 'result' is used uninitialized whenever 'if' condition is true
      [-Wsometimes-uninitialized]
                if (priv->channel[direction] == NULL) {
                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/s390/net/ctcm_main.c:1638:9: note: uninitialized use occurs here
        return result;
               ^~~~~~
../drivers/s390/net/ctcm_main.c:1595:3: note: remove the 'if' if its condition is always false
                if (priv->channel[direction] == NULL) {
                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/s390/net/ctcm_main.c:1539:12: note: initialize the variable 'result' to silence this warning
        int result;
                  ^

Make it return -ENODEV here, as in the related failure cases.
gcc has a known bug in underreporting some of these warnings
when it has already eliminated the assignment of the return code
based on some earlier optimization step.

Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 23:25:35 -07:00
Colin Ian King
d003d772e6 nfp: abm: fix spelling mistake "offseting" -> "offsetting"
There are a couple of spelling mistakes in NL_SET_ERR_MSG_MOD error
messages. Fix these.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 23:22:26 -07:00
YueHaibing
f87db4dbd5 net: stmmac: Use bfsize1 in ndesc_init_rx_desc
gcc warn this:

drivers/net/ethernet/stmicro/stmmac/norm_desc.c: In function ndesc_init_rx_desc:
drivers/net/ethernet/stmicro/stmmac/norm_desc.c:138:6: warning: variable 'bfsize1' set but not used [-Wunused-but-set-variable]

Like enh_desc_init_rx_desc, we should use bfsize1
in ndesc_init_rx_desc to calculate 'p->des1'

Fixes: 583e636141 ("net: stmmac: use correct DMA buffer size in the RX descriptor")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Aaro Koskinen <aaro.koskinen@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 23:20:25 -07:00
Guy Levi
7249c8ea22 IB/mlx5: Fix scatter to CQE in DCT QP creation
When scatter to CQE is enabled on a DCT QP it corrupts the mailbox command
since it tried to treat it as as QP create mailbox command instead of a
DCT create command.

The corrupted mailbox command causes userspace to malfunction as the
device doesn't create the QP as expected.

A new mlx5 capability is exposed to user-space which ensures that it will
not enable the feature on DCT without this fix in the kernel.

Fixes: 5d6ff1babe ("IB/mlx5: Support scatter to CQE for DC transport type")
Signed-off-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-18 03:13:41 -03:00