Commit graph

779291 commits

Author SHA1 Message Date
Yang Shi
c2231020ea mm: thp: register mm for khugepaged when merging vma for shmem
When merging anonymous page vma, if the size of the vma can fit in at
least one hugepage, the mm will be registered for khugepaged for
collapsing THP in the future.

But it skips shmem vmas.  Do so for shmem also, but not for file-private
mappings when merging a vma in order to increase the odds of collapsing
a hugepage via khugepaged.

hugepage_vma_check() sounds like a good fit to do the check.  And move
the definition of it before khugepaged_enter_vma_merge() to avoid a
build error.

Link: http://lkml.kernel.org/r/1529697791-6950-1-git-send-email-yang.shi@linux.alibaba.com
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:28 -07:00
Jia-Ju Bai
8cded8668e mm/mempool.c: remove unused argument in kasan_unpoison_element() and remove_element()
The argument "gfp_t flags" is not used in kasan_unpoison_element() and
remove_element(), so remove it.

Link: http://lkml.kernel.org/r/20180621070332.16633-1-baijiaju1990@gmail.com
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Reviewed-by: Matthew Wilcox <willy@infradead.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:28 -07:00
Greg Thelen
bb451fdf3d mm/vmscan.c: condense scan_control
Use smaller scan_control fields for order, priority, and reclaim_idx.
Convert fields from int => s8.  All easily fit within a byte:

 - allocation order range: 0..MAX_ORDER(64?)
 - priority range:         0..12(DEF_PRIORITY)
 - reclaim_idx range:      0..6(__MAX_NR_ZONES)

Since 6538b8ea88 ("x86_64: expand kernel stack to 16K") x86_64 stack
overflows are not an issue.  But it's inefficient to use ints.

Use s8 (signed byte) rather than u8 to allow for loops like:
	do {
		...
	} while (--sc.priority >= 0);

Add BUILD_BUG_ON to verify that s8 is capable of storing max values.

This reduces sizeof(struct scan_control):
 - 96 => 80 bytes (x86_64)
 - 68 => 56 bytes (i386)

scan_control structure field order is changed to utilize padding.  After
this patch there is 1 bit of scan_control padding.

akpm: makes my vmscan.o's .text 572 bytes smaller as well.

Link: http://lkml.kernel.org/r/20180530061212.84915-1-gthelen@google.com
Signed-off-by: Greg Thelen <gthelen@google.com>
Suggested-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:28 -07:00
Kirill A. Shutemov
10ed634152 mm/page_ext.c: constify lookup_page_ext() argument
lookup_page_ext() finds 'struct page_ext' for a given page.  It requires
only read access to the given struct page.

Current implemnentation takes 'struct page *' as an argument.  It makes
compiler complain when 'const struct page *' passed.

Change the argument to 'const struct page *'.

Link: http://lkml.kernel.org/r/20180531135457.20167-3-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:28 -07:00
Kirill A. Shutemov
b3a2369692 include/linux/page_ext.h: drop definition of unused PAGE_EXT_DEBUG_POISON
After commit bd33ef3681 ("mm: enable page poisoning early at boot")
PAGE_EXT_DEBUG_POISON is not longer used.  Remove it.

Link: http://lkml.kernel.org/r/20180531135457.20167-2-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Vinayak Menon <vinmenon@codeaurora.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:28 -07:00
Arnd Bergmann
46c9a946d7 shmem: use monotonic time for i_generation
get_seconds() is deprecated because it will lead to a 32-bit overflow in
2038 or 2106.  We don't need the i_generation to be strictly monotonic
anyway, and other file systems like ext4 and xfs just use prandom_u32(),
so let's use the same one here.

If this is considered too slow, we could also use ktime_get_seconds() or
ktime_get_real_seconds() to keep the previous behavior.  Both of these
return a time64_t and are not deprecated, but only return a unique value
once per second, and are predictable.

Link: http://lkml.kernel.org/r/20180620082556.581543-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:28 -07:00
Vlastimil Babka
d6a24df006 mm, page_alloc: actually ignore mempolicies for high priority allocations
__alloc_pages_slowpath() has for a long time contained code to ignore
node restrictions from memory policies for high priority allocations.
The current code that resets the zonelist iterator however does
effectively nothing after commit 7810e6781e ("mm, page_alloc: do not
break __GFP_THISNODE by zonelist reset") removed a buggy zonelist reset.
Even before that commit, mempolicy restrictions were still not ignored,
as they are passed in ac->nodemask which is untouched by the code.

We can either remove the code, or make it work as intended.  Since
ac->nodemask can be set from task's mempolicy via alloc_pages_current()
and thus also alloc_pages(), it may indeed affect kernel allocations,
and it makes sense to ignore it to allow progress for high priority
allocations.

Thus, this patch resets ac->nodemask to NULL in such cases.  This
assumes all callers can handle it (i.e.  there are no guarantees as in
the case of __GFP_THISNODE) which seems to be the case.  The same
assumption is already present in check_retry_cpuset() for some time.

The expected effect is that high priority kernel allocations in the
context of userspace tasks (e.g.  OOM victims) restricted by mempolicies
will have higher chance to succeed if they are restricted to nodes with
depleted memory, while there are other nodes with free memory left.

It's not a new intention, but for the first time the code will match the
intention, AFAICS.  It was intended by commit 183f6371aa ("mm: ignore
mempolicies when using ALLOC_NO_WATERMARK") in v3.6 but I think it never
really worked, as mempolicy restriction was already encoded in nodemask,
not zonelist, at that time.

So originally that was for ALLOC_NO_WATERMARK only.  Then it was
adjusted by e46e7b77c9 ("mm, page_alloc: recalculate the preferred
zoneref if the context can ignore memory policies") and cd04ae1e2d
("mm, oom: do not rely on TIF_MEMDIE for memory reserves access") to the
current state.  So even GFP_ATOMIC would now ignore mempolicies after
the initial attempts fail - if the code worked as people thought it
does.

Link: http://lkml.kernel.org/r/20180612122624.8045-1-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:28 -07:00
Christian Hansen
59ae96ffc3 tools/vm/page-types.c: add support for idle page tracking
Add a flag which causes page-types to use the kernels's idle page
tracking to mark pages idle.  As the tool already prints the idle flag
if set, subsequent runs will show which pages have been accessed since
last run.

[akpm@linux-foundation.org: simplify mark_page_idle()]
[chansen3@cisco.com: reorganize mark_page_idle() logic, add docs]
  Link: http://lkml.kernel.org/r/20180706172237.21691-1-chansen3@cisco.com
Link: http://lkml.kernel.org/r/20180612153223.13174-1-chansen3@cisco.com
Signed-off-by: Christian Hansen <chansen3@cisco.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:28 -07:00
Christian Hansen
7f1d23e607 tools/vm/page-types.c: include shared map counts
Add a new flag that will read kpagecount for each PFN and print out the
number of times the page is mapped along with the flags in the listing
view.

This information is useful in understanding and optimizing memory usage.
Identifying pages which are not shared allows us to focus on adjusting
the memory layout or access patterns for the sole owning process.
Knowing the number of processes that share a page tells us how many
other times we must make the same adjustments or how many processes to
potentially disable.

Truncated sample output:

  voffset map-cnt offset  len     flags
  561a3591e       1       15fe8   1       ___U_lA____Ma_b___________________________
  561a3591f       1       2b103   1       ___U_lA____Ma_b___________________________
  561a36ca4       1       2cc78   1       ___U_lA____Ma_b___________________________
  7f588bb4e       14      2273c   1       __RU_lA____M______________________________

[akpm@linux-foundation.org: coding-style fixes]
[chansen3@cisco.com: add documentation, tweak whitespace]
  Link: http://lkml.kernel.org/r/20180705181204.5529-1-chansen3@cisco.com
Link: http://lkml.kernel.org/r/20180612153205.12879-1-chansen3@cisco.com
Signed-off-by: Christian Hansen <chansen3@cisco.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:28 -07:00
Yang Shi
fadae29530 thp: use mm_file_counter to determine update which rss counter
Since commit eca56ff906 ("mm, shmem: add internal shmem resident
memory accounting"), MM_SHMEMPAGES is added to separate the shmem
accounting from regular files.  So, all shmem pages should be accounted
to MM_SHMEMPAGES instead of MM_FILEPAGES.

And, normal 4K shmem pages have been accounted to MM_SHMEMPAGES, so
shmem thp pages should be not treated differently.  Account them to
MM_SHMEMPAGES via mm_counter_file() since shmem pages are swap backed to
keep consistent with normal 4K shmem pages.

This will not change the rss counter of processes since shmem pages are
still a part of it.

The /proc/pid/status and /proc/pid/statm counters will however be more
accurate wrt shmem usage, as originally intended.  And as eca56ff906
("mm, shmem: add internal shmem resident memory accounting") mentioned,
oom also could report more accurate "shmem-rss".

Link: http://lkml.kernel.org/r/1529442518-17398-1-git-send-email-yang.shi@linux.alibaba.com
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:28 -07:00
Pavel Tatashin
720e14ebec mm: skip invalid pages block at a time in zero_resv_unresv()
The role of zero_resv_unavail() is to make sure that every struct page
that is allocated but is not backed by memory that is accessible by
kernel is zeroed and not in some uninitialized state.

Since struct pages are allocated in blocks (2M pages in x86 case), we
can skip pageblock_nr_pages at a time, when the first one is found to be
invalid.

This optimization may help since now on x86 every hole in e820 maps is
marked as reserved in memblock, and thus will go through this function.

This function is called before sched_clock() is initialized, so I used
my x86 early boot clock patches to measure the performance improvement.

With 1T hole on i7-8700 currently we would take 0.606918s of boot time,
but with this optimization 0.001103s.

Link: http://lkml.kernel.org/r/20180615155733.1175-1-pasha.tatashin@oracle.com
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Pasha Tatashin <Pavel.Tatashin@microsoft.com>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:28 -07:00
Souptick Joarder
50a7ca3c6f mm: convert return type of handle_mm_fault() caller to vm_fault_t
Use new return type vm_fault_t for fault handler.  For now, this is just
documenting that the function returns a VM_FAULT value rather than an
errno.  Once all instances are converted, vm_fault_t will become a
distinct type.

Ref-> commit 1c8f422059 ("mm: change return type to vm_fault_t")

In this patch all the caller of handle_mm_fault() are changed to return
vm_fault_t type.

Link: http://lkml.kernel.org/r/20180617084810.GA6730@jordon-HP-15-Notebook-PC
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: James Hogan <jhogan@kernel.org>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: James E.J. Bottomley <jejb@parisc-linux.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: David S. Miller <davem@davemloft.net>
Cc: Richard Weinberger <richard@nod.at>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Levin, Alexander (Sasha Levin)" <alexander.levin@verizon.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:28 -07:00
Vlastimil Babka
0882ff9190 mm, slub: restore the original intention of prefetch_freepointer()
In SLUB, prefetch_freepointer() is used when allocating an object from
cache's freelist, to make sure the next object in the list is cache-hot,
since it's probable it will be allocated soon.

Commit 2482ddec67 ("mm: add SLUB free list pointer obfuscation") has
unintentionally changed the prefetch in a way where the prefetch is
turned to a real fetch, and only the next->next pointer is prefetched.
In case there is not a stream of allocations that would benefit from
prefetching, the extra real fetch might add a useless cache miss to the
allocation.  Restore the previous behavior.

Link: http://lkml.kernel.org/r/20180809085245.22448-1-vbabka@suse.cz
Fixes: 2482ddec67 ("mm: add SLUB free list pointer obfuscation")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:28 -07:00
NeilBrown
1f4aace60b fs/seq_file.c: simplify seq_file iteration code and interface
The documentation for seq_file suggests that it is necessary to be able
to move the iterator to a given offset, however that is not the case.
If the iterator is stored in the private data and is stable from one
read() syscall to the next, it is only necessary to support first/next
interactions.  Implementing this in a client is a little clumsy.

 - if ->start() is given a pos of zero, it should go to start of
   sequence.

 - if ->start() is given the name pos that was given to the most recent
   next() or start(), it should restore the iterator to state just
   before that last call

 - if ->start is given another number, it should set the iterator one
   beyond the start just before the last ->start or ->next call.

Also, the documentation says that the implementation can interpret the
pos however it likes (other than zero meaning start), but seq_file
increments the pos sometimes which does impose on the implementation.

This patch simplifies the interface for first/next iteration and
simplifies the code, while maintaining complete backward compatability.
Now:

 - if ->start() is given a pos of zero, it should return an iterator
   placed at the start of the sequence

 - if ->start() is given a non-zero pos, it should return the iterator
   in the same state it was after the last ->start or ->next.

This is particularly useful for interators which walk the multiple
chains in a hash table, e.g.  using rhashtable_walk*.  See
fs/gfs2/glock.c and drivers/staging/lustre/lustre/llite/vvp_dev.c

A large part of achieving this is to *always* call ->next after ->show
has successfully stored all of an entry in the buffer.  Never just
increment the index instead.  Also:

 - always pass &m->index to ->start() and ->next(), never a temp
   variable

 - don't clear ->from when ->count is zero, as ->from is dead when
   ->count is zero.

Some ->next functions do not increment *pos when they return NULL.  To
maintain compatability with this, we still need to increment m->index in
one place, if ->next didn't increment it.  Note that such ->next
functions are buggy and should be fixed.  A simple demonstration is

   dd if=/proc/swaps bs=1000 skip=1

Choose any block size larger than the size of /proc/swaps.  This will
always show the whole last line of /proc/swaps.

This patch doesn't work around buggy next() functions for this case.

[neilb@suse.com: ensure ->from is valid]
  Link: http://lkml.kernel.org/r/87601ryb8a.fsf@notabene.neil.brown.name
Signed-off-by: NeilBrown <neilb@suse.com>
Acked-by: Jonathan Corbet <corbet@lwn.net>	[docs]
Tested-by: Jann Horn <jannh@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:28 -07:00
NeilBrown
4cdfffc872 vfs: discard ATTR_ATTR_FLAG
This flag was introduce in 2.1.37pre1 and the only place it was tested
was removed in 2.1.43pre1.  The flag was never set.

Let's discard it properly.

Link: http://lkml.kernel.org/r/877en0hewz.fsf@notabene.neil.brown.name
Signed-off-by: NeilBrown <neilb@suse.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:28 -07:00
Tetsuo Handa
6cd00a01f0 fs/dcache.c: fix kmemcheck splat at take_dentry_name_snapshot()
Since only dentry->d_name.len + 1 bytes out of DNAME_INLINE_LEN bytes
are initialized at __d_alloc(), we can't copy the whole size
unconditionally.

 WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff8fa27465ac50)
 636f6e66696766732e746d70000000000010000000000000020000000188ffff
  i i i i i i i i i i i i i u u u u u u u u u u i i i i i u u u u
                                  ^
 RIP: 0010:take_dentry_name_snapshot+0x28/0x50
 RSP: 0018:ffffa83000f5bdf8 EFLAGS: 00010246
 RAX: 0000000000000020 RBX: ffff8fa274b20550 RCX: 0000000000000002
 RDX: ffffa83000f5be40 RSI: ffff8fa27465ac50 RDI: ffffa83000f5be60
 RBP: ffffa83000f5bdf8 R08: ffffa83000f5be48 R09: 0000000000000001
 R10: ffff8fa27465ac00 R11: ffff8fa27465acc0 R12: ffff8fa27465ac00
 R13: ffff8fa27465acc0 R14: 0000000000000000 R15: 0000000000000000
 FS:  00007f79737ac8c0(0000) GS:ffffffff8fc30000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: ffff8fa274c0b000 CR3: 0000000134aa7002 CR4: 00000000000606f0
  take_dentry_name_snapshot+0x28/0x50
  vfs_rename+0x128/0x870
  SyS_rename+0x3b2/0x3d0
  entry_SYSCALL_64_fastpath+0x1a/0xa4
  0xffffffffffffffff

Link: http://lkml.kernel.org/r/201709131912.GBG39012.QMJLOVFSFFOOtH@I-love.SAKURA.ne.jp
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:28 -07:00
Colin Ian King
480bd56485 ocfs2: make several functions and variables static (and some const)
There are a variety of functions and variables that are local to the
source and do not need to be in global scope, so make them static.  Also
make a couple of char arrays static const.

Cleans up sparse warnings:
  symbol 'o2hb_heartbeat_mode_desc' was not declared. Should it be static?
  symbol 'o2hb_heartbeat_mode' was not declared. Should it be static?
  symbol 'o2hb_dependent_users' was not declared. Should it be static?
  symbol 'o2hb_region_dec_user' was not declared. Should it be static?
  symbol 'o2nm_fence_method_desc' was not declared. Should it be static?
  symbol 'lockdep_keys' was not declared. Should it be static?

Link: http://lkml.kernel.org/r/20180628131659.12133-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:28 -07:00
wangyan
229ba1f82a ocfs2: clean up some unnecessary code
Several functions have some unnecessary code, clean up these code.

Link: http://lkml.kernel.org/r/5B14DF72.5020800@huawei.com
Signed-off-by: Yan Wang <wangyan122@huawei.com>
Reviewed-by: Jun Piao <piaojun@huawei.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <ge.changwei@h3c.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:27 -07:00
Jun Piao
93f5920d86 ocfs2: return -EROFS when filesystem becomes read-only
We should return -EROFS rather than other errno if filesystem becomes
read-only.

[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/5B191B26.9010501@huawei.com
Signed-off-by: Jun Piao <piaojun@huawei.com>
Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com>
Acked-by: Joseph Qi <jiangqi903@gmail.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <ge.changwei@h3c.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:27 -07:00
Nick Desaulniers
8d00d0c00c sh: prefer _THIS_IP_ to current_text_addr
As part of the effort to reduce the code duplication between _THIS_IP_
and current_text_addr(), let's consolidate callers of
current_text_addr() to use _THIS_IP_.

Link: http://lkml.kernel.org/r/20180801185331.39535-1-ndesaulniers@google.com
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:27 -07:00
Dmitry Torokhov
82f7c5103d sh: make use of for_each_node_by_type()
Instead of open-coding the loop, let's use canned macro.

Also make sure we are not leaking "cpus" node reference.

Link: http://lkml.kernel.org/r/20180624224252.GA220395@dtor-ws
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:27 -07:00
Kees Cook
ab62ef82ea ntfs: mft: remove VLA usage
In the quest to remove all stack VLA usage from the kernel[1], this
allocates the maximum size stack buffer.  Existing checks already
require that blocksize >= NTFS_BLOCK_SIZE and mft_record_size <=
PAGE_SIZE, so max_bhs can be at most PAGE_SIZE / NTFS_BLOCK_SIZE.
Sanity checks are added for robustness.

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

Link: http://lkml.kernel.org/r/20180626172909.41453-4-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Anton Altaparmakov <anton@tuxera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:27 -07:00
Kees Cook
2c27ce9150 ntfs: decompress: remove VLA usage
In the quest to remove all stack VLA usage from the kernel[1], this
moves the stack buffer used during decompression to be allocated
externally.

The existing "dest_max_index" used in the VLA is bounded by cb_max_page.
cb_max_page is bounded by max_page, and max_page is bounded by nr_pages.
Since nr_pages is used for the "pages" allocation, it can similarly be
used for the "completed_pages" allocation and passed into the
decompression function.  The error paths are updated to free the new
allocation.

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

Link: http://lkml.kernel.org/r/20180626172909.41453-3-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Anton Altaparmakov <anton@tuxera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:27 -07:00
Kees Cook
ac4ecf968a ntfs: aops: remove VLA usage
In the quest to remove all stack VLA usage from the kernel[1], this uses
the maximum size needed on the stack and adds a sanity check for
robustness: index.block_size cannot be larger than PAGE_SIZE nor less
than NTFS_BLOCK_SIZE.

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

Link: http://lkml.kernel.org/r/20180626172909.41453-2-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Anton Altaparmakov <anton@tuxera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:27 -07:00
Sebastian Andrzej Siewior
a10dcebacd fs/ntfs/aops.c: don't disable interrupts during kmap_atomic()
ntfs_end_buffer_async_read() disables interrupts around kmap_atomic().
This is a leftover from the old kmap_atomic() implementation which
relied on fixed mapping slots, so the caller had to make sure that the
same slot could not be reused from an interrupting context.

kmap_atomic() was changed to dynamic slots long ago and commit
1ec9c5ddc1 ("include/linux/highmem.h: remove the second argument of
k[un]map_atomic()") removed the slot assignements, but the callers were
not checked for now redundant interrupt disabling.

Remove the conditional interrupt disable.

Link: http://lkml.kernel.org/r/20180611144913.gln5mklhqcrfsoom@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Anton Altaparmakov <anton@tuxera.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:27 -07:00
Jeremy Cline
bed95c43c1 scripts: add Python 3 compatibility to spdxcheck.py
"dict.has_key(key)" on dictionaries has been replaced with "key in
dict".  Additionally, when run under Python 3 some files don't decode
with the default encoding (tested with UTF-8).  To handle that, don't
open the file in text mode and decode text line-by-line, ignoring
encoding errors.

This remains compatible with Python 2 and should have no functional
change.

Link: http://lkml.kernel.org/r/20180717190635.29467-1-jcline@redhat.com
Signed-off-by: Jeremy Cline <jcline@redhat.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:27 -07:00
Joe Perches
fde5e903fb scripts/spdxcheck.py: work with current HEAD LICENSES/ directory
Depending on how old your -next tree is, it may not have a master that
has the LICENSES directory.

Change the lookup to HEAD and find whatever LICENSE directory files are
used in that branch.

Miscellanea:

 - Remove the checkpatch test as it will have its own SPDX license
   identifier.

Link: http://lkml.kernel.org/r/7eeefc862194930c773e662cb2152e178441d3b8.camel@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:27 -07:00
Arnd Bergmann
f08957d0ff fs/hpfs: extend gmt_to_local() conversion to 64-bit times
The VFS timestamps are all 64-bit now, the only missing piece for hpfs
is the internal conversion function.  One interesting bit about hpfs is
that it can already deal with moving the 136 year window of its
timestamps to support a much wider range than other file systems with
32-bit timestamps.  It also treats the timestamps as 'unsigned' on
64-bit architectures (but signed on 32-bit, because time_t always around
to negative numbers in 2038).

Changing the conversion to use time64_t makes 32-bit architectures
behave the same way as 64-bit.  For completeness, this also adds a
clamp_t call for each conversion, so we don't wrap the timestamps but
instead stay within the [0..U32_MAX] range of the on-disk timestamps.

Link: http://lkml.kernel.org/r/20180718115017.742609-3-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:27 -07:00
Arnd Bergmann
bcf451ecfc fs/ntfs: use timespec64 directly for timestamp conversion
Now that the VFS has been converted from timespec to timespec64
timestamps, only the conversion to/from ntfs timestamps uses 32-bit
seconds.

This changes that last missing piece to get the ntfs implementation
y2038 safe on 32-bit architectures.

Link: http://lkml.kernel.org/r/20180718115017.742609-2-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Anton Altaparmakov <anton@tuxera.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:27 -07:00
Arnd Bergmann
a3fda0ffea fs/ufs: use ktime_get_real_seconds for sb and cg timestamps
get_seconds() is deprecated because of the 32-bit overflow and will be
removed.  All callers in ufs also truncate to a 32-bit number, so
nothing changes during the conversion, but this should be harmless as
the superblock and cylinder group timestamps are not visible to user
space, except for checking the fs-dirty state, wich works fine across
the overflow.

This moves the call to get_seconds() into a new inline function, with a
comment explaining the constraints, while converting it to
ktime_get_real_seconds().

Link: http://lkml.kernel.org/r/20180718115017.742609-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:27 -07:00
Arnd Bergmann
2c1bb29aa6 firewire: use 64-bit time_t based interfaces
32-bit CLOCK_REALTIME timestamps overflow in year 2038, so all such
interfaces are deprecated now.  For the FW_CDEV_IOC_GET_CYCLE_TIMER2
ioctl, we already support 64-bit timestamps, but the implementation
still uses timespec.

This changes the code to use timespec64 instead with the appropriate
accessor functions.

Link: http://lkml.kernel.org/r/20180711124456.1023039-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:27 -07:00
Dave Jiang
e1fb4a0864 dax: remove VM_MIXEDMAP for fsdax and device dax
This patch is reworked from an earlier patch that Dan has posted:
https://patchwork.kernel.org/patch/10131727/

VM_MIXEDMAP is used by dax to direct mm paths like vm_normal_page() that
the memory page it is dealing with is not typical memory from the linear
map.  The get_user_pages_fast() path, since it does not resolve the vma,
is already using {pte,pmd}_devmap() as a stand-in for VM_MIXEDMAP, so we
use that as a VM_MIXEDMAP replacement in some locations.  In the cases
where there is no pte to consult we fallback to using vma_is_dax() to
detect the VM_MIXEDMAP special case.

Now that we have explicit driver pfn_t-flag opt-in/opt-out for
get_user_pages() support for DAX we can stop setting VM_MIXEDMAP.  This
also means we no longer need to worry about safely manipulating vm_flags
in a future where we support dynamically changing the dax mode of a
file.

DAX should also now be supported with madvise_behavior(), vma_merge(),
and copy_page_range().

This patch has been tested against ndctl unit test.  It has also been
tested against xfstests commit: 625515d using fake pmem created by
memmap and no additional issues have been observed.

Link: http://lkml.kernel.org/r/152847720311.55924.16999195879201817653.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:27 -07:00
Arnd Bergmann
e36488c83b bitfield: avoid gcc-8 -Wint-in-bool-context warning
Passing an enum into FIELD_GET() produces a long but harmless warning on
newer compilers:

                   from include/linux/linkage.h:7,
                   from include/linux/kernel.h:7,
                   from include/linux/skbuff.h:17,
                   from include/linux/if_ether.h:23,
                   from include/linux/etherdevice.h:25,
                   from drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c:63:
  drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c: In function 'iwl_mvm_rx_mpdu_mq':
  include/linux/bitfield.h:56:20: error: enum constant in boolean context [-Werror=int-in-bool-context]
     BUILD_BUG_ON_MSG(!(_mask), _pfx "mask is zero"); \
                      ^
  ...
  include/linux/bitfield.h:103:3: note: in expansion of macro '__BF_FIELD_CHECK'
     __BF_FIELD_CHECK(_mask, _reg, 0U, "FIELD_GET: "); \
     ^~~~~~~~~~~~~~~~
  drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c:1025:21: note: in expansion of macro 'FIELD_GET'
      le16_encode_bits(FIELD_GET(IWL_RX_HE_PHY_SIBG_SYM_OR_USER_NUM_MASK,

The problem here is that the caller has no idea how the macro gets
expanding, leading to a false-positive.  It can be trivially avoided by
doing a comparison against zero.

This only recently started appearing as the iwlwifi driver was patched
to use FIELD_GET.

Link: http://lkml.kernel.org/r/20180813220950.194841-1-arnd@arndb.de
Fixes: 514c30696f ("iwlwifi: add support for IEEE802.11ax")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:27 -07:00
Dominique Martinet
6ed191ca90 9p: add Dominique Martinet to MAINTAINERS
Link: http://lkml.kernel.org/r/1533869305-29325-1-git-send-email-asmadeus@codewreck.org
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Cc: Ron Minnich <rminnich@sandia.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:27 -07:00
Dominique Martinet
6b69cbd8b1 9p: remove Ron Minnich from MAINTAINERS
Ron Minnich has left Sandia in 2011, and has not been involved in any 9p
commit in recent years.  Also add a CREDITS entry to record his
contributions.

Link: http://lkml.kernel.org/r/1534486244-1055-1-git-send-email-asmadeus@codewreck.org
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Ronald G. Minnich <rminnich@gmail.com>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:26 -07:00
Daniel Borkmann
f6069b9aa9 bpf: fix redirect to map under tail calls
Commits 109980b894 ("bpf: don't select potentially stale ri->map
from buggy xdp progs") and 7c30013133 ("bpf: fix ri->map_owner
pointer on bpf_prog_realloc") tried to mitigate that buggy programs
using bpf_redirect_map() helper call do not leave stale maps behind.
Idea was to add a map_owner cookie into the per CPU struct redirect_info
which was set to prog->aux by the prog making the helper call as a
proof that the map is not stale since the prog is implicitly holding
a reference to it. This owner cookie could later on get compared with
the program calling into BPF whether they match and therefore the
redirect could proceed with processing the map safely.

In (obvious) hindsight, this approach breaks down when tail calls are
involved since the original caller's prog->aux pointer does not have
to match the one from one of the progs out of the tail call chain,
and therefore the xdp buffer will be dropped instead of redirected.
A way around that would be to fix the issue differently (which also
allows to remove related work in fast path at the same time): once
the life-time of a redirect map has come to its end we use it's map
free callback where we need to wait on synchronize_rcu() for current
outstanding xdp buffers and remove such a map pointer from the
redirect info if found to be present. At that time no program is
using this map anymore so we simply invalidate the map pointers to
NULL iff they previously pointed to that instance while making sure
that the redirect path only reads out the map once.

Fixes: 97f91a7cf0 ("bpf: add bpf_redirect_map helper routine")
Fixes: 109980b894 ("bpf: don't select potentially stale ri->map from buggy xdp progs")
Reported-by: Sebastiano Miano <sebastiano.miano@polito.it>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-17 15:56:23 -07:00
Linus Torvalds
9bd553929f This has been a large cycle for RDMA, with several major patch series
reworking parts of the core code.
 
 - Rework the so-called 'gid cache' and internal APIs to use a kref'd
   pointer to a struct instead of copying, push this upwards into the
   callers and add more stuff to the struct. The new design avoids some
   ugly races the old one suffered with. This is part of the namespace
   enablement work as the new struct is learning to be namespace aware.
 
 - Various uapi cleanups, moving more stuff to include/uapi and fixing some
   long standing bugs that have recently been discovered.
 
 - Driver updates for mlx5, mlx4 i40iw, rxe, cxgb4, hfi1, usnic, pvrdma,
   and hns
 
 - Provide max_send_sge and max_recv_sge attributes to better support HW
   where these values are asymmetric.
 
 - mlx5 user API 'devx' allows sending commands directly to the device FW,
   instead of trying to cram every wild and niche feature into the common
   API. Sort of like what GPU does.
 
 - Major write() and ioctl() API rework to cleanly support PCI device hot
   unplug and advance the ioctl conversion work
 
 - Sparse and compile warning cleanups
 
 - Add 'const' to the ib_poll_cq() signature, and permit a NULL 'bad_wr',
   which is the common use case
 
 - Various patches to avoid high order allocations across the stack
 
 - SRQ support for cxgb4, hns and qedr
 
 - Changes to IPoIB to better follow the netdev model for working with
   struct net_device liftime
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAlt17oMACgkQOG33FX4g
 mxpRsQ//YZY1Gci1IoYLMuq0Rn9+/4lRHaBev+B728z1dvEFBW8m/i2DV5dPnSxO
 AUN9dZOKBYYhc08h8vphtnBdMEtYJz6Dl76F8W+mt5vSuM5D4+0ba415RYSnV1Dc
 d6Js33OTMVbQVHmYCIAXh9FNDX8lkywT346aXlMOpW3z74xoaLkkQ0cnfB0SEX0y
 q9jiu70s6eisLlu9zJsXmCCLQ1b8eUD6IZm7hX8wMheuhDWyfrOv8JBeBCQdICuI
 MASc2T7X8E++dvIePAL7Hgx/0SH/2Mit8zaJ0Sbt2OjBDcImLSs8bcple5gPoCPk
 3vnCdb2GKg8xlxe3n1S89sGC1b8MY2CtQFElSs9C6npIGCwr2XlrZDDa0tE45+8I
 miVhoswakmKW61KTCkVf2d9RXWcIh1qwUIpan1aZMsWdNnA6FYXIF054mMmJO44+
 HUi2C93zAhx3XhFuX6O2YAHkG6CSXcZPfO7U9zy++GwAoXtGU0g6OLZbaYdEfuQh
 lN8LLqxe3M5sMdDnHYc38AsLW9MmxyJXt+h2yLxtsdZ9jitypBDQxSVfAI68RNwL
 BB1qELflF9FtAousQU9qhdNHimsgwctJ9MoZ6I1Aa1+ovwcSQgmKoQlNJIHkFroB
 wUz2sz6q25OdLWDpFrGipmG7Kfnosg7xuBSYZUQMBzLmjg0HTVY=
 =F50c
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "This has been a large cycle for RDMA, with several major patch series
  reworking parts of the core code.

   - Rework the so-called 'gid cache' and internal APIs to use a kref'd
     pointer to a struct instead of copying, push this upwards into the
     callers and add more stuff to the struct. The new design avoids
     some ugly races the old one suffered with. This is part of the
     namespace enablement work as the new struct is learning to be
     namespace aware.

   - Various uapi cleanups, moving more stuff to include/uapi and fixing
     some long standing bugs that have recently been discovered.

   - Driver updates for mlx5, mlx4 i40iw, rxe, cxgb4, hfi1, usnic,
     pvrdma, and hns

   - Provide max_send_sge and max_recv_sge attributes to better support
     HW where these values are asymmetric.

   - mlx5 user API 'devx' allows sending commands directly to the device
     FW, instead of trying to cram every wild and niche feature into the
     common API. Sort of like what GPU does.

   - Major write() and ioctl() API rework to cleanly support PCI device
     hot unplug and advance the ioctl conversion work

   - Sparse and compile warning cleanups

   - Add 'const' to the ib_poll_cq() signature, and permit a NULL
     'bad_wr', which is the common use case

   - Various patches to avoid high order allocations across the stack

   - SRQ support for cxgb4, hns and qedr

   - Changes to IPoIB to better follow the netdev model for working with
     struct net_device liftime"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (312 commits)
  Revert "net/smc: Replace ib_query_gid with rdma_get_gid_attr"
  RDMA/hns: Fix usage of bitmap allocation functions return values
  IB/core: Change filter function return type from int to bool
  IB/core: Update GID entries for netdevice whose mac address changes
  IB/core: Add default GIDs of the bond master netdev
  IB/core: Consider adding default GIDs of bond device
  IB/core: Delete lower netdevice default GID entries in bonding scenario
  IB/core: Avoid confusing del_netdev_default_ips
  IB/core: Add comment for change upper netevent handling
  qedr: Add user space support for SRQ
  qedr: Add support for kernel mode SRQ's
  qedr: Add wrapping generic structure for qpidr and adjust idr routines.
  IB/mlx5: Fix leaking stack memory to userspace
  Update the e-mail address of Bart Van Assche
  IB/ucm: Fix compiling ucm.c
  IB/uverbs: Do not check for device disassociation during ioctl
  IB/uverbs: Remove struct uverbs_root_spec and all supporting code
  IB/uverbs: Use uverbs_api to unmarshal ioctl commands
  IB/uverbs: Use uverbs_alloc for allocations
  IB/uverbs: Add a simple allocator to uverbs_attr_bundle
  ...
2018-08-17 12:44:48 -07:00
Heiner Kallweit
bfdd19ad80 r8169: add missing Kconfig dependency
Now that we switched the r8169 driver to use phylib, there's a
dependency on the Realtek PHY drivers. This dependency was missing
in Kconfig.

Reported-by: Jouni Mettälä <jtmettala@gmail.com>
Fixes: f1e911d5d0 ("r8169: add basic phylib support")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-17 12:24:13 -07:00
Yonghong Song
a85da34e97 tools/bpf: fix bpf selftest test_cgroup_storage failure
The bpf selftest test_cgroup_storage failed in one of
our production test servers.
  # sudo ./test_cgroup_storage
  Failed to create map: Operation not permitted

It turns out this is due to insufficient locked memory
with system default 16KB.

Similar to other self tests, let us arm the process
with unlimited locked memory. With this change,
the test passed.
  # sudo ./test_cgroup_storage
  test_cgroup_storage:PASS

Fixes: 68cfa3ac6b ("selftests/bpf: add a cgroup storage test")
Cc: Roman Gushchin <guro@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-17 12:23:47 -07:00
Linus Torvalds
022ff62c3d msm a6xx new hardware support
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJbdh8EAAoJEAx081l5xIa+DUYP/1M5JEJRH59bgmvzW2gVfVPU
 JCK300umHiGDiPmuGY/mHITw9deZ9Kgrgk6YmrjqeR3W2P/9nbxwhL6PLYKPI/j2
 qgdzKhJ+5BHjHgEbzl2w1Vq5T70djoqrlxkJHedG+NgypNsDkNxFQIW5qfD5CQ8R
 a+FhK0LetnhBzuGUTqMzCcewErC+omQtgilxbEmkHyv5l2qkkerNRLRZmNUlkH3m
 N+hhsBjWSFHG9TpBngObbY97MKJlx1CeGO8Q+wnLAgJZ/bqkpN3pG/5UTr41FjYq
 hsOKo7Drs8uTokjJyL5hVPZa8fQLrTWM+u+ieEm6ag2Sx7hbD+cqzcwkWVgUKjL9
 7ONyxrFZvyO7dqninC8hEvvlakYo7k9CrHFKt69bdZ6Z0IYWebEvkKZpK3tOFU3N
 pln3xTtEjOoMjA/vmEphlvksJV6XeJMJmdzbUYKDOR3VWONmNqlrJJ0wBmSTCAMY
 5KrHsepvQ1Qu5giXjNdyc31lWorHOdIfyRnK2EZ2217jHUzFfQQ6YU1DZYNh2AJY
 0cVlAVggWV9KsWpSwDWsk6wq2tWv701u8LxncjeyxH+8eW8FF9oyTA85nreykyze
 LDIGiLQF4CpZaBilXmxyWzl71ZDvfBs9OhZIWHL1Obq4hbRQW02JQNv0UTqgfC4i
 ZMJbHvBmCXV+bhEbKZrx
 =vyiz
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2018-08-17-1' of git://anongit.freedesktop.org/drm/drm

Pull drm msm support for adreno a6xx from Dave Airlie:
 "This is the support for new Qualcomm Snapdragon SoCs with the A6xx
  core. Userspace support is in mesa now"

* tag 'drm-next-2018-08-17-1' of git://anongit.freedesktop.org/drm/drm:
  drm/msm: a6xx: fix spelling mistake: "initalization" -> "initialization"
  drm/msm: Add A6XX device support
  drm/msm: update generated headers
  drm/msm/adreno: Load the firmware before bringing up the hardware
  drm/msm: Add a helper function to parse clock names
2018-08-17 12:13:15 -07:00
Linus Torvalds
f80a71b0c4 msm, i915 and amdgpu fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJbdhlUAAoJEAx081l5xIa++08P/jdvasJkTu3eyAczNVW2EPyG
 JQcpIj774tyN0Dwy9rjkW5KxlNo5cQlchlAQ/LqPnRJp8c3qSe3obwjFzmXkQgxP
 j//1FB1XkxK/YZbXRjudV7xUov/sMyBnXIwvmWP3NDu5rDrWfLZDznvq6r7vDy/o
 ImmxZboWqI94oGhrtAuwMpcFjOOuOvJQg9FSHAOMkNhRHs1xl50y5R/WSeAoY1fC
 R22SZEcGQkQJuq6kHa2Dgysd1uMULLpgQnbw/9rD72PeQXzIIw48xdjJkTBjPu5A
 ulrCaMd+loaCO3xdtIdpLqbKo4XQwGCm1gShDUWZhgVy21Z3M78u6isEtBkYDbZZ
 MJECEYzbp8EYkm8QiqSzTTdqvrlH3CjukKhhZeNdpVNxmIvsjZDQGTKYp21mA3S1
 I+FVPFH6sykMFxIcpRa87bn4ImrJ2xSDSrWU3HhNQiWpJf+fSaZsKQkUCLdY9rxX
 WcwvtP5zspL0rWwtkStkKd0BSkBK+S6uZ17xlvUEK17kih2E2TTpJoGnqNE1HNUP
 7Kts/UgXrxobSGhRJLxf+b7gJqWwrLmeCfF4ZWRvMpG727k6Dw87mIfkMGy0v/fJ
 rKp2/RYqPGVF2A++2kp5GFPfIFlHtiCCDNYwoBJKqwCFkm+ow+ehFWwLXPXeBDFH
 PRfMeYj5freNp2C78TTD
 =RsR5
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2018-08-17' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "First round of fixes for -rc1. I'll follow this up with the msm new hw
  support pull request.

  This just has three sets of fixes, some for msm before the new hw, a
  bunch of AMD fixes (includiing some required firmware changes for new
  hw), and a set of i915 (+gvt) fixes"

* tag 'drm-next-2018-08-17' of git://anongit.freedesktop.org/drm/drm: (30 commits)
  drm/amdgpu: Use kvmalloc for allocating UVD/VCE/VCN BO backup memory
  drm/i915: set DP Main Stream Attribute for color range on DDI platforms
  drm/i915/selftests: Hold rpm for unparking
  drm/i915: Restore user forcewake domains across suspend
  drm/i915: Unmask user interrupts writes into HWSP on snb/ivb/vlv/hsw
  drm/i915/gvt: fix memory leak in intel_vgpu_ioctl()
  drm/i915/gvt: Off by one in intel_vgpu_write_fence()
  drm/i915/kvmgt: Fix potential Spectre v1
  drm/i915/gvt: return error on cmd access
  drm/i915/gvt: initialize dmabuf mutex in vgpu_create
  drm/i915/gvt: fix cleanup sequence in intel_gvt_clean_device
  drm/amd/display: Guard against null crtc in CRC IRQ
  drm/amd/display: Pass connector id when executing VBIOS CT
  drm/amd/display: Check if clock source in use before disabling
  drm/amd/display: Allow clock sharing b/w HDMI and DVI
  drm/amd/display: Fix warning observed in mode change on Vega
  drm/amd/display: fix single link DVI has no display
  drm/amdgpu/vce: VCE entity initialization relies on ring initializtion
  drm/amdgpu/uvd: UVD entity initialization relys on ring initialization
  drm/amdgpu:add VCN booting with firmware loaded by PSP
  ...
2018-08-17 12:10:22 -07:00
Linus Torvalds
edb0a20009 A couple of arm64 fixes
- Fix boot on Hikey-960 by avoiding an IPI with interrupts disabled
 - Fix address truncation in pfn_valid() implementation
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCgAGBQJbdp+EAAoJELescNyEwWM0Ld8H/iJqjvPwNLRC0KGL/rCQJH70
 D80qlNBnwlrs2eUJTeNeRVZC+t2l9vJIoT17W938WkjxV+DSGDsfFDy3/BQ7VTji
 7e33mwFBNoH+feAfMYmzht3sRlvyZ0oqXSIq/GrdZ8a4Gg/6iNVz7K1kpboBVFXp
 LFnFIN4I7mNwdl1nAyNmnU081MMWfyvgRB82Xd9eS00KCAm3ueHfkwBNcwkfulDg
 RT2ZXPzwd3Yxsdy3Z+r1vyXMHAw2GjcYpL5pjvHf34zMdvqkk03sMsx2yReuSR1U
 M6MpNCdZfWHgMlFWbsEoEOd0g0CF5s6TQK3hBqoUEE3AUVNrQ8ixZMip326axoQ=
 =C2YW
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "A couple of arm64 fixes

   - Fix boot on Hikey-960 by avoiding an IPI with interrupts disabled

   - Fix address truncation in pfn_valid() implementation"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: mm: check for upper PAGE_SHIFT bits in pfn_valid()
  arm64: Avoid calling stop_machine() when patching jump labels
2018-08-17 11:48:04 -07:00
Linus Torvalds
5e2d059b52 powerpc updates for 4.19
Notable changes:
 
  - A fix for a bug in our page table fragment allocator, where a page table page
    could be freed and reallocated for something else while still in use, leading
    to memory corruption etc. The fix reuses pt_mm in struct page (x86 only) for
    a powerpc only refcount.
 
  - Fixes to our pkey support. Several are user-visible changes, but bring us in
    to line with x86 behaviour and/or fix outright bugs. Thanks to Florian Weimer
    for reporting many of these.
 
  - A series to improve the hvc driver & related OPAL console code, which have
    been seen to cause hardlockups at times. The hvc driver changes in particular
    have been in linux-next for ~month.
 
  - Increase our MAX_PHYSMEM_BITS to 128TB when SPARSEMEM_VMEMMAP=y.
 
  - Remove Power8 DD1 and Power9 DD1 support, neither chip should be in use
    anywhere other than as a paper weight.
 
  - An optimised memcmp implementation using Power7-or-later VMX instructions
 
  - Support for barrier_nospec on some NXP CPUs.
 
  - Support for flushing the count cache on context switch on some IBM CPUs
    (controlled by firmware), as a Spectre v2 mitigation.
 
  - A series to enhance the information we print on unhandled signals to bring it
    into line with other arches, including showing the offending VMA and dumping
    the instructions around the fault.
 
 Thanks to:
   Aaro Koskinen, Akshay Adiga, Alastair D'Silva, Alexey Kardashevskiy, Alexey
   Spirkov, Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju T Sudhakar,
   Arnd Bergmann, Bartosz Golaszewski, Benjamin Herrenschmidt, Bharat Bhushan,
   Bjoern Noetel, Boqun Feng, Breno Leitao, Bryant G. Ly, Camelia Groza,
   Christophe Leroy, Christoph Hellwig, Cyril Bur, Dan Carpenter, Daniel Klamt,
   Darren Stevens, Dave Young, David Gibson, Diana Craciun, Finn Thain, Florian
   Weimer, Frederic Barrat, Gautham R. Shenoy, Geert Uytterhoeven, Geoff Levand,
   Guenter Roeck, Gustavo Romero, Haren Myneni, Hari Bathini, Joel Stanley,
   Jonathan Neuschäfer, Kees Cook, Madhavan Srinivasan, Mahesh Salgaonkar, Markus
   Elfring, Mathieu Malaterre, Mauro S. M. Rodrigues, Michael Hanselmann, Michael
   Neuling, Michael Schmitz, Mukesh Ojha, Murilo Opsfelder Araujo, Nicholas
   Piggin, Parth Y Shah, Paul Mackerras, Paul Menzel, Ram Pai, Randy Dunlap,
   Rashmica Gupta, Reza Arbab, Rodrigo R. Galvao, Russell Currey, Sam Bobroff,
   Scott Wood, Shilpasri G Bhat, Simon Guo, Souptick Joarder, Stan Johnson,
   Thiago Jung Bauermann, Tyrel Datwyler, Vaibhav Jain, Vasant Hegde, Venkat Rao
   B, zhong jiang.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAlt2O6cTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgC7hD/4+cj796Df7GsVsIMxzQm7SS9dklIdO
 JuKj2Nr5HRzTH59jWlXukLG9mfTNCFgFJB4gEpK1ArDOTcHTCI9RRsLZTZ/kum66
 7Pd+7T40dLYXB5uecuUs0vMXa2fI3syKh1VLzACSXv3Dh9BBIKQBwW/aD2eww4YI
 1fS5LnXZ2PSxfr6KNAC6ogZnuaiD0sHXOYrtGHq+S/TFC7+Z6ySa6+AnPS+hPVoo
 /rHDE1Khr66aj7uk+PP2IgUrCFj6Sbj6hTVlS/iAuwbMjUl9ty6712PmvX9x6wMZ
 13hJQI+g6Ci+lqLKqmqVUpXGSr6y4NJGPS/Hko4IivBTJApI+qV/tF2H9nxU+6X0
 0RqzsMHPHy13n2torA1gC7ttzOuXPI4hTvm6JWMSsfmfjTxLANJng3Dq3ejh6Bqw
 76EMowpDLexwpy7/glPpqNdsP4ySf2Qm8yq3mR7qpL4m3zJVRGs11x+s5DW8NKBL
 Fl5SqZvd01abH+sHwv6NLaLkEtayUyohxvyqu2RU3zu5M5vi7DhqstybTPjKPGu0
 icSPh7b2y10WpOUpC6lxpdi8Me8qH47mVc/trZ+SpgBrsuEmtJhGKszEnzRCOqos
 o2IhYHQv3lQv86kpaAFQlg/RO+Lv+Lo5qbJ209V+hfU5nYzXpEulZs4dx1fbA+ze
 fK8GEh+u0L4uJg==
 =PzRz
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Notable changes:

   - A fix for a bug in our page table fragment allocator, where a page
     table page could be freed and reallocated for something else while
     still in use, leading to memory corruption etc. The fix reuses
     pt_mm in struct page (x86 only) for a powerpc only refcount.

   - Fixes to our pkey support. Several are user-visible changes, but
     bring us in to line with x86 behaviour and/or fix outright bugs.
     Thanks to Florian Weimer for reporting many of these.

   - A series to improve the hvc driver & related OPAL console code,
     which have been seen to cause hardlockups at times. The hvc driver
     changes in particular have been in linux-next for ~month.

   - Increase our MAX_PHYSMEM_BITS to 128TB when SPARSEMEM_VMEMMAP=y.

   - Remove Power8 DD1 and Power9 DD1 support, neither chip should be in
     use anywhere other than as a paper weight.

   - An optimised memcmp implementation using Power7-or-later VMX
     instructions

   - Support for barrier_nospec on some NXP CPUs.

   - Support for flushing the count cache on context switch on some IBM
     CPUs (controlled by firmware), as a Spectre v2 mitigation.

   - A series to enhance the information we print on unhandled signals
     to bring it into line with other arches, including showing the
     offending VMA and dumping the instructions around the fault.

  Thanks to: Aaro Koskinen, Akshay Adiga, Alastair D'Silva, Alexey
  Kardashevskiy, Alexey Spirkov, Alistair Popple, Andrew Donnellan,
  Aneesh Kumar K.V, Anju T Sudhakar, Arnd Bergmann, Bartosz Golaszewski,
  Benjamin Herrenschmidt, Bharat Bhushan, Bjoern Noetel, Boqun Feng,
  Breno Leitao, Bryant G. Ly, Camelia Groza, Christophe Leroy, Christoph
  Hellwig, Cyril Bur, Dan Carpenter, Daniel Klamt, Darren Stevens, Dave
  Young, David Gibson, Diana Craciun, Finn Thain, Florian Weimer,
  Frederic Barrat, Gautham R. Shenoy, Geert Uytterhoeven, Geoff Levand,
  Guenter Roeck, Gustavo Romero, Haren Myneni, Hari Bathini, Joel
  Stanley, Jonathan Neuschäfer, Kees Cook, Madhavan Srinivasan, Mahesh
  Salgaonkar, Markus Elfring, Mathieu Malaterre, Mauro S. M. Rodrigues,
  Michael Hanselmann, Michael Neuling, Michael Schmitz, Mukesh Ojha,
  Murilo Opsfelder Araujo, Nicholas Piggin, Parth Y Shah, Paul
  Mackerras, Paul Menzel, Ram Pai, Randy Dunlap, Rashmica Gupta, Reza
  Arbab, Rodrigo R. Galvao, Russell Currey, Sam Bobroff, Scott Wood,
  Shilpasri G Bhat, Simon Guo, Souptick Joarder, Stan Johnson, Thiago
  Jung Bauermann, Tyrel Datwyler, Vaibhav Jain, Vasant Hegde, Venkat
  Rao, zhong jiang"

* tag 'powerpc-4.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (234 commits)
  powerpc/mm/book3s/radix: Add mapping statistics
  powerpc/uaccess: Enable get_user(u64, *p) on 32-bit
  powerpc/mm/hash: Remove unnecessary do { } while(0) loop
  powerpc/64s: move machine check SLB flushing to mm/slb.c
  powerpc/powernv/idle: Fix build error
  powerpc/mm/tlbflush: update the mmu_gather page size while iterating address range
  powerpc/mm: remove warning about ‘type’ being set
  powerpc/32: Include setup.h header file to fix warnings
  powerpc: Move `path` variable inside DEBUG_PROM
  powerpc/powermac: Make some functions static
  powerpc/powermac: Remove variable x that's never read
  cxl: remove a dead branch
  powerpc/powermac: Add missing include of header pmac.h
  powerpc/kexec: Use common error handling code in setup_new_fdt()
  powerpc/xmon: Add address lookup for percpu symbols
  powerpc/mm: remove huge_pte_offset_and_shift() prototype
  powerpc/lib: Use patch_site to patch copy_32 functions once cache is enabled
  powerpc/pseries: Fix endianness while restoring of r3 in MCE handler.
  powerpc/fadump: merge adjacent memory ranges to reduce PT_LOAD segements
  powerpc/fadump: handle crash memory ranges array index overflow
  ...
2018-08-17 11:32:50 -07:00
Linus Torvalds
d190775206 Modules updates for v4.19
Summary of modules changes for the 4.19 merge window:
 
 - Fix modules kallsyms for livepatch. Livepatch modules can have
   SHN_UNDEF symbols in their module symbol tables for later symbol
   resolution, but kallsyms shouldn't be returning these symbols
 
 - Some code cleanups and minor reshuffling in load_module() were done to
   log the module name when module signature verification fails
 
 Signed-off-by: Jessica Yu <jeyu@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCgAGBQJbdmqCAAoJEMBFfjjOO8FyR+AP/3nbWaCUYmxvae9QPt1ycXxm
 UV0TnRYJbrEZVlUpL1X/efd63jVizmhJSMPTGumje4s3nhSNGMbEYa+/VN0mcja3
 egKqKGD/zjUrOZxwu/zxlV4lVd6Wt8mbO+pFnc/0MlmtWVnkWxRXsf/k8Uo5bzSE
 l6aGjQJjF1Yj5hwgVoLlL0L+6mw5Usu1tKzvEvx5IZO9IvzenwifZ1N8JowSp7G2
 APd0TjxDTuZlR430JVONKVpzFCWtMqUzMxHQZl26uKxyrLkJ32GarQKTUxStOAXg
 LTZ1Nrjn4UToRZLx4VUmeaXBlW/dhnzF17WPwrN1AXFgcXdR6wlw0otDSJdZVh8X
 FM8mCANZt9Zi/HAmBTMypnnCGqyoY+rHz1TWcsNDgknwBb0lg+3Kx781fl1Z1v0g
 RHV8JYRw7hCt/WDfPuyrhDO5wmdqAPF3A1WlJ7ItNvLB4tP0jILCoOP5csSp8+hs
 q/7o1A66KoGgOCWJ4NnZ/uPadr1ahTH0CYcq4dziQWiEuzOU3XdgHJ2V+XBgs36v
 UkuJiqM9NyBs90+r9+8YBCkC9gCv8Rs/43aTN9sVoDoihmX57339jgVcKfLIqSNw
 djmJHsUW7eFU0I6m5c8bNt3VDgL4iIJp4gAwu/uuDV7nFnbaJQK9cENH8IJ23Wto
 y0CYhBExs26z6axi4BCv
 =HzLj
 -----END PGP SIGNATURE-----

Merge tag 'modules-for-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux

Pull modules updates from Jessica Yu:
 "Summary of modules changes for the 4.19 merge window:

   - Fix modules kallsyms for livepatch. Livepatch modules can have
     SHN_UNDEF symbols in their module symbol tables for later symbol
     resolution, but kallsyms shouldn't be returning these symbols

   - Some code cleanups and minor reshuffling in load_module() were done
     to log the module name when module signature verification fails"

* tag 'modules-for-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
  kernel/module: Use kmemdup to replace kmalloc+memcpy
  ARM: module: fix modsign build error
  modsign: log module name in the event of an error
  module: replace VMLINUX_SYMBOL_STR() with __stringify() or string literal
  module: print sensible error code
  module: setup load info before module_sig_check()
  module: make it clear when we're handling the module copy in info->hdr
  module: exclude SHN_UNDEF symbols from kallsyms api
2018-08-17 10:51:22 -07:00
Linus Torvalds
84f5685230 VLA leftovers pull summary:
- bus/imx-weim: Use maximum register count to avoid VLA
 
 - drm/i2c/tda9950: Use maximum CEC message size to avoid VLA
 -----BEGIN PGP SIGNATURE-----
 Comment: Kees Cook <kees@outflux.net>
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAlt2C80WHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJvVBD/96GT3cgROKIL1nrgtotAJobefS
 PpojBo9qHqG+8YqBQxZwM5PJpHhHdGeK1EE8gzruJL1146ExXdePX+Mvf1NgGsad
 EnMVZ4eQfiL1yTj/Sutn4WZeLs88kqkcvsUVN+V7uStuDGeSDMh2QYCsFBFL5heV
 aW6/DvB+LT4vmAkZdu7cU29vUo6hRQYDWD5VmYajujjkVHCerVuHX38ly/VS7jK8
 JhPp256nAMJSqvqyxQuS98GvzSLV7ovZB/i0PN8efg1Sph5XLHCql5S1hUIaMhqC
 ruVjJWqkDB78VZ/SLZcYVNurOwvdD3JREmmm4GvrYk3mgf1Lr8d1HvbAeTsz1QLi
 GJGS290OVEyVCZHwS5RAaIlEo+idPGtBymv7EIcVGQ3TdkDsruAfchF1OWNK0AYG
 T+85IDjYaUHWU7NkK98BdGTQNY7J95KDnMPgoN45ArIgBIrNAZ5pPlw3e3W0BL7Z
 rB4DwCEwrcq2C9uoJgQG0Oc8hmLsIINSicAZ1Y15DHkPGDHvdSNDQ6ya1q6IkcUS
 zfvxrh377Zo2l3cuJNkXnnT1sxO9Q86XjmqLbhGiifIzTHwIRcVUYWb4Hk6AfGks
 +5HhPli/02yaaRQcfa+8CI3MfiGmFjarwNx/AUDKvshfsHUcOqBxdop1ur8P4bpj
 E1frvFoUipoI2wUBTQ==
 =hM/H
 -----END PGP SIGNATURE-----

Merge tag 'vla-leftovers-v4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull VLA removal leftovers from Kees Cook:

 - bus/imx-weim: Use maximum register count to avoid VLA

 - drm/i2c/tda9950: Use maximum CEC message size to avoid VLA

* tag 'vla-leftovers-v4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  bus: imx-weim: Remove VLA usage
  drm/i2c: tda9950: Remove VLA usage
2018-08-17 10:40:09 -07:00
Sean Christopherson
f19f5c49bb x86/speculation/l1tf: Exempt zeroed PTEs from inversion
It turns out that we should *not* invert all not-present mappings,
because the all zeroes case is obviously special.

clear_page() does not undergo the XOR logic to invert the address bits,
i.e. PTE, PMD and PUD entries that have not been individually written
will have val=0 and so will trigger __pte_needs_invert(). As a result,
{pte,pmd,pud}_pfn() will return the wrong PFN value, i.e. all ones
(adjusted by the max PFN mask) instead of zero. A zeroed entry is ok
because the page at physical address 0 is reserved early in boot
specifically to mitigate L1TF, so explicitly exempt them from the
inversion when reading the PFN.

Manifested as an unexpected mprotect(..., PROT_NONE) failure when called
on a VMA that has VM_PFNMAP and was mmap'd to as something other than
PROT_NONE but never used. mprotect() sends the PROT_NONE request down
prot_none_walk(), which walks the PTEs to check the PFNs.
prot_none_pte_entry() gets the bogus PFN from pte_pfn() and returns
-EACCES because it thinks mprotect() is trying to adjust a high MMIO
address.

[ This is a very modified version of Sean's original patch, but all
  credit goes to Sean for doing this and also pointing out that
  sometimes the __pte_needs_invert() function only gets the protection
  bits, not the full eventual pte.  But zero remains special even in
  just protection bits, so that's ok.   - Linus ]

Fixes: f22cc87f6c ("x86/speculation/l1tf: Invert all not present mappings")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 10:27:36 -07:00
Linus Torvalds
b0e5c29426 - A couple stable fixes for the DM writecache target.
- A stable fix for the DM cache target that fixes the potential for data
   corruption after an unclean shutdown of a cache device using writeback
   mode.
 
 - Update DM integrity target to allow the metadata to be stored on a
   separate device from data.
 
 - Fix DM kcopyd and the snapshot target to cond_resched() where
   appropriate and be more efficient with processing completed work.
 
 - A few fixes and improvements for DM crypt.
 
 - Add DM delay target feature to configure delay of flushes independent
   of writes.
 
 - Update DM thin-provisioning target to include metadata_low_watermark
   threshold in pool status.
 
 - Fix stale DM thin-provisioning Documentation.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJbddbuAAoJEMUj8QotnQNaoZIH/1qxc4x26bUfkVwBCiqU0cqJ
 8PD8D5UBFB7+JXeI4P9p7ONVJNpk291QGeX+CBOXfhdeBBcXmuHnavJFcmH6+1Y3
 omPEIvKnsODEMyZuznA8HasvlZIDPNOESaDt/8vhDPkDmPLWi3h+fUO1ay+NRtua
 J+8ZTe35P5SY70uLFE3nTeZScZD8KAO9Py1W+5Lz1godS6UUHhNOAcm0zzw5Nvnc
 2gu2HZaCx0erFNou5Kj5Y4a/z/cITlAyXHQyzXBINk/X4sSwvzRr2tSy/FG91CuO
 kHduB0Tgo8eSeEta8jcOcYCm601XLlXzkIM69Z7c3fmCY+b8dY4ybHPxApTEv7c=
 =blC6
 -----END PGP SIGNATURE-----

Merge tag 'for-4.19/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper updates from Mike Snitzer:

 - A couple stable fixes for the DM writecache target.

 - A stable fix for the DM cache target that fixes the potential for
   data corruption after an unclean shutdown of a cache device using
   writeback mode.

 - Update DM integrity target to allow the metadata to be stored on a
   separate device from data.

 - Fix DM kcopyd and the snapshot target to cond_resched() where
   appropriate and be more efficient with processing completed work.

 - A few fixes and improvements for DM crypt.

 - Add DM delay target feature to configure delay of flushes independent
   of writes.

 - Update DM thin-provisioning target to include metadata_low_watermark
   threshold in pool status.

 - Fix stale DM thin-provisioning Documentation.

* tag 'for-4.19/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (26 commits)
  dm writecache: fix a crash due to reading past end of dirty_bitmap
  dm crypt: don't decrease device limits
  dm cache metadata: set dirty on all cache blocks after a crash
  dm snapshot: remove stale FIXME in snapshot_map()
  dm snapshot: improve performance by switching out_of_order_list to rbtree
  dm kcopyd: avoid softlockup in run_complete_job
  dm cache metadata: save in-core policy_hint_size to on-disk superblock
  dm thin: stop no_space_timeout worker when switching to write-mode
  dm kcopyd: return void from dm_kcopyd_copy()
  dm thin: include metadata_low_watermark threshold in pool status
  dm writecache: report start_sector in status line
  dm crypt: convert essiv from ahash to shash
  dm crypt: use wake_up_process() instead of a wait queue
  dm integrity: recalculate checksums on creation
  dm integrity: flush journal on suspend when using separate metadata device
  dm integrity: use version 2 for separate metadata
  dm integrity: allow separate metadata device
  dm integrity: add ic->start in get_data_sector()
  dm integrity: report provided data sectors in the status
  dm integrity: implement fair range locks
  ...
2018-08-17 09:52:15 -07:00
Linus Torvalds
2645b9d1a4 \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAlt2mBQACgkQnJ2qBz9k
 QNntGQgAluTTnuJLjoUDjFfT37Fjf2x1ve8rg6xmYS3YIhYTWWA1oazUIeyBDfwa
 soutlfAZ/ix2bP1UEmeULxFhrCIXYBbWAe8s5MRqO/7s01QftNf0M72ASmd7gZRy
 rSVt2/BWpr745mWI38tEKlIF4sQJVD7IGrnc1cQslPzleeCqsCXA+uBkBPMlcDpJ
 ZWni2qK023y9E2dsg6RsJc1HemkQvrJtoLSVqRsdhty9GEuWseMbssdgz1zMXljQ
 eXIALE5BssoxISIpH6qVKZRlr7UWGxOmV4CDPmku7DFLOSiwMk/Ml0V80BwzjNNY
 hY8qfxcJOFOGZ8t82pWkVGMjgOAKjA==
 =IN6Y
 -----END PGP SIGNATURE-----

Merge tag 'fsnotify_for_v4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull fsnotify updates from Jan Kara:
 "fsnotify cleanups from Amir and a small inotify improvement"

* tag 'fsnotify_for_v4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  inotify: Add flag IN_MASK_CREATE for inotify_add_watch()
  fanotify: factor out helpers to add/remove mark
  fsnotify: add helper to get mask from connector
  fsnotify: let connector point to an abstract object
  fsnotify: pass connp and object type to fsnotify_add_mark()
  fsnotify: use typedef fsnotify_connp_t for brevity
2018-08-17 09:41:28 -07:00
Linus Torvalds
46e62a072a \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAlt2l2MACgkQnJ2qBz9k
 QNlZMAgAwVu/bMsRR6PbXJIAYEUNLehrmgUfSdYxIFqnZPq84ZfpOMQZKDYJIO5d
 WiLz9Z9pti/ldrQ33yllbJrsalAn8R+LB911eaKUvLscXyrIsoBxsBbOOtVZc9lZ
 jaQBUMLStdPvE6LgW93f1EwIg/Z8CSTzaeCO31wlZl7s7wsBhjg3MJ3f9sR6LG0G
 OKQZnjDxGbtsbeVl8cnOeeF3sd0kqYTT5EwSh+zkMIbHJQ0dbvEjj24TM9rHdzG2
 AN35+rzFZeMHRGnfWsQ/I6il1nTuWIyPRpoc57cwV/dcYwpg1Pi6MZzrFcDsWfwx
 rHgRJIkmSqi1S6Ic8o6s9fYsn6266A==
 =ljWe
 -----END PGP SIGNATURE-----

Merge tag 'for_v4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull UDF and ext2 update from Jan Kara.

* tag 'for_v4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  ext2: use ktime_get_real_seconds for timestamps
  udf: convert inode stamps to timespec64
2018-08-17 09:38:39 -07:00
Vinod Koul
3257d86182 Merge branch 'topic/pl330' into for-linus 2018-08-17 18:00:29 +05:30