Commit graph

57459 commits

Author SHA1 Message Date
Linus Torvalds
b5dd0c658c Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:

 - some of the rest of MM

 - various misc things

 - dynamic-debug updates

 - checkpatch

 - some epoll speedups

 - autofs

 - rapidio

 - lib/, lib/lzo/ updates

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (83 commits)
  samples/mic/mpssd/mpssd.h: remove duplicate header
  kernel/fork.c: remove duplicated include
  include/linux/relay.h: fix percpu annotation in struct rchan
  arch/nios2/mm/fault.c: remove duplicate include
  unicore32: stop printing the virtual memory layout
  MAINTAINERS: fix GTA02 entry and mark as orphan
  mm: create the new vm_fault_t type
  arm, s390, unicore32: remove oneliner wrappers for memblock_alloc()
  arch: simplify several early memory allocations
  openrisc: simplify pte_alloc_one_kernel()
  sh: prefer memblock APIs returning virtual address
  microblaze: prefer memblock API returning virtual address
  powerpc: prefer memblock APIs returning virtual address
  lib/lzo: separate lzo-rle from lzo
  lib/lzo: implement run-length encoding
  lib/lzo: fast 8-byte copy on arm64
  lib/lzo: 64-bit CTZ on arm64
  lib/lzo: tidy-up ifdefs
  ipc/sem.c: replace kvmalloc/memset with kvzalloc and use struct_size
  ipc: annotate implicit fall through
  ...
2019-03-07 19:25:37 -08:00
Oleg Nesterov
6eb3c3d0a5 exec: increase BINPRM_BUF_SIZE to 256
Large enterprise clients often run applications out of networked file
systems where the IT mandated layout of project volumes can end up
leading to paths that are longer than 128 characters.  Bumping this up
to the next order of two solves this problem in all but the most
egregious case while still fitting into a 512b slab.

[oleg@redhat.com: update comment, per Kees]
Link: http://lkml.kernel.org/r/20181112160956.GA28472@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Ben Woodard <woodard@redhat.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
Vineet Gupta
26e152252e fs/exec.c: replace opencoded set_mask_bits()
Link: http://lkml.kernel.org/r/1548275584-18096-2-git-send-email-vgupta@synopsys.com
Link: http://lkml.kernel.org/g/20150807115710.GA16897@redhat.com
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
Hou Tao
67ceb1eca0 fat: enable .splice_write to support splice on O_DIRECT file
Now splice() on O_DIRECT-opened fat file will return -EFAULT, that is
because the default .splice_write, namely default_file_splice_write(),
will construct an ITER_KVEC iov_iter and dio_refill_pages() in dio path
can not handle it.

Fix it by implementing .splice_write through iter_file_splice_write().

Spotted by xfs-tests generic/091.

Link: http://lkml.kernel.org/r/20190210094754.56355-1-houtao1@huawei.com
Signed-off-by: Hou Tao <houtao1@huawei.com>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
NeilBrown
660c9fc72e autofs: clear O_NONBLOCK on the pipe
autofs does not expect the pipe it is given to have O_NONBLOCK set -
specifically if __kernel_write() in autofs_write() returns -EAGAIN, this
is treated as a fatal error and the pipe is closed.

For safety autofs should, therefore, clear the O_NONBLOCK flag.

Releases of systemd prior to 8th February 2019 used
  pipe2(p, O_NONBLOCK|O_CLOEXEC)
and thus (inadvertently) set this flag.

Link: http://lkml.kernel.org/r/154993550902.3321.1183632970046073478.stgit@pluto-themaw-net
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
Ian Kent
874d22d62b fs/autofs/inode.c: use seq_puts() for simple strings in autofs_show_options()
Fix checkpatch.sh WARNING about the use of seq_printf() to print simple
strings in autofs_show_options(), use seq_puts() in this case.

Link: http://lkml.kernel.org/r/154889012613.4863.12231175554744203482.stgit@pluto-themaw-net
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
Ian Kent
60d6d04ca3 autofs: add ignore mount option
Add an autofs file system mount option that can be used to provide a
generic indicator to applications that the mount entry should be ignored
when displaying mount information.

In other OSes that provide autofs and that provide a mount list to user
space based on the kernel mount list a no-op mount option ("ignore" is
the one use on the most common OS) is allowed so that autofs file system
users can optionally use it.

The idea is that it be used by user space programs to exclude autofs
mounts from consideration when reading the mounts list.

Prior to the change to link /etc/mtab to /proc/self/mounts all I needed
to do to achieve this was to use mount(2) and not update the mtab but
now that no longer works.

I know the symlinking happened a long time ago and I considered doing
this then but, at the time I couldn't remember the commonly used option
name and thought persuading the various utility maintainers would be too
hard.

But now I have a RHEL request to do this for compatibility for a widely
used product so I want to go ahead with it and try and enlist the help
of some utility package maintainers.

Clearly, without the option nothing can be done so it's at least a
start.

Link: http://lkml.kernel.org/r/154725123970.11260.6113771566924907275.stgit@pluto-themaw-net
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
Alexey Dobriyan
49ac981965 fs/binfmt_elf.c: spread const a little
Link: http://lkml.kernel.org/r/20190204202830.GC27482@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
Alexey Dobriyan
93f044e282 fs/binfmt_elf.c: use list_for_each_entry()
[adobriyan@gmail.com: fixup compilation]
  Link: http://lkml.kernel.org/r/20190205064334.GA2152@avx2
Link: http://lkml.kernel.org/r/20190204202800.GB27482@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
Alexey Dobriyan
faf1c31520 fs/binfmt_elf.c: don't be afraid of overflow
Number of ELF program headers is 16-bit by spec, so total size
comfortably fits into "unsigned int".

Space savings: 7 bytes!

	add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-7 (-7)
	Function                                     old     new   delta
	load_elf_phdrs                               137     130      -7

Link: http://lkml.kernel.org/r/20190204202715.GA27482@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
Roman Penyaev
a218cc4914 epoll: use rwlock in order to reduce ep_poll_callback() contention
The goal of this patch is to reduce contention of ep_poll_callback()
which can be called concurrently from different CPUs in case of high
events rates and many fds per epoll.  Problem can be very well
reproduced by generating events (write to pipe or eventfd) from many
threads, while consumer thread does polling.  In other words this patch
increases the bandwidth of events which can be delivered from sources to
the poller by adding poll items in a lockless way to the list.

The main change is in replacement of the spinlock with a rwlock, which
is taken on read in ep_poll_callback(), and then by adding poll items to
the tail of the list using xchg atomic instruction.  Write lock is taken
everywhere else in order to stop list modifications and guarantee that
list updates are fully completed (I assume that write side of a rwlock
does not starve, it seems qrwlock implementation has these guarantees).

The following are some microbenchmark results based on the test [1]
which starts threads which generate N events each.  The test ends when
all events are successfully fetched by the poller thread:

 spinlock
 ========

 threads  events/ms  run-time ms
       8       6402        12495
      16       7045        22709
      32       7395        43268

 rwlock + xchg
 =============

 threads  events/ms  run-time ms
       8      10038         7969
      16      12178        13138
      32      13223        24199

According to the results bandwidth of delivered events is significantly
increased, thus execution time is reduced.

This patch was tested with different sort of microbenchmarks and
artificial delays (e.g.  "udelay(get_random_int() & 0xff)") introduced
in kernel on paths where items are added to lists.

[1] https://github.com/rouming/test-tools/blob/master/stress-epoll.c

Link: http://lkml.kernel.org/r/20190103150104.17128-5-rpenyaev@suse.de
Signed-off-by: Roman Penyaev <rpenyaev@suse.de>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
Roman Penyaev
c3e320b615 epoll: unify awaking of wakeup source on ep_poll_callback() path
Original comment "Activate ep->ws since epi->ws may get deactivated at
any time" indeed sounds loud, but it is incorrect, because the path
where we check epi->ws is a path where insert to ovflist happens, i.e.
ep_scan_ready_list() has taken ep->mtx and waits for this callback to
finish, thus ep_modify() (which unregisters wakeup source) waits for
ep_scan_ready_list().

Here in this patch I simply call ep_pm_stay_awake_rcu(), which is a bit
extra for this path (indirectly protected by main ep->mtx, so even rcu
is not needed), but I do not want to create another naked
__ep_pm_stay_awake() variant only for this particular case, so rcu variant
is just better for all the cases.

Link: http://lkml.kernel.org/r/20190103150104.17128-4-rpenyaev@suse.de
Signed-off-by: Roman Penyaev <rpenyaev@suse.de>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
Roman Penyaev
c141175d01 epoll: make sure all elements in ready list are in FIFO order
Patch series "use rwlock in order to reduce ep_poll_callback()
contention", v3.

The last patch targets the contention problem in ep_poll_callback(),
which can be very well reproduced by generating events (write to pipe or
eventfd) from many threads, while consumer thread does polling.

The following are some microbenchmark results based on the test [1]
which starts threads which generate N events each.  The test ends when
all events are successfully fetched by the poller thread:

 spinlock
 ========

 threads  events/ms  run-time ms
       8       6402        12495
      16       7045        22709
      32       7395        43268

 rwlock + xchg
 =============

 threads  events/ms  run-time ms
       8      10038         7969
      16      12178        13138
      32      13223        24199

According to the results bandwidth of delivered events is significantly
increased, thus execution time is reduced.

This patch (of 4):

All coming events are stored in FIFO order and this is also should be
applicable to ->ovflist, which originally is stack, i.e.  LIFO.

Thus to keep correct FIFO order ->ovflist should reversed by adding
elements to the head of the read list but not to the tail.

Link: http://lkml.kernel.org/r/20190103150104.17128-2-rpenyaev@suse.de
Signed-off-by: Roman Penyaev <rpenyaev@suse.de>
Reviewed-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
Rasmus Villemoes
afe1a715e8 btrfs: implement btrfs_debug* in terms of helper macro
First, the btrfs_debug macros open-code (one possible definition of)
DYNAMIC_DEBUG_BRANCH, so they don't benefit from the CONFIG_JUMP_LABEL
optimization.

Second, a planned change of struct _ddebug (to reduce its size on 64 bit
machines) requires that all descriptors in a translation unit use
distinct identifiers.

Using the new _dynamic_func_call_no_desc helper macro from
dynamic_debug.h takes care of both of these.  No functional change.

Link: http://lkml.kernel.org/r/20190212214150.4807-12-linux@rasmusvillemoes.dk
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: David Sterba <dsterba@suse.com>
Acked-by: Jason Baron <jbaron@akamai.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: "Rafael J . Wysocki" <rafael.j.wysocki@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:00 -08:00
Rasmus Villemoes
f1fffbd447 linux/fs.h: move member alignment check next to definition of struct filename
Instead of doing this compile-time check in some slightly arbitrary user
of struct filename, put it next to the definition.

Link: http://lkml.kernel.org/r/20190208203015.29702-3-linux@rasmusvillemoes.dk
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Kees Cook <keescook@chromium.org>
Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:31:59 -08:00
Linus Torvalds
be37f21a08 audit/stable-5.1 PR 20190305
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAlx+8ZgUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXOlDhAAiGlirQ9syyG2fYzaARZZ2QoU/GGD
 PSAeiNmP3jvJzXArCvugRCw+YSNDdQOBM3SrLQC+cM0MAIDRYXN0NdcrsbTchlMA
 51Fx1egZ9Fyj+Ehgida3muh2lRUy7DQwMCL6tAVqwz7vYkSTGDUf+MlYqOqXDka5
 74pEExOS3Jdi7560BsE8b6QoW9JIJqEJnirXGkG9o2qC0oFHCR6PKxIyQ7TJrLR1
 F23aFTqLTH1nbPUQjnox2PTf13iQVh4j2gwzd+9c9KBfxoGSge3dmxId7BJHy2aG
 M27fPdCYTNZAGWpPVujsCPAh1WPQ9NQqg3mA9+g14PEbiLqPcqU+kWmnDU7T7bEw
 Qx0kt6Y8GiknwCqq8pDbKYclgRmOjSGdfutzd0z8uDpbaeunS4/NqnDb/FUaDVcr
 jA4d6ep7qEgHpYbL8KgOeZCexfaTfz6mcwRWNq3Uu9cLZbZqSSQ7PXolMADHvoRs
 LS7VH2jcP7q4p4GWmdfjv67xyUUo9HG5HHX74h5pLfQSYXiBWo4ht0UOAzX/6EcE
 CJNHAFHv+OanI5Rg/6JQ8b3/bJYxzAJVyLZpCuMtlKk6lYBGNeADk9BezEDIYsm8
 tSe4/GqqyR9+Qz8rSdpAZ0KKkfqS535IcHUPUJau7Bzg1xqSEP5gzZN6QsjdXg0+
 5wFFfdFICTfJFXo=
 =57/1
 -----END PGP SIGNATURE-----

Merge tag 'audit-pr-20190305' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit

Pull audit updates from Paul Moore:
 "A lucky 13 audit patches for v5.1.

  Despite the rather large diffstat, most of the changes are from two
  bug fix patches that move code from one Kconfig option to another.

  Beyond that bit of churn, the remaining changes are largely cleanups
  and bug-fixes as we slowly march towards container auditing. It isn't
  all boring though, we do have a couple of new things: file
  capabilities v3 support, and expanded support for filtering on
  filesystems to solve problems with remote filesystems.

  All changes pass the audit-testsuite.  Please merge for v5.1"

* tag 'audit-pr-20190305' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
  audit: mark expected switch fall-through
  audit: hide auditsc_get_stamp and audit_serial prototypes
  audit: join tty records to their syscall
  audit: remove audit_context when CONFIG_ AUDIT and not AUDITSYSCALL
  audit: remove unused actx param from audit_rule_match
  audit: ignore fcaps on umount
  audit: clean up AUDITSYSCALL prototypes and stubs
  audit: more filter PATH records keyed on filesystem magic
  audit: add support for fcaps v3
  audit: move loginuid and sessionid from CONFIG_AUDITSYSCALL to CONFIG_AUDIT
  audit: add syscall information to CONFIG_CHANGE records
  audit: hand taken context to audit_kill_trees for syscall logging
  audit: give a clue what CONFIG_CHANGE op was involved
2019-03-07 12:20:11 -08:00
Linus Torvalds
ae5906ceee Merge branch 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:

 - Extend LSM stacking to allow sharing of cred, file, ipc, inode, and
   task blobs. This paves the way for more full-featured LSMs to be
   merged, and is specifically aimed at LandLock and SARA LSMs. This
   work is from Casey and Kees.

 - There's a new LSM from Micah Morton: "SafeSetID gates the setid
   family of syscalls to restrict UID/GID transitions from a given
   UID/GID to only those approved by a system-wide whitelist." This
   feature is currently shipping in ChromeOS.

* 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (62 commits)
  keys: fix missing __user in KEYCTL_PKEY_QUERY
  LSM: Update list of SECURITYFS users in Kconfig
  LSM: Ignore "security=" when "lsm=" is specified
  LSM: Update function documentation for cap_capable
  security: mark expected switch fall-throughs and add a missing break
  tomoyo: Bump version.
  LSM: fix return value check in safesetid_init_securityfs()
  LSM: SafeSetID: add selftest
  LSM: SafeSetID: remove unused include
  LSM: SafeSetID: 'depend' on CONFIG_SECURITY
  LSM: Add 'name' field for SafeSetID in DEFINE_LSM
  LSM: add SafeSetID module that gates setid calls
  LSM: add SafeSetID module that gates setid calls
  tomoyo: Allow multiple use_group lines.
  tomoyo: Coding style fix.
  tomoyo: Swicth from cred->security to task_struct->security.
  security: keys: annotate implicit fall throughs
  security: keys: annotate implicit fall throughs
  security: keys: annotate implicit fall through
  capabilities:: annotate implicit fall through
  ...
2019-03-07 11:44:01 -08:00
Linus Torvalds
9e1fd794cb Changes for Linux 5.1:
- Fix online fsck to handle inode btrees correctly on 64k block
   filesystems.
 - Teach online fsck to check directory and attribute names for invalid
   characters.
 - Miscellanous fixes for online fsck.
 - Introduce a new panic mask so that we can halt immediately on
   metadata corruption (for debugging purposes)
 - Fix a block mapping race during writeback.
 - Cache unlinked inode list backrefs in memory to speed up list
   processing.
 - Separate the bnobt/cntbt and inobt/finobt buffer verifiers so that we
   can detect crosslinked btrees.
 - Refactor magic number verification so that we can standardize it.
 - Strengthen ondisk metadata structure offset build time verification.
 - Fix a memory corruption problem in the listxattr code.
 - Fix a shutdown problem during log recovery due to unreserved finobt
   expansion.
 - Fix a referential integrity problem where O_TMPFILE inodes were put on
   the unlinked list with nlink > 0 which would cause asserts during log
   recovery if the system went down immediately.
 - Refactor the delayed allocation allocator to be more clever about the
   possibility that its mapping might be stale.
 - Various fixes to the copy on write mechanism.
 - Make CoW preallocation suitable for use even with writes that wouldn't
   otherwise require it.
 - Refactor an internal API.
 - Fix some statx implementation bugs.
 - Fix miscellaneous compiler and static checker complaints.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAlx5ZaIACgkQ+H93GTRK
 tOvZnhAApQ7bEsJIovMhW36yh/pXef9iKk2xAv6Kwn9jWEysQH+vlBvwEnaJTb0E
 8WBYm0Jv1DK8A6luSLmWbCIWB+RMkwhufXKadaTBtL8ySWq8uf776e9dVQflRyNU
 pN5k+JwQEfVDw8beAkUhr8yrZODv6iQi7/e2O0ARuf6CeNvX5ZR7OTjPdW3vX/zu
 F0t5UOegd2kGq+kkmIfDpAlQ8ju5U6P8sjImbXbjiTtSBydqkadgx3rCfa7MdD7u
 Y3Zauzba9PmJyJQ71JNsXgle58Ojq57LwWsRL/XEufy5B1upZsDzfDYTlAsDIkPJ
 pcG3YGPuGrBG/VX12XHmHqxt6DeW3S+GMVVxx7xBpSVDiSGeOazugLrAXk83NVzo
 eaUM96VVJO1lZArLxieZ7EBcuuNjW4PYYCOmUobGz/kxAzjubJU3dRMRLHiG5A1H
 TB2p0WlIP8cSxboF+SK8mhBmYDa1w16TCHQ6d6i9IFqcrmSc0JUvE3gy1UJzHhMb
 FESHasLTz9NTRhnCY3QjOYcT9Sa/s22+46+yZ8xBADG14N00dsYnCC5bbR/ALI09
 9RNawcDvzN6p9pcJMuRImWlXx5dpNjIAr/bEp+xZK1tnsHg8HKHub61e2x6PhmH8
 COJTErXV01IDRSVdQv6rFSE38/FACYppjzNyUdE7VydyKSoxBsM=
 =BAFq
 -----END PGP SIGNATURE-----

Merge tag 'xfs-5.1-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs updates from Darrick Wong:
 "Here are a number of new features and bug fixes for 5.1

  They've undergone a week's worth of fstesting and merge cleanly with
  master as of this morning

  Most of the changes center on improving metadata validation and fixing
  problems with online fsck, though there's also a new cache to speed up
  unlinked inode handling and cleanup of the copy on write code in
  preparation for future features

  Changes for Linux 5.1:

   - Fix online fsck to handle inode btrees correctly on 64k block
     filesystems

   - Teach online fsck to check directory and attribute names for
     invalid characters

   - Miscellanous fixes for online fsck

   - Introduce a new panic mask so that we can halt immediately on
     metadata corruption (for debugging purposes)

   - Fix a block mapping race during writeback

   - Cache unlinked inode list backrefs in memory to speed up list
     processing

   - Separate the bnobt/cntbt and inobt/finobt buffer verifiers so that
     we can detect crosslinked btrees

   - Refactor magic number verification so that we can standardize it

   - Strengthen ondisk metadata structure offset build time verification

   - Fix a memory corruption problem in the listxattr code

   - Fix a shutdown problem during log recovery due to unreserved finobt
     expansion

   - Fix a referential integrity problem where O_TMPFILE inodes were put
     on the unlinked list with nlink > 0 which would cause asserts
     during log recovery if the system went down immediately

   - Refactor the delayed allocation allocator to be more clever about
     the possibility that its mapping might be stale

   - Various fixes to the copy on write mechanism

   - Make CoW preallocation suitable for use even with writes that
     wouldn't otherwise require it

   - Refactor an internal API

   - Fix some statx implementation bugs

   - Fix miscellaneous compiler and static checker complaints"

* tag 'xfs-5.1-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (70 commits)
  xfs: fix reporting supported extra file attributes for statx()
  xfs: fix backwards endian conversion in scrub
  xfs: fix uninitialized error variables
  xfs: rework breaking of shared extents in xfs_file_iomap_begin
  xfs: don't pass iomap flags to xfs_reflink_allocate_cow
  xfs: fix uninitialized error variable
  xfs: introduce an always_cow mode
  xfs: report IOMAP_F_SHARED from xfs_file_iomap_begin_delay
  xfs: make COW fork unwritten extent conversions more robust
  xfs: merge COW handling into xfs_file_iomap_begin_delay
  xfs: also truncate holes covered by COW blocks
  xfs: don't use delalloc extents for COW on files with extsize hints
  xfs: fix SEEK_DATA for speculative COW fork preallocation
  xfs: make xfs_bmbt_to_iomap more useful
  xfs: fix xfs_buf magic number endian checks
  xfs: retry COW fork delalloc conversion when no extent was found
  xfs: remove the truncate short cut in xfs_map_blocks
  xfs: move xfs_iomap_write_allocate to xfs_aops.c
  xfs: move stat accounting to xfs_bmapi_convert_delalloc
  xfs: move transaction handling to xfs_bmapi_convert_delalloc
  ..
2019-03-07 09:38:51 -08:00
Linus Torvalds
b1e243957e for-5.1-part1-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAlx9czQACgkQxWXV+ddt
 WDvC9w/8CxJf1/eZBqb+b+aA38kgZhoaNixMud/IW/IFmIlicX0PoDxk6dh1ZTA+
 3uej/7fyfwjNCVvtrPVVxdT8zhZgyJouHrbhG1PlDWtmTEV2VqV5pBG1xQtCwmZy
 oinQI5oYYM5Le5EXxRGH8TQs6Z3tFuLx2kcrVWBLFKoZ2kZBZxe6KykGF9izve4a
 sVjtOL1CEL1e00vrNLzUmch8qss9Cu0i3qd3k8UANp3SgKIaOkJt4S/HeEcLfy5J
 kf6hVKlgPDuakVtAJKyhbLVQsfHVNkfiyvplta9lDot/iJchJITTRkadP6LblVeo
 knl8V+VO9kzQUvGauxtu66Q3DJ/7mqbzHUwPISetdKCV9ZXkuPFHnu0AEP577mrx
 e1JAPA/a8lF3up5QhqIb0uzH3sczOd8nNN/b1Xnxl7Kogyl8SUjhmX3FFy88borj
 /8Ptv/fFMQZs9IJ0QWlkh5TKRXAtSNAzVy2FpkvLaO0k0gJKQjyJuTKV5ezv/PGU
 +4m5kDtfpyz//KAOZxq4lERj4EMIEDhHhNbA8Qqmdeoj7oaKZ+gW57enOXohCTbi
 gVE6xDr2u4oQ85j3JuQo5W5mZA4uza35Gh4t43n5akdrrkLFVLW584hDxShGx9uS
 B0maToGbzOdGJTZXZ2SLHZ5Da14Lzb/TooCufF8GMAISb99vbjw=
 =D9zc
 -----END PGP SIGNATURE-----

Merge tag 'for-5.1-part1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs updates from David Sterba:
 "This contains usual mix of new features, core changes and fixes; full
  list below. I'm planning second pull request, with a few more fixes
  that arrived recently but too close to merge window, will send it next
  week.

  New features:

   - support zstd compression levels

   - new ioctl to unregister a device from the module (ie. reverse of
     device scan)

   - scrub prints a message to log when it's about to start or finish

  Core changes:

   - qgroups can now skip part of a tree that does not get updated
     during relocation, because this does not affect the quota
     accounting, estimated speedup in run time is about 20%

   - the compression workspace management had to be enhanced due to zstd
     requirements

   - various enospc fixes, when there's high fragmentation the
     over-reservation can cause ENOSPC that might not happen after a
     flush, in such cases try to wait if the situation improves

  Fixes:

   - various ioctls could overwrite previous return value if
     copy_to_user fails, fix this so the original error is reported

   - more reclaim vs GFP_KERNEL fixes

   - other cleanups and refactoring

   - fix a (valid) lockdep warning in a test when device replace is
     destroying worker threads

   - make qgroup async transaction commit more aggressive, this avoids
     some 'quota limit reached' errors if there are not enough data to
     trigger transaction in order to flush

   - fix deadlock between snapshot deletion and quotas when backref
     walking is called from context that already holds the same locks

   - fsync fixes:
      - fix fsync after succession of renames of different files
      - fix fsync after succession of renames and unlink/rmdir"

* tag 'for-5.1-part1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (92 commits)
  btrfs: Remove unnecessary casts in btrfs_read_root_item
  Btrfs: remove assertion when searching for a key in a node/leaf
  Btrfs: add missing error handling after doing leaf/node binary search
  btrfs: drop the lock on error in btrfs_dev_replace_cancel
  btrfs: ensure that a DUP or RAID1 block group has exactly two stripes
  btrfs: init csum_list before possible free
  Btrfs: remove no longer needed range length checks for deduplication
  Btrfs: fix fsync after succession of renames and unlink/rmdir
  Btrfs: fix fsync after succession of renames of different files
  btrfs: honor path->skip_locking in backref code
  btrfs: qgroup: Make qgroup async transaction commit more aggressive
  btrfs: qgroup: Move reserved data accounting from btrfs_delayed_ref_head to btrfs_qgroup_extent_record
  btrfs: scrub: remove unused nocow worker pointer
  btrfs: scrub: add assertions for worker pointers
  btrfs: scrub: convert scrub_workers_refcnt to refcount_t
  btrfs: scrub: add scrub_lock lockdep check in scrub_workers_get
  btrfs: scrub: fix circular locking dependency warning
  btrfs: fix comment its device list mutex not volume lock
  btrfs: extent_io: Kill the forward declaration of flush_write_bio
  btrfs: Fix grossly misleading argument names in extent io search
  ...
2019-03-07 09:07:30 -08:00
Linus Torvalds
0556161ff9 \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAlx5SzMACgkQnJ2qBz9k
 QNnP0AgAl9HDtk436P4QPPFhdeXBB6uRTYU8wgZSropXMxoyUzqZotWkeUqUusHs
 I8BAJeQojZeKUnMwET1/RA+dMLsgAlMcxkM3+3eDCLF9wh/+wO2B2G3ywh8KMVed
 Sa3C4V/NZvZzAossoGDV/yWmK+ZYrrW8l/DM3LU54GV1NfAL+Khn4FNwtgWiYiP5
 S4RkRflzwhIEZJSZByMlCLcsrHl/ehtMJuR1opUPY1c0CY8iAGcIobSSzVFqv4f5
 ScRB56rnXqTt6CBGBcDIkWxWqEE9XuTdAn1AVMC7327UezyFNocXUZaqDzJcGcR/
 rM3sf+seZftbA4nid8dIhSRoqAf4Yw==
 =/DQr
 -----END PGP SIGNATURE-----

Merge tag 'fsnotify_for_v5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull fanotify updates from Jan Kara:
 "Support for fanotify directory events and changes to make waiting for
  fanotify permission event response killable"

* tag 'fsnotify_for_v5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: (25 commits)
  fanotify: Make waits for fanotify events only killable
  fanotify: Use interruptible wait when waiting for permission events
  fanotify: Track permission event state
  fanotify: Simplify cleaning of access_list
  fsnotify: Create function to remove event from notification list
  fanotify: Move locking inside get_one_event()
  fanotify: Fold dequeue_event() into process_access_response()
  fanotify: Select EXPORTFS
  fanotify: report FAN_ONDIR to listener with FAN_REPORT_FID
  fanotify: add support for create/attrib/move/delete events
  fanotify: support events with data type FSNOTIFY_EVENT_INODE
  fanotify: check FS_ISDIR flag instead of d_is_dir()
  fsnotify: report FS_ISDIR flag with MOVE_SELF and DELETE_SELF events
  fanotify: use vfs_get_fsid() helper instead of vfs_statfs()
  vfs: add vfs_get_fsid() helper
  fanotify: cache fsid in fsnotify_mark_connector
  fanotify: enable FAN_REPORT_FID init flag
  fanotify: copy event fid info to user
  fanotify: encode file identifier for FAN_REPORT_FID
  fanotify: open code fill_event_metadata()
  ...
2019-03-07 09:03:38 -08:00
Linus Torvalds
a9913f23f3 \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAlx5SeAACgkQnJ2qBz9k
 QNlzLAf/c+1o6fd4mH9uBMqEHwo+g7cKcr76j00h60bZpMJ0N/k91o8KtUKDixLJ
 wG1o3FtaFyOpKXInjQOZZ83XQybjpDDFO67pCss/OYZ9bHtWM6ZfrYQzpxpIXu2E
 7/FFjZV7MlugmnqJbvYRvMr2Tx7IrqOeWZ0ZIUMRnghuBarLpbiOqFaTbGlqS8e1
 haFRhbxv0sA44YN9N40XVpg6P+cRsxJ4cHDSyQn4+X9CoYdKZ69utXyiiaV2L/Gc
 iNYn2fkh7IDkgxF8imwHSLhvvAVangWWphhTX/XVnCPq0FKTRw9e2tRdt77IlDlX
 w/GCHKnXaM6GnGDj4t83KV4yrdXGsQ==
 =xrvx
 -----END PGP SIGNATURE-----

Merge tag 'fs_for_v5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull ext2 and udf fixes from Jan Kara:
 "A couple of fixes for udf and ext2. Namely:

   - fix making ext2 mountable (again) with 64k blocksize

   - fix for ext2 statx(2) handling

   - fix for udf handling of corrupted filesystem so that it doesn't get
     corrupted even further

   - couple smaller ext2 and udf cleanups"

* tag 'fs_for_v5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  udf: Drop pointless check from udf_sync_fs()
  ext2: support statx syscall
  udf: disallow RW mount without valid integrity descriptor
  udf: finalize integrity descriptor before writeback
  udf: factor out LVID finalization for reuse
  ext2: Fix underflow in ext2_max_size()
  ext2: Fix a typo in comment
  ext2: Remove redundant check for finding no group
  ext2: Annotate implicit fall through in __ext2_truncate_blocks
  ext2: Set superblock revision when enabling xattr feature
  ext2: Remove redundant check on s_inode_size
  ext2: set proper return code
2019-03-07 09:01:33 -08:00
Linus Torvalds
b39a07a5e0 \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAlx5R3AACgkQnJ2qBz9k
 QNlrLQf/f8puq1PgwvxuxnZATtKBWA0O84YCkIvf18LV9GsOIaYGBVOhpd3CNZ0u
 WFKKaWxmrWlHtjKb43mAnZbGDLBE7uJmBe3CweIxg/Dgl3i0zvcI1Sz2vgyD3g+Q
 cSW8KF8mmG53ltSpQV2NzQOSwtAGuBGfJt9b9aZ25Xl+Tpoq3PlRGNfA8oyVsL+f
 iZeiJ9UxB4eRBhO0fEqhpyW1ZvNLoHF1U1qhJaVLK85tBnAAGvRQtlP1n4gFNNXP
 /+Hhb0khunkhH5uXrXxYpxp5AX8mciqT28d0PPaFUxHIa4PDtgMDZoTkIjgFCusk
 SqiL6TkPOovAG/27rBTH14L2ZMf7bw==
 =8mdR
 -----END PGP SIGNATURE-----

Merge tag 'dtype_for_v5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull dtype handling cleanups from Jan Kara:
 "A reworked dtype cleanup patches based on your feedback to the
  previous version of these.

  Again the series includes only the generic code and ext2 cleanup as a
  sample. The plan is to push cleanups for other filesystems separately
  through respective trees once the generic code lands to reduce the
  number of conflicts"

* tag 'dtype_for_v5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  ext2: use common file type conversion
  fs: common implementation of file type
2019-03-07 08:23:17 -08:00
Linus Torvalds
e431f2d74e Driver core patches for 5.1-rc1
Here is the big driver core patchset for 5.1-rc1
 
 More patches than "normal" here this merge window, due to some work in
 the driver core by Alexander Duyck to rework the async probe
 functionality to work better for a number of devices, and independant
 work from Rafael for the device link functionality to make it work
 "correctly".
 
 Also in here is:
 	- lots of BUS_ATTR() removals, the macro is about to go away
 	- firmware test fixups
 	- ihex fixups and simplification
 	- component additions (also includes i915 patches)
 	- lots of minor coding style fixups and cleanups.
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXH+euQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ynyTgCfbV8CLums843sBnT8NnWrTMTdTCcAn1K4re0m
 ep8g+6oRLxJy414hogxQ
 =bLs2
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the big driver core patchset for 5.1-rc1

  More patches than "normal" here this merge window, due to some work in
  the driver core by Alexander Duyck to rework the async probe
  functionality to work better for a number of devices, and independant
  work from Rafael for the device link functionality to make it work
  "correctly".

  Also in here is:

   - lots of BUS_ATTR() removals, the macro is about to go away

   - firmware test fixups

   - ihex fixups and simplification

   - component additions (also includes i915 patches)

   - lots of minor coding style fixups and cleanups.

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'driver-core-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (65 commits)
  driver core: platform: remove misleading err_alloc label
  platform: set of_node in platform_device_register_full()
  firmware: hardcode the debug message for -ENOENT
  driver core: Add missing description of new struct device_link field
  driver core: Fix PM-runtime for links added during consumer probe
  drivers/component: kerneldoc polish
  async: Add cmdline option to specify drivers to be async probed
  driver core: Fix possible supplier PM-usage counter imbalance
  PM-runtime: Fix __pm_runtime_set_status() race with runtime resume
  driver: platform: Support parsing GpioInt 0 in platform_get_irq()
  selftests: firmware: fix verify_reqs() return value
  Revert "selftests: firmware: remove use of non-standard diff -Z option"
  Revert "selftests: firmware: add CONFIG_FW_LOADER_USER_HELPER_FALLBACK to config"
  device: Fix comment for driver_data in struct device
  kernfs: Allocating memory for kernfs_iattrs with kmem_cache.
  sysfs: remove unused include of kernfs-internal.h
  driver core: Postpone DMA tear-down until after devres release
  driver core: Document limitation related to DL_FLAG_RPM_ACTIVE
  PM-runtime: Take suppliers into account in __pm_runtime_set_status()
  device.h: Add __cold to dev_<level> logging functions
  ...
2019-03-06 14:52:48 -08:00
Linus Torvalds
8dcd175bc3 Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:

 - a few misc things

 - ocfs2 updates

 - most of MM

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (159 commits)
  tools/testing/selftests/proc/proc-self-syscall.c: remove duplicate include
  proc: more robust bulk read test
  proc: test /proc/*/maps, smaps, smaps_rollup, statm
  proc: use seq_puts() everywhere
  proc: read kernel cpu stat pointer once
  proc: remove unused argument in proc_pid_lookup()
  fs/proc/thread_self.c: code cleanup for proc_setup_thread_self()
  fs/proc/self.c: code cleanup for proc_setup_self()
  proc: return exit code 4 for skipped tests
  mm,mremap: bail out earlier in mremap_to under map pressure
  mm/sparse: fix a bad comparison
  mm/memory.c: do_fault: avoid usage of stale vm_area_struct
  writeback: fix inode cgroup switching comment
  mm/huge_memory.c: fix "orig_pud" set but not used
  mm/hotplug: fix an imbalance with DEBUG_PAGEALLOC
  mm/memcontrol.c: fix bad line in comment
  mm/cma.c: cma_declare_contiguous: correct err handling
  mm/page_ext.c: fix an imbalance with kmemleak
  mm/compaction: pass pgdat to too_many_isolated() instead of zone
  mm: remove zone_lru_lock() function, access ->lru_lock directly
  ...
2019-03-06 10:31:36 -08:00
Linus Torvalds
45802da05e Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
 "The main changes in this cycle were:

   - refcount conversions

   - Solve the rq->leaf_cfs_rq_list can of worms for real.

   - improve power-aware scheduling

   - add sysctl knob for Energy Aware Scheduling

   - documentation updates

   - misc other changes"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits)
  kthread: Do not use TIMER_IRQSAFE
  kthread: Convert worker lock to raw spinlock
  sched/fair: Use non-atomic cpumask_{set,clear}_cpu()
  sched/fair: Remove unused 'sd' parameter from select_idle_smt()
  sched/wait: Use freezable_schedule() when possible
  sched/fair: Prune, fix and simplify the nohz_balancer_kick() comment block
  sched/fair: Explain LLC nohz kick condition
  sched/fair: Simplify nohz_balancer_kick()
  sched/topology: Fix percpu data types in struct sd_data & struct s_data
  sched/fair: Simplify post_init_entity_util_avg() by calling it with a task_struct pointer argument
  sched/fair: Fix O(nr_cgroups) in the load balancing path
  sched/fair: Optimize update_blocked_averages()
  sched/fair: Fix insertion in rq->leaf_cfs_rq_list
  sched/fair: Add tmp_alone_branch assertion
  sched/core: Use READ_ONCE()/WRITE_ONCE() in move_queued_task()/task_rq_lock()
  sched/debug: Initialize sd_sysctl_cpus if !CONFIG_CPUMASK_OFFSTACK
  sched/pelt: Skip updating util_est when utilization is higher than CPU's capacity
  sched/fair: Update scale invariance of PELT
  sched/fair: Move the rq_of() helper function
  sched/core: Convert task_struct.stack_refcount to refcount_t
  ...
2019-03-06 08:14:05 -08:00
Linus Torvalds
3478588b51 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
 "The biggest part of this tree is the new auto-generated atomics API
  wrappers by Mark Rutland.

  The primary motivation was to allow instrumentation without uglifying
  the primary source code.

  The linecount increase comes from adding the auto-generated files to
  the Git space as well:

    include/asm-generic/atomic-instrumented.h     | 1689 ++++++++++++++++--
    include/asm-generic/atomic-long.h             | 1174 ++++++++++---
    include/linux/atomic-fallback.h               | 2295 +++++++++++++++++++++++++
    include/linux/atomic.h                        | 1241 +------------

  I preferred this approach, so that the full call stack of the (already
  complex) locking APIs is still fully visible in 'git grep'.

  But if this is excessive we could certainly hide them.

  There's a separate build-time mechanism to determine whether the
  headers are out of date (they should never be stale if we do our job
  right).

  Anyway, nothing from this should be visible to regular kernel
  developers.

  Other changes:

   - Add support for dynamic keys, which removes a source of false
     positives in the workqueue code, among other things (Bart Van
     Assche)

   - Updates to tools/memory-model (Andrea Parri, Paul E. McKenney)

   - qspinlock, wake_q and lockdep micro-optimizations (Waiman Long)

   - misc other updates and enhancements"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (48 commits)
  locking/lockdep: Shrink struct lock_class_key
  locking/lockdep: Add module_param to enable consistency checks
  lockdep/lib/tests: Test dynamic key registration
  lockdep/lib/tests: Fix run_tests.sh
  kernel/workqueue: Use dynamic lockdep keys for workqueues
  locking/lockdep: Add support for dynamic keys
  locking/lockdep: Verify whether lock objects are small enough to be used as class keys
  locking/lockdep: Check data structure consistency
  locking/lockdep: Reuse lock chains that have been freed
  locking/lockdep: Fix a comment in add_chain_cache()
  locking/lockdep: Introduce lockdep_next_lockchain() and lock_chain_count()
  locking/lockdep: Reuse list entries that are no longer in use
  locking/lockdep: Free lock classes that are no longer in use
  locking/lockdep: Update two outdated comments
  locking/lockdep: Make it easy to detect whether or not inside a selftest
  locking/lockdep: Split lockdep_free_key_range() and lockdep_reset_lock()
  locking/lockdep: Initialize the locks_before and locks_after lists earlier
  locking/lockdep: Make zap_class() remove all matching lock order entries
  locking/lockdep: Reorder struct lock_class members
  locking/lockdep: Avoid that add_chain_cache() adds an invalid chain to the cache
  ...
2019-03-06 07:17:17 -08:00
Alexey Dobriyan
08b5577513 proc: use seq_puts() everywhere
seq_printf() without format specifiers == faster seq_puts()

Link: http://lkml.kernel.org/r/20190114200545.GC9680@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:22 -08:00
Alexey Dobriyan
5713f35c05 proc: read kernel cpu stat pointer once
Help gcc generate better code:

	$ ./scripts/bloat-o-meter ../vmlinux-000 ../vmlinux-001
	add/remove: 2/2 grow/shrink: 0/1 up/down: 92/-142 (-50)
	Function                                     old     new   delta
	get_iowait_time.isra                           -      46     +46
	get_idle_time.isra                             -      46     +46
	show_stat                                   1489    1477     -12
	get_iowait_time                               65       -     -65
	get_idle_time                                 65       -     -65

Link: http://lkml.kernel.org/r/20190114195907.GA9680@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:21 -08:00
Zhikang Zhang
867aaccf1f proc: remove unused argument in proc_pid_lookup()
[adobriyan@gmail.com: delete "extern" from prototype]
Link: http://lkml.kernel.org/r/20190114195635.GA9372@avx2
Signed-off-by: Zhikang Zhang <zhangzhikang1@huawei.com>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:21 -08:00
Chengguang Xu
45f68ab502 fs/proc/thread_self.c: code cleanup for proc_setup_thread_self()
Remove unnecessary ERR_PTR()/PTR_ERR() cast in proc_setup_thread_self().

Link: http://lkml.kernel.org/r/20190124030150.8472-2-cgxu519@gmx.com
Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:21 -08:00
Chengguang Xu
756ca74c7f fs/proc/self.c: code cleanup for proc_setup_self()
Remove unnecessary ERR_PTR()/PTR_ERR() cast in proc_setup_self().

Link: http://lkml.kernel.org/r/20190124030150.8472-1-cgxu519@gmx.com
Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:21 -08:00
Joel Fernandes (Google)
ab3948f58f mm/memfd: add an F_SEAL_FUTURE_WRITE seal to memfd
Android uses ashmem for sharing memory regions.  We are looking forward
to migrating all usecases of ashmem to memfd so that we can possibly
remove the ashmem driver in the future from staging while also
benefiting from using memfd and contributing to it.  Note staging
drivers are also not ABI and generally can be removed at anytime.

One of the main usecases Android has is the ability to create a region
and mmap it as writeable, then add protection against making any
"future" writes while keeping the existing already mmap'ed
writeable-region active.  This allows us to implement a usecase where
receivers of the shared memory buffer can get a read-only view, while
the sender continues to write to the buffer.  See CursorWindow
documentation in Android for more details:

  https://developer.android.com/reference/android/database/CursorWindow

This usecase cannot be implemented with the existing F_SEAL_WRITE seal.
To support the usecase, this patch adds a new F_SEAL_FUTURE_WRITE seal
which prevents any future mmap and write syscalls from succeeding while
keeping the existing mmap active.

A better way to do F_SEAL_FUTURE_WRITE seal was discussed [1] last week
where we don't need to modify core VFS structures to get the same
behavior of the seal.  This solves several side-effects pointed by Andy.
self-tests are provided in later patch to verify the expected semantics.

[1] https://lore.kernel.org/lkml/20181111173650.GA256781@google.com/

Thanks a lot to Andy for suggestions to improve code.

Link: http://lkml.kernel.org/r/20190112203816.85534-2-joel@joelfernandes.org
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Acked-by: John Stultz <john.stultz@linaro.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: J. Bruce Fields <bfields@fieldses.org>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Marc-Andr Lureau <marcandre.lureau@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:19 -08:00
Aneesh Kumar K.V
04a8645304 mm: update ptep_modify_prot_commit to take old pte value as arg
Architectures like ppc64 require to do a conditional tlb flush based on
the old and new value of pte.  Enable that by passing old pte value as
the arg.

Link: http://lkml.kernel.org/r/20190116085035.29729-3-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:18 -08:00
Aneesh Kumar K.V
0cbe3e26ab mm: update ptep_modify_prot_start/commit to take vm_area_struct as arg
Patch series "NestMMU pte upgrade workaround for mprotect", v5.

We can upgrade pte access (R -> RW transition) via mprotect.  We need to
make sure we follow the recommended pte update sequence as outlined in
commit bd5050e38a ("powerpc/mm/radix: Change pte relax sequence to
handle nest MMU hang") for such updates.  This patch series does that.

This patch (of 5):

Some architectures may want to call flush_tlb_range from these helpers.

Link: http://lkml.kernel.org/r/20190116085035.29729-2-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:18 -08:00
Johannes Weiner
147e1a97c4 fs: kernfs: add poll file operation
Patch series "psi: pressure stall monitors", v3.

Android is adopting psi to detect and remedy memory pressure that
results in stuttering and decreased responsiveness on mobile devices.

Psi gives us the stall information, but because we're dealing with
latencies in the millisecond range, periodically reading the pressure
files to detect stalls in a timely fashion is not feasible.  Psi also
doesn't aggregate its averages at a high enough frequency right now.

This patch series extends the psi interface such that users can
configure sensitive latency thresholds and use poll() and friends to be
notified when these are breached.

As high-frequency aggregation is costly, it implements an aggregation
method that is optimized for fast, short-interval averaging, and makes
the aggregation frequency adaptive, such that high-frequency updates
only happen while monitored stall events are actively occurring.

With these patches applied, Android can monitor for, and ward off,
mounting memory shortages before they cause problems for the user.  For
example, using memory stall monitors in userspace low memory killer
daemon (lmkd) we can detect mounting pressure and kill less important
processes before device becomes visibly sluggish.

In our memory stress testing psi memory monitors produce roughly 10x
less false positives compared to vmpressure signals.  Having ability to
specify multiple triggers for the same psi metric allows other parts of
Android framework to monitor memory state of the device and act
accordingly.

The new interface is straightforward.  The user opens one of the
pressure files for writing and writes a trigger description into the
file descriptor that defines the stall state - some or full, and the
maximum stall time over a given window of time.  E.g.:

        /* Signal when stall time exceeds 100ms of a 1s window */
        char trigger[] = "full 100000 1000000";
        fd = open("/proc/pressure/memory");
        write(fd, trigger, sizeof(trigger));
        while (poll() >= 0) {
                ...
        }
        close(fd);

When the monitored stall state is entered, psi adapts its aggregation
frequency according to what the configured time window requires in order
to emit event signals in a timely fashion.  Once the stalling subsides,
aggregation reverts back to normal.

The trigger is associated with the open file descriptor.  To stop
monitoring, the user only needs to close the file descriptor and the
trigger is discarded.

Patches 1-4 prepare the psi code for polling support.  Patch 5
implements the adaptive polling logic, the pressure growth detection
optimized for short intervals, and hooks up write() and poll() on the
pressure files.

The patches were developed in collaboration with Johannes Weiner.

This patch (of 5):

Kernfs has a standardized poll/notification mechanism for waking all
pollers on all fds when a filesystem node changes.  To allow polling for
custom events, add a .poll callback that can override the default.

This is in preparation for pollable cgroup pressure files which have
per-fd trigger configurations.

Link: http://lkml.kernel.org/r/20190124211518.244221-2-surenb@google.com
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:17 -08:00
Shakeel Butt
60cd4bcd62 memcg: localize memcg_kmem_enabled() check
Move the memcg_kmem_enabled() checks into memcg kmem charge/uncharge
functions, so, the users don't have to explicitly check that condition.

This is purely code cleanup patch without any functional change.  Only
the order of checks in memcg_charge_slab() can potentially be changed
but the functionally it will be same.  This should not matter as
memcg_charge_slab() is not in the hot path.

Link: http://lkml.kernel.org/r/20190103161203.162375-1-shakeelb@google.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:15 -08:00
David Hildenbrand
ca215086b1 mm: convert PG_balloon to PG_offline
PG_balloon was introduced to implement page migration/compaction for
pages inflated in virtio-balloon.  Nowadays, it is only a marker that a
page is part of virtio-balloon and therefore logically offline.

We also want to make use of this flag in other balloon drivers - for
inflated pages or when onlining a section but keeping some pages offline
(e.g.  used right now by XEN and Hyper-V via set_online_page_callback()).

We are going to expose this flag to dump tools like makedumpfile.  But
instead of exposing PG_balloon, let's generalize the concept of marking
pages as logically offline, so it can be reused for other purposes later
on.

Rename PG_balloon to PG_offline.  This is an indicator that the page is
logically offline, the content stale and that it should not be touched
(e.g.  a hypervisor would have to allocate backing storage in order for
the guest to dump an unused page).  We can then e.g.  exclude such pages
from dumps.

We replace and reuse KPF_BALLOON (23), as this shouldn't really harm
(and for now the semantics stay the same).  In following patches, we
will make use of this bit also in other balloon drivers.  While at it,
document PGTABLE.

[akpm@linux-foundation.org: fix comment text, per David]
Link: http://lkml.kernel.org/r/20181119101616.8901-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Konstantin Khlebnikov <koct9i@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Pankaj gupta <pagupta@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Christian Hansen <chansen3@cisco.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Miles Chen <miles.chen@mediatek.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Julien Freche <jfreche@vmware.com>
Cc: Kairui Song <kasong@redhat.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Lianbo Jiang <lijiang@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Nadav Amit <namit@vmware.com>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Xavier Deguillard <xdeguillard@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:14 -08:00
Shuriyc Chu
5704a06810 fs/file.c: initialize init_files.resize_wait
(Taken from https://bugzilla.kernel.org/show_bug.cgi?id=200647)

'get_unused_fd_flags' in kthread cause kernel crash.  It works fine on
4.1, but causes crash after get 64 fds.  It also cause crash on
ubuntu1404/1604/1804, centos7.5, and the crash messages are almost the
same.

The crash message on centos7.5 shows below:

  start fd 61
  start fd 62
  start fd 63
  BUG: unable to handle kernel NULL pointer dereference at           (null)
  IP: __wake_up_common+0x2e/0x90
  PGD 0
  Oops: 0000 [#1] SMP
  Modules linked in: test(OE) xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter devlink sunrpc kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd sg ppdev pcspkr virtio_balloon parport_pc parport i2c_piix4 joydev ip_tables xfs libcrc32c sr_mod cdrom sd_mod crc_t10dif crct10dif_generic ata_generic pata_acpi virtio_scsi virtio_console virtio_net cirrus drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm crct10dif_pclmul crct10dif_common crc32c_intel drm ata_piix serio_raw libata virtio_pci virtio_ring i2c_core
   virtio floppy dm_mirror dm_region_hash dm_log dm_mod
  CPU: 2 PID: 1820 Comm: test_fd Kdump: loaded Tainted: G           OE  ------------   3.10.0-862.3.3.el7.x86_64 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
  task: ffff8e92b9431fa0 ti: ffff8e94247a0000 task.ti: ffff8e94247a0000
  RIP: 0010:__wake_up_common+0x2e/0x90
  RSP: 0018:ffff8e94247a2d18  EFLAGS: 00010086
  RAX: 0000000000000000 RBX: ffffffff9d09daa0 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffffffff9d09daa0
  RBP: ffff8e94247a2d50 R08: 0000000000000000 R09: ffff8e92b95dfda8
  R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff9d09daa8
  R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000003
  FS:  0000000000000000(0000) GS:ffff8e9434e80000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000000 CR3: 000000017c686000 CR4: 00000000000207e0
  Call Trace:
    __wake_up+0x39/0x50
    expand_files+0x131/0x250
    __alloc_fd+0x47/0x170
    get_unused_fd_flags+0x30/0x40
    test_fd+0x12a/0x1c0 [test]
    kthread+0xd1/0xe0
    ret_from_fork_nospec_begin+0x21/0x21
  Code: 66 90 55 48 89 e5 41 57 41 89 f7 41 56 41 89 ce 41 55 41 54 49 89 fc 49 83 c4 08 53 48 83 ec 10 48 8b 47 08 89 55 cc 4c 89 45 d0 <48> 8b 08 49 39 c4 48 8d 78 e8 4c 8d 69 e8 75 08 eb 3b 4c 89 ef
  RIP   __wake_up_common+0x2e/0x90
   RSP <ffff8e94247a2d18>
  CR2: 0000000000000000

This issue exists since CentOS 7.5 3.10.0-862 and CentOS 7.4
(3.10.0-693.21.1 ) is ok.  Root cause: the item 'resize_wait' is not
initialized before being used.

Reported-by: Richard Zhang <zhang.zijian@h3c.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:14 -08:00
Vineet Gupta
a905737fdd fs/inode.c: inode_set_flags(): replace opencoded set_mask_bits()
It seems that commits 5f16f3225b and 00a1a053eb, both with same
commitlog ("ext4: atomically set inode->i_flags in ext4_set_inode_flags()")
introduced the set_mask_bits API, but somehow missed not using it in ext4
in the end.

Also, set_mask_bits() is used in fs quite a bit and we can possibly come
up with a generic llsc based implementation (w/o the cmpxchg loop)

Link: http://lkml.kernel.org/r/1548275584-18096-3-git-send-email-vgupta@synopsys.com
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:13 -08:00
Gustavo A. R. Silva
f402cf03fc ocfs2: Use zero-sized array and struct_size() in kzalloc()
Update the code to use a zero-sized array instead of a pointer in
structure ocfs2_slot_info and use struct_size() in kzalloc().

Notice that one of the more common cases of allocation size calculations
is finding the size of a structure that has a zero-sized array at the
end, along with memory for some number of elements for that array.  For
example:

  struct foo {
      int stuff;
      void *entry[];
  };

  instance = kzalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL);

Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:

  instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL);

This code was detected with the help of Coccinelle.

Link: http://lkml.kernel.org/r/20190108191903.GA22056@embeddedor
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:13 -08:00
Gang He
5500ab4ed3 ocfs2: fix the application IO timeout when fstrim is running
The user reported this problem, the upper application IO was timeout
when fstrim was running on this ocfs2 partition.  the application
monitoring resource agent considered that this application did not work,
then this node was fenced by the cluster brain (e.g.  pacemaker).

The root cause is that fstrim thread always holds main_bm meta-file
related locks until all the cluster groups are trimmed.  This patch will
make fstrim thread release main_bm meta-file related locks when each
cluster group is trimmed, this will let the current application IO has a
chance to claim the clusters from main_bm meta-file.

Link: http://lkml.kernel.org/r/20190111090014.31645-1-ghe@suse.com
Signed-off-by: Gang He <ghe@suse.com>
Reviewed-by: Changwei Ge <ge.changwei@h3c.com>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:13 -08:00
Jia Guo
cc725ef3cb ocfs2: fix a panic problem caused by o2cb_ctl
In the process of creating a node, it will cause NULL pointer
dereference in kernel if o2cb_ctl failed in the interval (mkdir,
o2cb_set_node_attribute(node_num)] in function o2cb_add_node.

The node num is initialized to 0 in function o2nm_node_group_make_item,
o2nm_node_group_drop_item will mistake the node number 0 for a valid
node number when we delete the node before the node number is set
correctly.  If the local node number of the current host happens to be
0, cluster->cl_local_node will be set to O2NM_INVALID_NODE_NUM while
o2hb_thread still running.  The panic stack is generated as follows:

  o2hb_thread
      \-o2hb_do_disk_heartbeat
          \-o2hb_check_own_slot
              |-slot = &reg->hr_slots[o2nm_this_node()];
              //o2nm_this_node() return O2NM_INVALID_NODE_NUM

We need to check whether the node number is set when we delete the node.

Link: http://lkml.kernel.org/r/133d8045-72cc-863e-8eae-5013f9f6bc51@huawei.com
Signed-off-by: Jia Guo <guojia12@huawei.com>
Reviewed-by: Joseph Qi <jiangqi903@gmail.com>
Acked-by: Jun Piao <piaojun@huawei.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <ge.changwei@h3c.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:13 -08:00
Linus Torvalds
b1b988a6a0 Merge branch 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull year 2038 updates from Thomas Gleixner:
 "Another round of changes to make the kernel ready for 2038. After lots
  of preparatory work this is the first set of syscalls which are 2038
  safe:

    403 clock_gettime64
    404 clock_settime64
    405 clock_adjtime64
    406 clock_getres_time64
    407 clock_nanosleep_time64
    408 timer_gettime64
    409 timer_settime64
    410 timerfd_gettime64
    411 timerfd_settime64
    412 utimensat_time64
    413 pselect6_time64
    414 ppoll_time64
    416 io_pgetevents_time64
    417 recvmmsg_time64
    418 mq_timedsend_time64
    419 mq_timedreceiv_time64
    420 semtimedop_time64
    421 rt_sigtimedwait_time64
    422 futex_time64
    423 sched_rr_get_interval_time64

  The syscall numbers are identical all over the architectures"

* 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
  riscv: Use latest system call ABI
  checksyscalls: fix up mq_timedreceive and stat exceptions
  unicore32: Fix __ARCH_WANT_STAT64 definition
  asm-generic: Make time32 syscall numbers optional
  asm-generic: Drop getrlimit and setrlimit syscalls from default list
  32-bit userspace ABI: introduce ARCH_32BIT_OFF_T config option
  compat ABI: use non-compat openat and open_by_handle_at variants
  y2038: add 64-bit time_t syscalls to all 32-bit architectures
  y2038: rename old time and utime syscalls
  y2038: remove struct definition redirects
  y2038: use time32 syscall names on 32-bit
  syscalls: remove obsolete __IGNORE_ macros
  y2038: syscalls: rename y2038 compat syscalls
  x86/x32: use time64 versions of sigtimedwait and recvmmsg
  timex: change syscalls to use struct __kernel_timex
  timex: use __kernel_timex internally
  sparc64: add custom adjtimex/clock_adjtime functions
  time: fix sys_timer_settime prototype
  time: Add struct __kernel_timex
  time: make adjtime compat handling available for 32 bit
  ...
2019-03-05 14:08:26 -08:00
Linus Torvalds
78f8601354 Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
 "The interrupt departement delivers this time:

   - New infrastructure to manage NMIs on platforms which have a sane
     NMI delivery, i.e. identifiable NMI vectors instead of a single
     lump.

   - Simplification of the interrupt affinity management so drivers
     don't have to implement ugly loops around the PCI/MSI enablement.

   - Speedup for interrupt statistics in /proc/stat

   - Provide a function to retrieve the default irq domain

   - A new interrupt controller for the Loongson LS1X platform

   - Affinity support for the SiFive PLIC

   - Better support for the iMX irqsteer driver

   - NUMA aware memory allocations for GICv3

   - The usual small fixes, improvements and cleanups all over the
     place"

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
  irqchip/imx-irqsteer: Add multi output interrupts support
  irqchip/imx-irqsteer: Change to use reg_num instead of irq_group
  dt-bindings: irq: imx-irqsteer: Add multi output interrupts support
  dt-binding: irq: imx-irqsteer: Use irq number instead of group number
  irqchip/brcmstb-l2: Use _irqsave locking variants in non-interrupt code
  irqchip/gicv3-its: Use NUMA aware memory allocation for ITS tables
  irqdomain: Allow the default irq domain to be retrieved
  irqchip/sifive-plic: Implement irq_set_affinity() for SMP host
  irqchip/sifive-plic: Differentiate between PLIC handler and context
  irqchip/sifive-plic: Add warning in plic_init() if handler already present
  irqchip/sifive-plic: Pre-compute context hart base and enable base
  PCI/MSI: Remove obsolete sanity checks for multiple interrupt sets
  genirq/affinity: Remove the leftovers of the original set support
  nvme-pci: Simplify interrupt allocation
  genirq/affinity: Add new callback for (re)calculating interrupt sets
  genirq/affinity: Store interrupt sets size in struct irq_affinity
  genirq/affinity: Code consolidation
  irqchip/irq-sifive-plic: Check and continue in case of an invalid cpuid.
  irqchip/i8259: Fix shutdown order by moving syscore_ops registration
  dt-bindings: interrupt-controller: loongson ls1x intc
  ...
2019-03-05 12:21:47 -08:00
Linus Torvalds
08300f4402 a.out: remove core dumping support
We're (finally) phasing out a.out support for good.  As Borislav Petkov
points out, we've supported ELF binaries for about 25 years by now, and
coredumping in particular has bitrotted over the years.

None of the tool chains even support generating a.out binaries any more,
and the plan is to deprecate a.out support entirely for the kernel.  But
I want to start with just removing the core dumping code, because I can
still imagine that somebody actually might want to support a.out as a
simpler biinary format.

Particularly if you generate some random binaries on the fly, ELF is a
much more complicated format (admittedly ELF also does have a lot of
toolchain support, mitigating that complexity a lot and you really
should have moved over in the last 25 years).

So it's at least somewhat possible that somebody out there has some
workflow that still involves generating and running a.out executables.

In contrast, it's very unlikely that anybody depends on debugging any
legacy a.out core files.  But regardless, I want this phase-out to be
done in two steps, so that we can resurrect a.out support (if needed)
without having to resurrect the core file dumping that is almost
certainly not needed.

Jann Horn pointed to the <asm/a.out-core.h> file that my first trivial
cut at this had missed.

And Alan Cox points out that the a.out binary loader _could_ be done in
user space if somebody wants to, but we might keep just the loader in
the kernel if somebody really wants it, since the loader isn't that big
and has no really odd special cases like the core dumping does.

Acked-by: Borislav Petkov <bp@alien8.de>
Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
Cc: Jann Horn <jannh@google.com>
Cc: Richard Weinberger <richard@nod.at>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 10:00:35 -08:00
Linus Torvalds
63bdf4284c Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto update from Herbert Xu:
 "API:
   - Add helper for simple skcipher modes.
   - Add helper to register multiple templates.
   - Set CRYPTO_TFM_NEED_KEY when setkey fails.
   - Require neither or both of export/import in shash.
   - AEAD decryption test vectors are now generated from encryption
     ones.
   - New option CONFIG_CRYPTO_MANAGER_EXTRA_TESTS that includes random
     fuzzing.

  Algorithms:
   - Conversions to skcipher and helper for many templates.
   - Add more test vectors for nhpoly1305 and adiantum.

  Drivers:
   - Add crypto4xx prng support.
   - Add xcbc/cmac/ecb support in caam.
   - Add AES support for Exynos5433 in s5p.
   - Remove sha384/sha512 from artpec7 as hardware cannot do partial
     hash"

[ There is a merge of the Freescale SoC tree in order to pull in changes
  required by patches to the caam/qi2 driver. ]

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (174 commits)
  crypto: s5p - add AES support for Exynos5433
  dt-bindings: crypto: document Exynos5433 SlimSSS
  crypto: crypto4xx - add missing of_node_put after of_device_is_available
  crypto: cavium/zip - fix collision with generic cra_driver_name
  crypto: af_alg - use struct_size() in sock_kfree_s()
  crypto: caam - remove redundant likely/unlikely annotation
  crypto: s5p - update iv after AES-CBC op end
  crypto: x86/poly1305 - Clear key material from stack in SSE2 variant
  crypto: caam - generate hash keys in-place
  crypto: caam - fix DMA mapping xcbc key twice
  crypto: caam - fix hash context DMA unmap size
  hwrng: bcm2835 - fix probe as platform device
  crypto: s5p-sss - Use AES_BLOCK_SIZE define instead of number
  crypto: stm32 - drop pointless static qualifier in stm32_hash_remove()
  crypto: chelsio - Fixed Traffic Stall
  crypto: marvell - Remove set but not used variable 'ivsize'
  crypto: ccp - Update driver messages to remove some confusion
  crypto: adiantum - add 1536 and 4096-byte test vectors
  crypto: nhpoly1305 - add a test vector with len % 16 != 0
  crypto: arm/aes-ce - update IV after partial final CTR block
  ...
2019-03-05 09:09:55 -08:00
Linus Torvalds
6456300356 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Here we go, another merge window full of networking and #ebpf changes:

   1) Snoop DHCPACKS in batman-adv to learn MAC/IP pairs in the DHCP
      range without dealing with floods of ARP traffic, from Linus
      Lüssing.

   2) Throttle buffered multicast packet transmission in mt76, from
      Felix Fietkau.

   3) Support adaptive interrupt moderation in ice, from Brett Creeley.

   4) A lot of struct_size conversions, from Gustavo A. R. Silva.

   5) Add peek/push/pop commands to bpftool, as well as bash completion,
      from Stanislav Fomichev.

   6) Optimize sk_msg_clone(), from Vakul Garg.

   7) Add SO_BINDTOIFINDEX, from David Herrmann.

   8) Be more conservative with local resends due to local congestion,
      from Yuchung Cheng.

   9) Allow vetoing of unsupported VXLAN FDBs, from Petr Machata.

  10) Add health buffer support to devlink, from Eran Ben Elisha.

  11) Add TXQ scheduling API to mac80211, from Toke Høiland-Jørgensen.

  12) Add statistics to basic packet scheduler filter, from Cong Wang.

  13) Add GRE tunnel support for mlxsw Spectrum-2, from Nir Dotan.

  14) Lots of new IP tunneling forwarding tests, also from Nir Dotan.

  15) Add 3ad stats to bonding, from Nikolay Aleksandrov.

  16) Lots of probing improvements for bpftool, from Quentin Monnet.

  17) Various nfp drive #ebpf JIT improvements from Jakub Kicinski.

  18) Allow #ebpf programs to access gso_segs from skb shared info, from
      Eric Dumazet.

  19) Add sock_diag support for AF_XDP sockets, from Björn Töpel.

  20) Support 22260 iwlwifi devices, from Luca Coelho.

  21) Use rbtree for ipv6 defragmentation, from Peter Oskolkov.

  22) Add JMP32 instruction class support to #ebpf, from Jiong Wang.

  23) Add spinlock support to #ebpf, from Alexei Starovoitov.

  24) Support 256-bit keys and TLS 1.3 in ktls, from Dave Watson.

  25) Add device infomation API to devlink, from Jakub Kicinski.

  26) Add new timestamping socket options which are y2038 safe, from
      Deepa Dinamani.

  27) Add RX checksum offloading for various sh_eth chips, from Sergei
      Shtylyov.

  28) Flow offload infrastructure, from Pablo Neira Ayuso.

  29) Numerous cleanups, improvements, and bug fixes to the PHY layer
      and many drivers from Heiner Kallweit.

  30) Lots of changes to try and make packet scheduler classifiers run
      lockless as much as possible, from Vlad Buslov.

  31) Support BCM957504 chip in bnxt_en driver, from Erik Burrows.

  32) Add concurrency tests to tc-tests infrastructure, from Vlad
      Buslov.

  33) Add hwmon support to aquantia, from Heiner Kallweit.

  34) Allow 64-bit values for SO_MAX_PACING_RATE, from Eric Dumazet.

  And I would be remiss if I didn't thank the various major networking
  subsystem maintainers for integrating much of this work before I even
  saw it. Alexei Starovoitov, Daniel Borkmann, Pablo Neira Ayuso,
  Johannes Berg, Kalle Valo, and many others. Thank you!"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2207 commits)
  net/sched: avoid unused-label warning
  net: ignore sysctl_devconf_inherit_init_net without SYSCTL
  phy: mdio-mux: fix Kconfig dependencies
  net: phy: use phy_modify_mmd_changed in genphy_c45_an_config_aneg
  net: dsa: mv88e6xxx: add call to mv88e6xxx_ports_cmode_init to probe for new DSA framework
  selftest/net: Remove duplicate header
  sky2: Disable MSI on Dell Inspiron 1545 and Gateway P-79
  net/mlx5e: Update tx reporter status in case channels were successfully opened
  devlink: Add support for direct reporter health state update
  devlink: Update reporter state to error even if recover aborted
  sctp: call iov_iter_revert() after sending ABORT
  team: Free BPF filter when unregistering netdev
  ip6mr: Do not call __IP6_INC_STATS() from preemptible context
  isdn: mISDN: Fix potential NULL pointer dereference of kzalloc
  net: dsa: mv88e6xxx: support in-band signalling on SGMII ports with external PHYs
  cxgb4/chtls: Prefix adapter flags with CXGB4
  net-sysfs: Switch to bitmap_zalloc()
  mellanox: Switch to bitmap_zalloc()
  bpf: add test cases for non-pointer sanitiation logic
  mlxsw: i2c: Extend initialization by querying resources data
  ...
2019-03-05 08:26:13 -08:00
Slavomir Kaslev
ee5e001196 fs: Make splice() and tee() take into account O_NONBLOCK flag on pipes
The current implementation of splice() and tee() ignores O_NONBLOCK set
on pipe file descriptors and checks only the SPLICE_F_NONBLOCK flag for
blocking on pipe arguments.  This is inconsistent since splice()-ing
from/to non-pipe file descriptors does take O_NONBLOCK into
consideration.

Fix this by promoting O_NONBLOCK, when set on a pipe, to
SPLICE_F_NONBLOCK.

Some context for how the current implementation of splice() leads to
inconsistent behavior.  In the ongoing work[1] to add VM tracing
capability to trace-cmd we stream tracing data over named FIFOs or
vsockets from guests back to the host.

When we receive SIGINT from user to stop tracing, we set O_NONBLOCK on
the input file descriptor and set SPLICE_F_NONBLOCK for the next call to
splice().  If splice() was blocked waiting on data from the input FIFO,
after SIGINT splice() restarts with the same arguments (no
SPLICE_F_NONBLOCK) and blocks again instead of returning -EAGAIN when no
data is available.

This differs from the splice() behavior when reading from a vsocket or
when we're doing a traditional read()/write() loop (trace-cmd's
--nosplice argument).

With this patch applied we get the same behavior in all situations after
setting O_NONBLOCK which also matches the behavior of doing a
read()/write() loop instead of splice().

This change does have potential of breaking users who don't expect
EAGAIN from splice() when SPLICE_F_NONBLOCK is not set.  OTOH programs
that set O_NONBLOCK and don't anticipate EAGAIN are arguably buggy[2].

 [1] https://github.com/skaslev/trace-cmd/tree/vsock
 [2] d47e3da175/fs/read_write.c (L1425)

Signed-off-by: Slavomir Kaslev <kaslevs@vmware.com>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-04 16:10:17 -08:00
David S. Miller
18a4d8bf25 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-03-04 13:26:15 -08:00
Linus Torvalds
4f9020ffde Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
 "Assorted fixes that sat in -next for a while, all over the place"

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  aio: Fix locking in aio_poll()
  exec: Fix mem leak in kernel_read_file
  copy_mount_string: Limit string length to PATH_MAX
  cgroup: saner refcounting for cgroup_root
  fix cgroup_do_mount() handling of failure exits
2019-03-04 13:24:27 -08:00