Commit graph

950062 commits

Author SHA1 Message Date
Christoph Hellwig
26e328759b xfs: reuse _xfs_buf_read for re-reading the superblock
Instead of poking deeply into buffer cache internals when re-reading the
superblock during log recovery just generalize _xfs_buf_read and use it
there.  Note that we don't have to explicitly set up the ops as they
must be set from the initial read.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-09-15 20:52:39 -07:00
Christoph Hellwig
b3f8e08ca8 xfs: remove xfs_getsb
Merge xfs_getsb into its only caller, and clean that one up a little bit
as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-09-15 20:52:39 -07:00
Christoph Hellwig
cead0b10f5 xfs: simplify xfs_trans_getsb
Remove the mp argument as this function is only called in transaction
context, and open code xfs_getsb given that the function already accesses
the buffer pointer in the mount point directly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-09-15 20:52:39 -07:00
Christoph Hellwig
22c10589a1 xfs: remove xlog_recover_iodone
The log recovery I/O completion handler does not substancially differ from
the normal one except for the fact that it:

 a) never retries failed writes
 b) can have log items that aren't on the AIL
 c) never has inode/dquot log items attached and thus don't need to
    handle them

Add conditionals for (a) and (b) to the ioend code, while (c) doesn't
need special handling anyway.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-09-15 20:52:39 -07:00
Christoph Hellwig
55b7d7115f xfs: clear the read/write flags later in xfs_buf_ioend
Clear the flags at the end of xfs_buf_ioend so that they can be used
during the completion.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-09-15 20:52:39 -07:00
Christoph Hellwig
b840e2ada8 xfs: use xfs_buf_item_relse in xfs_buf_item_done
Reuse xfs_buf_item_relse instead of duplicating it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-09-15 20:52:39 -07:00
Christoph Hellwig
70796c6b74 xfs: simplify the xfs_buf_ioend_disposition calling convention
Now that all the actual error handling is in a single place,
xfs_buf_ioend_disposition just needs to return true if took ownership of
the buffer, or false if not instead of the tristate.  Also move the
error check back in the caller to optimize for the fast path, and give
the function a better fitting name.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-09-15 20:52:38 -07:00
Christoph Hellwig
844c9358df xfs: lift the XBF_IOEND_FAIL handling into xfs_buf_ioend_disposition
Keep all the error handling code together.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-09-15 20:52:38 -07:00
Christoph Hellwig
3cc498845a xfs: remove xfs_buf_ioerror_retry
Merge xfs_buf_ioerror_retry into its only caller to make the resubmission
flow a little easier to follow.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-09-15 20:52:38 -07:00
Christoph Hellwig
f58d0ea956 xfs: refactor xfs_buf_ioerror_fail_without_retry
xfs_buf_ioerror_fail_without_retry is a somewhat weird function in
that it has two trivial checks that decide the return value, while
the rest implements a ratelimited warning.  Just lift the two checks
into the caller, and give the remainder a suitable name.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-09-15 20:52:38 -07:00
Christoph Hellwig
6a7584b1d8 xfs: fold xfs_buf_ioend_finish into xfs_ioend
No need to keep a separate helper for this logic.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-09-15 20:52:38 -07:00
Christoph Hellwig
664ffb8a42 xfs: move the buffer retry logic to xfs_buf.c
Move the buffer retry state machine logic to xfs_buf.c and call it once
from xfs_ioend instead of duplicating it three times for the three kinds
of buffers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-09-15 20:52:38 -07:00
Christoph Hellwig
23fb5a93c2 xfs: refactor xfs_buf_ioend
Move the log recovery I/O completion handling entirely into the log
recovery code, and re-arrange the normal I/O completion handler flow
to prepare to lifting more logic into common code in the next commits.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-09-15 20:52:38 -07:00
Christoph Hellwig
76b2d32346 xfs: mark xfs_buf_ioend static
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-09-15 20:52:38 -07:00
Christoph Hellwig
12e164aa1f xfs: refactor the buf ioend disposition code
Handle the no-error case in xfs_buf_iodone_error as well, and to clarify
the code rename the function, use the actual enum type as return value
and then switch on it in the callers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-09-15 20:52:38 -07:00
Dave Chinner
718ecc5035 xfs: xfs_iflock is no longer a completion
With the recent rework of the inode cluster flushing, we no longer
ever wait on the the inode flush "lock". It was never a lock in the
first place, just a completion to allow callers to wait for inode IO
to complete. We now never wait for flush completion as all inode
flushing is non-blocking. Hence we can get rid of all the iflock
infrastructure and instead just set and check a state flag.

Rename the XFS_IFLOCK flag to XFS_IFLUSHING, convert all the
xfs_iflock_nowait() test-and-set operations on that flag, and
replace all the xfs_ifunlock() calls to clear operations.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-09-06 18:05:51 -07:00
Carlos Maiolino
771915c4f6 xfs: remove kmem_realloc()
Remove kmem_realloc() function and convert its users to use MM API
directly (krealloc())

Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-09-06 18:05:51 -07:00
Linus Torvalds
f4d51dffc6 Linux 5.9-rc4 2020-09-06 17:11:40 -07:00
Linus Torvalds
a8205e3100 io_uring-5.9-2020-09-06
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl9U/MMQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpg4BEAC6TQ6ctc1yNTTWwiz4UrdIKMAWP7S7wepu
 k00A9+JjLLJBVdkz9rZ2Y/SyGe12qBM+riiRSn/gNTbkd4qq2rCY2d8U1vXXKyP5
 VDXPo12zsUD5WpqdEMfXUB4+DOQs0a3DAyDLT0+K/Qw0rpOQVZA26Ovn6GOh9+Kq
 KJCllynYkwUQ/7CXsdI13ktMI1HADFOx3149SlGkbuPOggwNQrGiLJAxyUOn/i+E
 uiHy8b8o3B2nun61+Y98q+MDISLf0xXYEbHeAsvETEy52ya2iadnMo1lwS5zHzM+
 p6jOBybHaM/wz5t1V44VTBvfog1KAUtp8K0gsxcB6Ezf7LhYfTVjtvfZqpwVz1sl
 txkxhnGEBURHpdr0aCtC15cIpbDGM85ymjP4RD0YlV7oyT0+Ufx7r9jJjLf7IhZO
 FMyEFmVwSPD/NdJE9ZNx0I2v/qdIzYwfeAix4Z4bXoe51BtPrf1uBgsGWVuqNVz/
 dKVf1vK1tPF8PgIiIW/o8GI4iF3RRVQcLwJGiAWMzBS11iniJLf9mUPRVl5Bxpeb
 YRd2chm+ppHC/IgtK44x7Ce6415hnbKehE2KUr43PHB7nIMNWcRsurxLizM7ZGei
 Gv2/K9PM3+4O/b1k20xLakz4Vw3Isk4W6/Flj9LoxheBo+CUG4Rsx5UyzFEB6apA
 uXxiF6ORLg==
 =v5QQ
 -----END PGP SIGNATURE-----

Merge tag 'io_uring-5.9-2020-09-06' of git://git.kernel.dk/linux-block

Pull more io_uring fixes from Jens Axboe:
 "Two followup fixes. One is fixing a regression from this merge window,
  the other is two commits fixing cancelation of deferred requests.

  Both have gone through full testing, and both spawned a few new
  regression test additions to liburing.

   - Don't play games with const, properly store the output iovec and
     assign it as needed.

   - Deferred request cancelation fix (Pavel)"

* tag 'io_uring-5.9-2020-09-06' of git://git.kernel.dk/linux-block:
  io_uring: fix linked deferred ->files cancellation
  io_uring: fix cancel of deferred reqs with ->files
  io_uring: fix explicit async read/write mapping for large segments
2020-09-06 12:10:27 -07:00
Linus Torvalds
2ccdd9f8b2 IOMMU Fixes for Linux v5.9-rc3
Including:
 
 	- Three Intel VT-d fixes to fix address handling on 32bit, fix a
 	  NULL pointer dereference bug and serialize a hardware register
 	  access as required by the VT-d spec.
 
 	- Two patches for AMD IOMMU to force AMD GPUs into translation mode
 	  when memory encryption is active and disallow using IOMMUv2
 	  functionality. This makes the AMDGPU driver working when
 	  memory encryption is active.
 
 	- Two more fixes for AMD IOMMU to fix updating the Interrupt
 	  Remapping Table Entries.
 
 	- MAINTAINERS file update for the Qualcom IOMMU driver.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAl9U+Q4ACgkQK/BELZcB
 GuOkexAA14LsYG8QWd1lwgf3Yh6/xGwhw/vDTHCMCNqFCMUHTLjte6uLZ9QH/Pxy
 fvwyX73ZOjsrKXOwkKvEFD9fCJDyMsPsfH3ssYNgOZfTFculcGdNc6NhLWhoKCc8
 pmM0tVikRKxhAMxWgc9EX4dwu9WRe8vobvZj0cAPFlMAkY2ZY3lDtl8bniX9cygW
 Oo2hmhud1CeLE3ffUWhWgT9T7GfTL1i6KFF8wUqrNFE53+b7pS4LbJaCWuHKjtKu
 MaBC0eE0rG0qvSng6ArdZx22a47o2j57zc0mG6IeHtGe7eUAgwNkRLQmL1RU35xW
 cDniCJXosQ7DJ0Hel12ahk1HlUQFFqMlZVZbNpYP04v8xqHNEsWvWHw2NS6xZg2T
 FozBWdnhw8HzaXpWmg14WQ+e9+rnsK/FndRxBjXP9H9bKQQNKQV8PaiUEQCZin0s
 QJohAgqyjyPLB3a/B9q1jzG5HHI2pK2s2/4ry9XBdijZ7MhADq9pAvAe9k8baU/5
 gEy6pa4PmZEMibbrfUNjIoeYz80hDaHIxIbW/fFgl5n1JJJoz+qsfZlYaddrN6nc
 XFWDg9dDgQjDaYDjK9uG2vJkbMFiqE0kRxtJdWfqZNGgSWxqVIMtf42ZvYlqH+Wb
 Zc9nBEuH27o4NZ//2mXzynwBooBTmOsLRw/kqSPOuBVMki08fqk=
 =2uqs
 -----END PGP SIGNATURE-----

Merge tag 'iommu-fixes-v5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu fixes from Joerg Roedel:

 - three Intel VT-d fixes to fix address handling on 32bit, fix a NULL
   pointer dereference bug and serialize a hardware register access as
   required by the VT-d spec.

 - two patches for AMD IOMMU to force AMD GPUs into translation mode
   when memory encryption is active and disallow using IOMMUv2
   functionality.  This makes the AMDGPU driver work when memory
   encryption is active.

 - two more fixes for AMD IOMMU to fix updating the Interrupt Remapping
   Table Entries.

 - MAINTAINERS file update for the Qualcom IOMMU driver.

* tag 'iommu-fixes-v5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/vt-d: Handle 36bit addressing for x86-32
  iommu/amd: Do not use IOMMUv2 functionality when SME is active
  iommu/amd: Do not force direct mapping when SME is active
  iommu/amd: Use cmpxchg_double() when updating 128-bit IRTE
  iommu/amd: Restore IRTE.RemapEn bit after programming IRTE
  iommu/vt-d: Fix NULL pointer dereference in dev_iommu_priv_set()
  iommu/vt-d: Serialize IOMMU GCMD register modifications
  MAINTAINERS: Update QUALCOMM IOMMU after Arm SMMU drivers move
2020-09-06 11:58:15 -07:00
Linus Torvalds
015b3155c4 Misc fixes:
- Fix more generic entry code ABI fallout
  - Fix debug register handling bugs
  - Fix vmalloc mappings on 32-bit kernels
  - Fix kprobes instrumentation output on 32-bit kernels
  - Fix over-eager WARN_ON_ONCE() on !SMAP hardware
  - Fix NUMA debugging
  - Fix Clang related crash on !RETPOLINE kernels
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl9UljIRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1g1yA//VecoyJOw4jb43LdkeKDGtUjCsPVZlt4w
 fw55nT4taqqbgl9mQjrJQlh8thtk7LvAqcsrEGk/SH+1fp/hDvBG0i3etyI1mPJ2
 t97MCVtD1bz2zyLpOtGN48tgiRxSazr4S9nZPCLTec+c75I3pmJssj44m/eJi/Z2
 hoj/syiO4J0BPa7a1ou++Jeyag6J+PgXdJTOMyjuqi99vqai1aTVKo8GdWMInext
 +fJNYd0ZQRj1FxVdMusDfzxOk7N7b8nAzvd30iJN67R6QwoEazO12K1F4IYQmHSq
 0rhHrwe0lTLtjmYdp/ef14kfzD7DRFN6Nv2gk/zyZsH+tjGflxTZConkFPnfoJEc
 33cNHfigh0V9TSVNDDhHnkRyy6dzCHkYHEf33KFuX3amC236TgrCEL7+oWE2rcNp
 9PJbPGlXCqNb2feNy2de4cY+KiZ2a1N/T4VcdMK6DEdENFh5T03EZgIChQEd0S99
 LNBYHqTWJdQEKfkzfAXlR4Bd2hX1LWLMM6rNcXxInrH7rWDXUCS0X9m3gLZR9DIs
 7/nXoK4OkaJdgH/D2CToDgwMNT5hlIiTGtVtB3H6Qz8eQQ4+fwTyboQDqpeG4Upy
 LfOH2h5Fo33FCgqnrua8IsgUKLwW2yJGdghJpcd9d0qfVUDEJuXGo6xe6SEHdSu/
 VEiQtFUf50U=
 =EhRy
 -----END PGP SIGNATURE-----

Merge tag 'x86-urgent-2020-09-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:

 - more generic entry code ABI fallout

 - debug register handling bugfixes

 - fix vmalloc mappings on 32-bit kernels

 - kprobes instrumentation output fix on 32-bit kernels

 - fix over-eager WARN_ON_ONCE() on !SMAP hardware

 - NUMA debugging fix

 - fix Clang related crash on !RETPOLINE kernels

* tag 'x86-urgent-2020-09-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/entry: Unbreak 32bit fast syscall
  x86/debug: Allow a single level of #DB recursion
  x86/entry: Fix AC assertion
  tracing/kprobes, x86/ptrace: Fix regs argument order for i386
  x86, fakenuma: Fix invalid starting node ID
  x86/mm/32: Bring back vmalloc faulting on x86_32
  x86/cmdline: Disable jump tables for cmdline.c
2020-09-06 10:28:00 -07:00
Linus Torvalds
68beef5710 xen: branch for v5.9-rc4
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCX1Rn1wAKCRCAXGG7T9hj
 vlEjAQC/KGC3wYw5TweWcY48xVzgvued3JLAQ6pcDlOe6osd6AEAzZcZKgL948cx
 oY0T98dxb/U+lUhbIzhpBr/30g8JbAQ=
 =Xcxp
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-5.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen updates from Juergen Gross:
 "A small series for fixing a problem with Xen PVH guests when running
  as backends (e.g. as dom0).

  Mapping other guests' memory is now working via ZONE_DEVICE, thus not
  requiring to abuse the memory hotplug functionality for that purpose"

* tag 'for-linus-5.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: add helpers to allocate unpopulated memory
  memremap: rename MEMORY_DEVICE_DEVDAX to MEMORY_DEVICE_GENERIC
  xen/balloon: add header guard
2020-09-06 09:59:27 -07:00
Pavel Begunkov
c127a2a1b7 io_uring: fix linked deferred ->files cancellation
While looking for ->files in ->defer_list, consider that requests there
may actually be links.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-05 16:02:42 -06:00
Pavel Begunkov
b7ddce3cbf io_uring: fix cancel of deferred reqs with ->files
While trying to cancel requests with ->files, it also should look for
requests in ->defer_list, otherwise it might end up hanging a thread.

Cancel all requests in ->defer_list up to the last request there with
matching ->files, that's needed to follow drain ordering semantics.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-05 15:59:51 -06:00
Linus Torvalds
dd9fb9bb33 A trivial patch for auxdisplay:
- Replace HTTP links with HTTPS ones (Alexander A. Klimov)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPjU5OPd5QIZ9jqqOGXyLc2htIW0FAl9TdpwACgkQGXyLc2ht
 IW0MMxAAkh9gkR1ygJik6Ii9t63jbpG/NG3jJjlxCllh6OZrJ64SJBM7sF80NC1Y
 MAq+j16yPcyvfI1M378hfwD71c6OYuwfH4z1By8WbAVQ0d7S/Qd9UJMhn8f6U/aA
 /KLwvq/+YiCT5328MAZQHmmpsGqKBVnARMAf65fNviiYDwjeSd+9G9RtFF2KD2Q6
 m+OsOqAr7WvhqZ/1lMOuAXxkuee11WbEi/ZBijJl8ZGglld3eQcW1+aMKHsD/W/Q
 QXEi5QhdeLv60tu3HSQu6QyNYSPArDv7nZt1Yrxu05Sb+sryMzXgtuy03q6c0iUm
 zZmFDijcfneWJ5mJda+9301wJOC1NsQnEhJBi0RgjquRw8ydB175rKAzGBIKzaWp
 toQ9cN/LG/dJU/a4o1lrugIdqv7FJHUdvW2fsaL2twy/ly0sV8yv+dM/e3kItkv4
 hzCvGe9RKW2VF2JwhCFDtEkknXkwnaOX+yYB5si1NYSvODvPrln/MADRIB7TONw+
 eI8C9B52JQ+66YDb/Yg3OBhncobufYCOgs30taRv7q3sBd1P394RT598mW8JTRIN
 CAWKYsBfE4aTSL0PZeSfbQrNYxj+BxMrDjg9VQ9lGVzH4lw2vJZ51nSLtxgyuSr6
 fwSC6R3Y3WxKKj5M2E0tAzDF0C0V/kVvDzUetFgTHSJN/Kuht6o=
 =w1IL
 -----END PGP SIGNATURE-----
mergetag object 4e4bb89446
 type commit
 tag clang-format-for-linus-v5.9-rc4
 tagger Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> 1599306139 +0200
 
 The usual trivial update:
 
  - Update with the latest for_each macro list (Miguel Ojeda)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPjU5OPd5QIZ9jqqOGXyLc2htIW0FAl9TecAACgkQGXyLc2ht
 IW0D6A/9Guy8Q+fQoZ+jVynb7zRklfq8+D7rp4LrI+TXshGE1kUWzpwrPQX+QDcD
 H1wce3TMteZxOJe1KiWF5UwsFD2sGKpI1O8NJLn8Af9oImbIwhoJKxtdT0NOJ0cI
 08bzCDfonBAMn5DA2abDkmuRpkeyRXn1AN2mHcxnZ33lTZttWf/SCyrrynhHP4Co
 AAyeOuJtD5duMFvVtnXU/3aRGUinZia6roVG9wDbWoWYD50H3OeVecVRbnM0Vonf
 DEUbU2yjQRKVHd8P/mhf8tRhiVCnAt6b+c4wwgzgpRcFhu69b0fjJPmZETRrt4Dc
 aQZT4ltY9H0R6Wz82Pt8peZieZEvq8cgX+YgCkpLuHNiv8+/Fce+Nllpaokg8lYb
 rirS+u4t5zX3AaVdKk1riPaYinAOuC5PirGr8JDap4T7kg2Zw+xUSSZgH0sqTK0N
 /UwvoqZK5wOnwotsn16vhQvxOtzbdgh4lkkkkBPbXMUXtN8PfkHiZ/4S5fSDdrTu
 hz2F4nHkZlvhkZYKd7tp8UytnN1XEJQF70Z17LTEnVyVTjSoAA2LfSoJ0NCIwdhe
 A12/H7DB1FlPjcoxdjJka5teoGnlV5JkSAklp6Q+FlC/Roytxq5hEpmm77YShW4a
 /KuaKlohInKCqA6/ZJq3H1jzz+dT14OHekwdSXk5ZsaMkXejEMA=
 =Lcl0
 -----END PGP SIGNATURE-----
mergetag object e5fc436f06
 type commit
 tag compiler-attributes-for-linus-v5.9-rc4
 tagger Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> 1599306822 +0200
 
 Two trivial changes and a fix for sparse:
 
  - sparse: use static inline for __chk_{user,io}_ptr() (Luc Van Oostenryck)
 
  - Compiler Attributes: fix comment concerning GCC 4.6 (Luc Van Oostenryck)
 
  - Compiler Attributes: remove comment about sparse not supporting __has_attribute (Luc Van Oostenryck)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPjU5OPd5QIZ9jqqOGXyLc2htIW0FAl9TfFAACgkQGXyLc2ht
 IW0vfQ//dtC2zaqJo3L0RMyLqAWwJbbML1eyy7tT8qPnw635arJXSqyjPliVoRwA
 3O3iQOjnss/vfL5bs9OSFEN5yjALP4wm9YZl2EponWZ8WiIKfC/YPSSH2iM2m2kW
 AK6bG6HPfoSy3Kx04CuxeqGN2F9/hZGLneLfF+y6gdfeAVBRTf1o+SvuCV/phLyW
 ek2UQK00+00BOVa62ZLZzqCKCHjTBSXtjDrrwg39g4MJ+gsGcUDoF13InRdgSWTY
 rgUD09OqOTjwaB2GeTiV7mXNn0yydyH0IoXz1EsfqRJV5mik3LMEpo/RWTBX95K1
 82N2mrL4ejOPX8gj+iKvcDjC8b1Fiwnb4pw6SsAOTJ5KG/euMVV2ZqAIS4SVVvSx
 VcGdErNcDBFrpM2py6rM3UhHBOafdlOYAtLKb596hMh4H8Kjt3LFIZAuyLVAJQfz
 q3xuBhkKSPw/j3ZVJ4iBFJSMIkxKRLUkOtF4dxAUknHOuYktnsIH0T/NcN0sHUqY
 c/3bcJFG6jQhzrA3DLlwcKWcgeihEjx5JpJFreN97heSEWsn7G1GAMMWUJ1y4KqJ
 kheczTljl5RhxUqUH2WBMj4MgdmUhwtXqW6Ht9HqP6nKDDotUjinjrSGOP5alej7
 3JYNz11c2pWwQNIzItHspmSnaiSUXbcn7WJoRTrO09kcPfsMw7E=
 =RrSY
 -----END PGP SIGNATURE-----

Merge tags 'auxdisplay-for-linus-v5.9-rc4', 'clang-format-for-linus-v5.9-rc4' and 'compiler-attributes-for-linus-v5.9-rc4' of git://github.com/ojeda/linux

Pull misc fixes from Miguel Ojeda:
 "A trivial patch for auxdisplay:

   - Replace HTTP links with HTTPS ones (Alexander A. Klimov)

  The usual clang-format trivial update:

   - Update with the latest for_each macro list (Miguel Ojeda)

  And Luc requested me to pick a sparse fix on my queue, so here it goes
  along with other two trivial Compiler Attributes ones (also from Luc).

   - sparse: use static inline for __chk_{user,io}_ptr() (Luc Van
     Oostenryck)

   - Compiler Attributes: fix comment concerning GCC 4.6 (Luc Van
     Oostenryck)

   - Compiler Attributes: remove comment about sparse not supporting
     __has_attribute (Luc Van Oostenryck)"

* tag 'auxdisplay-for-linus-v5.9-rc4' of git://github.com/ojeda/linux:
  auxdisplay: Replace HTTP links with HTTPS ones

* tag 'clang-format-for-linus-v5.9-rc4' of git://github.com/ojeda/linux:
  clang-format: Update with the latest for_each macro list

* tag 'compiler-attributes-for-linus-v5.9-rc4' of git://github.com/ojeda/linux:
  sparse: use static inline for __chk_{user,io}_ptr()
  Compiler Attributes: fix comment concerning GCC 4.6
  Compiler Attributes: remove comment about sparse not supporting __has_attribute
2020-09-05 14:22:46 -07:00
Linus Torvalds
70187f7727 ARC fixes for 5.9-rc4
- HSDK-4xd Dev system: perf driver updates for sampling interrupt
 
  - HSDK* Dev System : Ethernet broken	[Evgeniy Didin]
 
  - HIGHMEM broken (2 memory banks)	[Mike Rapoport]
 
  - show_regs() rewrite once and for all
 
  - Other minor fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEOXpuCuR6hedrdLCJadfx3eKKwl4FAl9S5E4ACgkQadfx3eKK
 wl7y5xAAyExBJlAWH8tyJjKYnRnKUwP5xZxzSLt9k5sUF8+aVMBpSr6PCF9kf1ZP
 HTFJ3T9NLFOh4cI7Lk9QFkLXbOvuvjgkfqlSmbHjnlTYcuWCqCQfU93bLMhRvrvi
 oRnWzEXLXwRuYoYz4TVLzxw9SuQr/EtCvR+a4rAyMpUlg+9pWHTokpiv69nDeYSl
 t3ckCVkdKbqPsUW7fXclyzo5qtsdGgLhYh6pb8ZE/x/Qdk+IjaN4WEmjC5DCy10B
 VTKEKdK7VKfRGuhIJbPq/UDs0TwMMxic+5sCdQtM54vrQPuVTmpsUopOHQ4/2Y5D
 IVWsZ3Uy/jnZxS32FP4d4r4Rv+BT9r1c7o148glT9iI0poOZ13BZHNZbLuSxYi4l
 1UQ2r/o+E+XKOWzAOc1WIgEJGdxcDu9seJNSSmxs1kBveOIg8S8JjoDyS6ZyTWD+
 fbOttxNe3kqGWCdKkgtUGqrg2w0qr3kzJd5aF4ju4fE+KMSVQRylRSWBpM1hcovP
 rxhcVvxyd1xSS3krrXjBTsjgTlkPDAaLyku/G5VuwpG91LyTiWfSlg+IOfLO+zFm
 LAwMTlV0gXfV2GH/AlEDJXtvT7GPhXN9sqmqqqOpUmOcKbPE30RmUkUxTJJbGfPN
 s7ioUV8CJVzRAkxyo79IDQQjT20p13cbJrIeCfgaSKWBf1M64BU=
 =vOZp
 -----END PGP SIGNATURE-----

Merge tag 'arc-5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC fixes from Vineet Gupta:

 - HSDK-4xd Dev system: perf driver updates for sampling interrupt

 - HSDK* Dev System: Ethernet broken [Evgeniy Didin]

 - HIGHMEM broken (2 memory banks) [Mike Rapoport]

 - show_regs() rewrite once and for all

 - Other minor fixes

* tag 'arc-5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: [plat-hsdk]: Switch ethernet phy-mode to rgmii-id
  arc: fix memory initialization for systems with two memory banks
  irqchip/eznps: Fix build error for !ARC700 builds
  ARC: show_regs: fix r12 printing and simplify
  ARC: HSDK: wireup perf irq
  ARC: perf: don't bail setup if pct irq missing in device-tree
  ARC: pgalloc.h: delete a duplicated word + other fixes
2020-09-05 13:46:14 -07:00
Linus Torvalds
7514c0362f Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "19 patches.

  Subsystems affected by this patch series: MAINTAINERS, ipc, fork,
  checkpatch, lib, and mm (memcg, slub, pagemap, madvise, migration,
  hugetlb)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  include/linux/log2.h: add missing () around n in roundup_pow_of_two()
  mm/khugepaged.c: fix khugepaged's request size in collapse_file
  mm/hugetlb: fix a race between hugetlb sysctl handlers
  mm/hugetlb: try preferred node first when alloc gigantic page from cma
  mm/migrate: preserve soft dirty in remove_migration_pte()
  mm/migrate: remove unnecessary is_zone_device_page() check
  mm/rmap: fixup copying of soft dirty and uffd ptes
  mm/migrate: fixup setting UFFD_WP flag
  mm: madvise: fix vma user-after-free
  checkpatch: fix the usage of capture group ( ... )
  fork: adjust sysctl_max_threads definition to match prototype
  ipc: adjust proc_ipc_sem_dointvec definition to match prototype
  mm: track page table modifications in __apply_to_page_range()
  MAINTAINERS: IA64: mark Status as Odd Fixes only
  MAINTAINERS: add LLVM maintainers
  MAINTAINERS: update Cavium/Marvell entries
  mm: slub: fix conversion of freelist_corrupted()
  mm: memcg: fix memcg reclaim soft lockup
  memcg: fix use-after-free in uncharge_batch
2020-09-05 13:28:40 -07:00
Jason Gunthorpe
428fc0aff4 include/linux/log2.h: add missing () around n in roundup_pow_of_two()
Otherwise gcc generates warnings if the expression is complicated.

Fixes: 312a0c1709 ("[PATCH] LOG2: Alter roundup_pow_of_two() so that it can use a ilog2() on a constant")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/0-v1-8a2697e3c003+41165-log_brackets_jgg@nvidia.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-05 12:14:30 -07:00
David Howells
e5a59d308f mm/khugepaged.c: fix khugepaged's request size in collapse_file
collapse_file() in khugepaged passes PAGE_SIZE as the number of pages to
be read to page_cache_sync_readahead().  The intent was probably to read
a single page.  Fix it to use the number of pages to the end of the
window instead.

Fixes: 99cb0dbd47 ("mm,thp: add read-only THP support for (non-shmem) FS")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Yang Shi <shy828301@gmail.com>
Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Eric Biggers <ebiggers@google.com>
Link: https://lkml.kernel.org/r/20200903140844.14194-2-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-05 12:14:30 -07:00
Muchun Song
17743798d8 mm/hugetlb: fix a race between hugetlb sysctl handlers
There is a race between the assignment of `table->data` and write value
to the pointer of `table->data` in the __do_proc_doulongvec_minmax() on
the other thread.

  CPU0:                                 CPU1:
                                        proc_sys_write
  hugetlb_sysctl_handler                  proc_sys_call_handler
  hugetlb_sysctl_handler_common             hugetlb_sysctl_handler
    table->data = &tmp;                       hugetlb_sysctl_handler_common
                                                table->data = &tmp;
      proc_doulongvec_minmax
        do_proc_doulongvec_minmax           sysctl_head_finish
          __do_proc_doulongvec_minmax         unuse_table
            i = table->data;
            *i = val;  // corrupt CPU1's stack

Fix this by duplicating the `table`, and only update the duplicate of
it.  And introduce a helper of proc_hugetlb_doulongvec_minmax() to
simplify the code.

The following oops was seen:

    BUG: kernel NULL pointer dereference, address: 0000000000000000
    #PF: supervisor instruction fetch in kernel mode
    #PF: error_code(0x0010) - not-present page
    Code: Bad RIP value.
    ...
    Call Trace:
     ? set_max_huge_pages+0x3da/0x4f0
     ? alloc_pool_huge_page+0x150/0x150
     ? proc_doulongvec_minmax+0x46/0x60
     ? hugetlb_sysctl_handler_common+0x1c7/0x200
     ? nr_hugepages_store+0x20/0x20
     ? copy_fd_bitmaps+0x170/0x170
     ? hugetlb_sysctl_handler+0x1e/0x20
     ? proc_sys_call_handler+0x2f1/0x300
     ? unregister_sysctl_table+0xb0/0xb0
     ? __fd_install+0x78/0x100
     ? proc_sys_write+0x14/0x20
     ? __vfs_write+0x4d/0x90
     ? vfs_write+0xef/0x240
     ? ksys_write+0xc0/0x160
     ? __ia32_sys_read+0x50/0x50
     ? __close_fd+0x129/0x150
     ? __x64_sys_write+0x43/0x50
     ? do_syscall_64+0x6c/0x200
     ? entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: e5ff215941 ("hugetlb: multiple hstates for multiple page sizes")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/20200828031146.43035-1-songmuchun@bytedance.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-05 12:14:30 -07:00
Li Xinhai
953f064aa6 mm/hugetlb: try preferred node first when alloc gigantic page from cma
Since commit cf11e85fc0 ("mm: hugetlb: optionally allocate gigantic
hugepages using cma"), the gigantic page would be allocated from node
which is not the preferred node, although there are pages available from
that node.  The reason is that the nid parameter has been ignored in
alloc_gigantic_page().

Besides, the __GFP_THISNODE also need be checked if user required to
alloc only from the preferred node.

After this patch, the preferred node is tried first before other allowed
nodes, and don't try to allocate from other nodes if __GFP_THISNODE is
specified.  If user don't specify the preferred node, the current node
will be used as preferred node, which makes sure consistent behavior of
allocating gigantic and non-gigantic hugetlb page.

Fixes: cf11e85fc0 ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Roman Gushchin <guro@fb.com>
Link: https://lkml.kernel.org/r/20200902025016.697260-1-lixinhai.lxh@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-05 12:14:30 -07:00
Ralph Campbell
3d321bf82c mm/migrate: preserve soft dirty in remove_migration_pte()
The code to remove a migration PTE and replace it with a device private
PTE was not copying the soft dirty bit from the migration entry.  This
could lead to page contents not being marked dirty when faulting the page
back from device private memory.

Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Link: https://lkml.kernel.org/r/20200831212222.22409-3-rcampbell@nvidia.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-05 12:14:30 -07:00
Ralph Campbell
6128763fc3 mm/migrate: remove unnecessary is_zone_device_page() check
Patch series "mm/migrate: preserve soft dirty in remove_migration_pte()".

I happened to notice this from code inspection after seeing Alistair
Popple's patch ("mm/rmap: Fixup copying of soft dirty and uffd ptes").

This patch (of 2):

The check for is_zone_device_page() and is_device_private_page() is
unnecessary since the latter is sufficient to determine if the page is a
device private page.  Simplify the code for easier reading.

Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Link: https://lkml.kernel.org/r/20200831212222.22409-1-rcampbell@nvidia.com
Link: https://lkml.kernel.org/r/20200831212222.22409-2-rcampbell@nvidia.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-05 12:14:30 -07:00
Alistair Popple
ad7df764b7 mm/rmap: fixup copying of soft dirty and uffd ptes
During memory migration a pte is temporarily replaced with a migration
swap pte.  Some pte bits from the existing mapping such as the soft-dirty
and uffd write-protect bits are preserved by copying these to the
temporary migration swap pte.

However these bits are not stored at the same location for swap and
non-swap ptes.  Therefore testing these bits requires using the
appropriate helper function for the given pte type.

Unfortunately several code locations were found where the wrong helper
function is being used to test soft_dirty and uffd_wp bits which leads to
them getting incorrectly set or cleared during page-migration.

Fix these by using the correct tests based on pte type.

Fixes: a5430dda8a ("mm/migrate: support un-addressable ZONE_DEVICE page in migration")
Fixes: 8c3328f1f3 ("mm/migrate: migrate_vma() unmap page from vma while collecting pages")
Fixes: f45ec5ff16 ("userfaultfd: wp: support swap and page migration")
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Alistair Popple <alistair@popple.id.au>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20200825064232.10023-2-alistair@popple.id.au
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-05 12:14:30 -07:00
Alistair Popple
ebdf8321ee mm/migrate: fixup setting UFFD_WP flag
Commit f45ec5ff16 ("userfaultfd: wp: support swap and page migration")
introduced support for tracking the uffd wp bit during page migration.
However the non-swap PTE variant was used to set the flag for zone device
private pages which are a type of swap page.

This leads to corruption of the swap offset if the original PTE has the
uffd_wp flag set.

Fixes: f45ec5ff16 ("userfaultfd: wp: support swap and page migration")
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Link: https://lkml.kernel.org/r/20200825064232.10023-1-alistair@popple.id.au
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-05 12:14:30 -07:00
Yang Shi
7867fd7cc4 mm: madvise: fix vma user-after-free
The syzbot reported the below use-after-free:

  BUG: KASAN: use-after-free in madvise_willneed mm/madvise.c:293 [inline]
  BUG: KASAN: use-after-free in madvise_vma mm/madvise.c:942 [inline]
  BUG: KASAN: use-after-free in do_madvise.part.0+0x1c8b/0x1cf0 mm/madvise.c:1145
  Read of size 8 at addr ffff8880a6163eb0 by task syz-executor.0/9996

  CPU: 0 PID: 9996 Comm: syz-executor.0 Not tainted 5.9.0-rc1-syzkaller #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  Call Trace:
    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0x18f/0x20d lib/dump_stack.c:118
    print_address_description.constprop.0.cold+0xae/0x497 mm/kasan/report.c:383
    __kasan_report mm/kasan/report.c:513 [inline]
    kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
    madvise_willneed mm/madvise.c:293 [inline]
    madvise_vma mm/madvise.c:942 [inline]
    do_madvise.part.0+0x1c8b/0x1cf0 mm/madvise.c:1145
    do_madvise mm/madvise.c:1169 [inline]
    __do_sys_madvise mm/madvise.c:1171 [inline]
    __se_sys_madvise mm/madvise.c:1169 [inline]
    __x64_sys_madvise+0xd9/0x110 mm/madvise.c:1169
    do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

  Allocated by task 9992:
    kmem_cache_alloc+0x138/0x3a0 mm/slab.c:3482
    vm_area_alloc+0x1c/0x110 kernel/fork.c:347
    mmap_region+0x8e5/0x1780 mm/mmap.c:1743
    do_mmap+0xcf9/0x11d0 mm/mmap.c:1545
    vm_mmap_pgoff+0x195/0x200 mm/util.c:506
    ksys_mmap_pgoff+0x43a/0x560 mm/mmap.c:1596
    do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

  Freed by task 9992:
    kmem_cache_free.part.0+0x67/0x1f0 mm/slab.c:3693
    remove_vma+0x132/0x170 mm/mmap.c:184
    remove_vma_list mm/mmap.c:2613 [inline]
    __do_munmap+0x743/0x1170 mm/mmap.c:2869
    do_munmap mm/mmap.c:2877 [inline]
    mmap_region+0x257/0x1780 mm/mmap.c:1716
    do_mmap+0xcf9/0x11d0 mm/mmap.c:1545
    vm_mmap_pgoff+0x195/0x200 mm/util.c:506
    ksys_mmap_pgoff+0x43a/0x560 mm/mmap.c:1596
    do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

It is because vma is accessed after releasing mmap_lock, but someone
else acquired the mmap_lock and the vma is gone.

Releasing mmap_lock after accessing vma should fix the problem.

Fixes: 692fe62433 ("mm: Handle MADV_WILLNEED through vfs_fadvise()")
Reported-by: syzbot+b90df26038d1d5d85c97@syzkaller.appspotmail.com
Signed-off-by: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: <stable@vger.kernel.org>	[5.4+]
Link: https://lkml.kernel.org/r/20200816141204.162624-1-shy828301@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-05 12:14:29 -07:00
Mrinal Pandey
13e45417ce checkpatch: fix the usage of capture group ( ... )
The usage of "capture group (...)" in the immediate condition after `&&`
results in `$1` being uninitialized.  This issues a warning "Use of
uninitialized value $1 in regexp compilation at ./scripts/checkpatch.pl
line 2638".

I noticed this bug while running checkpatch on the set of commits from
v5.7 to v5.8-rc1 of the kernel on the commits with a diff content in
their commit message.

This bug was introduced in the script by commit e518e9a59e
("checkpatch: emit an error when there's a diff in a changelog").  It
has been in the script since then.

The author intended to store the match made by capture group in variable
`$1`.  This should have contained the name of the file as `[\w/]+`
matched.  However, this couldn't be accomplished due to usage of capture
group and `$1` in the same regular expression.

Fix this by placing the capture group in the condition before `&&`.
Thus, `$1` can be initialized to the text that capture group matches
thereby setting it to the desired and required value.

Fixes: e518e9a59e ("checkpatch: emit an error when there's a diff in a changelog")
Signed-off-by: Mrinal Pandey <mrinalmni@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Reviewed-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: Joe Perches <joe@perches.com>
Link: https://lkml.kernel.org/r/20200714032352.f476hanaj2dlmiot@mrinalpandey
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-05 12:14:29 -07:00
Tobias Klauser
b0daa2c73f fork: adjust sysctl_max_threads definition to match prototype
Commit 32927393dc ("sysctl: pass kernel pointers to ->proc_handler")
changed ctl_table.proc_handler to take a kernel pointer.  Adjust the
definition of sysctl_max_threads to match its prototype in
linux/sysctl.h which fixes the following sparse error/warning:

  kernel/fork.c:3050:47: warning: incorrect type in argument 3 (different address spaces)
  kernel/fork.c:3050:47:    expected void *
  kernel/fork.c:3050:47:    got void [noderef] __user *buffer
  kernel/fork.c:3036:5: error: symbol 'sysctl_max_threads' redeclared with different type (incompatible argument 3 (different address spaces)):
  kernel/fork.c:3036:5:    int extern [addressable] [signed] [toplevel] sysctl_max_threads( ... )
  kernel/fork.c: note: in included file (through include/linux/key.h, include/linux/cred.h, include/linux/sched/signal.h, include/linux/sched/cputime.h):
  include/linux/sysctl.h:242:5: note: previously declared as:
  include/linux/sysctl.h:242:5:    int extern [addressable] [signed] [toplevel] sysctl_max_threads( ... )

Fixes: 32927393dc ("sysctl: pass kernel pointers to ->proc_handler")
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lkml.kernel.org/r/20200825093647.24263-1-tklauser@distanz.ch
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-05 12:14:29 -07:00
Tobias Klauser
fff1662cc4 ipc: adjust proc_ipc_sem_dointvec definition to match prototype
Commit 32927393dc ("sysctl: pass kernel pointers to ->proc_handler")
changed ctl_table.proc_handler to take a kernel pointer.  Adjust the
signature of proc_ipc_sem_dointvec to match ctl_table.proc_handler which
fixes the following sparse error/warning:

  ipc/ipc_sysctl.c:94:47: warning: incorrect type in argument 3 (different address spaces)
  ipc/ipc_sysctl.c:94:47:    expected void *buffer
  ipc/ipc_sysctl.c:94:47:    got void [noderef] __user *buffer
  ipc/ipc_sysctl.c:194:35: warning: incorrect type in initializer (incompatible argument 3 (different address spaces))
  ipc/ipc_sysctl.c:194:35:    expected int ( [usertype] *proc_handler )( ... )
  ipc/ipc_sysctl.c:194:35:    got int ( * )( ... )

Fixes: 32927393dc ("sysctl: pass kernel pointers to ->proc_handler")
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Link: https://lkml.kernel.org/r/20200825105846.5193-1-tklauser@distanz.ch
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-05 12:14:29 -07:00
Joerg Roedel
e80d3909be mm: track page table modifications in __apply_to_page_range()
__apply_to_page_range() is also used to change and/or allocate
page-table pages in the vmalloc area of the address space.  Make sure
these changes get synchronized to other page-tables in the system by
calling arch_sync_kernel_mappings() when necessary.

The impact appears limited to x86-32, where apply_to_page_range may miss
updating the PMD.  That leads to explosions in drivers like

  BUG: unable to handle page fault for address: fe036000
  #PF: supervisor write access in kernel mode
  #PF: error_code(0x0002) - not-present page
  *pde = 00000000
  Oops: 0002 [#1] SMP
  CPU: 3 PID: 1300 Comm: gem_concurrent_ Not tainted 5.9.0-rc1+ #16
  Hardware name:  /NUC6i3SYB, BIOS SYSKLi35.86A.0024.2015.1027.2142 10/27/2015
  EIP: __execlists_context_alloc+0x132/0x2d0 [i915]
  Code: 31 d2 89 f0 e8 2f 55 02 00 89 45 e8 3d 00 f0 ff ff 0f 87 11 01 00 00 8b 4d e8 03 4b 30 b8 5a 5a 5a 5a ba 01 00 00 00 8d 79 04 <c7> 01 5a 5a 5a 5a c7 81 fc 0f 00 00 5a 5a 5a 5a 83 e7 fc 29 f9 81
  EAX: 5a5a5a5a EBX: f60ca000 ECX: fe036000 EDX: 00000001
  ESI: f43b7340 EDI: fe036004 EBP: f6389cb8 ESP: f6389c9c
  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010286
  CR0: 80050033 CR2: fe036000 CR3: 2d361000 CR4: 001506d0
  DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
  DR6: fffe0ff0 DR7: 00000400
  Call Trace:
    execlists_context_alloc+0x10/0x20 [i915]
    intel_context_alloc_state+0x3f/0x70 [i915]
    __intel_context_do_pin+0x117/0x170 [i915]
    i915_gem_do_execbuffer+0xcc7/0x2500 [i915]
    i915_gem_execbuffer2_ioctl+0xcd/0x1f0 [i915]
    drm_ioctl_kernel+0x8f/0xd0
    drm_ioctl+0x223/0x3d0
    __ia32_sys_ioctl+0x1ab/0x760
    __do_fast_syscall_32+0x3f/0x70
    do_fast_syscall_32+0x29/0x60
    do_SYSENTER_32+0x15/0x20
    entry_SYSENTER_32+0x9f/0xf2
  EIP: 0xb7f28559
  Code: 03 74 c0 01 10 05 03 74 b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76
  EAX: ffffffda EBX: 00000005 ECX: c0406469 EDX: bf95556c
  ESI: b7e68000 EDI: c0406469 EBP: 00000005 ESP: bf9554d8
  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000296
  Modules linked in: i915 x86_pkg_temp_thermal intel_powerclamp crc32_pclmul crc32c_intel intel_cstate intel_uncore intel_gtt drm_kms_helper intel_pch_thermal video button autofs4 i2c_i801 i2c_smbus fan
  CR2: 00000000fe036000

It looks like kasan, xen and i915 are vulnerable.

Actual impact is "on thinkpad X60 in 5.9-rc1, screen starts blinking
after 30-or-so minutes, and machine is unusable"

[sfr@canb.auug.org.au: ARCH_PAGE_TABLE_SYNC_MASK needs vmalloc.h]
  Link: https://lkml.kernel.org/r/20200825172508.16800a4f@canb.auug.org.au
[chris@chris-wilson.co.uk: changelog addition]
[pavel@ucw.cz: changelog addition]

Fixes: 2ba3e6947a ("mm/vmalloc: track which page-table levels were modified")
Fixes: 86cf69f1d8 ("x86/mm/32: implement arch_sync_kernel_mappings()")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk>	[x86-32]
Tested-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@vger.kernel.org>	[5.8+]
Link: https://lkml.kernel.org/r/20200821123746.16904-1-joro@8bytes.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-05 12:14:29 -07:00
Randy Dunlap
9d90dd188d MAINTAINERS: IA64: mark Status as Odd Fixes only
IA64 isn't really being maintained, so mark it as Odd Fixes only.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/7e719139-450f-52c2-59a2-7964a34eda1f@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-05 12:14:29 -07:00
Nick Desaulniers
b964428965 MAINTAINERS: add LLVM maintainers
Nominate Nathan and myself to be point of contact for clang/LLVM related
support, after a poll at the LLVM BoF at Linux Plumbers Conf 2020.

While corporate sponsorship is beneficial, its important to not entrust
the keys to the nukes with any one entity.  Should Nathan and I find
ourselves at the same employer, I would gladly step down.

Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Sedat Dilek <sedat.dilek@gmail.com>
Acked-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Acked-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lkml.kernel.org/r/20200825143540.2948637-1-ndesaulniers@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-05 12:14:29 -07:00
Robert Richter
f548a64570 MAINTAINERS: update Cavium/Marvell entries
I am leaving Marvell and already do not have access to my @marvell.com
email address.  So switching over to my korg mail address or removing my
address there another maintainer is already listed.  For the entries
there no other maintainer is listed I will keep looking into patches for
Cavium systems for a while until someone from Marvell takes it over.

Since I might have limited access to hardware and also limited time I
changed state to 'Odd Fixes' for those entries.

Signed-off-by: Robert Richter <rric@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Ganapatrao Kulkarni <gkulkarni@marvell.com>
Cc: Sunil Goutham <sgoutham@marvell.com>
CC: Borislav Petkov <bp@alien8.de>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Wolfram Sang <wsa@kernel.org>,
Link: https://lkml.kernel.org/r/20200824122050.31164-1-rric@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-05 12:14:29 -07:00
Eugeniu Rosca
dc07a728d4 mm: slub: fix conversion of freelist_corrupted()
Commit 52f2347808 ("mm/slub.c: fix corrupted freechain in
deactivate_slab()") suffered an update when picked up from LKML [1].

Specifically, relocating 'freelist = NULL' into 'freelist_corrupted()'
created a no-op statement.  Fix it by sticking to the behavior intended
in the original patch [1].  In addition, make freelist_corrupted()
immune to passing NULL instead of &freelist.

The issue has been spotted via static analysis and code review.

[1] https://lore.kernel.org/linux-mm/20200331031450.12182-1-dongli.zhang@oracle.com/

Fixes: 52f2347808 ("mm/slub.c: fix corrupted freechain in deactivate_slab()")
Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Dongli Zhang <dongli.zhang@oracle.com>
Cc: Joe Jin <joe.jin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20200824130643.10291-1-erosca@de.adit-jv.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-05 12:14:29 -07:00
Xunlei Pang
e3336cab25 mm: memcg: fix memcg reclaim soft lockup
We've met softlockup with "CONFIG_PREEMPT_NONE=y", when the target memcg
doesn't have any reclaimable memory.

It can be easily reproduced as below:

  watchdog: BUG: soft lockup - CPU#0 stuck for 111s![memcg_test:2204]
  CPU: 0 PID: 2204 Comm: memcg_test Not tainted 5.9.0-rc2+ #12
  Call Trace:
    shrink_lruvec+0x49f/0x640
    shrink_node+0x2a6/0x6f0
    do_try_to_free_pages+0xe9/0x3e0
    try_to_free_mem_cgroup_pages+0xef/0x1f0
    try_charge+0x2c1/0x750
    mem_cgroup_charge+0xd7/0x240
    __add_to_page_cache_locked+0x2fd/0x370
    add_to_page_cache_lru+0x4a/0xc0
    pagecache_get_page+0x10b/0x2f0
    filemap_fault+0x661/0xad0
    ext4_filemap_fault+0x2c/0x40
    __do_fault+0x4d/0xf9
    handle_mm_fault+0x1080/0x1790

It only happens on our 1-vcpu instances, because there's no chance for
oom reaper to run to reclaim the to-be-killed process.

Add a cond_resched() at the upper shrink_node_memcgs() to solve this
issue, this will mean that we will get a scheduling point for each memcg
in the reclaimed hierarchy without any dependency on the reclaimable
memory in that memcg thus making it more predictable.

Suggested-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Xunlei Pang <xlpang@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Chris Down <chris@chrisdown.name>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Link: http://lkml.kernel.org/r/1598495549-67324-1-git-send-email-xlpang@linux.alibaba.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-05 12:14:29 -07:00
Michal Hocko
f1796544a0 memcg: fix use-after-free in uncharge_batch
syzbot has reported an use-after-free in the uncharge_batch path

  BUG: KASAN: use-after-free in instrument_atomic_write include/linux/instrumented.h:71 [inline]
  BUG: KASAN: use-after-free in atomic64_sub_return include/asm-generic/atomic-instrumented.h:970 [inline]
  BUG: KASAN: use-after-free in atomic_long_sub_return include/asm-generic/atomic-long.h:113 [inline]
  BUG: KASAN: use-after-free in page_counter_cancel mm/page_counter.c:54 [inline]
  BUG: KASAN: use-after-free in page_counter_uncharge+0x3d/0xc0 mm/page_counter.c:155
  Write of size 8 at addr ffff8880371c0148 by task syz-executor.0/9304

  CPU: 0 PID: 9304 Comm: syz-executor.0 Not tainted 5.8.0-syzkaller #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  Call Trace:
    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0x1f0/0x31e lib/dump_stack.c:118
    print_address_description+0x66/0x620 mm/kasan/report.c:383
    __kasan_report mm/kasan/report.c:513 [inline]
    kasan_report+0x132/0x1d0 mm/kasan/report.c:530
    check_memory_region_inline mm/kasan/generic.c:183 [inline]
    check_memory_region+0x2b5/0x2f0 mm/kasan/generic.c:192
    instrument_atomic_write include/linux/instrumented.h:71 [inline]
    atomic64_sub_return include/asm-generic/atomic-instrumented.h:970 [inline]
    atomic_long_sub_return include/asm-generic/atomic-long.h:113 [inline]
    page_counter_cancel mm/page_counter.c:54 [inline]
    page_counter_uncharge+0x3d/0xc0 mm/page_counter.c:155
    uncharge_batch+0x6c/0x350 mm/memcontrol.c:6764
    uncharge_page+0x115/0x430 mm/memcontrol.c:6796
    uncharge_list mm/memcontrol.c:6835 [inline]
    mem_cgroup_uncharge_list+0x70/0xe0 mm/memcontrol.c:6877
    release_pages+0x13a2/0x1550 mm/swap.c:911
    tlb_batch_pages_flush mm/mmu_gather.c:49 [inline]
    tlb_flush_mmu_free mm/mmu_gather.c:242 [inline]
    tlb_flush_mmu+0x780/0x910 mm/mmu_gather.c:249
    tlb_finish_mmu+0xcb/0x200 mm/mmu_gather.c:328
    exit_mmap+0x296/0x550 mm/mmap.c:3185
    __mmput+0x113/0x370 kernel/fork.c:1076
    exit_mm+0x4cd/0x550 kernel/exit.c:483
    do_exit+0x576/0x1f20 kernel/exit.c:793
    do_group_exit+0x161/0x2d0 kernel/exit.c:903
    get_signal+0x139b/0x1d30 kernel/signal.c:2743
    arch_do_signal+0x33/0x610 arch/x86/kernel/signal.c:811
    exit_to_user_mode_loop kernel/entry/common.c:135 [inline]
    exit_to_user_mode_prepare+0x8d/0x1b0 kernel/entry/common.c:166
    syscall_exit_to_user_mode+0x5e/0x1a0 kernel/entry/common.c:241
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

Commit 1a3e1f4096 ("mm: memcontrol: decouple reference counting from
page accounting") reworked the memcg lifetime to be bound the the struct
page rather than charges.  It also removed the css_put_many from
uncharge_batch and that is causing the above splat.

uncharge_batch() is supposed to uncharge accumulated charges for all
pages freed from the same memcg.  The queuing is done by uncharge_page
which however drops the memcg reference after it adds charges to the
batch.  If the current page happens to be the last one holding the
reference for its memcg then the memcg is OK to go and the next page to
be freed will trigger batched uncharge which needs to access the memcg
which is gone already.

Fix the issue by taking a reference for the memcg in the current batch.

Fixes: 1a3e1f4096 ("mm: memcontrol: decouple reference counting from page accounting")
Reported-by: syzbot+b305848212deec86eabe@syzkaller.appspotmail.com
Reported-by: syzbot+b5ea6fb6f139c8b9482b@syzkaller.appspotmail.com
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Hugh Dickins <hughd@google.com>
Link: https://lkml.kernel.org/r/20200820090341.GC5033@dhcp22.suse.cz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-05 12:14:29 -07:00
Linus Torvalds
9322c47b21 Fixes (2) for 5.9:
- Fix a broken metadata verifier that would incorrectly validate attr
 fork extents of a realtime file against the realtime volume.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAl9RDTMACgkQ+H93GTRK
 tOseqA/8DTnwKRKuTXc6UboIXbgMwjGTJBv7xTD3KIDF3ls+Gmp2/RWHs7lfGRVY
 LVdvlgAwDLEe1al8aLaibhrsRCYX8MLvMp8dX6xoqMWCY0KF80VCpXXy+FXbwBmZ
 GQRqrThyjepseERJj15UC249p9ztBj4DAHQD0r2XD7JMKHBHo6cQ3eYY2NTzgqc3
 q0kYcP5xPZZ4R6UFUNkJpkeV8PKpkzWTXyMod0+e8h4njZoRAmHimj7Id9DMQ6HB
 ciHjZjNCd04Nu6JIpNBwlQE4epCPh79zX8hTQZf5nGNAh13CB9Wc2nHDvwCFBLxw
 HUdU5BpxMNWGujJ+T5X+RVbzA4VXXaL3iypag9EjXadg+vOV6XmPQv6Fa3WqkfBT
 GjEXB9+TG9ZuyxAObjP6yn1PRvob0l0iccAbtfnX5bE/mwqx1aDxN1QTfJrjG2Fv
 2EiUnqI+jxro66HdS5QU+W8ko7dG0tiQPMF3DCv7nfb3ZxEgfXiDrgbHyYbzdLJq
 pi5LqXBAfRHkDwRUzRU6G6pT7mplW31iTwh2AuiQogXnpPTWylb0XsUEVfYKpK9q
 Z0HzwRfHmwQSkIEDZ0rZxP3wH5ssniHyLdXh5FtfFcPSBTyDvflMqFpLmYmvDV/6
 MPfwCQAnMTrGv2GRJ8hbFiWd987bYTTfOKwJBEsFKDH01ZKYcV8=
 =5U8o
 -----END PGP SIGNATURE-----

Merge tag 'xfs-5.9-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fix from Darrick Wong:
 "Fix a broken metadata verifier that would incorrectly validate attr
  fork extents of a realtime file against the realtime volume"

* tag 'xfs-5.9-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: fix xfs_bmap_validate_extent_raw when checking attr fork of rt files
2020-09-05 10:04:53 -07:00
Mikulas Patocka
b17164e258 xfs: don't update mtime on COW faults
When running in a dax mode, if the user maps a page with MAP_PRIVATE and
PROT_WRITE, the xfs filesystem would incorrectly update ctime and mtime
when the user hits a COW fault.

This breaks building of the Linux kernel.  How to reproduce:

 1. extract the Linux kernel tree on dax-mounted xfs filesystem
 2. run make clean
 3. run make -j12
 4. run make -j12

at step 4, make would incorrectly rebuild the whole kernel (although it
was already built in step 3).

The reason for the breakage is that almost all object files depend on
objtool.  When we run objtool, it takes COW page fault on its .data
section, and these faults will incorrectly update the timestamp of the
objtool binary.  The updated timestamp causes make to rebuild the whole
tree.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-05 10:00:05 -07:00
Mikulas Patocka
1ef6ea0efe ext2: don't update mtime on COW faults
When running in a dax mode, if the user maps a page with MAP_PRIVATE and
PROT_WRITE, the ext2 filesystem would incorrectly update ctime and mtime
when the user hits a COW fault.

This breaks building of the Linux kernel.  How to reproduce:

 1. extract the Linux kernel tree on dax-mounted ext2 filesystem
 2. run make clean
 3. run make -j12
 4. run make -j12

at step 4, make would incorrectly rebuild the whole kernel (although it
was already built in step 3).

The reason for the breakage is that almost all object files depend on
objtool.  When we run objtool, it takes COW page fault on its .data
section, and these faults will incorrectly update the timestamp of the
objtool binary.  The updated timestamp causes make to rebuild the whole
tree.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-05 10:00:05 -07:00
Jens Axboe
c183edff33 io_uring: fix explicit async read/write mapping for large segments
If we exceed UIO_FASTIOV, we don't handle the transition correctly
between an allocated vec for requests that are queued with IOSQE_ASYNC.
Store the iovec appropriately and re-set it in the iter iov in case
it changed.

Fixes: ff6165b2d7 ("io_uring: retain iov_iter state over io_read/io_write calls")
Reported-by: Nick Hill <nick@nickhill.org>
Tested-by: Norman Maurer <norman.maurer@googlemail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-05 09:02:47 -06:00