No description
Find a file
Yang Shi 46c22e7b09 mm: gup: fix the fast GUP race against THP collapse
commit 70cbc3cc78 upstream.

Since general RCU GUP fast was introduced in commit 2667f50e8b ("mm:
introduce a general RCU get_user_pages_fast()"), a TLB flush is no longer
sufficient to handle concurrent GUP-fast in all cases, it only handles
traditional IPI-based GUP-fast correctly.  On architectures that send an
IPI broadcast on TLB flush, it works as expected.  But on the
architectures that do not use IPI to broadcast TLB flush, it may have the
below race:

   CPU A                                          CPU B
THP collapse                                     fast GUP
                                              gup_pmd_range() <-- see valid pmd
                                                  gup_pte_range() <-- work on pte
pmdp_collapse_flush() <-- clear pmd and flush
__collapse_huge_page_isolate()
    check page pinned <-- before GUP bump refcount
                                                      pin the page
                                                      check PTE <-- no change
__collapse_huge_page_copy()
    copy data to huge page
    ptep_clear()
install huge pmd for the huge page
                                                      return the stale page
discard the stale page

The race can be fixed by checking whether PMD is changed or not after
taking the page pin in fast GUP, just like what it does for PTE.  If the
PMD is changed it means there may be parallel THP collapse, so GUP should
back off.

Also update the stale comment about serializing against fast GUP in
khugepaged.

Link: https://lkml.kernel.org/r/20220907180144.555485-1-shy828301@gmail.com
Fixes: 2667f50e8b ("mm: introduce a general RCU get_user_pages_fast()")
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Yang Shi <shy828301@gmail.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-10-12 09:53:26 +02:00
arch x86/alternative: Fix race in try_get_desc() 2022-10-05 10:39:44 +02:00
block block: blk_queue_enter() / __bio_queue_enter() must return -EAGAIN for nowait 2022-09-23 14:15:48 +02:00
certs certs/blacklist_hashes.c: fix const confusion in certs blacklist 2022-06-22 14:22:01 +02:00
crypto KEYS: asymmetric: enforce SM2 signature use pkey algo 2022-08-17 14:24:28 +02:00
Documentation docs: update mediator information in CoC docs 2022-10-12 09:53:26 +02:00
drivers drm/i915/gem: Really move i915_gem_context.link under ref protection 2022-10-05 10:39:44 +02:00
fs fs: split off setxattr_copy and do_setxattr function from setxattr 2022-10-05 10:39:44 +02:00
include xsk: Inherit need_wakeup flag for shared sockets 2022-10-12 09:53:26 +02:00
init stack: Declare {randomize_,}kstack_offset to fix Sparse warnings 2022-08-17 14:23:10 +02:00
ipc ipc/mqueue: use get_tree_nodev() in mqueue_get_tree() 2022-06-09 10:23:10 +02:00
kernel swiotlb: max mapping size takes min align mask into account 2022-10-05 10:39:40 +02:00
lib crypto: lib - remove unneeded selection of XOR_BLOCKS 2022-09-05 10:30:03 +02:00
LICENSES LICENSES/dual/CC-BY-4.0: Git rid of "smart quotes" 2021-07-15 06:31:24 -06:00
mm mm: gup: fix the fast GUP race against THP collapse 2022-10-12 09:53:26 +02:00
net xsk: Inherit need_wakeup flag for shared sockets 2022-10-12 09:53:26 +02:00
samples samples/landlock: Format with clang-format 2022-06-09 10:23:23 +02:00
scripts Makefile.extrawarn: Move -Wcast-function-type-strict to W=1 2022-10-12 09:53:26 +02:00
security apparmor: Fix memleak in aa_simple_write_to_buffer() 2022-08-25 11:40:01 +02:00
sound ASoC: tas2770: Reinit regcache on reset 2022-10-05 10:39:41 +02:00
tools selftests: Fix the if conditions of in test_extra_filter() 2022-10-05 10:39:43 +02:00
usr usr/include/Makefile: add linux/nfc.h to the compile-test coverage 2022-02-01 17:27:15 +01:00
virt KVM: SEV: add cache flush to solve SEV cache incoherency issues 2022-09-23 14:15:52 +02:00
.clang-format clang-format: Update with the latest for_each macro list 2021-05-12 23:32:39 +02:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore .gitignore: ignore only top-level modules.builtin 2021-05-02 00:43:35 +09:00
.mailmap mailmap: add Andrej Shadura 2021-10-18 20:22:03 -10:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: Move Daniel Drake to credits 2021-09-21 08:34:58 +03:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS Input: goodix - add a goodix.h header file 2022-07-12 16:34:51 +02:00
Makefile Linux 5.15.72 2022-10-05 10:39:44 +02:00
README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.