No description
Find a file
Vlastimil Babka b1683d6a7e mm, vmscan: prevent infinite loop for costly GFP_NOIO | __GFP_RETRY_MAYFAIL allocations
commit 803de9000f upstream.

Sven reports an infinite loop in __alloc_pages_slowpath() for costly order
__GFP_RETRY_MAYFAIL allocations that are also GFP_NOIO.  Such combination
can happen in a suspend/resume context where a GFP_KERNEL allocation can
have __GFP_IO masked out via gfp_allowed_mask.

Quoting Sven:

1. try to do a "costly" allocation (order > PAGE_ALLOC_COSTLY_ORDER)
   with __GFP_RETRY_MAYFAIL set.

2. page alloc's __alloc_pages_slowpath tries to get a page from the
   freelist. This fails because there is nothing free of that costly
   order.

3. page alloc tries to reclaim by calling __alloc_pages_direct_reclaim,
   which bails out because a zone is ready to be compacted; it pretends
   to have made a single page of progress.

4. page alloc tries to compact, but this always bails out early because
   __GFP_IO is not set (it's not passed by the snd allocator, and even
   if it were, we are suspending so the __GFP_IO flag would be cleared
   anyway).

5. page alloc believes reclaim progress was made (because of the
   pretense in item 3) and so it checks whether it should retry
   compaction. The compaction retry logic thinks it should try again,
   because:
    a) reclaim is needed because of the early bail-out in item 4
    b) a zonelist is suitable for compaction

6. goto 2. indefinite stall.

(end quote)

The immediate root cause is confusing the COMPACT_SKIPPED returned from
__alloc_pages_direct_compact() (step 4) due to lack of __GFP_IO to be
indicating a lack of order-0 pages, and in step 5 evaluating that in
should_compact_retry() as a reason to retry, before incrementing and
limiting the number of retries.  There are however other places that
wrongly assume that compaction can happen while we lack __GFP_IO.

To fix this, introduce gfp_compaction_allowed() to abstract the __GFP_IO
evaluation and switch the open-coded test in try_to_compact_pages() to use
it.

Also use the new helper in:
- compaction_ready(), which will make reclaim not bail out in step 3, so
  there's at least one attempt to actually reclaim, even if chances are
  small for a costly order
- in_reclaim_compaction() which will make should_continue_reclaim()
  return false and we don't over-reclaim unnecessarily
- in __alloc_pages_slowpath() to set a local variable can_compact,
  which is then used to avoid retrying reclaim/compaction for costly
  allocations (step 5) if we can't compact and also to skip the early
  compaction attempt that we do in some cases

Link: https://lkml.kernel.org/r/20240221114357.13655-2-vbabka@suse.cz
Fixes: 3250845d05 ("Revert "mm, oom: prevent premature OOM killer invocation for high order request"")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Sven van Ashbrook <svenva@chromium.org>
Closes: https://lore.kernel.org/all/CAG-rBihs_xMKb3wrMO1%2B-%2Bp4fowP9oy1pa_OTkfxBzPUVOZF%2Bg@mail.gmail.com/
Tested-by: Karthikeyan Ramasubramanian <kramasub@chromium.org>
Cc: Brian Geffon <bgeffon@google.com>
Cc: Curtis Malainey <cujomalainey@chromium.org>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Takashi Iwai <tiwai@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-03 15:11:40 +02:00
arch ARM: imx_v6_v7_defconfig: Restore CONFIG_BACKLIGHT_CLASS_DEVICE 2024-04-03 15:11:40 +02:00
block Revert "block/mq-deadline: use correct way to throttling write requests" 2024-04-03 15:11:26 +02:00
certs This update includes the following changes: 2023-11-02 16:15:30 -10:00
crypto Revert "crypto: pkcs7 - remove sha1 support" 2024-04-03 15:11:35 +02:00
Documentation docs: Restore "smart quotes" for quotes 2024-04-03 15:11:13 +02:00
drivers tee: optee: Fix kernel panic caused by incorrect error handling 2024-04-03 15:11:40 +02:00
fs fs/aio: Check IOCB_AIO_RW before the struct aio_kiocb conversion 2024-04-03 15:11:39 +02:00
include mm, vmscan: prevent infinite loop for costly GFP_NOIO | __GFP_RETRY_MAYFAIL allocations 2024-04-03 15:11:40 +02:00
init init/Kconfig: lower GCC version check for -Warray-bounds 2024-04-03 15:11:37 +02:00
io_uring io_uring/waitid: always remove waitid entry for cancel all 2024-04-03 15:11:28 +02:00
ipc Many singleton patches against the MM code. The patch series which are 2023-11-02 19:38:47 -10:00
kernel tracing: Use .flush() call to wake up readers 2024-04-03 15:11:37 +02:00
lib pci_iounmap(): Fix MMIO mapping leak 2024-04-03 15:11:07 +02:00
LICENSES LICENSES: Add the copyleft-next-0.3.1 license 2022-11-08 15:44:01 +01:00
mm mm, vmscan: prevent infinite loop for costly GFP_NOIO | __GFP_RETRY_MAYFAIL allocations 2024-04-03 15:11:40 +02:00
net xfrm: Avoid clang fortify warning in copy_to_user_tmpl() 2024-04-03 15:11:36 +02:00
rust rust: Ignore preserve-most functions 2024-01-25 15:45:09 -08:00
samples eventfd: simplify eventfd_signal() 2024-04-03 15:11:23 +02:00
scripts kbuild: Move -Wenum-{compare-conditional,enum-conversion} into W=1 2024-04-03 15:11:22 +02:00
security landlock: Warn once if a Landlock action is requested while disabled 2024-04-03 15:11:19 +02:00
sound ALSA: hda/realtek: fix mute/micmute LEDs for HP EliteBook 2024-04-03 15:11:40 +02:00
tools selftests: mptcp: diag: return KSFT_FAIL not test_cnt 2024-04-03 15:11:36 +02:00
usr arch: Remove Itanium (IA-64) architecture 2023-09-11 08:13:17 +00:00
virt eventfd: simplify eventfd_signal() 2024-04-03 15:11:23 +02:00
.clang-format iommu: Add for_each_group_device() 2023-05-23 08:15:51 +02:00
.cocciconfig
.get_maintainer.ignore get_maintainer: add Alan to .get_maintainer.ignore 2022-08-20 15:17:44 -07:00
.gitattributes .gitattributes: set diff driver for Rust source code files 2023-05-31 17:48:25 +02:00
.gitignore kbuild: rpm-pkg: generate kernel.spec in rpmbuild/SPECS/ 2023-10-03 20:49:09 +09:00
.mailmap 12 hotfixes. 2 are cc:stable and the remainder either address post-6.7 2024-01-05 13:46:18 -08:00
.rustfmt.toml rust: add .rustfmt.toml 2022-09-28 09:02:20 +02:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS 12 hotfixes. 2 are cc:stable and the remainder either address post-6.7 2024-01-05 13:46:18 -08:00
Kbuild Kbuild updates for v6.1 2022-10-10 12:00:45 -07:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS 12 hotfixes. 2 are cc:stable and the remainder either address post-6.7 2024-01-05 13:46:18 -08:00
Makefile Linux 6.7.11 2024-03-26 18:22:50 -04:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.