Go to file
Vlastimil Babka ea57485af8 mm, page_alloc: fix check for NULL preferred_zone
Patch series "fix premature OOM regression in 4.7+ due to cpuset races".

This is v2 of my attempt to fix the recent report based on LTP cpuset
stress test [1].  The intention is to go to stable 4.9 LTSS with this,
as triggering repeated OOMs is not nice.  That's why the patches try to
be not too intrusive.

Unfortunately why investigating I found that modifying the testcase to
use per-VMA policies instead of per-task policies will bring the OOM's
back, but that seems to be much older and harder to fix problem.  I have
posted a RFC [2] but I believe that fixing the recent regressions has a
higher priority.

Longer-term we might try to think how to fix the cpuset mess in a better
and less error prone way.  I was for example very surprised to learn,
that cpuset updates change not only task->mems_allowed, but also
nodemask of mempolicies.  Until now I expected the parameter to
alloc_pages_nodemask() to be stable.  I wonder why do we then treat
cpusets specially in get_page_from_freelist() and distinguish HARDWALL
etc, when there's unconditional intersection between mempolicy and
cpuset.  I would expect the nodemask adjustment for saving overhead in
g_p_f(), but that clearly doesn't happen in the current form.  So we
have both crazy complexity and overhead, AFAICS.

[1] https://lkml.kernel.org/r/CAFpQJXUq-JuEP=QPidy4p_=FN0rkH5Z-kfB4qBvsf6jMS87Edg@mail.gmail.com
[2] https://lkml.kernel.org/r/7c459f26-13a6-a817-e508-b65b903a8378@suse.cz

This patch (of 4):

Since commit c33d6c06f6 ("mm, page_alloc: avoid looking up the first
zone in a zonelist twice") we have a wrong check for NULL preferred_zone,
which can theoretically happen due to concurrent cpuset modification.  We
check the zoneref pointer which is never NULL and we should check the zone
pointer.  Also document this in first_zones_zonelist() comment per Michal
Hocko.

Fixes: c33d6c06f6 ("mm, page_alloc: avoid looking up the first zone in a zonelist twice")
Link: http://lkml.kernel.org/r/20170120103843.24587-2-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Ganapatrao Kulkarni <gpkulkarni@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-01-24 16:26:14 -08:00
Documentation Documentation/filesystems/proc.txt: add VmPin 2017-01-24 16:26:14 -08:00
arch frv: add atomic64_add_unless() 2017-01-24 16:26:14 -08:00
block blk-mq: Remove unused variable 2017-01-18 15:14:15 -07:00
certs certs: Add a secondary system keyring that can be added to dynamically 2016-04-11 22:48:09 +01:00
crypto crypto: testmgr - Use heap buffer for acomp test input 2016-12-27 17:32:11 +08:00
drivers fbdev: color map copying bounds checking 2017-01-24 16:26:14 -08:00
firmware WHENCE: use https://linuxtv.org for LinuxTV URLs 2015-12-04 10:35:11 -02:00
fs proc: add a schedule point in proc_pid_readdir() 2017-01-24 16:26:14 -08:00
include mm, page_alloc: fix check for NULL preferred_zone 2017-01-24 16:26:14 -08:00
init cgroup: move CONFIG_SOCK_CGROUP_DATA to init/Kconfig 2017-01-11 09:47:10 -05:00
ipc ipc/sem.c: fix incorrect sem_lock pairing 2017-01-10 18:31:55 -08:00
kernel kernel/panic.c: add missing \n 2017-01-24 16:26:14 -08:00
lib radix-tree: fix private list warnings 2017-01-24 16:26:14 -08:00
mm mm, page_alloc: fix check for NULL preferred_zone 2017-01-24 16:26:14 -08:00
net Three filesystem endianness fixes (one goes back to the 2.6 era, all 2017-01-20 12:15:48 -08:00
samples Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-01-15 11:37:43 -08:00
scripts gcc-plugins: update gcc-common.h for gcc-7 2017-01-03 12:08:59 -08:00
security Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
sound ASoC: Fixes for v4.10 2017-01-11 19:49:27 +01:00
tools virtio, vhost: fixes, cleanups 2017-01-22 12:40:09 -08:00
usr kbuild: initramfs cleanup, set target from Kconfig 2017-01-05 09:40:16 -08:00
virt KVM/ARM updates for 4.10-rc4 2017-01-17 15:04:59 +01:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.get_maintainer.ignore Add hch to .get_maintainer.ignore 2015-08-21 14:30:10 -07:00
.gitattributes .gitattributes: set git diff driver for C source code files 2016-10-07 18:46:30 -07:00
.gitignore Merge branch 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild 2016-08-02 16:48:52 -04:00
.mailmap mailmap: add codeaurora.org names for nameless email commits 2017-01-10 18:31:55 -08:00
COPYING
CREDITS CREDITS: Remove outdated address information 2016-12-21 15:21:29 -08:00
Kbuild scripts/gdb: provide linux constants 2016-05-23 17:04:14 -07:00
Kconfig
MAINTAINERS drm fixes across the board 2017-01-23 13:10:50 -08:00
Makefile Linux 4.10-rc5 2017-01-22 12:54:15 -08:00
README README: add a new README file, pointing to the Documentation/ 2016-10-24 08:12:35 -02:00

README

Linux kernel
============

This file was moved to Documentation/admin-guide/README.rst

Please notice that there are several guides for kernel developers and users.
These guides can be rendered in a number of formats, like HTML and PDF.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
See Documentation/00-INDEX for a list of what is contained in each file.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.