linux-stable/mm
Qian Cai cd02cf1ace mm/hotplug: fix an imbalance with DEBUG_PAGEALLOC
When onlining a memory block with DEBUG_PAGEALLOC, it unmaps the pages
in the block from kernel, However, it does not map those pages while
offlining at the beginning.  As the result, it triggers a panic below
while onlining on ppc64le as it checks if the pages are mapped before
unmapping.  However, the imbalance exists for all arches where
double-unmappings could happen.  Therefore, let kernel map those pages
in generic_online_page() before they have being freed into the page
allocator for the first time where it will set the page count to one.

On the other hand, it works fine during the boot, because at least for
IBM POWER8, it does,

early_setup
  early_init_mmu
    harsh__early_init_mmu
      htab_initialize [1]
        htab_bolt_mapping [2]

where it effectively map all memblock regions just like
kernel_map_linear_page(), so later mem_init() -> memblock_free_all()
will unmap them just fine without any imbalance.  On other arches
without this imbalance checking, it still unmap them once at the most.

[1]
for_each_memblock(memory, reg) {
        base = (unsigned long)__va(reg->base);
        size = reg->size;

        DBG("creating mapping for region: %lx..%lx (prot: %lx)\n",
                base, size, prot);

        BUG_ON(htab_bolt_mapping(base, base + size, __pa(base),
                prot, mmu_linear_psize, mmu_kernel_ssize));
        }

[2] linear_map_hash_slots[paddr >> PAGE_SHIFT] = ret | 0x80;
    kernel BUG at arch/powerpc/mm/hash_utils_64.c:1815!
    Oops: Exception in kernel mode, sig: 5 [#1]
    LE SMP NR_CPUS=256 DEBUG_PAGEALLOC NUMA pSeries
    CPU: 2 PID: 4298 Comm: bash Not tainted 5.0.0-rc7+ #15
    NIP:  c000000000062670 LR: c00000000006265c CTR: 0000000000000000
    REGS: c0000005bf8a75b0 TRAP: 0700   Not tainted  (5.0.0-rc7+)
    MSR:  800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 28422842
    XER: 00000000
    CFAR: c000000000804f44 IRQMASK: 1
    NIP [c000000000062670] __kernel_map_pages+0x2e0/0x4f0
    LR [c00000000006265c] __kernel_map_pages+0x2cc/0x4f0
    Call Trace:
       __kernel_map_pages+0x2cc/0x4f0
       free_unref_page_prepare+0x2f0/0x4d0
       free_unref_page+0x44/0x90
       __online_page_free+0x84/0x110
       online_pages_range+0xc0/0x150
       walk_system_ram_range+0xc8/0x120
       online_pages+0x280/0x5a0
       memory_subsys_online+0x1b4/0x270
       device_online+0xc0/0xf0
       state_store+0xc0/0x180
       dev_attr_store+0x3c/0x60
       sysfs_kf_write+0x70/0xb0
       kernfs_fop_write+0x10c/0x250
       __vfs_write+0x48/0x240
       vfs_write+0xd8/0x210
       ksys_write+0x70/0x120
       system_call+0x5c/0x70

Link: http://lkml.kernel.org/r/20190301220814.97339-1-cai@lca.pw
Signed-off-by: Qian Cai <cai@lca.pw>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>	[powerpc]
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:21 -08:00
..
kasan kasan: fix coccinelle warnings in kasan_p*_table 2019-03-05 21:07:13 -08:00
backing-dev.c writeback: synchronize sync(2) against cgroup writeback membership switches 2019-01-22 14:39:38 -07:00
balloon_compaction.c
cleancache.c mm: use octal not symbolic permissions 2018-06-15 07:55:25 +09:00
cma.c mm/cma.c: cma_declare_contiguous: correct err handling 2019-03-05 21:07:21 -08:00
cma.h
cma_debug.c mm/cma_debug.c: remove static scoped cma_debugfs_root 2019-03-05 21:07:20 -08:00
compaction.c mm/compaction: pass pgdat to too_many_isolated() instead of zone 2019-03-05 21:07:21 -08:00
debug.c mm/debug.c: fix __dump_page() for poisoned pages 2019-02-21 09:01:00 -08:00
debug_page_ref.c
dmapool.c docs/core-api/mm: fix return value descriptions in mm/ 2019-03-05 21:07:20 -08:00
early_ioremap.c
fadvise.c vfs: implement readahead(2) using POSIX_FADV_WILLNEED 2018-08-30 20:01:32 +02:00
failslab.c mm: no need to check return value of debugfs_create functions 2019-03-05 21:07:17 -08:00
filemap.c mm: remove zone_lru_lock() function, access ->lru_lock directly 2019-03-05 21:07:21 -08:00
frame_vector.c
frontswap.c mm: use octal not symbolic permissions 2018-06-15 07:55:25 +09:00
gup.c mm: update get_user_pages_longterm to migrate pages allocated from CMA region 2019-03-05 21:07:19 -08:00
gup_benchmark.c mm: no need to check return value of debugfs_create functions 2019-03-05 21:07:17 -08:00
highmem.c mm: convert totalram_pages and totalhigh_pages variables to atomic 2018-12-28 12:11:47 -08:00
hmm.c mm/hmm: fix memremap.h, move dev_page_fault_t callback to hmm 2018-12-28 12:11:52 -08:00
huge_memory.c mm: remove zone_lru_lock() function, access ->lru_lock directly 2019-03-05 21:07:21 -08:00
hugetlb.c mm: update get_user_pages_longterm to migrate pages allocated from CMA region 2019-03-05 21:07:19 -08:00
hugetlb_cgroup.c mm: rename page_counter's count/limit into usage/max 2018-06-07 17:34:35 -07:00
hwpoison-inject.c
init-mm.c mm: Allocate the mm_cpumask (mm->cpu_bitmap[]) dynamically based on nr_cpu_ids 2018-07-17 09:35:30 +02:00
internal.h mm, compaction: capture a page under direct compaction 2019-03-05 21:07:17 -08:00
interval_tree.c
Kconfig ksm: replace jhash2 with xxhash 2018-12-28 12:11:46 -08:00
Kconfig.debug mm/page_owner: move config option to mm/Kconfig.debug 2019-03-05 21:07:18 -08:00
khugepaged.c mm: memcontrol: expose THP events on a per-memcg basis 2019-03-05 21:07:19 -08:00
kmemleak-test.c
kmemleak.c kmemleak: account for tagged pointers when calculating pointer range 2019-02-21 09:01:00 -08:00
ksm.c mm: ksm: do not block on page lock when searching stable tree 2019-03-05 21:07:19 -08:00
list_lru.c numa: make "nr_node_ids" unsigned int 2019-03-05 21:07:19 -08:00
maccess.c Revert "x86/fault: BUG() when uaccess helpers fault on kernel addresses" 2019-02-25 09:10:51 -08:00
madvise.c mm/mmu_notifier: use structure for invalidate_range_start/end calls v2 2018-12-28 12:11:50 -08:00
Makefile mm: remove nobootmem 2018-10-31 08:54:16 -07:00
memblock.c mm: no need to check return value of debugfs_create functions 2019-03-05 21:07:17 -08:00
memcontrol.c mm/memcontrol.c: fix bad line in comment 2019-03-05 21:07:21 -08:00
memfd.c mm/memfd: add an F_SEAL_FUTURE_WRITE seal to memfd 2019-03-05 21:07:19 -08:00
memory-failure.c mm: hwpoison: fix thp split handing in soft_offline_in_use_page() 2019-03-05 21:07:13 -08:00
memory.c docs/core-api/mm: fix return value descriptions in mm/ 2019-03-05 21:07:20 -08:00
memory_hotplug.c mm/hotplug: fix an imbalance with DEBUG_PAGEALLOC 2019-03-05 21:07:21 -08:00
mempolicy.c mm, mempolicy: fix uninit memory access 2019-03-05 21:07:18 -08:00
mempool.c docs/core-api/mm: fix return value descriptions in mm/ 2019-03-05 21:07:20 -08:00
memtest.c
migrate.c mm/migrate.c: cleanup expected_page_refs() 2019-03-05 21:07:20 -08:00
mincore.c Revert "Change mincore() to count "mapped" pages rather than "cached" pages" 2019-01-24 09:04:37 +13:00
mlock.c mm: remove zone_lru_lock() function, access ->lru_lock directly 2019-03-05 21:07:21 -08:00
mm_init.c mm: convert totalram_pages and totalhigh_pages variables to atomic 2018-12-28 12:11:47 -08:00
mmap.c mm: fix some typos in mm directory 2019-03-05 21:07:18 -08:00
mmu_context.c
mmu_gather.c mm: Replace call_rcu_sched() with call_rcu() 2018-11-27 09:21:46 -08:00
mmu_notifier.c mm/mmu_notifier: use structure for invalidate_range_start/end calls v2 2018-12-28 12:11:50 -08:00
mmzone.c
mprotect.c mm: update ptep_modify_prot_commit to take old pte value as arg 2019-03-05 21:07:18 -08:00
mremap.c mm: speed up mremap by 20x on large regions 2019-01-04 13:13:48 -08:00
msync.c
nommu.c mm/gup: cache dev_pagemap while pinning pages 2018-10-26 16:38:15 -07:00
oom_kill.c mm,oom: don't kill global init via memory.oom.group 2019-03-05 21:07:19 -08:00
page-writeback.c docs/core-api/mm: fix return value descriptions in mm/ 2019-03-05 21:07:20 -08:00
page_alloc.c mm: unexport free_reserved_area 2019-03-05 21:07:20 -08:00
page_counter.c memcg: introduce memory.min 2018-06-07 17:34:36 -07:00
page_ext.c mm/page_ext.c: fix an imbalance with kmemleak 2019-03-05 21:07:21 -08:00
page_idle.c mm: remove zone_lru_lock() function, access ->lru_lock directly 2019-03-05 21:07:21 -08:00
page_io.c mm/page_io.c: fix polled swap page in 2019-01-04 13:13:48 -08:00
page_isolation.c mm: only report isolation failures when offlining memory 2018-12-28 12:11:46 -08:00
page_owner.c mm: no need to check return value of debugfs_create functions 2019-03-05 21:07:17 -08:00
page_poison.c page_poison: play nicely with KASAN 2019-03-05 21:07:13 -08:00
page_vma_mapped.c mm/rmap: map_pte() was not handling private ZONE_DEVICE page properly 2018-10-31 08:54:11 -07:00
pagewalk.c mm: kernel-doc: add missing parameter descriptions 2018-04-05 21:36:27 -07:00
percpu-internal.h
percpu-km.c percpu: convert spin_lock_irq to spin_lock_irqsave. 2018-12-18 09:04:08 -08:00
percpu-stats.c treewide: Use array_size() in vmalloc() 2018-06-12 16:19:22 -07:00
percpu-vm.c percpu: allow select gfp to be passed to underlying allocators 2018-02-18 05:33:01 -08:00
percpu.c Merge branch 'for-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu 2018-11-01 09:27:57 -07:00
pgtable-generic.c x86/mm: Page size aware flush_tlb_mm_range() 2018-10-09 16:51:11 +02:00
process_vm_access.c
quicklist.c
readahead.c docs/core-api/mm: fix return value descriptions in mm/ 2019-03-05 21:07:20 -08:00
rmap.c mm: remove zone_lru_lock() function, access ->lru_lock directly 2019-03-05 21:07:21 -08:00
rodata_test.c
shmem.c mm/memfd: add an F_SEAL_FUTURE_WRITE seal to memfd 2019-03-05 21:07:19 -08:00
slab.c docs/core-api/mm: fix return value descriptions in mm/ 2019-03-05 21:07:20 -08:00
slab.h memcg: localize memcg_kmem_enabled() check 2019-03-05 21:07:15 -08:00
slab_common.c docs/core-api/mm: fix return value descriptions in mm/ 2019-03-05 21:07:20 -08:00
slob.c slab: __GFP_ZERO is incompatible with a constructor 2018-06-07 17:34:34 -07:00
slub.c numa: make "nr_node_ids" unsigned int 2019-03-05 21:07:19 -08:00
sparse-vmemmap.c mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
sparse.c mm, sparse: pass nid instead of pgdat to sparse_add_one_section() 2018-12-28 12:11:49 -08:00
swap.c mm: remove zone_lru_lock() function, access ->lru_lock directly 2019-03-05 21:07:21 -08:00
swap_cgroup.c
swap_slots.c mm, swap, get_swap_pages: use entry_size instead of cluster in parameter 2018-08-22 10:52:44 -07:00
swap_state.c mm: swap: add comment for swap_vma_readahead 2019-03-05 21:07:16 -08:00
swapfile.c mm/swapfile.c: use struct_size() in kvzalloc() 2019-03-05 21:07:21 -08:00
truncate.c docs/core-api/mm: fix return value descriptions in mm/ 2019-03-05 21:07:20 -08:00
usercopy.c mm/usercopy.c: no check page span for stack objects 2019-01-08 17:15:11 -08:00
userfaultfd.c hugetlbfs: revert "use i_mmap_rwsem for more pmd sharing synchronization" 2019-01-08 17:15:11 -08:00
util.c docs/core-api/mm: fix return value descriptions in mm/ 2019-03-05 21:07:20 -08:00
vmacache.c mm: get rid of vmacache_flush_all() entirely 2018-09-13 15:18:04 -10:00
vmalloc.c docs/core-api/mm: fix return value descriptions in mm/ 2019-03-05 21:07:20 -08:00
vmpressure.c mm/vmpressure.c: convert to use match_string() helper 2018-06-07 17:34:36 -07:00
vmscan.c mm: remove zone_lru_lock() function, access ->lru_lock directly 2019-03-05 21:07:21 -08:00
vmstat.c mm: no need to check return value of debugfs_create functions 2019-03-05 21:07:17 -08:00
workingset.c mm/workingset: remove unused @mapping argument in workingset_eviction() 2019-03-05 21:07:21 -08:00
z3fold.c z3fold: fix possible reclaim races 2018-11-18 10:15:09 -08:00
zbud.c
zpool.c mm/zpool.c: zpool_evictable: fix mismatch in parameter name and kernel-doc 2018-02-21 15:35:43 -08:00
zsmalloc.c mm/zsmalloc.c: fix fall-through annotation 2018-10-26 16:26:35 -07:00
zswap.c mm: convert totalram_pages and totalhigh_pages variables to atomic 2018-12-28 12:11:47 -08:00