linux-stable/mm
Sergey Senozhatsky 5a845e9f2d zsmalloc: rework compaction algorithm
The zsmalloc compaction algorithm has the potential to waste some CPU
cycles, particularly when compacting pages within the same fullness group.
This is due to the way it selects the head page of the fullness list for
source and destination pages, and how it reinserts those pages during each
iteration.  The algorithm may first use a page as a migration destination
and then as a migration source, leading to an unnecessary back-and-forth
movement of objects.

Consider the following fullness list:

PageA PageB PageC PageD PageE

During the first iteration, the compaction algorithm will select PageA as
the source and PageB as the destination.  All of PageA's objects will be
moved to PageB, and then PageA will be released while PageB is reinserted
into the fullness list.

PageB PageC PageD PageE

During the next iteration, the compaction algorithm will again select the
head of the list as the source and destination, meaning that PageB will
now serve as the source and PageC as the destination.  This will result in
the objects being moved away from PageB, the same objects that were just
moved to PageB in the previous iteration.

To prevent this avalanche effect, the compaction algorithm should not
reinsert the destination page between iterations.  By doing so, the most
optimal page will continue to be used and its usage ratio will increase,
reducing internal fragmentation.  The destination page should only be
reinserted into the fullness list if:
- It becomes full
- No source page is available.

TEST
====

It's very challenging to reliably test this series.  I ended up developing
my own synthetic test that has 100% reproducibility.  The test generates
significan fragmentation (for each size class) and then performs
compaction for each class individually and tracks the number of memcpy()
in zs_object_copy(), so that we can compare the amount work compaction
does on per-class basis.

Total amount of work (zram mm_stat objs_moved)
----------------------------------------------

Old fullness grouping, old compaction algorithm:
323977 memcpy() in zs_object_copy().

Old fullness grouping, new compaction algorithm:
262944 memcpy() in zs_object_copy().

New fullness grouping, new compaction algorithm:
213978 memcpy() in zs_object_copy().

Per-class compaction memcpy() comparison (T-test)
-------------------------------------------------

x Old fullness grouping, old compaction algorithm
+ Old fullness grouping, new compaction algorithm

    N           Min           Max        Median           Avg        Stddev
x 140           349          3513          2461     2314.1214     806.03271
+ 140           289          2778          2006     1878.1714     641.02073
Difference at 95.0% confidence
        -435.95 +/- 170.595
        -18.8387% +/- 7.37193%
        (Student's t, pooled s = 728.216)

x Old fullness grouping, old compaction algorithm
+ New fullness grouping, new compaction algorithm

    N           Min           Max        Median           Avg        Stddev
x 140           349          3513          2461     2314.1214     806.03271
+ 140           226          2279          1644     1528.4143     524.85268
Difference at 95.0% confidence
        -785.707 +/- 159.331
        -33.9527% +/- 6.88516%
        (Student's t, pooled s = 680.132)

Link: https://lkml.kernel.org/r/20230304034835.2082479-4-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Yosry Ahmed <yosryahmed@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-03-28 16:20:12 -07:00
..
damon mm/damon/paddr: fix folio_nr_pages() after folio_put() in damon_pa_mark_accessed_or_deactivate() 2023-03-07 17:04:55 -08:00
kasan kasan: test: fix test for new meminstrinsic instrumentation 2023-03-02 21:54:22 -08:00
kfence mm: kfence: fix handling discontiguous page 2023-03-28 15:24:32 -07:00
kmsan kmsan: add memsetXX tests 2023-03-28 16:20:11 -07:00
Kconfig zsmalloc: set default zspage chain size to 8 2023-02-02 22:33:23 -08:00
Kconfig.debug mm: move KMEMLEAK's Kconfig items from lib to mm 2023-02-02 22:33:26 -08:00
Makefile mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol 2022-10-03 14:03:36 -07:00
backing-dev.c mm: add /sys/class/bdi/<bdi>/min_ratio_fine knob 2022-11-30 15:59:06 -08:00
balloon_compaction.c mm: Convert all PageMovable users to movable_operations 2022-08-02 12:34:03 -04:00
bootmem_info.c bootmem: remove the vmemmap pages from kmemleak in put_page_bootmem 2022-08-28 14:02:45 -07:00
cma.c mm/cma: fix potential memory loss on cma_declare_contiguous_nid 2023-02-02 22:33:24 -08:00
cma.h
cma_debug.c mm/cma_debug: show complete cma name in debugfs directories 2022-09-11 20:25:50 -07:00
cma_sysfs.c mm: cma: make kobj_type structure constant 2023-03-28 16:20:06 -07:00
compaction.c - Daniel Verkamp has contributed a memfd series ("mm/memfd: add 2023-02-23 17:09:35 -08:00
debug.c mm/debug: use %pGt to display page_type in dump_page() 2023-03-28 16:20:09 -07:00
debug_page_ref.c
debug_vm_pgtable.c mm/debug_vm_pgtable: replace pte_mkhuge() with arch_make_huge_pte() 2023-03-28 16:20:11 -07:00
dmapool.c
early_ioremap.c
fadvise.c mm: support POSIX_FADV_NOREUSE 2023-01-18 17:12:57 -08:00
failslab.c mm: fix unexpected changes to {failslab|fail_page_alloc}.attr 2022-11-22 18:50:44 -08:00
filemap.c - Daniel Verkamp has contributed a memfd series ("mm/memfd: add 2023-02-23 17:09:35 -08:00
folio-compat.c mm: change to return bool for isolate_lru_page() 2023-02-20 12:46:17 -08:00
frontswap.c frontswap: don't call ->init if no ops are registered 2022-09-26 12:14:34 -07:00
gup.c mm: change to return bool for folio_isolate_lru() 2023-02-20 12:46:17 -08:00
gup_test.c mm/gup_test: free memory allocated via kvcalloc() using kvfree() 2022-12-15 16:37:48 -08:00
gup_test.h mm/gup_test: start/stop/read functionality for PIN LONGTERM test 2022-11-08 17:37:15 -08:00
highmem.c highmem: fix kmap_to_page() for kmap_local_page() addresses 2022-10-12 18:51:51 -07:00
hmm.c mm/hugetlb: make walk_hugetlb_range() safe to pmd unshare 2023-01-18 17:12:39 -08:00
huge_memory.c mm: huge_memory: convert __do_huge_pmd_anonymous_page() to use a folio 2023-03-28 16:20:09 -07:00
hugetlb.c mm: hugetlb: change to return bool for isolate_hugetlb() 2023-02-20 12:46:17 -08:00
hugetlb_cgroup.c mm/hugetlb: increase use of folios in alloc_huge_page() 2023-02-13 15:54:27 -08:00
hugetlb_vmemmap.c mm: hugetlb_vmemmap: simplify hugetlb_vmemmap_init() a bit 2023-03-28 16:20:06 -07:00
hugetlb_vmemmap.h mm: hugetlb_vmemmap: improve hugetlb_vmemmap code readability 2022-08-08 18:06:43 -07:00
hwpoison-inject.c mm/hwpoison: add __init/__exit annotations to module init/exit funcs 2022-10-03 14:03:05 -07:00
init-mm.c mm: remove rb tree. 2022-09-26 19:46:16 -07:00
internal.h mm, printk: introduce new format %pGt for page_type 2023-03-28 16:20:09 -07:00
interval_tree.c
io-mapping.c
ioremap.c mm: ioremap: Add ioremap/iounmap_allowed() 2022-06-27 12:22:31 +01:00
khugepaged.c mm/khugepaged: cleanup memcg uncharge for failure path 2023-03-28 16:20:11 -07:00
kmemleak.c lib/stackdepot, mm: rename stack_depot_want_early_init 2023-02-16 20:43:49 -08:00
ksm.c mm: add tracepoints to ksm 2023-03-28 16:20:08 -07:00
list_lru.c mm: kmem: make mem_cgroup_from_obj() vmalloc()-safe 2022-06-16 19:48:31 -07:00
maccess.c maccess: Fix writing offset in case of fault in strncpy_from_kernel_nofault() 2022-11-11 11:44:46 -08:00
madvise.c - Daniel Verkamp has contributed a memfd series ("mm/memfd: add 2023-02-23 17:09:35 -08:00
mapping_dirty_helpers.c mm/mmu_notifier: remove unused mmu_notifier_range_update_to_read_only export 2023-02-02 22:32:54 -08:00
memblock.c memblock: small optimizations 2023-02-27 09:34:53 -08:00
memcontrol.c - Daniel Verkamp has contributed a memfd series ("mm/memfd: add 2023-02-23 17:09:35 -08:00
memfd.c mm/memfd: add write seals when apply SEAL_EXEC to executable memfd 2023-01-18 17:12:37 -08:00
memory-failure.c mm/hwpoison: convert TTU_IGNORE_HWPOISON to TTU_HWPOISON 2023-02-27 17:00:14 -08:00
memory-tiers.c memory tier: release the new_memtier in find_create_memory_tier() 2023-02-09 16:51:40 -08:00
memory.c mm: memory: use folio_throttle_swaprate() in do_cow_fault() 2023-03-28 16:20:10 -07:00
memory_hotplug.c mm/memory_hotplug: cleanup return value handing in do_migrate_range() 2023-02-20 12:46:18 -08:00
mempolicy.c mm: hugetlb: change to return bool for isolate_hugetlb() 2023-02-20 12:46:17 -08:00
mempool.c mempool: do not use ksize() for poisoning 2022-11-30 15:58:41 -08:00
memremap.c mm/memremap.c: fix outdated comment in devm_memremap_pages 2023-02-09 16:51:46 -08:00
memtest.c
migrate.c mm/migrate: drop pte_mkhuge() in remove_migration_pte() 2023-03-28 16:20:11 -07:00
migrate_device.c mm: change to return bool for isolate_lru_page() 2023-02-20 12:46:17 -08:00
mincore.c mm: teach mincore_hugetlb about pte markers 2023-03-07 17:04:53 -08:00
mlock.c mm: introduce vm_flags_reset_once to replace WRITE_ONCE vm_flags updates 2023-02-09 16:51:41 -08:00
mm_init.c memory: move hotplug memory notifier priority to same file for easy sorting 2022-11-08 17:37:17 -08:00
mm_slot.h mm: introduce common struct mm_slot 2022-10-03 14:02:43 -07:00
mmap.c mm: deduplicate error handling for map_deny_write_exec 2023-03-23 17:18:32 -07:00
mmap_lock.c
mmu_gather.c mm: mmu_gather: allow more than one batch of delayed rmaps 2022-12-11 18:12:21 -08:00
mmu_notifier.c mm/mmu_notifier: remove unused mmu_notifier_range_update_to_read_only export 2023-02-02 22:32:54 -08:00
mmzone.c mm: multi-gen LRU: groundwork 2022-09-26 19:46:09 -07:00
mprotect.c mm: fix error handling for map_deny_write_exec 2023-03-23 17:18:33 -07:00
mremap.c x86/mm/pat: clear VM_PAT if copy_p4d_range failed 2023-03-28 16:20:07 -07:00
msync.c mm/msync: use vma_find() instead of vma linked list 2022-09-26 19:46:25 -07:00
nommu.c mm: replace vma->vm_flags direct modifications with modifier calls 2023-02-09 16:51:39 -08:00
oom_kill.c mm/mmu_notifier: remove unused mmu_notifier_range_update_to_read_only export 2023-02-02 22:32:54 -08:00
page-writeback.c fs: convert writepage_t callback to pass a folio 2023-02-02 22:33:34 -08:00
page_alloc.c mm, page_alloc: reduce page alloc/free sanity checks 2023-03-28 16:20:06 -07:00
page_counter.c mm: page_counter: remove unneeded atomic ops for low/min 2022-09-11 20:26:01 -07:00
page_ext.c mm/page_ext: init page_ext early if there are no deferred struct pages 2023-02-02 22:33:22 -08:00
page_idle.c mm: page_idle: convert page idle to use a folio 2023-01-18 17:12:52 -08:00
page_io.c - Daniel Verkamp has contributed a memfd series ("mm/memfd: add 2023-02-23 17:09:35 -08:00
page_isolation.c mm/page_isolation: fix clang deadcode warning 2022-10-28 13:37:22 -07:00
page_owner.c lib/stackdepot, mm: rename stack_depot_want_early_init 2023-02-16 20:43:49 -08:00
page_poison.c
page_reporting.c mm/page_reporting: replace rcu_access_pointer() with rcu_dereference_protected() 2023-01-18 17:12:50 -08:00
page_reporting.h
page_table_check.c mm/page_ext: do not allocate space for page_ext->flags if not needed 2023-02-02 22:33:11 -08:00
page_vma_mapped.c mm/hugetlb: introduce hugetlb_walk() 2023-01-18 17:12:39 -08:00
pagewalk.c mm/hugetlb: introduce hugetlb_walk() 2023-01-18 17:12:39 -08:00
percpu-internal.h mm: percpu: fix incorrect size in pcpu_obj_full_size() 2023-02-16 20:43:55 -08:00
percpu-km.c
percpu-stats.c
percpu-vm.c
percpu.c mm: memcontrol: rename memcg_kmem_enabled() 2023-02-16 20:43:56 -08:00
pgalloc-track.h
pgtable-generic.c
process_vm_access.c use less confusing names for iov_iter direction initializers 2022-11-25 13:01:55 -05:00
ptdump.c mm: pagewalk: Fix race between unmap and page walker 2022-09-03 10:13:13 -07:00
readahead.c readahead: convert readahead_expand() to use a folio 2023-02-02 22:33:21 -08:00
rmap.c mm/rmap: use atomic_try_cmpxchg in set_tlb_ubc_flush_pending 2023-03-28 16:20:09 -07:00
rodata_test.c mm/rodata_test: use PAGE_ALIGNED() helper 2022-10-03 14:03:05 -07:00
secretmem.c - Daniel Verkamp has contributed a memfd series ("mm/memfd: add 2023-02-23 17:09:35 -08:00
shmem.c - Daniel Verkamp has contributed a memfd series ("mm/memfd: add 2023-02-23 17:09:35 -08:00
shrinker_debug.c mm: shrinkers: fix deadlock in shrinker debugfs 2023-02-09 15:56:51 -08:00
shuffle.c mm/shuffle: convert module_param_call to module_param_cb 2022-10-03 14:03:07 -07:00
shuffle.h
slab.c slab fix for 6.3-rc4 2023-03-24 10:12:14 -07:00
slab.h mm: memcontrol: rename memcg_kmem_enabled() 2023-02-16 20:43:56 -08:00
slab_common.c mm/kasan: simplify and refine kasan_cache code 2023-01-18 17:12:55 -08:00
slob.c Merge branch 'slab/for-6.1/kmalloc_size_roundup' into slab/for-next 2022-09-29 11:30:55 +02:00
slub.c - Daniel Verkamp has contributed a memfd series ("mm/memfd: add 2023-02-23 17:09:35 -08:00
sparse-vmemmap.c mm/sparse-vmemmap: generalise vmemmap_populate_hugepages() 2022-12-11 18:12:12 -08:00
sparse.c mm/sparse: fix "unused function 'pgdat_to_phys'" warning 2023-02-02 22:33:29 -08:00
swap.c - Daniel Verkamp has contributed a memfd series ("mm/memfd: add 2023-02-23 17:09:35 -08:00
swap.h mm: remove the __swap_writepage return value 2023-02-02 22:33:33 -08:00
swap_cgroup.c mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled 2022-10-03 14:03:36 -07:00
swap_slots.c mm/swap: convert put_swap_page() to put_swap_folio() 2022-10-03 14:02:46 -07:00
swap_state.c swap_state: update shadow_nodes for anonymous page 2023-02-02 22:33:24 -08:00
swapfile.c mm: swap: remove unneeded cgroup_throttle_swaprate() 2023-03-28 16:20:10 -07:00
truncate.c folio-compat: remove lru_cache_add() 2022-12-11 18:12:13 -08:00
usercopy.c mm: use kstrtobool() instead of strtobool() 2022-11-30 15:58:45 -08:00
userfaultfd.c mm/userfaultfd: support WP on multiple VMAs 2023-03-28 16:20:07 -07:00
util.c mm: fix typo in __vm_enough_memory warning 2023-02-13 15:54:33 -08:00
vmalloc.c mm, vmalloc: fix high order __GFP_NOFAIL allocations 2023-03-23 17:18:31 -07:00
vmpressure.c
vmscan.c mm: multi-gen LRU: improve design doc 2023-03-28 16:20:07 -07:00
vmstat.c mm: vmscan: split khugepaged stats from direct reclaim stats 2022-11-30 15:58:41 -08:00
workingset.c swap_state: update shadow_nodes for anonymous page 2023-02-02 22:33:24 -08:00
z3fold.c mm: remove PageMovable export 2023-01-18 17:12:57 -08:00
zbud.c zpool: clean out dead code 2022-12-11 18:12:10 -08:00
zpool.c zpool: clean out dead code 2022-12-11 18:12:10 -08:00
zsmalloc.c zsmalloc: rework compaction algorithm 2023-03-28 16:20:12 -07:00
zswap.c mm/zswap: try to avoid worst-case scenario on same element pages 2023-03-28 16:20:07 -07:00