Commit graph

1058783 commits

Author SHA1 Message Date
Zhou Qingyang
49a8bf50ca drm/i915/gem: Fix a NULL pointer dereference in igt_request_rewind()
In igt_request_rewind(), mock_context(i915, "A") is assigned to ctx[0]
and used in i915_gem_context_get_engine(). There is a dereference
of ctx[0] in i915_gem_context_get_engine(), which could lead to a NULL
pointer dereference on failure of mock_context(i915, "A") .

So as mock_context(i915, "B").

Although this bug is not serious for it belongs to testing code, it is
better to be fixed to avoid unexpected failure in testing.

Fix this bugs by adding checks about ctx[0] and ctx[1].

This bug was found by a static analyzer. The analysis employs
differential checking to identify inconsistent security operations
(e.g., checks or kfrees) between two code paths and confirms that the
inconsistent operations are not recovered in the current function or
the callers, so they constitute bugs.

Note that, as a bug found by static analysis, it can be a false
positive or hard to trigger. Multiple researchers have cross-reviewed
the bug.

Builds with CONFIG_DRM_I915_SELFTEST=y show no new warnings,
and our static analyzer no longer warns about this code.

References: 591c0fb85d ("drm/i915: Exercise request cancellation using a mock selftest")
[tursulin: Replaced fixes with references to avoid.]
Signed-off-by: Zhou Qingyang <zhou1615@umn.edu>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211130141545.153899-1-zhou1615@umn.edu
2021-12-01 09:22:15 +00:00
Tvrtko Ursulin
cca0846923 drm/i915: Use per device iommu check
With both integrated and discrete Intel GPUs in a system, the current
global check of intel_iommu_gfx_mapped, as done from intel_vtd_active()
may not be completely accurate.

In this patch we add i915 parameter to intel_vtd_active() in order to
prepare it for multiple GPUs and we also change the check away from Intel
specific intel_iommu_gfx_mapped (global exported by the Intel IOMMU
driver) to probing the presence of IOMMU on a specific device using
device_iommu_mapped().

This will return true both for IOMMU pass-through and address translation
modes which matches the current behaviour. If in the future we wanted to
distinguish between these two modes we could either use
iommu_get_domain_for_dev() and check for __IOMMU_DOMAIN_PAGING bit
indicating address translation, or ask for a new API to be exported from
the IOMMU core code.

v2:
  * Check for dmar translation specifically, not just iommu domain. (Baolu)

v3:
 * Go back to plain "any domain" check for now, rewrite commit message.

v4:
 * Use device_iommu_mapped. (Robin, Baolu)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211126141424.493753-1-tvrtko.ursulin@linux.intel.com
2021-12-01 09:21:47 +00:00
Matthew Brost
44505168d7 drm/i915: Drop stealing of bits from i915_sw_fence function pointer
Rather than stealing bits from i915_sw_fence function pointer use
separate fields for function pointer and flags. If using two different
fields, the 4 byte alignment for the i915_sw_fence function pointer can
also be dropped.

v2:
 (CI)
  - Set new function field rather than flags in __i915_sw_fence_init
v3:
 (Tvrtko)
  - Remove BUG_ON(!fence->flags) in reinit as that will now blow up
  - Only define fence->flags if CONFIG_DRM_I915_SW_FENCE_CHECK_DAG is
    defined
v4:
  - Rebase, resend for CI

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211116194929.10211-1-matthew.brost@intel.com
2021-11-30 17:52:15 -08:00
Umesh Nerlige Ramappa
2a67b18e67 drm/i915/pmu: Fix synchronization of PMU callback with reset
Since the PMU callback runs in irq context, it synchronizes with gt
reset using the reset count. We could run into a case where the PMU
callback could read the reset count before it is updated. This has a
potential of corrupting the busyness stats.

In addition to the reset count, check if the reset bit is set before
capturing busyness.

In addition save the previous stats only if you intend to update them.

v2:
- The 2 reset counts captured in the PMU callback can end up being the
  same if they were captured right after the count is incremented in the
  reset flow. This can lead to a bad busyness state. Ensure that reset
  is not in progress when the initial reset count is captured.

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211108211057.68783-1-umesh.nerlige.ramappa@intel.com
2021-11-30 17:08:07 -08:00
Thomas Hellström
95d3583888 dma_fence_array: Fix PENDING_ERROR leak in dma_fence_array_signaled()
If a dma_fence_array is reported signaled by a call to
dma_fence_is_signaled(), it may leak the PENDING_ERROR status.

Fix this by clearing the PENDING_ERROR status if we return true in
dma_fence_array_signaled().

v2:
- Update Cc list, and add R-b.

Fixes: 1f70b8b812 ("dma-fence: Propagate errors to dma-fence-array container")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Gustavo Padovan <gustavo@padovan.org>
Cc: Christian König <christian.koenig@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: <stable@vger.kernel.org> # v5.4+
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211129152727.448908-1-thomas.hellstrom@linux.intel.com
2021-11-30 07:46:24 +01:00
Matthew Auld
3ccadbce85 drm/i915/gemfs: don't mark huge_opt as static
vfs_kernel_mount() modifies the passed in mount options, leaving us with
"huge", instead of "huge=within_size". Normally this shouldn't matter
with the usual module load/unload flow, however with the core_hotunplug
IGT we are hitting the following, when re-probing the memory regions:

i915 0000:00:02.0: [drm] Transparent Hugepage mode 'huge'
tmpfs: Bad value for 'huge'
[drm] Unable to create a private tmpfs mount, hugepage support will be disabled(-22).

References: https://gitlab.freedesktop.org/drm/intel/-/issues/4651
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211126110843.2028582-1-matthew.auld@intel.com
2021-11-26 14:46:01 +00:00
Thomas Hellström
8b91cdd4f8 drm/i915: Use __GFP_KSWAPD_RECLAIM in the capture code
The capture code is typically run entirely in the fence signalling
critical path. We're about to add lockdep annotation in an upcoming patch
which reveals a lockdep splat similar to the below one.

Fix the associated potential deadlocks using __GFP_KSWAPD_RECLAIM
(which is the same as GFP_WAIT, but open-coded for clarity) rather than
GFP_KERNEL for memory allocation in the capture path. This has the
potential drawback that capture might fail in situations with memory
pressure.

[  234.842048] WARNING: possible circular locking dependency detected
[  234.842050] 5.15.0-rc7+ #20 Tainted: G     U  W
[  234.842052] ------------------------------------------------------
[  234.842054] gem_exec_captur/1180 is trying to acquire lock:
[  234.842056] ffffffffa3e51c00 (fs_reclaim){+.+.}-{0:0}, at: __kmalloc+0x4d/0x330
[  234.842063]
               but task is already holding lock:
[  234.842064] ffffffffa3f57620 (dma_fence_map){++++}-{0:0}, at: i915_vma_snapshot_resource_pin+0x27/0x30 [i915]
[  234.842138]
               which lock already depends on the new lock.

[  234.842140]
               the existing dependency chain (in reverse order) is:
[  234.842142]
               -> #2 (dma_fence_map){++++}-{0:0}:
[  234.842145]        __dma_fence_might_wait+0x41/0xa0
[  234.842149]        dma_resv_lockdep+0x1dc/0x28f
[  234.842151]        do_one_initcall+0x58/0x2d0
[  234.842154]        kernel_init_freeable+0x273/0x2bf
[  234.842157]        kernel_init+0x16/0x120
[  234.842160]        ret_from_fork+0x1f/0x30
[  234.842163]
               -> #1 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}:
[  234.842166]        fs_reclaim_acquire+0x6d/0xd0
[  234.842168]        __kmalloc_node+0x51/0x3a0
[  234.842171]        alloc_cpumask_var_node+0x1b/0x30
[  234.842174]        native_smp_prepare_cpus+0xc7/0x292
[  234.842177]        kernel_init_freeable+0x160/0x2bf
[  234.842179]        kernel_init+0x16/0x120
[  234.842181]        ret_from_fork+0x1f/0x30
[  234.842184]
               -> #0 (fs_reclaim){+.+.}-{0:0}:
[  234.842186]        __lock_acquire+0x1161/0x1dc0
[  234.842189]        lock_acquire+0xb5/0x2b0
[  234.842192]        fs_reclaim_acquire+0xa1/0xd0
[  234.842193]        __kmalloc+0x4d/0x330
[  234.842196]        i915_vma_coredump_create+0x78/0x5b0 [i915]
[  234.842253]        intel_engine_coredump_add_vma+0x36/0xe0 [i915]
[  234.842307]        __i915_gpu_coredump+0x290/0x5e0 [i915]
[  234.842365]        i915_capture_error_state+0x57/0xa0 [i915]
[  234.842415]        intel_gt_handle_error+0x348/0x3e0 [i915]
[  234.842462]        intel_gt_debugfs_reset_store+0x3c/0x90 [i915]
[  234.842504]        simple_attr_write+0xc1/0xe0
[  234.842507]        full_proxy_write+0x53/0x80
[  234.842509]        vfs_write+0xbc/0x350
[  234.842513]        ksys_write+0x58/0xd0
[  234.842514]        do_syscall_64+0x38/0x90
[  234.842516]        entry_SYSCALL_64_after_hwframe+0x44/0xae
[  234.842519]
               other info that might help us debug this:

[  234.842521] Chain exists of:
                 fs_reclaim --> mmu_notifier_invalidate_range_start --> dma_fence_map

[  234.842526]  Possible unsafe locking scenario:

[  234.842528]        CPU0                    CPU1
[  234.842529]        ----                    ----
[  234.842531]   lock(dma_fence_map);
[  234.842532]                                lock(mmu_notifier_invalidate_range_start);
[  234.842535]                                lock(dma_fence_map);
[  234.842537]   lock(fs_reclaim);
[  234.842539]
                *** DEADLOCK ***

[  234.842540] 4 locks held by gem_exec_captur/1180:
[  234.842543]  #0: ffff9007812d9460 (sb_writers#17){.+.+}-{0:0}, at: ksys_write+0x58/0xd0
[  234.842547]  #1: ffff900781d9ecb8 (&attr->mutex){+.+.}-{3:3}, at: simple_attr_write+0x3a/0xe0
[  234.842552]  #2: ffffffffc11913a8 (capture_mutex){+.+.}-{3:3}, at: i915_capture_error_state+0x1a/0xa0 [i915]
[  234.842602]  #3: ffffffffa3f57620 (dma_fence_map){++++}-{0:0}, at: i915_vma_snapshot_resource_pin+0x27/0x30 [i915]
[  234.842656]
               stack backtrace:
[  234.842658] CPU: 0 PID: 1180 Comm: gem_exec_captur Tainted: G     U  W         5.15.0-rc7+ #20
[  234.842661] Hardware name: ASUS System Product Name/PRIME B560M-A AC, BIOS 0403 01/26/2021
[  234.842664] Call Trace:
[  234.842666]  dump_stack_lvl+0x57/0x72
[  234.842669]  check_noncircular+0xde/0x100
[  234.842672]  ? __lock_acquire+0x3bf/0x1dc0
[  234.842675]  __lock_acquire+0x1161/0x1dc0
[  234.842678]  lock_acquire+0xb5/0x2b0
[  234.842680]  ? __kmalloc+0x4d/0x330
[  234.842683]  ? finish_task_switch.isra.0+0xf2/0x360
[  234.842686]  ? i915_vma_coredump_create+0x78/0x5b0 [i915]
[  234.842734]  fs_reclaim_acquire+0xa1/0xd0
[  234.842737]  ? __kmalloc+0x4d/0x330
[  234.842739]  __kmalloc+0x4d/0x330
[  234.842742]  i915_vma_coredump_create+0x78/0x5b0 [i915]
[  234.842793]  ? capture_vma+0xbe/0x110 [i915]
[  234.842844]  intel_engine_coredump_add_vma+0x36/0xe0 [i915]
[  234.842892]  __i915_gpu_coredump+0x290/0x5e0 [i915]
[  234.842939]  i915_capture_error_state+0x57/0xa0 [i915]
[  234.842985]  intel_gt_handle_error+0x348/0x3e0 [i915]
[  234.843032]  ? __mutex_lock+0x81/0x830
[  234.843035]  ? simple_attr_write+0x3a/0xe0
[  234.843038]  ? __lock_acquire+0x3bf/0x1dc0
[  234.843041]  intel_gt_debugfs_reset_store+0x3c/0x90 [i915]
[  234.843083]  ? _copy_from_user+0x45/0x80
[  234.843086]  simple_attr_write+0xc1/0xe0
[  234.843089]  full_proxy_write+0x53/0x80
[  234.843091]  vfs_write+0xbc/0x350
[  234.843094]  ksys_write+0x58/0xd0
[  234.843096]  do_syscall_64+0x38/0x90
[  234.843098]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  234.843101] RIP: 0033:0x7fa467480877
[  234.843103] Code: 75 05 48 83 c4 58 c3 e8 37 4e ff ff 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[  234.843108] RSP: 002b:00007ffd14d79b08 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  234.843112] RAX: ffffffffffffffda RBX: 00007ffd14d79b60 RCX: 00007fa467480877
[  234.843114] RDX: 0000000000000014 RSI: 00007ffd14d79b60 RDI: 0000000000000007
[  234.843116] RBP: 0000000000000007 R08: 0000000000000000 R09: 00007ffd14d79ab0
[  234.843119] R10: ffffffffffffffff R11: 0000000000000246 R12: 0000000000000014
[  234.843121] R13: 0000000000000000 R14: 00007ffd14d79b60 R15: 0000000000000005

v5:
- Use __GFP_KSWAPD_RECLAIM rather than __GFP_NOWAIT for clarity.
  (Daniel Vetter)
v6:
- Include an instance in execlists_capture_work().
- Rework the commit message due to patch reordering.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Ramalingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211108174547.979714-3-thomas.hellstrom@linux.intel.com
2021-11-26 08:26:10 +01:00
Thomas Hellström
e45b98ba62 drm/i915: Avoid allocating a page array for the gpu coredump
The gpu coredump typically takes place in a dma_fence signalling
critical path, and hence can't use GFP_KERNEL allocations, as that
means we might hit deadlocks under memory pressure. However
changing to __GFP_KSWAPD_RECLAIM which will be done in an upcoming
patch will instead mean a lower chance of the allocation succeeding.
In particular large contigous allocations like the coredump page
vector.
Remove the page vector in favor of a linked list of single pages.
Use the page lru list head as the list link, as the page owner is
allowed to do that.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Ramalingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211108174547.979714-2-thomas.hellstrom@linux.intel.com
2021-11-26 08:26:08 +01:00
Maarten Lankhorst
5c2625c4a0 drm/i915: Remove dma_resv_prune
The signaled bit is already used for quick testing if a fence is signaled.
On top of that, it's a terrible abuse of dma-fence api, and in the common
case where the object is already locked by the caller, the trylock will fail.

If it were useful, the core dma-api would have exposed the same functionality.

The fact that i915 has a dma_resv_utils.c file should be a warning that the
functionality either belongs in core, or is not very useful at all.
In this case the latter.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
[mlankhorst: Improve commit message]
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211021103605.735002-3-maarten.lankhorst@linux.intel.com
Reviewed-by: Matthew Auld <matthew.auld@intel.com> #irc
2021-11-25 14:33:26 +01:00
Tvrtko Ursulin
16d69a8919 Merge drm/drm-next into drm-intel-gt-next
Maarten requested a backmerge due his work depending on subtle semantic
changes introduced by:

  7e2e69ed46 ("drm/i915: Fix i915_request fence wait semantics")
  2cbb8d4d67 ("drm/i915: use new iterator in i915_gem_object_wait_reservation")

Both should probably have been merged to drm-intel-gt-next anyway.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2021-11-25 12:52:32 +00:00
Thomas Hellström
5652df829b drm/i915/ttm: Update i915_gem_obj_copy_ttm() to be asynchronous
Update the copy function i915_gem_obj_copy_ttm() to be asynchronous for
future users and update the only current user to sync the objects
as needed after this function.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211122214554.371864-7-thomas.hellstrom@linux.intel.com
2021-11-25 09:36:20 +01:00
Thomas Hellström
6385eb7ad8 drm/i915/ttm: Implement asynchronous TTM moves
Don't wait sync while migrating, but rather make the GPU blit await the
dependencies and add a moving fence to the object.

This also enables asynchronous VRAM management in that on eviction,
rather than waiting for the moving fence to expire before freeing VRAM,
it is freed immediately and the fence is stored with the VRAM manager and
handed out to newly allocated objects to await before clears and swapins,
or for kernel objects before setting up gpu vmas or mapping.

To collect dependencies before migrating, add a set of utilities that
coalesce these to a single dma_fence.

What is still missing for fully asynchronous operation is asynchronous vma
unbinding, which is still to be implemented.

This commit substantially reduces execution time in the gem_lmem_swapping
test.

v2:
- Make a couple of functions static.
v4:
- Fix some style issues (Matthew Auld)
- Audit and add more checks for ghost objects (Matthew Auld)
- Add more documentation for the i915_deps utility (Mattew Auld)
- Simplify the i915_deps_sync() function
v6:
- Re-check for fence signaled before returning -EBUSY (Matthew Auld)
- Use dma_resv_iter_is_exclusive() (Matthew Auld)
- Await all dma-resv fences before a migration blit (Matthew Auld)

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211122214554.371864-6-thomas.hellstrom@linux.intel.com
2021-11-25 09:36:19 +01:00
Thomas Hellström
004746e4b1 drm/i915/ttm: Correctly handle waiting for gpu when shrinking
With async migration, the shrinker may end up wanting to release the
pages of an object while the migration blit is still running, since
the GT migration code doesn't set up VMAs and the shrinker is thus
oblivious to the fact that the GPU is still using the pages.

Add waiting for gpu in the shrinker_release_pages() op and an
argument to that function indicating whether the shrinker expects it
to not wait for gpu. In the latter case the shrinker_release_pages()
op will return -EBUSY if the object is not idle.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211122214554.371864-5-thomas.hellstrom@linux.intel.com
2021-11-25 09:36:18 +01:00
Thomas Hellström
8b1f7f92e5 drm/i915/ttm: Drop region reference counting
There is an interesting refcounting loop:
struct intel_memory_region has a struct ttm_resource_manager,
ttm_resource_manager->move may hold a reference to i915_request,
i915_request may hold a reference to intel_context,
intel_context may hold a reference to drm_i915_gem_object,
drm_i915_gem_object may hold a reference to intel_memory_region.

Break this loop by dropping region reference counting.

In addition, Have regions with a manager moving fence make sure
that all region objects are released before freeing the region.

v6:
- Fix a code comment.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211122214554.371864-4-thomas.hellstrom@linux.intel.com
2021-11-25 09:36:16 +01:00
Thomas Hellström
05d1c76107 drm/i915/ttm: Move the i915_gem_obj_copy_ttm() function
Move the i915_gem_obj_copy_ttm() function to i915_gem_ttm_move.h.
This will help keep a number of functions static when introducing
async moves.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211122214554.371864-3-thomas.hellstrom@linux.intel.com
2021-11-25 09:36:15 +01:00
Maarten Lankhorst
f6c466b84c drm/i915: Add support for moving fence waiting
For now, we will only allow async migration when TTM is used,
so the paths we care about are related to TTM.

The mmap path is handled by having the fence in ttm_bo->moving,
when pinning, the binding only becomes available after the moving
fence is signaled, and pinning a cpu map will only work after
the moving fence signals.

This should close all holes where userspace can read a buffer
before it's fully migrated.

v2:
- Fix a couple of SPARSE warnings
v3:
- Fix a NULL pointer dereference
v4:
- Ditch the moving fence waiting for i915_vma_pin_iomap() and
  replace with a verification that the vma is already bound.
  (Matthew Auld)
- Squash with a previous patch introducing moving fence waiting and
  accessing interfaces (Matthew Auld)
- Rename to indicated that we also add support for sync waiting.
v5:
- Fix check for NULL and unreferencing i915_vma_verify_bind_complete()
  (Matthew Auld)
- Fix compilation failure if !CONFIG_DRM_I915_DEBUG_GEM
- Fix include ordering. (Matthew Auld)
v7:
- Fix yet another compilation failure with clang if
  !CONFIG_DRM_I915_DEBUG_GEM

Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211122214554.371864-2-thomas.hellstrom@linux.intel.com
2021-11-25 09:36:13 +01:00
Tvrtko Ursulin
b6b56df519 Revert "drm/i915/dmabuf: fix broken build"
This reverts commit 777226dac0 ("drm/i915/dmabuf: fix broken build").

Approach taken in the patch was rejected by Linus and the upstream tree
now already contains the required include directive via 304ac8032d
("Merge tag 'drm-next-2021-11-12' of git://anongit.freedesktop.org/drm/drm").

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 777226dac0 ("drm/i915/dmabuf: fix broken build")
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Acked-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211122135758.85444-1-tvrtko.ursulin@linux.intel.com
[tursulin: fixup commit message sha format]
2021-11-24 09:22:53 +00:00
Tejas Upadhyay
d22d446f7a drm/i915/gt: Hold RPM wakelock during PXP suspend
selftest --r live shows failure in suspend tests when
RPM wakelock is not acquired during suspend.

This changes addresses below error :
<4> [154.177535] RPM wakelock ref not held during HW access
<4> [154.177575] WARNING: CPU: 4 PID: 5772 at
drivers/gpu/drm/i915/intel_runtime_pm.h:113
fwtable_write32+0x240/0x320 [i915]
<4> [154.177974] Modules linked in: i915(+) vgem drm_shmem_helper
fuse snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic
ledtrig_audio mei_hdcp mei_pxp x86_pkg_temp_thermal coretemp
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_intel_dspcfg
snd_hda_codec snd_hwdep igc snd_hda_core ttm mei_me ptp
snd_pcm prime_numbers mei i2c_i801 pps_core i2c_smbus intel_lpss_pci
btusb btrtl btbcm btintel bluetooth ecdh_generic ecc [last unloaded: i915]
<4> [154.178143] CPU: 4 PID: 5772 Comm: i915_selftest Tainted: G
U            5.15.0-rc6-CI-Patchwork_21432+ #1
<4> [154.178154] Hardware name: ASUS System Product Name/TUF GAMING
Z590-PLUS WIFI, BIOS 0811 04/06/2021
<4> [154.178160] RIP: 0010:fwtable_write32+0x240/0x320 [i915]
<4> [154.178604] Code: 15 7b e1 0f 0b e9 34 fe ff ff 80 3d a9 89 31
00 00 0f 85 31 fe ff ff 48 c7 c7 88 9e 4f a0 c6 05 95 89 31 00 01 e8
c0 15 7b e1 <0f> 0b e9 17 fe ff ff 8b 05 0f 83 58 e2 85 c0 0f 85 8d
00 00 00 48
<4> [154.178614] RSP: 0018:ffffc900016279f0 EFLAGS: 00010286
<4> [154.178626] RAX: 0000000000000000 RBX: ffff888204fe0ee0
RCX: 0000000000000001
<4> [154.178634] RDX: 0000000080000001 RSI: ffffffff823142b5
RDI: 00000000ffffffff
<4> [154.178641] RBP: 00000000000320f0 R08: 0000000000000000
R09: c0000000ffffcd5a
<4> [154.178647] R10: 00000000000f8c90 R11: ffffc90001627808
R12: 0000000000000000
<4> [154.178654] R13: 0000000040000000 R14: ffffffffa04d12e0
R15: 0000000000000000
<4> [154.178660] FS:  00007f7390aa4c00(0000) GS:ffff88844f000000(0000)
knlGS:0000000000000000
<4> [154.178669] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [154.178675] CR2: 000055bc40595028 CR3: 0000000204474005
CR4: 0000000000770ee0
<4> [154.178682] PKRU: 55555554
<4> [154.178687] Call Trace:
<4> [154.178706]  intel_pxp_fini_hw+0x23/0x30 [i915]
<4> [154.179284]  intel_pxp_suspend+0x1f/0x30 [i915]
<4> [154.179807]  live_gt_resume+0x5b/0x90 [i915]

Changes since V2 :
	- Remove boolean in intel_pxp_runtime_preapre for
	  non-pxp configs. Solves build error
Changes since V2 :
	- Open-code intel_pxp_runtime_suspend - Daniele
	- Remove boolean in intel_pxp_runtime_preapre - Daniele
Changes since V1 :
	- split the HW access parts in gt_suspend_late - Daniele
	- Remove default PXP configs

Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Fixes: 0cfab4cb3c ("drm/i915/pxp: Enable PXP power management")
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211117060321.3729343-1-tejaskumarx.surendrakumar.upadhyay@intel.com
2021-11-23 13:22:51 -08:00
Umesh Nerlige Ramappa
5979873ebb drm/i915/pmu: Increase the live_engine_busy_stats sample period
Irrespective of the backend for request submissions, busyness for an
engine with an active context is calculated using:

busyness = total + (current_time - context_switch_in_time)

In execlists mode of operation, the context switch events are handled
by the CPU. Context switch in/out time and current_time are captured
in CPU time domain using ktime_get().

In GuC mode of submission, context switch events are handled by GuC and
the times in the above formula are captured in GT clock domain. This
information is shared with the CPU through shared memory. This results
in 2 caveats:

1) The time taken between start of a batch and the time that CPU is able
to see the context_switch_in_time in shared memory is dependent on GuC
and memory bandwidth constraints.

2) Determining current_time requires an MMIO read that can take anywhere
between a few us to a couple ms. A reference CPU time is captured soon
after reading the MMIO so that the caller can compare the cpu delta
between 2 busyness samples. The issue here is that the CPU delta and the
busyness delta can be skewed because of the time taken to read the
register.

These 2 factors affect the accuracy of the selftest -
live_engine_busy_stats. For (1) the selftest waits until busyness stats
are visible to the CPU. The effects of (2) are more prominent for the
current busyness sample period of 100 us. Increase the busyness sample
period from 100 us to 10 ms to overccome (2).

v2: Fix checkpatch issues

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211115221640.30793-1-umesh.nerlige.ramappa@intel.com
2021-11-23 10:45:33 -08:00
Matthew Auld
be373fad54 drm/i915/ttm: fixup build failure
drm-intel-gt-next fails to build with:

drivers/gpu/drm/i915/gem/i915_gem_ttm.c: In function ‘vm_fault_ttm’:
drivers/gpu/drm/i915/gem/i915_gem_ttm.c:862:23: error: too many arguments to function ‘ttm_bo_vm_fault_reserved’
  862 |                 ret = ttm_bo_vm_fault_reserved(vmf, vmf->vma->vm_page_prot,
      |                       ^~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211123125814.1703220-1-matthew.auld@intel.com
2021-11-23 16:36:22 +00:00
Randy Dunlap
0af4cbfa73 drm/i915/gem: placate scripts/kernel-doc
Correct kernel-doc warnings in i915_drm_object.c:

i915_gem_object.c:103: warning: expecting prototype for i915_gem_object_fini(). Prototype was for __i915_gem_object_fini() instead
i915_gem_object.c:110: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
 * Mark up the object's coherency levels for a given cache_level
i915_gem_object.c:110: warning: missing initial short description on line:
 * Mark up the object's coherency levels for a given cache_level
i915_gem_object.c:457: warning: No description found for return value of 'i915_gem_object_read_from_page'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: intel-gfx@lists.freedesktop.org
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211123050928.20434-1-rdunlap@infradead.org
2021-11-23 09:38:11 +00:00
Dave Airlie
c18c889111 drm-misc-next for 5.17:
UAPI Changes:
 
  * Remove restrictions on DMA_BUF_SET_NAME ioctl
  * connector: State of privacy screen
  * sysfs: Send hotplug uevent
 
 Cross-subsystem Changes:
 
  * clk/bmc-2835: Fixes
 
  * dma-buf: Add dma_resv selftest; Error-handling fixes; Add debugfs
    helpers; Remove dma_resv_get_excl_unlocked(); Documentation fixes
 
  * pwm: Introduce of_pwm_single_xlate()
 
 Core Changes:
 
  * Support for privacy screens
  * Make drm_irq.c legacy
  * Fix __stack_depot_* name conflict
  * Documentation fixes
  * Fixes and cleanups
 
  * dp-helper: Reuse 8b/10b link-training delay helpers
 
  * format-helper: Update interfaces
 
  * fb-helper: Allocate shadow buffer of correct size
 
  * gem: Link GEM SHMEM and CMA helpers into separate modules; Use
 	    dma_resv iterator; Import DMA_BUF namespace into GEM-helper modules
 
  * gem/shmem-helper: Interface cleanups
 
  * scheduler: Grab fence in drm_sched_job_add_implicit_dependencies();
    Lockdep fixes
 
  * kms-helpers: Link several files from core into the KMS-helper module
 
 Driver Changes:
 
  * Use dma_resv_iter in several places
  * Fixes and cleanups
 
  * amdgpu: Use drm_kms_helper_connector_hotplug_event(); Get all fences
    at once
 
  * bridge: Switch to managed MIPI DSI helpers in several places; Register
    and attach during probe in several places; Convert to YAML in several
    places
 
  * bridge/anx7625: Support MIPI DPI input; Support HDMI audio; Fixes
 
  * bridge/dw-hdmi: Allow interlace on bridge
 
  * bridge/ps8640: Enable PM; Support aux-bus
 
  * bridge/tc358768: Enabled reference clock; Support pulse mode;
    Modesetting fixes
 
  * bridge/ti-sn65dsi86: Use regmap_bulk_write(); Implement PWM
 
  * etnaviv: Get all fences at once
 
  * gma500: GEM object cleanups; Remove generic drivers in probe function
 
  * i915: Support VESA panel backlights
 
  * ingenic: Fixes and cleanups
 
  * kirin: Adjust probe order
 
  * kmb: Enable framebuffer console
 
  * lima: Kconfig fixes
 
  * meson: Refactoring to supperot DRM_BRIDGE_ATTACH_NO_ENCODER
 
  * msm: Fixes and cleanups
 
  * msm/dsi: Adjust probe order
 
  * omap: Fixes and cleanups
 
  * nouveau: CRC fixes; Validate LUTs in atomic check; Set HDMI AVI RGB
    quantization to FULL; Fixes and cleanups
 
  * panel: Support Innolux G070Y2-T02, Vivax TPC-9150, JDI R63452,
    Newhaven 1.8-128160EF, Wanchanglong W552964ABA, Novatek NT35950,
    BOE BF060Y8M, Sony Tulip Truly NT35521; Use dev_err_probe() throughout
    drivers; Fixes and cleanups
 
  * panel/ili9881c: Orientation fixes
 
  * radeon: Use dma_resv_wait_timeout()
 
  * rockchip: Add timeout for DSP hold; Suspend/resume fixes; PLL clock
    fixes; Implement mmap in GEM object functions
 
  * simpledrm: Support FB_DAMAGE_CLIPS and virtual screen sizes
 
  * sun4i: Use CMA helpers without vmap support
 
  * tidss: Fixes and cleanups
 
  * v3d: Cleanups
 
  * vc4: Fix HDMI-CEC hang when display is off; Power on HDMI controller
    while disabling; Support 4k@60 Hz modes; Fixes and cleanups
 
  * video: Convert to sysfs_emit() in several places
 
  * video/omapfb: Fix fall-through
 
  * virtio: Overflow fixes
 
  * xen: Implement mmap as GEM object functions
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmGWF0wACgkQaA3BHVML
 eiN55ggAr6QN7S7Uxu98XnqfAHC9RErY7r3PoTXXS6ODvxY41bWOpHk8TQzuw626
 JCNnpQCk6Gi8L3yl8r/l1fqoirGXrfDR1YvrnmG4I9xhPxOqBmgxDWw7HQrROm2B
 FctOvgFukvzn5jzQk2FqYgs5JBV20WqLrfEhttPFFMvjLGti/U31/+d+aGMJdRIQ
 kE2eVpPZtAlbBAP+S4mglKp6w+WrrzNHHULSGOPcGS5jwLrNFaQg7w75JiLInzyR
 RWum28USXSkKE0d6XBqHw+PWAwo3B/Vq9XSnbCX4+3GQX4tJh+hosyU7ju+rA0BY
 FJCyN/YgZvc+474o/LVtMh7PbKMEHQ==
 =TV6Q
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2021-11-18' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.17:

UAPI Changes:

 * Remove restrictions on DMA_BUF_SET_NAME ioctl
 * connector: State of privacy screen
 * sysfs: Send hotplug uevent

Cross-subsystem Changes:

 * clk/bmc-2835: Fixes

 * dma-buf: Add dma_resv selftest; Error-handling fixes; Add debugfs
   helpers; Remove dma_resv_get_excl_unlocked(); Documentation fixes

 * pwm: Introduce of_pwm_single_xlate()

Core Changes:

 * Support for privacy screens
 * Make drm_irq.c legacy
 * Fix __stack_depot_* name conflict
 * Documentation fixes
 * Fixes and cleanups

 * dp-helper: Reuse 8b/10b link-training delay helpers

 * format-helper: Update interfaces

 * fb-helper: Allocate shadow buffer of correct size

 * gem: Link GEM SHMEM and CMA helpers into separate modules; Use
	    dma_resv iterator; Import DMA_BUF namespace into GEM-helper modules

 * gem/shmem-helper: Interface cleanups

 * scheduler: Grab fence in drm_sched_job_add_implicit_dependencies();
   Lockdep fixes

 * kms-helpers: Link several files from core into the KMS-helper module

Driver Changes:

 * Use dma_resv_iter in several places
 * Fixes and cleanups

 * amdgpu: Use drm_kms_helper_connector_hotplug_event(); Get all fences
   at once

 * bridge: Switch to managed MIPI DSI helpers in several places; Register
   and attach during probe in several places; Convert to YAML in several
   places

 * bridge/anx7625: Support MIPI DPI input; Support HDMI audio; Fixes

 * bridge/dw-hdmi: Allow interlace on bridge

 * bridge/ps8640: Enable PM; Support aux-bus

 * bridge/tc358768: Enabled reference clock; Support pulse mode;
   Modesetting fixes

 * bridge/ti-sn65dsi86: Use regmap_bulk_write(); Implement PWM

 * etnaviv: Get all fences at once

 * gma500: GEM object cleanups; Remove generic drivers in probe function

 * i915: Support VESA panel backlights

 * ingenic: Fixes and cleanups

 * kirin: Adjust probe order

 * kmb: Enable framebuffer console

 * lima: Kconfig fixes

 * meson: Refactoring to supperot DRM_BRIDGE_ATTACH_NO_ENCODER

 * msm: Fixes and cleanups

 * msm/dsi: Adjust probe order

 * omap: Fixes and cleanups

 * nouveau: CRC fixes; Validate LUTs in atomic check; Set HDMI AVI RGB
   quantization to FULL; Fixes and cleanups

 * panel: Support Innolux G070Y2-T02, Vivax TPC-9150, JDI R63452,
   Newhaven 1.8-128160EF, Wanchanglong W552964ABA, Novatek NT35950,
   BOE BF060Y8M, Sony Tulip Truly NT35521; Use dev_err_probe() throughout
   drivers; Fixes and cleanups

 * panel/ili9881c: Orientation fixes

 * radeon: Use dma_resv_wait_timeout()

 * rockchip: Add timeout for DSP hold; Suspend/resume fixes; PLL clock
   fixes; Implement mmap in GEM object functions

 * simpledrm: Support FB_DAMAGE_CLIPS and virtual screen sizes

 * sun4i: Use CMA helpers without vmap support

 * tidss: Fixes and cleanups

 * v3d: Cleanups

 * vc4: Fix HDMI-CEC hang when display is off; Power on HDMI controller
   while disabling; Support 4k@60 Hz modes; Fixes and cleanups

 * video: Convert to sysfs_emit() in several places

 * video/omapfb: Fix fall-through

 * virtio: Overflow fixes

 * xen: Implement mmap as GEM object functions

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YZYZSypIrr+qcih3@linux-uq9g.fritz.box
2021-11-23 09:38:55 +10:00
Dan Carpenter
6164807dd2 drm/i915/ttm: Fix error code in i915_ttm_eviction_valuable()
This function returns a bool type so returning -EBUSY is equivalent to
returning true.  It should return false instead.

Fixes: 7ae034590c ("drm/i915/ttm: add tt shmem backend")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211122061438.GA2492@kili
2021-11-22 14:20:44 +00:00
Tvrtko Ursulin
8626afb170 Merge drm/drm-next into drm-intel-gt-next
Thomas needs the dma_resv_for_each_fence API for i915/ttm async migration
work.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2021-11-22 12:18:15 +00:00
Umesh Nerlige Ramappa
865fbc0f8d drm/i915/pmu: Avoid with_intel_runtime_pm within spinlock
When guc timestamp ping worker runs it takes the spinlock and calls
with_intel_runtime_pm.  Since with_intel_runtime_pm may sleep, move the
spinlock inside __update_guc_busyness_stats.

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211120014201.26480-1-umesh.nerlige.ramappa@intel.com
2021-11-22 09:16:32 +00:00
Linus Torvalds
1360572566 Linux 5.16-rc2 2021-11-21 13:47:39 -08:00
Linus Torvalds
40c93d7fff Two X86 fixes:
- Move the command line preparation and the early command line parsing
    earlier so that the command line parameters which affect
    early_reserve_memory(), e.g. efi=nosftreserve, are taken into
    account. This was broken when the invocation of early_reserve_memory()
    was moved recently.
 
  - Use an atomic type for the SGX page accounting, which is read and
    written lockless, to plug various race conditions related to it.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmGaYPoTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoTM7D/9bivpPDiNzfjUV7kNKx6aTUwPjdFer
 G0RuuDZqkpJm9j7+51VnQNFssIfAFtzKMJn/DuGVoXF0ERxXMEhJVHiSTeOlCjJU
 u1760qFYlAQ1mwvKVNLk2SenWuNZwwgUneY3VvvS4qYsSq7PsbYlekuddPeX0Nws
 AJ1llOoCoBkm5vNZ5c3/CmhY6iPSRQQkDmbA11cZZUyWl2uouSk21+ax24IDCvW3
 E8Aq9QqB6ND2uukB32kQ7Wp7/UZ4inJHTUXF9UF/8P+N1ftDWeKDjQz6y9U19Tsd
 ivuMr6NqqAos/Fpo9PhlGns07C8HeKGf4ronnt9cUMqjzYWfdS+pRT+0pQR+vIPa
 M8+jyHQplzeOX9/nKOkpV+u0tYP2zgx8e7yeu5Sion8TqsKqiNOy9+D0D2utUDmw
 1x3DzuzGx/mK2OX5gjGSx4ZbS4u0DIAWnF8vB9YfgEfcnqpxr6KdbrY0bLatIbKv
 ip9mh0rRYeTkTZ4FGmvy3hFgAmadCODWxva/7AhzbWVZoM+AShwnTDsipkRaaj3V
 nMdgcVix8qVDg9YIAn9ziZbxkXKQUXFJn7lZj3KBeWjKcV2svA89S/9YL6JTaSeW
 TJ4X6wK8EoApKhEasZhufXBNAl9EmQlBS1k9pHiIjKVuRgBGlzMuEhzvrqZM2+rA
 KaUQSwBN6Ij6Dg==
 =CJUK
 -----END PGP SIGNATURE-----

Merge tag 'x86-urgent-2021-11-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:

 - Move the command line preparation and the early command line parsing
   earlier so that the command line parameters which affect
   early_reserve_memory(), e.g. efi=nosftreserve, are taken into
   account. This was broken when the invocation of
   early_reserve_memory() was moved recently.

 - Use an atomic type for the SGX page accounting, which is read and
   written locklessly, to plug various race conditions related to it.

* tag 'x86-urgent-2021-11-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/sgx: Fix free page accounting
  x86/boot: Pull up cmdline preparation and early param parsing
2021-11-21 11:25:19 -08:00
Linus Torvalds
af16bdeae8 A set of perf/X86 fixes:
- Remove unneded PEBS disabling when taking LBR snapshots to prevent an
    unchecked MSR access error.
 
  - Fix IIO event constraints for Snowridge and Skylake server chips.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmGaX44THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoarZEACxCahzh+3P3BD/yYs2R4qEWpiFf0ei
 KpoCSL/Xh2+5c+N0Z5qxMKs4//Vkab6GYdDlbQnfcfQPFgxFs9UI0kqBoXFWaqK6
 u12JJW8F3iBxV4UTwduCflve9x5iZ5B7OmCRrTmJm8GJwQqTIFmjx8kDsDbghYvu
 rURM3L7mln/Qx7xcjvhZehXrDupwPvak1tw1SbxPyNjz1dNAN8A7G9xVFDddlzDB
 AuYMEadipn5QsbQD224rwUPMj04jnby+421phLmaaPiduZx2Hi7QjJGEH718R7IQ
 IGA1OeSXJXIsPimUW3UJZEw5OGMG+UY7/raHgk8LnxUCqQoeIjU8vpt6HR7EAD0b
 0LuUvJ1ispVT8dY+7DdzcbuW+Zp4TPUQNlG/bBlsmuduuSkvUiDeMtknJbJ8xE0s
 xFbMAgcwlaylSmtwNGCgM/P1KWbLwZQXS6IP8Iy4bnfwEueTeeHwaEtssrfrhW/z
 9OCVgUIkO2LbAFMmlQK4tfFprmR2oJSDpohJ0e5QZMMyrEefSMbY2U47omnB4bln
 HDZ2Q+ZKI0G43ECyI2TZXJg77SS/cmJxCcgXx8iQGZTDn38iPDPQJWWWWWDYTz0C
 ERVEGvK7jc+9Pu54iWAaSQxGmZQUWHfETt0QuKqx+1Kgl2NmBfeJkLkss4dL/eW4
 zH8qAjblfH7a3g==
 =PeA1
 -----END PGP SIGNATURE-----

Merge tag 'perf-urgent-2021-11-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 perf fixes from Thomas Gleixner:

 - Remove unneded PEBS disabling when taking LBR snapshots to prevent an
   unchecked MSR access error.

 - Fix IIO event constraints for Snowridge and Skylake server chips.

* tag 'perf-urgent-2021-11-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/perf: Fix snapshot_branch_stack warning in VM
  perf/x86/intel/uncore: Fix IIO event constraints for Snowridge
  perf/x86/intel/uncore: Fix IIO event constraints for Skylake Server
  perf/x86/intel/uncore: Fix filter_tid mask for CHA events on Skylake Server
2021-11-21 11:17:50 -08:00
Linus Torvalds
75603b14ed powerpc fixes for 5.16 #2
Fix a bug in copying of sigset_t for 32-bit systems, which caused X to not start.
 
 Fix handling of shared LSIs (rare) with the xive interrupt controller (Power9/10).
 
 Fix missing TOC setup in some KVM code, which could result in oopses depending on kernel
 data layout.
 
 Fix DMA mapping when we have persistent memory and only one DMA window available.
 
 Fix further problems with STRICT_KERNEL_RWX on 8xx, exposed by a recent fix.
 
 A couple of other minor fixes.
 
 Thanks to: Alexey Kardashevskiy, Aneesh Kumar K.V, Cédric Le Goater, Christian Zigotzky,
 Christophe Leroy, Daniel Axtens, Finn Thain, Greg Kurz, Masahiro Yamada, Nicholas Piggin,
 Uwe Kleine-König.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmGZzGMTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgBrRD/4qE1A3+nXe+uZRJM3H5F8C/Ui2I/1G
 JPekyfW9aZklsv8SMlz8BotDTlK8vNwiEtkAuwqLOfPXPi1p/Y1do4sPtXAjUpuX
 mXZP3G9K2xXmALLedXMjJNO6YJjTT5LE7OT42QziSfY1ScS7iqfGNANg1zRjkCRW
 yf2cpBbMRnWdDhCgWyE/V/V4xdPyOTTnnWn3d4F3qNshV0luKgTJl/9yo0OmQrGe
 /T4Cw8jG5p+pSblNyFaACnYlKWF4bYTQIl5NWsvJY0A2cg3I5ah6+hexdGRN/JdI
 K3PWpJ8rx5RjICkTFE4cADI6xIF1bHhjMh3ytcaMH5USBMmW3fTUUfcFwjRkRDHa
 b8Z6V631mgK1v3L0RlrAn+PZ9R212wpupvQT6YOf4pFb5+BzOyaCQCzyQv+BnwoI
 Fwran0HEO6NUODq4off9MADEpNTjwhV2mDFojxiCJ9eb1oCIilLbs8BOUWRSHYe0
 1S22pdj9XSR7yxXt5DnjQBwhR47ZS7D3jXf9gjbmJ/qn6cRPAFzt/m/woSY2Vv7T
 UrZVjz5lb+skjij7vxw+L9jUIwLBd99cvBiHzJpWUNc0RTQeBlAh4QBK/1MNixCP
 93LTN7tsRdGknLRTJ5yfRhEhwuhTTH8SEPp3H+qOZj9sXwq3Bftl4Nm40AgoATHO
 G4kPlgrCMQBcRQ==
 =Ss4y
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull more powerpc fixes from Michael Ellerman:

 - Fix a bug in copying of sigset_t for 32-bit systems, which caused X
   to not start.

 - Fix handling of shared LSIs (rare) with the xive interrupt controller
   (Power9/10).

 - Fix missing TOC setup in some KVM code, which could result in oopses
   depending on kernel data layout.

 - Fix DMA mapping when we have persistent memory and only one DMA
   window available.

 - Fix further problems with STRICT_KERNEL_RWX on 8xx, exposed by a
   recent fix.

 - A couple of other minor fixes.

Thanks to Alexey Kardashevskiy, Aneesh Kumar K.V, Cédric Le Goater,
Christian Zigotzky, Christophe Leroy, Daniel Axtens, Finn Thain, Greg
Kurz, Masahiro Yamada, Nicholas Piggin, and Uwe Kleine-König.

* tag 'powerpc-5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/xive: Change IRQ domain to a tree domain
  powerpc/8xx: Fix pinned TLBs with CONFIG_STRICT_KERNEL_RWX
  powerpc/signal32: Fix sigset_t copy
  powerpc/book3e: Fix TLBCAM preset at boot
  powerpc/pseries/ddw: Do not try direct mapping with persistent memory and one window
  powerpc/pseries/ddw: simplify enable_ddw()
  powerpc/pseries/ddw: Revert "Extend upper limit for huge DMA window for persistent memory"
  powerpc/pseries: Fix numa FORM2 parsing fallback code
  powerpc/pseries: rename numa_dist_table to form2_distances
  powerpc: clean vdso32 and vdso64 directories
  powerpc/83xx/mpc8349emitx: Drop unused variable
  KVM: PPC: Book3S HV: Use GLOBAL_TOC for kvmppc_h_set_dabr/xdabr()
2021-11-21 10:26:35 -08:00
Geert Uytterhoeven
61eb495c83 pstore/blk: Use "%lu" to format unsigned long
On 32-bit:

    fs/pstore/blk.c: In function ‘__best_effort_init’:
    include/linux/kern_levels.h:5:18: warning: format ‘%zu’ expects argument of type ‘size_t’, but argument 3 has type ‘long unsigned int’ [-Wformat=]
	5 | #define KERN_SOH "\001"  /* ASCII Start Of Header */
	  |                  ^~~~~~
    include/linux/kern_levels.h:14:19: note: in expansion of macro ‘KERN_SOH’
       14 | #define KERN_INFO KERN_SOH "6" /* informational */
	  |                   ^~~~~~~~
    include/linux/printk.h:373:9: note: in expansion of macro ‘KERN_INFO’
      373 |  printk(KERN_INFO pr_fmt(fmt), ##__VA_ARGS__)
	  |         ^~~~~~~~~
    fs/pstore/blk.c:314:3: note: in expansion of macro ‘pr_info’
      314 |   pr_info("attached %s (%zu) (no dedicated panic_write!)\n",
	  |   ^~~~~~~

Cc: stable@vger.kernel.org
Fixes: 7bb9557b48 ("pstore/blk: Use the normal block device I/O path")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210629103700.1935012-1-geert@linux-m68k.org
Cc: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-21 09:44:19 -08:00
Linus Torvalds
923dcc5eb0 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "15 patches.

  Subsystems affected by this patch series: ipc, hexagon, mm (swap,
  slab-generic, kmemleak, hugetlb, kasan, damon, and highmem), and proc"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  proc/vmcore: fix clearing user buffer by properly using clear_user()
  kmap_local: don't assume kmap PTEs are linear arrays in memory
  mm/damon/dbgfs: fix missed use of damon_dbgfs_lock
  mm/damon/dbgfs: use '__GFP_NOWARN' for user-specified size buffer allocation
  kasan: test: silence intentional read overflow warnings
  hugetlb, userfaultfd: fix reservation restore on userfaultfd error
  hugetlb: fix hugetlb cgroup refcounting during mremap
  mm: kmemleak: slob: respect SLAB_NOLEAKTRACE flag
  hexagon: ignore vmlinux.lds
  hexagon: clean up timer-regs.h
  hexagon: export raw I/O routines for modules
  mm: emit the "free" trace report before freeing memory in kmem_cache_free()
  shm: extend forced shm destroy to support objects from several IPC nses
  ipc: WARN if trying to remove ipc object which is absent
  mm/swap.c:put_pages_list(): reinitialise the page list
2021-11-20 13:17:24 -08:00
Linus Torvalds
61564e7b3a block-5.16-2021-11-19
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmGYctEQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpqXjD/9Ef3i/OXjvUOOG19gEk4B1rr0youVioJg9
 0HYmXYGDLHZZd01mBdOCrbvadYaJRw3SG+whzNyayekPhs38kFHTqBNHcXOGphm1
 4snuDVFz0kElDswRjnh/q9QoJNDGwDidy6L+KHXMbHhTPuXALXshc+6U3GGWJ0Gg
 5EEKwfJGRkIzJJ9fL9d9GbyAFMLq8xXr1pf9LdNJEZL2RaBkh4gI7Uu0q6vYGh88
 N1iz2LdF4D4uopB6GkT6Eup/5iKakGbv2M2edcHpICdG/3EKn3Q8pnUajxVvXkV/
 kOR4zRy2on7xYKIVZGKvq6/8e7Wde9yfBGVQ1D/45uzKIeTiDrP0MWo8ldP5vkbU
 yTc3/BiLW5Nkk67RXs3Llg5QDX+n69BHNtXepO8W6DBMdLRFqZSiM0Y1Xrba/+yU
 4TJif/wlArErjUUsWuWlnSzKrn1CyRLxxmewXfhxgpy12d4pTQsDLMKs7CojCmoF
 t265dmvzXeNtAymuS/WSk0GlnobEO+wvfVUzDUQir6PukDSvz1vw5OIdsMaf2dmX
 QDrcgVnGIjNQfLQVoX8FF0u52HcElP1+iikcs/XCNj67f9qImiBYsUYqDtfW6yFl
 56IKkq4akSWPgA/qnV9oJ/Cf/AV8WI7rXV32hivo41nXaclrJouRfrarffTRzV9n
 gHP0k9W6Cg==
 =IqIF
 -----END PGP SIGNATURE-----

Merge tag 'block-5.16-2021-11-19' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:

 - Flip a cap check to avoid a selinux error (Alistair)

 - Fix for a regression this merge window where we can miss a queue ref
   put (me)

 - Un-mark pstore-blk as broken, as the condition that triggered that
   change has been rectified (Kees)

 - Queue quiesce and sync fixes (Ming)

 - FUA insertion fix (Ming)

 - blk-cgroup error path put fix (Yu)

* tag 'block-5.16-2021-11-19' of git://git.kernel.dk/linux-block:
  blk-mq: don't insert FUA request with data into scheduler queue
  blk-cgroup: fix missing put device in error path from blkg_conf_pref()
  block: avoid to quiesce queue in elevator_init_mq
  Revert "mark pstore-blk as broken"
  blk-mq: cancel blk-mq dispatch work in both blk_cleanup_queue and disk_release()
  block: fix missing queue put in error path
  block: Check ADMIN before NICE for IOPRIO_CLASS_RT
2021-11-20 11:05:10 -08:00
Linus Torvalds
b100274c70 Pin control fixes for the v5.16 kernel series:
- Fix some stubs causing compile issues for ACPI.
 
 - Fix some wakeups on AMD IRQs shared between GPIO and SCI.
 
 - Fix a build warning in the Tegra driver.
 
 - Fix a Kconfig issue in the Qualcomm driver.
 
 - Add a missing include the RALink driver.
 
 - Return a valid type for the Apple pinctrl IRQs.
 
 - Implement some Qualcomm SDM845 dual-edge errata.
 
 - Remove the unused <linux/sdb.h> header. (The subsystem was
   once deleted by the pinctrl maintainer...)
 
 - Fix a duplicate initialized in the Tegra driver.
 
 - Fix register offsets for UFS and SDC in the Qualcomm SM8350
   driver.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEElDRnuGcz/wPCXQWMQRCzN7AZXXMFAmGYW4MACgkQQRCzN7AZ
 XXNI/BAAmbPnEdjOpa/qjQRae7VV9ycCVhFjs37+0HSOOiMFjQieTz3n4dUQ7JX9
 guK7pqn9+ZPBqkya75X4pvDWVW7IuquifflVPg0c3V4yW/+tgt7ZR4JnZo18xt+L
 OzW/SnR1O8wXvV7O+6ee8jH3NL7g1SB2bdLuvAwIM1uMdBse0F0nDvdxfSiaLcGk
 zFdht2MVXOz4JT0Qq9HYujxw3cJ8Z8fBSS8Y7hdWaNRxYdQe3mVJzaSgCTnEXLj5
 DTFuzx64g44DNor5D1KzU/WYkHe+MX2tPxwnfXjckrnQbw1TZzl8Zmk2mUxViesi
 KaC1mTBYUjLDj++fiFW5MP3yK+sigcXZJ9COMAr2ue6zpdzc6ja097lIRZO0dreD
 iV5YkYj9uZOxji5m18jfuaTvjGbDjfDH9ZHRNmARUOPPmn7xGF+dPqkcKaSIn3KW
 gpP0L5oF1mP0iNuOU0bI9gi6J6UAjfJz9E3yukqrteObw+F4SMEulNPq+WQzxOYw
 FeNaakufIF8SYii7yoWKK6qG30zHds+BMBxxdj3dB+Px23J1J1R2kDGD8Y13fNkN
 bygFgK6z6A6Qw/4O4m8BcO99rrNet+0+dd1tA4mc8GNAqA4jXRCJgWeoy6eLB3y7
 Cx6QecJ0YOHnsyBrrpxxFiPDkhWsFL2DeBY6iQOqjagQPJWKKcI=
 =iBZ7
 -----END PGP SIGNATURE-----

Merge tag 'pinctrl-v5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control fixes from Linus Walleij:
 "There is an ACPI stubs fix which is ACKed by the ACPI maintainer for
  merging through my tree.

  One item stand out and that is that I delete the <linux/sdb.h> header
  that is used by nothing. I deleted this subsystem (through the GPIO
  tree) a while back so I feel responsible for tidying up the floor.

  Other than that it is the usual mistakes, a bit noisy around build
  issue and Kconfig then driver fixes.

  Specifics:

   - Fix some stubs causing compile issues for ACPI.

   - Fix some wakeups on AMD IRQs shared between GPIO and SCI.

   - Fix a build warning in the Tegra driver.

   - Fix a Kconfig issue in the Qualcomm driver.

   - Add a missing include the RALink driver.

   - Return a valid type for the Apple pinctrl IRQs.

   - Implement some Qualcomm SDM845 dual-edge errata.

   - Remove the unused <linux/sdb.h> header. (The subsystem was once
     deleted by the pinctrl maintainer...)

   - Fix a duplicate initialized in the Tegra driver.

   - Fix register offsets for UFS and SDC in the Qualcomm SM8350 driver"

* tag 'pinctrl-v5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: qcom: sm8350: Correct UFS and SDC offsets
  pinctrl: tegra194: remove duplicate initializer again
  Remove unused header <linux/sdb.h>
  pinctrl: qcom: sdm845: Enable dual edge errata
  pinctrl: apple: Always return valid type in apple_gpio_irq_type
  pinctrl: ralink: include 'ralink_regs.h' in 'pinctrl-mt7620.c'
  pinctrl: qcom: fix unmet dependencies on GPIOLIB for GPIOLIB_IRQCHIP
  pinctrl: tegra: Return const pointer from tegra_pinctrl_get_group()
  pinctrl: amd: Fix wakeups when IRQ is shared with SCI
  ACPI: Add stubs for wakeup handler functions
2021-11-20 10:59:03 -08:00
Linus Torvalds
6b38e2fb70 s390 updates for 5.16-rc2
- Add missing Kconfig option for ftrace direct multi sample, so it can
   be compiled again, and also add s390 support for this sample.
 
 - Update Christian Borntraeger's email address.
 
 - Various fixes for memory layout setup. Besides other this makes it
   possible to load shared DCSS segments again.
 
 - Fix copy to user space of swapped kdump oldmem.
 
 - Remove -mstack-guard and -mstack-size compile options when building
   vdso binaries. This can happen when CONFIG_VMAP_STACK is disabled
   and results in broken vdso code which causes more or less random
   exceptions. Also remove the not needed -nostdlib option.
 
 - Fix memory leak on cpu hotplug and return code handling in kexec
   code.
 
 - Wire up futex_waitv system call.
 
 - Replace snprintf with sysfs_emit where appropriate.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEECMNfWEw3SLnmiLkZIg7DeRspbsIFAmGZBmIACgkQIg7DeRsp
 bsLevQ//XfCEcvJ1sB4OEiN97xyy5me4FoOo5rWuzG/ZN/YmUH0CkzJHIhjDcCg3
 2FslxH5doOA3zLEBCQKXtcW4uaLSgJcqDgFgpE0TZk/6VKB9RD5q2eSjd+akFMGh
 HFge54pfgpR7pYYwWRvbqOJRyzkU5oHAjMmt2UweOoX3qwynhMhTrT/03Y9pGMgK
 VBHhp+ocfdLGQk3nbehAWsh7AWItWwOtKblsTFoyJ6BW0pxb7Yc6+wrpyxLYCaRK
 rCbyXDStvDqjeBSdx2GZDrA7HbVsrZTHA7sSStIW8yIss1/YJXTP0J2PMXmYNbeE
 ou2WCg/iti1DNwN7AOR0OdPu1NfPQkyW6NmV8814Haa8Ub3GUc6RCo+U4wlCXAbo
 ZcHWlb8sgWgfQMzho3WfgkeXuEohO+nOV/x/JFt+NFcwidNTQKO7FQ8GsyylUcYo
 fBhElbn7p44eS1ivMFEwzptBbpH1JVbb30iV7tMWxyjJQ9TkzpsC3Ph14JimSChk
 oZuUnmgMztss/ikEMFcDLhd3DNedXfz10Boq6FucD8x46cW5j7o0scwIomcNtxmx
 C3Y9JCsDdiXAfS6Et6KGbsuWbigT3NjNKETK0+Be65GYNP/NPD5pXLeKywU++cHe
 e+Lucqiej9polcGN3X97lORMDEx0dXpGkM6ZK2rtX66e7rBbB7M=
 =n7BA
 -----END PGP SIGNATURE-----

Merge tag 's390-5.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Heiko Carstens:

 - Add missing Kconfig option for ftrace direct multi sample, so it can
   be compiled again, and also add s390 support for this sample.

 - Update Christian Borntraeger's email address.

 - Various fixes for memory layout setup. Besides other this makes it
   possible to load shared DCSS segments again.

 - Fix copy to user space of swapped kdump oldmem.

 - Remove -mstack-guard and -mstack-size compile options when building
   vdso binaries. This can happen when CONFIG_VMAP_STACK is disabled and
   results in broken vdso code which causes more or less random
   exceptions. Also remove the not needed -nostdlib option.

 - Fix memory leak on cpu hotplug and return code handling in kexec
   code.

 - Wire up futex_waitv system call.

 - Replace snprintf with sysfs_emit where appropriate.

* tag 's390-5.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  ftrace/samples: add s390 support for ftrace direct multi sample
  ftrace/samples: add missing Kconfig option for ftrace direct multi sample
  MAINTAINERS: update email address of Christian Borntraeger
  s390/kexec: fix memory leak of ipl report buffer
  s390/kexec: fix return code handling
  s390/dump: fix copying to user-space of swapped kdump oldmem
  s390: wire up sys_futex_waitv system call
  s390/vdso: filter out -mstack-guard and -mstack-size
  s390/vdso: remove -nostdlib compiler flag
  s390: replace snprintf in show functions with sysfs_emit
  s390/boot: simplify and fix kernel memory layout setup
  s390/setup: re-arrange memblock setup
  s390/setup: avoid using memblock_enforce_memory_limit
  s390/setup: avoid reserving memory above identity mapping
2021-11-20 10:55:50 -08:00
Linus Torvalds
b38bfc747c 3 small cifs/smb3 fixes, 2 to address minor coverity issues and one cleanup
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmGYKGIACgkQiiy9cAdy
 T1ED5wwAtDe73G0fLjajrlVaOfBgHVqRWg1QkGxrwe0Pk8BXJkrrfze5kizD7B5D
 xgVDbAWmF3wfkoGukggpjrLpwe/36F/xdHHIATRH2xG7zielDSab/RHPZQx4xPJQ
 Qz/F7f5N8cSvDYX5HdTYncAtF3yV6MM48n9N6fBoKTL43mDWK7EI90KM5EkL2Mdc
 DT1wYzTpuNoR1qY4oBIftV8mau6DAVtE/GIdpijzIbCf07xADvaM62QmA1qFLCFT
 zlya3RmgyTS2UtCV9pKnbzZ2o1rm7J/C6YWvqrggH24Fu7V4nGTAx5yY/X2Zm3iu
 uyoMTT1uvaiaihFCVUYN2e4jhYX/SuWA2Q0WczHAcx0LFXWWsIe5pOl78i9aQkU3
 UQhTk4G1HNd35CEtR53HEil8wt674p2D/kJpZ5VL6OIg7H1jgwG6UoterNEbEUVT
 qFvKCbePXyt8kDdgk9DqsAQQZTkJKMeiuk+hwVUngDRd/jsp7N7p4ISdBQNzNVG3
 JVtinWzM
 =IGYd
 -----END PGP SIGNATURE-----

Merge tag '5.16-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
 "Three small cifs/smb3 fixes: two to address minor coverity issues and
  one cleanup"

* tag '5.16-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: introduce cifs_ses_mark_for_reconnect() helper
  cifs: protect srv_count with cifs_tcp_ses_lock
  cifs: move debug print out of spinlock
2021-11-20 10:47:16 -08:00
David Hildenbrand
c1e6311771 proc/vmcore: fix clearing user buffer by properly using clear_user()
To clear a user buffer we cannot simply use memset, we have to use
clear_user().  With a virtio-mem device that registers a vmcore_cb and
has some logically unplugged memory inside an added Linux memory block,
I can easily trigger a BUG by copying the vmcore via "cp":

  systemd[1]: Starting Kdump Vmcore Save Service...
  kdump[420]: Kdump is using the default log level(3).
  kdump[453]: saving to /sysroot/var/crash/127.0.0.1-2021-11-11-14:59:22/
  kdump[458]: saving vmcore-dmesg.txt to /sysroot/var/crash/127.0.0.1-2021-11-11-14:59:22/
  kdump[465]: saving vmcore-dmesg.txt complete
  kdump[467]: saving vmcore
  BUG: unable to handle page fault for address: 00007f2374e01000
  #PF: supervisor write access in kernel mode
  #PF: error_code(0x0003) - permissions violation
  PGD 7a523067 P4D 7a523067 PUD 7a528067 PMD 7a525067 PTE 800000007048f867
  Oops: 0003 [#1] PREEMPT SMP NOPTI
  CPU: 0 PID: 468 Comm: cp Not tainted 5.15.0+ #6
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-27-g64f37cc530f1-prebuilt.qemu.org 04/01/2014
  RIP: 0010:read_from_oldmem.part.0.cold+0x1d/0x86
  Code: ff ff ff e8 05 ff fe ff e9 b9 e9 7f ff 48 89 de 48 c7 c7 38 3b 60 82 e8 f1 fe fe ff 83 fd 08 72 3c 49 8d 7d 08 4c 89 e9 89 e8 <49> c7 45 00 00 00 00 00 49 c7 44 05 f8 00 00 00 00 48 83 e7 f81
  RSP: 0018:ffffc9000073be08 EFLAGS: 00010212
  RAX: 0000000000001000 RBX: 00000000002fd000 RCX: 00007f2374e01000
  RDX: 0000000000000001 RSI: 00000000ffffdfff RDI: 00007f2374e01008
  RBP: 0000000000001000 R08: 0000000000000000 R09: ffffc9000073bc50
  R10: ffffc9000073bc48 R11: ffffffff829461a8 R12: 000000000000f000
  R13: 00007f2374e01000 R14: 0000000000000000 R15: ffff88807bd421e8
  FS:  00007f2374e12140(0000) GS:ffff88807f000000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f2374e01000 CR3: 000000007a4aa000 CR4: 0000000000350eb0
  Call Trace:
   read_vmcore+0x236/0x2c0
   proc_reg_read+0x55/0xa0
   vfs_read+0x95/0x190
   ksys_read+0x4f/0xc0
   do_syscall_64+0x3b/0x90
   entry_SYSCALL_64_after_hwframe+0x44/0xae

Some x86-64 CPUs have a CPU feature called "Supervisor Mode Access
Prevention (SMAP)", which is used to detect wrong access from the kernel
to user buffers like this: SMAP triggers a permissions violation on
wrong access.  In the x86-64 variant of clear_user(), SMAP is properly
handled via clac()+stac().

To fix, properly use clear_user() when we're dealing with a user buffer.

Link: https://lkml.kernel.org/r/20211112092750.6921-1-david@redhat.com
Fixes: 997c136f51 ("fs/proc/vmcore.c: add hook to read_from_oldmem() to check for non-ram pages")
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Philipp Rudo <prudo@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-20 10:35:55 -08:00
Ard Biesheuvel
825c43f50e kmap_local: don't assume kmap PTEs are linear arrays in memory
The kmap_local conversion broke the ARM architecture, because the new
code assumes that all PTEs used for creating kmaps form a linear array
in memory, and uses array indexing to look up the kmap PTE belonging to
a certain kmap index.

On ARM, this cannot work, not only because the PTE pages may be
non-adjacent in memory, but also because ARM/!LPAE interleaves hardware
entries and extended entries (carrying software-only bits) in a way that
is not compatible with array indexing.

Fortunately, this only seems to affect configurations with more than 8
CPUs, due to the way the per-CPU kmap slots are organized in memory.

Work around this by permitting an architecture to set a Kconfig symbol
that signifies that the kmap PTEs do not form a lineary array in memory,
and so the only way to locate the appropriate one is to walk the page
tables.

Link: https://lore.kernel.org/linux-arm-kernel/20211026131249.3731275-1-ardb@kernel.org/
Link: https://lkml.kernel.org/r/20211116094737.7391-1-ardb@kernel.org
Fixes: 2a15ba82fa ("ARM: highmem: Switch to generic kmap atomic")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reported-by: Quanyang Wang <quanyang.wang@windriver.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-20 10:35:54 -08:00
SeongJae Park
d78f3853f8 mm/damon/dbgfs: fix missed use of damon_dbgfs_lock
DAMON debugfs is supposed to protect dbgfs_ctxs, dbgfs_nr_ctxs, and
dbgfs_dirs using damon_dbgfs_lock.  However, some of the code is
accessing the variables without the protection.  This fixes it by
protecting all such accesses.

Link: https://lkml.kernel.org/r/20211110145758.16558-3-sj@kernel.org
Fixes: 75c1c2b53c ("mm/damon/dbgfs: support multiple contexts")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-20 10:35:54 -08:00
SeongJae Park
db7a347b26 mm/damon/dbgfs: use '__GFP_NOWARN' for user-specified size buffer allocation
Patch series "DAMON fixes".

This patch (of 2):

DAMON users can trigger below warning in '__alloc_pages()' by invoking
write() to some DAMON debugfs files with arbitrarily high count
argument, because DAMON debugfs interface allocates some buffers based
on the user-specified 'count'.

        if (unlikely(order >= MAX_ORDER)) {
                WARN_ON_ONCE(!(gfp & __GFP_NOWARN));
                return NULL;
        }

Because the DAMON debugfs interface code checks failure of the
'kmalloc()', this commit simply suppresses the warnings by adding
'__GFP_NOWARN' flag.

Link: https://lkml.kernel.org/r/20211110145758.16558-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20211110145758.16558-2-sj@kernel.org
Fixes: 4bc05954d0 ("mm/damon: implement a debugfs-based user space interface")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-20 10:35:54 -08:00
Kees Cook
cab71f7495 kasan: test: silence intentional read overflow warnings
As done in commit d73dad4eb5 ("kasan: test: bypass __alloc_size
checks") for __write_overflow warnings, also silence some more cases
that trip the __read_overflow warnings seen in 5.16-rc1[1]:

  In file included from include/linux/string.h:253,
                   from include/linux/bitmap.h:10,
                   from include/linux/cpumask.h:12,
                   from include/linux/mm_types_task.h:14,
                   from include/linux/mm_types.h:5,
                   from include/linux/page-flags.h:13,
                   from arch/arm64/include/asm/mte.h:14,
                   from arch/arm64/include/asm/pgtable.h:12,
                   from include/linux/pgtable.h:6,
                   from include/linux/kasan.h:29,
                   from lib/test_kasan.c:10:
  In function 'memcmp',
      inlined from 'kasan_memcmp' at lib/test_kasan.c:897:2:
  include/linux/fortify-string.h:263:25: error: call to '__read_overflow' declared with attribute error: detected read beyond size of object (1st parameter)
    263 |                         __read_overflow();
        |                         ^~~~~~~~~~~~~~~~~
  In function 'memchr',
      inlined from 'kasan_memchr' at lib/test_kasan.c:872:2:
  include/linux/fortify-string.h:277:17: error: call to '__read_overflow' declared with attribute error: detected read beyond size of object (1st parameter)
    277 |                 __read_overflow();
        |                 ^~~~~~~~~~~~~~~~~

[1] http://kisskb.ellerman.id.au/kisskb/buildresult/14660585/log/

Link: https://lkml.kernel.org/r/20211116004111.3171781-1-keescook@chromium.org
Fixes: d73dad4eb5 ("kasan: test: bypass __alloc_size checks")
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Acked-by: Marco Elver <elver@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-20 10:35:54 -08:00
Mina Almasry
cc30042df6 hugetlb, userfaultfd: fix reservation restore on userfaultfd error
Currently in the is_continue case in hugetlb_mcopy_atomic_pte(), if we
bail out using "goto out_release_unlock;" in the cases where idx >=
size, or !huge_pte_none(), the code will detect that new_pagecache_page
== false, and so call restore_reserve_on_error().  In this case I see
restore_reserve_on_error() delete the reservation, and the following
call to remove_inode_hugepages() will increment h->resv_hugepages
causing a 100% reproducible leak.

We should treat the is_continue case similar to adding a page into the
pagecache and set new_pagecache_page to true, to indicate that there is
no reservation to restore on the error path, and we need not call
restore_reserve_on_error().  Rename new_pagecache_page to
page_in_pagecache to make that clear.

Link: https://lkml.kernel.org/r/20211117193825.378528-1-almasrymina@google.com
Fixes: c7b1850dfb ("hugetlb: don't pass page cache pages to restore_reserve_on_error")
Signed-off-by: Mina Almasry <almasrymina@google.com>
Reported-by: James Houghton <jthoughton@google.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Wei Xu <weixugc@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-20 10:35:54 -08:00
Bui Quang Minh
afe041c2d0 hugetlb: fix hugetlb cgroup refcounting during mremap
When hugetlb_vm_op_open() is called during copy_vma(), we may take the
reference to resv_map->css.  Later, when clearing the reservation
pointer of old_vma after transferring it to new_vma, we forget to drop
the reference to resv_map->css.  This leads to a reference leak of css.

Fixes this by adding a check to drop reservation css reference in
clear_vma_resv_huge_pages()

Link: https://lkml.kernel.org/r/20211113154412.91134-1-minhquangbui99@gmail.com
Fixes: 550a7d60bd ("mm, hugepages: add mremap() support for hugepage backed vma")
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-20 10:35:54 -08:00
Rustam Kovhaev
34dbc3aaf5 mm: kmemleak: slob: respect SLAB_NOLEAKTRACE flag
When kmemleak is enabled for SLOB, system does not boot and does not
print anything to the console.  At the very early stage in the boot
process we hit infinite recursion from kmemleak_init() and eventually
kernel crashes.

kmemleak_init() specifies SLAB_NOLEAKTRACE for KMEM_CACHE(), but
kmem_cache_create_usercopy() removes it because CACHE_CREATE_MASK is not
valid for SLOB.

Let's fix CACHE_CREATE_MASK and make kmemleak work with SLOB

Link: https://lkml.kernel.org/r/20211115020850.3154366-1-rkovhaev@gmail.com
Fixes: d8843922fb ("slab: Ignore internal flags in cache creation")
Signed-off-by: Rustam Kovhaev <rkovhaev@gmail.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Glauber Costa <glommer@parallels.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-20 10:35:54 -08:00
Nathan Chancellor
eaac2f8989 hexagon: ignore vmlinux.lds
After building allmodconfig, there is an untracked vmlinux.lds file in
arch/hexagon/kernel:

    $ git ls-files . --exclude-standard --others
    arch/hexagon/kernel/vmlinux.lds

Ignore it as all other architectures have.

Link: https://lkml.kernel.org/r/20211115174250.1994179-4-nathan@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-20 10:35:54 -08:00
Nathan Chancellor
51f2ec5934 hexagon: clean up timer-regs.h
When building allmodconfig, there is a warning about TIMER_ENABLE being
redefined:

  drivers/clocksource/timer-oxnas-rps.c:39:9: error: 'TIMER_ENABLE' macro redefined [-Werror,-Wmacro-redefined]
  #define TIMER_ENABLE            BIT(7)
          ^
  arch/hexagon/include/asm/timer-regs.h:13:9: note: previous definition is here
  #define TIMER_ENABLE            0
           ^
  1 error generated.

The values in this header are only used in one file each, if they are
used at all.  Remove the header and sink all of the constants into their
respective files.

TCX0_CLK_RATE is only used in arch/hexagon/include/asm/timex.h

TIMER_ENABLE, RTOS_TIMER_INT, RTOS_TIMER_REGS_ADDR are only used in
arch/hexagon/kernel/time.c.

SLEEP_CLK_RATE and TIMER_CLR_ON_MATCH have both been unused since the
file's introduction in commit 71e4a47f32 ("Hexagon: Add time and timer
functions").

TIMER_ENABLE is redefined as BIT(0) so the shift is moved into the
definition, rather than its use.

Link: https://lkml.kernel.org/r/20211115174250.1994179-3-nathan@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Brian Cain <bcain@codeaurora.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-20 10:35:54 -08:00
Nathan Chancellor
ffb92ce826 hexagon: export raw I/O routines for modules
Patch series "Fixes for ARCH=hexagon allmodconfig", v2.

This series fixes some issues noticed with ARCH=hexagon allmodconfig.

This patch (of 3):

When building ARCH=hexagon allmodconfig, the following errors occur:

  ERROR: modpost: "__raw_readsl" [drivers/i3c/master/svc-i3c-master.ko] undefined!
  ERROR: modpost: "__raw_writesl" [drivers/i3c/master/dw-i3c-master.ko] undefined!
  ERROR: modpost: "__raw_readsl" [drivers/i3c/master/dw-i3c-master.ko] undefined!
  ERROR: modpost: "__raw_writesl" [drivers/i3c/master/i3c-master-cdns.ko] undefined!
  ERROR: modpost: "__raw_readsl" [drivers/i3c/master/i3c-master-cdns.ko] undefined!

Export these symbols so that modules can use them without any errors.

Link: https://lkml.kernel.org/r/20211115174250.1994179-1-nathan@kernel.org
Link: https://lkml.kernel.org/r/20211115174250.1994179-2-nathan@kernel.org
Fixes: 013bf24c38 ("Hexagon: Provide basic implementation and/or stubs for I/O routines.")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Brian Cain <bcain@codeaurora.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-20 10:35:54 -08:00
Yunfeng Ye
9a543f007b mm: emit the "free" trace report before freeing memory in kmem_cache_free()
After the memory is freed, it can be immediately allocated by other
CPUs, before the "free" trace report has been emitted.  This causes
inaccurate traces.

For example, if the following sequence of events occurs:

    CPU 0                 CPU 1

  (1) alloc xxxxxx
  (2) free  xxxxxx
                         (3) alloc xxxxxx
                         (4) free  xxxxxx

Then they will be inaccurately reported via tracing, so that they appear
to have happened in this order:

    CPU 0                 CPU 1

  (1) alloc xxxxxx
                         (2) alloc xxxxxx
  (3) free  xxxxxx
                         (4) free  xxxxxx

This makes it look like CPU 1 somehow managed to allocate memory that
CPU 0 still had allocated for itself.

In order to avoid this, emit the "free xxxxxx" tracing report just
before the actual call to free the memory, instead of just after it.

Link: https://lkml.kernel.org/r/374eb75d-7404-8721-4e1e-65b0e5b17279@huawei.com
Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-20 10:35:54 -08:00
Alexander Mikhalitsyn
85b6d24646 shm: extend forced shm destroy to support objects from several IPC nses
Currently, the exit_shm() function not designed to work properly when
task->sysvshm.shm_clist holds shm objects from different IPC namespaces.

This is a real pain when sysctl kernel.shm_rmid_forced = 1, because it
leads to use-after-free (reproducer exists).

This is an attempt to fix the problem by extending exit_shm mechanism to
handle shm's destroy from several IPC ns'es.

To achieve that we do several things:

1. add a namespace (non-refcounted) pointer to the struct shmid_kernel

2. during new shm object creation (newseg()/shmget syscall) we
   initialize this pointer by current task IPC ns

3. exit_shm() fully reworked such that it traverses over all shp's in
   task->sysvshm.shm_clist and gets IPC namespace not from current task
   as it was before but from shp's object itself, then call
   shm_destroy(shp, ns).

Note: We need to be really careful here, because as it was said before
(1), our pointer to IPC ns non-refcnt'ed.  To be on the safe side we
using special helper get_ipc_ns_not_zero() which allows to get IPC ns
refcounter only if IPC ns not in the "state of destruction".

Q/A

Q: Why can we access shp->ns memory using non-refcounted pointer?
A: Because shp object lifetime is always shorther than IPC namespace
   lifetime, so, if we get shp object from the task->sysvshm.shm_clist
   while holding task_lock(task) nobody can steal our namespace.

Q: Does this patch change semantics of unshare/setns/clone syscalls?
A: No. It's just fixes non-covered case when process may leave IPC
   namespace without getting task->sysvshm.shm_clist list cleaned up.

Link: https://lkml.kernel.org/r/67bb03e5-f79c-1815-e2bf-949c67047418@colorfullife.com
Link: https://lkml.kernel.org/r/20211109151501.4921-1-manfred@colorfullife.com
Fixes: ab602f7991 ("shm: make exit_shm work proportional to task activity")
Co-developed-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Cc: Vasily Averin <vvs@virtuozzo.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-20 10:35:54 -08:00
Alexander Mikhalitsyn
126e8bee94 ipc: WARN if trying to remove ipc object which is absent
Patch series "shm: shm_rmid_forced feature fixes".

Some time ago I met kernel crash after CRIU restore procedure,
fortunately, it was CRIU restore, so, I had dump files and could do
restore many times and crash reproduced easily.  After some
investigation I've constructed the minimal reproducer.  It was found
that it's use-after-free and it happens only if sysctl
kernel.shm_rmid_forced = 1.

The key of the problem is that the exit_shm() function not handles shp's
object destroy when task->sysvshm.shm_clist contains items from
different IPC namespaces.  In most cases this list will contain only
items from one IPC namespace.

How can this list contain object from different namespaces? The
exit_shm() function is designed to clean up this list always when
process leaves IPC namespace.  But we made a mistake a long time ago and
did not add a exit_shm() call into the setns() syscall procedures.

The first idea was just to add this call to setns() syscall but it
obviously changes semantics of setns() syscall and that's
userspace-visible change.  So, I gave up on this idea.

The first real attempt to address the issue was just to omit forced
destroy if we meet shp object not from current task IPC namespace [1].
But that was not the best idea because task->sysvshm.shm_clist was
protected by rwsem which belongs to current task IPC namespace.  It
means that list corruption may occur.

Second approach is just extend exit_shm() to properly handle shp's from
different IPC namespaces [2].  This is really non-trivial thing, I've
put a lot of effort into that but not believed that it's possible to
make it fully safe, clean and clear.

Thanks to the efforts of Manfred Spraul working an elegant solution was
designed.  Thanks a lot, Manfred!

Eric also suggested the way to address the issue in ("[RFC][PATCH] shm:
In shm_exit destroy all created and never attached segments") Eric's
idea was to maintain a list of shm_clists one per IPC namespace, use
lock-less lists.  But there is some extra memory consumption-related
concerns.

An alternative solution which was suggested by me was implemented in
("shm: reset shm_clist on setns but omit forced shm destroy").  The idea
is pretty simple, we add exit_shm() syscall to setns() but DO NOT
destroy shm segments even if sysctl kernel.shm_rmid_forced = 1, we just
clean up the task->sysvshm.shm_clist list.

This chages semantics of setns() syscall a little bit but in comparision
to the "naive" solution when we just add exit_shm() without any special
exclusions this looks like a safer option.

[1] https://lkml.org/lkml/2021/7/6/1108
[2] https://lkml.org/lkml/2021/7/14/736

This patch (of 2):

Let's produce a warning if we trying to remove non-existing IPC object
from IPC namespace kht/idr structures.

This allows us to catch possible bugs when the ipc_rmid() function was
called with inconsistent struct ipc_ids*, struct kern_ipc_perm*
arguments.

Link: https://lkml.kernel.org/r/20211027224348.611025-1-alexander.mikhalitsyn@virtuozzo.com
Link: https://lkml.kernel.org/r/20211027224348.611025-2-alexander.mikhalitsyn@virtuozzo.com
Co-developed-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Cc: Vasily Averin <vvs@virtuozzo.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-20 10:35:54 -08:00
Matthew Wilcox
3cd018b4d6 mm/swap.c:put_pages_list(): reinitialise the page list
While free_unref_page_list() puts pages onto the CPU local LRU list, it
does not remove them from the list they were passed in on.  That makes
the list_head appear to be non-empty, and would lead to various
corruption problems if we didn't have an assertion that the list was
empty.

Reinitialise the list after calling free_unref_page_list() to avoid this
problem.

Link: https://lkml.kernel.org/r/YYp40A2lNrxaZji8@casper.infradead.org
Fixes: 988c69f1bc ("mm: optimise put_pages_list()")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Steve French <stfrench@microsoft.com>
Reported-by: Namjae Jeon <linkinjeon@kernel.org>
Tested-by: Steve French <stfrench@microsoft.com>
Tested-by: Namjae Jeon <linkinjeon@kernel.org>
Cc: Steve French <smfrench@gmail.com>
Cc: Hyeoncheol Lee <hyc.lee@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-20 10:35:54 -08:00