No description
Find a file
Peter Newman 3030f11f27 x86/resctrl: Fix task CLOSID/RMID update race
[ Upstream commit fe1f071438 ]

When the user moves a running task to a new rdtgroup using the task's
file interface or by deleting its rdtgroup, the resulting change in
CLOSID/RMID must be immediately propagated to the PQR_ASSOC MSR on the
task(s) CPUs.

x86 allows reordering loads with prior stores, so if the task starts
running between a task_curr() check that the CPU hoisted before the
stores in the CLOSID/RMID update then it can start running with the old
CLOSID/RMID until it is switched again because __rdtgroup_move_task()
failed to determine that it needs to be interrupted to obtain the new
CLOSID/RMID.

Refer to the diagram below:

CPU 0                                   CPU 1
-----                                   -----
__rdtgroup_move_task():
  curr <- t1->cpu->rq->curr
                                        __schedule():
                                          rq->curr <- t1
                                        resctrl_sched_in():
                                          t1->{closid,rmid} -> {1,1}
  t1->{closid,rmid} <- {2,2}
  if (curr == t1) // false
   IPI(t1->cpu)

A similar race impacts rdt_move_group_tasks(), which updates tasks in a
deleted rdtgroup.

In both cases, use smp_mb() to order the task_struct::{closid,rmid}
stores before the loads in task_curr().  In particular, in the
rdt_move_group_tasks() case, simply execute an smp_mb() on every
iteration with a matching task.

It is possible to use a single smp_mb() in rdt_move_group_tasks(), but
this would require two passes and a means of remembering which
task_structs were updated in the first loop. However, benchmarking
results below showed too little performance impact in the simple
approach to justify implementing the two-pass approach.

Times below were collected using `perf stat` to measure the time to
remove a group containing a 1600-task, parallel workload.

CPU: Intel(R) Xeon(R) Platinum P-8136 CPU @ 2.00GHz (112 threads)

  # mkdir /sys/fs/resctrl/test
  # echo $$ > /sys/fs/resctrl/test/tasks
  # perf bench sched messaging -g 40 -l 100000

task-clock time ranges collected using:

  # perf stat rmdir /sys/fs/resctrl/test

Baseline:                     1.54 - 1.60 ms
smp_mb() every matching task: 1.57 - 1.67 ms

  [ bp: Massage commit message. ]

Fixes: ae28d1aae4 ("x86/resctrl: Use an IPI instead of task_work_add() to update PQR_ASSOC MSR")
Fixes: 0efc89be94 ("x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount")
Signed-off-by: Peter Newman <peternewman@google.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/20221220161123.432120-1-peternewman@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18 11:42:05 +01:00
arch x86/resctrl: Fix task CLOSID/RMID update race 2023-01-18 11:42:05 +01:00
block blk-mq: fix possible memleak when register 'hctx' failed 2023-01-18 11:41:38 +01:00
certs certs/blacklist_hashes.c: fix const confusion in certs blacklist 2022-06-22 14:11:22 +02:00
crypto crypto: tcrypt - Fix multibuffer skcipher speed test mem leak 2023-01-18 11:41:19 +01:00
Documentation docs: Fix the docs build with Sphinx 6.0 2023-01-18 11:42:01 +01:00
drivers iommu/mediatek-v1: Fix an error handling path in mtk_iommu_v1_probe() 2023-01-18 11:42:05 +01:00
fs ext4: fix uninititialized value in 'ext4_evict_inode' 2023-01-18 11:42:03 +01:00
include quota: Factor out setup of quota inode 2023-01-18 11:42:02 +01:00
init init/Kconfig: fix CC_HAS_ASM_GOTO_TIED_OUTPUT test with dash 2022-12-08 11:22:59 +01:00
ipc ipc/sem: Fix dangling sem_array access in semtimedop race 2022-12-08 11:23:06 +01:00
kernel tracing: Fix infinite loop in tracing_read_pipe on overflowed print_trace_line 2023-01-18 11:41:48 +01:00
lib mm/highmem: Lift memcpy_[to|from]_page to core 2023-01-18 11:41:55 +01:00
LICENSES LICENSES: Rename other to deprecated 2019-05-03 06:34:32 -06:00
mm mm, compaction: fix fast_isolate_around() to stay within boundaries 2023-01-18 11:41:44 +01:00
net net/sched: act_mpls: Fix warning during failed attribute validation 2023-01-18 11:42:04 +01:00
samples samples: vfio-mdev: Fix missing pci_disable_device() in mdpy_fb_probe() 2023-01-18 11:41:26 +01:00
scripts scripts/faddr2line: Fix regression in name resolution on ppc64le 2022-12-08 11:23:02 +01:00
security device_cgroup: Roll back to original exceptions after copy failure 2023-01-18 11:41:50 +01:00
sound ALSA: hda/hdmi: Add a HP device 0x8715 to force connect list 2023-01-18 11:42:01 +01:00
tools perf auxtrace: Fix address filter duplicate symbol selection 2023-01-18 11:42:01 +01:00
usr initramfs: restore default compression behavior 2020-04-08 09:08:38 +02:00
virt KVM: arm64: vgic: Fix exit condition in scan_its_table() 2022-10-29 10:20:35 +02:00
.clang-format clang-format: Update with the latest for_each macro list 2019-08-31 10:00:51 +02:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes
.gitignore Modules updates for v5.4 2019-09-22 10:34:46 -07:00
.mailmap ARM: SoC fixes 2019-11-10 13:41:59 -08:00
COPYING
CREDITS MAINTAINERS: Remove Simon as Renesas SoC Co-Maintainer 2019-10-10 08:12:51 -07:00
Kbuild kbuild: do not descend to ./Kbuild when cleaning 2019-08-21 21:03:58 +09:00
Kconfig docs: kbuild: convert docs to ReST and rename to *.rst 2019-06-14 14:21:21 -06:00
MAINTAINERS MAINTAINERS: add Chandan as xfs maintainer for 5.4.y 2022-09-28 11:03:58 +02:00
Makefile Linux 5.4.228 2022-12-19 12:24:17 +01:00
README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.