No description
Find a file
Waiman Long 1fd7ab3fac driver/base/cpu: Retry online operation if -EBUSY
Booting the kernel with "maxcpus=1" is a common technique for CPU
partitioning and isolation. It delays the CPU bringup process until
when the bootup scripts are ready to bring CPUs online by writing 1 to
/sys/device/system/cpu/cpu<X>/online. However, it was found that not
all the CPUs were online after bootup. The collection of offline CPUs
are different after every reboot.

Further investigation reveals that some "online" write operations
fail with an -EBUSY error. This error is returned when CPU hotplug is
temporiarly disabled when cpu_hotplug_disable() is called.

During bootup, the main caller of cpu_hotplug_disable() is
pci_call_probe() for PCI device initialization. By measuring the times
spent with cpu_hotplug_disabled set in a typical 2-socket server, most
of them last less than 10ms.  However, there are a few that can last
hundreds of ms. Note that the cpu_hotplug_disabled period of different
devices can overlap leading to longer cpu_hotplug_disabled hold time.

Since the CPU hotplug disable condition is transient and it is not
that easy to modify all the existing bootup scripts to handle this
condition, the kernel can help by retrying the online operation when
an -EBUSY error is returned. This patch retries the online operation
in cpu_subsys_online() when an -EBUSY error is returned for up to 5
times after an exponentially increasing delay that can last a total of
at least 620ms of waiting time by calling msleep().

With this patch in place, booting up the patched kernel with "maxcpus=1"
does not leave any CPU in an offline state in 10 reboot attempts.

Reported-by: Vishal Agrawal <vagrawal@redhat.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Link: https://lore.kernel.org/r/20230724143826.3996163-1-longman@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-05 08:31:41 +02:00
arch x86: 2023-07-30 11:19:08 -07:00
block block-6.5-2023-07-21 2023-07-22 11:05:15 -07:00
certs
crypto
Documentation docs: stable-kernel-rules: make rule section more straight forward 2023-08-05 08:31:41 +02:00
drivers driver/base/cpu: Retry online operation if -EBUSY 2023-08-05 08:31:41 +02:00
fs four small client fixes 2023-07-29 20:49:13 -07:00
include driver core: Move dev_err_probe() to where it belogs 2023-08-05 08:31:41 +02:00
init
io_uring io_uring-6.5-2023-07-28 2023-07-28 10:19:44 -07:00
ipc
kernel Probe fixes for 6.5-rc3: 2023-07-30 11:27:22 -07:00
lib kobject: Add helper kobj_ns_type_is_valid() 2023-08-05 08:31:41 +02:00
LICENSES
mm 11 hotfixes. Five are cc:stable and the remainder address post-6.4 issues 2023-07-28 17:19:52 -07:00
net A patch to reduce the potential for erroneous RBD exclusive lock 2023-07-28 10:47:24 -07:00
rust
samples
scripts x86: 2023-07-30 11:19:08 -07:00
security security: keys: perform capable check only on privileged operations 2023-07-28 18:07:41 +00:00
sound ASoC: Fixes for v6.5 2023-07-27 14:54:23 +02:00
tools Probe fixes for 6.5-rc3: 2023-07-30 11:27:22 -07:00
usr
virt KVM: Grab a reference to KVM for VM and vCPU stats file descriptors 2023-07-29 11:05:28 -04:00
.clang-format
.cocciconfig
.get_maintainer.ignore
.gitattributes
.gitignore
.mailmap mailmap: update remaining active codeaurora.org email addresses 2023-07-27 13:07:05 -07:00
.rustfmt.toml
COPYING
CREDITS
Kbuild
Kconfig
MAINTAINERS USB fixes for 6.5-rc4 2023-07-30 11:57:51 -07:00
Makefile Linux 6.5-rc4 2023-07-30 13:23:47 -07:00
README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.