Go to file
Douglas Anderson 9e957a1550 serial: qcom-geni: Don't cancel/abort if we can't get the port lock
As of commit d7402513c9 ("arm64: smp: IPI_CPU_STOP and
IPI_CPU_CRASH_STOP should try for NMI"), if we've got pseudo-NMI
enabled then we'll use it to stop CPUs at panic time. This is nice,
but it does mean that there's a pretty good chance that we'll end up
stopping a CPU while it holds the port lock for the console
UART. Specifically, I see a CPU get stopped while holding the port
lock nearly 100% of the time on my sc7180-trogdor based Chromebook by
enabling the "buddy" hardlockup detector and then doing:

  sysctl -w kernel.hardlockup_all_cpu_backtrace=1
  sysctl -w kernel.hardlockup_panic=1
  echo HARDLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT

UART drivers are _supposed_ to handle this case OK and this is why
UART drivers check "oops_in_progress" and only do a "trylock" in that
case. However, before we enabled pseudo-NMI to stop CPUs it wasn't a
very well-tested situation.

Now that we're testing the situation a lot, it can be seen that the
Qualcomm GENI UART driver is pretty broken. Specifically, when I run
my test case and look at the console output I just see a bunch of
garbled output like:

  [  201.069084] NMI backtrace[  201.069084] NM[  201.069087] CPU: 6
  PID: 10296 Comm: dnsproxyd Not tainted 6.7.0-06265-gb13e8c0ede12
  #1 01112b9f14923cbd0b[  201.069090] Hardware name: Google Lazor
  ([  201.069092] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DI[
  201.069095] pc : smp_call_function_man[  201.069099]

That's obviously not so great. This happens because each call to the
console driver exits after the data has been written to the FIFO but
before it's actually been flushed out of the serial port. When we have
multiple calls into the console one after the other then (if we can't
get the lock) each call tells the UART to throw away any data in the
FIFO that hadn't been transferred yet.

I've posted up a patch to change the arm64 core to avoid this
situation most of the time [1] much like x86 seems to do, but even if
that patch lands the GENI driver should still be fixed.

>From testing, it appears that we can just delete the cancel/abort in
the case where we weren't able to get the UART lock and the output
looks good. It makes sense that we'd be able to do this since that
means we'll just call into __qcom_geni_serial_console_write() and
__qcom_geni_serial_console_write() looks much like
qcom_geni_serial_poll_put_char() but with a loop. However, it seems
safest to poll the FIFO and make sure it's empty before our
transfer. This should reliably make sure that we're not
interrupting/clobbering any existing transfers.

As part of this change, we'll also avoid re-setting up a TX at the end
of the console write function if we weren't able to get the lock,
since accessing "port->tx_remaining" without the lock is not
safe. This is only needed to re-start userspace initiated transfers.

[1] https://lore.kernel.org/r/20231207170251.1.Id4817adef610302554b8aa42b090d57270dc119c@changeid

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20240112150307.2.Idb1553d1d22123c377f31eacb4486432f6c9ac8d@changeid
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-01-27 19:01:51 -08:00
Documentation Documentation: add console.rst 2024-01-27 18:08:55 -08:00
LICENSES LICENSES: Add the copyleft-next-0.3.1 license 2022-11-08 15:44:01 +01:00
arch serial: 8250: Move hp300_setup_serial_console() to <linux/serial_8250.h> 2024-01-27 19:01:39 -08:00
block for-6.8/block-2024-01-18 2024-01-18 18:22:40 -08:00
certs This update includes the following changes: 2023-11-02 16:15:30 -10:00
crypto crypto: scomp - fix req->dst buffer overflow 2023-12-29 11:25:56 +08:00
drivers serial: qcom-geni: Don't cancel/abort if we can't get the port lock 2024-01-27 19:01:51 -08:00
fs More bcachefs updates for 6.7-rc1 2024-01-21 14:01:12 -08:00
include soc: qcom: geni-se: Add M_TX_FIFO_NOT_EMPTY bit definition 2024-01-27 19:01:51 -08:00
init Driver core changes for 6.8-rc1 2024-01-18 09:48:40 -08:00
io_uring for-6.8/io_uring-2024-01-18 2024-01-18 18:17:57 -08:00
ipc shm: Slim down dependencies 2023-12-20 19:26:31 -05:00
kernel Updates for time and clocksources: 2024-01-21 11:14:40 -08:00
lib RISC-V Patches for the 6.8 Merge Window, Part 4 2024-01-20 11:06:04 -08:00
mm vfs-6.8.netfs 2024-01-19 09:10:23 -08:00
net Assorted CephFS fixes and cleanups with nothing standing out. 2024-01-19 09:58:55 -08:00
rust Rust changes for v6.8 2024-01-11 13:05:41 -08:00
samples RISC-V Patches for the 6.8 Merge Window, Part 4 2024-01-20 11:06:04 -08:00
scripts Coccinelle change for v6.8 2024-01-20 14:20:34 -08:00
security + Features 2024-01-19 10:53:55 -08:00
sound treewide, serdev: change receive_buf() return type to size_t 2024-01-27 18:13:53 -08:00
tools RISC-V Patches for the 6.8 Merge Window, Part 4 2024-01-20 11:06:04 -08:00
usr Kbuild updates for v6.8 2024-01-18 17:57:07 -08:00
virt Generic: 2024-01-17 13:03:37 -08:00
.clang-format clang-format: Update with v6.7-rc4's `for_each` macro list 2023-12-08 23:54:38 +01:00
.cocciconfig
.editorconfig Add .editorconfig file for basic formatting 2023-12-28 16:22:47 +09:00
.get_maintainer.ignore get_maintainer: add Alan to .get_maintainer.ignore 2022-08-20 15:17:44 -07:00
.gitattributes .gitattributes: set diff driver for Rust source code files 2023-05-31 17:48:25 +02:00
.gitignore Add .editorconfig file for basic formatting 2023-12-28 16:22:47 +09:00
.mailmap Char/Misc and other Driver changes for 6.8-rc1 2024-01-17 16:47:17 -08:00
.rustfmt.toml rust: add `.rustfmt.toml` 2022-09-28 09:02:20 +02:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS Including fixes from bpf and netfilter. 2024-01-18 17:33:50 -08:00
Kbuild Kbuild updates for v6.1 2022-10-10 12:00:45 -07:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS Various smb client fixes, including multichannel and for SMB3.1.1 POSIX extensions 2024-01-20 16:48:07 -08:00
Makefile Linux 6.8-rc1 2024-01-21 14:11:32 -08:00
README

README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.