Commit Graph

527 Commits

Author SHA1 Message Date
Linus Torvalds 3c31041e37 printk changes for 6.6
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmT1pbAACgkQUqAMR0iA
 lPLoxBAAl18gKo6C8zIBNBoYNl7FxvChrJORjK7RQONs5RYKt8drHjSrJGazhjiV
 TIdbZt9juqs+UT/f6DnkJznrqQ0W70fQsgUpw+q7n7+cnkIoXAiAs+plexdQXigB
 6nx67b2oub41jTwzn/uV7R/eTwq2VnoZqudS/o9iAI/Ia9JzkqmGx08hQedvOoqX
 2SWs140iY/zXsFUyEfe8UTXwJUnC/n9pwtpe5sLpmtyupGc/KumUimTQ+LFJbV9o
 n8QhcQn10CE93M5fH/R2JXjZO7wuSmCHt/V8oSnoOwdCBBe7Tc6aBx5wUwc4XCuC
 450h5hlYBKq97lA1PnWGC01uAkeDTRw8963LVRRqWvohoFuHXR0cisF9FTM9LXfs
 bg90XjzYAK7Ns9fJ0dZHOSbUtRaa06hiExUnO3ctyv2G6h8qKfv86LCuP0CMFmQU
 rflfk1dPiMW20HT3ZJNtMCy3Vu6kVQSdSaGKTnwzTcUWop5tCQxhmWYBXH6q/1LH
 aD7xT1xJnBGqLUqq5C8twoOea+w47x/vtjTLi7mJarP5Wfh8xv6axdkwE8N4NrYp
 cc2RR83a1BZ7At3YkAjfjHmhaZ97gSSe6+Yqk9UzvUEQY/WILEGnb0DKO1jCSB34
 D2NPh1MHF5MFQjazdt/dSyMJVxDlTeY/S5wqfLLhNZts48LJ8n0=
 =D7ZU
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk updates from Petr Mladek:

 - Do not try to get the console lock when it is not need or useful in
   panic()

 - Replace the global console_suspended state by a per-console flag

 - Export symbols needed for dumping the raw printk buffer in panic()

 - Fix documentation of printf formats for integer types

 - Moved Sergey Senozhatsky to the reviewer role

 - Misc cleanups

* tag 'printk-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  printk: export symbols for debug modules
  lib: test_scanf: Add explicit type cast to result initialization in test_number_prefix()
  printk: ringbuffer: Fix truncating buffer size min_t cast
  printk: Rename abandon_console_lock_in_panic() to other_cpu_in_panic()
  printk: Add per-console suspended state
  printk: Consolidate console deferred printing
  printk: Do not take console lock for console_flush_on_panic()
  printk: Keep non-panic-CPUs out of console lock
  printk: Reduce console_unblank() usage in unsafe scenarios
  kdb: Do not assume write() callback available
  docs: printk-formats: Treat char as always unsigned
  docs: printk-formats: Fix hex printing of signed values
  MAINTAINERS: adjust printk/vsprintf entries
2023-09-04 13:20:19 -07:00
Petr Mladek f0f6923953 Merge branch 'rework/misc-cleanups' into for-linus 2023-09-04 11:37:37 +02:00
Enlin Mu 3e00123a13 printk: export symbols for debug modules
the module is out-of-tree, it saves kernel logs when panic

Signed-off-by: Enlin Mu <enlin.mu@unisoc.com>
Acked-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20230815020711.2604939-1-yunlong.xing@unisoc.com
2023-08-16 17:06:38 +02:00
Kees Cook 53e9e33ede printk: ringbuffer: Fix truncating buffer size min_t cast
If an output buffer size exceeded U16_MAX, the min_t(u16, ...) cast in
copy_data() was causing writes to truncate. This manifested as output
bytes being skipped, seen as %NUL bytes in pstore dumps when the available
record size was larger than 65536. Fix the cast to no longer truncate
the calculation.

Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Reported-by: Vijay Balakrishna <vijayb@linux.microsoft.com>
Link: https://lore.kernel.org/lkml/d8bb1ec7-a4c5-43a2-9de0-9643a70b899f@linux.microsoft.com/
Fixes: b6cf8b3f33 ("printk: add lockless ringbuffer")
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Vijay Balakrishna <vijayb@linux.microsoft.com>
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Reviewed-by: Tyler Hicks (Microsoft) <code@tyhicks.com>
Tested-by: Tyler Hicks (Microsoft) <code@tyhicks.com>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20230811054528.never.165-kees@kernel.org
2023-08-14 13:05:22 +02:00
John Ogness 132a90d152 printk: Rename abandon_console_lock_in_panic() to other_cpu_in_panic()
Currently abandon_console_lock_in_panic() is only used to determine if
the current CPU should immediately release the console lock because
another CPU is in panic. However, later this function will be used by
the CPU to immediately release other resources in this situation.

Rename the function to other_cpu_in_panic(), which is a better
description and does not assume it is related to the console lock.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20230717194607.145135-8-john.ogness@linutronix.de
2023-07-20 13:06:22 +02:00
John Ogness 9e70a5e109 printk: Add per-console suspended state
Currently the global @console_suspended is used to determine if
consoles are in a suspended state. Its primary purpose is to allow
usage of the console_lock when suspended without causing console
printing. It is synchronized by the console_lock.

Rather than relying on the console_lock to determine suspended
state, make it an official per-console state that is set within
console->flags. This allows the state to be queried via SRCU.

Remove @console_suspended. Console printing will still be avoided
when suspended because console_is_usable() returns false when
the new suspended flag is set for that console.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20230717194607.145135-7-john.ogness@linutronix.de
2023-07-20 13:06:22 +02:00
John Ogness 696ffaf50e printk: Consolidate console deferred printing
Printing to consoles can be deferred for several reasons:

- explicitly with printk_deferred()
- printk() in NMI context
- recursive printk() calls

The current implementation is not consistent. For printk_deferred(),
irq work is scheduled twice. For NMI und recursive, panic CPU
suppression and caller delays are not properly enforced.

Correct these inconsistencies by consolidating the deferred printing
code so that vprintk_deferred() is the top-level function for
deferred printing and vprintk_emit() will perform whichever irq_work
queueing is appropriate.

Also add kerneldoc for wake_up_klogd() and defer_console_output() to
clarify their differences and appropriate usage.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20230717194607.145135-6-john.ogness@linutronix.de
2023-07-20 13:06:22 +02:00
John Ogness eacb04ff3c printk: Do not take console lock for console_flush_on_panic()
Currently console_flush_on_panic() will attempt to acquire the
console lock when flushing the buffer on panic. If it fails to
acquire the lock, it continues anyway because this is the last
chance to get any pending records printed.

The reason why the console lock was attempted at all was to
prevent any other CPUs from acquiring the console lock for
printing while the panic CPU was printing. But as of the
previous commit, non-panic CPUs will no longer attempt to
acquire the console lock in a panic situation. Therefore it is
no longer strictly necessary for a panic CPU to acquire the
console lock.

Avoiding taking the console lock when flushing in panic has
the additional benefit of avoiding possible deadlocks due to
semaphore usage in NMI context (semaphores are not NMI-safe)
and avoiding possible deadlocks if another CPU accesses the
semaphore and is stopped while holding one of the semaphore's
internal spinlocks.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20230717194607.145135-5-john.ogness@linutronix.de
2023-07-20 13:06:22 +02:00
John Ogness 51a1d258e5 printk: Keep non-panic-CPUs out of console lock
When in a panic situation, non-panic CPUs should avoid holding the
console lock so as not to contend with the panic CPU. This is already
implemented with abandon_console_lock_in_panic(), which is checked
after each printed line. However, non-panic CPUs should also avoid
trying to acquire the console lock during a panic.

Modify console_trylock() to fail and console_lock() to block() when
called from a non-panic CPU during a panic.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20230717194607.145135-4-john.ogness@linutronix.de
2023-07-20 13:06:22 +02:00
John Ogness 7b23a66db5 printk: Reduce console_unblank() usage in unsafe scenarios
A semaphore is not NMI-safe, even when using down_trylock(). Both
down_trylock() and up() are using internal spinlocks and up()
might even call wake_up_process().

In the panic() code path it gets even worse because the internal
spinlocks of the semaphore may have been taken by a CPU that has
been stopped.

To reduce the risk of deadlocks caused by the console semaphore in
the panic path, make the following changes:

- First check if any consoles have implemented the unblank()
  callback. If not, then there is no reason to take the console
  semaphore anyway. (This check is also useful for the non-panic
  path since the locking/unlocking of the console lock can be
  quite expensive due to console printing.)

- If the panic path is in NMI context, bail out without attempting
  to take the console semaphore or calling any unblank() callbacks.
  Bailing out is acceptable because console_unblank() would already
  bail out if the console semaphore is contended. The alternative of
  ignoring the console semaphore and calling the unblank() callbacks
  anyway is a bad idea because these callbacks are also not NMI-safe.

If consoles with unblank() callbacks exist and console_unblank() is
called from a non-NMI panic context, it will still attempt a
down_trylock(). This could still result in a deadlock if one of the
stopped CPUs is holding the semaphore internal spinlock. But this
is a risk that the kernel has been (and continues to be) willing
to take.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20230717194607.145135-3-john.ogness@linutronix.de
2023-07-20 13:06:22 +02:00
Peter Zijlstra d16317de9b seqlock/latch: Provide raw_read_seqcount_latch_retry()
The read side of seqcount_latch consists of:

  do {
    seq = raw_read_seqcount_latch(&latch->seq);
    ...
  } while (read_seqcount_latch_retry(&latch->seq, seq));

which is asymmetric in the raw_ department, and sure enough,
read_seqcount_latch_retry() includes (explicit) instrumentation where
raw_read_seqcount_latch() does not.

This inconsistency becomes a problem when trying to use it from
noinstr code. As such, fix it by renaming and re-implementing
raw_read_seqcount_latch_retry() without the instrumentation.

Specifically the instrumentation in question is kcsan_atomic_next(0)
in do___read_seqcount_retry(). Loosing this annotation is not a
problem because raw_read_seqcount_latch() does not pass through
kcsan_atomic_next(KCSAN_SEQLOCK_REGION_MAX).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Tested-by: Michael Kelley <mikelley@microsoft.com>  # Hyper-V
Link: https://lore.kernel.org/r/20230519102715.233598176@infradead.org
2023-06-05 21:11:03 +02:00
Linus Torvalds 7fa8a8ee94 - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of
switching from a user process to a kernel thread.
 
 - More folio conversions from Kefeng Wang, Zhang Peng and Pankaj Raghav.
 
 - zsmalloc performance improvements from Sergey Senozhatsky.
 
 - Yue Zhao has found and fixed some data race issues around the
   alteration of memcg userspace tunables.
 
 - VFS rationalizations from Christoph Hellwig:
 
   - removal of most of the callers of write_one_page().
 
   - make __filemap_get_folio()'s return value more useful
 
 - Luis Chamberlain has changed tmpfs so it no longer requires swap
   backing.  Use `mount -o noswap'.
 
 - Qi Zheng has made the slab shrinkers operate locklessly, providing
   some scalability benefits.
 
 - Keith Busch has improved dmapool's performance, making part of its
   operations O(1) rather than O(n).
 
 - Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd,
   permitting userspace to wr-protect anon memory unpopulated ptes.
 
 - Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive rather
   than exclusive, and has fixed a bunch of errors which were caused by its
   unintuitive meaning.
 
 - Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature,
   which causes minor faults to install a write-protected pte.
 
 - Vlastimil Babka has done some maintenance work on vma_merge():
   cleanups to the kernel code and improvements to our userspace test
   harness.
 
 - Cleanups to do_fault_around() by Lorenzo Stoakes.
 
 - Mike Rapoport has moved a lot of initialization code out of various
   mm/ files and into mm/mm_init.c.
 
 - Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for
   DRM, but DRM doesn't use it any more.
 
 - Lorenzo has also coverted read_kcore() and vread() to use iterators
   and has thereby removed the use of bounce buffers in some cases.
 
 - Lorenzo has also contributed further cleanups of vma_merge().
 
 - Chaitanya Prakash provides some fixes to the mmap selftesting code.
 
 - Matthew Wilcox changes xfs and afs so they no longer take sleeping
   locks in ->map_page(), a step towards RCUification of pagefaults.
 
 - Suren Baghdasaryan has improved mmap_lock scalability by switching to
   per-VMA locking.
 
 - Frederic Weisbecker has reworked the percpu cache draining so that it
   no longer causes latency glitches on cpu isolated workloads.
 
 - Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig
   logic.
 
 - Liu Shixin has changed zswap's initialization so we no longer waste a
   chunk of memory if zswap is not being used.
 
 - Yosry Ahmed has improved the performance of memcg statistics flushing.
 
 - David Stevens has fixed several issues involving khugepaged,
   userfaultfd and shmem.
 
 - Christoph Hellwig has provided some cleanup work to zram's IO-related
   code paths.
 
 - David Hildenbrand has fixed up some issues in the selftest code's
   testing of our pte state changing.
 
 - Pankaj Raghav has made page_endio() unneeded and has removed it.
 
 - Peter Xu contributed some rationalizations of the userfaultfd
   selftests.
 
 - Yosry Ahmed has fixed an issue around memcg's page recalim accounting.
 
 - Chaitanya Prakash has fixed some arm-related issues in the
   selftests/mm code.
 
 - Longlong Xia has improved the way in which KSM handles hwpoisoned
   pages.
 
 - Peter Xu fixes a few issues with uffd-wp at fork() time.
 
 - Stefan Roesch has changed KSM so that it may now be used on a
   per-process and per-cgroup basis.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZEr3zQAKCRDdBJ7gKXxA
 jlLoAP0fpQBipwFxED0Us4SKQfupV6z4caXNJGPeay7Aj11/kQD/aMRC2uPfgr96
 eMG3kwn2pqkB9ST2QpkaRbxA//eMbQY=
 =J+Dj
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of
   switching from a user process to a kernel thread.

 - More folio conversions from Kefeng Wang, Zhang Peng and Pankaj
   Raghav.

 - zsmalloc performance improvements from Sergey Senozhatsky.

 - Yue Zhao has found and fixed some data race issues around the
   alteration of memcg userspace tunables.

 - VFS rationalizations from Christoph Hellwig:
     - removal of most of the callers of write_one_page()
     - make __filemap_get_folio()'s return value more useful

 - Luis Chamberlain has changed tmpfs so it no longer requires swap
   backing. Use `mount -o noswap'.

 - Qi Zheng has made the slab shrinkers operate locklessly, providing
   some scalability benefits.

 - Keith Busch has improved dmapool's performance, making part of its
   operations O(1) rather than O(n).

 - Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd,
   permitting userspace to wr-protect anon memory unpopulated ptes.

 - Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive
   rather than exclusive, and has fixed a bunch of errors which were
   caused by its unintuitive meaning.

 - Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature,
   which causes minor faults to install a write-protected pte.

 - Vlastimil Babka has done some maintenance work on vma_merge():
   cleanups to the kernel code and improvements to our userspace test
   harness.

 - Cleanups to do_fault_around() by Lorenzo Stoakes.

 - Mike Rapoport has moved a lot of initialization code out of various
   mm/ files and into mm/mm_init.c.

 - Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for
   DRM, but DRM doesn't use it any more.

 - Lorenzo has also coverted read_kcore() and vread() to use iterators
   and has thereby removed the use of bounce buffers in some cases.

 - Lorenzo has also contributed further cleanups of vma_merge().

 - Chaitanya Prakash provides some fixes to the mmap selftesting code.

 - Matthew Wilcox changes xfs and afs so they no longer take sleeping
   locks in ->map_page(), a step towards RCUification of pagefaults.

 - Suren Baghdasaryan has improved mmap_lock scalability by switching to
   per-VMA locking.

 - Frederic Weisbecker has reworked the percpu cache draining so that it
   no longer causes latency glitches on cpu isolated workloads.

 - Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig
   logic.

 - Liu Shixin has changed zswap's initialization so we no longer waste a
   chunk of memory if zswap is not being used.

 - Yosry Ahmed has improved the performance of memcg statistics
   flushing.

 - David Stevens has fixed several issues involving khugepaged,
   userfaultfd and shmem.

 - Christoph Hellwig has provided some cleanup work to zram's IO-related
   code paths.

 - David Hildenbrand has fixed up some issues in the selftest code's
   testing of our pte state changing.

 - Pankaj Raghav has made page_endio() unneeded and has removed it.

 - Peter Xu contributed some rationalizations of the userfaultfd
   selftests.

 - Yosry Ahmed has fixed an issue around memcg's page recalim
   accounting.

 - Chaitanya Prakash has fixed some arm-related issues in the
   selftests/mm code.

 - Longlong Xia has improved the way in which KSM handles hwpoisoned
   pages.

 - Peter Xu fixes a few issues with uffd-wp at fork() time.

 - Stefan Roesch has changed KSM so that it may now be used on a
   per-process and per-cgroup basis.

* tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits)
  mm,unmap: avoid flushing TLB in batch if PTE is inaccessible
  shmem: restrict noswap option to initial user namespace
  mm/khugepaged: fix conflicting mods to collapse_file()
  sparse: remove unnecessary 0 values from rc
  mm: move 'mmap_min_addr' logic from callers into vm_unmapped_area()
  hugetlb: pte_alloc_huge() to replace huge pte_alloc_map()
  maple_tree: fix allocation in mas_sparse_area()
  mm: do not increment pgfault stats when page fault handler retries
  zsmalloc: allow only one active pool compaction context
  selftests/mm: add new selftests for KSM
  mm: add new KSM process and sysfs knobs
  mm: add new api to enable ksm per process
  mm: shrinkers: fix debugfs file permissions
  mm: don't check VMA write permissions if the PTE/PMD indicates write permissions
  migrate_pages_batch: fix statistics for longterm pin retry
  userfaultfd: use helper function range_in_vma()
  lib/show_mem.c: use for_each_populated_zone() simplify code
  mm: correct arg in reclaim_pages()/reclaim_clean_pages_from_list()
  fs/buffer: convert create_page_buffers to folio_create_buffers
  fs/buffer: add folio_create_empty_buffers helper
  ...
2023-04-27 19:42:02 -07:00
Linus Torvalds b6a7828502 modules-6.4-rc1
The summary of the changes for this pull requests is:
 
  * Song Liu's new struct module_memory replacement
  * Nick Alcock's MODULE_LICENSE() removal for non-modules
  * My cleanups and enhancements to reduce the areas where we vmalloc
    module memory for duplicates, and the respective debug code which
    proves the remaining vmalloc pressure comes from userspace.
 
 Most of the changes have been in linux-next for quite some time except
 the minor fixes I made to check if a module was already loaded
 prior to allocating the final module memory with vmalloc and the
 respective debug code it introduces to help clarify the issue. Although
 the functional change is small it is rather safe as it can only *help*
 reduce vmalloc space for duplicates and is confirmed to fix a bootup
 issue with over 400 CPUs with KASAN enabled. I don't expect stable
 kernels to pick up that fix as the cleanups would have also had to have
 been picked up. Folks on larger CPU systems with modules will want to
 just upgrade if vmalloc space has been an issue on bootup.
 
 Given the size of this request, here's some more elaborate details
 on this pull request.
 
 The functional change change in this pull request is the very first
 patch from Song Liu which replaces the struct module_layout with a new
 struct module memory. The old data structure tried to put together all
 types of supported module memory types in one data structure, the new
 one abstracts the differences in memory types in a module to allow each
 one to provide their own set of details. This paves the way in the
 future so we can deal with them in a cleaner way. If you look at changes
 they also provide a nice cleanup of how we handle these different memory
 areas in a module. This change has been in linux-next since before the
 merge window opened for v6.3 so to provide more than a full kernel cycle
 of testing. It's a good thing as quite a bit of fixes have been found
 for it.
 
 Jason Baron then made dynamic debug a first class citizen module user by
 using module notifier callbacks to allocate / remove module specific
 dynamic debug information.
 
 Nick Alcock has done quite a bit of work cross-tree to remove module
 license tags from things which cannot possibly be module at my request
 so to:
 
   a) help him with his longer term tooling goals which require a
      deterministic evaluation if a piece a symbol code could ever be
      part of a module or not. But quite recently it is has been made
      clear that tooling is not the only one that would benefit.
      Disambiguating symbols also helps efforts such as live patching,
      kprobes and BPF, but for other reasons and R&D on this area
      is active with no clear solution in sight.
 
   b) help us inch closer to the now generally accepted long term goal
      of automating all the MODULE_LICENSE() tags from SPDX license tags
 
 In so far as a) is concerned, although module license tags are a no-op
 for non-modules, tools which would want create a mapping of possible
 modules can only rely on the module license tag after the commit
 8b41fc4454 ("kbuild: create modules.builtin without Makefile.modbuiltin
 or tristate.conf").  Nick has been working on this *for years* and
 AFAICT I was the only one to suggest two alternatives to this approach
 for tooling. The complexity in one of my suggested approaches lies in
 that we'd need a possible-obj-m and a could-be-module which would check
 if the object being built is part of any kconfig build which could ever
 lead to it being part of a module, and if so define a new define
 -DPOSSIBLE_MODULE [0]. A more obvious yet theoretical approach I've
 suggested would be to have a tristate in kconfig imply the same new
 -DPOSSIBLE_MODULE as well but that means getting kconfig symbol names
 mapping to modules always, and I don't think that's the case today. I am
 not aware of Nick or anyone exploring either of these options. Quite
 recently Josh Poimboeuf has pointed out that live patching, kprobes and
 BPF would benefit from resolving some part of the disambiguation as
 well but for other reasons. The function granularity KASLR (fgkaslr)
 patches were mentioned but Joe Lawrence has clarified this effort has
 been dropped with no clear solution in sight [1].
 
 In the meantime removing module license tags from code which could never
 be modules is welcomed for both objectives mentioned above. Some
 developers have also welcomed these changes as it has helped clarify
 when a module was never possible and they forgot to clean this up,
 and so you'll see quite a bit of Nick's patches in other pull
 requests for this merge window. I just picked up the stragglers after
 rc3. LWN has good coverage on the motivation behind this work [2] and
 the typical cross-tree issues he ran into along the way. The only
 concrete blocker issue he ran into was that we should not remove the
 MODULE_LICENSE() tags from files which have no SPDX tags yet, even if
 they can never be modules. Nick ended up giving up on his efforts due
 to having to do this vetting and backlash he ran into from folks who
 really did *not understand* the core of the issue nor were providing
 any alternative / guidance. I've gone through his changes and dropped
 the patches which dropped the module license tags where an SPDX
 license tag was missing, it only consisted of 11 drivers.  To see
 if a pull request deals with a file which lacks SPDX tags you
 can just use:
 
   ./scripts/spdxcheck.py -f \
 	$(git diff --name-only commid-id | xargs echo)
 
 You'll see a core module file in this pull request for the above,
 but that's not related to his changes. WE just need to add the SPDX
 license tag for the kernel/module/kmod.c file in the future but
 it demonstrates the effectiveness of the script.
 
 Most of Nick's changes were spread out through different trees,
 and I just picked up the slack after rc3 for the last kernel was out.
 Those changes have been in linux-next for over two weeks.
 
 The cleanups, debug code I added and final fix I added for modules
 were motivated by David Hildenbrand's report of boot failing on
 a systems with over 400 CPUs when KASAN was enabled due to running
 out of virtual memory space. Although the functional change only
 consists of 3 lines in the patch "module: avoid allocation if module is
 already present and ready", proving that this was the best we can
 do on the modules side took quite a bit of effort and new debug code.
 
 The initial cleanups I did on the modules side of things has been
 in linux-next since around rc3 of the last kernel, the actual final
 fix for and debug code however have only been in linux-next for about a
 week or so but I think it is worth getting that code in for this merge
 window as it does help fix / prove / evaluate the issues reported
 with larger number of CPUs. Userspace is not yet fixed as it is taking
 a bit of time for folks to understand the crux of the issue and find a
 proper resolution. Worst come to worst, I have a kludge-of-concept [3]
 of how to make kernel_read*() calls for modules unique / converge them,
 but I'm currently inclined to just see if userspace can fix this
 instead.
 
 [0] https://lore.kernel.org/all/Y/kXDqW+7d71C4wz@bombadil.infradead.org/
 [1] https://lkml.kernel.org/r/025f2151-ce7c-5630-9b90-98742c97ac65@redhat.com
 [2] https://lwn.net/Articles/927569/
 [3] https://lkml.kernel.org/r/20230414052840.1994456-3-mcgrof@kernel.org
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmRG4m0SHG1jZ3JvZkBr
 ZXJuZWwub3JnAAoJEM4jHQowkoinQ2oP/0xlvKwJg6Ey8fHZF0qv8VOskE80zoLF
 hMazU3xfqLA+1TQvouW1YBxt3jwS3t1Ehs+NrV+nY9Yzcm0MzRX/n3fASJVe7nRr
 oqWWQU+voYl5Pw1xsfdp6C8IXpBQorpYby3Vp0MAMoZyl2W2YrNo36NV488wM9KC
 jD4HF5Z6xpnPSZTRR7AgW9mo7FdAtxPeKJ76Bch7lH8U6omT7n36WqTw+5B1eAYU
 YTOvrjRs294oqmWE+LeebyiOOXhH/yEYx4JNQgCwPdxwnRiGJWKsk5va0hRApqF/
 WW8dIqdEnjsa84lCuxnmWgbcPK8cgmlO0rT0DyneACCldNlldCW1LJ0HOwLk9pea
 p3JFAsBL7TKue4Tos6I7/4rx1ufyBGGIigqw9/VX5g0Iif+3BhWnqKRfz+p9wiMa
 Fl7cU6u7yC68CHu1HBSisK16cYMCPeOnTSd89upHj8JU/t74O6k/ARvjrQ9qmNUt
 c5U+OY+WpNJ1nXQydhY/yIDhFdYg8SSpNuIO90r4L8/8jRQYXNG80FDd1UtvVDuy
 eq0r2yZ8C0XHSlOT9QHaua/tWV/aaKtyC/c0hDRrigfUrq8UOlGujMXbUnrmrWJI
 tLJLAc7ePWAAoZXGSHrt0U27l029GzLwRdKqJ6kkDANVnTeOdV+mmBg9zGh3/Mp6
 agiwdHUMVN7X
 =56WK
 -----END PGP SIGNATURE-----

Merge tag 'modules-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux

Pull module updates from Luis Chamberlain:
 "The summary of the changes for this pull requests is:

   - Song Liu's new struct module_memory replacement

   - Nick Alcock's MODULE_LICENSE() removal for non-modules

   - My cleanups and enhancements to reduce the areas where we vmalloc
     module memory for duplicates, and the respective debug code which
     proves the remaining vmalloc pressure comes from userspace.

  Most of the changes have been in linux-next for quite some time except
  the minor fixes I made to check if a module was already loaded prior
  to allocating the final module memory with vmalloc and the respective
  debug code it introduces to help clarify the issue. Although the
  functional change is small it is rather safe as it can only *help*
  reduce vmalloc space for duplicates and is confirmed to fix a bootup
  issue with over 400 CPUs with KASAN enabled. I don't expect stable
  kernels to pick up that fix as the cleanups would have also had to
  have been picked up. Folks on larger CPU systems with modules will
  want to just upgrade if vmalloc space has been an issue on bootup.

  Given the size of this request, here's some more elaborate details:

  The functional change change in this pull request is the very first
  patch from Song Liu which replaces the 'struct module_layout' with a
  new 'struct module_memory'. The old data structure tried to put
  together all types of supported module memory types in one data
  structure, the new one abstracts the differences in memory types in a
  module to allow each one to provide their own set of details. This
  paves the way in the future so we can deal with them in a cleaner way.
  If you look at changes they also provide a nice cleanup of how we
  handle these different memory areas in a module. This change has been
  in linux-next since before the merge window opened for v6.3 so to
  provide more than a full kernel cycle of testing. It's a good thing as
  quite a bit of fixes have been found for it.

  Jason Baron then made dynamic debug a first class citizen module user
  by using module notifier callbacks to allocate / remove module
  specific dynamic debug information.

  Nick Alcock has done quite a bit of work cross-tree to remove module
  license tags from things which cannot possibly be module at my request
  so to:

   a) help him with his longer term tooling goals which require a
      deterministic evaluation if a piece a symbol code could ever be
      part of a module or not. But quite recently it is has been made
      clear that tooling is not the only one that would benefit.
      Disambiguating symbols also helps efforts such as live patching,
      kprobes and BPF, but for other reasons and R&D on this area is
      active with no clear solution in sight.

   b) help us inch closer to the now generally accepted long term goal
      of automating all the MODULE_LICENSE() tags from SPDX license tags

  In so far as a) is concerned, although module license tags are a no-op
  for non-modules, tools which would want create a mapping of possible
  modules can only rely on the module license tag after the commit
  8b41fc4454 ("kbuild: create modules.builtin without
  Makefile.modbuiltin or tristate.conf").

  Nick has been working on this *for years* and AFAICT I was the only
  one to suggest two alternatives to this approach for tooling. The
  complexity in one of my suggested approaches lies in that we'd need a
  possible-obj-m and a could-be-module which would check if the object
  being built is part of any kconfig build which could ever lead to it
  being part of a module, and if so define a new define
  -DPOSSIBLE_MODULE [0].

  A more obvious yet theoretical approach I've suggested would be to
  have a tristate in kconfig imply the same new -DPOSSIBLE_MODULE as
  well but that means getting kconfig symbol names mapping to modules
  always, and I don't think that's the case today. I am not aware of
  Nick or anyone exploring either of these options. Quite recently Josh
  Poimboeuf has pointed out that live patching, kprobes and BPF would
  benefit from resolving some part of the disambiguation as well but for
  other reasons. The function granularity KASLR (fgkaslr) patches were
  mentioned but Joe Lawrence has clarified this effort has been dropped
  with no clear solution in sight [1].

  In the meantime removing module license tags from code which could
  never be modules is welcomed for both objectives mentioned above. Some
  developers have also welcomed these changes as it has helped clarify
  when a module was never possible and they forgot to clean this up, and
  so you'll see quite a bit of Nick's patches in other pull requests for
  this merge window. I just picked up the stragglers after rc3. LWN has
  good coverage on the motivation behind this work [2] and the typical
  cross-tree issues he ran into along the way. The only concrete blocker
  issue he ran into was that we should not remove the MODULE_LICENSE()
  tags from files which have no SPDX tags yet, even if they can never be
  modules. Nick ended up giving up on his efforts due to having to do
  this vetting and backlash he ran into from folks who really did *not
  understand* the core of the issue nor were providing any alternative /
  guidance. I've gone through his changes and dropped the patches which
  dropped the module license tags where an SPDX license tag was missing,
  it only consisted of 11 drivers. To see if a pull request deals with a
  file which lacks SPDX tags you can just use:

    ./scripts/spdxcheck.py -f \
	$(git diff --name-only commid-id | xargs echo)

  You'll see a core module file in this pull request for the above, but
  that's not related to his changes. WE just need to add the SPDX
  license tag for the kernel/module/kmod.c file in the future but it
  demonstrates the effectiveness of the script.

  Most of Nick's changes were spread out through different trees, and I
  just picked up the slack after rc3 for the last kernel was out. Those
  changes have been in linux-next for over two weeks.

  The cleanups, debug code I added and final fix I added for modules
  were motivated by David Hildenbrand's report of boot failing on a
  systems with over 400 CPUs when KASAN was enabled due to running out
  of virtual memory space. Although the functional change only consists
  of 3 lines in the patch "module: avoid allocation if module is already
  present and ready", proving that this was the best we can do on the
  modules side took quite a bit of effort and new debug code.

  The initial cleanups I did on the modules side of things has been in
  linux-next since around rc3 of the last kernel, the actual final fix
  for and debug code however have only been in linux-next for about a
  week or so but I think it is worth getting that code in for this merge
  window as it does help fix / prove / evaluate the issues reported with
  larger number of CPUs. Userspace is not yet fixed as it is taking a
  bit of time for folks to understand the crux of the issue and find a
  proper resolution. Worst come to worst, I have a kludge-of-concept [3]
  of how to make kernel_read*() calls for modules unique / converge
  them, but I'm currently inclined to just see if userspace can fix this
  instead"

Link: https://lore.kernel.org/all/Y/kXDqW+7d71C4wz@bombadil.infradead.org/ [0]
Link: https://lkml.kernel.org/r/025f2151-ce7c-5630-9b90-98742c97ac65@redhat.com [1]
Link: https://lwn.net/Articles/927569/ [2]
Link: https://lkml.kernel.org/r/20230414052840.1994456-3-mcgrof@kernel.org [3]

* tag 'modules-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: (121 commits)
  module: add debugging auto-load duplicate module support
  module: stats: fix invalid_mod_bytes typo
  module: remove use of uninitialized variable len
  module: fix building stats for 32-bit targets
  module: stats: include uapi/linux/module.h
  module: avoid allocation if module is already present and ready
  module: add debug stats to help identify memory pressure
  module: extract patient module check into helper
  modules/kmod: replace implementation with a semaphore
  Change DEFINE_SEMAPHORE() to take a number argument
  module: fix kmemleak annotations for non init ELF sections
  module: Ignore L0 and rename is_arm_mapping_symbol()
  module: Move is_arm_mapping_symbol() to module_symbol.h
  module: Sync code of is_arm_mapping_symbol()
  scripts/gdb: use mem instead of core_layout to get the module address
  interconnect: remove module-related code
  interconnect: remove MODULE_LICENSE in non-modules
  zswap: remove MODULE_LICENSE in non-modules
  zpool: remove MODULE_LICENSE in non-modules
  x86/mm/dump_pagetables: remove MODULE_LICENSE in non-modules
  ...
2023-04-27 16:36:55 -07:00
Pavankumar Kondeti 1f6ab566cb printk: export console trace point for kcsan/kasan/kfence/kmsan
The console tracepoint is used by kcsan/kasan/kfence/kmsan test modules. 
Since this tracepoint is not exported, these modules iterate over all
available tracepoints to find the console trace point.  Export the trace
point so that it can be directly used.

Link: https://lkml.kernel.org/r/20230413100859.1492323-1-quic_pkondeti@quicinc.com
Signed-off-by: Pavankumar Kondeti <quic_pkondeti@quicinc.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Marco Elver <elver@google.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:30:11 -07:00
Peter Zijlstra 48380368de Change DEFINE_SEMAPHORE() to take a number argument
Fundamentally semaphores are a counted primitive, but
DEFINE_SEMAPHORE() does not expose this and explicitly creates a
binary semaphore.

Change DEFINE_SEMAPHORE() to take a number argument and use that in the
few places that open-coded it using __SEMAPHORE_INITIALIZER().

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
[mcgrof: add some tribal knowledge about why some folks prefer
 binary sempahores over mutexes]
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-04-18 11:15:24 -07:00
Stanislav Kinsburskii bee43904dc printk: Remove obsoleted check for non-existent "user" object
The original check for non-null "user" object was introduced by commit
e11fea92e1 ("kmsg: export printk records to the /dev/kmsg interface")
when "user" could be NULL if /dev/ksmg was opened for writing.

Subsequent change 750afe7bab ("printk: add kernel parameter to control
writes to /dev/kmsg") made "user" context required for files opened for
write, but didn't remove now redundant checks for it to be non-NULL.

This patch removes the dead code while preserving the current logic.

Signed-off-by: Stanislav Kinsburskii <stanislav.kinsburski@gmail.com>
CC: Petr Mladek <pmladek@suse.com>
CC: Sergey Senozhatsky <senozhatsky@chromium.org>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: John Ogness <john.ogness@linutronix.de>
CC: linux-kernel@vger.kernel.org
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/167929571877.2810.9926967619100618792.stgit@skinsburskii.localdomain
2023-04-03 12:05:17 +02:00
Linus Torvalds d876315445 printk changes for 6.3
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmP0vCsACgkQUqAMR0iA
 lPKjHxAAldxJr7B5XZWRbYOw9VAQ/Qv0whfZwJ0Wl1crhx/PrB6JBkTbz/vHMj+c
 vZSQXOID5g858+iR7KdvejWFTnr7RdBP28rkRj6zijXYBAlrNwQtV+aXS/AHyCjR
 2/cMuP6+7vqaVp8BqDyToQrZil6Us5Io/a6l4C0RQp1nQmY5A8qLg7kdtpPOcIoW
 yBWY+5pMJVfryr95yOjE7WxBVjpIh5eo/D1WoexoxcJNsHpe/2mAvXfZMNLMHBKL
 Io0RSZfrD0Pw+oHyhpX7bLxRbmYO8kajBMtMvPa2NIMQyk/Cm+2SSvDS39n1B4vm
 eXym5hUfSZIZUGV1cegia/exZgfvj3yCEc+8HJ0WEenlFruxXCtqwPbFwtkjsznK
 EjgOEehygaHGiQ7LY9cEA5ON5e7MIGsYUp+EAS2vNja81yBmAK89TlOtezzKIyU2
 U4q4UUsXi4TMMNzvNu0bs4gEu2HOCTovE1O0h7C56haoVHSWWqexepvRwNhAWELY
 ztQz5hd8S7UIILGDTnx8nCc7W8RddfLZfl2XBuzr9gWp79SqeaGKOk10jkjBkatC
 1STqkCN+Jwx5uIdF8LLg4EaNJ8QQCdDaCLvQpSJbQpFIYz7KCwk/hhczdCLsWrrp
 feGEwSD5u2ek5cEk840WfvKlle6fsqtLb6m98UV2/WJpjW9qtSo=
 =cVoE
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk updates from Petr Mladek:

 - Refactor printk code for formatting messages that are shown on
   consoles. This is a preparatory step for introducing atomic consoles
   which could not share the global buffers

 - Prevent memory leak when removing printk index in debugfs

 - Dump also the newest printk message by the sample gdbmacro

 - Fix a compiler warning

* tag 'printk-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  printf: fix errname.c list
  kernel/printk/index.c: fix memory leak with using debugfs_lookup()
  printk: Use scnprintf() to print the message about the dropped messages on a console
  printk: adjust string limit macros
  printk: use printk_buffers for devkmsg
  printk: introduce console_prepend_dropped() for dropped messages
  printk: introduce printk_get_next_message() and printk_message
  printk: introduce struct printk_buffers
  console: Document struct console
  console: Use BIT() macros for @flags values
  printk: move size limit macros into internal.h
  docs: gdbmacros: print newest record
2023-02-23 13:49:45 -08:00
Petr Mladek 392143c9f2 Merge branch 'rework/buffers-cleanup' into for-linus 2023-02-21 13:41:17 +01:00
Greg Kroah-Hartman 55bf243c51 kernel/printk/index.c: fix memory leak with using debugfs_lookup()
When calling debugfs_lookup() the result must have dput() called on it,
otherwise the memory will leak over time.  To make things simpler, just
call debugfs_lookup_and_remove() instead which handles all of the logic
at once.

Cc: Chris Down <chris@chrisdown.name>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20230202151411.2308576-1-gregkh@linuxfoundation.org
2023-02-03 10:42:02 +01:00
Ingo Molnar 57a30218fa Linux 6.2-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmPW7E8eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGf7MIAI0JnHN9WvtEukSZ
 E6j6+cEGWxsvD6q0g3GPolaKOCw7hlv0pWcFJFcUAt0jebspMdxV2oUGJ8RYW7Lg
 nCcHvEVswGKLAQtQSWw52qotW6fUfMPsNYYB5l31sm1sKH4Cgss0W7l2HxO/1LvG
 TSeNHX53vNAZ8pVnFYEWCSXC9bzrmU/VALF2EV00cdICmfvjlgkELGXoLKJJWzUp
 s63fBHYGGURSgwIWOKStoO6HNo0j/F/wcSMx8leY8qDUtVKHj4v24EvSgxUSDBER
 ch3LiSQ6qf4sw/z7pqruKFthKOrlNmcc0phjiES0xwwGiNhLv0z3rAhc4OM2cgYh
 SDc/Y/c=
 =zpaD
 -----END PGP SIGNATURE-----

Merge tag 'v6.2-rc6' into sched/core, to pick up fixes

Pick up fixes before merging another batch of cpuidle updates.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2023-01-31 15:01:20 +01:00
Petr Mladek 21493c6e96 Merge branch 'rework/console-list-lock' into for-linus 2023-01-19 14:56:38 +01:00
Petr Mladek d551afc258 printk: Use scnprintf() to print the message about the dropped messages on a console
Use scnprintf() for printing the message about dropped messages on
a console. It returns the really written length of the message.
It prevents potential buffer overflow when the returned length is
later used to copy the buffer content.

Note that the previous code was safe because the scratch buffer was
big enough and the message always fit in. But scnprintf() makes
it more safe, definitely.

Reported-by: coverity-bot <keescook+coverity-bot@chromium.org>
Addresses-Coverity-ID: 1530570 ("Memory - corruptions")
Fixes: c4fcc617e1 ("printk: introduce console_prepend_dropped() for dropped messages")
Link: https://lore.kernel.org/r/202301131544.D9E804CCD@keescook
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20230117161031.15499-1-pmladek@suse.com
2023-01-18 10:13:50 +01:00
Anuradha Weeraman 4fe59a130c kernel/printk/printk.c: Fix W=1 kernel-doc warning
Fix W=1 kernel-doc warning:

kernel/printk/printk.c:
 - Include function parameter in console_lock_spinning_disable_and_check()

Signed-off-by: Anuradha Weeraman <anuradha@debian.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20230116125635.374567-1-anuradha@debian.org
2023-01-16 16:59:17 +01:00
John Ogness 3ef5abd9b5 tty: serial: kgdboc: fix mutex locking order for configure_kgdboc()
Several mutexes are taken while setting up console serial ports. In
particular, the tty_port->mutex and @console_mutex are taken:

  serial_pnp_probe
    serial8250_register_8250_port
      uart_add_one_port (locks tty_port->mutex)
        uart_configure_port
          register_console (locks @console_mutex)

In order to synchronize kgdb's tty_find_polling_driver() with
register_console(), commit 6193bc9084 ("tty: serial: kgdboc:
synchronize tty_find_polling_driver() and register_console()") takes
the @console_mutex. However, this leads to the following call chain
(with locking):

  platform_probe
    kgdboc_probe
      configure_kgdboc (locks @console_mutex)
        tty_find_polling_driver
          uart_poll_init (locks tty_port->mutex)
            uart_set_options

This is clearly deadlock potential due to the reverse lock ordering.

Since uart_set_options() requires holding @console_mutex in order to
serialize early initialization of the serial-console lock, take the
@console_mutex in uart_poll_init() instead of configure_kgdboc().

Since configure_kgdboc() was using @console_mutex for safe traversal
of the console list, change it to use the SRCU iterator instead.

Add comments to uart_set_options() kerneldoc mentioning that it
requires holding @console_mutex (aka the console_list_lock).

Fixes: 6193bc9084 ("tty: serial: kgdboc: synchronize tty_find_polling_driver() and register_console()")
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
[pmladek@suse.com: Export console_srcu_read_lock_is_held() to fix build kgdboc as a module.]
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20230112161213.1434854-1-john.ogness@linutronix.de
2023-01-16 16:44:53 +01:00
Peter Zijlstra 880970b56b printk: Remove trace_.*_rcuidle() usage
The problem, per commit fc98c3c8c9 ("printk: use rcuidle console
tracepoint"), was printk usage from the cpuidle path where RCU was
already disabled.

Per the patches earlier in this series, this is no longer the case.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Tony Lindgren <tony@atomide.com>
Tested-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Acked-by: Petr Mladek <pmladek@suse.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20230112195540.865735001@infradead.org
2023-01-13 11:48:16 +01:00
John Ogness b0975c47c2 printk: adjust string limit macros
The various internal size limit macros have names and/or values that
do not fit well to their current usage.

Rename the macros so that their purpose is clear and, if needed,
provide a more appropriate value. In general, the new macros and
values will lead to less memory usage. The new macros are...

PRINTK_MESSAGE_MAX:

This is the maximum size for a formatted message on a console,
devkmsg, or syslog. It does not matter which format the message has
(normal or extended). It replaces the use of CONSOLE_EXT_LOG_MAX for
console and devkmsg. It replaces the use of CONSOLE_LOG_MAX for
syslog.

Historically, normal messages have been allowed to print up to 1kB,
whereas extended messages have been allowed to print up to 8kB.
However, the difference in lengths of these message types is not
significant and in multi-line records, normal messages are probably
larger. Also, because 1kB is only slightly above the allowed record
size, multi-line normal messages could be easily truncated during
formatting.

This new macro should be significantly larger than the allowed
record size to allow sufficient space for extended or multi-line
prefix text. A value of 2kB should be plenty of space. For normal
messages this represents a doubling of the historically allowed
amount. For extended messages it reduces the excessive 8kB size,
thus reducing memory usage needed for message formatting.

PRINTK_PREFIX_MAX:

This is the maximum size allowed for a record prefix (used by
console and syslog). It replaces PREFIX_MAX. The value is left
unchanged.

PRINTKRB_RECORD_MAX:

This is the maximum size allowed to be reserved for a record in the
ringbuffer. It is used by all readers and writers with the printk
ringbuffer. It replaces LOG_LINE_MAX.

Previously this was set to "1kB - PREFIX_MAX", which makes some
sense if 1kB is the limit for normal message output and prefixes are
enabled. However, with the allowance of larger output and the
existence of multi-line records, the value is rather bizarre.

Round the value up to 1kB.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20230109100800.1085541-9-john.ogness@linutronix.de
2023-01-11 15:35:12 +01:00
John Ogness ea308da119 printk: use printk_buffers for devkmsg
Replace the buffers in struct devkmsg_user with a struct
printk_buffers. This reduces the number of buffers to keep track of.

As a side-effect, @text_buf was 8kB large, even though it only
needed to be the max size of a ringbuffer record. By switching to
struct printk_buffers, ~7kB less memory is allocated when opening
/dev/kmsg.

And since struct printk_buffers will be used now, reduce duplicate
code by calling printk_get_next_message() to handle the record
reading and formatting.

Note that since /dev/kmsg never suppresses records based on
loglevel, printk_get_next_message() is extended with an extra
bool argument to specify if suppression is allowed.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20230109100800.1085541-8-john.ogness@linutronix.de
2023-01-11 15:35:12 +01:00
John Ogness c4fcc617e1 printk: introduce console_prepend_dropped() for dropped messages
Currently "dropped messages" are separately printed immediately
before printing the printk message. Since normal consoles are
now using an output buffer that is much larger than previously,
the "dropped message" could be prepended to the printk message
and then output everything in a single write() call.

Introduce a helper function console_prepend_dropped() to prepend
an existing message with a "dropped message". This simplifies
the code by allowing all message formatting to be handled
together and then only requires a single write() call to output
the full message. And since this helper does not require any
locking, it can be used in the future for other console printing
contexts as well.

Note that console_prepend_dropped() is defined as a NOP for
!CONFIG_PRINTK. Although the function will never be called for
!CONFIG_PRINTK, compiling the function can lead to warnings of
"always true" conditionals due to the size macro values used
in !CONFIG_PRINTK.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20230109100800.1085541-7-john.ogness@linutronix.de
2023-01-11 15:35:11 +01:00
John Ogness 2830eec14a printk: introduce printk_get_next_message() and printk_message
Code for performing the console output is intermixed with code that
is formatting the output for that console. Introduce a new helper
function printk_get_next_message() to handle the reading and
formatting of the printk text. The helper does not require any
locking so that in the future it can be used for other printing
contexts as well.

This also introduces a new struct printk_message to wrap the struct
printk_buffers, adding metadata about its contents. This allows
users of printk_get_next_message() to receive all relevant
information about the message that was read and formatted.

Why is struct printk_message a wrapper struct?

It is intentional that a wrapper struct is introduced instead of
adding the metadata directly to struct printk_buffers. The upcoming
atomic consoles support multiple printing contexts per CPU. This
means that while a CPU is formatting a message, it can be
interrupted and the interrupting context may also format a (possibly
different) message. Since the printk buffers are rather large,
there will only be one struct printk_buffers per CPU and it must be
shared by the possible contexts of that CPU.

If the metadata was part of struct printk_buffers, interrupting
contexts would clobber the metadata being prepared by the
interrupted context. This could be handled by robustifying the
message formatting functions to cope with metadata unexpectedly
changing. However, this would require significant amounts of extra
data copying, also adding significant complexity to the code.

Instead, the metadata can live on the stack of the formatting
context and the message formatting functions do not need to be
concerned about the metadata changing underneath them.

Note that the message formatting functions can handle unexpected
text buffer changes. So it is perfectly OK if a shared text buffer
is clobbered by an interrupting context. The atomic console
implementation will recognize the interruption and avoid printing
the (probably garbage) text buffer.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20230109100800.1085541-6-john.ogness@linutronix.de
2023-01-11 15:35:11 +01:00
John Ogness daaab5b5bb printk: introduce struct printk_buffers
Introduce a new struct printk_buffers to contain all the buffers
needed to read and format a printk message for output. Putting the
buffers inside a struct reduces the number of buffer pointers that
need to be tracked. Also, it allows usage of the sizeof() macro for
the buffer sizes, rather than expecting certain sized buffers being
passed in.

Note that since the output buffer for normal consoles is now
CONSOLE_EXT_LOG_MAX instead of CONSOLE_LOG_MAX, multi-line
messages that may have been previously truncated will now be
printed in full. This should be considered a feature and not a bug
since the CONSOLE_LOG_MAX restriction was about limiting static
buffer usage rather than limiting printed text.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20230109100800.1085541-5-john.ogness@linutronix.de
2023-01-11 15:35:11 +01:00
John Ogness 2364b40682 printk: move size limit macros into internal.h
The size limit macros are located further down in printk.c and
behind ifdef conditionals. This complicates their usage for
upcoming changes. Move the macros into internal.h so that they
are still invisible outside of printk, but easily accessible
for printk.

Also, the maximum size of formatted extended messages does not
need to be known by any code outside of printk, so move it to
internal.h as well. And like CONSOLE_LOG_MAX, for !CONFIG_PRINTK
set CONSOLE_EXT_LOG_MAX to 0 to reduce the static memory
footprint.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20230109100800.1085541-2-john.ogness@linutronix.de
2023-01-11 15:35:11 +01:00
Petr Mladek 6b2b0d839a Merge branch 'rework/console-list-lock' into for-linus 2022-12-08 11:46:56 +01:00
John Ogness 5074ffbec6 printk: htmldocs: add missing description
Variable and return descriptions were missing from the SRCU read
lock functions. Add them.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/87zgcjpdvo.fsf@jogness.linutronix.de
2022-12-02 11:25:02 +01:00
John Ogness 848a9c1066 printk: relieve console_lock of list synchronization duties
The console_list_lock provides synchronization for console list and
console->flags updates. All call sites that were using the console_lock
for this synchronization have either switched to use the
console_list_lock or the SRCU list iterator.

Remove console_lock usage for console list updates and console->flags
updates.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20221116162152.193147-40-john.ogness@linutronix.de
2022-12-02 11:25:02 +01:00
John Ogness 6f8836756f printk, xen: fbfront: create/use safe function for forcing preferred
With commit 9e124fe16ff2("xen: Enable console tty by default in domU
if it's not a dummy") a hack was implemented to make sure that the
tty console remains the console behind the /dev/console device. The
main problem with the hack is that, after getting the console pointer
to the tty console, it is assumed the pointer is still valid after
releasing the console_sem. This assumption is incorrect and unsafe.

Make the hack safe by introducing a new function
console_force_preferred_locked() and perform the full operation
under the console_list_lock.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20221116162152.193147-33-john.ogness@linutronix.de
2022-12-02 11:25:02 +01:00
John Ogness 1fd4224a6b console: introduce console_is_registered()
Currently it is not possible for drivers to detect if they have
already successfully registered their console. Several drivers
have multiple paths that lead to console registration. To avoid
attempting a 2nd registration (which leads to a WARN), drivers
are implementing their own solution.

Introduce console_is_registered() so drivers can easily identify
if their console is currently registered. A _locked() variant
is also provided if the caller is already holding the
console_list_lock.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20221116162152.193147-22-john.ogness@linutronix.de
2022-12-02 11:25:01 +01:00
John Ogness 8cb15f7f49 printk: console_device: use srcu console list iterator
Use srcu console list iteration for console list traversal. It is
acceptable because the consoles might come and go at any time.
Strict synchronizing with console registration code would not bring
any advantage over srcu.

Document why the console_lock is still necessary. Note that this
is a preparatory change for when console_lock no longer provides
synchronization for the console list.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20221116162152.193147-21-john.ogness@linutronix.de
2022-12-02 11:25:01 +01:00
John Ogness 87f2e4b7d9 printk: console_flush_on_panic: use srcu console list iterator
With SRCU it is now safe to traverse the console list, even if
the console_trylock() failed. However, overwriting console->seq
when console_trylock() failed is still an issue.

Switch to SRCU iteration and document remaining issue with
console->seq.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20221116162152.193147-20-john.ogness@linutronix.de
2022-12-02 11:25:01 +01:00
John Ogness d792db6f6b printk: console_unblank: use srcu console list iterator
Use srcu console list iteration for console list traversal.

Document why the console_lock is still necessary. Note that this
is a preparatory change for when console_lock no longer provides
synchronization for the console list.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20221116162152.193147-19-john.ogness@linutronix.de
2022-12-02 11:25:00 +01:00
John Ogness 12f1da5fc4 printk: console_is_usable: use console_srcu_read_flags
All users of console_is_usable() are SRCU iterators. Use the
appropriate wrapper function to locklessly read the flags.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20221116162152.193147-18-john.ogness@linutronix.de
2022-12-02 11:25:00 +01:00
John Ogness eb7f1ed250 printk: __pr_flush: use srcu console list iterator
Use srcu console list iteration for console list traversal.

Document why the console_lock is still necessary. Note that this
is a preparatory change for when console_lock no longer provides
synchronization for the console list.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20221116162152.193147-17-john.ogness@linutronix.de
2022-12-02 11:25:00 +01:00
John Ogness fc956ae0de printk: console_flush_all: use srcu console list iterator
Guarantee safe iteration of the console list by using SRCU.

Note that in the case of a handover, the SRCU read lock is also
released. This is documented in the function description and as
comments in the code. It is a bit tricky, but this preserves the
lockdep lock ordering for the context handing over the
console_lock:

  console_lock()
  | mutex_acquire(&console_lock_dep_map)       <-- console lock
  |
  console_unlock()
  | console_flush_all()
  | | srcu_read_lock(&console_srcu)            <-- srcu lock
  | | console_emit_next_record()
  | | | console_lock_spinning_disable_and_check()
  | | | | srcu_read_unlock(&console_srcu)      <-- srcu unlock
  | | | | mutex_release(&console_lock_dep_map) <-- console unlock

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20221116162152.193147-16-john.ogness@linutronix.de
2022-12-02 11:25:00 +01:00
John Ogness 100bdef2c1 console: introduce wrappers to read/write console flags
After switching to SRCU for console list iteration, some readers
will begin readings console->flags as a data race. Locklessly
reading console->flags provides a consistent value because there
is at most one CPU modifying console->flags and that CPU is
using only read-modify-write operations.

Introduce a wrapper for SRCU iterators to read console flags.
Introduce a matching wrapper to write to flags of registered
consoles. Writing to flags of registered consoles is synchronized
by the console_list_lock.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20221116162152.193147-13-john.ogness@linutronix.de
2022-12-02 11:25:00 +01:00
John Ogness 4dc64682ad printk: introduce console_list_lock
Currently there exist races in register_console(), where the types
of registered consoles are checked (without holding the console_lock)
and then after acquiring the console_lock, it is assumed that the list
has not changed. Also, some code that performs console_unregister()
make similar assumptions.

It might be possible to fix these races using the console_lock. But
it would require a complex analysis of all console drivers to make
sure that the console_lock is not taken in match() and setup()
callbacks. And we really prefer to split up and reduce the
responsibilities of console_lock rather than expand its complexity.
Therefore, introduce a new console_list_lock to provide full
synchronization for any console list changes.

In addition, also use console_list_lock for synchronization of
console->flags updates. All flags are either static or modified only
during the console registration. There are only two exceptions.

The first exception is CON_ENABLED, which is also modified by
console_start()/console_stop(). Therefore, these functions must
also take the console_list_lock.

The second exception is when the flags are modified by the console
driver init code before the console is registered. These will be
ignored because they are not visible to the rest of the system
via the console_drivers list.

Note that one of the various responsibilities of the console_lock is
also intended to provide console list and console->flags
synchronization. Later changes will update call sites relying on the
console_lock for these purposes. Once all call sites have been
updated, the console_lock will be relieved of synchronizing
console_list and console->flags updates.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/87sficwokr.fsf@jogness.linutronix.de
2022-12-02 11:25:00 +01:00
John Ogness a424276093 printk: fix setting first seq for consoles
It used to be that all consoles were synchronized with respect to
which message they were printing. After commit a699449bb1 ("printk:
refactor and rework printing logic"), all consoles have their own
@seq for tracking which message they are on. That commit also changed
how the initial sequence number was chosen. Instead of choosing the
next non-printed message, it chose the sequence number of the next
message that will be added to the ringbuffer.

That change created a possibility that a non-boot console taking over
for a boot console might skip messages if the boot console was behind
and did not have a chance to catch up before being unregistered.

Since it is not known which boot console is the same device, flush
all consoles and, if necessary, start with the message of the enabled
boot console that is the furthest behind. If no boot consoles are
enabled, begin with the next message that will be added to the
ringbuffer.

Also, since boot consoles are meant to be used at boot time, handle
them the same as CON_PRINTBUFFER to ensure that no initial messages
are skipped.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20221116162152.193147-7-john.ogness@linutronix.de
2022-12-02 11:24:59 +01:00
John Ogness b80ea0e81b printk: move @seq initialization to helper
The code to initialize @seq for a new console needs to consider
more factors when choosing an initial value. Move the code into
a helper function console_init_seq() "as is" so this code can
be expanded without causing register_console() to become too
long. A later commit will implement the additional code.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20221116162152.193147-6-john.ogness@linutronix.de
2022-12-02 11:24:59 +01:00
John Ogness 1145703612 printk: register_console: use "registered" for variable names
The @bootcon_enabled and @realcon_enabled local variables actually
represent if such console types are registered. In general there
has been a confusion about enabled vs. registered. Incorrectly
naming such variables promotes such confusion.

Rename the variables to _registered.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20221116162152.193147-5-john.ogness@linutronix.de
2022-12-02 11:24:59 +01:00
John Ogness 6c4afa7914 printk: Prepare for SRCU console list protection
Provide an NMI-safe SRCU protected variant to walk the console list.

Note that all console fields are now set before adding the console
to the list to avoid the console becoming visible by SCRU readers
before being fully initialized.

This is a preparatory change for a new console infrastructure which
operates independent of the console BKL.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Acked-by: Miguel Ojeda <ojeda@kernel.org>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20221116162152.193147-4-john.ogness@linutronix.de
2022-12-02 11:24:59 +01:00
Thomas Gleixner d9a4af5690 printk: Convert console_drivers list to hlist
Replace the open coded single linked list with a hlist so a conversion
to SRCU protected list walks can reuse the existing primitives.

Co-developed-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20221116162152.193147-3-john.ogness@linutronix.de
2022-12-02 11:24:59 +01:00
Xu Panda 7365df19e8 printk: use strscpy() to instead of strlcpy()
The implementation of strscpy() is more robust and safer.
That's now the recommended way to copy NUL terminated strings.

Signed-off-by: Xu Panda <xu.panda@zte.com.cn>
Signed-off-by: Yang Yang <yang.yang29@zte.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/202211301601416229001@zte.com.cn
2022-12-01 11:57:51 +01:00
Wang Honghui 7b0592a23e printk: fix a typo of comment
Fix a typo of comment

Signed-off-by: Wang Honghui <honghui.wang@ucas.com.cn>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/77522C532189E547+Y3yG91g6XALbtdJr@TP-P15V.lan
Link: https://lore.kernel.org/r/0C7C980DB815FAE1+Y3yNXJCqZ3Nzxa5V@TP-P15V.lan
2022-11-22 12:10:15 +01:00
Thomas Gleixner 78ba392c84 printk: Mark __printk percpu data ready __ro_after_init
This variable cannot change post boot.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220924000454.3319186-6-john.ogness@linutronix.de
2022-09-29 15:20:52 +02:00
Thomas Gleixner eb4531b346 printk: Remove bogus comment vs. boot consoles
The comment about unregistering boot consoles is just not matching the
reality. Remove it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220924000454.3319186-5-john.ogness@linutronix.de
2022-09-29 15:20:44 +02:00
Thomas Gleixner 7fc11a521e printk: Remove write only variable nr_ext_console_drivers
Commit a699449bb1 ("printk: refactor and rework printing logic")
removed the need for @nr_ext_console_drivers. Remove the unneeded
variable.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220924000454.3319186-4-john.ogness@linutronix.de
2022-09-29 15:20:36 +02:00
Thomas Gleixner c60ba2d346 printk: Make pr_flush() static
No user outside the printk code and no reason to export this.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220924000454.3319186-2-john.ogness@linutronix.de
2022-09-29 15:20:22 +02:00
John Ogness 9023ca0866 printk: do not wait for consoles when suspended
The console_stop() and console_start() functions call pr_flush().
When suspending, these functions are called by the serial subsystem
while the serial port is suspended. In this scenario, if there are
any pending messages, a call to pr_flush() will always result in a
timeout because the serial port cannot make forward progress. This
causes longer suspend and resume times.

Add a check in pr_flush() so that it will immediately timeout if
the consoles are suspended.

Fixes: 3b604ca812 ("printk: add pr_flush()")
Reported-by: Todd Brandt <todd.e.brandt@linux.intel.com>
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Tested-by: Todd Brandt <todd.e.brandt@linux.intel.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220715061042.373640-2-john.ogness@linutronix.de
2022-07-15 10:52:11 +02:00
Petr Mladek 07a22b6194 Revert "printk: add functions to prefer direct printing"
This reverts commit 2bb2b7b57f.

The testing of 5.19 release candidates revealed missing synchronization
between early and regular console functionality.

It would be possible to start the console kthreads later as a workaround.
But it is clear that console lock serialized console drivers between
each other. It opens a big area of possible problems that were not
considered by people involved in the development and review.

printk() is crucial for debugging kernel issues and console output is
very important part of it. The number of consoles is huge and a proper
review would take some time. As a result it need to be reverted for 5.19.

Link: https://lore.kernel.org/r/YrBdjVwBOVgLfHyb@alley
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220623145157.21938-7-pmladek@suse.com
2022-06-23 18:41:40 +02:00
Petr Mladek 5831788afb Revert "printk: add kthread console printers"
This reverts commit 09c5ba0aa2.

This reverts commit b87f02307d.

The testing of 5.19 release candidates revealed missing synchronization
between early and regular console functionality.

It would be possible to start the console kthreads later as a workaround.
But it is clear that console lock serialized console drivers between
each other. It opens a big area of possible problems that were not
considered by people involved in the development and review.

printk() is crucial for debugging kernel issues and console output is
very important part of it. The number of consoles is huge and a proper
review would take some time. As a result it need to be reverted for 5.19.

Link: https://lore.kernel.org/r/YrBdjVwBOVgLfHyb@alley
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220623145157.21938-6-pmladek@suse.com
2022-06-23 18:41:40 +02:00
Petr Mladek 2d9ef940f8 Revert "printk: extend console_lock for per-console locking"
This reverts commit 8e27473211.

The testing of 5.19 release candidates revealed missing synchronization
between early and regular console functionality.

It would be possible to start the console kthreads later as a workaround.
But it is clear that console lock serialized console drivers between
each other. It opens a big area of possible problems that were not
considered by people involved in the development and review.

printk() is crucial for debugging kernel issues and console output is
very important part of it. The number of consoles is huge and a proper
review would take some time. As a result it need to be reverted for 5.19.

Link: https://lore.kernel.org/r/YrBdjVwBOVgLfHyb@alley
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220623145157.21938-5-pmladek@suse.com
2022-06-23 18:41:40 +02:00
Petr Mladek 007eeab7e9 Revert "printk: remove @console_locked"
This reverts commit ab406816fc.

The testing of 5.19 release candidates revealed missing synchronization
between early and regular console functionality.

It would be possible to start the console kthreads later as a workaround.
But it is clear that console lock serialized console drivers between
each other. It opens a big area of possible problems that were not
considered by people involved in the development and review.

printk() is crucial for debugging kernel issues and console output is
very important part of it. The number of consoles is huge and a proper
review would take some time. As a result it need to be reverted for 5.19.

Link: https://lore.kernel.org/r/YrBdjVwBOVgLfHyb@alley
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220623145157.21938-4-pmladek@suse.com
2022-06-23 18:41:40 +02:00
Petr Mladek 05c96b3713 Revert "printk: Block console kthreads when direct printing will be required"
This reverts commit c3230283e2.

The testing of 5.19 release candidates revealed missing synchronization
between early and regular console functionality.

It would be possible to start the console kthreads later as a workaround.
But it is clear that console lock serialized console drivers between
each other. It opens a big area of possible problems that were not
considered by people involved in the development and review.

printk() is crucial for debugging kernel issues and console output is
very important part of it. The number of consoles is huge and a proper
review would take some time. As a result it need to be reverted for 5.19.

Link: https://lore.kernel.org/r/YrBdjVwBOVgLfHyb@alley
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220623145157.21938-3-pmladek@suse.com
2022-06-23 18:41:40 +02:00
Petr Mladek 20fb0c8272 Revert "printk: Wait for the global console lock when the system is going down"
This reverts commit b87f02307d.

The testing of 5.19 release candidates revealed missing synchronization
between early and regular console functionality.

It would be possible to start the console kthreads later as a workaround.
But it is clear that console lock serialized console drivers between
each other. It opens a big area of possible problems that were not
considered by people involved in the development and review.

printk() is crucial for debugging kernel issues and console output is
very important part of it. The number of consoles is huge and a proper
review would take some time. As a result it need to be reverted for 5.19.

Link: https://lore.kernel.org/r/YrBdjVwBOVgLfHyb@alley
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220623145157.21938-2-pmladek@suse.com
2022-06-23 18:41:40 +02:00
Petr Mladek b87f02307d printk: Wait for the global console lock when the system is going down
There are reports that the console kthreads block the global console
lock when the system is going down, for example, reboot, panic.

First part of the solution was to block kthreads in these problematic
system states so they stopped handling newly added messages.

Second part of the solution is to wait when for the kthreads when
they are actively printing. It solves the problem when a message
was printed before the system entered the problematic state and
the kthreads managed to step in.

A busy waiting has to be used because panic() can be called in any
context and in an unknown state of the scheduler.

There must be a timeout because the kthread might get stuck or sleeping
and never release the lock. The timeout 10s is an arbitrary value
inspired by the softlockup timeout.

Link: https://lore.kernel.org/r/20220610205038.GA3050413@paulmck-ThinkPad-P17-Gen-1
Link: https://lore.kernel.org/r/CAMdYzYpF4FNTBPZsEFeWRuEwSies36QM_As8osPWZSr2q-viEA@mail.gmail.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/r/20220615162805.27962-3-pmladek@suse.com
2022-06-15 22:04:15 +02:00
Petr Mladek c3230283e2 printk: Block console kthreads when direct printing will be required
There are known situations when the console kthreads are not
reliable or does not work in principle, for example, early boot,
panic, shutdown.

For these situations there is the direct (legacy) mode when printk() tries
to get console_lock() and flush the messages directly. It works very well
during the early boot when the console kthreads are not available at all.
It gets more complicated in the other situations when console kthreads
might be actively printing and block console_trylock() in printk().

The same problem is in the legacy code as well. Any console_lock()
owner could block console_trylock() in printk(). It is solved by
a trick that the current console_lock() owner is responsible for
printing all pending messages. It is actually the reason why there
is the risk of softlockups and why the console kthreads were
introduced.

The console kthreads use the same approach. They are responsible
for printing the messages by definition. So that they handle
the messages anytime when they are awake and see new ones.
The global console_lock is available when there is nothing
to do.

It should work well when the problematic context is correctly
detected and printk() switches to the direct mode. But it seems
that it is not enough in practice. There are reports that
the messages are not printed during panic() or shutdown()
even though printk() tries to use the direct mode here.

The problem seems to be that console kthreads become active in these
situation as well. They steel the job before other CPUs are stopped.
Then they are stopped in the middle of the job and block the global
console_lock.

First part of the solution is to block console kthreads when
the system is in a problematic state and requires the direct
printk() mode.

Link: https://lore.kernel.org/r/20220610205038.GA3050413@paulmck-ThinkPad-P17-Gen-1
Link: https://lore.kernel.org/r/CAMdYzYpF4FNTBPZsEFeWRuEwSies36QM_As8osPWZSr2q-viEA@mail.gmail.com
Suggested-by: John Ogness <john.ogness@linutronix.de>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220615162805.27962-2-pmladek@suse.com
2022-06-15 22:03:38 +02:00
John Ogness 809631e2bf Revert "printk: wake up all waiters"
This reverts commit 938ba4084a.

The wait queue @log_wait never has exclusive waiters, so there
is no need to use wake_up_interruptible_all(). Using
wake_up_interruptible() was the correct function to wake all
waiters.

Since there are no exclusive waiters, erroneously changing
wake_up_interruptible() to wake_up_interruptible_all() did not
result in any behavior change. However, using
wake_up_interruptible_all() on a wait queue without exclusive
waiters is fundamentally wrong.

Go back to using wake_up_interruptible() to wake all waiters.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220526203056.81123-1-john.ogness@linutronix.de
2022-05-27 13:04:46 +02:00
Marco Elver 701850dc0c printk, tracing: fix console tracepoint
The original intent of the 'console' tracepoint per the commit 9510035849
("printk/tracing: Add console output tracing") had been to "[...] record
any printk messages into the trace, regardless of the current console
loglevel. This can help correlate (existing) printk debugging with other
tracing."

Petr points out [1] that calling trace_console_rcuidle() in
call_console_driver() had been the wrong thing for a while, because
"printk() always used console_trylock() and the message was flushed to
the console only when the trylock succeeded. And it was always deferred
in NMI or when printed via printk_deferred()."

With the commit 09c5ba0aa2 ("printk: add kthread console printers"),
things only got worse, and calls to call_console_driver() no longer
happen with typical printk() calls but always appear deferred [2].

As such, the tracepoint can no longer serve its purpose to clearly
correlate printk() calls and other tracing, as well as breaks usecases
that expect every printk() call to result in a callback of the console
tracepoint. Notably, the KFENCE and KCSAN test suites, which want to
capture console output and assume a printk() immediately gives us a
callback to the console tracepoint.

Fix the console tracepoint by moving it into printk_sprint() [3].

One notable difference is that by moving tracing into printk_sprint(),
the 'text' will no longer include the "header" (loglevel and timestamp),
but only the raw message. Arguably this is less of a problem now that
the console tracepoint happens on the printk() call and isn't delayed.

Link: https://lore.kernel.org/all/Ym+WqKStCg%2FEHfh3@alley/ [1]
Link: https://lore.kernel.org/all/CA+G9fYu2kS0wR4WqMRsj2rePKV9XLgOU1PiXnMvpT+Z=c2ucHA@mail.gmail.com/ [2]
Link: https://lore.kernel.org/all/87fslup9dx.fsf@jogness.linutronix.de/ [3]
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Signed-off-by: Marco Elver <elver@google.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Petr Mladek <pmladek@suse.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Acked-by: John Ogness <john.ogness@linutronix.de>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220503073844.4148944-1-elver@google.com
2022-05-06 10:55:24 +02:00
John Ogness ab406816fc printk: remove @console_locked
The static global variable @console_locked is used to help debug
VT code to make sure that certain code paths are running with
the console_lock held. However, this information is also available
with the static global variable @console_kthreads_blocked (for
locking via console_lock()), and the static global variable
@console_kthreads_active (for locking via console_trylock()).

Remove @console_locked and update is_console_locked() to use the
alternative variables.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220421212250.565456-16-john.ogness@linutronix.de
2022-04-26 14:33:59 +02:00
John Ogness 8e27473211 printk: extend console_lock for per-console locking
Currently threaded console printers synchronize against each
other using console_lock(). However, different console drivers
are unrelated and do not require any synchronization between
each other. Removing the synchronization between the threaded
console printers will allow each console to print at its own
speed.

But the threaded consoles printers do still need to synchronize
against console_lock() callers. Introduce a per-console mutex
and a new console boolean field @blocked to provide this
synchronization.

console_lock() is modified so that it must acquire the mutex
of each console in order to set the @blocked field. Console
printing threads will acquire their mutex while printing a
record. If @blocked was set, the thread will go back to sleep
instead of printing.

The reason for the @blocked boolean field is so that
console_lock() callers do not need to acquire multiple console
mutexes simultaneously, which would introduce unnecessary
complexity due to nested mutex locking. Also, a new field
was chosen instead of adding a new @flags value so that the
blocked status could be checked without concern of reading
inconsistent values due to @flags updates from other contexts.

Threaded console printers also need to synchronize against
console_trylock() callers. Since console_trylock() may be
called from any context, the per-console mutex cannot be used
for this synchronization. (mutex_trylock() cannot be called
from atomic contexts.) Introduce a global atomic counter to
identify if any threaded printers are active. The threaded
printers will also check the atomic counter to identify if the
console has been locked by another task via console_trylock().

Note that @console_sem is still used to provide synchronization
between console_lock() and console_trylock() callers.

A locking overview for console_lock(), console_trylock(), and the
threaded printers is as follows (pseudo code):

console_lock()
{
        down(&console_sem);
        for_each_console(con) {
                mutex_lock(&con->lock);
                con->blocked = true;
                mutex_unlock(&con->lock);
        }
        /* console_lock acquired */
}

console_trylock()
{
        if (down_trylock(&console_sem) == 0) {
                if (atomic_cmpxchg(&console_kthreads_active, 0, -1) == 0) {
                        /* console_lock acquired */
                }
        }
}

threaded_printer()
{
        mutex_lock(&con->lock);
        if (!con->blocked) {
		/* console_lock() callers blocked */

                if (atomic_inc_unless_negative(&console_kthreads_active)) {
                        /* console_trylock() callers blocked */

                        con->write();

                        atomic_dec(&console_lock_count);
                }
        }
        mutex_unlock(&con->lock);
}

The console owner and waiter logic now only applies between contexts
that have taken the console_lock via console_trylock(). Threaded
printers never take the console_lock, so they do not have a
console_lock to handover. Tasks that have used console_lock() will
block the threaded printers using a mutex and if the console_lock
is handed over to an atomic context, it would be unable to unblock
the threaded printers. However, the console_trylock() case is
really the only scenario that is interesting for handovers anyway.

@panic_console_dropped must change to atomic_t since it is no longer
protected exclusively by the console_lock.

Since threaded printers remain asleep if they see that the console
is locked, they now must be explicitly woken in __console_unlock().
This means wake_up_klogd() calls following a console_unlock() are
no longer necessary and are removed.

Also note that threaded printers no longer need to check
@console_suspended. The check for the @blocked field implicitly
covers the suspended console case.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/878rrs6ft7.fsf@jogness.linutronix.de
2022-04-26 14:32:00 +02:00
John Ogness 09c5ba0aa2 printk: add kthread console printers
Create a kthread for each console to perform console printing. During
normal operation (@system_state == SYSTEM_RUNNING), the kthread
printers are responsible for all printing on their respective
consoles.

During non-normal operation, console printing is done as it has been:
within the context of the printk caller or within irqwork triggered
by the printk caller, referred to as direct printing.

Since threaded console printers are responsible for all printing
during normal operation, this also includes messages generated via
deferred printk calls. If direct printing is in effect during a
deferred printk call, the queued irqwork will perform the direct
printing. To make it clear that this is the only time that the
irqwork will perform direct printing, rename the flag
PRINTK_PENDING_OUTPUT to PRINTK_PENDING_DIRECT_OUTPUT.

Threaded console printers synchronize against each other and against
console lockers by taking the console lock for each message that is
printed.

Note that the kthread printers do not care about direct printing.
They will always try to print if new records are available. They can
be blocked by direct printing, but will be woken again once direct
printing is finished.

Console unregistration is a bit tricky because the associated
kthread printer cannot be stopped while the console lock is held.
A policy is implemented that states: whichever task clears
con->thread (under the console lock) is responsible for stopping
the kthread. unregister_console() will clear con->thread while
the console lock is held and then stop the kthread after releasing
the console lock.

For consoles that have implemented the exit() callback, the kthread
is stopped before exit() is called.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220421212250.565456-14-john.ogness@linutronix.de
2022-04-22 21:30:58 +02:00
John Ogness 2bb2b7b57f printk: add functions to prefer direct printing
Once kthread printing is available, console printing will no longer
occur in the context of the printk caller. However, there are some
special contexts where it is desirable for the printk caller to
directly print out kernel messages. Using pr_flush() to wait for
threaded printers is only possible if the caller is in a sleepable
context and the kthreads are active. That is not always the case.

Introduce printk_prefer_direct_enter() and printk_prefer_direct_exit()
functions to explicitly (and globally) activate/deactivate preferred
direct console printing. The term "direct console printing" refers to
printing to all enabled consoles from the context of the printk
caller. The term "prefer" is used because this type of printing is
only best effort. If the console is currently locked or other
printers are already actively printing, the printk caller will need
to rely on the other contexts to handle the printing.

This preferred direct printing is how all printing has been handled
until now (unless it was explicitly deferred).

When kthread printing is introduced, there may be some unanticipated
problems due to kthreads being unable to flush important messages.
In order to minimize such risks, preferred direct printing is
activated for the primary important messages when the system
experiences general types of major errors. These are:

 - emergency reboot/shutdown
 - cpu and rcu stalls
 - hard and soft lockups
 - hung tasks
 - warn
 - sysrq

Note that since kthread printing does not yet exist, no behavior
changes result from this commit. This is only implementing the
counter and marking the various places where preferred direct
printing is active.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Acked-by: Paul E. McKenney <paulmck@kernel.org> # for RCU
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220421212250.565456-13-john.ogness@linutronix.de
2022-04-22 21:30:58 +02:00
John Ogness 3b604ca812 printk: add pr_flush()
Provide a might-sleep function to allow waiting for console printers
to catch up to the latest logged message.

Use pr_flush() whenever it is desirable to get buffered messages
printed before continuing: suspend_console(), resume_console(),
console_stop(), console_start(), console_unblank().

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220421212250.565456-12-john.ogness@linutronix.de
2022-04-22 21:30:58 +02:00
John Ogness 03a749e628 printk: move buffer definitions into console_emit_next_record() caller
Extended consoles print extended messages and do not print messages about
dropped records.

Non-extended consoles print "normal" messages as well as extra messages
about dropped records.

Currently the buffers for these various message types are defined within
the functions that might use them and their usage is based upon the
CON_EXTENDED flag. This will be a problem when moving to kthread printers
because each printer must be able to provide its own buffers.

Move all the message buffer definitions outside of
console_emit_next_record(). The caller knows if extended or dropped
messages should be printed and can specify the appropriate buffers to
use. The console_emit_next_record() and call_console_driver() functions
can know what to print based on whether specified buffers are non-NULL.

With this change, buffer definition/allocation/specification is separated
from the code that does the various types of string printing.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220421212250.565456-11-john.ogness@linutronix.de
2022-04-22 21:30:58 +02:00
John Ogness a699449bb1 printk: refactor and rework printing logic
Refactor/rework printing logic in order to prepare for moving to
threaded console printing.

- Move @console_seq into struct console so that the current
  "position" of each console can be tracked individually.

- Move @console_dropped into struct console so that the current drop
  count of each console can be tracked individually.

- Modify printing logic so that each console independently loads,
  prepares, and prints its next record.

- Remove exclusive_console logic. Since console positions are
  handled independently, replaying past records occurs naturally.

- Update the comments explaining why preemption is disabled while
  printing from printk() context.

With these changes, there is a change in behavior: the console
replaying the log (formerly exclusive console) will no longer block
other consoles. New messages appear on the other consoles while the
newly added console is still replaying.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220421212250.565456-10-john.ogness@linutronix.de
2022-04-22 21:30:57 +02:00
John Ogness 1fc0ca9e0d printk: add con_printk() macro for console details
It is useful to generate log messages that include details about
the related console. Rather than duplicate the code to assemble
the details, put that code into a macro con_printk().

Once console printers become threaded, this macro will find more
users.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220421212250.565456-9-john.ogness@linutronix.de
2022-04-22 21:30:57 +02:00
John Ogness 1f47e8af45 printk: call boot_delay_msec() in printk_delay()
boot_delay_msec() is always called immediately before printk_delay()
so just call it from within printk_delay().

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220421212250.565456-8-john.ogness@linutronix.de
2022-04-22 21:30:57 +02:00
John Ogness 9f0844de49 printk: get caller_id/timestamp after migration disable
Currently the local CPU timestamp and caller_id for the record are
collected while migration is enabled. Since this information is
CPU-specific, it should be collected with migration disabled.

Migration is disabled immediately after collecting this information
anyway, so just move the information collection to after the
migration disabling.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220421212250.565456-7-john.ogness@linutronix.de
2022-04-22 21:30:57 +02:00
John Ogness 5341b93dea printk: wake waiters for safe and NMI contexts
When printk() is called from safe or NMI contexts, it will directly
store the record (vprintk_store()) and then defer the console output.
However, defer_console_output() only causes console printing and does
not wake any waiters of new records.

Wake waiters from defer_console_output() so that they also are aware
of the new records from safe and NMI contexts.

Fixes: 03fc7f9c99 ("printk/nmi: Prevent deadlock when accessing the main log buffer in NMI")
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220421212250.565456-6-john.ogness@linutronix.de
2022-04-22 21:30:57 +02:00
John Ogness 938ba4084a printk: wake up all waiters
There can be multiple tasks waiting for new records. They should
all be woken. Use wake_up_interruptible_all() instead of
wake_up_interruptible().

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220421212250.565456-5-john.ogness@linutronix.de
2022-04-22 21:30:57 +02:00
John Ogness 1f5d783094 printk: add missing memory barrier to wake_up_klogd()
It is important that any new records are visible to preparing
waiters before the waker checks if the wait queue is empty.
Otherwise it is possible that:

- there are new records available
- the waker sees an empty wait queue and does not wake
- the preparing waiter sees no new records and begins to wait

This is exactly the problem that the function description of
waitqueue_active() warns about.

Use wq_has_sleeper() instead of waitqueue_active() because it
includes the necessary full memory barrier.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220421212250.565456-4-john.ogness@linutronix.de
2022-04-22 21:30:57 +02:00
John Ogness faebd693c5 printk: rename cpulock functions
Since the printk cpulock is CPU-reentrant and since it is used
in all contexts, its usage must be carefully considered and
most likely will require programming locklessly. To avoid
mistaking the printk cpulock as a typical lock, rename it to
cpu_sync. The main functions then become:

    printk_cpu_sync_get_irqsave(flags);
    printk_cpu_sync_put_irqrestore(flags);

Add extra notes of caution in the function description to help
developers understand the requirements for correct usage.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220421212250.565456-2-john.ogness@linutronix.de
2022-04-22 21:30:57 +02:00
Linus Torvalds 3ef4ea3d84 printk changes for 5.18
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmI4ggsACgkQUqAMR0iA
 lPLrlA//R12HGCGSzdpdyynl+5wByIqcHe8RANHOAj9f9qxBtmYv2ZK69mzSvhHO
 6kAGdb3vBtxo1NCHeqxlXpds9GP/zOGEWmEJP2P7pIZ8ci8QtwrXCtQ8XIW9UGhJ
 WHzXpXkfzcIDsRZs6B1pxN5cRXuW2VVzfgxyu6L+hvNV0o0PPO4A48ptzNBZh8rj
 URAid+n/aGs9SOXM0h8SRjjBYEqjiB2RZ3gLg5XGZmcATtitBO135LGZnBR2fwnX
 RZKckbdA/fBzqS4Njsp2rV5Rqldwj7mHzQbcsQm4YDrxSdl8d78XxQdAA5sNyaCD
 ToDw6/DeegXzgtPJpuBH/ymF9RczIu4l3eawO1FBMCB5EPq56zVHWErxry8qaTgi
 yQFqhBgifNN5NqfQCn7dyF10usmsvImFczre7ZxJvL7vmzqDsYYqdZG5oouLudR4
 iOphFwX71v4X+RsxbOXqEt+mS3AwqEJc1SZl5rrDc4TSUOE1qCd+ncLTAuAf3Wfm
 1xaZ+siomahcZAKrgmSw6AcD5bU+JJpr6FktKAddiO7J1+nIdT1lYEbpUsfWZ/p8
 Kx8A2M2ula+whJ6CgtnGTsbsacsFi+j/MioMTGZIU+Fubkig3XEeIp3QUO6sEN+9
 /sUQ6Wj6c95miWdttff9o6ap8py9NbfuKIw/HMOesfLVKP82rmw=
 =EUpJ
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk updates from Petr Mladek:

 - Make %pK behave the same as %p for kptr_restrict == 0 also with
   no_hash_pointers parameter

 - Ignore the default console in the device tree also when console=null
   or console="" is used on the command line

 - Document console=null and console="" behavior

 - Prevent a deadlock and a livelock caused by console_lock in panic()

 - Make console_lock available for panicking CPU

 - Fast query for the next to-be-used sequence number

 - Use the expected return values in printk.devkmsg __setup handler

 - Use the correct atomic operations in wake_up_klogd() irq_work handler

 - Avoid possible unaligned access when handling %4cc printing format

* tag 'printk-for-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  printk: fix return value of printk.devkmsg __setup handler
  vsprintf: Fix %pK with kptr_restrict == 0
  printk: make suppress_panic_printk static
  printk: Set console_set_on_cmdline=1 when __add_preferred_console() is called with user_specified == true
  Docs: printk: add 'console=null|""' to admin/kernel-parameters
  printk: use atomic updates for klogd work
  printk: Drop console_sem during panic
  printk: Avoid livelock with heavy printk during panic
  printk: disable optimistic spin during panic
  printk: Add panic_in_progress helper
  vsprintf: Move space out of string literals in fourcc_string()
  vsprintf: Fix potential unaligned access
  printk: ringbuffer: Improve prb_next_seq() performance
2022-03-23 10:54:27 -07:00
Petr Mladek 5eb17c1f45 Merge branch 'rework/fast-next-seq' into for-linus 2022-03-21 14:46:19 +01:00
Petr Mladek 1ccd85f5ac Merge branch 'for-5.18-panic-deadlocks' into for-linus 2022-03-21 14:45:01 +01:00
Randy Dunlap b665eae7a7 printk: fix return value of printk.devkmsg __setup handler
If an invalid option value is used with "printk.devkmsg=<value>",
it is silently ignored.
If a valid option value is used, it is honored but the wrong return
value (0) is used, indicating that the command line option had an
error and was not handled. This string is not added to init's
environment strings due to init/main.c::unknown_bootoption()
checking for a '.' in the boot option string and then considering
that string to be an "Unused module parameter".

Print a warning message if a bad option string is used.
Always return 1 from the __setup handler to indicate that the command
line option has been handled.

Fixes: 750afe7bab ("printk: add kernel parameter to control writes to /dev/kmsg")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: Igor Zhbanov <i.zhbanov@omprussia.ru>
Link: lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru
Cc: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220228220556.23484-1-rdunlap@infradead.org
2022-03-02 09:01:37 +01:00
Jiapeng Chong ce06e863f3 printk: make suppress_panic_printk static
This symbol is not used outside of printk.c, so marks it static.

Fix the following sparse warning:

kernel/printk/printk.c💯19: warning: symbol 'suppress_panic_printk'
was not declared. Should it be static?

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220216031957.9761-1-jiapeng.chong@linux.alibaba.com
2022-02-21 16:55:07 +01:00
Andre Kalb a5a763b2b2 printk: Set console_set_on_cmdline=1 when __add_preferred_console() is called with user_specified == true
In case of using console="" or console=null
set console_set_on_cmdline=1 to disable "stdout-path" node from DT.

We basically need to set it every time when __add_preferred_console()
is called with parameter 'user_specified' set.
Therefore we can move setting it into a helper function that is
called from __add_preferred_console().

Suggested-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Andre Kalb <andre.kalb@sma.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/YgzU4ho8l6XapyG2@pc6682
2022-02-21 15:52:08 +01:00
John Ogness 2ba3673d70 printk: use atomic updates for klogd work
The per-cpu @printk_pending variable can be updated from
sleepable contexts, such as:

  get_random_bytes()
    warn_unseeded_randomness()
      printk_deferred()
        defer_console_output()

and can be updated from interrupt contexts, such as:

  handle_irq_event_percpu()
    __irq_wake_thread()
      wake_up_process()
        try_to_wake_up()
          select_task_rq()
            select_fallback_rq()
              printk_deferred()
                defer_console_output()

and can be updated from NMI contexts, such as:

  vprintk()
    if (in_nmi()) defer_console_output()

Therefore the atomic variant of the updating functions must be used.

Replace __this_cpu_xchg() with this_cpu_xchg().
Replace __this_cpu_or() with this_cpu_or().

Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/87iltld4ue.fsf@jogness.linutronix.de
2022-02-15 16:10:55 +01:00
Stephen Brennan 8ebc476fd5 printk: Drop console_sem during panic
If another CPU is in panic, we are about to be halted. Try to gracefully
abandon the console_sem, leaving it free for the panic CPU to grab.

Suggested-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220202171821.179394-5-stephen.s.brennan@oracle.com
2022-02-14 13:39:20 +01:00
Stephen Brennan 13fb0f74d7 printk: Avoid livelock with heavy printk during panic
During panic(), if another CPU is writing heavily the kernel log (e.g.
via /dev/kmsg), then the panic CPU may livelock writing out its messages
to the console. Note when too many messages are dropped during panic and
suppress further printk, except from the panic CPU. This could result in
some important messages being dropped. However, messages are already
being dropped, so this approach at least prevents a livelock.

Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220202171821.179394-4-stephen.s.brennan@oracle.com
2022-02-14 13:39:20 +01:00
Stephen Brennan d51507098f printk: disable optimistic spin during panic
A CPU executing with console lock spinning enabled might be halted
during a panic. Before the panicking CPU calls console_flush_on_panic(),
it may call console_trylock(), which attempts to optimistically spin,
deadlocking the panic CPU:

CPU 0 (panic CPU)             CPU 1
-----------------             ------
                              printk() {
                                vprintk_func() {
                                  vprintk_default() {
                                    vprintk_emit() {
                                      console_unlock() {
                                        console_lock_spinning_enable();
                                        ... printing to console ...
panic() {
  crash_smp_send_stop() {
    NMI  -------------------> HALT
  }
  atomic_notifier_call_chain() {
    printk() {
      ...
      console_trylock_spinnning() {
        // optimistic spin infinitely

This hang during panic can be induced when a kdump kernel is loaded, and
crash_kexec_post_notifiers=1 is present on the kernel command line. The
following script which concurrently writes to /dev/kmsg, and triggers a
panic, can result in this hang:

    #!/bin/bash
    date
    # 991 chars (based on log buffer size):
    chars="$(printf 'a%.0s' {1..991})"
    while :; do
        echo $chars > /dev/kmsg
    done &
    echo c > /proc/sysrq-trigger &
    date
    exit

To avoid this deadlock, ensure that console_trylock_spinning() does not
allow spinning once a panic has begun.

Fixes: dbdda842fe ("printk: Add console owner and waiter logic to load balance console writes")

Suggested-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220202171821.179394-3-stephen.s.brennan@oracle.com
2022-02-14 13:39:20 +01:00
Stephen Brennan 7749861785 printk: Add panic_in_progress helper
This will be used help avoid deadlocks during panics. Although it would
be better to include this in linux/panic.h, it would require that header
to include linux/atomic.h as well. On some architectures, this results
in a circular dependency as well. So instead add the helper directly to
printk.c.

Suggested-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220202171821.179394-2-stephen.s.brennan@oracle.com
2022-02-14 13:39:20 +01:00
Mickaël Salaün 1f2cfdd349 printk: Fix incorrect __user type in proc_dointvec_minmax_sysadmin()
The move of proc_dointvec_minmax_sysadmin() from kernel/sysctl.c to
kernel/printk/sysctl.c introduced an incorrect __user attribute to the
buffer argument.  I spotted this change in [1] as well as the kernel
test robot.  Revert this change to please sparse:

  kernel/printk/sysctl.c:20:51: warning: incorrect type in argument 3 (different address spaces)
  kernel/printk/sysctl.c:20:51:    expected void *
  kernel/printk/sysctl.c:20:51:    got void [noderef] __user *buffer

Fixes: faaa357a55 ("printk: move printk sysctl to printk/sysctl.c")
Link: https://lore.kernel.org/r/20220104155024.48023-2-mic@digikod.net [1]
Reported-by: kernel test robot <lkp@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Xiaoming Ni <nixiaoming@huawei.com>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Link: https://lore.kernel.org/r/20220203145029.272640-1-mic@digikod.net
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-03 11:27:38 -08:00
Petr Mladek f244b4dc53 printk: ringbuffer: Improve prb_next_seq() performance
prb_next_seq() always iterates from the first known sequence number.
In the worst case, it might loop 8k times for 256kB buffer,
15k times for 512kB buffer, and 64k times for 2MB buffer.

It was reported that polling and reading using syslog interface
might occupy 50% of CPU.

Speedup the search by storing @id of the last finalized descriptor.

The loop is still needed because the @id is stored and read in the best
effort way. An atomic variable is used to keep the @id consistent.
But the stores and reads are not serialized against each other.
The descriptor could get reused in the meantime. The related sequence
number will be used only when it is still valid.

An invalid value should be read _only_ when there is a flood of messages
and the ringbuffer is rapidly reused. The performance is the least
problem in this case.

Reported-by: Chunlei Wang <chunlei.wang@mediatek.com>
Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/1642770388-17327-1-git-send-email-quic_mojha@quicinc.com
Link: https://lore.kernel.org/lkml/YXlddJxLh77DKfIO@alley/T/#m43062e8b2a17f8dbc8c6ccdb8851fb0dbaabbb14
2022-01-26 16:00:32 +01:00
Xiaoming Ni fdcd4073fc printk: fix build warning when CONFIG_PRINTK=n
build warning when CONFIG_PRINTK=n

	kernel/printk/printk.c:175:5: warning: no previous prototype for
	 'devkmsg_sysctl_set_loglvl' [-Wmissing-prototypes]

devkmsg_sysctl_set_loglvl() is only used in sysctl.c when
CONFIG_PRINTK=y, but it participates in the build when CONFIG_PRINTK=n.
So add compile dependency CONFIG_PRINTK=y && CONFIG_SYSCTL=y to fix the
build warning.

Link: https://lkml.kernel.org/r/20211129211943.640266-5-mcgrof@kernel.org
Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Antti Palosaari <crope@iki.fi>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Lukas Middendorf <kernel@tuxforce.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
Cc: Stephen Kitt <steve@sk2.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-22 08:33:36 +02:00
Xiaoming Ni faaa357a55 printk: move printk sysctl to printk/sysctl.c
kernel/sysctl.c is a kitchen sink where everyone leaves their dirty
dishes, this makes it very difficult to maintain.

To help with this maintenance let's start by moving sysctls to places
where they actually belong.  The proc sysctl maintainers do not want to
know what sysctl knobs you wish to add for your own piece of code, we
just care about the core logic.

So move printk sysctl from kernel/sysctl.c to kernel/printk/sysctl.c.
Use register_sysctl() to register the sysctl interface.

[mcgrof@kernel.org: fixed compile issues when PRINTK is not set, commit log update]

Link: https://lkml.kernel.org/r/20211124231435.1445213-6-mcgrof@kernel.org
Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Antti Palosaari <crope@iki.fi>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: James E.J. Bottomley <jejb@linux.ibm.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Julia Lawall <julia.lawall@inria.fr>
Cc: Kees Cook <keescook@chromium.org>
Cc: Lukas Middendorf <kernel@tuxforce.de>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Phillip Potter <phil@philpotter.co.uk>
Cc: Qing Wang <wangqing@vivo.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Sebastian Reichel <sre@kernel.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Stephen Kitt <steve@sk2.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-22 08:33:35 +02:00
Petr Mladek 5e8ba485b2 printk/console: Clean up boot console handling in register_console()
The variable @bcon has two meanings. It is used several times for iterating
the list of registered consoles. In the meantime, it holds the information
whether a boot console is first in @console_drivers list.

The information about the 1st console driver used to be important for
the decision whether to install the new console by default or not.
It allowed to re-evaluate the variable @need_default_console when
a real console with tty binding has been unregistered in the meantime.

The decision about the default console is not longer affected by @bcon
variable. The current code checks whether the first driver is real
and has tty binding directly.

The information about the first console is still used for two more
decisions:

  1. It prevents duplicate output on non-boot consoles with
     CON_CONSDEV flag set.

  2. Early/boot consoles are unregistered when a real console with
     CON_CONSDEV is registered and @keep_bootcon is not set.

The behavior in the real life is far from obvious. @bcon is set according
to the first console @console_drivers list. But the first position in
the list is special:

  1. Consoles with CON_CONSDEV flag are put at the beginning of
     the list. It is either the preferred console or any console
     with tty binding registered by default.

  2. Another console might become the first in the list when
     the first console in the list is unregistered. It might
     happen either explicitly or automatically when boot
     consoles are unregistered.

There is one more important rule:

  + Boot consoles can't be registered when any real console
    is already registered.

It is a puzzle. The main complication is the dependency on the first
position is the list and the complicated rules around it.

Let's try to make it easier:

1. Add variable @bootcon_enabled and set it by iterating all registered
   consoles. The variable has obvious meaning and more predictable
   behavior. Any speed optimization and other tricks are not worth it.

2. Use a generic name for the variable that is used to iterate
   the list on registered console drivers.

Behavior change:

No, maybe surprisingly, there is _no_ behavior change!

Let's provide the proof by contradiction. Both operations, duplicate
output prevention and boot consoles removal, are done only when
the newly added console has CON_CONSDEV flag set. The behavior
would change when the new @bootcon_enabled has different value
than the original @bcon.

By other words, the behavior would change when the following conditions
are true:

   + a console with CON_CONSDEV flag is added
   + a real (non-boot) console is the first in the list
   + a boot console is later in the list

Now, a real console might be first in the list only when:

   + It was the first registered console. In this case, there can't be
     any boot console because any later ones were rejected.

   + It was put at the first position because it had CON_CONSDEV flag
     set. It was either the preferred console or it was a console with
     tty binding registered by default. We are interested only in
     a real consoles here. And real console with tty binding fulfills
     conditions of the default console.

     Now, there is always only one console that is either preferred
     or fulfills conditions of the default console. It can't be already
     in the list and being registered at the same time.

As a result, the above three conditions could newer be "true" at
the same time. Therefore the behavior can't change.

Final dilemma:

OK, the new code has the same behavior. But is the change in the right
direction? What if the handling of @console_drivers is updated in
the future?

OK, let's look at it from another angle:

1. The ordering of @console_drivers list is important only in
   console_device() function. The first console driver with tty
   binding gets associated with /dev/console.

2. CON_CONSDEV flag is shown in /proc/consoles. And it should be set
   for the driver that is returned by console_device().

3. A boot console is removed and the duplicated output is prevented
   when the real console with CON_CONSDEV flag is registered.

Now, in the ideal world:

+ The driver associated with /dev/console should be either a console
  preferred via the command line, device tree, or SPCR. Or it should
  be the first real console with tty binding registered by default.

+ The code should match the related boot and real console drivers.
  It should unregister only the obsolete boot driver. And the duplicated
  output should be prevented only on the related real driver.

It is clear that it is not guaranteed by the current code. Instead,
the current code looks like a maze of heuristics that try to achieve
the above.

It is result of adding several features over last few decades. For example,
a possibility to register more consoles, unregister consoles, boot
consoles, consoles without tty binding, device tree, SPCR, braille
consoles.

Anyway, there is no reason why the decision, about removing boot consoles
and preventing duplicated output, should depend on the first console
in the list. The current code does the decisions primary by CON_CONSDEV
flag that is used for the preferred console. It looks like a
good compromise. And the change seems to be in the right direction.

Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20211122132649.12737-6-pmladek@suse.com
2021-12-06 14:07:57 +01:00
Petr Mladek 4f54693925 printk/console: Remove need_default_console variable
The variable @need_default_console is used to decide whether a newly
registered console should get enabled by default.

The logic is complicated. It can be modified in a register_console()
call. But it is always re-evaluated in the next call by the following
condition:

	if (need_default_console || bcon || !console_drivers)
		need_default_console = preferred_console < 0;

In short, the value is updated when either of the condition is valid:

  + the value is still, or again, "true"
  + boot/early console is still the first in @console_driver list
  + @console_driver list is empty

The value is updated according to @preferred_console. In particular,
it is set to "false" when a @preferred_console was set by
__add_preferred_console(). This happens when a non-braille console
was added via the command line, device tree, or SPCR.

It far from clear what this all means together. Let's look at
@need_default_console from another angle:

1. The value is "true" by default. It means that it is always set
   according to @preferred_console during the first register_console()
   call.

   By other words, the first register_console() call will register
   the console by default only when none non-braille console was defined
   via the command line, device tree, or SPCR.

2. The value will always stay "false" when @preferred_console is set.

   By other words, try_enable_default_console() will never get called
   when a non-braille console is explicitly required.

4. The value might be set to "false" in try_enable_default_console()
   when a console with tty binding (driver) gets enabled.

   In this case CON_CONSDEV is set as well. It causes that the console
   will be inserted as first into the list @console_driver. It might
   be either real or boot/early console.

5. The value will be set _back_ to "true" in the next register_console()
   call when:

      + The console added by the previous register_console() had been
	a boot/early one.

      + The last console has been unregistered in the meantime and
	a boot/early console became first in @console_drivers list
	again. Or the list became empty.

   By other words, the value will stay "false" only when the last
   registered console was real, had tty binding, and was not removed
   in the mean time.

The main logic looks clear:

  + Consoles are enabled by default only when no one is preferred
    via the command line, device tree, or SPCR.

  + By default, any console is enabled until a real console
    with tty binding gets registered.

The behavior when the real console with tty binding is later removed
is a bit unclear:

  + By default, any new console is registered again only when there
    is no console or the first console in the list is a boot one.

The question is why the code is suddenly happy when a real console
without tty binding is the first in the list. It looks like an overlook
and bug.

Conclusion:

The state of @preferred_console and the first console in @console_driver
list should be enough to decide whether we need to enable the given console
by default.

The rules are simple. New consoles are _not_ enabled by default
when either of the following conditions is true:

  + @preferred_console is set. It means that a non-braille console
    is explicitly configured via the command line, device tree, or SPCR.

  + A real console with tty binding is registered. Such a console will
    have CON_CONSDEV flag set and will always be the first in
    @console_drivers list.

Note:

The new code does not use @bcon variable. The meaning of the variable
is far from clear. The direct check of the first console in the list
makes it more clear that only real console fulfills requirements
of the default console.

Behavior change:

As already discussed above. There was one situation where the original
code worked a strange way. Let's have:

	+ console A: real console without tty binding
	+ console B: real console with tty binding

and do:

	register_console(A);	/* 1st step */
	register_console(B);	/* 2nd step */
	unregister_console(B);	/* 3rd step */
	register_console(B);	/* 4th step */

The original code will not register the console B in the 4th step.
@need_default_console is set to "false" in 2nd step. The real console
with tty binding (driver) is then removed in the 3rd step.
But @need_default_console will stay "false" in the 4th step because
there is no boot/early console and @registered_consoles list is not
empty.

The new code will register the console B in the 4th step because
it checks whether the first console has tty binding (->driver)

This behavior change should acceptable:

  1. The scenario requires manual intervention (console removal).
     The system should boot with the same consoles as before.

  2. Console B is registered again probably because the user wants
     to use it. The most likely scenario is that the related
     module is reloaded.

  3. It makes the behavior more consistent and predictable.

Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20211122132649.12737-5-pmladek@suse.com
2021-12-06 14:07:57 +01:00
Petr Mladek f873efe841 printk/console: Remove unnecessary need_default_console manipulation
There is no need to clear @need_default_console when a console
preferred by the command line, device tree, or SPCR, gets enabled.

The code is called only when some non-braille console matched a console
in @console_cmdline array. It means that a non-braille console was added
in __add_preferred_console() and the variable preferred_console is set
to a number >= 0. As a result, @need_default_console is always set to
"false" in the magic condition:

	if (need_default_console || bcon || !console_drivers)
		need_default_console = preferred_console < 0;

This is one small step in removing the above magic condition
that is hard to follow.

The patch removes one superfluous assignment and should not change
the functionality.

Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20211122132649.12737-4-pmladek@suse.com
2021-12-06 14:07:57 +01:00
Petr Mladek a6953370d2 printk/console: Rename has_preferred_console to need_default_console
The logic around the variable @has_preferred_console made my head
spin many times. Part of the problem is the ambiguous name.

There is the variable @preferred_console. It points to the last
non-braille console in @console_cmdline array. This array contains
consoles preferred via the command line, device tree, or SPCR.

Then there is the variable @has_preferred_console. It is set to
"true" when @preferred_console is enabled or when a console with
tty binding gets enabled by default.

It might get reset back by the magic condition:

	if (!has_preferred_console || bcon || !console_drivers)
		has_preferred_console = preferred_console >= 0;

It is a puzzle. Dumb explanation is that it gets re-evaluated
when:

	+ it was not set before (see above when it gets set)
	+ there is still an early console enabled (bcon)
	+ there is no console enabled (!console_drivers)

This is still a puzzle.

It gets more clear when we see where the value is checked. The only
meaning of the variable is to decide whether we should try to enable
the new console by default.

Rename the variable according to the single situation where
the value is checked.

The rename requires an inverted logic. Otherwise, it is a simple
search & replace. It does not change the functionality.

Signed-off-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Link: https://lore.kernel.org/r/20211122132649.12737-3-pmladek@suse.com
2021-12-06 14:07:57 +01:00
Petr Mladek ed758b30d5 printk/console: Split out code that enables default console
Put the code enabling a console by default into a separate function
called try_enable_default_console().

Rename try_enable_new_console() to try_enable_preferred_console() to
make the purpose of the different variants more clear.

It is a code refactoring without any functional change.

Signed-off-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Link: https://lore.kernel.org/r/20211122132649.12737-2-pmladek@suse.com
2021-12-06 14:07:57 +01:00