Commit graph

308 commits

Author SHA1 Message Date
Gustavo A. R. Silva
a38283da05 printk: ringbuffer: Replace zero-length array with flexible-array member
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.9/process/deprecated.html#zero-length-and-one-element-arrays

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-10-30 16:57:42 -05:00
Linus Torvalds
8119c4332d Urgent printk fix for 5.10
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAl+JtNYACgkQUqAMR0iA
 lPLvaQ//Xo1gAe/ZZznTJ/hMeJQf0COAFeQ/oriqiHThu9rRxv9cfQ434JYIQaz4
 f1cKhnMNrdm6/AZynjKeLwoO+Ui9N3DUJ8Wf3V0r3W1mbWXrxSEhFE9AHc8BNWFZ
 vzlVOrHBfPt9RqFQ12k/IUcAElAiuCmceLJ3LsTLfVF2gc07cBsZdpsZLdPcb0fz
 Oo8Qu9afsVWUu9X/pepgKcNO0XbfDJM0cuFpmCrziRpToLvdaTEBihI7C0ktg9WU
 aNZz+mEuhX9mHGt9PvsHsuDsWj3gJJypkc7ccWRdWEbl0lR1HV40tr2j2WdtYtSb
 GvwD34ApxneoX87mWgaSSHWNfZ5B+SDbK1XQVswRkFmZhEqoAulGYd4Uktt+3Yxy
 4cGC1pzEzU5ekj6XcmUjTHxmTKRNju2qC1XTtSE0F7ozDTlMtPovqSMZAQoNNuQK
 +F1TBy49ikEVKB1V6xoGH/IepStO2OSee5JN6yJ+ZzLKPq9hqqtAC6CjA75NQNR0
 tB9e7EKUjGkwZohnCkPpVCA1BYzqWG8+II0Z5EZilZxJiNnh5CuJ4vr0wkIfnVkI
 BlvkoiPHiCBPhov8UJfnrxBASsSYt/EnzKfHisqQPs3/hN9ZgMbF+wbApwosEwNx
 l1fswzO0mBbb2NNJWydQzG305GjoxnkQjFOewOrCii4Z0ZrfMVo=
 =FqTN
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-5.10-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk fix from Petr Mladek:
 "Prevent overflow in the new lockless ringbuffer"

* tag 'printk-for-5.10-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  printk: ringbuffer: Wrong data pointer when appending small string
2020-10-16 12:52:37 -07:00
Linus Torvalds
bbf6259903 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial updates from Jiri Kosina:
 "The latest advances in computer science from the trivial queue"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
  xtensa: fix Kconfig typo
  spelling.txt: Remove some duplicate entries
  mtd: rawnand: oxnas: cleanup/simplify code
  selftests: vm: add fragment CONFIG_GUP_BENCHMARK
  perf: Fix opt help text for --no-bpf-event
  HID: logitech-dj: Fix spelling in comment
  bootconfig: Fix kernel message mentioning CONFIG_BOOT_CONFIG
  MAINTAINERS: rectify MMP SUPPORT after moving cputype.h
  scif: Fix spelling of EACCES
  printk: fix global comment
  lib/bitmap.c: fix spello
  fs: Fix missing 'bit' in comment
2020-10-15 15:11:56 -07:00
Petr Mladek
eac48eb6ce printk: ringbuffer: Wrong data pointer when appending small string
data_realloc() returns wrong data pointer when the block is wrapped and
the size is not increased. It might happen when pr_cont() wants to
add only few characters and there is already a space for them because
of alignment.

It might cause writing outsite the buffer. It has been detected by LTP
tests with KASAN enabled:

[  221.921944] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=c,mems_allowed=0,oom_memcg=/0,task_memcg=in
[  221.922108] ==================================================================
[  221.922111] BUG: KASAN: global-out-of-bounds in vprintk_store+0x362/0x3d0
[  221.922112] Write of size 2 at addr ffffffffba51dbcd by task
memcg_test_1/11282
[  221.922113]
[  221.922114] CPU: 1 PID: 11282 Comm: memcg_test_1 Not tainted
5.9.0-next-20201013 #1
[  221.922116] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
2.0b 07/27/2017
[  221.922116] Call Trace:
[  221.922117]  dump_stack+0xa4/0xd9
[  221.922118]  print_address_description.constprop.0+0x21/0x210
[  221.922119]  ? _raw_write_lock_bh+0xe0/0xe0
[  221.922120]  ? vprintk_store+0x362/0x3d0
[  221.922121]  kasan_report.cold+0x37/0x7c
[  221.922122]  ? vprintk_store+0x362/0x3d0
[  221.922123]  check_memory_region+0x18c/0x1f0
[  221.922124]  memcpy+0x3c/0x60
[  221.922125]  vprintk_store+0x362/0x3d0
[  221.922125]  ? __ia32_sys_syslog+0x50/0x50
[  221.922126]  ? _raw_spin_lock_irqsave+0x9b/0x100
[  221.922127]  ? _raw_spin_lock_irq+0xf0/0xf0
[  221.922128]  ? __kasan_check_write+0x14/0x20
[  221.922129]  vprintk_emit+0x8d/0x1f0
[  221.922130]  vprintk_default+0x1d/0x20
[  221.922131]  vprintk_func+0x5a/0x100
[  221.922132]  printk+0xb2/0xe3
[  221.922133]  ? swsusp_write.cold+0x189/0x189
[  221.922134]  ? kernfs_vfs_xattr_set+0x60/0x60
[  221.922134]  ? _raw_write_lock_bh+0xe0/0xe0
[  221.922135]  ? trace_hardirqs_on+0x38/0x100
[  221.922136]  pr_cont_kernfs_path.cold+0x49/0x4b
[  221.922137]  mem_cgroup_print_oom_context.cold+0x74/0xc3
[  221.922138]  dump_header+0x340/0x3bf
[  221.922139]  oom_kill_process.cold+0xb/0x10
[  221.922140]  out_of_memory+0x1e9/0x860
[  221.922141]  ? oom_killer_disable+0x210/0x210
[  221.922142]  mem_cgroup_out_of_memory+0x198/0x1c0
[  221.922143]  ? mem_cgroup_count_precharge_pte_range+0x250/0x250
[  221.922144]  try_charge+0xa9b/0xc50
[  221.922145]  ? arch_stack_walk+0x9e/0xf0
[  221.922146]  ? memory_high_write+0x230/0x230
[  221.922146]  ? avc_has_extended_perms+0x830/0x830
[  221.922147]  ? stack_trace_save+0x94/0xc0
[  221.922148]  ? stack_trace_consume_entry+0x90/0x90
[  221.922149]  __memcg_kmem_charge+0x73/0x120
[  221.922150]  ? cred_has_capability+0x10f/0x200
[  221.922151]  ? mem_cgroup_can_attach+0x260/0x260
[  221.922152]  ? selinux_sb_eat_lsm_opts+0x2f0/0x2f0
[  221.922153]  ? obj_cgroup_charge+0x16b/0x220
[  221.922154]  ? kmem_cache_alloc+0x78/0x4c0
[  221.922155]  obj_cgroup_charge+0x122/0x220
[  221.922156]  ? vm_area_alloc+0x20/0x90
[  221.922156]  kmem_cache_alloc+0x78/0x4c0
[  221.922157]  vm_area_alloc+0x20/0x90
[  221.922158]  mmap_region+0x3ed/0x9a0
[  221.922159]  ? cap_mmap_addr+0x1d/0x80
[  221.922160]  do_mmap+0x3ee/0x720
[  221.922161]  vm_mmap_pgoff+0x16a/0x1c0
[  221.922162]  ? randomize_stack_top+0x90/0x90
[  221.922163]  ? copy_page_range+0x1980/0x1980
[  221.922163]  ksys_mmap_pgoff+0xab/0x350
[  221.922164]  ? find_mergeable_anon_vma+0x110/0x110
[  221.922165]  ? __audit_syscall_entry+0x1a6/0x1e0
[  221.922166]  __x64_sys_mmap+0x8d/0xb0
[  221.922167]  do_syscall_64+0x38/0x50
[  221.922168]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  221.922169] RIP: 0033:0x7fe8f5e75103
[  221.922172] Code: 54 41 89 d4 55 48 89 fd 53 4c 89 cb 48 85 ff 74
56 49 89 d9 45 89 f8 45 89 f2 44 89 e2 4c 89 ee 48 89 ef b8 09 00 00
00 0f 05 <48> 3d 00 f0 ff ff 77 7d 5b 5d 41 5c 41 5d 41 5e 41 5f c3 66
2e 0f
[  221.922173] RSP: 002b:00007ffd38c90198 EFLAGS: 00000246 ORIG_RAX:
0000000000000009
[  221.922175] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fe8f5e75103
[  221.922176] RDX: 0000000000000003 RSI: 0000000000001000 RDI: 0000000000000000
[  221.922178] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[  221.922179] R10: 0000000000002022 R11: 0000000000000246 R12: 0000000000000003
[  221.922180] R13: 0000000000001000 R14: 0000000000002022 R15: 0000000000000000
[  221.922181]
[  213O[  221.922182] The buggy address belongs to the variable:
[  221.922183]  clear_seq+0x2d/0x40
[  221.922183]
[  221.922184] Memory state around the buggy address:
[  221.922185]  ffffffffba51da80: 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
[  221.922187]  ffffffffba51db00: 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
[  221.922188] >ffffffffba51db80: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9
00 f9 f9 f9
[  221.922189]                                               ^
[  221.922190]  ffffffffba51dc00: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9
00 f9 f9 f9
[  221.922191]  ffffffffba51dc80: f9 f9 f9 f9 01 f9 f9 f9 f9 f9 f9 f9
00 f9 f9 f9
[  221.922193] ==================================================================
[  221.922194] Disabling lock debugging due to kernel taint
[  221.922196] ,task=memcg_test_1,pid=11280,uid=0
[  221.922205] Memory cgroup out of memory: Killed process 11280

Link: https://lore.kernel.org/r/CA+G9fYt46oC7-BKryNDaaXPJ9GztvS2cs_7GjYRjanRi4+ryCQ@mail.gmail.com
Fixes: 4cfc7258f8 ("printk: ringbuffer: add finalization/extension support")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20201014175051.GC13775@alley
2020-10-15 12:21:13 +02:00
Petr Mladek
70333f4ff9 Merge branch 'printk-rework' into for-linus 2020-10-12 13:01:37 +02:00
Gustavo A. R. Silva
4e797e6ec7 printk: Use fallthrough pseudo-keyword
Replace /* FALL THRU */ comment with the new pseudo-keyword macro
fallthrough[1].

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20201002224627.GA30475@embeddedor
2020-10-05 15:56:58 +02:00
John Ogness
0463d04ea0 printk: reduce setup_text_buf size to LOG_LINE_MAX
@setup_text_buf only copies the original text messages (without any
prefix or extended text). It only needs to be LOG_LINE_MAX in size.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200930090134.8723-3-john.ogness@linutronix.de
2020-09-30 13:54:21 +02:00
John Ogness
59f8bcca1e printk: avoid and/or handle record truncation
If a reader provides a buffer that is smaller than the message text,
the @text_len field of @info will have a value larger than the buffer
size. If readers blindly read @text_len bytes of data without
checking the size, they will read beyond their buffer.

Add this check to record_print_text() to properly recognize when such
truncation has occurred.

Add a maximum size argument to the ringbuffer function to extend
records so that records can not be created that are larger than the
buffer size of readers.

When extending records (LOG_CONT), do not extend records beyond
LOG_LINE_MAX since that is the maximum size available in the buffers
used by consoles and syslog.

Fixes: f5f022e53b ("printk: reimplement log_cont using record extension")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200930090134.8723-2-john.ogness@linutronix.de
2020-09-30 13:30:28 +02:00
John Ogness
f35efc78ad printk: remove dict ring
Since there is no code that will ever store anything into the dict
ring, remove it. If any future dictionary properties are to be
added, these should be added to the struct printk_info.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200918223421.21621-4-john.ogness@linutronix.de
2020-09-22 11:39:18 +02:00
John Ogness
74caba7f2a printk: move dictionary keys to dev_printk_info
Dictionaries are only used for SUBSYSTEM and DEVICE properties. The
current implementation stores the property names each time they are
used. This requires more space than otherwise necessary. Also,
because the dictionary entries are currently considered optional,
it cannot be relied upon that they are always available, even if the
writer wanted to store them. These issues will increase should new
dictionary properties be introduced.

Rather than storing the subsystem and device properties in the
dict ring, introduce a struct dev_printk_info with separate fields
to store only the property values. Embed this struct within the
struct printk_info to provide guaranteed availability.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/87mu1jl6ne.fsf@jogness.linutronix.de
2020-09-22 11:27:48 +02:00
John Ogness
cfe2790b16 printk: move printk_info into separate array
The majority of the size of a descriptor is taken up by meta data,
which is often not of interest to the ringbuffer (for example,
when performing state checks). Since descriptors are often
temporarily stored on the stack, keeping their size minimal will
help reduce stack pressure.

Rather than embedding the printk_info into the descriptor, create
a separate printk_info array. The index of a descriptor in the
descriptor array corresponds to the printk_info with the same
index in the printk_info array. The rules for validity of a
printk_info match the existing rules for the data blocks: the
descriptor must be in a consistent state.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200918223421.21621-2-john.ogness@linutronix.de
2020-09-22 11:09:42 +02:00
John Ogness
f5f022e53b printk: reimplement log_cont using record extension
Use the record extending feature of the ringbuffer to implement
continuous messages. This preserves the existing continuous message
behavior.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200914123354.832-7-john.ogness@linutronix.de
2020-09-15 16:39:50 +02:00
John Ogness
4cfc7258f8 printk: ringbuffer: add finalization/extension support
Add support for extending the newest data block. For this, introduce
a new finalization state (desc_finalized) denoting a committed
descriptor that cannot be extended.

Until a record is finalized, a writer can reopen that record to
append new data. Reopening a record means transitioning from the
desc_committed state back to the desc_reserved state.

A writer can explicitly finalize a record if there is no intention
of extending it. Also, records are automatically finalized when a
new record is reserved. This relieves writers of needing to
explicitly finalize while also making such records available to
readers sooner. (Readers can only traverse finalized records.)

Four new memory barrier pairs are introduced. Two of them are
insignificant additions (data_realloc:A/desc_read:D and
data_realloc:A/data_push_tail:B) because they are alternate path
memory barriers that exactly match the purpose, pairing, and
context of the two existing memory barrier pairs they provide an
alternate path for. The other two new memory barrier pairs are
significant additions:

desc_reopen_last:A / _prb_commit:B - When reopening a descriptor,
    ensure the state transitions back to desc_reserved before
    fully trusting the descriptor data.

_prb_commit:B / desc_reserve:D - When committing a descriptor,
    ensure the state transitions to desc_committed before checking
    the head ID to see if the descriptor needs to be finalized.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200914123354.832-6-john.ogness@linutronix.de
2020-09-15 16:35:27 +02:00
John Ogness
10dcb06d40 printk: ringbuffer: change representation of states
Rather than deriving the state by evaluating bits within the flags
area of the state variable, assign the states explicit values and
set those values in the flags area. Introduce macros to make it
simple to read and write state values for the state variable.

Although the functionality is preserved, the binary representation
for the states is changed.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200914123354.832-5-john.ogness@linutronix.de
2020-09-15 15:52:49 +02:00
John Ogness
cc5c7041c6 printk: ringbuffer: clear initial reserved fields
prb_reserve() will set some meta data values and leave others
uninitialized (or rather, containing the values of the previous
wrap). Simplify the API by always clearing out all the fields.
Only the sequence number is filled in. The caller is now
responsible for filling in the rest of the meta data fields.
In particular, for correctly filling in text and dict lengths.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200914123354.832-4-john.ogness@linutronix.de
2020-09-15 15:47:19 +02:00
John Ogness
e3bc0401c1 printk: ringbuffer: add BLK_DATALESS() macro
Rather than continually needing to explicitly check @begin and @next
to identify a dataless block, introduce and use a BLK_DATALESS()
macro.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200914123354.832-3-john.ogness@linutronix.de
2020-09-15 15:44:49 +02:00
John Ogness
2a7f87ed05 printk: ringbuffer: relocate get_data()
Move the internal get_data() function as-is above prb_reserve() so
that a later change can make use of the static function.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200914123354.832-2-john.ogness@linutronix.de
2020-09-15 15:41:04 +02:00
John Ogness
e7c1fe2104 printk: ringbuffer: avoid memcpy() on state_var
@state_var is copied as part of the descriptor copying via
memcpy(). This is not allowed because @state_var is an atomic type,
which in some implementations may contain a spinlock.

Avoid using memcpy() with @state_var by explicitly copying the other
fields of the descriptor. @state_var is set using atomic set
operator before returning.

Fixes: b6cf8b3f33 ("printk: add lockless ringbuffer")
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200914094803.27365-2-john.ogness@linutronix.de
2020-09-15 14:42:09 +02:00
John Ogness
ce003d67ad printk: ringbuffer: fix setting state in desc_read()
It is expected that desc_read() will always set at least the
@state_var field. However, if the descriptor is in an inconsistent
state, no fields are set.

Also, the second load of @state_var is not stored in @desc_out and
so might not match the state value that is returned.

Always set the last loaded @state_var into @desc_out, regardless of
the descriptor consistency.

Fixes: b6cf8b3f33 ("printk: add lockless ringbuffer")
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200914094803.27365-1-john.ogness@linutronix.de
2020-09-15 14:23:37 +02:00
John Ogness
d397820f36 printk: ringbuffer: support dataless records
With commit 896fbe20b4 ("printk: use the lockless ringbuffer"),
printk() started silently dropping messages without text because such
records are not supported by the new printk ringbuffer.

Add support for such records.

Currently dataless records are denoted by INVALID_LPOS in order
to recognize failed prb_reserve() calls. Change the ringbuffer
to instead use two different identifiers (FAILED_LPOS and
NO_LPOS) to distinguish between failed prb_reserve() records and
successful dataless records, respectively.

Fixes: 896fbe20b4 ("printk: use the lockless ringbuffer")
Fixes: https://lkml.kernel.org/r/20200718121053.GA691245@elver.google.com
Reported-by: Marco Elver <elver@google.com>
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Marco Elver <elver@google.com>
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200721132528.9661-1-john.ogness@linutronix.de
2020-09-08 09:32:59 +02:00
Jiri Kosina
ead5d1f4d8 Merge branch 'master' into for-next
Sync with Linus' branch in order to be able to apply fixups
of more recent patches.
2020-09-01 14:19:48 +02:00
Randy Dunlap
547bbf7d21 kernel: printk: delete repeated words in comments
Drop repeated words "the" in kernel/printk/.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200807033227.8349-1-rdunlap@infradead.org
2020-08-10 17:23:44 +02:00
Linus Torvalds
a754292348 Printk changes for 5.9
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAl8pk84ACgkQUqAMR0iA
 lPIrTxAAhD6fosJx+7LCrDRABIw/ZybeS5MIxTuPsNtdMmGBemigew5Ao1wYY6Ww
 3BFiNC2LpDXPxSOCQpz0Zm5/oCLhShPJmS6ukjLbufDsiw0MezliKCAa2Bfw3W31
 6xntQtf7ps+bmTEQDyuznu8Kfg+I3lmdGUOEBBluHIP4gb7XKQE8ttyUHB6qdiXI
 3eAl53Q8dOMMjtk5eNBXA19JY43g4JmLZRBumrAUc1vsv15KTDmSyWKlV8+tLH9K
 JbQAHe0pNVec4sJUIYLvIwDZXvtsvxjdJyX3tTeZ7xJ/ARcvRLoixVGqWxKhqdth
 j5U/L+YQfCJifyqvEVo03yy4Ti+OraliRpGcRf/bM2HpmFBA2+dISr7/VEqRwkG7
 Sy8HuvBHHyUqdrPjB7izhv8iyRN+LxFfpdT5LMnzsvxMxAJ+QwNjxb13RA4kkeRU
 5SgOhfGWgTsLy71J6qdSeXYB2oPFw4Onp5yAtoUsOJVYqWkN9x0zdl+9HmqIHF7T
 dY+KNriEO6gmpsQrMR4FC/GVMtwYWf8AoqeZen5O5SQULmzuKQ5AkOo0IAMrU92i
 iAdFrSZj35HAQjIJRccPNGZ3FwTd1Z4r5GT7VRvrN+nq2wVopzbbz924/lmsGoAS
 YppAw31sKfXDc5uWE8jP8GP3OJqhORn2PPXq3D5Q3XSVbGgey0Q=
 =ZcMq
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk updates from Petr Mladek:

 - Herbert Xu made printk header file self-contained.

 - Andy Shevchenko and Sergey Senozhatsky cleaned up console->setup()
   error handling.

 - Andy Shevchenko did some cleanups (e.g. sparse warning) in vsprintf
   code.

 - Minor documentation updates.

* tag 'printk-for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  lib/vsprintf: Force type of flags value for gfp_t
  lib/vsprintf: Replace custom spec to print decimals with generic one
  lib/vsprintf: Replace hidden BUILD_BUG_ON() with static_assert()
  printk: Make linux/printk.h self-contained
  doc:kmsg: explicitly state the return value in case of SEEK_CUR
  Replace HTTP links with HTTPS ones: vsprintf
  hvc: unify console setup naming
  console: Fix trivia typo 'change' -> 'chance'
  console: Propagate error code from console ->setup()
  tty: hvc: Return proper error code from console ->setup() hook
  serial: sunzilog: Return proper error code from console ->setup() hook
  serial: sunsab: Return proper error code from console ->setup() hook
  mips: Return proper error code from console ->setup() hook
2020-08-04 22:22:25 -07:00
Petr Mladek
57e60db3bc Merge branch 'for-5.9-console-return-codes' into for-linus 2020-08-04 16:27:43 +02:00
Bruno Meneguele
bc885f1ab6 doc:kmsg: explicitly state the return value in case of SEEK_CUR
The commit 625d344978 ("Revert "kernel/printk: add kmsg SEEK_CUR
handling"") reverted a change done to the return value in case a SEEK_CUR
operation was performed for kmsg buffer based on the fact that different
userspace apps were handling the new return value (-ESPIPE) in different
ways, breaking them.

At the same time -ESPIPE was the wrong decision because kmsg /does support/
seek() but doesn't follow the "normal" behavior userspace is used to.
Because of that and also considering the time -EINVAL has been used, it was
decided to keep this way to avoid more userspace breakage.

This patch adds an official statement to the kmsg documentation pointing to
the current return value for SEEK_CUR, -EINVAL, thus userspace libraries
and apps can refer to it for a definitive guide on what to expect.

Signed-off-by: Bruno Meneguele <bmeneg@redhat.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200710174423.10480-1-bmeneg@redhat.com
2020-07-13 15:07:45 +02:00
John Ogness
896fbe20b4 printk: use the lockless ringbuffer
Replace the existing ringbuffer usage and implementation with
lockless ringbuffer usage. Even though the new ringbuffer does not
require locking, all existing locking is left in place. Therefore,
this change is purely replacing the underlining ringbuffer.

Changes that exist due to the ringbuffer replacement:

- The VMCOREINFO has been updated for the new structures.

- Dictionary data is now stored in a separate data buffer from the
  human-readable messages. The dictionary data buffer is set to the
  same size as the message buffer. Therefore, the total required
  memory for both dictionary and message data is
  2 * (2 ^ CONFIG_LOG_BUF_SHIFT) for the initial static buffers and
  2 * log_buf_len (the kernel parameter) for the dynamic buffers.

- Record meta-data is now stored in a separate array of descriptors.
  This is an additional 72 * (2 ^ (CONFIG_LOG_BUF_SHIFT - 5)) bytes
  for the static array and 72 * (log_buf_len >> 5) bytes for the
  dynamic array.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200709132344.760-5-john.ogness@linutronix.de
2020-07-10 08:48:55 +02:00
John Ogness
8749efc0c0 Revert "printk: lock/unlock console only for new logbuf entries"
This reverts commit 3ac37a93fa.

This optimization will not apply once the transition to a lockless
printk is complete. Rather than porting this optimization through
the transition only to remove it anyway, just revert it now to
simplify the transition.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200709132344.760-4-john.ogness@linutronix.de
2020-07-10 08:48:35 +02:00
John Ogness
b6cf8b3f33 printk: add lockless ringbuffer
Introduce a multi-reader multi-writer lockless ringbuffer for storing
the kernel log messages. Readers and writers may use their API from
any context (including scheduler and NMI). This ringbuffer will make
it possible to decouple printk() callers from any context, locking,
or console constraints. It also makes it possible for readers to have
full access to the ringbuffer contents at any time and context (for
example from any panic situation).

The printk_ringbuffer is made up of 3 internal ringbuffers:

desc_ring:
A ring of descriptors. A descriptor contains all record meta data
(sequence number, timestamp, loglevel, etc.) as well as internal state
information about the record and logical positions specifying where in
the other ringbuffers the text and dictionary strings are located.

text_data_ring:
A ring of data blocks. A data block consists of an unsigned long
integer (ID) that maps to a desc_ring index followed by the text
string of the record.

dict_data_ring:
A ring of data blocks. A data block consists of an unsigned long
integer (ID) that maps to a desc_ring index followed by the dictionary
string of the record.

The internal state information of a descriptor is the key element to
allow readers and writers to locklessly synchronize access to the data.

Co-developed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200709132344.760-3-john.ogness@linutronix.de
2020-07-10 08:48:19 +02:00
Andy Shevchenko
504603767a console: Fix trivia typo 'change' -> 'chance'
I bet the word 'chance' has to be used in 'had a chance to be called',
but, alas, I'm not native speaker...

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200618164751.56828-7-andriy.shevchenko@linux.intel.com
2020-06-25 14:24:54 +02:00
Andy Shevchenko
bba18a1af3 console: Propagate error code from console ->setup()
Since console ->setup() hook returns meaningful error codes,
propagate it to the caller of try_enable_new_console().

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200618164751.56828-6-andriy.shevchenko@linux.intel.com
2020-06-25 14:24:07 +02:00
Jason A. Donenfeld
625d344978 Revert "kernel/printk: add kmsg SEEK_CUR handling"
This reverts commit 8ece3b3eb5.

This commit broke userspace. Bash uses ESPIPE to determine whether or
not the file should be read using "unbuffered I/O", which means reading
1 byte at a time instead of 128 bytes at a time. I used to use bash to
read through kmsg in a really quite nasty way:

    while read -t 0.1 -r line 2>/dev/null || [[ $? -ne 142 ]]; do
       echo "SARU $line"
    done < /dev/kmsg

This will show all lines that can fit into the 128 byte buffer, and skip
lines that don't. That's pretty awful, but at least it worked.

With this change, bash now tries to do 1-byte reads, which means it
skips all the lines, which is worse than before.

Now, I don't really care very much about this, and I'm already look for
a workaround. But I did just spend an hour trying to figure out why my
scripts were broken. Either way, it makes no difference to me personally
whether this is reverted, but it might be something to consider. If you
declare that "trying to read /dev/kmsg with bash is terminally stupid
anyway," I might be inclined to agree with you. But do note that bash
uses lseek(fd, 0, SEEK_CUR)==>ESPIPE to determine whether or not it's
reading from a pipe.

Cc: Bruno Meneguele <bmeneg@redhat.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Petr Mladek <pmladek@suse.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-21 20:47:20 -07:00
Linus Torvalds
5c2fb57af0 One more printk change for 5.8
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAl7jLTcACgkQUqAMR0iA
 lPJiaw/9FWnHlGHq1RMJ2cQTdgDStVDP12+eXUrSXBiXedGNoMfsfRWHHNmiqkSS
 4fmbjNu+//yz44QNzZPF783zix3rdY6IOaNOd95Pi1kjZ2wrcW3ioL7fk6Q0/vr0
 +pC1zeC+G2JzYdXdInvAM7HI0W5R7D0YBUaIORf3bTD4nW1CPbpDSknX6TkfjCRz
 fM9MZxWz1r788uk2HpwhLtjk6qoiNXihTzB/pRbiK9z9MlwQd5331W93RuARYIKy
 8gCM/MUWZ/iD7a4Tvn7+vdtqu1gEo+c2wfgY7ilK56JJsyqL/u1DkOxDQme73ffe
 PAtDgceEKz51N86Lsagp3UdRnP4K78BS7TKOUJ6mWaXb6uPHhB5o5qUN2tmQiEE9
 G5b+0wxiD4iwVK5XaP2g2bfaEBKcaj8CJmPP8u44urFQCEVDKGa3nLanUGvVkzaQ
 JUAZWLhgblfVV4IAJ4WM3JO10Cwt1Js5gUEgqJ8TE6wIKz4zCvFsPYqLl6EUdX/o
 BCn6NI4rX79WvqQVB6KjJ9fdIAJqs6Wgps0ceW/U+Xs0HWfyQL/Ow9ShSiG/ZJvf
 dF3o2wyrnQQq7jgc/2JpQG+mBFHrnRgwD24i8LzaymsadlwNxmmoJxljO/KG4ZSJ
 KniL33BrN92BSBEYMC6YHYAHXCyGYZxTGWugXIeS1N2iRt4A3Gw=
 =+1iq
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-5.8-kdb-nmi' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk fix from Petr Mladek:
 "One more printk change for 5.8: make sure that messages printed from
  KDB context are redirected to KDB console handlers. It did not work
  when KDB interrupted NMI or printk_safe contexts.

  Arm people started hitting this problem more often recently. I forgot
  to add the fix into the previous pull request by mistake"

* tag 'printk-for-5.8-kdb-nmi' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  printk/kdb: Redirect printk messages into kdb in any context
2020-06-12 12:13:36 -07:00
Petr Mladek
2a9e5ded95 printk/kdb: Redirect printk messages into kdb in any context
kdb has to get messages on consoles even when the system is stopped.
It uses kdb_printf() internally and calls console drivers on its own.

It uses a hack to reuse an existing code. It sets "kdb_trap_printk"
global variable to redirect even the normal printk() into the
kdb_printf() variant.

The variable "kdb_trap_printk" is checked in printk_default() and
it is ignored when printk is redirected to printk_safe in NMI context.
Solve this by moving the check into printk_func().

It is obvious that it is not fully safe. But it does not make things
worse. The console drivers are already called in this context by
db_printf() direct calls.

Reported-by: Sumit Garg <sumit.garg@linaro.org>
Tested-by: Sumit Garg <sumit.garg@linaro.org>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200520102233.GC3464@linux-b0ei
2020-06-11 08:48:44 +02:00
Linus Torvalds
cb8e59cc87 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from David Miller:

 1) Allow setting bluetooth L2CAP modes via socket option, from Luiz
    Augusto von Dentz.

 2) Add GSO partial support to igc, from Sasha Neftin.

 3) Several cleanups and improvements to r8169 from Heiner Kallweit.

 4) Add IF_OPER_TESTING link state and use it when ethtool triggers a
    device self-test. From Andrew Lunn.

 5) Start moving away from custom driver versions, use the globally
    defined kernel version instead, from Leon Romanovsky.

 6) Support GRO vis gro_cells in DSA layer, from Alexander Lobakin.

 7) Allow hard IRQ deferral during NAPI, from Eric Dumazet.

 8) Add sriov and vf support to hinic, from Luo bin.

 9) Support Media Redundancy Protocol (MRP) in the bridging code, from
    Horatiu Vultur.

10) Support netmap in the nft_nat code, from Pablo Neira Ayuso.

11) Allow UDPv6 encapsulation of ESP in the ipsec code, from Sabrina
    Dubroca. Also add ipv6 support for espintcp.

12) Lots of ReST conversions of the networking documentation, from Mauro
    Carvalho Chehab.

13) Support configuration of ethtool rxnfc flows in bcmgenet driver,
    from Doug Berger.

14) Allow to dump cgroup id and filter by it in inet_diag code, from
    Dmitry Yakunin.

15) Add infrastructure to export netlink attribute policies to
    userspace, from Johannes Berg.

16) Several optimizations to sch_fq scheduler, from Eric Dumazet.

17) Fallback to the default qdisc if qdisc init fails because otherwise
    a packet scheduler init failure will make a device inoperative. From
    Jesper Dangaard Brouer.

18) Several RISCV bpf jit optimizations, from Luke Nelson.

19) Correct the return type of the ->ndo_start_xmit() method in several
    drivers, it's netdev_tx_t but many drivers were using
    'int'. From Yunjian Wang.

20) Add an ethtool interface for PHY master/slave config, from Oleksij
    Rempel.

21) Add BPF iterators, from Yonghang Song.

22) Add cable test infrastructure, including ethool interfaces, from
    Andrew Lunn. Marvell PHY driver is the first to support this
    facility.

23) Remove zero-length arrays all over, from Gustavo A. R. Silva.

24) Calculate and maintain an explicit frame size in XDP, from Jesper
    Dangaard Brouer.

25) Add CAP_BPF, from Alexei Starovoitov.

26) Support terse dumps in the packet scheduler, from Vlad Buslov.

27) Support XDP_TX bulking in dpaa2 driver, from Ioana Ciornei.

28) Add devm_register_netdev(), from Bartosz Golaszewski.

29) Minimize qdisc resets, from Cong Wang.

30) Get rid of kernel_getsockopt and kernel_setsockopt in order to
    eliminate set_fs/get_fs calls. From Christoph Hellwig.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2517 commits)
  selftests: net: ip_defrag: ignore EPERM
  net_failover: fixed rollback in net_failover_open()
  Revert "tipc: Fix potential tipc_aead refcnt leak in tipc_crypto_rcv"
  Revert "tipc: Fix potential tipc_node refcnt leak in tipc_rcv"
  vmxnet3: allow rx flow hash ops only when rss is enabled
  hinic: add set_channels ethtool_ops support
  selftests/bpf: Add a default $(CXX) value
  tools/bpf: Don't use $(COMPILE.c)
  bpf, selftests: Use bpf_probe_read_kernel
  s390/bpf: Use bcr 0,%0 as tail call nop filler
  s390/bpf: Maintain 8-byte stack alignment
  selftests/bpf: Fix verifier test
  selftests/bpf: Fix sample_cnt shared between two threads
  bpf, selftests: Adapt cls_redirect to call csum_level helper
  bpf: Add csum_level helper for fixing up csum levels
  bpf: Fix up bpf_skb_adjust_room helper's skb csum setting
  sfc: add missing annotation for efx_ef10_try_update_nic_stats_vf()
  crypto/chtls: IPv6 support for inline TLS
  Crypto/chcr: Fixes a coccinile check error
  Crypto/chcr: Fixes compilations warnings
  ...
2020-06-03 16:27:18 -07:00
Linus Torvalds
2227e5b21a The RCU updates for this cycle were:
- RCU-tasks update, including addition of RCU Tasks Trace for
    BPF use and TASKS_RUDE_RCU
  - kfree_rcu() updates.
  - Remove scheduler locking restriction
  - RCU CPU stall warning updates.
  - Torture-test updates.
  - Miscellaneous fixes and other updates.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl7U/r0RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hSNxAAirKhPGBoLI9DW1qde4OFhZg+BlIpS+LD
 IE/0eGB8hGwhb1793RGbzIJfSnRQpSOPxWbWc6DJZ4Zpi5/ZbVkiPKsuXpM1xGxs
 kuBCTOhWy1/p3iCZ1JH/JCrCAdWGZkIzEoaV7ipnHtV/+UrRbCWH5PB7R0fYvcbI
 q5bUcWJyEp/bYMxQn8DhAih6SLPHx+F9qaGAqqloLSHstTYG2HkBhBGKnqcd/Jex
 twkLK53poCkeP/c08V1dyagU2IRWj2jGB1NjYh/Ocm+Sn/vru15CVGspjVjqO5FF
 oq07lad357ddMsZmKoM2F5DhXbOh95A+EqF9VDvIzCvfGMUgqYI1oxWF4eycsGhg
 /aYJgYuN23YeEe2DkDzJB67GvBOwl4WgdoFaxKRzOiCSfrhkM8KqM4G9Fz1JIepG
 abRJCF85iGcLslU9DkrShQiDsd/CRPzu/jz6ybK0I2II2pICo6QRf76T7TdOvKnK
 yXwC6OdL7/dwOht20uT6XfnDXMCWI4MutiUrb8/C1DbaihwEaI2denr3YYL+IwrB
 B38CdP6sfKZ5UFxKh0xb+sOzWrw0KA+ThSAXeJhz3tKdxdyB6nkaw3J9lFg8oi20
 XGeAujjtjMZG5cxt2H+wO9kZY0RRau/nTqNtmmRrCobd5yJjHHPHH8trEd0twZ9A
 X5Wjh11lv3E=
 =Yisx
 -----END PGP SIGNATURE-----

Merge tag 'core-rcu-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull RCU updates from Ingo Molnar:
 "The RCU updates for this cycle were:

   - RCU-tasks update, including addition of RCU Tasks Trace for BPF use
     and TASKS_RUDE_RCU

   - kfree_rcu() updates.

   - Remove scheduler locking restriction

   - RCU CPU stall warning updates.

   - Torture-test updates.

   - Miscellaneous fixes and other updates"

* tag 'core-rcu-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (103 commits)
  rcu: Allow for smp_call_function() running callbacks from idle
  rcu: Provide rcu_irq_exit_check_preempt()
  rcu: Abstract out rcu_irq_enter_check_tick() from rcu_nmi_enter()
  rcu: Provide __rcu_is_watching()
  rcu: Provide rcu_irq_exit_preempt()
  rcu: Make RCU IRQ enter/exit functions rely on in_nmi()
  rcu/tree: Mark the idle relevant functions noinstr
  x86: Replace ist_enter() with nmi_enter()
  x86/mce: Send #MC singal from task work
  x86/entry: Get rid of ist_begin/end_non_atomic()
  sched,rcu,tracing: Avoid tracing before in_nmi() is correct
  sh/ftrace: Move arch_ftrace_nmi_{enter,exit} into nmi exception
  lockdep: Always inline lockdep_{off,on}()
  hardirq/nmi: Allow nested nmi_enter()
  arm64: Prepare arch_nmi_enter() for recursion
  printk: Disallow instrumenting print_nmi_enter()
  printk: Prepare for nested printk_nmi_enter()
  rcutorture: Convert ULONG_CMP_LT() to time_before()
  torture: Add a --kasan argument
  torture: Save a few lines by using config_override_param initially
  ...
2020-06-01 12:56:29 -07:00
Linus Torvalds
ca1f5df23f Printk changes for 5.8
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAl7U1TIACgkQUqAMR0iA
 lPI0mQ//TcVlRJgts/iwv0M2Simje28t9tziOHgWmEeiyGwE7vwDPDzt8QFiKzBa
 IrJ5iTRMtCrEF+eapqeH4g+Ve4Npm5Cobl8/h9JEiVu3SNC48TuaiUzU3+Bfl1zV
 vcDfZtN9QD1/CdLGlyKO75xjkCOaJRCFnx5ToXnd3llshMKI2XebUCnEH4TDe6Fz
 NGTjJL4kCPwzmae++UhlMfKwkayBtNbqcLkaTb7d67Tw2DcuuIVixUER77oC9QPN
 SfxdS07s0UVc4C9bCVe3KtYZR5YU/riOjKNJNutzP3JDtQNugywrrtI0qBwEisqM
 puMJ3xLeLssTn10FJxRK9ewRlXy2zT9mmcCuaVU6LtiyGnHOuEwIU+Ewu0itiJGu
 JuFovsNqTvOiZqFP7+pkDktOjffF2hsY/a6NxHr6aof8CrdO2w9dmCAgzGvnwV1N
 /zlmmPSEigVLz47eeivIIdQrPUejrEV7g1wOYYApnIlNCmGjdkGnXKNaUrLGDehQ
 QIlpx1uvmhntjiw1hTSbOV7KOQxLGtuy6DWLYC7uHD3H3aGDQyQO8dZYXSfOY1Qu
 cvQ/K8ykW0Kq2JKEHjkwSRHKDpxhvbQg7N5JCHQA49ahdZD3O5bOfeTofw+7nFO0
 j9g2Qv6nWFjLbAIAFtqjh6wz0UtGNoTl2cqQswsS10wlDcsq8vs=
 =pUBG
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk updates from Petr Mladek:

 - Benjamin Herrenschmidt solved a problem with non-matched console
   aliases by first checking consoles defined on the command line. It is
   a more conservative approach than the previous attempts.

 - Benjamin also made sure that the console accessible via /dev/console
   always has CON_CONSDEV flag.

 - Andy Shevchenko added the %ptT modifier for printing struct time64_t.
   It extends the existing %ptR handling for struct rtc_time.

 - Bruno Meneguele fixed /dev/kmsg error value returned by unsupported
   SEEK_CUR.

 - Tetsuo Handa removed unused pr_cont_once().

... and a few small fixes.

* tag 'printk-for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  printk: Remove pr_cont_once()
  printk: handle blank console arguments passed in.
  kernel/printk: add kmsg SEEK_CUR handling
  printk: Fix a typo in comment "interator"->"iterator"
  usb: pulse8-cec: Switch to use %ptT
  ARM: bcm2835: Switch to use %ptT
  lib/vsprintf: Print time64_t in human readable format
  lib/vsprintf: update comment about simple_strto<foo>() functions
  printk: Correctly set CON_CONSDEV even when preferred console was not registered
  printk: Fix preferred console selection with multiple matches
  printk: Move console matching logic into a separate function
  printk: Convert a use of sprintf to snprintf in console_unlock
2020-06-01 12:13:30 -07:00
Petr Mladek
d053cf0d77 Merge branch 'for-5.8' into for-linus 2020-06-01 10:15:16 +02:00
Petr Mladek
6a0af9fc8c Merge branch 'for-5.7-preferred-console' into for-linus 2020-06-01 10:13:51 +02:00
Kees Cook
fb13cb8a04 printk: Introduce kmsg_dump_reason_str()
The pstore subsystem already had a private version of this function.
With the coming addition of the pstore/zone driver, this needs to be
shared. As it really should live with printk, move it there instead.

Link: https://lore.kernel.org/lkml/20200515184434.8470-4-keescook@chromium.org/
Acked-by: Petr Mladek <pmladek@suse.com>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-05-30 10:34:03 -07:00
Pavel Tatashin
b1f6f161b2 printk: honor the max_reason field in kmsg_dumper
kmsg_dump() allows to dump kmesg buffer for various system events: oops,
panic, reboot, etc. It provides an interface to register a callback
call for clients, and in that callback interface there is a field
"max_reason", but it was getting ignored when set to any "reason"
higher than KMSG_DUMP_OOPS unless "always_kmsg_dump" was passed as
kernel parameter.

Allow clients to actually control their "max_reason", and keep the
current behavior when "max_reason" is not set.

Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Link: https://lore.kernel.org/lkml/20200515184434.8470-3-keescook@chromium.org/
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-05-30 10:34:03 -07:00
Shreyas Joshi
48021f9813 printk: handle blank console arguments passed in.
If uboot passes a blank string to console_setup then it results in
a trashed memory. Ultimately, the kernel crashes during freeing up
the memory.

This fix checks if there is a blank parameter being
passed to console_setup from uboot. In case it detects that
the console parameter is blank then it doesn't setup the serial
device and it gracefully exits.

Link: https://lore.kernel.org/r/20200522065306.83-1-shreyas.joshi@biamp.com
Signed-off-by: Shreyas Joshi <shreyas.joshi@biamp.com>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
[pmladek@suse.com: Better format the commit message and code, remove unnecessary brackets.]
Signed-off-by: Petr Mladek <pmladek@suse.com>
2020-05-22 10:34:34 +02:00
Bruno Meneguele
8ece3b3eb5 kernel/printk: add kmsg SEEK_CUR handling
Userspace libraries, e.g. glibc's dprintf(), perform a SEEK_CUR operation
over any file descriptor requested to make sure the current position isn't
pointing to junk due to previous manipulation of that same fd. And whenever
that fd doesn't have support for such operation, the userspace code expects
-ESPIPE to be returned.

However, when the fd in question references the /dev/kmsg interface, the
current kernel code state returns -EINVAL instead, causing an unexpected
behavior in userspace: in the case of glibc, when -ESPIPE is returned it
gets ignored and the call completes successfully, while returning -EINVAL
forces dprintf to fail without performing any action over that fd:

  if (_IO_SEEKOFF (fp, (off64_t)0, _IO_seek_cur, _IOS_INPUT|_IOS_OUTPUT) ==
  _IO_pos_BAD && errno != ESPIPE)
    return NULL;

With this patch we make sure to return the correct value when SEEK_CUR is
requested over kmsg and also add some kernel doc information to formalize
this behavior.

Link: https://lore.kernel.org/r/20200317103344.574277-1-bmeneg@redhat.com
Cc: linux-kernel@vger.kernel.org
Cc: rostedt@goodmis.org,
Cc: David.Laight@ACULAB.COM
Signed-off-by: Bruno Meneguele <bmeneg@redhat.com>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
2020-05-21 13:32:25 +02:00
Ethon Paul
325606af57 printk: Fix a typo in comment "interator"->"iterator"
There is a typo in comment, fix it.

Signed-off-by: Ethon Paul <ethp@qq.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
2020-05-21 13:31:33 +02:00
Peter Zijlstra
b0f51883f5 printk: Disallow instrumenting print_nmi_enter()
It happens early in nmi_enter(), no tracing, probing or other funnies
allowed. Specifically as nmi_enter() will be used in do_debug(), which
would cause recursive exceptions when kprobed.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Link: https://lkml.kernel.org/r/20200505134101.139720912@linutronix.de
2020-05-19 15:51:16 +02:00
Petr Mladek
8c4e93c362 printk: Prepare for nested printk_nmi_enter()
There is plenty of space in the printk_context variable. Reserve one byte
there for the NMI context to be on the safe side.

It should never overflow. The BUG_ON(in_nmi() == NMI_MASK) in nmi_enter()
will trigger much earlier.

Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Link: https://lkml.kernel.org/r/20200505134100.681374113@linutronix.de
2020-05-19 15:51:16 +02:00
Randy Dunlap
c1a371cf80 printk: fix global comment
Fix typo/spello.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2020-05-18 12:41:32 +02:00
Christoph Hellwig
32927393dc sysctl: pass kernel pointers to ->proc_handler
Instead of having all the sysctl handlers deal with user pointers, which
is rather hairy in terms of the BPF interaction, copy the input to and
from  userspace in common code.  This also means that the strings are
always NUL-terminated by the common code, making the API a little bit
safer.

As most handler just pass through the data to one of the common handlers
a lot of the changes are mechnical.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-04-27 02:07:40 -04:00
Sergey Senozhatsky
ab6f762f0f printk: queue wake_up_klogd irq_work only if per-CPU areas are ready
printk_deferred(), similarly to printk_safe/printk_nmi, does not
immediately attempt to print a new message on the consoles, avoiding
calls into non-reentrant kernel paths, e.g. scheduler or timekeeping,
which potentially can deadlock the system.

Those printk() flavors, instead, rely on per-CPU flush irq_work to print
messages from safer contexts.  For same reasons (recursive scheduler or
timekeeping calls) printk() uses per-CPU irq_work in order to wake up
user space syslog/kmsg readers.

However, only printk_safe/printk_nmi do make sure that per-CPU areas
have been initialised and that it's safe to modify per-CPU irq_work.
This means that, for instance, should printk_deferred() be invoked "too
early", that is before per-CPU areas are initialised, printk_deferred()
will perform illegal per-CPU access.

Lech Perczak [0] reports that after commit 1b710b1b10 ("char/random:
silence a lockdep splat with printk()") user-space syslog/kmsg readers
are not able to read new kernel messages.

The reason is printk_deferred() being called too early (as was pointed
out by Petr and John).

Fix printk_deferred() and do not queue per-CPU irq_work before per-CPU
areas are initialized.

Link: https://lore.kernel.org/lkml/aa0732c6-5c4e-8a8b-a1c1-75ebe3dca05b@camlintechnologies.com/
Reported-by: Lech Perczak <l.perczak@camlintechnologies.com>
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Tested-by: Jann Horn <jannh@google.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-10 13:18:57 -07:00
Benjamin Herrenschmidt
33225d7b0a printk: Correctly set CON_CONSDEV even when preferred console was not registered
CON_CONSDEV flag was historically used to put/keep the preferred console
first in console_drivers list. Where the preferred console is the last
on the command line.

The ordering is important only when opening /dev/console:

  + tty_kopen()
    + tty_lookup_driver()
      + console_device()

The flag was originally an implementation detail. But it was later
made accessible from userspace via /proc/consoles. It was used,
for example, by the tool "showconsole" to show the real tty
accessible via /dev/console, see
https://github.com/bitstreamout/showconsole

Now, the current code sets CON_CONSDEV only for the preferred
console or when a fallback console is added. The flag is not
set when the preferred console is defined on the command line
but it is not registered from some reasons.

Simple solution is to set CON_CONSDEV flag for the first
registered console. It will work most of the time because:

  + Most real consoles have console->device defined.

  + Boot consoles are removed in printk_late_init().

  + unregister_console() moves CON_CONSDEV flag to the next
    console.

Clean solution would require checking con->device when the
preferred console is registered and in unregister_console().

Conclusion:

Use the simple solution for now. It is better than the current
state and good enough.

The clean solution is not worth it. It would complicate the already
complicated code without too much gain. Instead the code would deserve
a complete rewrite.

Link: https://lore.kernel.org/r/20200213095133.23176-4-pmladek@suse.com
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[pmladek@suse.com: Correct reasoning in the commit message, comment update.]
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
2020-02-18 09:35:24 +01:00
Benjamin Herrenschmidt
e369d8227f printk: Fix preferred console selection with multiple matches
In the following circumstances, the rule of selecting the console
corresponding to the last "console=" entry on the command line as
the preferred console (CON_CONSDEV, ie, /dev/console) fails. This
is a specific example, but it could happen with different consoles
that have a similar name aliasing mechanism.

  - The kernel command line has both console=tty0 and console=ttyS0
    in that order (the latter with speed etc... arguments).
    This is common with some cloud setups such as Amazon Linux.

  - add_preferred_console is called early to register "uart0". In
    our case that happens from acpi_parse_spcr() on arm64 since the
    "enable_console" argument is true on that architecture. This causes
    "uart0" to become entry 0 of the console_cmdline array.

Now, because of the above, what happens is:

  - add_preferred_console is called by the cmdline parsing for tty0
    and ttyS0 respectively, thus occupying entries 1 and 2 of the
    console_cmdline array (since this happens after ACPI SPCR parsing).
    At that point preferred_console is set to 2 as expected.

  - When the tty layer kicks in, it will call register_console for tty0.
    This will match entry 1 in console_cmdline array. It isn't our
    preferred console but because it's our only console at this point,
    it will end up "first" in the consoles list.

  - When 8250 probes the actual serial port later on, it calls
    register_console for ttyS0. At that point the loop in register_console
    tries to match it with the entries in the console_cmdline array.
    Ideally this should match ttyS0 in entry 2, which is preferred, causing
    it to be inserted first and to replace tty0 as CONSDEV. However, 8250
    provides a "match" hook in its struct console, and that hook will match
    "uart" as an alias to "ttyS". So we match uart0 at entry 0 in the array
    which is not the preferred console and will not match entry 2 which is
    since we break out of the loop on the first match. As a result,
    we don't set CONSDEV and don't insert it first, but second in
    the console list.

    As a result, we end up with tty0 remaining first in the array, and thus
    /dev/console going there instead of the last user specified one which
    is ttyS0.

This tentative fix register_console() to scan first for consoles
specified on the command line, and only if none is found, to then
scan for consoles specified by the architecture.

Link: https://lore.kernel.org/r/20200213095133.23176-3-pmladek@suse.com
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
2020-02-18 09:34:42 +01:00