Commit graph

11 commits

Author SHA1 Message Date
Pavel Begunkov
2b8e976b98 io_uring: user registered clockid for wait timeouts
Add a new registration opcode IORING_REGISTER_CLOCK, which allows the
user to select which clock id it wants to use with CQ waiting timeouts.
It only allows a subset of all posix clocks and currently supports
CLOCK_MONOTONIC and CLOCK_BOOTTIME.

Suggested-by: Lewis Baker <lewissbaker@gmail.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/98f2bc8a3c36cdf8f0e6a275245e81e903459703.1723039801.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-08-25 08:27:01 -06:00
Gabriel Krisman Bertazi
6bc9199d0c io_uring: Allocate only necessary memory in io_probe
We write at most IORING_OP_LAST entries in the probe buffer, so we don't
need to allocate temporary space for more than that.  As a side effect,
we no longer can overflow "size".

Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
Link: https://lore.kernel.org/r/20240619020620.5301-3-krisman@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-19 08:58:00 -06:00
Gabriel Krisman Bertazi
3e05b22238 io_uring: Fix probe of disabled operations
io_probe checks io_issue_def->not_supported, but we never really set
that field, as we mark non-supported functions through a specific ->prep
handler.  This means we end up returning IO_URING_OP_SUPPORTED, even for
disabled operations.  Fix it by just checking the prep handler itself.

Fixes: 66f4af93da ("io_uring: add support for probing opcodes")
Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
Link: https://lore.kernel.org/r/20240619020620.5301-2-krisman@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-19 08:58:00 -06:00
Jens Axboe
200f3abd14 io_uring/eventfd: move eventfd handling to separate file
This is pretty nicely abstracted already, but let's move it to a separate
file rather than have it in the main io_uring file. With that, we can
also move the io_ev_fd struct and enum out of global scope.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-16 14:54:55 -06:00
Jens Axboe
60b6c075e8 io_uring/eventfd: move to more idiomatic RCU free usage
In some ways, it just "happens to work" currently with using the ops
field for both the free and signaling bit. But it depends on ordering
of operations in terms of freeing and signaling. Clean it up and use the
usual refs == 0 under RCU read side lock to determine if the ev_fd is
still valid, and use the reference to gate the freeing as well.

Fixes: 21a091b970 ("io_uring: signal registered eventfd to process deferred task work")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-16 14:54:55 -06:00
Hagar Hemdan
73254a297c io_uring: fix possible deadlock in io_register_iowq_max_workers()
The io_register_iowq_max_workers() function calls io_put_sq_data(),
which acquires the sqd->lock without releasing the uring_lock.
Similar to the commit 009ad9f0c6 ("io_uring: drop ctx->uring_lock
before acquiring sqd->lock"), this can lead to a potential deadlock
situation.

To resolve this issue, the uring_lock is released before calling
io_put_sq_data(), and then it is re-acquired after the function call.

This change ensures that the locks are acquired in the correct
order, preventing the possibility of a deadlock.

Suggested-by: Maximilian Heyne <mheyne@amazon.de>
Signed-off-by: Hagar Hemdan <hagarhem@amazon.com>
Link: https://lore.kernel.org/r/20240604130527.3597-1-hagarhem@amazon.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-04 07:39:17 -06:00
Jens Axboe
1da2f311ba io_uring: fix warnings on shadow variables
There are a few of those:

io_uring/fdinfo.c:170:16: warning: declaration shadows a local variable [-Wshadow]
  170 |                 struct file *f = io_file_from_index(&ctx->file_table, i);
      |                              ^
io_uring/fdinfo.c:53:67: note: previous declaration is here
   53 | __cold void io_uring_show_fdinfo(struct seq_file *m, struct file *f)
      |                                                                   ^
io_uring/cancel.c:187:25: warning: declaration shadows a local variable [-Wshadow]
  187 |                 struct io_uring_task *tctx = node->task->io_uring;
      |                                       ^
io_uring/cancel.c:166:31: note: previous declaration is here
  166 |                              struct io_uring_task *tctx,
      |                                                    ^
io_uring/register.c:371:25: warning: declaration shadows a local variable [-Wshadow]
  371 |                 struct io_uring_task *tctx = node->task->io_uring;
      |                                       ^
io_uring/register.c:312:24: note: previous declaration is here
  312 |         struct io_uring_task *tctx = NULL;
      |                               ^

and a simple cleanup gets rid of them. For the fdinfo case, make a
distinction between the file being passed in (for the ring), and the
registered files we iterate. For the other two cases, just get rid of
shadowed variable, there's no reason to have a new one.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-04-15 08:10:26 -06:00
Stefan Roesch
ef1186c1a8 io_uring: add register/unregister napi function
This adds an api to register and unregister the napi for io-uring. If
the arg value is specified when unregistering, the current napi setting
for the busy poll timeout is copied into the user structure. If this is
not required, NULL can be passed as the arg value.

Signed-off-by: Stefan Roesch <shr@devkernel.io>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230608163839.2891748-7-shr@devkernel.io
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-02-09 11:54:32 -07:00
Jens Axboe
baf5977134 io_uring/register: guard compat syscall with CONFIG_COMPAT
Add compat.h include to avoid a potential build issue:

io_uring/register.c:281:6: error: call to undeclared function 'in_compat_syscall'; ISO C99 and later do not support implicit function declarations [-Werror,-Wimplicit-function-declaration]

if (in_compat_syscall()) {
    ^
1 warning generated.
io_uring/register.c:282:9: error: call to undeclared function 'compat_get_bitmap'; ISO C99 and later do not support implicit function declarations [-Werror,-Wimplicit-function-declaration]
ret = compat_get_bitmap(cpumask_bits(new_mask),
      ^

Fixes: c43203154d ("io_uring/register: move io_uring_register(2) related code to register.c")
Reported-by: Manu Bretelle <chantra@meta.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-01-17 09:45:18 -07:00
Jens Axboe
d293b1a896 io_uring/kbuf: add method for returning provided buffer ring head
The tail of the provided ring buffer is shared between the kernel and
the application, but the head is private to the kernel as the
application doesn't need to see it. However, this also prevents the
application from knowing how many buffers the kernel has consumed.
Usually this is fine, as the information is inherently racy in that
the kernel could be consuming buffers continually, but for cleanup
purposes it may be relevant to know how many buffers are still left
in the ring.

Add IORING_REGISTER_PBUF_STATUS which will return status for a given
provided buffer ring. Right now it just returns the head, but space
is reserved for more information later in, if needed.

Link: https://github.com/axboe/liburing/discussions/1020
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-12-21 09:47:06 -07:00
Jens Axboe
c43203154d io_uring/register: move io_uring_register(2) related code to register.c
Most of this code is basically self contained, move it out of the core
io_uring file to bring a bit more separation to the registration related
bits. This moves another ~10% of the code into register.c.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-12-19 08:54:20 -07:00