diff options
Diffstat (limited to 'man/io_uring_register.2')
-rw-r--r-- | man/io_uring_register.2 | 315 |
1 files changed, 302 insertions, 13 deletions
diff --git a/man/io_uring_register.2 b/man/io_uring_register.2 index 5326a87..1e91caf 100644 --- a/man/io_uring_register.2 +++ b/man/io_uring_register.2 @@ -88,14 +88,107 @@ then issuing a new call to .BR io_uring_register () with the new buffers. -Note that registering buffers will wait for the ring to idle. If the application -currently has requests in-flight, the registration will wait for those to -finish before proceeding. +Note that before 5.13 registering buffers would wait for the ring to idle. +If the application currently has requests in-flight, the registration will +wait for those to finish before proceeding. An application need not unregister buffers explicitly before shutting down the io_uring instance. Available since 5.1. .TP +.B IORING_REGISTER_BUFFERS2 +Register buffers for I/O. Similar to +.B IORING_REGISTER_BUFFERS +but aims to have a more extensible ABI. + +.I arg +points to a +.I struct io_uring_rsrc_register, +and +.I nr_args +should be set to the number of bytes in the structure. + +.PP +.in +8n +.EX +struct io_uring_rsrc_register { + __u32 nr; + __u32 resv; + __u64 resv2; + __aligned_u64 data; + __aligned_u64 tags; +}; + +.EE +.in +.PP + +.in +8n + +The +.I data +field contains a pointer to a +.I struct iovec +array of +.I nr +entries. +The +.I tags +field should either be 0, then tagging is disabled, or point to an array +of +.I nr +"tags" (unsigned 64 bit integers). If a tag is zero, then tagging for this +particular resource (a buffer in this case) is disabled. Otherwise, after the +resource had been unregistered and it's not used anymore, a CQE will be +posted with +.I user_data +set to the specified tag and all other fields zeroed. + +Note that resource updates, e.g. +.B IORING_REGISTER_BUFFERS_UPDATE, +don't necessarily deallocate resources by the time it returns, but they might +be held alive until all requests using it complete. + +Available since 5.13. + +.TP +.B IORING_REGISTER_BUFFERS_UPDATE +Updates registered buffers with new ones, either turning a sparse entry into +a real one, or replacing an existing entry. + +.I arg +must contain a pointer to a struct io_uring_rsrc_update2, which contains +an offset on which to start the update, and an array of +.I struct iovec. +.I tags +points to an array of tags. +.I nr +must contain the number of descriptors in the passed in arrays. +See +.B IORING_REGISTER_BUFFERS2 +for the resource tagging description. + +.PP +.in +8n +.EX + +struct io_uring_rsrc_update2 { + __u32 offset; + __u32 resv; + __aligned_u64 data; + __aligned_u64 tags; + __u32 nr; + __u32 resv2; +}; +.EE +.in +.PP + +.in +8n + +Available since 5.13. + +.TP .B IORING_UNREGISTER_BUFFERS This operation takes no argument, and .I arg @@ -128,25 +221,60 @@ See .B IORING_REGISTER_FILES_UPDATE for how to update files in place. -Note that registering files will wait for the ring to idle. If the application -currently has requests in-flight, the registration will wait for those to -finish before proceeding. See +Note that before 5.13 registering files would wait for the ring to idle. +If the application currently has requests in-flight, the registration will +wait for those to finish before proceeding. See .B IORING_REGISTER_FILES_UPDATE for how to update an existing set without that limitation. Files are automatically unregistered when the io_uring instance is -torn down. An application need only unregister if it wishes to +torn down. An application needs only unregister if it wishes to register a new set of fds. Available since 5.1. .TP +.B IORING_REGISTER_FILES2 +Register files for I/O. Similar to +.B IORING_REGISTER_FILES. + +.I arg +points to a +.I struct io_uring_rsrc_register, +and +.I nr_args +should be set to the number of bytes in the structure. + +The +.I data +field contains a pointer to an array of +.I nr +file descriptors (signed 32 bit integers). +.I tags +field should either be 0 or or point to an array of +.I nr +"tags" (unsigned 64 bit integers). See +.B IORING_REGISTER_BUFFERS2 +for more info on resource tagging. + +Note that resource updates, e.g. +.B IORING_REGISTER_FILES_UPDATE, +don't necessarily deallocate resources, they might be held until all requests +using that resource complete. + +Available since 5.13. + +.TP .B IORING_REGISTER_FILES_UPDATE This operation replaces existing files in the registered file set with new -ones, either turning a sparse entry (one where fd is equal to -1) into a -real one, removing an existing entry (new one is set to -1), or replacing -an existing entry with a new existing entry. +ones, either turning a sparse entry (one where fd is equal to +.B -1 +) into a real one, removing an existing entry (new one is set to +.B -1 +), or replacing an existing entry with a new existing entry. .I arg -must contain a pointer to a struct io_uring_files_update, which contains +must contain a pointer to a +.I struct io_uring_files_update, +which contains an offset on which to start the update, and an array of file descriptors to use for the update. .I nr_args @@ -158,6 +286,32 @@ File descriptors can be skipped if they are set to Skipping an fd will not touch the file associated with the previous fd at that index. Available since 5.12. +.TP +.B IORING_REGISTER_FILES_UPDATE2 +Similar to IORING_REGISTER_FILES_UPDATE, replaces existing files in the +registered file set with new ones, either turning a sparse entry (one where +fd is equal to +.B -1 +) into a real one, removing an existing entry (new one is set to +.B -1 +), or replacing an existing entry with a new existing entry. + +.I arg +must contain a pointer to a +.I struct io_uring_rsrc_update2, +which contains +an offset on which to start the update, and an array of file descriptors to +use for the update stored in +.I data. +.I tags +points to an array of tags. +.I nr +must contain the number of descriptors in the passed in arrays. +See +.B IORING_REGISTER_BUFFERS2 +for the resource tagging description. + +Available since 5.13. .TP .B IORING_UNREGISTER_FILES @@ -174,7 +328,13 @@ registered through this operation. .I arg must contain a pointer to the eventfd file descriptor, and .I nr_args -must be 1. Available since 5.2. +must be 1. Note that while io_uring generally takes care to avoid spurious +events, they can occur. Similarly, batched completions of CQEs may only trigger +a single eventfd notification even if multiple CQEs are posted. The application +should make no assumptions on number of events being available having a direct +correlation to eventfd notifications posted. An eventfd notification must thus +only be treated as a hint to check the CQ ring for completions. Available since +5.2. An application can temporarily disable notifications, coming through the registered eventfd, by setting the @@ -292,11 +452,140 @@ must be specified in the call to Available since 5.10. +.TP +.B IORING_REGISTER_IOWQ_AFF +By default, async workers created by io_uring will inherit the CPU mask of its +parent. This is usually all the CPUs in the system, unless the parent is being +run with a limited set. If this isn't the desired outcome, the application +may explicitly tell io_uring what CPUs the async workers may run on. +.I arg +must point to a +.B cpu_set_t +mask, and +.I nr_args +the byte size of that mask. + +Available since 5.14. + +.TP +.B IORING_UNREGISTER_IOWQ_AFF +Undoes a CPU mask previously set with +.B IORING_REGISTER_IOWQ_AFF. +Must not have +.I arg +or +.I nr_args +set. + +Available since 5.14. + +.TP +.B IORING_REGISTER_IOWQ_MAX_WORKERS +By default, io_uring limits the unbounded workers created to the maximum +processor count set by +.I RLIMIT_NPROC +and the bounded workers is a function of the SQ ring size and the number +of CPUs in the system. Sometimes this can be excessive (or too little, for +bounded), and this command provides a way to change the count per ring (per NUMA +node) instead. + +.I arg +must be set to an +.I unsigned int +pointer to an array of two values, with the values in the array being set to +the maximum count of workers per NUMA node. Index 0 holds the bounded worker +count, and index 1 holds the unbounded worker count. On successful return, the +passed in array will contain the previous maximum valyes for each type. If the +count being passed in is 0, then this command returns the current maximum values +and doesn't modify the current setting. +.I nr_args +must be set to 2, as the command takes two values. + +Available since 5.15. + +.TP +.B IORING_REGISTER_RING_FDS +Whenever +.BR io_uring_enter (2) +is called to submit request or wait for completions, the kernel must grab a +reference to the file descriptor. If the application using io_uring is threaded, +the file table is marked as shared, and the reference grab and put of the file +descriptor count is more expensive than it is for a non-threaded application. + +Similarly to how io_uring allows registration of files, this allow registration +of the ring file descriptor itself. This reduces the overhead of the +.BR io_uring_enter (2) +system call. + +.I arg +must be set to an unsigned int pointer to an array of type +.I struct io_uring_rsrc_register +of +.I nr_args +number of entries. The +.B data +field of this struct must point to an io_uring file descriptor, and the +.B offset +field can be either +.B -1 +or an explicit offset desired for the registered file descriptor value. If +.B -1 +is used, then upon successful return of this system call, the field will +contain the value of the registered file descriptor to be used for future +.BR io_uring_enter (2) +system calls. + +On successful completion of this request, the returned descriptors may be used +instead of the real file descriptor for +.BR io_uring_enter (2), +provided that +.B IORING_ENTER_REGISTERED_RING +is set in the +.I flags +for the system call. This flag tells the kernel that a registered descriptor +is used rather than a real file descriptor. + +Each thread or process using a ring must register the file descriptor directly +by issuing this request.o + +The maximum number of supported registered ring descriptors is currently +limited to +.B 16. + +Available since 5.18. + +.TP +.B IORING_UNREGISTER_RING_FDS +Unregister descriptors previously registered with +.B IORING_REGISTER_RING_FDS. + +.I arg +must be set to an unsigned int pointer to an array of type +.I struct io_uring_rsrc_register +of +.I nr_args +number of entries. Only the +.B offset +field should be set in the structure, containing the registered file descriptor +offset previously returned from +.B IORING_REGISTER_RING_FDS +that the application wishes to unregister. + +Note that this isn't done automatically on ring exit, if the thread or task +that previously registered a ring file descriptor isn't exiting. It is +recommended to manually unregister any previously registered ring descriptors +if the ring is closed and the task persists. This will free up a registration +slot, making it available for future use. + +Available since 5.18. + .SH RETURN VALUE On success, .BR io_uring_register () -returns 0. On error, -1 is returned, and +returns 0. On error, +.B -1 +is returned, and .I errno is set accordingly. |