.\" $NetBSD: aio.3,v 1.5 2010/05/19 06:35:20 jruoho Exp $ $
.\"
.\" Copyright (c) 2010 Jukka Ruohonen <jruohonen@iki.fi>
.\" All rights reserved.
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\" 1. Redistributions of source code must retain the above copyright
.\" notice, this list of conditions and the following disclaimer.
.\" 2. Redistributions in binary form must reproduce the above copyright
.\" notice, this list of conditions and the following disclaimer in the
.\" documentation and/or other materials provided with the distribution.
.\"
.\" THIS SOFTWARE IS PROVIDED BY Softweyr LLC AND CONTRIBUTORS ``AS IS'' AND
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
.\" ARE DISCLAIMED. IN NO EVENT SHALL Softweyr LLC OR CONTRIBUTORS BE LIABLE
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
.\" SUCH DAMAGE.
.\"
.Dd May 19, 2010
.Dt AIO 3
.Os
.Sh NAME
.Nm aio
.Nd asynchronous I/O (REALTIME)
.Sh LIBRARY
.Lb librt
.Sh SYNOPSIS
.In aio.h
.Sh DESCRIPTION
The
.St -p1003.1-2001
standard defines an interface for asynchronous input and output.
Although in
.Nx
this is provided as part of the
.Lb librt ,
the implementation largely resides in the kernel.
.Ss Rationale
The rationale can be roughly summarized with the following points.
.Bl -enum -offset 2n
.It
To increase performance by providing a mechanism to carry out
.Tn I/O
without blocking.
Theoretically, if
.Tn I/O
would never block,
neither at the software nor at the hardware level,
the overhead of
.Tn I/O
would become zero, and processes would no longer be
.Tn I/O
bound.
.It
To segregate the different
.Tn I/O
operations into logically distinctive procedures.
Unlike with the standard
.Xr stdio 3 ,
the
.Nm
interface separates queuing and submitting
.Tn I/O
operations to the kernel, and
receiving notifications of operation completion from the kernel.
.It
To provide an uniform and standardized framework for asynchronous
.Tn I/O .
For instance,
.Nm
avoids the need for (and the overhead of) extra worker threads
sometimes used to perform asynchronous
.Tn I/O .
.El
.Ss Asynchronous I/O Control Block
The Asynchronous I/O Control Block is the basic operational unit behind
.Nm .
This is required since an arbitrary number of operations can be started
at once, and because each operation can be either input or output.
This block is represented by the
.Em aiocb
structure, which is defined in the
.In aio.h
header.
The following fields are available for user applications:
.Bd -literal -offset indent
off_t aio_offset;
void *aio_buf;
size_t aio_nbytes;
int aio_fildes;
int aio_lio_opcode;
int aio_reqprio;
struct sigevent aio_sigevent;
.Ed
.Pp
The fields are:
.Bl -enum -offset indent
.It
The
.Va aio_offset
specifies the implicit file offset at which the
.Tn I/O
operations are performed.
This cannot be expected to be the actual read/write offset of the
file descriptor.
.It
The
.Va aio_buf
member is a pointer to the buffer to which data is going to be written or
to which the read operation stores data.
.It
The
.Va aio_nbytes
specifies the length of
.Va aio_buf .
.It
The
.Va aio_fildes
specifies the used file descriptor.
.It
The
.Va aio_lio_opcode
is used by the
.Fn lio_listio
function to initialize a list of
.Tn I/O
requests with a single call.
.It
The
.Va aio_reqprio
member can be used to lower the scheduling priority of an
.Nm
operation.
This is only available if
.Dv _POSIX_PRIORITIZED_IO
and
.Dv _POSIX_PRIORITY_SCHEDULING
are defined, and the associated file descriptor supports it.
.It
The
.Va aio_sigevent
member is used to specify how the calling process is notified once an
.Nm
operation completes.
.El
.Pp
The members
.Va aio_buf ,
.Va aio_fildes ,
and
.Va aio_nbytes
are conceptually similar to the parameters
.Sq buf ,
.Sq fildes ,
and
.Sq nbytes
used in the standard
.Xr read 2
and
.Xr write 2
functions.
For example, the caller can read
.Va aio_nbytes
from a file associated with the file descriptor
.Va aio_fildes
into the buffer
.Va aio_buf .
All appropriate fields should be initialized by the caller before
.Fn aio_read
or
.Fn aio_write
is called.
.Ss File Offsets
Asynchronous
.Tn I/O
operations are not strictly sequential;
operations are carried out in arbitrary order and more than one
operation for one file descriptor can be started.
The requested read or write operation starts
from the absolute position specified by
.Va aio_offset ,
as if
.Xr lseek 2
would have been called with
.Dv SEEK_SET
immediately prior to the operation.
The
.Tn POSIX
standard does not specify what happens after an
.Nm
operation has been successfully completed.
Depending on the implementation,
the actual file offset may or may not be updated.
.Ss Errors and Completion
Asynchronous
.Tn I/O
operations are said to be complete when:
.Bl -bullet -offset 2n
.It
An error is detected.
.It
The
.Tn I/O
transfer is performed successfully.
.It
The operation is canceled.
.El
.Pp
If an error condition is detected that prevents
an operation from being started, the request is not enqueued.
In this case the read and write functions,
.Fn aio_read
and
.Fn aio_write ,
return immediately, setting the global
.Va errno
to indicate the cause of the error.
.Pp
After an operation has been successfully enqueued,
.Fn aio_error
and
.Fn aio_return
must be used to determine the status of the operation and to determine
any error conditions.
This includes the conditions reported by the standard
.Xr read 2 ,
.Xr write 2 ,
and
.Xr fsync 2 .
The request remains enqueued and consumes process and
system resources until
.Fn aio_return
is called.
.Ss Waiting for Completion
The
.Nm
interface supports both polling and notification models.
The first can be implemented by simply repeatedly calling the
.Fn aio_error
function to test the status of an operation.
Once the operation has completed,
.Fn aio_return
is used to free the
.Va aiocb
structure for re-use.
.Pp
The notification model is implemented by using the
.Va aio_sigevent
member of the Asynchronous I/O Control Block.
The operational model and the used structure are described in
.Xr sigevent 3 .
.Pp
The
.Fn aio_suspend
function can be used to wait for the completion of one or more operations.
It is possible to set a timeout so that the process can continue the
execution and take recovery actions if the
.Nm
operations do not complete as expected.
.Ss Cancellation and Synchronization
The
.Fn aio_cancel
function can be used to request cancellation of an asynchronous
.Tn I/O
operation.
Note however that not all of them can be canceled.
The same
.Va aiocb
used to start the operation may be used as a handle for identification.
It is also possible to request cancellation of all operations pending
for a file.
.Pp
Comparable to
.Xr fsync 2 ,
the
.Fn aio_fsync
function can be used to synchronize the contents of
permanent storage when multiple asynchronous
.Tn I/O
operations are outstanding for the file or device.
The synchronization operation includes only those requests that have
already been successfully enqueued.
.Sh FUNCTIONS
The following functions comprise the
.Tn API
of the
.Nm
interface:
.Bl -column -offset indent "aio_suspend " "XXX"
.It Sy Function Ta Sy Description
.It Xr aio_cancel 3 Ta cancel an outstanding asynchronous I/O operation
.It Xr aio_error 3 Ta retrieve error status of asynchronous I/O operation
.It Xr aio_fsync 3 Ta asynchronous data synchronization of file
.It Xr aio_read 3 Ta asynchronous read from a file
.It Xr aio_return 3 Ta get return status of asynchronous I/O operation
.It Xr aio_suspend 3 Ta suspend until operations or timeout complete
.It Xr aio_write 3 Ta asynchronous write to a file
.It Xr lio_listio 3 Ta list directed I/O
.El
.Sh COMPATIBILITY
Unfortunately, the
.Tn POSIX
asynchronous
.Tn I/O
implementations vary slightly.
Some implementations provide a slightly different
.Tn API
with possible extensions.
For instance, the
.Fx
implementation uses a function
.Sq Fn aio_waitcomplete
to wait for the next completion of an
.Nm aio
request.
.Sh STANDARDS
The
.Nm
interface is expected to conform to the
.St -p1003.1-2001
standard.
.Sh HISTORY
The
.Nm
interface first appeared in
.Nx 5.0 .
.Sh CAVEATS
Few limitations can be mentioned:
.Bl -bullet
.It
Undefined behavior results if simultaneous asynchronous operations
use the same Asynchronous I/O Control Block.
.It
When an asynchronous read operation is outstanding,
undefined behavior may follow if the contents of
.Va aiocb
are altered, or if memory associated with the structure, or the
.Va aio_buf
buffer, is deallocated.
.El