How to improve program performance with AIO Technology

Write in front

This is an article about AIO. This article compares several common I/O models in detail, and introduces some API s related to AIO.

I translated the original English into this article. One is to study by yourself, and the other is to facilitate students who are not used to reading English materials to study.

Original English Address:

The English title is:
Boost application performance using asynchronous I/O

Start of text

AIO introduction

Linux asynchronous I/O is a recent addition to the Linux kernel. This is a standard feature of the 2.6 kernel and appears as a patch in version 2.4. The basic idea behind AIO is to allow processes to start many I/O operations without blocking or waiting for any operation to complete. Then, after receiving the I/O completion notification later, the process can further query the I/O results.

I/O model

Before going deep into AIO API, let's sort out different I/O models under Linux. The purpose of these models is to make you understand the differences between them. Figure 1 shows the synchronous and asynchronous models, as well as the blocking and non blocking models.

From the figure, we can see the location of AIO.

Synchronous blocking I/O

One of the most common models is the synchronous blocking I/O model. In this model, the user space application performs a system call, which causes blocking. This means that the application blocks until the system call is complete (data transfer complete or error). The caller application is in a state of waiting for a response, but does not consume CPU. From this point of view, it is fairly efficient.

Figure 2 shows the traditional blocking I/O model, which is also the most commonly used model in applications. Its behavior is well understood, and its use is effective for typical applications. When the read system call is invoked, the application blocks and the context switches to the kernel. Then start reading, and when the response returns (from the device being read), the data is moved to the user space buffer. The application is then unblocked (and the read call returns).

From an application perspective, the read operation lasts a long time. But it's actually just when the kernel is blocked.

Synchronous non blocking I/O

A less efficient variant of synchronous blocking is synchronous non blocking I/O. In this model, the device opens in a non blocking manner. This means that the read operation may not complete the I/O immediately, but return an error code indicating that the command (EAGAIN or EWOULDBLOCK) cannot be satisfied immediately, as shown in Figure 3.

Non blocking means that I/O commands may not be satisfied immediately and require multiple calls from the application to wait for completion. This can lead to very low efficiency, because in many cases, the application must always wait for data to be available or try to do other work while executing commands in the kernel. As shown in Figure 3, this method introduces latency in I/O because any gap between when the data becomes available in the kernel and when the user calls read to return the data will reduce the overall data throughput.

Asynchronous blocking I/O

Another example of blocking is non blocking I/O with blocking notifications. In this model, non blocking I/O is configured, and then the blocking select system call is used to determine when there is any change in the I/O descriptor. The interesting thing about the select call is that it can provide notifications not only for one descriptor, but also for multiple descriptors. For each descriptor, notifications that can be requested include: descriptor write data, read data, and whether an error has occurred.

The main problem of the select call is its low efficiency. Although it is a model of asynchronous notification, it is not recommended for high-performance I/O.

Asynchronous non blocking I/O

Finally, asynchronous non blocking I/O model is a model of I/O overlapping processing. The read request returns immediately, indicating that the read was started successfully. The application can then perform other processing when the background read operation is completed. When the read response arrives, a signal or thread based callback can be generated to complete the I/O operation.

The ability to overlap calculations and make multiple I/O requests in a single process comes from, taking advantage of the gap between processing speed and I/O speed. When one or more slow I/O requests hang, the CPU can perform other tasks, or more commonly, operate on the completed I/O when starting other I/O.

The next section will take a closer look at this model and its API.

Motivation for asynchronous I/O

From the previous I/O model classification, we can see the necessity of AIO. The blocking model requires the startup application to block when I/O starts. This means that it is not possible to overlap processing and I/O at the same time. The synchronous non blocking model allows processing and I/O to overlap, but it requires the application to check the status of I/O regularly. This leaves asynchronous non blocking I/O, which allows processing and I/O overlap, including I/O completion notifications.

The select function (asynchronous blocking I/O) provides functions similar to AIO, except that it is still blocked when obtaining results.

Asynchronous I/O in linux

This section explores the asynchronous I/O model of Linux to help you understand how to apply it in your applications.

AIO first entered the Linux kernel in 2.5 and is now a standard feature of the 2.6 production kernel.

In the traditional I/O model, there is an I/O channel identified by a unique handle. In UNIX ® In, these are called file descriptors (the same for files, pipes, sockets, etc.). In the blocking I/O model, you start the transmission, and the system call returns when the transmission is completed or an error occurs.

In the asynchronous non blocking I/O model, multiple transmissions can be started at the same time. This requires a unique context for each transmission so that it can be identified when the transmission is complete. In AIO, a structure called aiocb (AIOI/O control block) plays this role. This structure contains all the information about the transmission, including the user buffer of the data. When an I/O notification (completion) occurs, an aiocb structure is provided to uniquely identify the completed I/O. The API section of the next chapter shows how to do this.

AIO API introduction

AIO interface API is very simple. It provides necessary functions for data transmission through several different notification models.

Each of these API functions uses the aiocb structure to start or check. This structure has many member variables, and the following listing 1 shows only the necessary elements.

struct aiocb {

  int aio_fildes;               // File descriptor
  int aio_lio_opcode;           // lio_listio (r/w/nop)
  volatile void ∗aio_buf;       // Data buffer
  size_t aio_nbytes;            // Data size of data buffer
  struct sigevent aio_sigevent; // Notification structure

  /∗ Internal fields ∗/


The sigevent structure tells AIO what to do when I/O is completed. This structure will be discussed later. Now let's look at how the various API functions of AIO work and how to use them.


aio_ The read function requests an asynchronous read of a valid file descriptor. A file descriptor can be a file, a socket, or even a pipe. aio_ The read function has the following prototypes:

int aio_read( struct aiocb ∗aiocbp );

aio_ The read function returns immediately after the request is queued. The return value is zero on success and - 1 on error, where errno is defined.

To perform a read, the application must initialize the aiocb structure. The following example illustrates populating the aiocb request structure and using aio_read executes an asynchronous read request (ignoring the notification temporarily). It also shows AIO_ The use of the error function will be explained later.

#include <aio.h>


  #include <aio.h>


  int fd, ret;
  struct aiocb my_aiocb;

  fd = open( "file.txt", O_RDONLY );
  if (fd < 0) perror("open");

  /∗ Zero out the aiocb structure (recommended) ∗/
  bzero( (char ∗)&my_aiocb, sizeof(struct aiocb) );

  /∗ Allocate a data buffer for the aiocb request ∗/
  my_aiocb.aio_buf = malloc(BUFSIZE+1);
  if (!my_aiocb.aio_buf) perror("malloc");

  /∗ Initialize the necessary fields in the aiocb ∗/
  my_aiocb.aio_fildes = fd;
  my_aiocb.aio_nbytes = BUFSIZE;
  my_aiocb.aio_offset = 0;

  ret = aio_read( &my_aiocb );
  if (ret < 0) perror("aio_read");

  while ( aio_error( &my_aiocb ) == EINPROGRESS ) ;

  if ((ret = aio_return( &my_iocb )) > 0) {
    /∗ got ret bytes on the read ∗/
  } else {
    /∗ read failed, consult errno ∗/

After opening the file from which you want to read data, clear the aiocb structure and allocate a data buffer.

The reference to the data buffer is placed in aio_buf. The buffer size is then initialized to aio_nbytes. aio_offset is set to zero (the first offset in the file). Set the file descriptor you want to read from to aio_fildes. After setting these fields, call aio_. Read requests a read. You can then call aio_error to determine AIO_ Status of read. As long as the status is EINPROGRESS, it indicates that it has not been completed. Otherwise, the request will either succeed or fail.

Here's a look at the similarities to reading from a file using standard library functions. Except AIO_ In addition to the asynchronous feature of read, another difference is to set the read offset. In a typical read call, the offset is maintained in the context of the file descriptor. For each read, the offset is updated so that subsequent reads process the next data block.

For each I/O request, you can't specify an asynchronous read offset, because you can't perform a specific read at the same time.


aio_error is used to determine the status of the request. Its prototype is:

int aio_error( struct aiocb ∗aiocbp );

The function may return the following states:

  • EINPROGRESS, the request has not been completed
  • ECANCELLED, request cancelled
  • -1. Request error


Another difference between asynchronous I/O and standard blocking I/O is that the return state of the function cannot be accessed immediately because you do not block the read call. In a standard read call, the return status is provided when the function returns. For asynchronous I/O, you can use aio_return function. Prototype of this function:

ssize_t aio_return( struct aiocb ∗aiocbp );

Only in AIO_ This function is not called until the error call determines that your request has been completed (success or error). aio_ The return value of return is the same as that of the read or write system call in the synchronization context (the number of bytes transferred or - 1 indicates an error).


aio_write is used to perform asynchronous write operations. Its prototype is:

int aio_write( struct aiocb ∗aiocbp );

aio_ The write function returns immediately, indicating that the request has been queued (0 for success, 1 for failure, and errno is set correctly).

This is similar to the read system call, but one behavioral difference is worth noting. Recall that offsets are important for read calls. However, for writes, the offset is only when o is not set_ Only important when used in the file context of the append option. If O is set_ Append, the offset is ignored and the data is appended to the end of the file. Otherwise, AIO_ The offset field determines the offset at which the data is written to the file.


You can use AIO_ The suspend function suspends (or blocks) the calling process until the asynchronous I/O request is completed, signaled, or an optional timeout occurs. The caller provides a list of aiocb references in which the completion of at least one will result in aio_suspend returns. aio_ The prototype of suspend function is:

int aio_suspend( const struct aiocb ∗const cblist[],
                  int n, const struct timespec ∗timeout );

aio_suspend is very simple to use. Provide a list of aiocb references. If either of them is completed, the call returns 0. Otherwise, - 1 is returned, indicating that an error has occurred. See the following example:

struct aioct ∗cblistMAX_LIST
/∗ Clear the list. ∗/
bzero( (char ∗)cblist, sizeof(cblist) );

/∗ Load one or more references into the list ∗/
cblist[0] = &my_aiocb;

ret = aio_read( &my_aiocb );

ret = aio_suspend( cblist, MAX_LIST, NULL );

Attention AIO_ The second parameter to suspend is the number of elements in the cblist, not the number of aiocb references. aio_suspend will ignore any NULL elements in the cblist.

If to aio_suspend provides timeout and returns - 1 if timeout occurs and errno contains EAGAIN.


aio_ The cancel function can cancel one or all I/O requests for a file descriptor. Its prototype is:

int aio_cancel( int fd, struct aiocb ∗aiocbp );

To cancel a single request, you need to provide a file descriptor and an aiocb reference. If the request is cancelled successfully, the function returns AIO_CANCELED. If the request is completed, the function returns AIO_NOTCANCELED.

Cancel all requests for the given file descriptor, please provide a NULL reference to the file descriptor and aiocbp. If all requests are cancelled, the function returns AIO_CANCELED. If at least one request cannot be cancelled, AIO will be returned_ NOT_ Canceled. If no request can be cancelled, AIO will be returned_ ALLDONE. You can then use aio_error evaluate each individual AIO request. If the request is cancelled, aio_error returns - 1, and errno is set to ECANCELED.


Finally, AIO provides a way to use lio_listioAPI function is a method to start multiple transfers at the same time. This function is important because it means you can start a lot of I/O (kernel context switching) in the context of a single system call. From a performance perspective, this is great and worth studying. lio_ The listio API function has the following prototypes:

int lio_listio( int mode, struct aiocb ∗list[], int nent,
                   struct sigevent ∗sig );

The mode parameter can be LIO_WAIT or LIO_NOWAIT. LIO_WAIT blocks the call until all I/O is complete. LIO_NOWAIT returns after the operation is queued. The list parameter is the list referenced by aiocb. The maximum number of elements is defined by compo nent. Note that the element of the list may be NULL, lio_listio will ignore it. The sigevent reference defines the signal notification method when all I/O is completed.

Yes, Leo_ Listio's request is slightly different from a typical read or write because an operation must be specified. The following example illustrates this.

struct aiocb aiocb1, aiocb2;
struct aiocb ∗list[MAX_LIST];


/∗ Prepare the first aiocb ∗/
aiocb1.aio_fildes = fd;
aiocb1.aio_buf = malloc( BUFSIZE+1 );
aiocb1.aio_nbytes = BUFSIZE;
aiocb1.aio_offset = next_offset;
aiocb1.aio_lio_opcode = LIO_READ;


bzero( (char ∗)list, sizeof(list) );
list[0] = &aiocb1;
list[1] = &aiocb2;

ret = lio_listio( LIO_WAIT, list, MAX_LIST, NULL );

Read operation with lio_ AIO of read_ lio_ Specify in the opcode field. For write operations, use LIO_WRITE, but LIO_NOP is also valid for no operation.

AIO notification

Now that you know the available AIO functions, this section will delve into the methods that can be used for asynchronous notification. I'll illustrate asynchronous notification in terms of signals and function callbacks.

Use signal for asynchronous notification

Interprocess communication (IPC) using signals is a traditional mechanism in UNIX, and AIO also supports it. In this example, the application defines a signal handler that is called when a specified signal occurs.

The application then specifies that the asynchronous request will signal when the request completes. As part of the signal context, specific aiocb requests are provided to track multiple potentially outstanding requests. The following example demonstrates this notification method.

void setup_io( ... )
  int fd;
  struct sigaction sig_act;
  struct aiocb my_aiocb;


  /∗ Set up the signal handler ∗/
  sig_act.sa_flags = SA_SIGINFO;
  sig_act.sa_sigaction = aio_completion_handler;

  /∗ Set up the AIO request ∗/
  bzero( (char ∗)&my_aiocb, sizeof(struct aiocb) );
  my_aiocb.aio_fildes = fd;
  my_aiocb.aio_buf = malloc(BUF_SIZE+1);
  my_aiocb.aio_nbytes = BUF_SIZE;
  my_aiocb.aio_offset = next_offset;

  /∗ Link the AIO request with the Signal Handler ∗/
  my_aiocb.aio_sigevent.sigev_notify = SIGEV_SIGNAL;
  my_aiocb.aio_sigevent.sigev_signo = SIGIO;
  my_aiocb.aio_sigevent.sigev_value.sival_ptr = &my_aiocb;

  /∗ Map the Signal to the Signal Handler ∗/
  ret = sigaction( SIGIO, &sig_act, NULL );


  ret = aio_read( &my_aiocb );


void aio_completion_handler( int signo, siginfo_t ∗info, void ∗context )
  struct aiocb ∗req;

  /∗ Ensure it's our signal ∗/
  if (info‑>si_signo == SIGIO) {

    req = (struct aiocb ∗)info‑>si_value.sival_ptr;

    /∗ Did the request complete? ∗/
    if (aio_error( req ) == 0) {

      /∗ Request completed successfully, get the return status ∗/
      ret = aio_return( req );




In this example, the signal handler is set up to be used in AIO_ completion_ The SIGIO signal is captured in the handler function.

Then initialize aio_sigevent structure to raise SIGIO notification (specified by SIGEV_SIGNAL definition in sigev_notify). When the reading is completed, the signal handler starts from the Si of the signal_ Extract a specific aiocb from the value structure and check the error status and return status to determine that I/O is complete.

In terms of performance, the completion handler is an ideal place to continue I/O by requesting the next asynchronous transfer. In this way, when one transmission is completed, the next one starts immediately.

Asynchronous notification using callback

Another notification mechanism is system callback. This mechanism does not send a notification signal, but calls a function in user space for notification. This function is initialized into the sigevent structure in the aiocb reference to uniquely identify the specific request being completed. Here is an example:

void setup_io( ... )
  int fd;
  struct aiocb my_aiocb;


  /∗ Set up the AIO request ∗/
  bzero( (char ∗)&my_aiocb, sizeof(struct aiocb) );
  my_aiocb.aio_fildes = fd;
  my_aiocb.aio_buf = malloc(BUF_SIZE+1);
  my_aiocb.aio_nbytes = BUF_SIZE;
  my_aiocb.aio_offset = next_offset;

  /∗ Link the AIO request with a thread callback ∗/
  my_aiocb.aio_sigevent.sigev_notify = SIGEV_THREAD;
  my_aiocb.aio_sigevent.notify_function = aio_completion_handler;
  my_aiocb.aio_sigevent.notify_attributes = NULL;
  my_aiocb.aio_sigevent.sigev_value.sival_ptr = &my_aiocb;


  ret = aio_read( &my_aiocb );


void aio_completion_handler( sigval_t sigval )
  struct aiocb ∗req;

  req = (struct aiocb ∗)sigval.sival_ptr;

  /∗ Did the request complete? ∗/
  if (aio_error( req ) == 0) {

    /∗ Request completed successfully, get the return status ∗/
    ret = aio_return( req );



In this example, sigev is used after the aiocb request is created_ Thread requests a thread callback as a notification method.

Then, specify a specific notification handler and load the context to pass to the handler (in this case, a reference to the aiocb request itself). In the handler, just convert the incoming sigval pointer and use the AIO function to verify the completion of the request.


Using asynchronous I/O can help you build faster and more efficient I/O applications. If your application can handle I/O overlapped, AIO can help you build applications that use available CPU resources more efficiently. Although this I/O model is different from the traditional blocking mode in most Linux applications, the asynchronous notification model is conceptually simple and can simplify your design.

Keywords: Java select aio

Added by rami on Thu, 03 Feb 2022 04:46:31 +0200