File redirection and pipeline

Standard file descriptors

Concept of file descriptor: a file descriptor is an integer greater than or equal to 0, which means the sequence number of items in the file opening table of the process (starting from 0). For a process, a file descriptor is a handle that operates on an open file (or a device file, or a socket connection). This entry stores a pointer to the corresponding entry of the system level file open table. The system level file open table stores the file control block (FCB) corresponding to the open file.

One of the features of Unix (including Linux) is that it provides a large number of small software tools. These software tools are based on standard I/O, so you can use redirection and pipeline to use these tools more flexibly. Standard file descriptors include standard input (stdin), standard output (stdout) and standard error output (stderr), which correspond to file descriptors 0, 1 and 2 respectively. Any software tool reads the input from the standard input, writes the output result to the standard output, and writes the error prompt (such as perror()) to the standard error output.

Obviously, we didn't create 0, 1 and 2 file descriptors when we wrote the program. These three file descriptors are inherited from the parent process (shell), and all are connected to the terminal device (i.e. keyboard and display).

Principle of lowest available file descriptor

Since the file descriptor is essentially the entry number of the process file open table, when we close a file, we release the corresponding entry in the file open table, that is, the file descriptor is available. The next time we open another file, we will use the lowest number of file descriptors to ensure that the file opening table is as small as possible.

We can see from the following program that the process always uses the lowest available file descriptor.

#include <fcntl.h>
#include <stdio.h>

int main() {
    int fd1,fd2;
    fd1 = open("/usr/include/stdio.h", O_RDONLY);
    printf("%d\n", fd1);

    fd2 = open("/usr/include/stdio.h", O_RDONLY);
    printf("%d\n", fd2);
    
    close(0);
    printf("close stdin\n");

    close(fd1);
    close(fd2);

    fd1 = open("/usr/include/stdio.h", O_RDONLY);
    printf("%d\n", fd1);

    fd2 = open("/usr/include/stdio.h", O_RDONLY);
    printf("%d\n", fd2);

    close(fd1);
    close(fd2);

    return 0;
}

Operation results:

I/O redirection

Example: using I/O redirection in shell

In this example, the output of who is originally the terminal. After redirecting who output to a file, the file is the content of who output

Since any software tool reads the input from "standard input", the "standard output" outputs the result. If we can connect the standard input to a file in advance, the process can think that it reads the user's input from the terminal, but in fact it reads data from the file. The same is true for standard output and standard error output.

I/O redirection method 1: close... open

close(0) closes the standard input and the file descriptor 0 is available. Then open the file, which is connected to standard input. At this time, read from the standard input, and the file content will be read.

close(1) closes the standard output and the file descriptor 1 is available. Then open the file, which is connected to standard output. At this time, the standard output, such as printf, will not display the content on the terminal, but will be written to the file.

Example: the contents of printf are redirected to the file

#include <fcntl.h>
#include <stdio.h>

int main() {
    int fd;
    close(1);
    fd = open("haha", O_WRONLY | O_CREAT, 0644);

    printf("%d\n", fd);

    return 0;
}

I/O redirection method 2: open... close... dup... close

In some cases, a file descriptor has been opened, such as a socket connection. At this time, it is too late to use method 1. In this case, the standard file descriptor can be closed first to obtain the low availability file descriptor; Then copy (dup) the file descriptor you want to redirect. At this point, the standard file descriptor is connected to the file we want to connect. Close the file descriptor of the original file. Redirect complete.

#include<fcntl.h>
#include<stdio.h>
#include<unistd.h>

main(){
    int fd;
    fd=open("haha",O_WRONLY|O_CREAT,0644);
    
    close(1);
    dup(fd);
    close(fd);
    
    execlp("who","who",NULL);
}

I/O redirection method 3: open... dup2... closed

Method 3 is a simplification of method 2. Replace close(0) and dup(fd) with dup2(fd, 0).

The schematic diagram of I/O redirection methods 1 and 2 is as follows:

The Conduit

What we usually call a pipe generally refers to an unnamed pipe: intuitively, it takes the result of one process as the input of another process. More than two processes can also be piped together in turn.

For example, in this example, sort sorts the output of ls

A pipe connects the standard output of one process to the standard input of another process. Because every software tool reads from "standard input" and writes to "standard output". In this way, the series processing of a data stream can be realized.

be careful:

  1. The lower part of the pipe is in the core, that is, the pipe should be managed and controlled by the core;
  2. The pipe has a direction, just like the sewer at home - it's not two-way.
Pipe creation

A pipe can be created on a single process through the pipe() system call.

Check the man manual pipe (2). The calling format of pipe is ini pipe(int pipfd[2]); Where pipefd[0] is the read data side file descriptor and pipefd[1] is the write data side file descriptor.

Then copy the process through fork() system call. Since the child process inherits everything from the parent process, including file descriptors, it can obtain the status shown in the following figure.

At this point, a pipeline is established between the parent and child processes. Close unnecessary file descriptors, and then run the programs we want to run separately. The unknown pipeline is realized.

Example: This program demonstrates the directionality of both ends of the pipeline.

#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>

main(){
    int pipefd[2];
    char buf[1024];
    int n;
    pipe(pipefd);

    write(pipefd[1], "haha", 4);
    n=read(pipefd[0],buf,sizeof(buf));

    printf("%.*s\n", n, buf); 
}

Example: implement cat / usr / include / stdio h|sort

#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>

main(){
    int pipefd[2];
    int rv;
    pipe(pipefd);

    rv=fork();
    if(0==rv){
        close(pipefd[1]);
        close(0);
        dup(pipefd[0]);
        close(pipefd[0]);
        execlp("sort", "sort", NULL);
    }else{
        close(pipefd[0]);
        close(1);
        dup(pipefd[1]);
        close(pipefd[1]);
        execlp("cat", "cat", "/usr/include/stdio.h", NULL);
    }
}

Example: implement cat / usr / include / stdio h|sort

#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>

main(){
    int pipefd[2];
    int rv;
    pipe(pipefd);

    rv=fork();
    if(0==rv){
        close(pipefd[1]);
        close(0);
        dup(pipefd[0]);
        close(pipefd[0]);
        execlp("sort", "sort", NULL);
    }else{
        close(pipefd[0]);
        close(1);
        dup(pipefd[1]);
        close(pipefd[1]);
        execlp("cat", "cat", "/usr/include/stdio.h", NULL);
    }
}

Keywords: Linux Unix server

Added by twilightnights on Wed, 02 Feb 2022 07:53:53 +0200