10 ways to read files in Go

There are many ways to read and write file contents in Go, most of which are advanced encapsulation based on syscall or os library. Different libraries are applicable to different scenarios. In order to prevent novices from cutting corners here, I took some time to sort out these contents.

This is the previous article. First introduce the 10 methods of reading files, and then introduce the methods of writing files in two days

# 1. The entire file is read into memory

Reading data directly into memory is the most efficient way, but this method is only suitable for small files, but not for large files, because it wastes memory.

1.1 directly specify the file name to read

There are two ways

First: use OS ReadFile

package main

import (
    "fmt"
    "os"
)

func main() {
    content, err := os.ReadFile("a.txt")
    if err != nil {
        panic(err)
    }
    fmt.Println(string(content))
}

Second: use ioutil ReadFile

package main

import (
    "io/ioutil"
    "fmt"
)

func main() {
    content, err := ioutil.ReadFile("a.txt")
    if err != nil {
        panic(err)
    }
    fmt.Println(string(content))
}

In fact, it started on Go 1.16, ioutil Readfile is equivalent to OS Readfile, the two are exactly the same

// ReadFile reads the file named by filename and returns the contents.
// A successful call returns err == nil, not err == EOF. Because ReadFile
// reads the whole file, it does not treat an EOF from Read as an error
// to be reported.
//
// As of Go 1.16, this function simply calls os.ReadFile.
func ReadFile(filename string) ([]byte, error) {
    return os.ReadFile(filename)
}

1.2 create handle before reading

If you are only reading, you can use the advanced function os Open

package main

import (
"os"
"io/ioutil"
"fmt"
)

func main() {
    file, err := os.Open("a.txt")
    if err != nil {
        panic(err)
    }
    defer file.Close()
    content, err := ioutil.ReadAll(file)
    fmt.Println(string(content))
}

It is an advanced function because it is an OS in read-only mode OpenFile

// Open opens the named file for reading. If successful, methods on
// the returned file can be used for reading; the associated file
// descriptor has mode O_RDONLY.
// If there is an error, it will be of type *PathError.
func Open(name string) (*File, error) {
    return OpenFile(name, O_RDONLY, 0)
}

Therefore, you can also directly use OS OpenFile, just add two more parameters

package main

import (
    "fmt"
    "io/ioutil"
    "os"
)

func main() {
    file, err := os.OpenFile("a.txt", os.O_RDONLY, 0)
    if err != nil {
        panic(err)
    }
    defer file.Close()
    content, err := ioutil.ReadAll(file)
    fmt.Println(string(content))
}

# 2. Read only one line at a time

Reading all data at one time is too memory consuming, so you can specify to read only one row of data at a time. There are three methods:

  1. bufio.ReadLine()

  2. bufio.ReadBytes('\n')

  3. bufio.ReadString('\n')

In the source code comments of bufio, it was said that bufio ReadLine () is a low-level library, which is not suitable for ordinary users. It is more recommended that users use bufio Readbytes and bufio Readstring to read single line data.

Therefore, bufio. Com will not be introduced here ReadLine()

2.1 use bufio ReadBytes

package main

import (
    "bufio"
    "fmt"
    "io"
    "os"
    "strings"
)

func main() {
    //Create handle
    fi, err := os.Open("christmas_apple.py")
    if err != nil {
        panic(err)
    }

    //Create Reader
    r := bufio.NewReader(fi)

    for {
        lineBytes, err := r.ReadBytes('\n')
        line := strings.TrimSpace(string(lineBytes))
        if err != nil && err != io.EOF {
            panic(err)
        }
        if err == io.EOF {
            break
        }
        fmt.Println(line)
    }
}

2.2 use bufio ReadString

package main

import (
    "bufio"
    "fmt"
    "io"
    "os"
    "strings"
)

func main() {
    //Create handle
    fi, err := os.Open("a.txt")
    if err != nil {
        panic(err)
    }

    //Create Reader
    r := bufio.NewReader(fi)

    for {
        line, err := r.ReadString('\n')
        line = strings.TrimSpace(line)
        if err != nil && err != io.EOF {
            panic(err)
        }
        if err == io.EOF {
            break
        }
        fmt.Println(line)
    }
}

# 3. Only a fixed number of bytes are read at a time

Reading only one line of data at a time can solve the problem of excessive memory consumption. However, it should be noted that not all files have line breaks. \ n.

Therefore, for some large files without line breaks, we have to think of other ways.

3.1 using os Library

The common approach is:

  • First create a file handle, you can use OS Open or OS OpenFile

  • Then bufio Newreader create a Reader

  • Then call the Read function of {Reader in the for loop to Read only a fixed number of bytes at a time.

package main

import (
    "bufio"
    "fmt"
    "io"
    "os"
)

func main() {
    //Create handle
    fi, err := os.Open("a.txt")
    if err != nil {
        panic(err)
    }

    //Create Reader
    r := bufio.NewReader(fi)

    //1024 bytes read at a time
    buf := make([]byte, 1024)
    for {
        n, err := r.Read(buf)
        if err != nil && err != io.EOF {
            panic(err)
        }

        if n == 0 {
            break
        }
        fmt.Println(string(buf[:n]))
    }
}

3.2 using syscall Library

The os library essentially calls the syscall library, but because syscall is too low-level, it is generally not used unless it is specially needed

For the sake of content integrity, syscall is also used here as an example.

In this example, 100 bytes of data will be read each time and sent to the channel, which will be read and printed by another coprocessor.

package main

import (
    "fmt"
    "sync"
    "syscall"
)

func main() {
    fd, err := syscall.Open("christmas_apple.py", syscall.O_RDONLY, 0)
    if err != nil {
        fmt.Println("Failed on open: ", err)
    }
    defer syscall.Close(fd)

    var wg sync.WaitGroup
    wg.Add(2)
    dataChan := make(chan []byte)
    go func() {
        wg.Done()
        for {
            data := make([]byte, 100)
            n, _ := syscall.Read(fd, data)
            if n == 0 {
                break
            }
            dataChan <- data
        }
        close(dataChan)
    }()

    go func() {
        defer wg.Done()
        for {
            select {
            case data, ok := <-dataChan:
                if !ok {
                    return
                }

                fmt.Printf(string(data))
            default:

            }
        }
    }()
    wg.Wait()
}

Keywords: Go

Added by Codein on Tue, 04 Jan 2022 05:46:33 +0200