Distributed file storage: simple use and principle analysis of FastDFS

Introduction

FastDFS belongs to the category of distributed storage. The distributed file system FastDFS is very suitable for small and medium-sized projects. When I took over the maintenance of the company's image service, I began to contact it. The purpose of this article is to summarize the knowledge of FastDFS.

Two 2-core 4G alicloud servers are used for cluster deployment. For specific deployment steps, please refer to: https://github.com/happyfish100/fastdfs/wiki

1. Overview of FastDFS distributed file system

FastDFS is a lightweight open source distributed file system. The author is Yu Qing, a senior architecture of Taobao. FastDFS mainly solves the problem of distributed file storage and high concurrent access, realizes load balancing, is suitable for storing pictures, videos, documents and other files, and supports online expansion of storage server.

2. FastDFS architecture

The FastDFS server has two roles: tracker and storage. Tracker is mainly used for scheduling and load balancing. Storage is responsible for file access, synchronization and other operations.

FastDFS system structure:

2.1,Client

The client accesses FastDFS distributed storage, which is generally a back-end application.

2.2,Tracker

Tracker has two functions in FastDFS cluster:

  • Manage the Storage cluster. When the Storage service is started, it will register itself on the Tracker and regularly report its status information, including the remaining disk space, file synchronization status, file upload and download times and other statistical information.
  • Before the Client accesses the Storage service, it must first access the Tracker to dynamically obtain the connection information of the Storage service, which plays a role of load balancing.

2.3,Storage

Storage is a data storage server, where files and meta data are stored. It has the following characteristics:

  • Data storage in a highly available way.
  • In FastDFS cluster, Storage provides services by group (Group/volume), different groups of Storage will not communicate with each other, and Storage in the same group will connect with each other for file synchronization.
  • Storage service uses binlog file to record file upload, deletion and other update operations. Binlog only records the file name, not the content.
  • File synchronization is only performed between Storage services in the same group, and push mode is adopted, that is, circular server is synchronized to the target server.
  • FastDFS saves the file and related description information (MetaData) in the Storage service. After the file is stored, it will return a unique file ID. the file ID consists of group name and file name. MetaData is the file description information, such as width=1024,height=768.

3. Principle of file upload

The principle of file upload is as follows:

  1. The Client asks which Storage Tracker can upload to.
  2. Tracker returns the available Storage connection information.
  3. The Client communicates directly with the Storage to complete the file upload.
  4. After Storage saves the file, it returns the Client file ID (group name, file name).

4. File download principle

The principle of file download is as follows:

  1. The Client asks the Tracker for the Storage of the following files. The parameter is the file ID (group name, file name).
  2. Tracker returns an available Storage.
  3. The Client communicates with the Storage to complete the file download process.

5. File synchronization principle

  • Storage services in the same group are peer-to-peer. File upload, deletion and other operations can be performed on any storage service, and data will be synchronized in the same group.
  • File synchronization (upload, delete, update) adopts push mode, that is, the source server is synchronized to the target server.
  • Only the source data needs to be synchronized. If the backup data is synchronized again, a loop will be formed.
  • When a new Storage service is added, all data (source data and backup data) of an existing Storage will be synchronized to the new server.

6. Server file directory

6.1,TrackerServer

${base_path}

|__data

|     |__storage_groups.dat: Store group information

|     |__storage_servers.dat: Storage server list

|__logs

   |__trackerd.log: tracker server log file

6.2,StorageServer

${base_path}

|__data

|     |__.data_init_flag: current storage server Initialization information

|     |__storage_stat.dat: current storage server statistical information

|     |__sync: Store data synchronization related files

|     |     |__binlog.index: Current binlog File index

|     |     |__binlog.###: save update operation record (log)
|     |     |__${ip_addr}_${port}.mark: Completion of storage synchronization

|     |

|     |__First level directory: 256 directories for data files, such as: 00, 1F

|           |__Secondary directory: 256 directories for data files

|__logs

   |__storaged.log: storage server log file

7. Communication protocol between server and client

7.1 introduction to communication protocol

The FastDFS server uses a custom communication protocol when communicating with the client, as shown in the following figure:

The protocol package consists of two parts: header and body

  • The header has 10 bytes in total. The format is as follows:
    • 8 bytes body length
    • 1 byte command
    • 1 byte status
  • The format of the body packet depends on the specific command. The body can be empty.

7.2 command code and communication status code

7.2.1. Tracker management command code
Name command
Delete storage 93
Get download node query? Fetch? One 102
Get update node query? Update 103
Do not get storage nodes by group 101
Get storage nodes by group 104
Get group list 91
Get storage node list 92
7.2.2 Store file upload command code
Name command Explain
File upload 11 General file upload, main file after upload
Upload attached files 21 "Upload the slave file, for example, the master file is xxx.jpg, and the slave file (thumbnail) is xxx-150_150.jpg"
Delete files 12 Delete files
Set file metadata 13 Upload file creation date, label, etc
File download 14
Get file metadata 15
Query file information 22 Query file information
Create files that support breakpoint renewal 23 Create a file that supports breakpoint renewal
Breakpoint renewal 24 Upload files that can be uploaded at breakpoints. For example, cut large files into several copies and upload them separately
File modification 34 Modify files supporting breakpoint upload
Clear file 36 Intercept (clear) files supporting breakpoint upload
7.2.3 message communication status code
Name Code
Client close connection command 82
Connection status check command 111
The server returns the message correctly 100

8. Simple use

I use fastdfs-client-java-1.27-SNAPSHOT.jar happyfish100/fastdfs-client-java

This library stopped updating since June 5, 2017, and recently started to update the code. It looks like it needs to be maintained.

The simple encapsulation of the connection pool for the client is convenient to use.

  • System startup, pool management connection
  • Heartbeat to confirm whether the connection is reliable
  • Constructor mode create connection pool
  • Callback mode using client

Source address:

ClawHub/FastDFS-Pool

The following is the core code:

1.1. Initialize connection pool

/**
     * Build fast dfs conn pool.
     *
     * @return the fast dfs conn pool
     */
    public FastDFSConnPool build() {
        // Initialize free connection pool
        idleConnectionPool = new LinkedBlockingQueue<>(maxPoolSize);
        //Initialize global parameters
        try {
            ClientGlobal.init(confFileName);
        } catch (IOException | MyException e) {
            throw new RuntimeException("init client global exception.", e);
        }
        // Add default size threads to thread pool
        TrackerServer trackerServer;
        for (int i = 0; i < minPoolSize; i++) {
            //Get to connection
            trackerServer = createTrackerServer();
            if (trackerServer != null) {
                //Put into free pool
                idleConnectionPool.offer(trackerServer);
            }
        }
        // Registered heartbeat
        new HeartBeat(this).beat();

        return this;
    }

1.2 client execution request

/**
     * Mode of execution
     *
     * @param <T>    the type parameter
     * @param invoke the invoke
     * @return the t
     */
    public <T> T processFdfs(CallBack<T> invoke) {
        TrackerServer trackerServer = null;
        T t;
        try {
            //Get tracker connection
            trackerServer = fastDFSConnPool.checkOut();
            //Get storage
            StorageClient1 storageClient = new StorageClient1(trackerServer, null);
            //Execution operation
            t = invoke.invoke(storageClient);
            //Release connection
            fastDFSConnPool.checkIn(trackerServer);
            return t;
        } catch (Exception e) {
            //Delete links
            fastDFSConnPool.drop(trackerServer);
            throw new RuntimeException(e);
        }
    }

1.3 heartbeat

    /**
     * Heartbeat task
     */
    private class HeartBeatTask implements Runnable {

        @Override
        public void run() {
            LinkedBlockingQueue<TrackerServer> idleConnectionPool = fastDFSConnPool.getIdleConnectionPool();
            TrackerServer ts = null;
            for (int i = 0; i < idleConnectionPool.size(); i++) {
                try {
                    ts = idleConnectionPool.poll(fastDFSConnPool.getWaitTimes(), TimeUnit.SECONDS);
                    if (ts != null) {
                        ProtoCommon.activeTest(ts.getSocket());
                        idleConnectionPool.add(ts);
                    } else {
                        //No idle long connection
                        break;
                    }
                } catch (Exception e) {
                    //Exception occurred, to delete, rebuild
                    logger.error("heart beat conn  have dead, and reconnect.", e);
                    fastDFSConnPool.drop(ts);
                }
            }

        }
    }

1.4 usage

 //Initialize connection pool
        FastDFSConnPool fastDFSConnPool = new FastDFSConnPool()
                .confFileName("./config/fdfs_client.conf")
                .maxPoolSize(8)
                .minPoolSize(1)
                .reConnNum(2)
                .waitTimes(2).build();

        //Use client
        FastDFSClient client = new FastDFSClient(fastDFSConnPool);
        //Upload the full path extName file extension of the ileName file, excluding (.) meta file extension information
        String parts = client.processFdfs(storageClient -> storageClient.upload_file1("fileName", "extName", new NameValuePair[0]));
        //Download fileid: group1 / M00 / 00 / 00 / wkgrsvjtwpsaxgwkaaweezrjw471.jpg
        byte[] bytes = client.processFdfs(storageClient -> storageClient.download_file1("fileId"));
        //Delete - 1 failed, 0 succeeded
        int result = client.processFdfs(storageClient -> storageClient.delete_file1("fileId"));
        //Get remote server file resource information groupName filegroup name, for example: group1 remotefilename M00 / 00 / 00 / wkgrsvjtwpsaxgwkaaaweezrjw471.jpg
        FileInfo fileInfo = client.processFdfs(storageClient -> storageClient.get_file_info("groupName", "remoteFileName"));

Reference resources

Introduction to FastDFS V5.12 distributed file system

wiki of tobato / fastdfs client

Keywords: Programming Java github snapshot

Added by DigitalExpl0it on Thu, 12 Dec 2019 15:55:27 +0200