hadoop familiar with common HDFS operation IDEA installation

Familiar with common HDFS operation

1. Create a txt file in the "/ home/hadoop /" directory of the local Linux file system, where you can enter some words at will
2. View file location locally (ls)
3. Display file content locally

cd /home/hadoop
ls
touch test.txt
vim test.txt 
cat test.txt 

4. Use the command to upload the "txt" in the local file system to the input directory of the current user directory in HDFS.
5. View files (- ls) in hdfs
6. Display the contents of this file in hdfs

cd /usr/local/hadoop
./sbin/start-all.sh #Start service
./bin/hdfs dfs -put /home/hadoop/test.txt input #Upload file
./bin/hdfs dfs -ls input

Unable to upload

Prompt info SASL SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false

Common commands:

hdfs dfs -mkdir  /input         #Create / input folder
hdfs dfs -rm -r /input          #Delete / input folder
hdfs dfs -put a.log /input      #Upload a.log to / input folder
hdfs dfs -get /input/a.log      #Download a.log to your local computer
hdfs dfs -cat a.log             #View the contents of a.log
hdfs dfs -rmdir /input         #Delete directory

Experimental content

Upload file:

echo "test"+$(date) > test.txt
echo $(date) > time.txt
#Upload file
hdfs dfs -put ./test.txt input 
hdfs dfs -put ./time.txt input 

Additional files:

hdfs dfs -appendToFile test.txt time.txt
hdfs dfs -ls input
hdfs dfs -cat input/test.txt

Overwrite remote files from local:

hdfs dfs -copyFromLocal -f time.txt input/test.txt
hdfs dfs -cat input/test.txt

Time is displayed Txt content

Download File

And rename

if [ -f ~/zhouqi1.txt ]
then 
hadoop fs -get /input/test.txt ./test.txt
else
hadoop fs -get /input/test.txt ./test2.txt
fi

Displays the read-write permission, size, creation time, path and other information of the file specified in HDFS

hdfs dfs -ls input/test.txt
hdfs dfs -ls  -h input/test.txt  # -h set the file size in human readable format (kb,Mb, etc.)
hdfs dfs -ls  -t input/test.txt  # -t sort the output by modification time (the latest takes precedence)
hdfs dfs -ls  -S input/test.txt   #Sort the output by file size.

Given a directory in HDFS, output the read-write permission, size, creation time, path and other information of all files in the directory. If the file is a directory, recursively output the relevant information of all files in the directory;

hadoop fs -mkdir -p /input/1/2/3/4 #-p recursive creation
hadoop fs -ls -R /input

Provide a path to the file in HDFS, and create and delete the file. If the directory where the file is located does not exist, the directory will be created automatically;

vim 6.sh
#Write the following
#!/bin/bash
hadoop fs -test -d /input/6
if  [ "$?" == "0" ]
then 
hadoop fs -touchz /input/6/6.txt
else
hadoop fs -mkdir -p  /input/6/6.txt
fi

#Add execution permission to the script
chmod +x 6.sh
#Execute script
sh 6.sh
#View run results
hdfs dfs -ls -R input
#Delete file
hdfs dfs  -rm input/6/6.txt
#View run results
hdfs dfs  -ls -R input

Provide a path to the directory of HDFS, and create and delete the directory. When creating a directory, if the directory where the directory file is located does not exist, the corresponding directory will be automatically created; When deleting a directory, the user specifies whether to delete the directory when the directory is not empty;

vim 7.sh
#Write the following
#!/bin/bash
hadoop fs -test -d /input/7
if  [ "$?" != "0" ]
then 
hadoop fs -mkdir   /input/7
fi
#Add execution permission to the script
chmod +x 7.sh
#Execute script
sh 7.sh
#View run results
hdfs dfs -ls -R input
#Delete directory
hdfs dfs -rmdir input/7 
#If the last instruction is not executed successfully, it shows that there are files in the directory
hdfs dfs rm -r input/7
#View run results
hdfs dfs -ls -R input

Install IDEA

download

Download directly from the official website: https://www.jetbrains.com/idea/download/#section=linux

https://download.jetbrains.com.cn/idea/ideaIU-2021.1.1.tar.gz

install

# Copy to specified path
sudo cp ideaIU-2021.1.1.tar.gz /usr/local 
cd /usr/local 
# decompression
sudo tar -zxvf ideaIU-2021.1.1.tar.gz
# Delete file
sudo rm -rf ideaIU-2021.1.1.tar.gz

su -
#function
/usr/local/idea-IU-211.7142.45/bin/idea.sh

java

to configure

New project default options next

Open Project Structure
Add package

Import all the jar packages required by the project (you need to import all the following jar packages)
(1) Hadoop-common-3.1.3. In / usr/local/hadoop/share/hadoop/common directory Jar and haoop-nfs-3.1.3 jar´╝Ť
(2) All JAR packages in / usr/local/hadoop/share/hadoop/common/lib directory;
(3) / usr/local/hadoop/share/hadoop/hdfs
haoop-hdfs-3.1.3.jar
haoop-hdfs-nfs-3.1.3.jar
hadoop-hdfs-client-3.1.3.jar
(4) All JAR packages in / usr/local/hadoop/share/hadoop/hdfs/lib directory.

It is recommended to import all packages in / usr/local/hadoop/share/hadoop

2

Start hadoop
Create a class named MyFSDataInputStream
code:

import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.commons.io.IOUtils;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.net.MalformedURLException;
import java.net.URL;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.FsUrlStreamHandlerFactory;
import org.apache.hadoop.fs.Path;

public class MyFSDataInputStream extends FSDataInputStream {
	private static Configuration conf ;
	static{
		URL.setURLStreamHandlerFactory(new FsUrlStreamHandlerFactory());
	}
	MyFSDataInputStream(InputStream in)
	{
		super(in);
	}
	public static void Config(){
		conf= new Configuration();
        conf.set("fs.defaultFS","hdfs://localhost:9000");
       conf.set("fs.hdfs.impl","org.apache.hadoop.hdfs.DistributedFileSystem");
	}
	
	public static int ReadLine(String path) throws IOException{
		
		 FileSystem fs = FileSystem.get(conf);
         Path file = new Path(path); 
         FSDataInputStream getIt = fs.open(file);
         BufferedReader d = new BufferedReader(new InputStreamReader(getIt));
         
        String content;// = d.readLine(); // Read one line of file
         if((content=d.readLine())!=null){
         	 System.out.println(content);
         }
       //  System.out.println(content);
         d.close(); //Close file
         fs.close(); //Turn off hdfs	
		return 0;
	}
	
	public static void PrintFile() throws MalformedURLException, IOException{
		String FilePath="hdfs://localhost:9000/hadoop/test.txt";
		InputStream in=null;
		in=new URL(FilePath).openStream();
		IOUtils.copy(in,System.out);
	
	}
	
	
	public static void main(String[] arg) throws IOException{
		MyFSDataInputStream.Config();
		MyFSDataInputStream.ReadLine("test.txt");
		MyFSDataInputStream.PrintFile();
	}
}
	

Run directly in IDEA after saving

error

Class org.apache.hadoop.hdfs.DistributedFileSystem not found
To import / usr / local / Hadoop / share / Hadoop / HDFS / hadoop-hdfs-client-3.1.3 Jar package

RPC response exceeds maximum data length

Failed on local exception: org.apache.hadoop.ipc.RpcException: RPC response exceeds maximum data length; Host Details : local host is: "nick-virtual-machine/127.0.1.1"; destination host is: "localhost":9870;

Enter jps to check whether NameNode and SecondNameNode are started

Visit localhost:9870 to confirm that the hadoop localhost:9000 port has been started

Or check: netstat -ano |grep 9000

This value uses the server core site Fs.xml file default. Value of name

There is also the problem that the file cannot be found. You can use the Utilities in the web page to find the file copy path.

Implement the above shell command with code

Code reference http://dblab.xmu.edu.cn/blog/2808-2/

1 upload files

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.*;
import java.io.*;
 
public class HDFSApi {
    /**
     * Determine whether the path exists
     */
    public static boolean test(Configuration conf, String path) throws IOException {
        FileSystem fs = FileSystem.get(conf);
        return fs.exists(new Path(path));
    }
 
    /**
     * Copy files to the specified path
     * If the path already exists, overwrite it
     */
    public static void copyFromLocalFile(Configuration conf, String localFilePath, String remoteFilePath) throws IOException {
        FileSystem fs = FileSystem.get(conf);
        Path localPath = new Path(localFilePath);
        Path remotePath = new Path(remoteFilePath);
        /* fs.copyFromLocalFile The first parameter indicates whether to delete the source file, and the second parameter indicates whether to overwrite it */
        fs.copyFromLocalFile(false, true, localPath, remotePath);
        fs.close();
    }
 
    /**
     * Add file content
     */
    public static void appendToFile(Configuration conf, String localFilePath, String remoteFilePath) throws IOException {
        FileSystem fs = FileSystem.get(conf);
        Path remotePath = new Path(remoteFilePath);
        /* Create a file read in stream */
        FileInputStream in = new FileInputStream(localFilePath);
        /* Create a file output stream, and the output content will be appended to the end of the file */
        FSDataOutputStream out = fs.append(remotePath);
        /* Read and write file contents */
        byte[] data = new byte[1024];
        int read = -1;
        while ( (read = in.read(data)) > 0 ) {
            out.write(data, 0, read);
        }
        out.close();
        in.close();
        fs.close();
    }
 
    /**
     * Main function
     */
    public static void main(String[] args) {
        Configuration conf = new Configuration();
    conf.set("fs.default.name","hdfs://localhost:9000");
        String localFilePath = "/home/hadoop/text.txt";    // Local path
        String remoteFilePath = "/user/hadoop/text.txt";    // HDFS path
        String choice = "append";    // If the file exists, append to the end of the file
//      String choice = "overwrite";    //  Overwrite if file exists
 
        try {
            /* Determine whether the file exists */
            Boolean fileExists = false;
            if (HDFSApi.test(conf, remoteFilePath)) {
                fileExists = true;
                System.out.println(remoteFilePath + " Already exists.");
            } else {
                System.out.println(remoteFilePath + " non-existent.");
            }
            /* Processing */
            if ( !fileExists) { // If the file does not exist, upload it
                HDFSApi.copyFromLocalFile(conf, localFilePath, remoteFilePath);
                System.out.println(localFilePath + " Uploaded to " + remoteFilePath);
            } else if ( choice.equals("overwrite") ) {    // Select overlay
                HDFSApi.copyFromLocalFile(conf, localFilePath, remoteFilePath);
                System.out.println(localFilePath + " Covered " + remoteFilePath);
            } else if ( choice.equals("append") ) {   // Select append
                HDFSApi.appendToFile(conf, localFilePath, remoteFilePath);
                System.out.println(localFilePath + " Appended to " + remoteFilePath);
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

2 download files to local

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.*;
import java.io.*;
 
public class HDFSApi {
    /**
     * Download files locally
     * Judge whether the local path already exists. If it already exists, it will be renamed automatically
     */
    public static void copyToLocal(Configuration conf, String remoteFilePath, String localFilePath) throws IOException {
        FileSystem fs = FileSystem.get(conf);
        Path remotePath = new Path(remoteFilePath);
        File f = new File(localFilePath);
        /* If the file name exists, rename it automatically (add _0, _1...) after the file name */
        if (f.exists()) {
            System.out.println(localFilePath + " Already exists.");
            Integer i = 0;
            while (true) {
                f = new File(localFilePath + "_" + i.toString());
                if (!f.exists()) {
                    localFilePath = localFilePath + "_" + i.toString();
                    break;
                }
            }
            System.out.println("Rename to: " + localFilePath);
        }
 
        // Download files locally
       Path localPath = new Path(localFilePath);
       fs.copyToLocalFile(remotePath, localPath);
       fs.close();
    }
 
    /**
     * Main function
     */
    public static void main(String[] args) {
        Configuration conf = new Configuration();
    conf.set("fs.default.name","hdfs://localhost:9000");
        String localFilePath = "/home/hadoop/text.txt";    // Local path
        String remoteFilePath = "/user/hadoop/text.txt";    // HDFS path
 
        try {
            HDFSApi.copyToLocal(conf, remoteFilePath, localFilePath);
            System.out.println("Download complete");
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

3 output display

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.*;
import java.io.*;
 
public class HDFSApi {
    /**
     * Read file contents
     */
    public static void cat(Configuration conf, String remoteFilePath) throws IOException {
        FileSystem fs = FileSystem.get(conf);
        Path remotePath = new Path(remoteFilePath);
        FSDataInputStream in = fs.open(remotePath);
        BufferedReader d = new BufferedReader(new InputStreamReader(in));
        String line = null;
        while ( (line = d.readLine()) != null ) {
            System.out.println(line);
        }
       d.close();
       in.close();
       fs.close();
    }
 
    /**
     * Main function
     */
    public static void main(String[] args) {
        Configuration conf = new Configuration();
    conf.set("fs.default.name","hdfs://localhost:9000");
        String remoteFilePath = "/user/hadoop/text.txt";    // HDFS path
 
        try {
            System.out.println("read file: " + remoteFilePath);
            HDFSApi.cat(conf, remoteFilePath);
            System.out.println("\n Read complete");
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

4 document information

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.*;
import java.io.*;
import java.text.SimpleDateFormat;
 
public class HDFSApi {
    /**
     * Displays information about the specified file
     */
    public static void ls(Configuration conf, String remoteFilePath) throws IOException {
        FileSystem fs = FileSystem.get(conf);
        Path remotePath = new Path(remoteFilePath);
        FileStatus[] fileStatuses = fs.listStatus(remotePath);
        for (FileStatus s : fileStatuses) {
            System.out.println("route: " + s.getPath().toString());
            System.out.println("jurisdiction: " + s.getPermission().toString());
            System.out.println("size: " + s.getLen());
            /* The returned time stamp is converted into time date format */
            Long timeStamp = s.getModificationTime();
            SimpleDateFormat format =  new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
            String date = format.format(timeStamp);  
            System.out.println("time: " + date);
        }
        fs.close();
    }
 
    /**
     * Main function
     */
    public static void main(String[] args) {
        Configuration conf = new Configuration();
    conf.set("fs.default.name","hdfs://localhost:9000");
        String remoteFilePath = "/user/hadoop/text.txt";    // HDFS path
 
        try {
            System.out.println("Read file information: " + remoteFilePath);
            HDFSApi.ls(conf, remoteFilePath);
            System.out.println("\n Read complete");
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

5 directory file information

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.*;
import java.io.*;
import java.text.SimpleDateFormat;
 
public class HDFSApi {
    /**
     * Displays the information of all files in the specified folder (recursive)
     */
    public static void lsDir(Configuration conf, String remoteDir) throws IOException {
        FileSystem fs = FileSystem.get(conf);
        Path dirPath = new Path(remoteDir);
        /* Recursively obtain all files in the directory */
        RemoteIterator<LocatedFileStatus> remoteIterator = fs.listFiles(dirPath, true);
        /* Output information for each file */
        while (remoteIterator.hasNext()) {
            FileStatus s = remoteIterator.next();
            System.out.println("route: " + s.getPath().toString());
            System.out.println("jurisdiction: " + s.getPermission().toString());
            System.out.println("size: " + s.getLen());
            /* The returned time stamp is converted into time date format */
            Long timeStamp = s.getModificationTime();
            SimpleDateFormat format =  new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
            String date = format.format(timeStamp);  
            System.out.println("time: " + date);
            System.out.println();
        }
        fs.close();
    }    
 
    /**
     * Main function
     */
    public static void main(String[] args) {
        Configuration conf = new Configuration();
    conf.set("fs.default.name","hdfs://localhost:9000");
        String remoteDir = "/user/hadoop";    // HDFS path
 
        try {
            System.out.println("(recursion)Read the information of all files in the directory: " + remoteDir);
            HDFSApi.lsDir(conf, remoteDir);
            System.out.println("Read complete");
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

6 file creation / deletion

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.*;
import java.io.*;
 
public class HDFSApi {
    /**
     * Determine whether the path exists
     */
    public static boolean test(Configuration conf, String path) throws IOException {
        FileSystem fs = FileSystem.get(conf);
        return fs.exists(new Path(path));
    }
 
    /**
     * Create directory
     */
    public static boolean mkdir(Configuration conf, String remoteDir) throws IOException {
        FileSystem fs = FileSystem.get(conf);
        Path dirPath = new Path(remoteDir);
        boolean result = fs.mkdirs(dirPath);
        fs.close();
        return result;
    }
 
    /**
     * create a file
     */
    public static void touchz(Configuration conf, String remoteFilePath) throws IOException {
        FileSystem fs = FileSystem.get(conf);
        Path remotePath = new Path(remoteFilePath);
        FSDataOutputStream outputStream = fs.create(remotePath);
        outputStream.close();
        fs.close();
    }
 
    /**
     * Delete file
     */
    public static boolean rm(Configuration conf, String remoteFilePath) throws IOException {
        FileSystem fs = FileSystem.get(conf);
        Path remotePath = new Path(remoteFilePath);
        boolean result = fs.delete(remotePath, false);
        fs.close();
        return result;
    }
 
    /**
     * Main function
     */
    public static void main(String[] args) {
        Configuration conf = new Configuration();
    conf.set("fs.default.name","hdfs://localhost:9000");
        String remoteFilePath = "/user/hadoop/input/text.txt";    // HDFS path
        String remoteDir = "/user/hadoop/input";    // Directory corresponding to HDFS path
 
        try {
            /* Judge whether the path exists. If it exists, delete it. Otherwise, create it */
            if ( HDFSApi.test(conf, remoteFilePath) ) {
                HDFSApi.rm(conf, remoteFilePath); // delete
                System.out.println("Delete path: " + remoteFilePath);
            } else {
                if ( !HDFSApi.test(conf, remoteDir) ) { // If the directory does not exist, create it
                    HDFSApi.mkdir(conf, remoteDir);
                    System.out.println("create folder: " + remoteDir);
                }
                HDFSApi.touchz(conf, remoteFilePath);
                System.out.println("Create path: " + remoteFilePath);
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

7 directory creation / deletion

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.*;
import java.io.*;
 
public class HDFSApi {
    /**
     * Determine whether the path exists
     */
    public static boolean test(Configuration conf, String path) throws IOException {
        FileSystem fs = FileSystem.get(conf);
        return fs.exists(new Path(path));
    }
 
    /**
     * Determine whether the directory is empty
     * true: Null, false: not null
     */
    public static boolean isDirEmpty(Configuration conf, String remoteDir) throws IOException {
        FileSystem fs = FileSystem.get(conf);
        Path dirPath = new Path(remoteDir);
        RemoteIterator<LocatedFileStatus> remoteIterator = fs.listFiles(dirPath, true);
        return !remoteIterator.hasNext();
    }
 
    /**
     * Create directory
     */
    public static boolean mkdir(Configuration conf, String remoteDir) throws IOException {
        FileSystem fs = FileSystem.get(conf);
        Path dirPath = new Path(remoteDir);
        boolean result = fs.mkdirs(dirPath);
        fs.close();
        return result;
    }
 
    /**
     * Delete directory
     */
    public static boolean rmDir(Configuration conf, String remoteDir) throws IOException {
        FileSystem fs = FileSystem.get(conf);
        Path dirPath = new Path(remoteDir);
        /* The second parameter indicates whether to delete all files recursively */
        boolean result = fs.delete(dirPath, true);
        fs.close();
        return result;
    }
 
    /**
     * Main function
     */
    public static void main(String[] args) {
        Configuration conf = new Configuration();
    conf.set("fs.default.name","hdfs://localhost:9000");
        String remoteDir = "/user/hadoop/input";    // HDFS directory
        Boolean forceDelete = false;  // Force deletion
 
        try {
            /* Judge whether the directory exists. If it does not exist, it will be created and deleted */
            if ( !HDFSApi.test(conf, remoteDir) ) {
                HDFSApi.mkdir(conf, remoteDir); // Create directory
                System.out.println("Create directory: " + remoteDir);
            } else {
                if ( HDFSApi.isDirEmpty(conf, remoteDir) || forceDelete ) { // Directory is empty or forcibly deleted
                    HDFSApi.rmDir(conf, remoteDir);
                    System.out.println("Delete directory: " + remoteDir);
                } else  { // Directory is not empty
                    System.out.println("The directory is not empty and cannot be deleted: " + remoteDir);
                }
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

8. Additional contents

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.*;
import java.io.*;
 
public class HDFSApi {
    /**
     * Determine whether the path exists
     */
    public static boolean test(Configuration conf, String path) throws IOException {
        FileSystem fs = FileSystem.get(conf);
        return fs.exists(new Path(path));
    }
 
    /**
     * Append text content
     */
    public static void appendContentToFile(Configuration conf, String content, String remoteFilePath) throws IOException {
        FileSystem fs = FileSystem.get(conf);
        Path remotePath = new Path(remoteFilePath);
        /* Create a file output stream, and the output content will be appended to the end of the file */
        FSDataOutputStream out = fs.append(remotePath);
        out.write(content.getBytes());
        out.close();
        fs.close();
}
 
    /**
     * Add file content
     */
    public static void appendToFile(Configuration conf, String localFilePath, String remoteFilePath) throws IOException {
        FileSystem fs = FileSystem.get(conf);
        Path remotePath = new Path(remoteFilePath);
        /* Create a file read in stream */
        FileInputStream in = new FileInputStream(localFilePath);
        /* Create a file output stream, and the output content will be appended to the end of the file */
        FSDataOutputStream out = fs.append(remotePath);
        /* Read and write file contents */
        byte[] data = new byte[1024];
        int read = -1;
        while ( (read = in.read(data)) > 0 ) {
            out.write(data, 0, read);
        }
        out.close();
        in.close();
        fs.close();
    }
 
    /**
     * Move files to local
     * After deleting the source file
     */
    public static void moveToLocalFile(Configuration conf, String remoteFilePath, String localFilePath) throws IOException {
        FileSystem fs = FileSystem.get(conf);
        Path remotePath = new Path(remoteFilePath);
        Path localPath = new Path(localFilePath);
        fs.moveToLocalFile(remotePath, localPath);
    }
 
    /**
     * create a file
     */
    public static void touchz(Configuration conf, String remoteFilePath) throws IOException {
        FileSystem fs = FileSystem.get(conf);
        Path remotePath = new Path(remoteFilePath);
        FSDataOutputStream outputStream = fs.create(remotePath);
        outputStream.close();
        fs.close();
    }
 
    /**
     * Main function
     */
    public static void main(String[] args) {
        Configuration conf = new Configuration();
    conf.set("fs.default.name","hdfs://localhost:9000");
        String remoteFilePath = "/user/hadoop/text.txt";    // HDFS file
        String content = "New additions\n";
        String choice = "after";        //Append to end of file
//      String choice = "before";    //  Append to beginning of file
 
        try {
            /* Determine whether the file exists */
            if ( !HDFSApi.test(conf, remoteFilePath) ) {
                System.out.println("file does not exist: " + remoteFilePath);
            } else {
                if ( choice.equals("after") ) { // Append at the end of the file
                    HDFSApi.appendContentToFile(conf, content, remoteFilePath);
                    System.out.println("Appended to end of file" + remoteFilePath);
                } else if ( choice.equals("before") )  { // Append to beginning of file
                    /* There is no corresponding api to operate directly, so move the file locally first*/
/*Create a new HDFS and append the contents in order */
                    String localTmpPath = "/user/hadoop/tmp.txt";
                    // Move to local
HDFSApi.moveToLocalFile(conf, remoteFilePath, localTmpPath);
   // Create a new file
                    HDFSApi.touchz(conf, remoteFilePath); 
                    // Write new content first
                    HDFSApi.appendContentToFile(conf, content, remoteFilePath);
                    // Write the original content again
                    HDFSApi.appendToFile(conf, localTmpPath, remoteFilePath); 
                    System.out.println("Appended content to beginning of file: " + remoteFilePath);
                }
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

9 delete specified file

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.*;
import java.io.*;
 
public class HDFSApi {
    /**
     * Delete file
     */
    public static boolean rm(Configuration conf, String remoteFilePath) throws IOException {
        FileSystem fs = FileSystem.get(conf);
        Path remotePath = new Path(remoteFilePath);
        boolean result = fs.delete(remotePath, false);
        fs.close();
        return result;
    }
 
    /**
     * Main function
     */
    public static void main(String[] args) {
        Configuration conf = new Configuration();
    conf.set("fs.default.name","hdfs://localhost:9000");
        String remoteFilePath = "/user/hadoop/text.txt";    // HDFS file
 
        try {
            if ( HDFSApi.rm(conf, remoteFilePath) ) {
                System.out.println("File deletion: " + remoteFilePath);
            } else {
                System.out.println("Operation failed (file does not exist or deletion failed)");
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

10 moving files

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.*;
import java.io.*;
 
public class HDFSApi {
    /**
     * move file
     */
    public static boolean mv(Configuration conf, String remoteFilePath, String remoteToFilePath) throws IOException {
        FileSystem fs = FileSystem.get(conf);
        Path srcPath = new Path(remoteFilePath);
        Path dstPath = new Path(remoteToFilePath);
        boolean result = fs.rename(srcPath, dstPath);
        fs.close();
        return result;
    }
 
    /**
     * Main function
     */
    public static void main(String[] args) {
        Configuration conf = new Configuration();
    conf.set("fs.default.name","hdfs://localhost:9000");
        String remoteFilePath = "hdfs:///user/hadoop/text.txt "; / / source file HDFS path
        String remoteToFilePath = "hdfs:///user/hadoop/new.txt "; / / destination HDFS path
 
        try {
            if ( HDFSApi.mv(conf, remoteFilePath, remoteToFilePath) ) {
                System.out.println("Will file " + remoteFilePath + " Move to " + remoteToFilePath);
            } else {
                    System.out.println("operation failed(The source file does not exist or the move failed)");
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

MyFSDataInputStream

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import java.io.*;
 
public class MyFSDataInputStream extends FSDataInputStream {
    public MyFSDataInputStream(InputStream in) {
        super(in);
    }
 
    /**
     * Read by line
     * Read in one character at a time, end with "\ n" and return one line of content
     */
    public static String readline(BufferedReader br) throws IOException {
        char[] data = new char[1024];
        int read = -1;
        int off = 0; 
// When the loop is executed, br each time it will continue reading from the end of the last reading
//So in this function, off starts from 0 every time
        while ( (read = br.read(data, off, 1)) != -1 ) {
            if (String.valueOf(data[off]).equals("\n") ) {
                off += 1;
                break;
            }
            off += 1;
        }
 
        if (off > 0) {
            return String.valueOf(data);
        } else {
            return null;
        }
    }
 
    /**
     * Read file contents
     */
    public static void cat(Configuration conf, String remoteFilePath) throws IOException {
        FileSystem fs = FileSystem.get(conf);
        Path remotePath = new Path(remoteFilePath);
        FSDataInputStream in = fs.open(remotePath);
        BufferedReader br = new BufferedReader(new InputStreamReader(in));
        String line = null;
        while ( (line = MyFSDataInputStream.readline(br)) != null ) {
            System.out.println(line);
        }
        br.close();
        in.close();
        fs.close();
    }
 
    /**
     * Main function
     */
    public static void main(String[] args) {
        Configuration conf = new Configuration();
    conf.set("fs.default.name","hdfs://localhost:9000");
        String remoteFilePath = "/user/hadoop/text.txt";    // HDFS path
        try {
            MyFSDataInputStream.cat(conf, remoteFilePath);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

HDFS reads through URL

import org.apache.hadoop.fs.*;
import org.apache.hadoop.io.IOUtils;
import java.io.*;
import java.net.URL;
 
public class HDFSApi {
    static{  
        URL.setURLStreamHandlerFactory(new FsUrlStreamHandlerFactory());  
    }
 
    /**
     * Main function 
     */
    public static void main(String[] args) throws Exception {
        String remoteFilePath = "hdfs:///user/hadoop/text. Txt "; / / HDFS file
        InputStream in = null; 
        try{  
            /* Open the data stream through the URL object and read data from it */
            in = new URL(remoteFilePath).openStream();  
            IOUtils.copyBytes(in,System.out,4096,false);  
        } finally{  
            IOUtils.closeStream(in);  
        }  
    }
}

Keywords: Hadoop

Added by alexjb on Thu, 17 Feb 2022 11:21:35 +0200