Java - capture a page in the PDF as a thumbnail

catalogue

Apache PDFBox is an open source Java library that supports the development and transformation of PDF documents.

Introduce dependency

Business code

Reading PDF files in the network and file type conversion

If this blog is helpful to you, remember to leave a message + like + collect.

Apache PDFBox is an open source Java library that supports the development and transformation of PDF documents.

It has the following functions  -

  • Extract Text  - With PDFBox, you can extract Unicode text from PDF files.

  • Split & Merge  - With PDFBox, you can divide a single PDF file into multiple files and merge them into one file.

  • Fill Forms  - With PDFBox, you can fill in the form data in the document.

  • Print  - With PDFBox, you can print PDF files using the standard Java printing API.

  • Save as Image  - With PDFBox, you can save PDF as an image file, such as PNG or JPEG.

  • Create PDFs  - With PDFBox, you can create a new PDF file by creating a Java program, and you can also include images and fonts.

  • Signing  - With PDFBox, you can add digital signatures to PDF files.

  There is a tutorial that introduces PDFBox in detail. I won't say more here.

PDFBox - Quick Guide_ Learn the PDFbox|WIKI tutorial. This method takes a file object as a parameter because it is a static method that you can call using the class name, as shown below This method takes a file object as a parameter because it is a static method that you can call using the class name, as shown below This method takes a file object as a parameter because it is a static method that you can call using the class name, as shown below This method takes a file object as a parameter because it is a static method that you can call using the class name, as shown below This method takes a file object as a parameter because it is a static method that you can call using the class name, as shown below.https://iowiki.com/pdfbox/pdfbox_quick_guide.html

Introduce dependency

<!--start:PDF Get the picture on the first page-->
<dependency>
    <groupId>org.apache.pdfbox</groupId>
    <artifactId>pdfbox</artifactId>
    <version>2.0.20</version>
</dependency>
<!--end:PDF Get the picture on the first page-->

Business code

/**
 * Capture a page in the PDF as a thumbnail and upload (save)
 * @param pdfFileName
 * @return
 */
public  String PDFFramer(String pdfFileName){

    //Convert PDF files in the network to file
    File file = URLToFile(pdfFileName);
    //new File() can only access local files
    //Convert local file to file
    //File file = new File("C:\Users\Administrator\Downloads \ \ (important must see). pdf");

    String pdfUrl="";
    try
    {
        // Open source pdf
        log.info("Start interception PDF:");
        //The load() method of the PDDocument class is used to load an existing PDF document
        PDDocument pdfDocument = PDDocument.load(file);
        //The class of PDFRenderer renders PDF documents as AWT BufferedImage 
        PDFRenderer pdfRenderer = new PDFRenderer(pdfDocument);

        // Extracted page number
        int pageNumber = 0;
        // The BufferedImage object is read and stored at 300 dpi
        int dpi = 300;
       //The renderImage() method of the Renderer class renders an image in a specific page
        BufferedImage buffImage = pdfRenderer.renderImageWithDPI(pageNumber, dpi, ImageType.RGB);
        // File type conversion
        MultipartFile multipartFile = fileCase(buffImage);
        log.info("PDF Start uploading:");
        pdfUrl = fileLoad(multipartFile);
        log.info("PDF Upload succeeded:{}",pdfUrl);

        // close document
        pdfDocument.close();
        //Delete temporary file
        String s = threadLocal.get();
        log.info("Directory of temporary files:"+s);

        File f=new File(s);
        boolean delete = f.delete();
        log.info("Delete file"+delete);

    }
    catch (InvalidPasswordException e)
    {
        e.printStackTrace();
    }
    catch (IOException e)
    {
        e.printStackTrace();
    }
    return pdfUrl;
}

Reading PDF files in the network and file type conversion

/**
 * Read PDF files from the network
 * @param url
 * @return
 */
public  File URLToFile(String url){
    log.info("read FastDFS Upper PDF");
    //Save temporary file -- relative location of jar package
    File file1 = new File("Temporary.pdf");
    try {

        URL url1 = new URL(url);
        FileUtils.copyURLToFile(url1,file1);

    } catch (IOException e) {
        e.printStackTrace();
    }
    File absoluteFile = file1.getAbsoluteFile();
    threadLocal.set(absoluteFile.toString());
    log.info("ppt Already stored locally"+absoluteFile.toString());
    return file1;
}



/**
 * File conversion converts BufferedImage to MultipartFile: for file upload
 * @param image
 * @return
 */
public static MultipartFile fileCase(BufferedImage image){
    //Get BufferedImage object
   // BufferedImage bufferedImage = JoinTwoImage.testEncode(200, 200, url);
    MultipartFile multipartFile= null;
    try {
        //Create a ByteArrayOutputStream
        ByteArrayOutputStream os = new ByteArrayOutputStream();
        //Write BufferedImage to ByteArrayOutputStream
        ImageIO.write(image, "jpg", os);
        //ByteArrayOutputStream to InputStream
        InputStream input = new ByteArrayInputStream(os.toByteArray());
        //Convert InputStream to MultipartFile
        multipartFile =new MockMultipartFile("file", "file.jpg", "text/plain", input);
    } catch (IOException e) {
        e.printStackTrace();
    }
return multipartFile;

}

If this blog is helpful to you, remember to leave a message + like + collect.

Keywords: Java

Added by simmsy on Mon, 20 Sep 2021 15:33:00 +0300