catalogue
Reading PDF files in the network and file type conversion
If this blog is helpful to you, remember to leave a message + like + collect.
Apache PDFBox is an open source Java library that supports the development and transformation of PDF documents.
It has the following functions -
-
Extract Text - With PDFBox, you can extract Unicode text from PDF files.
-
Split & Merge - With PDFBox, you can divide a single PDF file into multiple files and merge them into one file.
-
Fill Forms - With PDFBox, you can fill in the form data in the document.
-
Print - With PDFBox, you can print PDF files using the standard Java printing API.
-
Save as Image - With PDFBox, you can save PDF as an image file, such as PNG or JPEG.
-
Create PDFs - With PDFBox, you can create a new PDF file by creating a Java program, and you can also include images and fonts.
-
Signing - With PDFBox, you can add digital signatures to PDF files.
There is a tutorial that introduces PDFBox in detail. I won't say more here.
Introduce dependency
<!--start:PDF Get the picture on the first page--> <dependency> <groupId>org.apache.pdfbox</groupId> <artifactId>pdfbox</artifactId> <version>2.0.20</version> </dependency> <!--end:PDF Get the picture on the first page-->
Business code
/** * Capture a page in the PDF as a thumbnail and upload (save) * @param pdfFileName * @return */ public String PDFFramer(String pdfFileName){ //Convert PDF files in the network to file File file = URLToFile(pdfFileName); //new File() can only access local files //Convert local file to file //File file = new File("C:\Users\Administrator\Downloads \ \ (important must see). pdf"); String pdfUrl=""; try { // Open source pdf log.info("Start interception PDF:"); //The load() method of the PDDocument class is used to load an existing PDF document PDDocument pdfDocument = PDDocument.load(file); //The class of PDFRenderer renders PDF documents as AWT BufferedImage PDFRenderer pdfRenderer = new PDFRenderer(pdfDocument); // Extracted page number int pageNumber = 0; // The BufferedImage object is read and stored at 300 dpi int dpi = 300; //The renderImage() method of the Renderer class renders an image in a specific page BufferedImage buffImage = pdfRenderer.renderImageWithDPI(pageNumber, dpi, ImageType.RGB); // File type conversion MultipartFile multipartFile = fileCase(buffImage); log.info("PDF Start uploading:"); pdfUrl = fileLoad(multipartFile); log.info("PDF Upload succeeded:{}",pdfUrl); // close document pdfDocument.close(); //Delete temporary file String s = threadLocal.get(); log.info("Directory of temporary files:"+s); File f=new File(s); boolean delete = f.delete(); log.info("Delete file"+delete); } catch (InvalidPasswordException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } return pdfUrl; }
Reading PDF files in the network and file type conversion
/** * Read PDF files from the network * @param url * @return */ public File URLToFile(String url){ log.info("read FastDFS Upper PDF"); //Save temporary file -- relative location of jar package File file1 = new File("Temporary.pdf"); try { URL url1 = new URL(url); FileUtils.copyURLToFile(url1,file1); } catch (IOException e) { e.printStackTrace(); } File absoluteFile = file1.getAbsoluteFile(); threadLocal.set(absoluteFile.toString()); log.info("ppt Already stored locally"+absoluteFile.toString()); return file1; } /** * File conversion converts BufferedImage to MultipartFile: for file upload * @param image * @return */ public static MultipartFile fileCase(BufferedImage image){ //Get BufferedImage object // BufferedImage bufferedImage = JoinTwoImage.testEncode(200, 200, url); MultipartFile multipartFile= null; try { //Create a ByteArrayOutputStream ByteArrayOutputStream os = new ByteArrayOutputStream(); //Write BufferedImage to ByteArrayOutputStream ImageIO.write(image, "jpg", os); //ByteArrayOutputStream to InputStream InputStream input = new ByteArrayInputStream(os.toByteArray()); //Convert InputStream to MultipartFile multipartFile =new MockMultipartFile("file", "file.jpg", "text/plain", input); } catch (IOException e) { e.printStackTrace(); } return multipartFile; }