Skip to main content

How to Read PDF and Write into Text file in Java? PDF to Text file in Java

Copy PDF text and paste to Text file in Java

Extract text from an existing PDF document to Text file in  JAVA

In this article, we will seen how to create new text file and Extract text from PDF document to text file.

We will use Apache pdfbox for extract PDF. For use Apache pdfbox we can use Maven project and include dependency or Crate Dynamic Web Project and add pdfbox JAR file. So in this we will use Dynamic Web Project. 

Step 1 : Create new Dynamic Web Project in eclipse

Go to File -> New -> Dynamic Web Project

Create Java class.

Step 2 : Add pdfbox JAR file in Project

Click on below link and download JAR file.

For include JAR into our project follow below steps :

  1. Click Right click on project -> Build Path - >  Configure Build Path
  2. Go to Libraries tab -> Click on Add External JARs button. Select Apache pdfbox jar.
  3. Click Apply and Close button.

Now all set for extracting PDF store into text file.

Step 3 : Java code for Extract PDF text to Text file

import java.io.BufferedWriter;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStreamWriter;

import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.text.PDFTextStripper;


public class FileReadWrite {

    public static void main(String[] args) {
        try {
            PDDocument pd;
            BufferedWriter wr;
            
            String filePath = "D:\\JavaFileDemo/";
            
            // The PDF file name and full path that you want to extract
            File input = new File(filePath + "input.pdf");
            
            // The text file name and its path where you want to store
            File output = new File(filePath + "output.txt");

            pd = PDDocument.load(input);
            PDFTextStripper stripper = new PDFTextStripper();
            wr = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(output)));
            stripper.writeText(pd, wr);
            
            if (pd != null) {
                pd.close();
            }
            wr.close();
            System.out.println("Successfully extracted : PDF to text file");
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
    }

If above code successfully compile and run, then "Successfully extracted : PDF to text file" message print.

Go to your Path location and check there is output.txt is successfully created and PDF data is pasted into it.


Other articles you may like :

Spring Boot and Security Articles :

Comments

Popular posts from this blog

Plus Minus HackerRank Solution in Java | Programming Blog

Java Solution for HackerRank Plus Minus Problem Given an array of integers, calculate the ratios of its elements that are positive , negative , and zero . Print the decimal value of each fraction on a new line with 6 places after the decimal. Example 1 : array = [1, 1, 0, -1, -1] There are N = 5 elements, two positive, two negative and one zero. Their ratios are 2/5 = 0.400000, 2/5 = 0.400000 and 1/5 = 0.200000. Results are printed as:  0.400000 0.400000 0.200000 proportion of positive values proportion of negative values proportion of zeros Example 2 : array = [-4, 3, -9, 0, 4, 1]  There are 3 positive numbers, 2 negative numbers, and 1 zero in array. Following is answer : 3/6 = 0.500000 2/6 = 0.333333 1/6 = 0.166667 Lets see solution Solution 1 import java.io.*; import java.math.*; import java.security.*; import java.text.*; import java.util.*; import java.util.concurrent.*; import java.util.function.*; import java.util.regex.*; import java.util.stream.*; import static jav...

Flipping the Matrix HackerRank Solution in Java with Explanation

Java Solution for Flipping the Matrix | Find Highest Sum of Upper-Left Quadrant of Matrix Problem Description : Sean invented a game involving a 2n * 2n matrix where each cell of the matrix contains an integer. He can reverse any of its rows or columns any number of times. The goal of the game is to maximize the sum of the elements in the n *n submatrix located in the upper-left quadrant of the matrix. Given the initial configurations for q matrices, help Sean reverse the rows and columns of each matrix in the best possible way so that the sum of the elements in the matrix's upper-left quadrant is maximal.  Input : matrix = [[1, 2], [3, 4]] Output : 4 Input : matrix = [[112, 42, 83, 119], [56, 125, 56, 49], [15, 78, 101, 43], [62, 98, 114, 108]] Output : 119 + 114 + 56 + 125 = 414 Full Problem Description : Flipping the Matrix Problem Description   Here we can find solution using following pattern, So simply we have to find Max of same number of box like (1,1,1,1). And ...