Copy PDF text and paste to Text file in Java
In this article, we will seen how to create new text file and Extract text from PDF document to text file.
We will use Apache pdfbox for extract PDF. For use Apache pdfbox we can use Maven project and include dependency or Crate Dynamic Web Project and add pdfbox JAR file. So in this we will use Dynamic Web Project.
Step 1 : Create new Dynamic Web Project in eclipse
Go to File -> New -> Dynamic Web Project
Create Java class.
Step 2 : Add pdfbox JAR file in Project
Click on below link and download JAR file.
For include JAR into our project follow below steps :
- Click Right click on project -> Build Path - > Configure Build Path
- Go to Libraries tab -> Click on Add External JARs button. Select Apache pdfbox jar.
- Click Apply and Close button.
Now all set for extracting PDF store into text file.
Step 3 : Java code for Extract PDF text to Text file
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStreamWriter;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.text.PDFTextStripper;
public class FileReadWrite {
public static void main(String[] args) {
try {
PDDocument pd;
BufferedWriter wr;
String filePath = "D:\\JavaFileDemo/";
// The PDF file name and full path that you want to extract
File input = new File(filePath + "input.pdf");
// The text file name and its path where you want to store
File output = new File(filePath + "output.txt");
pd = PDDocument.load(input);
PDFTextStripper stripper = new PDFTextStripper();
wr = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(output)));
stripper.writeText(pd, wr);
if (pd != null) {
pd.close();
}
wr.close();
System.out.println("Successfully extracted : PDF to text file");
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
If above code successfully compile and run, then "Successfully extracted : PDF to text file" message print.
Go to your Path location and check there is output.txt is successfully created and PDF data is pasted into it.
Other articles you may like :
Spring Boot and Security Articles :
Comments
Post a Comment