Here’s an example code in C# to extract data from a PDF file using the iTextSharp
library
using iTextSharp.text.pdf; using iTextSharp.text.pdf.parser; using System.IO; // Define the path of the PDF file string filePath = "path/to/pdf/file.pdf"; // Create an instance of the PdfReader class to read the PDF file PdfReader pdfReader = new PdfReader(filePath); // Define a string to store the extracted text string extractedText = ""; // Loop through each page of the PDF file for (int i = 1; i <= pdfReader.NumberOfPages; i++) { // Extract text from the current page extractedText += PdfTextExtractor.GetTextFromPage(pdfReader, i); } // Close the PdfReader object pdfReader.Close(); // Write the extracted text to a text file File.WriteAllText("path/to/output/file.txt", extractedText); // Display a message indicating that the extraction is complete Console.WriteLine("Extraction complete.");
This code uses the PdfReader
class to read the PDF file, loops through each page of the file, and uses the PdfTextExtractor
class to extract the text from each page. The extracted text is stored in a string variable and then written to a text file using the File.WriteAllText()
method. Finally, a message is displayed to indicate that the extraction is complete.