PDF Page Extractor Using Python
A Python-based PDF page extraction tool built using the PyPDF2 library. It allows users to select a specific range of pages from any PDF file and save them as a new PDF. Ideal for splitting documents, study notes, and extracting important sections easily.
Explanation
A PDF Page Extractor is a utility that allows users to extract a specific range of pages from a large PDF file and save them as a new PDF document. This is especially useful when you need only selected sections from large books, reports, question papers, or e-books.
This program uses Python's PyPDF2 library to read a PDF file, select a
range of pages, and write those pages into a new PDF file. The program
supports custom start and end page numbers and ensures that the selected
pages are extracted accurately.
Requirements
-
pip install PyPDF2
Installs the PyPDF2 library required for reading and writing PDF pages.
- Python version 3 or higher
- A valid PDF file to extract pages from
- User must provide start and end page numbers
- Generated output PDF is saved with user-given filename
Code Explanation
Imports the required classes for reading an existing PDF and writing a new one.
Displays a title to make the program more user-friendly.
Takes the input PDF file path from the user. Uses a default file if left blank.
Asks the user for the first page to extract.
Asks the user for the final page to extract.
Takes the output PDF filename from the user or uses a default filename.
writer = PdfWriter()
Loads the input PDF and prepares a new PDF writer object.
end = end_page
Converts user page numbers to zero-based index for PyPDF2.
writer.add_page(reader.pages[page_num])
Adds each selected page from the input PDF into the new output PDF.
writer.write(out_pdf)
Saves the extracted pages into a new PDF file.
Prints a success message after generating the PDF.
Key Points
- Extracts only selected pages from a PDF file.
- Uses Python’s lightweight PyPDF2 library.
- Zero-based indexing is used internally for page selection.
- User chooses start page, end page, input file, and output filename.
- Easy to automate splitting or cropping PDFs for study/work.
Full Python Program
from PyPDF2 import PdfReader, PdfWriter
print("=== PDF Splitter ===")
input_pdf = input("Enter input PDF path: ").strip() or "output.pdf"
start_page = int(input("Enter start page number: ").strip())
end_page = int(input("Enter end page number: ").strip())
output_pdf = input("Enter output PDF name (e.g., output.pdf): ").strip() or "outputs.pdf"
reader = PdfReader(input_pdf)
writer = PdfWriter()
start = start_page - 1
end = end_page
for page_num in range(start, end):
writer.add_page(reader.pages[page_num])
with open(output_pdf, "wb") as out_pdf:
writer.write(out_pdf)
print(f"PDF split successfully! Saved as: {output_pdf}")
Output :
> py pdfExtractor.py === PDF Splitter === Enter input PDF path: sample.pdf Enter start page number: 2 Enter end page number: 5 Enter output PDF name (e.g., output.pdf): extracted.pdf PDF split successfully! Saved as: extracted.pdf
