pdf-splicer

A Python utility script to create new PDF files from pages in other PDF files.

Use this script to create one or more PDF files that are compiled from single pages of other PDF files. A table of contents file contains the mapping of pages in the source PDF files to pages in destination PDF files.

Multiple destination PDFs can be created from multiple source PDFs in the same table of contents file. This script processes the mappings in the table of contents file and produces the resulting PDFs.

This has been useful for issuing page formatted reports that are compilations of PDF output from different sources. You might have PDFs that you collect from others, along with visualizations from R scripts, Python scripts, or Power BI exports. This script allows you to compile a single PDF digest from those separate documents using a template that can be rerun as often as needed.

Command Line

usage: pdf_splicer.py [-h] toc src_folder dest_folder    
positional arguments:    
    toc             Required: Table of contents file    
    src_folder      Required: Folder with source PDF files        
    dest_folder     Required: Folder for destination PDF files

Table of Contents File

The table of contents file instructs the script as to how many destination PDFs will result from running the script, and where to source each page of the destination PDFs.

It has six columns:

Report: (Required) The name of the destination PDF file
Page: (Required) The page number in the destination PDF file
Bookmark: (Optional) The bookmark for this page
Source File: (Required) The name of the source PDF file for this page
Source Page: (Required) The page number in the source PDF file to use
Folder: (Optional) A sub-folder in the output folder to write to

Each row in the table of contents file is intended to be a new page in a destination PDF file. Multiple destination PDF files can be mapped in one table of contents file.

Any table of contents record missing a value in the Report, Source File, or Source Page columns will be skipped. Any records missing a value in the Page column will use the default page number zero.

Technically, the script produces pages in the same order as the values in the Page column, not necessarily at the exact page number given in the table of contents file. Having more than one page with the same destination page number will not stop the script from working. The ordering of pages that share the same page number is not specified.

When sequential pages share the same bookmark, the bookmark will only be applied to the first page in the sequence.

Example

An example table of contents file, referrals_example_toc, can be found in the repository. This example uses input PDFs from my sample referrals report in referrals_powerbi.

The input for this example is a single PDF of referral process measures for multiple clinics. This one PDF will be split into a report for each of those clinics. Another PDF with cover sheet pages for each clinic is spliced into each of the destination PDFs for each clinic.

This example also recreates the input PDF while adding bookmarks for each clinic. The bookmarks make it much easier for viewers to use the report. The bookmark names and pages are supplied in the table of contents file.

The shell command to run this example is:

python pdf_splicer.py referrals_example_toc.csv in out

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
in		in
out/referrals		out/referrals
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pdf_splicer.py		pdf_splicer.py
referrals_example_toc.csv		referrals_example_toc.csv
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pdf-splicer

Command Line

Table of Contents File

Example

About

Releases

Packages

Languages

License

907sjl/pdf-splicer

Folders and files

Latest commit

History

Repository files navigation

pdf-splicer

Command Line

Table of Contents File

Example

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages