DON'S FREEWARE CORNER - JAN 2020

SPLITTING A PDF INTO SEPARATE PDFS


Don's Freeware Corner articles are printed in the UTAH VALLEY TECHNOLOGY AND GENEALOGY GROUP (UVTAGG) Newsletter TAGGology each month and are posted on his Class Notes Page https://uvtagg.org/classes/dons/dons-classes.html where there may be corrections and updates.

SPLITTING A PDF INTO SEPARATE PDFS

2020 Donald R. Snow - Last updated 2020-02-09

PDF = PORTABLE DOCUMENT FORMAT

PDF is a format of text file developed by Adobe and released to the public without copyright many years ago. It is a format such that a pdf page will show on any computer and operating system exactly the same, so many organizations, including the Church of Jesus Christ of Latter-day Saints, have adopted it and use it for releases of their documents. It is used for magazines, handbooks, reports, manuals, and more. It is the most widely used document format in the world, so we need to be aware of it and how to use it. This article will deal with separating a pdf into smaller parts. For example, at many family history conferences you are given a flash drive or link to a website of where to download a large pdf that contains the handouts for each of the talks given at the conference. In many cases these are in one large pdf file, but it is usually more helpful to have each paper in its own pdf file. There are several ways this can be done from the large pdf. Here we will discuss two methods of doing this, one method is to separate the pdf into separate pages, then combine the pages that pertain to each separate file. The other way is to note the page numbers of the inclusive pages of each file and type those into the split program, so each section contains exactly the pages you wantt. Both methods take time and there are probably other ways to do this, too.

PDFSAM BASIC = PDF SPLIT AND MERGE BASIC

This is the free program that I will discuss in this article. It is available from https://pdfsam.org/ . Scroll down from the top of the window to get to the BASIC version. This is completely free and for many pdf tasks is all you need. After download and installation, when you run it, you see 11 icons with descriptions of things it will do. It has several splitting and extracting features. We'll be discussing the Split feature and the Merge feature. Here is a screenshot of the home page of PDFSAM BASIC.

https://uvtagg.org

METHOD 1: SEPARATING THE LARGE PDF INTO INDIVIDUAL PAGES FIRST

To use PDFSAM for this go to its Split icon. Drag and drop the pdf file to split into the top window where it asks for the original file. Set the output parameters the way you want, e.g. Split at Every Page, and tell it where to put the split pages and what to call them. This will be an added prefix on each file after the split.  When you run it (Run icon is at bottom left), it goes through the entire pdf and forms new pdfs of each page, since you told it to split at every page.  These will be numbered sequentially, so if they are in the same folder, they will be the entire original pdf split page-by-page, and all in order. The next step in this method is to select the sets of consecutive pages you want to combine.

COMBINING THE SEPARATE PAGES INTO FILES

The next step is to combine each set of pages into its appropriate file. This can also be done with PDFSAM by going to the Merge option and dragging-and-dropping in the pages that are to form each new pdf. To do this you will have to know which pages go into each file. This can be done by going through the original file and writing down the page numbers for each sub-file or by opening a copy of the original file and looking at the page numbers and moving those page-files into the Merge option on PDFsam. You need to be sure the pages are in the order you want in PDFSAM before you merge them or they will be merged in the wrong order.  Both ways of selecting the pages works and as you merge the files, you can name them with the names you want.  For example, if these are papers from a family history conference, you might want to name them by the name of the speaker, then the title of the paper, and perhaps the meeting the talk was given at. That way they will be easy to find later.

METHOD 2: SPLITTING THE ORIGINAL FILE INTO THE FINAL PARTS AT THE START

This requires telling the splitting program, e.g. PDFSAM, the range of pages to split off and form a new file. So you have to write the first and last page numbers of each new file in the Split option of PDFsSAM for example, 4-9. This tells PDFSAM to form a separate file of pages 4 through 9, so it puts the new file together as it works. To have the page numbers available you will need to write them down first or else open a copy of the original file on your desktop aso you can see the page numbers to include in each new file. Again, this will take time, if there are many papers in the original file. As an example, it took me about two hours using this method by writing down the numbers first, to separate out about 64 papers from a 244-page pdf. And that didn't include the time to rename the files. This method probably won't save much time over Method 1 since splitting the entire pdf only takes a minute or so.

CONCLUSIONS

I'm still looking for a simpler way to accomplish this and thought I had found one using freeware, but it turned out that it only worked for one or two files and then it started putting watermarks on the rest of the files. Of course, if you pay for a good commercial program, you can do this much easier, but that's not freeware. I'll keep looking and write another article, if I find a better way. It would be helpful, if family history conference organizers would give us a flash drive or an online webpage in which each paper was in a separate file and you could download the entire collection all at once. Then, if you wanted a large file with all the papers, it would be easy to merge all the papers. It's separating them out that takes the time.
===================================================== ========