DEALING WITH PDFS

2020 by Donald R. Snow
This page was last updated 2020-01-19.
Return to the  Utah Valley Technology and Genealogy Group Home Page  or  Don's Class Listings Page .
ABSTRACT:  PDF stands for Portable Document Format and was developed and released to the public without
copyright by Adobe.  A pdf page displays exactly the same on any device or computer, regardless of the operating system.  Because of this many organizations, including the Church of Jesus Christ of Latter-day Saints, have adopted it for their manuals, handbooks, magazines, documents, and reports.  Thus, the need to  knowhow to work with pdfs, editing,  separating into parts, rearranging pages, and making them searchable, if they aren't already.  Many forms are in pdf and free programs allow you to fill in the forms electronically. There are many programs to do these tasks, including many free ones, and we will discuss some of these. The class notes and related articles, all with active Internet links, are on Don's website  http://uvtagg.org/classes/dons/dons-classes.html .

    WELCOME AND INTRODUCTION

  1. Instructor is Donald R. Snow of St. George and Provo, Utah ( snowd@math.byu.edu ).
  2. These notes, with active Internet links and other related articles, are on Don's website  http://uvtagg.org/classes/dons/dons-classes.html .
  3. Tips:  (1)  To put an icon on your desktop for the URL for these notes, or any webpage, just drag the icon that is in front of the address in your browser to your desktop.  (2)  To open a link while keeping your place on this page, hold down the Control key while clicking the link, so it opens in a new tab.
  4. The problem for today:  What are pdfs and what are some free programs to work with them?
  5. DOCUMENTS AND IMAGES OF DOCUMENTS

  6. Documents can be in many formats: .odt, .doc, .docx, .txt, etc., but these formats don't display the same on all computers.
  7. Images of documents (pictures of the text) can be in many formats: .jpg, .tif, png, etc., and these don't display the same on all computers, either.
  8. .pdf is a format for images of documents that do display the same on all computers and all operating systems; hence the benefit of  pdfs
  9. Remember:  pdf's are PICTURES of the document pages, so they look exactly the same on any computer whether Windows, Apple, Linux, or anything else 
  10. pdf was developed by Adobe in the 1990s and released to the public with no copyright restrictions so everyone could use them as they wanted -- see the history of pdf in  Wikipedia -- many organizations world-wide have adopted this format, including the Church of Jesus Christ of Latter-day Saints 
  11. Types of pdf's
    1. pdf's with no text layer -- These are images (pictures) of the text page and are not searchable.
    2. pdf's with text layer -- these have a layer over the picture of the page that identifies the symbols as words in some language and hence are searchable; adding this text layer is called OCR'ing (Optical Character Recognition)   
    3. There are several new pdf formats with additional features such as encryption and password protection and we won't discuss those here. 
    4. pdfs are in different  resolutions, e.g. 150 dpi (dots per inch), 300 dpi, or 600 dpi , and the higher the resolution, the more accurate, but also the larger the file size
  12. pdfs can be read, searched, printed, edited, split, merged, rotated, annotated, emailed, uploaded to FamilySearch, etc.
  13. SOURCES OF PDFS

  14. Text editors (Word, LibreOffice, WordPerfect, etc.) can save pages as pdfs; this is usually in an Export menu  
  15. Windows 10 has a built-in "printer" to print anything to pdf -- Print to Microsoft PDF -- I set that as my default printer since I usually only want a pdf and not a printed hardcopy   
  16. Flatbed and other types of scanners produce pdfs, usually without the text layer, so they are not searchable without OCR'ing them; as an example, you can scan the forms your doctor sends and write on them electronically and print a copy to take to your appointment  
  17. Screen capture programs such as FastStone Capture (old free version is available from  FastStone Capture 5.3) can save as pdf    
  18. Books downloaded from  FamilySearch , Google , Internet Archive , and HeritageQuest Online are usually in pdf 
  19. Downloading from websites such as --  https://www.churchofjesuschrist.org/?lang=eng -- Church manuals, handbooks, magazines, conference reports are all pdf 
  20. Ordinance forms from FamilySearchae are pdfs.
  21. Can convert from many formats by just printing the file to a pdf printer and two free conversion programs are  IrfanView  and  Hexonic 
  22. PDF READERS

  23. There are many free pdf reader programs for computers, tablets, and smartphones; they have various settings, e.g. to show only one page or two pages side-by-side, as in a book  
  24. List of pdf programs in various categories -- Wikipedia  
  25. Adobe Reader is a free pdf reader and (partial) editor --this is NOT the full and expensive Adobe program; the free Adobe Reader is available from many websites including  https://acrobat.adobe.com/us/en/acrobat/pdf-reader.html -- (download the free version, not the commercial trial version)  
  26. Sumatra PDF -- freeware, very fast reader, easy to use 
  27. Gizmo's List of top 5 pdf Readers 
  28. Calibre -- good book cataloger, reader, organizer, and converter -- drag-and-drop pdfs and ePubs and it organizes and catalogs them; allows reading and conversion to and from various formats    
  29. PDF EDITORS

  30. Printing to pdf -- Windows 10 has a built-in Print To PDF which you can set as your default printer so any program you want to print from will go to pdf; there are also many free pdf printer programs you can download and install  
  31. Good list and description of free pdf programs -- Gizmo's Review of Best Free PDF Tools 
  32. Two helpful and free pdf editors are   --  https://www.tracker-software.com/product/downloads/enduser /pdf-xchange-editor   and    https://acrobat.adobe.com/us/en/acrobat/pdf-reader.html    
  33. Writing on pdfs 
    1. Can write on, annotate, put sticky notes on, etc., with the above two free programs
    2. LibreOffice -- freeware, can import pdf's into its DRAW program, edit them, and export as edited pdf's, but it's not as simple as writing text
    3. For forms to fill out you can scan them to pdf and fill them in with an editor,; then save the filled in form and print it, if you need to, but you always have the original copy on your computer     
  34. Spliting, merging, rearranging or rotating pages 
    1. Split, merge, and rotate pdf's -- pdfsam  and  7-PDF Maker  and  PDFill Free PDF Tools -- create several pdf's from one or create a single pdf from several --
      7-PDF Split and Merge -- https://www.7-pdf.com/downloads -- see review and link on  dotTech
    2. Splitting out every page -- sometimes called "bursting" the file -- can usually set the numbering so the file number contains the page number so they sort in order  
    3. To reassemble a split pdf you can open a copy of the original file in another window so you can see which pages are the ones you want to combine.
  35. Renaming
    1. Use Windows Explorer with Preview Panel (right side) open to show what's in the pdf without having to open it -- can then rename it without opening it  
    2. May need to add leading 0's to make the pages sort in order, e.g. 001, 002, ...
    3. To sort pages with same name use numbering as ###a, ###b, etc. 
    4. Sometimes helps to write page number in pencil on oridinal paper before scanning so you know which page it is 
    5. Bulk File Namer -- very useful to rename collections of files
    6. See other class notes for my file naming system to make files findable and sort in order regardless of location 

    SEARCHING AND OCR'ING PDF'S

  36. To be searchable pdf's must have the text layer from OCR (Optical Character Recognition) ; books from  FamilySearch Books  and  Internet Archive  already have this, but books from Google do not 
  37. Adobe Acrobat Reader will now do OCR -- see link above
  38. Freeware program with built-in OCR is PDF-XChangeEditor from  https://www.tracker-software.com/product/downloads/enduser/pdf-xchange-editor -- open the pdf, click on OCR; the new file is searchable; save it with a new name so you don't overwrite your old file; PDF-XChangeEditor does a reasonable job, but is not as good as the full and expensive commercial OCR programs  
  39. CONCLUSIONS

  40. Working with pdf's is important in family history and in everyday computer work.
  41. This has been just an introduction to dealing with pdf's and there is much more.  Some additional information is in other class notes and Freeware Corner articles on my webpage.

  42. Return to the  Utah Valley Technology and Genealogy Group Home Page  or  Don's Class Listings Page .