PDF'S AND DOCUMENTS
©2015 by Donald R. Snow
- Welcome and Introduction
- Document Images
- Where PDF's Come From
- PDF Readers
- PDF Editors
- Searching and OCR'ing PDFs
- Conclusions
This page was last updated 2015-02-20
Return to the Utah Valley Technology
and Genealogy Group Home Page or Don's
Class Listings Page .
WELCOME AND INTRODUCTION
- Instructor is Donald R. Snow ( snowd@math.byu.edu
) of Provo and St. George, Utah.
- These note are posted on http://uvtagg.org/classes/dons/dons-classes.html
.
- Tips: (1) Easy to put an icon on your desktop for the
URL for these notes or any URL; just drag the icon from in front of
the address in your browser to your desktop. (2) To open a
link from here in another tab, but keep your place in these notes,
hold down the Control key while clicking the link.
- The problem for today: How do you work with pdf's (Portable
Document Format) and documents? Documentation is extremely
important in family history; without it our genealogy is only hearsay.
DOCUMENT IMAGES
- Images of documents can be in any image format such as pdf, jpg,
tif, png, ePub, and others, but pdf is the most common format for
images of documents
- pdf's are pictures of the document pages so they look exactly the
same on any computer whether Windows, Apple, Linux, etc.; because of
this many organizations, including the LDS Church,
have adopted it for their documents
- Adobe developed the format in the early 1990's and released it to
the general public a few years later; now used world-wide -- see
history of pdf in Wikipedia
- Types of pdf's
- pdf's with text layer -- hence, searchable -- the text layer is
an overlay that identifies symbols of the image as words
- pdf's without text layer -- hence, not searchable
- Several new pdf formats -- encrypted, password-protected, etc.,
but we won't discuss those here
- Can be in many resolutions, e.g. 150 dpi (dots per inch), 300
dpi, ad 600 dpi
- Things people do with pdf's -- read, print, search, edit, convert,
split, merge, rotate, annotate, email, upload to FamilySearch, etc.
WHERE PDF'S COME FROM
- Text editors such as LibreOffice, OpenOffice, Word, WordPerfect,
etc. -- they print directly to pdf in their menus
- Scanners, flatbed and other types
- Screen capture programs such as the old version FastStone
Capture 5.3
- Downloading books from FamilySearch
, Google , Internet
Archive , and HeritageQuest
Online
- Downloading from websites such as https://www.lds.org/?lang=eng
-- LDS Church manuals, handbooks, magazines, conference reports are
all in pdf
- Conversions from many other formats
- Good conversion-to-pdf program is 7-PDF
Maker -- see review and link on dotTech
- IrfanView
splits tif's into separate pages or creates multipage tif's
from separate tif's, e.g. when you scan both sides of a photo
on PS Photo Scanner at FS Lib; sequence of commands is Image
> Add Files > tell it where to put them and click Create
-- can later convert to pdf's, if wanted
- Hexonic
does batch conversions from tif's to pdf's, also does pdf
editing with splits and merges
- Can convert from many formats by just printing the
file to a pdf printer -- see below
PDF READERS
- Many freeware programs read pdf's on computers, tablets, and
smartphones; some show one page at a time, some show two pages open
like a book
- List of pdf programs in various categories -- Wikipedia
- Adobe Reader XI
-- freeware, NOT the full expensive Adobe Acrobat program;
be careful when downloading and installing or you get
bloatware with it -- Reader 11 has some annotation features, so it's
also a pdf editor
- Sumatra
PDF -- freeware, very fast reader, easy to use
- Gizmo's
List of top 5 pdf Readers
- Calibre -- good book
cataloger, reader, and converter; organizes all the pdf's,
ePub's, etc., that it finds on your computer and allows you to
read or convert them in various ways and formats
PDF EDITORS
- Printing to pdf
- Instead of printing hardcopies you can print to pdf from any
program that prints by using a pdf printer -- pdf printers install
just like an ordinary printer on your computer, then select that
instead of your hardcopy printer and you get a pdf of whatever you
were going to print on paper -- I leave my computer set to print to
pdf so I can see exactly what the print will look like and then
print from the pdf; frequently I don't even need a hard copy, so I
just save the pdf copy
- Gizmo's
Reviews of pdf writers
- Many good pdf printers -- CutePDF
- Most text editors will print directly to pdf, e.g. LibreOffice
, OpenOffice , and Word
-- sometimes called "Publish" in the menu
- Many good free pdf editors -- a good program is Adobe
Acrobat , but is expensive
- List of best free pdf editors -- Gizmo's
Review of Best Free PDF Tools --
this page has lots of other good information about working with
pdf's
- LibreOffice -- freeware,
can import pdf's into its DRAW program, edit them, and export as
edited pdf's, but it's not as simple as writing text
- Annotating -- putting text and/or sticky tabs anywhere on
the page -- PDF-XChangeViewer
and Nitro Reader
Free
- Can save a pdf form, fill it in with an editor, save filled
in form, then print it for the doctor, etc.
- Spliting, merging, and rotating pages
- Split, merge, and rotate pdf's -- pdfsam
and 7-PDF Maker
and PDFill
Free PDF Tools -- create several pdf's from
one or create a single pdf from several
- Splitting out every page -- sometimes called "bursting"
the file -- it helps to have some way of numbering the pages if
you do split out every page
- Renaming
- Use Windows Explorer with Preview Panel (right side) open to see
what's in the pdf without having to open it -- can then easily
rename the file
- If you want pages to sort in order, rename the files with numbers
### in front of file name; include leading 0's, if needed
- If several pages have same file name, can use numbering with ###a,
###b, etc.
- If you are scanning a large document or book, I find it helps to
number the pages in pencil before scanning, so you can see the page
number later
- Bulk
File Namer -- very useful to rename collections of files
- My file naming system makes file findable and sort in order -- see
my Supplementary
Notes pages
SEARCHING AND OCR'ING PDF'S
- To be searchable pdf's must include the text layer (OCR = Optical
Character Recognition) which identifies the characters on the page as
words; when you download books from FamilySearch
Books and Internet
Archive they have the text layer, but not with books
from Google
- Freeware program with built-in OCR is PDF-XChangeViewer
-- open the pdf, click on OCR, when finished, save it with a new name
so you don't wipe out your old version; the new pdf is then searchable
in any reader with search capabilities; PDF-XChangeViewer does a
reasonable job, but not as good as commercial OCR programs
CONCLUSIONS
- Working with pdf's is very helpful in family history since you can
do things like save documents in this format, OCR them so they are
searchable, print them later, upload them to FamilySearch Family Tree,
email them to family and other researchers, and read them on mobile
devices.
- There is much more in dealing with pdf's; other pages of my notes
and Supplementary
Notes pages have additional information and
programs, including my naming system for files so they sort in
chronological order for the person or event.
Return to the Utah
Valley Technology and Genealogy Group Home Page or Don's
Class Listings Page .