DON'S FREEWARE CORNER - MAY 2017
MAKING SEARCHABLE THE INDEX TO THE JOURNAL HISTORY OF THE
LDS CHURCH
©2017 Donald R. Snow - This page was last updated 2017-05-10.
These Freeware Corner notes are published in TAGGology,
our Utah Valley Technology and Genealogy Group (UVTAGG)
monthly newsletter. They are also posted on my Freeware
Corner Notes page on http://uvtagg.org/classes/dons/dons-classes.html
where the links are active and there may be corrections and
additions and other related notes and articles.
JOURNAL HISTORY OF THE LDS CHURCH
The LDS Church Historian's Office made a daily scrapbook of
happenings related to the Church from its starting in 1830 up
to the year 2008. It is called the Journal History of
the Church and consists of hundreds of notebooks with
looseleaf pages in chronological order of the event or
article. It was actually started in 1906 by Andrew
Jenson when he was the Assistant Church Historian and was
based on information he had collected and published in his
books about Church Chronology, including those in 1899
and 1914 , which
are available to download from https://archive.org/index.php . Information about
Andrew Jenson is at Wikipedia ,FamilySearch Memories , BYU
Religious Studies Center , and Dictionary
of Mormon Biography . He was a Mormon
convert from Denmark, but changed the spelling of his name
from Jensen to Jenson when coming to Utah in 1866.
Whenever he found a new article, journal entry, or
information piece, he added it to the Journal History by
pasting it onto a new page and putting the page in its
chronological place in the notebooks. The pages of
each day's items were numbered sequentially. Since
these items included articles, letters, events, addresses,
statements, etc., they form a valuable collection for family
history. To index it 3x5 inch cards were typed with
the main name or event at the top of the card followed by
each reference to that name or event with its date.
New cards were started as others were filled. The
cards were kept in card file drawers like a library card
catalog and that card index file was microfilmed several
times. The images of the Journal History, and later
the Index images, were released on DVD by the Church
Historian's Office in 2002 and are now posted on the Church
History Department website at https://history.lds.org/article/journal_history_guide?lang=eng .
To search them you click on the digitized film that contains
the name or event you want and go through the cards one by
one. The cards are alphabetical by the surname or event
at the top, but there is lots of other information on the
cards. For example, letters written by a person may be
indexed on that person's card, but not on the card of the
person they were addressed to. Hence, searching is
limited to the main person or event on the card. This is
the case with many other such indexes, e.g., the Early Church
Information File (ECIF), the Susan Easton-Black Early Church
Membership volumes, and many other items. This article
will show a way to make the card files for the Journal History
Index completely searchable for any word or date anywhere on
the cards. This method is possible since the cards were
typewritten and the CHO website allows you to download the
digital films of the Index. The method can be used on
other card file indexes, if you can somehow download large
collections of the cards.
VIEWING THE JOURNAL HISTORY INDEX
The Journal History Index, 1830-1972, is online at https://eadview.lds.org/findingaid/CR%20100%20142 .
These are the images from the microfilms. Clicking on
Overview shows that there are 192 microfilms. Clicking
on Browse the Collection take you to a page with 4
alphabetical sets with roughly 50 digitized films in each
set. Click on the set that includes the name or event
you are looking. There are about 50 files in each set
and their titles are the first and last cards on that
film. Click on the file you want and you see the first
card of the collection of 1500 or more card images in
alphabetical order. For example, Snow is on the film
"Snafu - Socialized Medicine". Highlighting the film
name shows a link to View Digital Object and gives its full
name and where it is stored in the Church Historian's
Office. Click on Browse the Collection to view the card
images one by one. The Journal History Index cards are
typewritten, but some are not very clear. On the right
side is a panel with image numbers and "Way Points" of
sections of the collection so you can jump to later images
closer to the name or event you are looking for. The
first card with Snow on it is Image 196 on that film, but it
is about snowfalls. The cards pertaining to the Snow
family start with Image 197 and go to Image 1135. These
include people and places named Snow, e.g. Snow College,
Erastus Snow, Lorenzo Snow, Willard Snow, etc. There is
also a small icon near the bottom with 4 small squares which
gives you thumbnail views of the cards with many on screen at
once. The size of these thumbnails can be increased or
decreased by holding down the CONTROL key while rolling the
mouse wheel. Clicking on the single rectangle icon takes it
back to showing a single card and its size can also be made
larger or smaller by CTRL+Mouse wheel. On that
particular film there are 1796 card images. You can
click on the cards one at a time to see the references and
download or print any card you want, but that is slow since it
has to download the next card each time you click.
DOWNLOADING A JOURNAL HISTORY INDEX FILE
On the CHO website, when you are at one of the digitized index
files, at the top right is an icon labeled Download.
When highlighted you see a note that the jpg images can be
downloaded to your computer as pdf's. Clicking there
takes you to an information card that says the Church
Historian's Office has no objection to you downloading and
using this index file for non-commercial purposes, as long as
you give them credit. Clicking that you agree to the
terms gets it ready to download the pdf of the entire film,
all 1796 images for this particular film, and asks you what
you want to name the download file. I named it something
like JournalHistoryIndex-SnafuToSocializedMedicine--[the CHO
name on the file]--[Date in International Date Format
YYYY-MM-DD].pdf . It was 95 megabytes in size and took
about 10 minutes to download. This pdf file itself is
much quicker to read through than doing it online, since you
don't have to bring down a new image each time. However,
this pdf can be OCR'd (Optical Character Recognized) to make
it searchable for every word on the cards. This is
possible since the cards were typewritten and not
handwritten.
OCR'ING THE PDF TO MAKE IT EVERY-WORD SEARCHABLE
There are commercial OCR (Optical Character Recognition)
programs like the full Adobe program (not the free Adobe
Reader) and ABBYY. Recently, Tracker Software has
included a good OCR program in their non-commercial version of
PDF-XCHANGE EDITOR. Download the program from https://www.tracker-software.com/product/pdf-xchange-editor .
It downloads as a zip (compressed) file which you then install
on your computer. You may need to install it as an
Administrator and I always use the Custom Installation, so it
doesn't change my default settings. Once it is
installed, run it and click File > Open and open the pdf
you downloaded. You will see the first card in the
window and can click on through to see other cards. To
OCR the file click on Document > OCR (Pages). I set
it to OCR all pages, Accuracy = Low (since some of the
typewritten cards are hard to read), Output Type = Create New
Searchable PDF, Quality = 150 dpi, and check Deskew.
Deskew means to straighten up (de-skew) each card so the card
and printing are always horizontal in the new pdf. The
cards were microfilmed by just laying them on a black
background and the images of most are slanted. As the
OCR'ing takes place a bar shows how far it has gone and on my
desktop computer it took about 20 minutes to do the entire
file of 1796 images. You can now save the file with a
new name so it doesn't overwrite the original. This file
is now every-word searchable in most pdf readers such as Adobe
Reader (free), Sumatra, or PDF-XCHANGE EDITOR. Using a
better OCR program might do a better OCR job, but this seems
to be adequate and you can now click through all occurrences
of whatever word or date you want. With the deskewing
the cards whose images were slanted are now straighten, so the
black background rectangle is at an angle. There is an
autocrop feature on PDF-XCHANGE EDITOR that you can use to
eliminate the black background on each card automatically, but
using that feature in the free version puts large watermarks
at both sides of the top of each card. I used this
feature, but left enough room at the top of each card so these
watermarks didn't cover any of the writing. There is
probably a good freeware autocropping program and, if I find
one, I'll update these notes to include it. To show you
what the final result looks like, I have posted on my website
in the Erastus
Snow section the part of this searchable pdf
with the cards for Erastus Snow and his sons named
Erastus. This is 164 cards and the searchable pdf is 116
MB, so it will take a few minutes to download, but you can see
what the cards look like. If the text could be edited to
be html, the file would be much smaller, but I haven't tried
doing that yet.
CONCLUSIONS
It would be nice to have all 192 digitized films of the
Journal History Index done this way so they could all be
searched for every word on them and perhaps that could be done
by "crowd-sourcing" the project. But, better still,
would be having the entire Journal History pdf'd and
searchable, so we wouldn't even need an Index. Until
that is done, a searchable index is a major help since the Journal History
of the LDS Church is a valuable historical resource of
information about ancestors.
===================================