DON'S FREEWARE CORNER - MAY 2017
MAKING SEARCHABLE THE INDEX TO THE JOURNAL HISTORY OF THE LDS CHURCH

2017 Donald R. Snow - This page was last updated 2017-05-10.
These Freeware Corner notes are published in TAGGology, our Utah Valley Technology and Genealogy Group (UVTAGG) monthly newsletter.  They are also posted on my Freeware Corner Notes page on  http://uvtagg.org/classes/dons/dons-classes.html  where the links are active and there may be corrections and additions and other related notes and articles.

JOURNAL HISTORY OF THE LDS CHURCH

The LDS Church Historian's Office made a daily scrapbook of happenings related to the Church from its starting in 1830 up to the year 2008.  It is called the Journal History of the Church and consists of hundreds of notebooks with looseleaf pages in chronological order of the event or article.  It was actually started in 1906 by Andrew Jenson when he was the Assistant Church Historian and was based on information he had collected and published in his books about Church Chronology, including those in  1899  and  1914 , which are available to download from https://archive.org/index.php .  Information about Andrew Jenson is at  Wikipedia ,FamilySearch Memories , BYU Religious Studies Center , and  Dictionary of Mormon Biography .  He was a Mormon convert from Denmark, but changed the spelling of his name from Jensen to Jenson when coming to Utah in 1866.  Whenever he found a new article, journal entry, or information piece, he added it to the Journal History by pasting it onto a new page and putting the page in its chronological place in the notebooks.  The pages of each day's items were numbered sequentially.  Since these items included articles, letters, events, addresses, statements, etc., they form a valuable collection for family history.  To index it 3x5 inch cards were typed with the main name or event at the top of the card followed by each reference to that name or event with its date.  New cards were started as others were filled.  The cards were kept in card file drawers like a library card catalog and that card index file was microfilmed several times.  The images of the Journal History, and later the Index images, were released on DVD by the Church Historian's Office in 2002 and are now posted on the Church History Department website at  https://history.lds.org/article/journal_history_guide?lang=eng .  To search them you click on the digitized film that contains the name or event you want and go through the cards one by one.  The cards are alphabetical by the surname or event at the top, but there is lots of other information on the cards.  For example, letters written by a person may be indexed on that person's card, but not on the card of the person they were addressed to.  Hence, searching is limited to the main person or event on the card.  This is the case with many other such indexes, e.g., the Early Church Information File (ECIF), the Susan Easton-Black Early Church Membership volumes, and many other items.  This article will show a way to make the card files for the Journal History Index completely searchable for any word or date anywhere on the cards.  This method is possible since the cards were typewritten and the CHO website allows you to download the digital films of the Index.  The method can be used on other card file indexes, if you can somehow download large collections of the cards.

VIEWING THE JOURNAL HISTORY INDEX

The Journal History Index, 1830-1972, is online at  https://eadview.lds.org/findingaid/CR%20100%20142 .  These are the images from the microfilms.  Clicking on Overview shows that there are 192 microfilms.  Clicking on Browse the Collection take you to a page with 4 alphabetical sets with roughly 50 digitized films in each set.  Click on the set that includes the name or event you are looking.  There are about 50 files in each set and their titles are the first and last cards on that film.  Click on the file you want and you see the first card of the collection of 1500 or more card images in alphabetical order.  For example, Snow is on the film "Snafu - Socialized Medicine".  Highlighting the film name shows a link to View Digital Object and gives its full name and where it is stored in the Church Historian's Office.  Click on Browse the Collection to view the card images one by one.  The Journal History Index cards are typewritten, but some are not very clear.  On the right side is a panel with image numbers and "Way Points" of sections of the collection so you can jump to later images closer to the name or event you are looking for.  The first card with Snow on it is Image 196 on that film, but it is about snowfalls.  The cards pertaining to the Snow family start with Image 197 and go to Image 1135.  These include people and places named Snow, e.g. Snow College, Erastus Snow, Lorenzo Snow, Willard Snow, etc.  There is also a small icon near the bottom with 4 small squares which gives you thumbnail views of the cards with many on screen at once.  The size of these thumbnails can be increased or decreased by holding down the CONTROL key while rolling the mouse wheel. Clicking on the single rectangle icon takes it back to showing a single card and its size can also be made larger or smaller by CTRL+Mouse wheel.  On that particular film there are 1796 card images.  You can click on the cards one at a time to see the references and download or print any card you want, but that is slow since it has to download the next card each time you click. 

DOWNLOADING A JOURNAL HISTORY INDEX FILE

On the CHO website, when you are at one of the digitized index files, at the top right is an icon labeled Download.  When highlighted you see a note that the jpg images can be downloaded to your computer as pdf's.  Clicking there takes you to an information card that says the Church Historian's Office has no objection to you downloading and using this index file for non-commercial purposes, as long as you give them credit.  Clicking that you agree to the terms gets it ready to download the pdf of the entire film, all 1796 images for this particular film, and asks you what you want to name the download file.  I named it something like JournalHistoryIndex-SnafuToSocializedMedicine--[the CHO name on the file]--[Date in International Date Format YYYY-MM-DD].pdf .  It was 95 megabytes in size and took about 10 minutes to download.  This pdf file itself is much quicker to read through than doing it online, since you don't have to bring down a new image each time.  However, this pdf can be OCR'd (Optical Character Recognized) to make it searchable for every word on the cards.  This is possible since the cards were typewritten and not handwritten. 

OCR'ING THE PDF TO MAKE IT EVERY-WORD SEARCHABLE

There are commercial OCR (Optical Character Recognition) programs like the full Adobe program (not the free Adobe Reader) and ABBYY.  Recently, Tracker Software has included a good OCR program in their non-commercial version of PDF-XCHANGE EDITOR.  Download the program from  https://www.tracker-software.com/product/pdf-xchange-editor .  It downloads as a zip (compressed) file which you then install on your computer.  You may need to install it as an Administrator and I always use the Custom Installation, so it doesn't change my default settings.  Once it is installed, run it and click File > Open and open the pdf you downloaded.  You will see the first card in the window and can click on through to see other cards.  To OCR the file click on Document > OCR (Pages).  I set it to OCR all pages, Accuracy = Low (since some of the typewritten cards are hard to read), Output Type = Create New Searchable PDF, Quality = 150 dpi, and check Deskew.  Deskew means to straighten up (de-skew) each card so the card and printing are always horizontal in the new pdf.  The cards were microfilmed by just laying them on a black background and the images of most are slanted.  As the OCR'ing takes place a bar shows how far it has gone and on my desktop computer it took about 20 minutes to do the entire file of 1796 images.  You can now save the file with a new name so it doesn't overwrite the original.  This file is now every-word searchable in most pdf readers such as Adobe Reader (free), Sumatra, or PDF-XCHANGE EDITOR.  Using a better OCR program might do a better OCR job, but this seems to be adequate and you can now click through all occurrences of whatever word or date you want.  With the deskewing the cards whose images were slanted are now straighten, so the black background rectangle is at an angle.  There is an autocrop feature on PDF-XCHANGE EDITOR that you can use to eliminate the black background on each card automatically, but using that feature in the free version puts large watermarks at both sides of the top of each card.  I used this feature, but left enough room at the top of each card so these watermarks didn't cover any of the writing.  There is probably a good freeware autocropping program and, if I find one, I'll update these notes to include it.  To show you what the final result looks like, I have posted on my website in the Erastus Snow section the part of this searchable pdf with the cards for Erastus Snow and his sons named Erastus.  This is 164 cards and the searchable pdf is 116 MB, so it will take a few minutes to download, but you can see what the cards look like.  If the text could be edited to be html, the file would be much smaller, but I haven't tried doing that yet. 

CONCLUSIONS

It would be nice to have all 192 digitized films of the Journal History Index done this way so they could all be searched for every word on them and perhaps that could be done by "crowd-sourcing" the project.  But, better still, would be having the entire Journal History pdf'd and searchable, so we wouldn't even need an Index.  Until that is done, a searchable index is a major help
 since the Journal History of the LDS Church is a valuable historical resource of information about ancestors.
===================================