DON'S FREEWARE CORNER - AUG 2019

SCANNING AND TRANSCRIBING HANDWRITTEN DOCUMENTS SUCH AS LETTERS

Don's Freeware Corner articles are printed in the UTAH VALLEY TECHNOLOGY AND GENEALOGY GROUP (UVTAGG) Newsletter TAGGology each month and are posted on his Class Notes Page https://uvtagg.org/classes/dons/dons-classes.html where there may be corrections and updates.

SCANNING AND TRANSCRIBING HANDWRITTEN DOCUMENTS SUCH AS LETTERS

©2019 Donald R. Snow - Last updated 2019-10-19

HANDWRITTEN DOCUMENTS

Most of us have collections of handwritten documents in our family history archives, letters, deeds, wills, censuses, contracts, etc. Some of these we would like to transcribe so they are more readable and searchable. There are people and companies that will do this for a fee, but most of us want to do it ourselves or have our family members help. Such handwritten documents could be from ancestors or even from yourself, e.g. our missionary letters home. Digitizing them by scanning is a first step, so they can be backed up and distributed. Digital copies in several places means they won't be destroyed in the event of a disaster such as a fire or flood. Scanned copies are easier to work with too, since they can be shown on screen, expanded, darked, etc., to make them more readable. In text format, after transcription, they can be examined and searched for names, words, or phrases, and people can read them more easily. So a first step is to scan them into digital format. Then they can be transcribed in various ways. You may even want to "farm out" some of the scans to other family members to help with the transcription.

SCANNING DOCUMENTS

Scanners are not expensive and you may have access to a good one in your local Family History Center.  These scanners scan directly to flash drives, so you take your hardcopy documents and a flashdrive to the FHC and scan them to your flashdrive. At home you copy them to your computer and rename the files so the names help you know what's in them without opening them and so that they sort easily and in order for the person they pertain to. There are some helpful ideas and programs in class notes on my website https://uvtagg.org/classes/dons/dons-classes.html . Scanned documents can be edited to make them more readable or take out extraneous parts.

Bleed-through is ink bleeding through the paper and some old documents have so much bleed-through that the words are hard to read. There are electronic ways to get rid of this bleed-through. Some scanners at FHC's have settings that help minimize this during the scanning. My missionary letters from Mexico and Guatemala to my parents in Los Angeles were written on very thin "onion skin" paper so we didn't have to pay so much for airmail postage. For some reason, unknown to me now, I wrote them all with a green-ink fountain pen and it bled through the thin paper. (sigh) I discovered recently that there are settings on the scanners in FHC's that take most of this bleed-through out as you scan, so the scans are much more readable than the originals. I didn't discovered this setting until after I had already scanned all my letters, so I went back and rescanned them all. The color is still with green ink, but most of the bleed-through is gone, so you only see what I wrote on one side of the paper. Other things you can do to make scans more readable are to darker the very light ones or lighten the very dark ones.

To help you decide what scanner settings to use see other articles and clas notes on my webpage. I usually scan black and white documents to pdf at 150 dpi (dots per inch). Most handwritten documents don't contain photos, but if they do, I use a higher resolution. For photos the rule of thumb from the Library of Congress is to scan them so the final copy is about 250 dpi, that is, each final inch has 250 dots or pixels. That means that, if you are scanning a 2 x 3 inch photo and want to have it 4 x 6 inches, use 2 x 250 = 500 dpi to scan the original. For most handwritten documents scanning at 150 dpi is sufficient.

Once the documents are scanned you can back them up and put copies on flashdrives and/or email them to family members for backups or to help you with transcription. You can also post them on websites such as FamilySearch Family Tree, so they are preserved and others can benefit from them.

TRANSCRIBING BY TYPING

The National Archives has a Transcription Tips website at https://www.archives.gov/citizen-archivist/transcribe/tips . It suggests you type exactly what you see, misspellings, words out of place, strike-outs, etc. No one has come up with a satisfactory program to read hand writing yet, so you have to do the work yourself. The simplest way is to just read and type what you see into your favorite text editor. That always works and, sometimes, is the best way to do it. Several programs are available to help with this; see -- https://abundantgenealogy.com/word-word-document-transcribing-technology/ .

TRANSCRIPT

A free-for-non-commercial-use program called TRANSCRIPT is available from http://www.jacobboerema.nl/en/Freeware.htm . It requires that the image be in digital (graphic) format such as .jpg or .tif . The program doesn't have many image editing features, so you want it already color-corrected and clear enough to read. It has two panels, one above the other in the free version, with the image in the top panel and what you type in the bottom panel. One nice feature is that, as you type and press enter to go to the next line on your typed part, the image moves up too, so you don't have to stop and move the image yourself. The text you type can be saved in several formats including .rtf (rich text format) and .ods (open document format), both of which are readable by most text editing programs. The main prooblem that I have found with using TRANSCRIPT is getting the scanned image into a readable form before opening it in TRANSCRIPT. Below is a screenshot of TRANSCRIPT in action.

https://uvtagg.org

GENSCRIBE

Another program to help with transcription is GENSCRIBE. This is also free for non-commercial use. Dick Eastman wrote an article about it in his Eastman's Online Genealogy Newsletter; see -- https://blog.eogn.com/2014/12/16/genscriber-a-free-transcription-tool-for-genealogy-research/ . The program can be downloaded from http://genscriber.com/genapps/en/start . To get it to install, I had to right-click the installation file and run it as an Administrator, even though it says you shouldn't have to. To handle pdfs you download and install a free add-on. I haven't had much experience with this program, so can't say much more about it, but it looks helpful.

TRANSCRIBING BY READING INTO SPEECH-RECOGNITION PROGRAMS

Another way to transcribe documents is to read them into speech- or voice-recognition programs. There are several such programs, both free and commercial. The premier commercial one is Dragon Naturally Speaking which is updated regularly and costs about $100 when you find it on sale. It will give fairly good results when you speak slowly and clearly. All speech-recognition programs have you train them so they recognize your voice by having you read s script into a microphone while it analyzes the way you speak. Dragon Naturally Speaking gives good accuracy for slow and clear speech, but it can't handle "continuous speech" very well. TContinuous speech is what they call the way we all talk to each other. Several years ago I talked to the Dragon Support group and they told me that it was not really capable of getting high accuracy for continuous speech. I had phoned them to ask how to transcribe my daily journal which I do on a digital recorder and used to do on a cassette tape. So, with any speech-recognition program, you have to plan on going back to correct errors. However, the result you get without editing may be sufficient to search and find names, etc., until you have time to correct it all. As far as I know, there are no programs that will index audio speech yet.

WINDOWS 10 has speech-recognition software built-in and that will be the subject of another Freeware Corner article. Also, there are several apps on smartphones and tablets that could be useful as speech-recognition to transcribe what you read from handwritten documents. One of these is ava whose icon is am ampersand "&". It is free for 5 hours every month.  If you buy the commercial version, your use time is unlimited. I learned about it from the Utah State Hearing Impaired program classes I attende, since it can be used to transcribe a person's speech when you can't hear them . You hold your smartphone near their mouth and they talk and you read what they say. When using such a mobile device to transcribe something, you have to know how to get the text from that program into whichever text program you use.

CONCLUSIONS

Automatic transcription of hand writing is still in the futture, but is coming. Some computer scientists told me recently that neural networks are proving more accurate in reading hand writing than anything else. These are computer networks that are "trained" by giving them lots of examples and telling them "yes" or "no" when it "thinks" a word is a particular word. The computer scientists told me that such networks were getting better accuracy than trying to describe to the program which curves meant which letters.

Most of us have documents that need transcribing to make them readable and searchable. This can always be done by hiring someone to do it, e.g., a grand kid. The first step for any document is to scan it so you can back it up, work with it to make it more readable, and give copies to others. The next step is to find a way to transcribe it and for most of us, that means doing it ourselves. This article has shown a few computer programs that help and there are many others.
=================================