DON'S FREEWARE CORNER -- JUL 2014
USING HTTRACK TO DOWNLOAD WEBSITES
FOR PRESERVATION AND READING OFFLINE
This page was last updated 2014-07-08
Underlined titles are links.
To Google search these notes and subpages CLICK
HERE and enter your search terms.
Return to Don's
Freeware Corner Page or Don's
Family History Class Notes Page .
=============================================================================================
DON'S FREEWARE CORNER 2014-07
USING HTTRACK TO DOWNLOAD WEBSITES FOR PRESERVATION AND READING
OFFLINE
©2014 Donald R. Snow
These notes are published in TAGGology, our Utah Valley
Technology and Genealogy Group (UVTAGG) monthly newsletter, and
are posted here on http://uvtagg.org/classes/dons/dons-classes.html
where there may be updates, corrections, or additions.
CAUTION
The UVTAGG webpage has malware that we are trying to get rid of
that adds unwanted links and even pages sometimes when you click
on links. This is NOT in my notes, but comes in from an
external source from our website. This happens with
multiple browsers on multiple computers in multiple places, so
it's not just on one computer. I have noticed that it
sometimes opens up a new tab in the browser and takes me there
without me clicking on the new tab. Sometimes deleting the
new tab clears the problem and it doesn't recur until I open the
browser again. I am sorry for the problem, but there is
nothing I can do about it at present. BTW, I have noticed
that downloading files from my website using HTTRACK doesn't
download the malware with the webpages, so they show up clean in
the downloads.
WHY USE HTTRACK TO DOWNLOAD A WEBSITE
Websites change all the time and even disappear. You may
want to preserve a copy or just have it available on your own
computer to read without being connected to the Internet.
There are several free programs that will do this and this note
discusses one of these, HTTRACK, available from http://www.httrack.com/
. The home page contains information about the current
version, a download button, links to a manual, their forum,
blog, and other information. The Windows version of
HTTRACK is called WinHTTrack.com. There is a set of
Step-By-Step instructions and suggestions of how to use it
at http://www.httrack.com/html/step.html
. There are also some no-no's there about using it
incorrectly. For older versions of websites that are now
changed on the Internet you can frequently find them online at
Internet Archive at https://archive.org/
. Internet Archive has been taking "snapshots" of the
entire web every few days since the 1990's. These are
copies of the static parts of the web, not the dynamic parts
that are generated when you fill in some blank. For
example, on FamilySearch Family Tree you have to enter your name
or someone else's, so you can't use HTTRACK to download the
entire Family Tree.
USING HTTRACK
After downloading and installing HTTRACK when you want to save a
webpage, open HTTRACK, click on File > New Project >
Next. Here you give your project a name such as
HTTRACKWEBSITE-FH-HELPS in the category FAMILY HISTORY, and
include the folder you want to save it in, e.g.
C:\DownloadedWebsites. I have found that including
HTTRACKWEBSITE in the title allows me to find these easily on my
computer using the freeware program EVERYTHING that I have
discussed repeatedly in these notes. The pick arrows
(downward pointing triangles at the ends of the lines) show the
other projects and categories you have set up earlier. You
can save many websites into the category you select, but give
the projects names, and maybe even dates, so you can tell
exactly what and when you downloaded it, e.g.
HTTRACKWEBSITE-UVTAGG-VideoLibrary-2014-07-08. Then click
Next and fill in the URL of the website you want to download,
e.g. http://uvtagg.org/videolibrary/ . Now click on Set
Options. Only a few of the many options here need to be
changed from the defaults. Click on the tab Limits and set
the Maximum Mirroring Depth of how many levels down in the
website you want to download. This will depend on the
website you are downloading. For the UVTAGG Video Library
you would probably only need 2 levels since there are no links
that go below those levels. The number of levels can be
set low to start and update the download later, if you need more
levels. The Maximum External Depth refers to levels of
links that take are not on the website you are
downloading. To start set this at 0 until you see if you
need more. On the tab Log, Index, Cache, put a check in
Store All Files in Cache. Leave the options in all the
rest of the tabs as they default and click OK, then Next >
Finish. As it works you see a list of the files it is
downloading with progress bars to indicate how it's doing.
You can cancel the operation at any time. If you have set
it to download many levels, it may take a long time (hours) to
download. If you set it for only a few, it will probably
only take a couple of minutes to finish. When it finishes,
click Next and you see a panel on the left with your computer's
file structure and the folder showing the projects you have
downloaded. You can click on the Browse the Downloaded
Websites button or else click on the file labeled index.html to
see a list in HTTRACK of all the projects you have downloaded in
that folder. Clicking on any one opens it in your
browser. For the URL in the address bar at the top of your
browser you will see something like file:///C:/ ..., which
indicates you are looking at the downloaded webpage as it is now
stored on your computer. If there are links beyond the
levels you have downloaded and you are connected to the
Internet, clicking there will take you to the online URL and you
see the full address in the address bar. So, by watching
the address in the browser you can tell if you are looking at a
downloaded copy or the online version. Once you have
downloaded the website you can read it in your browser without
being connected to the Internet. You can copy the website
folder to a flash drive and transfer it to another computer, if
you want. This makes a good way to be able to read
information without being connected to the Internet and to save
a copy of the website to archive it yourself. Be careful
of the size of the websites you download since some are very
large and take a long time to download and take up much storage
space. Unfortunately, there seems to be no way to tell how
much of the website you have downloaded until it finishes.
While it is working you can be working on other things on your
computer. If it seems to be taking too long, there is a
Cancel button which gives you the option of stopping and
retaining what you have already downloaded or going back and
deleting all the already-downloaded files. Remember
copyrights so you don't break copyright laws. I don't
think it is breaking copyright laws to have a downloaded copy of
a website on your own computer to read later, as long as you
don't pass it on, change it, or sell it, but I'm not an
attorney. You can update copies of websites on your
computer by opening HTTRACK and using the update feature.
Once the website has been downloaded, you can use it in your
browser to make pdf's, text files of the pages, screenshots, and
even scrolling window screenshots with freeware. But
that's another article.
VIEWING YOUR DOWNLOADED WEBSITES
There are two ways to view your downloaded websites.
First, open HTTRACK and click on link for the project you want
to view. Second, without opening HTTRACK, go to the
download folder and click on the index.html file. This
opens your browser and you see the list of downloaded
websites. For either method of viewing the downloaded
websites the links will take you to downloaded pages on your own
computer as far down in levels as you saved, and after that,
will take you online, if your computer is connected to the
Internet.
WEBSITES WITH PASSWORDS
In the FAQ's (Frequently Asked Questions) on the Helps page is
an example of how to download a website that requires a user
name and password. Here's the format to put in the HTTRACK
box:
http://[user]:[password]@www.somewebpage.com/mybox.html .
However, if the website requires you to enter additional
information before continuing on, I don't know how to do that.
CONCLUSIONS
You may want to experiment with downloading some useful websites
to learn how to use HTTRACK and then keep it in mind for saving
something you really need later.
=================================================================================
Return to Don's
Freeware Corner Page or Don's
Family History Class Notes Page .