How-To Geek
How Can I Download an Entire Web Site?

You don’t just want an article or an individual image, you want the whole web site. What’s the easiest way to siphon it all?
Today’s Question & Answer session comes to us courtesy of SuperUser—a subdivision of Stack Exchange, a community-driven grouping of Q&A web sites.
Image available as wallpaper at GoodFon.
The Question
SuperUser reader Joe has a simple request:
How can I download all pages from a website?
Any platform is fine.
Every page, no exception. Joe’s on a mission.
The Answer
SuperUser contributor Axxmasterr offers an application recommendation:
HTTRACK works like a champ for copying the contents of an entire site. This tool can even grab the pieces needed to make a website with active code content work offline. I am amazed at the stuff it can replicate offline.
This program will do all you require of it.
Happy hunting!
We can heartily recomment HTTRACK. It’s a mature application that gets the job done. What about archivists on non-Windows platforms? Another contributor, Jonik, suggests another mature and powerful tool:
Wget is a classic command-line tool for this kind of task. It comes with most Unix/Linux systems, and you can get it for Windows too (newer 1.13.4 available here).
You’d do something like:
wget -r --no-parent http://site.com/songs/For more details, see Wget Manual and its examples, or take a look at these:
Have something to add to the explanation? Sound off in the the comments. Want to read more answers from other tech-savvy Stack Exchange users? Check out the full discussion thread here.
this is kind of misleading...
"How Can I Download an Entire Web Site?"The short answer...these days that is impossible.
You can download the HTML, CSS, JS, Images and any media files...but that is NOT the Entire website.
It would not be possible to download every PHP, XML, etc server side files because most systems don't tell you what files are needed. If you have direct links to each php file you could probably download them except for those that have die commands if they are accessed directly.
Databases are also impossible to download without permission to do so.
So with HTTRACK or Wget or any other kind of downloader...this question is not possible.
If the word "Entire" wasn't in the question then it would be a totally different conversation.
wget --header="Accept-Language: en-us,en;q=0.5" --header="Accept-Charset: ISO-8859-1,utf-8;q=0.7,;q=0.7" --header="Accept:text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,/*;q=0.5" --user-agent="Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6" -r -p -k -c -np -E --tries=1 --timeout=5 -e robots=off http://the.web.site
....You can do a base install of Arch Linux with less text.
Sitesucker has been my go to app on the Mac for a number of years.
https://itunes.apple.com/us/app/sitesucker/id442168834?mt=12
that will bypass htaccess but that will not allow proper downloading of php files and it will absolutely not allow anyone to download databases.
The idea of downloading an "entire" website is impossible now...regardless of any legit tools.
You could black hat attack a website to get the contents but other than doing that it is not possible...and in some cases not possible to do that either.
Site Sucker LIMITATIONS.
http://www.sitesucker.us/mac/limitations.html
I've always wanted a browser feature (e.g. extension) that would automatically do this for every website I visited. Especially now with huge HDDs this wouldn't be a problem (it could be set to clear the saves every 3-7 days, for example). It wouldn't need to be a "functioning" website, but rather like a screenshot of the page as you currently see it. Sometimes when traveling I may be without any connection for a while, so it would be great if that Wikipedia article I opened before I left (but didn't remember to manually save an offline copy of) was saved and ready for me to use for reference in my homework, for example.