Wayback machine?

n1kko · April 2014

Is there anyway to save/download an old site from waybackmachine.org?

Mark_R · April 2014

www.httrack.com

n1kko · April 2014

Great thanks, I'm on Mac but can run parallels

srvrpro · April 2014

httrack could do the thing but it would also copy the code snippets waybackmachine.org adds, so you'll have to remove that from each page

n1kko · April 2014

Only seems to save index.html and goes no deeper. No limits set in settings either

jeffreywinters · April 2014

@n1kko said:
Great thanks, I'm on Mac but can run parallels

Sitesucker

srvrpro · April 2014

@n1kko set permission to disallow robots.txt file

It should not download the robots.txt file in order to go deeper. This works for most websites, not sure about waybackmachine

n1kko · April 2014

Tried sitesucker and just saves robots.txt with this

robots.txt web.archive.org 2013-10-02

User-agent: *
Disallow: /

User-agent: ia_archiver
Allow: /

n1kko · April 2014

Sitesucker allows "ignore robot exclusions" in settings working now