New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
Comments
www.httrack.com
Great thanks, I'm on Mac but can run parallels
httrack could do the thing but it would also copy the code snippets waybackmachine.org adds, so you'll have to remove that from each page
Only seems to save index.html and goes no deeper. No limits set in settings either
Sitesucker
@n1kko set permission to disallow robots.txt file
It should not download the robots.txt file in order to go deeper. This works for most websites, not sure about waybackmachine
Tried sitesucker and just saves robots.txt with this
robots.txt web.archive.org 2013-10-02
User-agent: *
Disallow: /
User-agent: ia_archiver
Allow: /
Sitesucker allows "ignore robot exclusions" in settings working now