Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Shells Virtual Desktop
BMail.ag - Secure Email Service
Server.net
CPLicense.net
VPS Server
Buy VPN
Vultr
VMs for AI
HostDare
ReliableSite White-Label Dedicated Hosting for Resellers
InterServer VPS
BMail.ag - Secure Email Service
Best VPN
High-Performance Bare Metal Server Solutions
Karvl.com
Server Mania Cloud Hosting
DataWagon Hosting
AlphaVPS Hosting
Evoxt.com
Clouvider
VPS Hosting with NVMe
Residential IPs in the US & 4G Mobile Proxies in EU & US with Unlimited Bandwidth
ReliableSite White-Label Dedicated Hosting for Resellers
Rabisu - Hosting Solutions
Shells Virtual Desktop
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Wayback machine?

Is there anyway to save/download an old site from waybackmachine.org?

Comments

  • www.httrack.com

  • n1kkon1kko Member

    Great thanks, I'm on Mac but can run parallels :D

  • httrack could do the thing but it would also copy the code snippets waybackmachine.org adds, so you'll have to remove that from each page

  • n1kkon1kko Member

    Only seems to save index.html and goes no deeper. No limits set in settings either :(

  • @n1kko said:
    Great thanks, I'm on Mac but can run parallels :D

    Sitesucker

  • @n1kko set permission to disallow robots.txt file

    It should not download the robots.txt file in order to go deeper. This works for most websites, not sure about waybackmachine

  • n1kkon1kko Member

    Tried sitesucker and just saves robots.txt with this

    robots.txt web.archive.org 2013-10-02

    User-agent: *
    Disallow: /

    User-agent: ia_archiver
    Allow: /

  • n1kkon1kko Member

    Sitesucker allows "ignore robot exclusions" in settings working now :D

Sign In or Register to comment.