Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Shells Virtual Desktop
BMail.ag - Secure Email Service
Server.net
CPLicense.net
VPS Server
Buy VPN
Vultr
VMs for AI
HostDare
ReliableSite White-Label Dedicated Hosting for Resellers
InterServer VPS
BMail.ag - Secure Email Service
Best VPN
High-Performance Bare Metal Server Solutions
Karvl.com
Server Mania Cloud Hosting
DataWagon Hosting
AlphaVPS Hosting
Evoxt.com
Clouvider
VPS Hosting with NVMe
Residential IPs in the US & 4G Mobile Proxies in EU & US with Unlimited Bandwidth
ReliableSite White-Label Dedicated Hosting for Resellers
Rabisu - Hosting Solutions
Shells Virtual Desktop
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

How would you provide clients access to easily download extremely large backups?

FranciscoFrancisco Top Host, Host Rep, Veteran

For NameCrane Email we already keep hourly offsite backups using rsync. We have that sitting on top of ZFS, which does hourly/daily/weekly/monthly snapshot rotating.

This is great for us as it completes an entire node wide sync in < 5 minutes or less. This doesn't work for end users to easily download a copy of these backups. We could tell people to just use imapsync or similar, but then they won't be getting backups of their calendars, notes, contacts, cloud storage, meetings, etc, etc, etc.

Since a domain could grow huge (up to 10TB even) with emails, documents, your favorite nightly material, etc, so generating tar.gz/zip's of this data isn't viable. Even if we rip 10Gbit/sec generating that backup, it could take multiple hours to complete a single large backup for a single domain, and then users still stuck trying to pull down a 10TB archive.

Remember, we don't charge for backups in any capacity, so we want to make sure that it doesn't become a huge burden on either infrastructure or our time. Our current rsync setup requires ~no upkeep, but again, those aren't exposable to the end user.

Our ideas so far:

Some sort of streaming zip/tar file

We're going to still have the same issue of the archive being a single big blob and a pain in the ass for the user to download.

Biweekly/Monthly PUSH to a user supplied S3 bucket

This was the original idea and not a bad one. We'd only be able to do 1 - 2 PUSH's per month though due to bucket restrictions, excessive bandwidth usage, etc.

Provide per domain restic/borg2 repositories a user can connect to

I like this one, but is more technically involved for the end user. Restic works well on both Windows & Linux (which is great since Smartermail does too). Borg2 is still in development, and while it works fine, it's still a long ways out.

The plus with this one is we'd store all backups to the same repository, so an end user would have full access to all available snapshots, not just whatever the latest is. You can cherry pick single files, folders, mount it, etc.

Some sort of restricted SSH access

This comes off pretty hacky and there's lots of places for it to break.

Screw it, imapsync baby!

This would be us keeping DR backups on your behalf, but you'd have to find your own way to export your data. This probably runs a foul with GDPR.

Ideas? :)

Francisco

«134

Comments

  • catscats Member
    edited April 2025

    In Discord someone mentioned PBS, which might be a reasonable option? It does really really good at dedupe + incremental backups, and using pxar (their archive format, which they also have a tool for: https://pbs.proxmox.com/docs/pxar-tool.html) - using proxmox-backup-client + pbs might be an option, since it does support directory backups into an archive

    pxar can also mount things as a filesystem

  • The 2 & 3 options seems fine to me. As a user not having 10TB of backup I'm fine.

  • @Francisco said: Biweekly/Monthly PUSH to a user supplied S3 bucket

    This would be ideal.

    @Francisco said: Some sort of restricted SSH access

    This sounds pretty dangerous for mail servers.

    Thanked by 1gbzret4d
  • FranciscoFrancisco Top Host, Host Rep, Veteran

    @dedipromo said: This sounds pretty dangerous for mail servers.

    I hate it in all aspects for sure, but it would've been on a read-only remote mirror. We'd then restrict SSH to just sftp and rsync or something like that. It's the clunkiest option thats come up.

    @dedipromo said: This would be ideal.

    Yeah, this was still the original plan. If we build it ontop of rclone then we could allow a remote SFTP/FTP/S3. Probably not google drive since it gets a bit funky with permissions/tokens/etc.

    Francisco

    Thanked by 1dedipromo
  • Something I think right now only making own wrapper borg convience app which read-only backup

  • kdjmakdjma Member

    (my understanding) it wouldn't run foul of gdpr, if someone asked you'd have to export but it doesn't need to be available for self service.

    Why would the S3 push option be more bandwidth than the zip streaming, does it not support incremental backups?

    I wonder what % of users would actually enable these backups, maybe the bandwidth usage won't be excessive.

    S3 push would be the most ideal for me, but I need more frequent than biweekly, I'd want daily to be honest, I'm happy to put in work so restic or borg2 is fine.

    I want to add, thanks for looking into client backups, some providers simply ignore this.

  • How would you provide clients access to easily download extremely large backups?
    ...
    Some sort of streaming zip/tar file

    You might find something useful at http://dar.linux.free.fr/doc/presentation.html

  • yoursunnyyoursunny Member, IPv6 Advocate

    Offer backups for free.
    Charge $25 per tarball download or $150/hour for custom restore, unless it's the provider screwed up and lost data.

  • FranciscoFrancisco Top Host, Host Rep, Veteran
    edited April 2025

    @kdjma said: Why would the S3 push option be more bandwidth than the zip streaming, does it not support incremental backups?

    I'm mostly thinking just the likelihood of the download getting interrupted, meaning you have to restart from the beginning. Bandwidth on our end isn't really an issue, especially in markets where we're on an IX (like Kansas & Netherlands).

    @kdjma said: I wonder what % of users would actually enable these backups, maybe the bandwidth usage won't be excessive.

    Probably < 25%. BuyVM has a ~15% pickup rate for paid backups. Users would have to pay for that S3 capacity somewhere.

    S3's usage could get high though since hard links aren't a thing. If we allow someone to specify a dynamic 'folder' (say, /%Y-%m-%d/), then we have to do a full upload each time.

    If someone doesn't specify that (or we don't allow it), then a sync/push job wouldn't touch files that haven't changed (which is 90%+ of all data).

    @kdjma said: I want to add, thanks for looking into client backups, some providers simply ignore this.

    You're welcome. It's not only good for users, but good for the product, so users feel better with the idea of lifetime plans.

    Francisco

  • kdjmakdjma Member

    .

    If someone doesn't specify that (or we don't allow it), then a sync/push job wouldn't touch files that haven't changed (which is 90%+ of all data).

    >

    I was envisioning this, a daily incremental push to a user's S3 bucket, that should limit resource usage.

    Each domain to a different bucket if I can pile on the requests lol.

  • JencyJency Member
    edited April 2025

    Pushing backups to a user’s own S3 bucket sounds clean and easy for most users. For more tech-savvy clients, offering access through restic repositories would be perfect, they can pick what they need without huge downloads. Maybe give both options based on user skill level, so it's flexible but not heavy on your side.

    Thanked by 1ypmLA77zcs
  • FranciscoFrancisco Top Host, Host Rep, Veteran

    @kdjma said: I was envisioning this, a daily incremental push to a user's S3 bucket, that should limit resource usage.

    That was the idea from the get go with a warning about it bankrupting someone if they aren't careful :P

    Each domain to a different bucket if I can pile on the requests lol.

    The setting would be per domain, so you can put completely different settings on each.

    Francisco

    Thanked by 1kdjma
  • FranciscoFrancisco Top Host, Host Rep, Veteran

    @Jency said: Maybe give both options based on user skill level, so it's flexible but not heavy on your side.

    >

    I think that becomes too much for us to maintain w/o charging more. I'm trying to pick 1 option that works pretty well. For S3 we could have frequency controls on the backend so if we want someone to have more (be it they pay, buy the boys a round at the bar, etc), it can be done.

    Francisco

  • since cranmail supports the native protocols pop3/smtp, i will download all my emails and save and backup them on my computer.

    the question is, what happens to emails that you backup as long as i have not yet downloaded them, will they remain in the crane archives forever, or only until the next backup of my account and when it is empty, will “my” backup also be empty on your backup servers? (anything else would be blatant data retention and every dispot would be wet in the crotch at the thought).

  • loayloay Member
    edited April 2025

    Biweekly/Monthly PUSH to a user supplied S3 bucket

    I prefer this option. S3 is easier to set up and scale compared to other solutions, and more secure than hosting on own server.

    Thanked by 1Francisco
  • +1 for user supplied S3.

    Thanked by 1Francisco
  • Why not a combination. Explain how to use rclone and imapsync to get files and email. And generate tar/zip for contacts, notes etc.

    Other options are difficult to manage (just my opinion).

  • seenuseenu Member
    edited April 2025

    instead, provide a backup option at extra cost (fixed or variable depending on their backup size)
    you maintain all s3 buckets but if users wants to download, you generate keys or download links for them.

    this way...you write all code needed and users can still download backups from s3 (offloaded) anytime they want.


    OR you write program/UI where they connect their s3 bucket (or minio) and your program pushes backups there.

  • mwmw Member

    restic/borg for me

  • d2411d2411 Member

    I wouldn’t recommend granting access to the PBS for your customers if its your only backup repository

  • I think the borg solution is the best, honestly. It is the closest to a (professional) backup solution (technical) people would setup themselves of your given choices. I also would think this might be integrated into your system very well, and (imho) it surpasses the other solutions by far, feature wise.

    (Additionally if not too much burden), why not offer zip/tar blobs for repos smaller than x GB as well? Any reseller/administrator with huge mailboxes should just not hang around with zip files for backups. Personal use, fine, go with it.

    Additionally, have you thought about offering a restore option as well? Keep that in mind for your proposed solution (if applicable), as you then need to think about how to implement the reverse direction (user uploads a zip file and you unpack it manually ..?)

    Thanked by 1tentor
  • luilui Member

    You should be able to do S3 incremental daily without wasting too much bandwidth, I think. If you're worried about timeouts and interruptions, you can do multipart uploads so you can easily retry a smaller part.

    Thanked by 1kdjma
  • RubbenRubben Member
    edited April 2025

    i wouldnt. backups are for pussies and i do NOT like pussies @sillycat agrees

    i like the restic/borg repo option though

    Thanked by 2nghialele ypmLA77zcs
  • @seenu said:
    instead, provide a backup option at extra cost (fixed or variable depending on their backup size)
    you maintain all s3 buckets but if users wants to download, you generate keys or download links for them.

    this way...you write all code needed and users can still download backups from s3 (offloaded) anytime they want.


    OR you write program/UI where they connect their s3 bucket (or minio) and your program pushes backups there.

    Why not both? An UI where you specify S3 storage but the UI has a simple button with "Buy S3 storage from us" that sets everything up and just sends you a bill.
    Users with their own S3 can use that, and users that have no clue can just buy it.

  • oplinkoplink Member, Patron Provider

    Maybe consider just having the client open a ticket to request the backup of their account vs putting a full download link for every user. But this could get annoying if you have too many users wanting a backup zip/tar. But as @yoursunny says you could charge a small fee for the zip/tar and still offer full protection to your users.

  • stefemanstefeman Member
    edited April 2025

    Allow user to supply his own NFS/CIFS/SMB/FTP/SFTP details and upload to there.

    Thanked by 1darkimmortal
  • trewtrew Member

    @caracal said:
    +1 for user supplied S3.

    Just to expand, if doing this, then just make clear it's not just S3, but any 'S3 compatible' so user can choose any he wants/or self-host Minio and similar.

  • trewtrew Member

    @Francisco starting point is always- what are your competitors offering? As you need to offer the same or better.

    Thanked by 1nghialele
  • FranciscoFrancisco Top Host, Host Rep, Veteran

    @trew said: Just to expand, if doing this, then just make clear it's not just S3, but any 'S3 compatible' so user can choose any he wants/or self-host Minio and similar.

    Exactly, it'd be listed as that with us recommending minio/seaweedfs for the selfhosters.

    @trew said: @Francisco starting point is always- what are your competitors offering? As you need to offer the same or better.

    Off the top of my head:

    Office 365

    Not aware of any free MS offering here. Everything i'm seeing says you have to license 3rd party apps to do the pull (veeam, etc).

    Google Workspace

    Supports 'Takeout' that will generate a multi part ZIP file. Can only be triggered twice a month and can take literal days for the archive to be available, making it cumbersome for the end user.

    Inbox.eu

    No way to download their backups. You must extract backups with imapsync or the 'export' function on your email client

    Protonmail

    First class support from them with their own inhouse 'Export Tool'. There's no automation/PUSH on this, but you can just cron the CLI tool most likely.

    Spacemail

    Inhouse DR backups are kept for up to 4 weeks. Doesn't seem to be any way to 'download' the backups, though you can ticket and ask them to restore.

    MXRoute

    No way to download their generated backups as they're for their own DR needs (like our rsync backups, maybe that's what they use, who knows). Not sure if the create backup option is available inside of Directadmin for scheduled jobs, would have to look.

    Supposedly you can ticket and they'll generate a full account backup.


    Proton & Google are the only ones that provide a way to download the backups (with protonmail being akin to imapsync it seems), but there's no vendor I'm aware of that does scheduled pushes of user backups.

    That'd be a first it seems.

    Francisco

  • What's the exact problem you are trying to solve here?

    How is the customer supposed to use that backup? Set up their own cranemail compatible service to access the data? Is that documented somewhere how to do that?

    Otherwise, imapsync is the safe option.

Sign In or Register to comment.