Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Cheap $5/mo 4-NODE redundant PVE Cluster Setup for Small Biz/Personal Use (low end needs) - Page 2
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Cheap $5/mo 4-NODE redundant PVE Cluster Setup for Small Biz/Personal Use (low end needs)

2»

Comments

  • @trycatchthis said: Whats software in the screenshot?

    It's proxmox.

    Thanked by 1stoned
  • i think this is how to combine VPS?
    But I tried and couldn't. lol

  • To @stoned and whom it may be interested in this thread:

    It looks like ZRAM would not discard by default:

    https://www.reddit.com/r/linux/comments/u0hb2a/tips_enable_discard_on_zram/

    So it seems to be a good idea to add -d to swapon, otherwise it might get filled up and not reclaiming after some time (please correct me if I am wrong):

    # swapon -d /dev/zram0

    Cheers,

    Thanked by 1stoned
  • @trycatchthis said:

    @stoned said:

    Whats software in the screenshot?

    Proxmox VE. https://www.proxmox.com/en/proxmox-ve

  • @michae1 said:
    @stoned I like your idea! Sounds like a fun thing to try.

    Glad to hear. Have fun! :)

  • @noviap09 said:
    i think this is how to combine VPS?
    But I tried and couldn't. lol

    What have you tried? What results did you get?

  • michae1michae1 Member
    edited January 2023

    @stoned I still have a lot to learn. Hope you will stick around, I'll pm you or post here if things got stuck.
    Cheers!

  • stonedstoned Member
    edited January 2023

    @64383042a said:
    To @stoned and whom it may be interested in this thread:

    It looks like ZRAM would not discard by default:

    https://www.reddit.com/r/linux/comments/u0hb2a/tips_enable_discard_on_zram/

    So it seems to be a good idea to add -d to swapon, otherwise it might get filled up and not reclaiming after some time (please correct me if I am wrong):

    # swapon -d /dev/zram0

    Cheers,

    from man swapon

           -d, --discard[=policy]
               Enable swap discards, if the swap backing device supports the discard or trim
               operation. This may improve performance on some Solid State Devices, but often it does
               not. The option allows one to select between two available swap discard policies:
    
               --discard=once
                   to perform a single-time discard operation for the whole swap area at swapon; or
    
               --discard=pages
                   to asynchronously discard freed swap pages before they are available for reuse.
    
               If no policy is selected, the default behavior is to enable both discard types. The
               /etc/fstab mount options discard, discard=once, or discard=pages may also be used to
               enable discard flags.
    

    and from swapon.c

    /*
     * swapon tell device that all the old swap contents can be discarded,
     * to allow the swap device to optimize its wear-levelling.
     */
    static int discard_swap(struct swap_info_struct *si)
    {
        struct swap_extent *se;
        sector_t start_block;
        sector_t nr_blocks;
        int err = 0;
    
        /* Do not discard the swap header page! */
        se = first_se(si);
        start_block = (se->start_block + 1) << (PAGE_SHIFT - 9);
        nr_blocks = ((sector_t)se->nr_pages - 1) << (PAGE_SHIFT - 9);
        if (nr_blocks) {
            err = blkdev_issue_discard(si->bdev, start_block,
                    nr_blocks, GFP_KERNEL);
            if (err)
                return err;
            cond_resched();
        }
    
        for (se = next_se(se); se; se = next_se(se)) {
            start_block = se->start_block << (PAGE_SHIFT - 9);
            nr_blocks = (sector_t)se->nr_pages << (PAGE_SHIFT - 9);
    
            err = blkdev_issue_discard(si->bdev, start_block,
                    nr_blocks, GFP_KERNEL);
            if (err)
                break;
    
            cond_resched();
        }
        return err;     /* That will often be -EOPNOTSUPP */
    }
    

    and

    /*
     * swap allocation tell device that a cluster of swap can now be discarded,
     * to allow the swap device to optimize its wear-levelling.
     */
    static void discard_swap_cluster(struct swap_info_struct *si,
                     pgoff_t start_page, pgoff_t nr_pages)
    {
        struct swap_extent *se = offset_to_swap_extent(si, start_page);
    
        while (nr_pages) {
            pgoff_t offset = start_page - se->start_page;
            sector_t start_block = se->start_block + offset;
            sector_t nr_blocks = se->nr_pages - offset;
    
            if (nr_blocks > nr_pages)
                nr_blocks = nr_pages;
            start_page += nr_blocks;
            nr_pages -= nr_blocks;
    
            start_block <<= PAGE_SHIFT - 9;
            nr_blocks <<= PAGE_SHIFT - 9;
            if (blkdev_issue_discard(si->bdev, start_block,
                        nr_blocks, GFP_NOIO))
                break;
    
            se = next_se(se);
        }
    }
    

    If no policy is selected, the default behavior is to enable both discard types.

    Discard is the linux term for telling a storage device that sectors are no longer storing valid data and applies equally to both ATA and SCSI devices. ie.

    TRIM is the actual ATA-8 command that is sent to a SSD to cause a sector range or set of sector ranges to be discarded. As such it should only apply to ATA devices, but is often used generically. Given the prevalence of ATA devices, trim is often the most used of these terms.

    Seems like ZRAM is DISCARD capable.

    #  lsblk --discard /dev/zram0 
    NAME  DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO
    zram0        0        4K       2T         0
    

    Explanation: When files are removed, Zram doesn't remove the compressed pages on memory because it's not notified that the space is not used for data anymore. The discard option performs discard when a file is removed. If you use the discard mount option Zram will be notified about the unused pages and will resize accordingly.

    Not sure if this requires manually clearing out the ZRAM device or what...

    Thanks for the tip. I'll test it out.

    Thanked by 164383042a
  • @michae1 said:
    @stoned I still have a lot to learn. Hope you will stick around, I'll pm you or post here if things got stuck.
    Cheers!

    I am a full time student, on crunch time, and cannot provide personal tech support. I can try to do my best here in the forums. Thank you.

  • stonedstoned Member
    edited January 2023

    @banana_mcn said:
    May i know how to handle the inbound public IP address when the VPS migration to another node?

    I'm not sure what you're asking. VPS migration? Are you asking about moving your PVE installation to a different VPS with a different public IP?

    If so, then first backup the containers on that node somewhere, then pvecm delnode nodename_here and remove the node from the cluster. Then install PVE on a different VPS, join cluster again with new public IP and restore your containers/VMs

  • @stoned said:

    @banana_mcn said:
    May i know how to handle the inbound public IP address when the VPS migration to another node?

    I'm not sure what you're asking, but the correct method is to first use pvecm delnode nodename to delete the node, then do your other operations, setup somewhere else, new ip, then add that new PVE installation to the cluster.

    Just want to learn more about HA on PVE.

    Public IP of Node A 120.0.0.2
    Public IP of Node B 120.0.0.3
    Public IP of Node C 120.0.0.4

    If I put my website(lxc101) on Node A, The inbound IP from outside should be 120.0.0.2.
    Then, If I migrate the website(lxc101) to Node B, The IP will change to 120.0.0.3?
    How to handle the public inbound IP change while migrating the VPS/LXC container to a different node? Auto DNS update script or use a layer 4 load balancer?

  • MaouniqueMaounique Host Rep, Veteran

    I think this is a great tinkering project to kill time and learn things in the LE style.
    Is it useful otherwise? Not really.
    Still, it should be great fun in our community.

    Thanked by 2stoned FrankZ
  • stonedstoned Member
    edited January 2023

    @banana_mcn said:

    @stoned said:

    @banana_mcn said:
    May i know how to handle the inbound public IP address when the VPS migration to another node?

    I'm not sure what you're asking, but the correct method is to first use pvecm delnode nodename to delete the node, then do your other operations, setup somewhere else, new ip, then add that new PVE installation to the cluster.

    Just want to learn more about HA on PVE.

    Public IP of Node A 120.0.0.2
    Public IP of Node B 120.0.0.3
    Public IP of Node C 120.0.0.4

    If I put my website(lxc101) on Node A, The inbound IP from outside should be 120.0.0.2.
    Then, If I migrate the website(lxc101) to Node B, The IP will change to 120.0.0.3?
    How to handle the public inbound IP change while migrating the VPS/LXC container to a different node? Auto DNS update script or use a layer 4 load balancer?

    Hi, I've been busy. But here I am now.

    Yes, anytime you move containers between nodes, since containers are behind NAT (unless you use routed ipv6), you need to update their ip addresses.

    Public IP of Node A 120.0.0.2
    Public IP of Node B 120.0.0.3
    Public IP of Node C 120.0.0.4

    So you have 3 servers. If you talk to .2 to access node A, then if you move node A to .3, how will you access Node A? Through .3's NAT.

    Check my other post here about PVE setup: https://lowendtalk.com/discussion/183188/how-to-setup-ipv6-on-proxmox-on-naranja-tech-server-they-only-give-a-b-c-d-1-64-in-their-panel#latest

  • stonedstoned Member
    edited January 2023

    I received a PM I'll address publicly so it may help others.

    Hello @stoned,
    What are the related applications for this kind of a setup? Could you list a few of scenarios?
    Thank you!

    Redundancy is most important, so I have 4 providers in a 7 node cluster. I may remove PVE from the 1G nodes, as I need to do things and 1G for a PVE is too little for anything beyond basic stuff. If you upgrade to 2GB or more RAM servers, that would be much better.

    Application could be anything you want really, but my goal was cheap redundancy instead of spending that or more on a single server.

    I only wanted to see if this could be done, and yes it can, but it's not going to be a viable production solution at 1GB of RAM. I cannot stress this enough. If you want a PVE cluster, your best bet is at the very minimum, 2GB RAM with the swap settings here.

    4GB RAM and you can keep the same swap setting, but lower swappiness to say 10-20.

    I have about 12TB on this cluster. 3x Proxmox Backup Server doing daily syncing. 4x Proxmox mail gateways for MX redundancy.

    Soon I will cluster a couple of SMTP servers together for outbound mail, so in case one fails, another can deliver. For that, I'm probably going to try postfix and possibly manual syncing of mailboxes.

    Basically I'm going to clone my SMTP server to at least 4 nodes, and if anything happens to one, I can simply point my PMG relay to the new SMTP server and mails will still go out. Same server, same config, just diff IP.

    I'm also going to have a backup with a mailing service, hosted, in case something happens to my IPs, I can quickly change the transport to Amazon SES or mailgun or sendinblue or whatever.

  • @stoned said: But I do have a private 7th server acting as a wireguard vpn, so each node can talk to other nodes on the private network. But that goes through the 7th server and makes things very slow. Direct connections to the cluster nodes even over 100mbps seem just fine (backups/replications/migrations take longer on slower network obviously).

    maybe you should use netmaker to build a mesh network, so each server could connect other directly without send traffic to centre node first.

    Thanked by 1stoned
  • Nice project. Do you know how to forward ip from proxmox A to B using wireguard?

    We have 4 usable ip in proxmox A but resource too low for make another lxc

  • @topper said:

    @stoned said: But I do have a private 7th server acting as a wireguard vpn, so each node can talk to other nodes on the private network. But that goes through the 7th server and makes things very slow. Direct connections to the cluster nodes even over 100mbps seem just fine (backups/replications/migrations take longer on slower network obviously).

    maybe you should use netmaker to build a mesh network, so each server could connect other directly without send traffic to centre node first.

    Oh I didn't even know about this. Looks fantastic. I'll check it out on a few VMs first in virtualbox. Thank you!

  • stonedstoned Member
    edited January 2023

    Update:

    The 1G RAM Racknerd node in my PVE cluster, average loads over the past month:

    We have reached 40% CPU once last week or so. (during system update)
    We have barely any disk activity, no thrashing at all, no noise, no high disk i/o

    Every now and again, the pve-cluster service will crash because of OOM, which leads me to think 1G is just not feasible for PVE, though it can be done. Even when the pve-cluster service crashes, it's just a matter of restarting it. Everything continues to work normally.

    Here's one container, Alpine, 64MB, nginx proxy manager:

    Here's the second container, Proxmox Mail Gateway:

    @LightBlade said:
    Nice project. Do you know how to forward ip from proxmox A to B using wireguard?

    We have 4 usable ip in proxmox A but resource too low for make another lxc

    Thanks. Sure. Depends on how you have things setup, but iptables forwarding is pretty easy. You can forward anything short of ip6 to ip4, which can be done using 6tunnel for example.

    Please ask your question in some detail. Also, it maybe time to learn iptables!

  • So I was wrong that pve-cluster service goes through OOM kill and dies, and then has to be restarted because of low memory of 1GB.

    The same thing happened to a 32GB dedicated server. The service crashed, but the containers continued to work, however no administration panel. So I had to login to the server, restart, and back to normal.

    Looks like PVE isn't without its quirks etc. Still, it seems a better solution than most other thing clustering wise.

    I'm going to setup another cluster, not PVE this time, at least 3-4 machines, then see if I can get a HA cluster going, with container management using LXD. Try to get away from PVE.

    Stay tuned.

    Thanked by 164383042a
  • stonedstoned Member
    edited March 2023

    Hello. If anyone is attempting still do this with 1GB nodes, here's a bit more info. Two things:

    1)
    Due to the following, I wanted to upgrade to 2+GB RAM nodes. I talked with Racknerd where I have the 2x 1GB nodes, and I realized they can't simply bump up my plan with rapid elasticity and I'd have to cancel the server, get another one and that would cost me time in changing everything over to the new IPs etc.

    I was sometimes using 80-100% CPU for minutes at a time. Disk activity was normal, but CPU was going mad and sometimes so much that corosync wouldn't quorate properly. Keep in mind, we're clustering over WAN links here, not local LAN links and corosync is sensitive to jitters even though it may not use a lot of bandwidth.

    I took off ZRAM and ZSWAP as I found the compressed RAM was causing this from various monitors and tests. Without ZRAM/ZSWAP and with just the following

    vm.swappiness = 100
    vm.vfs_cache_pressure = 500
    vm.dirty_background_ratio = 10
    vm.dirty_ratio = 10
    

    It has been more stable with far less CPU and disk activity on the 1GB RAM nodes. I didn't want to go through changing configs, so I kept the same IPs on the same 1GB RAM nodes instead of nuking the servers and getting larger ones.

    With the above config, even the 1GB RAM nodes are very quiet, running Exim4 container, Dovecot, etc. and Proxmox Mail Gateway, Nginx Proxy Manager, all three containers on a 1GB RAM KVM VPS and performance is pretty good, disk activity is very low, and CPU is very low.

    I'm going to try to throw a few more static sites and do some intense traffic test on nginx in a container on the 1GB nodes and see how far they can be pushed and still performant.

    2)
    corosync is quite strange. I've noticed that over time, corosync service starts to consume as much RAM as the system would allow. This is the reason corosync goes throuh OOM KILL and if two nodes are down, cluster doesn't quorate properly. I have to do the following in cron:

    */5 * * * * for i in {0..9}; do sync; echo 1 > /proc/sys/vm/drop_caches; sync; echo 2 > /proc/sys/vm/drop_caches; sync; echo 3 > /proc/sys/vm/drop_caches; sync; done
    */30 */1 * * * service corosync restart
    

    We drop caches, FS caches, inode caches, etc. etc. every 5 minutes, 10 times each (still testing this, but it's been stable and keeping needless stuff out of RAM), and restart the corosync service every 1.5 hours just in case it starts to consume large amounts of RAM.

    I actually found this is especially useful on the larger servers, like the dedicated I have with 32GB RAM.

    Okay, cheers everyone. Please excuse any grammatical errors.

    Thanked by 264383042a coffin
  • Sorry for bumping an old thread - I came across it as I'm about to try something similar with 4 x 2gb incus nodes + microceph / microovn. 4 x nodes is better than 3 for ceph - it can continue operating with one node down.

    incus (the new LXD fork) is probably a better lowend choice - a proxmox cluster needs 2 x NIC's on each node & 2 separate networks for reliable operation.

    As wireguard has such low overhead - running it on each node to create a mesh so they can communicate directly will be a better choice. I forked wg-meshconf to add quantum resistant security.

    For automating the wireguard setup - ansible-semaphore (an ansible web gui) - looks good & integrates with git (gitea under Alpine works well in unprivileged LXD). Semaphore runs fine under rootless podman inside unprivileged Ubuntu LXD (rootless docker doesn't start properly after a reboot for some reason inside LXD)

    I also wrote distrobuilder-menu for creating custom LXD / LXC containers with distrobuilder which you may find useful.

Sign In or Register to comment.