Scalable Linux machine

nadal28 · December 2020

Hello, I want to build a ffmpeg converter machine, it is going to grow so I need to scale the CPU, RAM and internet speed every some months.

The objective is to set up the machine once and when is needed upgrade its cpu,ram and internet speed.

I need a small ssd, 20 GB is more than enough.
My cpu needs will be intensive.
Unmetered bandwidth is a need.

Lets see

cazrz · December 2020

Is your app/site scalable? Does it scale to multiple servers? Sounds like you need Managed Hosting. Or do you think your ready made CMS or scripts can scale like youtube? Are you needing unmanaged?

WebProject · December 2020

@nadal28 said: My cpu needs will be intensive.

instead of managed service, I think you need to have a VPS with dedicated cores.

mrl22 · December 2020

It sounds more like you need a scalable cluster with a server that manages jobs for each member. I wrote one of these many years ago.

raindog308 · December 2020

@nadal28 said: My cpu needs will be intensive.

Then you don't want a VPS.

yoursunny · December 2020

I'm designing the same scalable ffmpeg converter for my push-up video site. I think it's not good to rely on a single server and hope the provider can resize your server:

This server is a single point of failure.
Most low-end providers do not have resizable servers. Digital Ocean and Oracle Cloud have them but they cost more.
There's a limit on the capacity of a single server.

I'm planning the following architecture:

Encoding and serving are separate. There would be separate storage/caching servers to serve videos to clients.
Encoding jobs are inserted to a global queue.
I can have one or more encoding servers. They would retrieve a job from the queue, download the input file from the storage server, and upload the output file to the storage server.
Software for encoding servers is in a Docker container.
Encoding servers are either long term (monthly or annually) or hourly VPS. If there are CPU limits, the Docker container needs to have appropriate CPU limits configuration. It's preferable to purchase dedicated cores to avoid the hassle.
There would be a monitoring service that checks the encoding jobs are being completed in a timely manner. If an encoding server has crashed, its ongoing job will be requeued. If there are too few or too many active encoding servers, it can automatically create or destroy hourly VPS to satisfy demand.

mrl22 · December 2020

@yoursunny said:
I'm designing the same scalable ffmpeg converter for my push-up video site. I think it's not good to rely on a single server and hope the provider can resize your server:

This server is a single point of failure.

Most low-end providers do not have resizable servers. Digital Ocean and Oracle Cloud have them but they cost more.

There's a limit on the capacity of a single server.

I'm planning the following architecture:

Encoding and serving are separate. There would be separate storage/caching servers to serve videos to clients.

Encoding jobs are inserted to a global queue.

I can have one or more encoding servers. They would retrieve a job from the queue, download the input file from the storage server, and upload the output file to the storage server.

Software for encoding servers is in a Docker container.

Encoding servers are either long term (monthly or annually) or hourly VPS. If there are CPU limits, the Docker container needs to have appropriate CPU limits configuration. It's preferable to purchase dedicated cores to avoid the hassle.

There would be a monitoring service that checks the encoding jobs are being completed in a timely manner. If an encoding server has crashed, its ongoing job will be requeued. If there are too few or too many active encoding servers, it can automatically create or destroy hourly VPS to satisfy demand.

As a PHP developer. I wrote a similar system where I had a PHP script in a cronjob that looked at a master server that contained a list of conversion jobs. Each encoding machine would run the cron every 5 minutes and take a job from the master server, mark as in progress, convert the video then push back to storage and mark the conversion as complete. Infinitely scalable and simple.

yoursunny · December 2020

@mrl22 said:

As a PHP developer. I wrote a similar system where I had a PHP script in a cronjob that looked at a master server that contained a list of conversion jobs. Each encoding machine would run the corn every 5 minutes and take a job from the master server, mark as in progress, convert the video then push back to storage and mark the conversion as complete. Infinitely scalable and simple.

This is the easy part.
Now think about:

How to avoid single point of failure at the master server?
Suppose input files and encoding servers are geographically distributed, and you know the network transfer speed between different locations, how do you dispatch jobs to minimize the time spent in network transfer? Remember that you are wasting the dedicated CPU cores when the encoding server is waiting for network transfer.
Suppose an input file is stored on two servers but these storage servers have limited network and I/O bandwidth if too many encoding servers want to download from the same storage server, which storage server do you download an input file from?
Suppose it takes 10 minutes to provision an encoding server, but you want 98% of jobs to start within 5 minutes, how do you predict how many encoding servers should be running based on (near future) demand?

These are just starters when trying to build a new YouTube on $7 VPS.

JoshR · December 2020

@nadal28 Take a look at our dedicated cores. You can upgrade the packages at any time.
Dedicated Core Xeon
https://my2.dataideas.com/index.php?rp=/store/intel-xeon-kvm-vps-dedicated-cpu

Dedicated Core Ryzen
https://my2.dataideas.com/index.php?rp=/store/amd-ryzen-kvm-vps-dedicated-cpu

0xbkt · December 2020

You need a job scheduler. Use Nomad.

Last year I was hired to migrate 2000+ HLS movies from a storage server in OVH to Google Drive in MP4 format. Without Nomad, I would most probably be crying in pain then.

Setup consisted of a worker (called 'client') fleet of 20 Scaleway VMs, all uniformly managed with Terraform. With only one Nomad master (called 'server'). I simply looped over the list of HLS movies dispatching them to Nomad, and it handled the rest by distributing it to clients (workers). Bash script in the worker is run and processed the movie it grabbed, and uploaded it to Drive.

All config and codes: https://gist.github.com/0xbkt/2d498ea770a43f2910eb2f3d34611464

akhfa · December 2020

@0xbkt said:
You need a job scheduler. Use Nomad.

Last year I was hired to migrate 2000+ HLS movies from a storage server in OVH to Google Drive in MP4 format. Without Nomad, I would most probably be crying in pain then.

Setup consisted of a worker (called 'client') fleet of 20 Scaleway VMs, all uniformly managed with Terraform. With only one Nomad master (called 'server'). I simply looped over the list of HLS movies dispatching them to Nomad, and it handled the rest by distributing it to clients (workers). Bash script in the worker is run and processed the movie it grabbed, and uploaded it to Drive.

All config and codes: https://gist.github.com/0xbkt/2d498ea770a43f2910eb2f3d34611464

Thanks for sharing. This is example where I think nomad more suitable than kubernetes because of its simpler architecture 🤔

0xbkt · December 2020

@akhfa said:

@0xbkt said:
You need a job scheduler. Use Nomad.

Last year I was hired to migrate 2000+ HLS movies from a storage server in OVH to Google Drive in MP4 format. Without Nomad, I would most probably be crying in pain then.

Setup consisted of a worker (called 'client') fleet of 20 Scaleway VMs, all uniformly managed with Terraform. With only one Nomad master (called 'server'). I simply looped over the list of HLS movies dispatching them to Nomad, and it handled the rest by distributing it to clients (workers). Bash script in the worker is run and processed the movie it grabbed, and uploaded it to Drive.

All config and codes: https://gist.github.com/0xbkt/2d498ea770a43f2910eb2f3d34611464

Thanks for sharing. This is example where I think nomad more suitable than kubernetes because of its simpler architecture 🤔

Absolutely yes. Nomad was ringing in my head since the very moment I took up the job. Never tried Kubernetes before, though. Just skimmed through its docs, and honestly it's obviously way harder to manage and get started with. One should just pick the right tool for whatever they're into.

Howdy, Stranger!

Categories

In this Discussion

Scalable Linux machine

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

Scalable Linux machine

Comments