New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
Need a server with 200+ TB HDD and 50+ TB Traffic for below $1000 / monthly
For collecting data from the internet, storage of that data and running simulations on that data, I need a machine with the following specs:
- 200+ TB HDD net after installing RAID6 (like 12 x 20 TB, 16 x 16 TB, 20...24 x 12 TB or similiar). Software RAID would be ok
- 2 x 960 GB SSD software or hardware RAID1 for the OS, consumer grade SSDs are ok
- 24+ cores of modern CPU (AMD / Intel / fast ARM also ok)
- 192+ GB RAM
- 1 GBit/s NIC, 50 TB/month traffic (mostly incoming)
- Server location: All countries except Germany (for geolocation diversity)
- Contract running time: 6 months; prepayment for 6 months is ok for renowned server companies
- Ubuntu 22.04 LTS; I can install the OS, so delivery as a "rescue system" is ok
- Budget: $ 500 / month
- Delivery / beginning of contract between 1st and 15th of July
Comments
Please share the porns when you have done.
I presume you already have a couple of SX133's given that Germany is out of the question?
(deleted)
Nah, I asked for INCOMING traffic of 50 TB/month. Would need outgoing 50 EB/month (Exabytes) to share all that smut 😂
Yeah, Hetzner is great value for money.
Not if you’re only sharing it with me….
Don’t you have your own collection?
I would think someone should be able to offer this within your budget.
I’m why would I pay to store something that’s freely available within a couple of clicks of the mouse?
Top contributing.
@MrRadic can probably do this for you.
It's going to take more time to recoup the cost than 6 months, wouldn't make sense for us to build it.
You obviously know your use case best, but for that amount of data, there's probably a way to shard it and process chunks on different servers and aggregate it on some other servers.
Another fairly obvious observation is that most of your data is being computed (because otherwise it'd take 4 of the 6 months streaming data from the net to just fill the drives), so you might even benefit from more cores writing to SSD using cheaper hosts.
for this kind of use i suggest rclone + dropbox buiness, then you can add many streaming servers depend on your traffic
So which contract term would be fine with you?
Let's see if @PulsedMedia have some
Thank you for your remarks.
The more servers I deploy, the more time I will spend with housekeeping. That's why I would prefer the "one big box" solution, even, if it costs $200 or $300 more per month in the end than a set of smaller machines. Otherwise, sharding would be possible with the event time as a parameter (like 2019 data goes to box 1, 2020 data to box 2 and so on). For full simulations on all past data, the CPUs in the 2019 box could process the 2019 data, the CPUs in the 2020 box could process the 2020 data and so on.
But for certain reasons, I run simulations on the current data (the last few months) more often, and then, I would be limited to only the CPUs in the box with the 2022 data. Disk read rates are too high to use NFS or similiar, at least over a 1 Gigabit/second ethernet. And moving to 10 Gigabit/second Ethernet would drive cost up a lot.
To make matters worse, already the data collection / preprocessing task keeps a substantial number of cores busy. Doing everything on eight core desktop CPUs therefore would seriously limit resources for the mentioned "last months" simulations.
To sum it up: One big box is easier to administrate and more flexible in the use of the collected data.
Are you simulating data patterns from unstructured data points.
Can be machine learning algorithm as well.
If you requires faster processing for data points and machine learning on dedicated GPU servers you can have a talk to @lentro for resources.
@lentro is representing Tensordock here at LET forum.
For your setup try these guys for a quote. I have been personally very happy so far working with them. Their pricing is hard to beat.
https://www.gorillaservers.com/services.html
[email protected]
Thanks for the mention!
@Need100TB
If you're running simulations or training models on your data, you might want to try a GPU server to take advantage of hardware acceleration a (or CPU-only server). We are more expensive than other LET providers, but our resources are dedicated, and we have 10 gbit networking. We also bill by the hour and charge only for storage when your server is off. And, we have an API too
Theoretically, you could create a program to start a processing/modeling server via our API, upload your data, process the data, and stop the server. That way, you only pay for the compute you actually use.
P.S. -- remember to make backups! 200TB would be a lot of data to lose.
Yea i think we can fill your need at that budget, sounds like not a problem.
24 or 36x drive AMD EPYC 24/36core should fit into that budget, from order it will take likely 4-8 weeks to get parts from Supermicro. They've been constantly low on stock at Europe.
That is actually run of the mill server for us, and a bit under specced how we typically set this type of server up. We have many similarly specced in production right now, and was just considering buying more.
Even your schedule matches up perfectly for what it takes to get parts.
Background;
We've been in business for more than 12 years now, we are well known in our own niche. We have our own network, hardware and datacenter in Helsinki, Finland.
Contact by emailing sales ** pulsedmedia - com and remember to link this thread and let's get your server needs sorted out!
@nvme thanks for the ping!
Hi Need200TB
Family, or simply a prankster-at-large?
prank
Bully! Bully, I say!