All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
Looking for a price match, 16 Dedis
We had servers with Delimiter but today their payment processor has decided they don't want our money, and I'm looking for a competitive deal.
What I'm after: 16 Servers
Reason for usage: We fingerprint websites and look at application usage (analytics IDs, javascript libraries etc, pretty much like 'builtwith'), it's essentially a simple C-written crawler that fetches the home page and some inner pages of websites. The servers grab and process the pages.
What I'm after:
- 1 IPv4 per server, no IPv6 required
- CPUs we had on the old boxes as shown by /proc/cpuinfo:
processor : 7
vendor_id : GenuineIntel
cpu family : 6
model : 23
model name : Intel(R) Xeon(R) CPU E5420 @ 2.50GHz
- 8GB RAM per server
- 500GB HD per server
- 5TB B/W, may look for more down the line so would want a cheapish add-on option
- IP space doesn't need to be terribly clean, it may get an automated abuse report or two purely because of anal sysadmins doing so for crawlers.
The specs are a general guideline.
Delimiter were charging $300/m for these. If you're a provider or offering a suggestion, I need the company to be established, e.g. several years in business already. I would pay monthly and looking to avoid a setup fee (but would be willing to commit to quarterly).
Comments
old ones, even duals... http://www.cpubenchmark.net/cpu.php?cpu=Intel+Xeon+E5420+@+2.50GHz
I can offer you like 10 x E3v3 servers ( http://www.cpubenchmark.net/cpu.php?cpu=Intel+Xeon+E3-1231+v3+@+3.40GHz ) for $540 with 1TB HDD and 100Mbps unlimited.
I can offer you something. But not established as long as you'd like.
Do you need them as dedicated servers? A number of our clients run similar applications but use KVM based instances rather than dedicated boxes.
It's processor intensive.
To clarify, I'm happy with less boxes but need comparable CPU performance, I would be OK with 8 boxes, same RAM and disk.
waaaat!?
@ricardo
now if you had 16 dedicated servers and was only paying 300 a month for all of them. I am not sure anyone could do 19 bucks a month for a dedicated Server. Just the rack space alone would cost more than that, plus the bandwidth.
On the hard drive space are you using all that space or is that just what they gave you.
I can set you up on a KVM server that is in Dallas. I can set each kvm server up with 8 GB of ram.
The main server has dual E 5 2650, SSD Drives, Raid 10,
Clearly Delimiter could.
@Ricardo,
Your traffic is mostly inbound or outbound or both? Does location matter?
$300 for all 16 servers ? Do you really need 16 servers or fewer but more powrful ones can also do?
Not anymore though.
I'm running a crawler so inbound, if I'm thinking from the right perspective. Answereed the other question already.
By my choice.
@ricardo,
Long term usage or short term? I have some short term servers I could give you .
https://lowendbox.com/blog/joes-datacenter-20month-dual-l5420-w-8gb-ram-500gb-hdd-kansas-city/
it seems comparable in performance/price. however I haven't tried this provider myself, so can't vouch for the quality.
https://virmach.com/vds-dedicated-servers/
8 of those may fit your needs and come close to your price tag if you buy and pay quarterly even ssd available if you don't the need hdd space... and maybe buy monthly first and switch to quarterly in steps to leverage the costs over three months :-)
or ask @virmach for a custom deal especially if you need less IPs he might be able to cut some dollars? just a wild guess ;-)
You could put a bunch of hetzner (auction) boxes on the job: free inbound bw on those things iirc
Or 4 of these for similar cpu scores: https://www.wholesaleinternet.net/cart/?id=279 will net you LESS cost per month
@ricardo QuickPacket is running a special on our Dual Xeon L5640 systems in Atlanta at $49.99 per system per month. You could probably accomplish the same thing with fewer systems since each server has 12 cores/24 threads. Order link is in my signature.
+1 for the CPU/cost ratio - use the additional IPs with proxmox to split them into pieces for easier management/deployment and backup capabilities
Some good suggestions at comparable rates. Thanks. I think i'll do some more sums to see my specific requirements, but the ones mentioned are ballpark of what I'm after. RAM/disk requirements are quite low, CPU and bandwidth will be the bottlenecks. I'll be taking a closer look over the next week. Cheers.
Sure. Then have his app perform slower because it is based on single core clock, not core availability, plus limiting inbound to much less available BW (100(0)Mbit divided by VPS).
Hetzner is cheap for i7s, especially if you don't pay VAT - Seflow also ok, but already more expensive. JoesDC/Wholesale (essentially same crap) should work also, but the HE/Cogent mix is pretty bad.
@ricardo Is your crawling app heavily multi threaded / multi process, or is it single thread? I.e. is single thread performance more important, or more cores / threads is better?
Lots of threads. Give me a moment and I'll type out a fairly detailed description of how its done. Then hopefully I'll get the creme-de-la-creme analysis from you lot
This is a pretty detailed gist of what the boxes will do, and I'm open to ideas of what hardware (or offer) would fit the bill. The one caveat I'd add is that extra bandwidth and more boxes is a distinct possibility.
My previous incarnation of this used a headless browser (phantomJS) because I was interested in Javascript manipulating the DOM, but I've found out it doesn't make all that much difference for the fingerprints I'm interested in. I ran around 30 threads per box and was maxing out on CPU, but using libcurl I imagine the CPU load will be 90% lower for crawling and more intensive on the PCRE front, assuming I crawl quicker.
What load are you currently pulling on your boxes? Or are they cancelled due to payment malfunction?
I've updated my post regarding load.
His current systems seems to have dual CPU, so it should be close to one E3.
https://speedykvm.com/#vdedi
V-DEDICATED #1 (1x KVM VPS on dedicated host node) with coupon LET, $21.75/mo for;
Seems ideal for a crawler as you have the dedicated resources, dedicated gigabit port, etc. If you need more storage get a storage plan too, or larger vdedi and nfs mount it.
Wow. That's some serious number crunching.
For string matching, Pire looks right up your alley for combining multiple regex matches in a single pass. There's also Google's RE2 and PCRE-SLJIT compared in this benchmark. For C++, I like Boost::Xpressive's static regexes.
Are you avoiding XML parsing for performance reasons? I would have thought libxml2 parsing with XPath matching would be easier, and with 1000 regexes per page, you will amortize the expense of building the XML tree.
EDIT: Meant Boost:Xpressive and not Boost::Spirit
You might want to grab a few of the Intel Atom C2750 from Online.net. €15.99 each, pretty much the same price and better specs
Online Ip reputation is pretty crappy though.
you probably missed the point that those delimiter boxes come with dual cpu, so I think the c2750 won't deliver better performance ;-)
@Incero
That looks like a nice deal, what part of it isn't dedicated then? Would you do a deal on 10 paid quarterly? I'd probably have a few more things to check out before going with it, maybe I can PM.
Thanks for the thoughts on it rince, previously PhantomJS already had the DOM tree so that's where the heavy lifting was done, I haven't looked into anything other than PCRE regexes for now, but will check out those other libraries you had mentioned. Fact is I couldn't really get into the inner workings of phantomJS and I'm pretty confident anything I do in C will outperform it, particularly due to all the stuff a headless browser has to do compared to a dumb grab via cURL. I probably will do some XML parsing as the majority of fingerprints are located in specific elements, I don't look at the whole page for each regex (e.g. hundreds of them just look at src and href attributes). The < head,a,script > elements contain most stuff, so I can even skip most of the string and just parse them sequentially with a subset of the regex.
Thanks to others for posting insights. I'm not too clued up on a CPU like for like comparison but have a slightly better idea today. It does seem quite reasonable that I can get similar CPU performance for a similar price.