New on LowEndTalk? Please Register and read our Community Rules.
Anyone running Cassandra on a LEB?
I know, I know, it's java, so it's definitely not a good idea to run it on OVZ. They recommend 8GB of RAM for Cassandra, so I figured it would be a great idea to try running it on a 256MB KVM. It crashed and burned, hard. Unfortunately, I can't find much information about running Cassandra in a low memory environment... when I see a question on a forum like "Cassandra keeps running out of memory, can I decrease memory usage?", everyone's answer is "Throw more memory at it!" That's all fine and dandy, but this is Low End Talk.
So, does anyone here have any experience with Cassandra, especially in a low end environment? Is it even possible?
Comments
I don't know much about Cassandra, but usually the idea is to use efficient code instead of "throwing more memory at it." But then, when "The largest known Cassandra cluster has over 300 TB of data in over 400 machines." (cassandra.apache.org) it might be difficult to fit this onto a 256MB KVM.
Throw more swap at it!
I know j/k xD
I don't know nothing about this project =(
But doesn't seems to be the right software for this level of hardware/servers
i think mongodb might do a better job as an nosql solution for lowendboxes haha
And the next point: For what Cassandra, it makes only sence in a big cluster.
Than, Cassandra isn't very fast and reading operations, it's ok on writing operations.
Have you ever tryed to build a database and a application with Cassandra as the backend? When I tryed it, it was difficult. And if you grew up with mysql / sql standard it's more difficult.
If you need a "cool" and fast database, try redis. I already used it on leb and it ran very fine. Was able to write(!) ~60k-80k objects/string to redis!
And come on, first start to build an application, that really needs a cassandra cluster, big projects like wikipedia are using mysql, reddit is using a kind of mysql. And if you're app gets sooo popular, there are many option to optimize it.
But that are just two cents from a 16-year old kiddie :P
Maybe you will laugh, but people are recommanding a very high amount of swap for redis clusters. It's because any kind of data in redis in in ram and on disk. But the ram is a limited thing... and many people said, that to have parts of the redis databases in swap is faster than old mysql clusters oO
That's not optimization. It's expanding your resources.
:P
I meant what someone can do, if his app gets popular quickly, and that coders mustn't worry about optimizing at beginning. Just focus on programming your app
Look, nobody or mostly nobody has optimized their apps during the development process. Facebook started with simple php and mysql.
Now they have PHP HipHob, their own database, many optimizings with mysql, and so on
I have, actually. And as far as I am aware it is pretty common amongst those that have limited resources.
That really depends on how well it scales. For example it might be inefficent when using a small number of servers but be optimized for more nodes, i.e. double the number of servers but get 4x the speed
I'm not looking for optimization, I'm looking for high availability and the ability to do writes on any node. MySQL isn't suitable for this (yes, I know about MySQL Cluster, but it's a pain to set up and it has a lot of limitations). Redis doesn't seem to allow for writes to occur on multiple nodes (actually, it seems to allow writes to slaves, but they don't propagate to the master). Likewise, MongoDB doesn't allow for multi-master environments either. Cassandra seems to fit the bill perfectly, except for not running on LEBs.
Riak.
When invoking a java binary, the following command line options:
-Xmx256M -Xms256M
set the maximum heap size and in this case gives the program 256Mb of ram (the default is 64Mb)
so you may try to tune/lower those on cassandra, I put you some links:
http://rimuhosting.com/knowledgebase/linux/java/-Xmx-settings
http://www.caucho.com/resin-3.0/performance/jvm-tuning.xtp
Also search "java xmx" and "java tuning" on google.
good luck!
With the heap size set to 96MB or less, it errors out ("Too small initial heap") and it still causes the OOM killer to start killing stuff off. At 128MB, it starts, but it causes OOM and a kernel panic. I just don't think Cassandra is going to play well with a small amount of RAM.
Also, googling for " cassandra low memory" give interesting links like:
http://blog.mikiobraun.de/2010/08/-cassandra-tips.html
http://jonathanhui.com/cassandra-performance-tuning-and-monitoring#CassandraPerformanceTuningandMonitoring-CassandraMemoryCacheTunning
Are you using the openjdk JRE? With generic java apps that exhibited OOM issues with openjdk on a 128/128 OVZ, I've had much better results with IBM's JRE instead.
I'll try that, thanks @quirkyquark
This thread is...interesting.
Are you looking for a setup where you can commit on any node and the nodes are allowed to be inconsistent (i.e., you commit on node 1 and nodes 3 and 4 may not reflect that for a little bit)? MySQL and many other products are fine.
If you want a RDBMS where all nodes are perfectly in sync, such that if you commit on node #1 and a microsecond later you query node #32 and it gives you a perfectly consistent answer that reflects that commit while still being highly performant, then you want Oracle RAC or IBM's DB/2 PureScale. Be prepared to spend hundreds of thousands of dollars.
You are posting on a Low End Box forum talking about multi-master HA. Perhaps your needs are not as extreme as you suggest and you could probably live with MySQL clustering quite nicely?
MySQL 6.0 is rumored to have true synchronous replication...but of course, the commit on node 1 doesn't complete until the the commit on node 2 (and 3, 4, etc.) also succeed (two-phase commit), so unless you're in the same DC talking over a private network, it doesn't scale very well. And even then...there are limits.
If you don't want an RDBMS and are looking for NoSQL, then the options broaden. Of course, NoSQL vs. RDBMS are two completely different solutions.
An alternate to rolling your own is to look at Amazon's SimpleDB, DynamoDB, MongoHQ, etc. Heck, Amazon SimpleDB will give you 1GB free forever.
what about running cassandra on these 6GB nodes(https://vpsdime.com/index.php#product).. I'm seduced by their high RAMs which perhaps should make them suitable to host cassandra ?