All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
multi-master replication for SQL/NoSQL/fuse
Hello experts,
I have some development servers lying around and would like to make better use of them. My dedicated servers have free HDD space and the virtual servers have free SSD space available.
Right now I'm working with MySQL/MariaDB and Postgresql on the SQL side and with MongoDB on the NoSQL side. No replication at all - just polling and pushing data between the servers.
What I really like to have would be a true multi-master replication solution where I can
- write on any node
- read from any node
- add new nodes anytime
- remove existing nodes anytime
- doesn't have to be synchronous, some delay is acceptable.
If no stable database solution is possible, a distributed file system does also help, where I can allocate the different free HDD/SSD spaces to my distributed space and mount it on fuse.
What solutions are you guys using in similar environments?
Comments
Cassandra (using).
CockroachDB (planning to test sometimes).
Galera mysql / mariadb / percona ?
It's not that great over high latency WAN links though (although I expect no distributed SQL would be).
Other than that we use it for our Mysql (3x droplets).
-SQL: CockroachDB
-NoSQL: Couchbase (general use), Cassandra (large scale use)
Thanks for your answers so far. Do those work with different sized nodes or do they have all be about the same size? I've tested Galera, but I don't find sharding capabilities there. MongoDB has sharding, but but not multi master.
I'm running Galera for 4-5 years now. In the meantime i've been adding / removing nodes without any downtime. It works great I can highly recommend it.
I'm not sure what sharding is btw. I guess you can't have it all.
Scalable multi-master ACID SQL databases are basically mythical beasts. You really need a dedicated database cluster tuned for performance and extremely low-latency links between them. The same is true of distributed file-systems.
If you care about data consistency, it's generally most performant to have a single master on the fastest hardware you can manage, with streaming replication and fail-over. Reading from the secondary improves performance a bit. If you don't care about data consistency, and are okay with losing a lot of data when one of your DBs suddenly goes down, then async replication can help, but it's still not magically scalable, and I'd consider very carefully whether you actually need it before going through all the hassle.
https://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling
If you want to go small...
http://rqlite.com
SQLite w/RAFT consensus engine.