sharding
MongoDB to Use Sharding with $lookup Aggregation Operator
As the docs you quote indicate, you can’t use $lookup on a sharded collection. So the best practice workaround is to perform the lookup yourself in a separate query. Perform your aggregate query. Pull the “localField” values from your query results into an array, possibly using Array#map. Perform a find query against the “from” collection, … Read more
Are there any REAL advantages to NoSQL over RDBMS for structured data on one machine?
If you’re starting off on a single server, then many advantages of NoSQL go out the window. The biggest advantages to the most popular NoSQL are high availability with less down time. Eventual consistency requirements can lead to performance improvements as well. It really depends on your needs. Document-based – If your data fits well … Read more
When do you start additional Elasticsearch nodes? [closed]
Let’s clarify the terminology a little first: Node: an Elasticsearch instance running (a java process). Usually every node runs on its own machine. Cluster: one or more nodes with the same cluster name. Index: more or less like a database. Type: more or less like a database table. Shard: effectively a lucene index. Every index … Read more
Extreme Sharding: One SQLite Database Per User
The place where this will fail is if you have to do what’s called “shard walking” – which is finding out all the data across a bunch of different users. That particular kind of “query” will have to be done programmatically, asking each of the SQLite databases in turn – and will very likely be … Read more
MySQL Partitioning / Sharding / Splitting – which way to go?
You will definitely start to run into issues on that 42 GB table once it no longer fits in memory. In fact, as soon as it does not fit in memory anymore, performance will degrade extremely quickly. One way to test is to put that table on another machine with less RAM and see how … Read more
Database partitioning – Horizontal vs Vertical – Difference between Normalization and Row Splitting?
Partitioning is a rather general concept and can be applied in many contexts. When it considers the partitioning of relational data, it usually refers to decomposing your tables either row-wise (horizontally) or column-wise (vertically). Vertical partitioning, aka row splitting, uses the same splitting techniques as database normalization, but ususally the term (vertical / horizontal) data … Read more
MongoDB querying performance for over 5 million records
This is searching the needle in a haystack. We’d need some output of explain() for those queries that don’t perform well. Unfortunately, even that would fix the problem only for that particular query, so here’s a strategy on how to approach this: Ensure it’s not because of insufficient RAM and excessive paging Enable the DB … Read more
MySQL sharding approaches?
The best approach for sharding MySQL tables to not do it unless it is totally unavoidable to do it. When you are writing an application, you usually want to do so in a way that maximizes velocity, developer speed. You optimize for latency (time until the answer is ready) or throughput (number of answers per … Read more
ElasticSearch: Unassigned Shards, how to fix?
By default, Elasticsearch will re-assign shards to nodes dynamically. However, if you’ve disabled shard allocation (perhaps you did a rolling restart and forgot to re-enable it), you can re-enable shard allocation. # v0.90.x and earlier curl -XPUT ‘localhost:9200/_settings’ -d ‘{ “index.routing.allocation.disable_allocation”: false }’ # v1.0+ curl -XPUT ‘localhost:9200/_cluster/settings’ -d ‘{ “transient” : { “cluster.routing.allocation.enable” : … Read more