I’d say a solution will be highly dependent of what exactly you need to do. A solution to write thousands of records per second might be very different from incrementing a counter in the example you provided. More so, there could be no tables at all to handle such load. Consistency/availability requirements are also missing in your question and depending on them the entire architecture may be very different.
Anyway, back to your specific simplistic case and your options
Option 1 (Master slave replication)
The problem you’ll face here is database locking – every increment would require a record lock to avoid race conditions and you’ll quickly get your processes writing to your db waiting in a queue and your system down. Even under a moderate load )
Option 2 (Sharding the DB)
Your assumption is correct, not much different from p.1
Option 3 (Batched updates)
Very close. A caching layer provided by a light-weight storage providing concurrent atomic incremens/decrements with persistence not to lose your data. We’ve used redis for a similar purpose although any other key-value database would do as well – there are literally dozens of such databases around.
A key-value database, or key-value store, is a data storage paradigm
designed for storing, retrieving, and managing associative arrays, a
data structure more commonly known today as a dictionary or hash table
The solution would look as follows:
incoming requests → your backend server -> kv_storage (atomic increment(product_id))
And you’ll have a “flushing” script running i.e. */5 that does the following (simplified):
- for every
product_idin kv_storage read its currentvalue - update your db counter (
+= value) - decrement the
valuein kv_storage
Further scaling
- if the script fails nothing bad would happen – the updates would arrive on next run
- if your backend boxes can’t handle load – you can easily add more boxes
- if a single key-value db can’t handle load – most of them support scaling over multiple boxes or a simple sharding strategy in your backend scripts would work fine
- if a single “flushing” script doesn’t keep up with increments – you can scale them to multiple boxes and decide what key ranges are handled by each one