- If you take a look at the Readme file at Apache Cassandra git repo, it says that,
Cassandra is a partitioned row store. Rows are organized into tables
with a required primary key.Partitioning means that Cassandra can distribute your data across
multiple machines in an application-transparent matter. Cassandra will
automatically repartition as machines are added and removed from the
cluster.Row store means that like relational databases, Cassandra organizes
data by rows and columns.
-
Column oriented or columnar databases are stored on disk column wise.
e.g: Table
Bonuses
tableID Last First Bonus 1 Doe John 8000 2 Smith Jane 4000 3 Beck Sam 1000
-
In a row-oriented database management system, the data would be stored like this:
1,Doe,John,8000;2,Smith,Jane,4000;3,Beck,Sam,1000;
-
In a column-oriented database management system, the data would be stored like this:
1,2,3;Doe,Smith,Beck;John,Jane,Sam;8000,4000,1000;
-
Cassandra is basically a column-family store
-
Cassandra would store the above data as,
"Bonuses" : {
row1 : { "ID":1, "Last":"Doe", "First":"John", "Bonus":8000},
row2 : { "ID":2, "Last":"Smith", "First":"Jane", "Bonus":4000}
...
}
-
Also, the number of columns in each row doesn’t have to be the same. One row can have 100 columns and the next row can have only 1 column.
-
Read this for more details.