What exactly is a wide column store?

Question

Let’s start with the definition of a wide column database.

Its architecture uses (a) persistent, sparse matrix, multi-dimensional
mapping (row-value, column-value, and timestamp) in a tabular format
meant for massive scalability (over and above the petabyte scale).

A relational database is designed to maintain the relationship between the entity and the columns that describe the entity. A good example is a Customer table. The columns hold values describing the Customer’s name, address, and contact information. All of this information is the same for each and every customer.

A wide column database is one type of NoSQL database.

Maybe this is a better image of four wide column databases.

Wide column databases

My understanding is that the first image at the top, the Column model, is what we called an entity/attribute/value table. It’s an attribute/value table within a particular entity (column).

For Customer information, the first wide-area database example might look like this.

Customer ID    Attribute    Value
-----------    ---------    ---------------
     100001    name         John Smith
     100001    address 1    10 Victory Lane
     100001    address 3    Pittsburgh, PA  15120

Yes, we could have modeled this for a relational database. The power of the attribute/value table comes with the more unusual attributes.

Customer ID    Attribute    Value
-----------    ---------    ---------------
     100001    fav color    blue
     100001    fav shirt    golf shirt

Any attribute that a marketer can dream up can be captured and stored in an attribute/value table. Different customers can have different attributes.

The Super Column model keeps the same information in a different format.

Customer ID: 100001
Attribute    Value
---------    --------------
fav color    blue
fav shirt    golf shirt

You can have as many Super Column models as you have entities. They can be in separate NoSQL tables or put together as a Super Column family.

The Column Family and Super Column family simply gives a row id to the first two models in the picture for quicker retrieval of information.

Leave a Comment Cancel reply