Monday, July 9, 2007

Sharding

Shard, or sharding, has become the new buzz word lately. The term seemed to have been popularized by Google and made more prominent by Flickr and Digg who credits it to helping them scale up performance. Not being familiar with the term, I started to do some research around the web to try to understand what this new technology is.

My initial investigation made me more confused. Based on description by Digg, it sounded like all they are doing is traditional data partitioning (vertical or horizontal, I forgot) or maybe database clustering to distribute load.

When I look at the comments made by Google they also only spoke about the physical layout of their database, but on closer examination it revealed that they work more with Hybernate's sharding mechanism, which is a software ORM (object relational mapping) solution, along with their database architecture. ORM is not new either. Essentially, it means to create an object for accessing data that abstract it from the underlying database architecture. What Hybernate takes one step further is to allow multiple data sources and still create one data object to be used.

In the end, it seems like sharding is a new term to describe the use of ORM with database partitioning together.

No comments:

Post a Comment