Database Sharding Blog

Saturday, March 28, 2009

Google's Infrastructure Discussed

IT World has published an interview with Google's Vice President of Research Alfred Spector that provides details about Google IT infrastructure and strategy:

Google uses what is now termed "cloud computing" We have numerous clusters, each containing large numbers of computers. The clusters run a distributed computing infrastructure that uses Linux on each computer. All the computers are then tied together with high-performance networking and distributed computing software. For example, we have built and deployed a global file system called the Google File System that provides scalable, fault-tolerant storage; a record-oriented data storage system for tabular data called BigTable; and a computational programming model called MapReduce that allows our batch jobs to use the inherent parallelism in our clusters.

[As for] the exact number of machines, locations and clusterswe have, suffice it to say that we have so many individual elements in our fabric that an enormous amount of attention is paid to fault tolerance, because with so many elements operating, there are exceedingly frequent component failures.
Could other companies emulate that kind of architecture? First, there really are economies of scale in running systems that can support many services on a common fabric. Second, relating to the services model we espouse, there are great simplifications to releasing software as a Web-based service, because services don't have to be tested and deployed on a large number of different customer environments. Instead, software can be released to a small number of machines in a more controlled cloud and then accessed by browsers.

A third benefit is that since a software service is a logically centralized notion, the history of interactions of very many users can be aggregated and thus be the basis for various types of self-learning systems. Google uses this concept to learn to correct spelling mistakes, but businesses can use similar notions to better meet the needs of employees or customers by learning, for example, of common errors, unfulfilled product searches, etc.

Labels: , , , ,