How GOOGLE Works

ZDNET The numbers alone are enough to make your eyes water.

  • Over four billion Web pages, each an average of 10KB, all fully indexed

  • Up to 2,000 PCs in a cluster

  • Over 30 clusters

  • 104 interface languages including Klingon and Tagalog

  • One petabyte of data in a cluster -- so much that hard disk error rates of 10-15 begin to be a real issue

  • Sustained transfer rates of 2Gbps in a cluster

  • An expectation that two machines will fail every day in each of the larger clusters

  • No complete system failure since February 2000

It is one of the largest computing projects on the planet, arguably employing more computers than any other single, fully managed system (we're not counting distributed computing projects here), some 200 computer science PhDs, and 600 other computer scientists.

http://insight.zdnet.co.uk/hardware/servers/
0,39020445,39175560,00.htm




Schmidt:
Technology is always evolving, and companies--not just search companies--can't be afraid to take advantage of change.

When was the last time a computer or network was built to take advantage of cheap bandwidth, cheap DRAM, and plentiful PCs? Most companies are still building mainframes based on old computer architectures.

At Google, for example, we found it costs less money and it is more efficient to use DRAM as storage as opposed to hard disks--which is kind of amazing. It turns out that DRAM is 200,000 times more efficient when it comes to storing seekable data. In a disk architecture, you have to wait for a disk arm to retrieve information off of a hard-disk platter. DRAM is not only cheaper, but queries are lightning fast.

Search technology has a lot of room for improvement, be it algorithms or computer architecture.

[http://www.pcworld.com/news/article/0,aid,81685,00.asp]



Links

Recording of a speech by Jim Reese http://technetcast.ddj.com/tnc_play_stream.html?stream_id=420

The Magic That Makes Google Tick: http://insight.zdnet.co.uk/hardware/servers/0,39020445,39175560,00.htm

Google Cluster Architecture http://www.computer.org/micro/mi2003/m2022.pdf

Interview with Jim Reese http://www.hpworld.com/hpworldnews/hpw009/02nt.html

Is Google Broken? http://www.w3reports.com/index.php?itemid=549

Pigeon Rank : http://www.google.com/technology/pigeonrank.html

Anatomy of a Search Engine: http://www-db.stanford.edu/~backrub/google.html

The Secret of Google's Power http://blog.topix.net/archives/000016.html

Google Server Design  http://news.cnet.com/8301-1001_3-10209580-92.html 

Underground Server Farm http://www.wired.com/news/business/0,1367,48104,00.html?tw=wn_story_related

RAM at HowStuffWorks http://computer.howstuffworks.com/ram.htm/printable


Questions:

Hardware

  1. How many servers are there (approximately)?

  2. Describe the hardware in a Google server.

  3. What is a “server farm”?

  4. Why is are heat and power significant issues in a server farm? Why is this an even bigger problem for Google?

  5. Explain the difference between Dynamic RAM (DRAM) and Static Ram (SRAM).

  6. Why would Google use SRAM instead of hard-disks?

  7. Do servers break? Approximately how often?

  8. Explain the term “mean-time-between-failure” and give an approximate value for a hard-disk.

  9. How does Google avoid a “single point of failure”?

  10. What special arrangements does Google make to enable them to use very cheap hardware?

Software

  1. Which browser do the Google servers use?

  2. What standard software is running on the Google servers?

  3. Explain what “page rank” is and how it works.

  4. What does it mean when a “web-spider” is “crawling” the web?

  5. What is an “index”?

  6. Why does Google “cache” pages?

  7. Calculate the total data storage requirements for 4 billion pages averaging 10 KB each.

System

  1. What operating system does Google use? Why?

  2. Describe some difference between a server OS and a desktop OS.

  3. What major change in the OS did Google program themselves?

  4. How does a large block size in the file system improve disk drive performance?

  5. What does “parallel” mean?

  6. Explain “load balancing”.

  7. What does “scalability” refer to?

  8. What does “redundant” mean?

  9. What does “fault tolerant” mean?

  10. Describe how Google can stay “up” 24/7, despite frequent changes to the Web
    and frequent hardware failures.

Vocabulary

Megabyte, Gigabyte, Terabyte, Petabyte, Gbps

http, HTML, client, server, host, spider, crawl, DNS, IP, query, Boolean, bandwidth, cache

rackmount, 1U, 2U

parallel, cluster, load-balancing, multi-threading, multi-processor, scalability, fault-tolerance

IDE, SCSI, drive array, RAID

CPU, RAM, SRAM, DRAM, L2 cache