Appendix 2: Hardware Technology Forecast

A2.1 Introduction

Extrapolation of past trends to predict the future can be dangerous, but it has proven to be the best tool for predicting the evolution of computer hardware technology. Over the last four decades, computer technology has improved each year. Occasionally technological breakthroughs have occurred and occasionally development has stalled but, viewed on semi-log paper plotted over the last four decades, the trend lines march inexorably up and to the right in almost-straight lines.

The typical trend line is canonized by Gordon Mooreís law that ìcircuits-per-chip increases by a factor of 4 every 3 years.î This observation has been approximately true since the early RAM (random-access memory) chips of 1970. Related to this is Bill Joyís law that Sun Microsystemsí processor MIPS (millions of instructions per second) double every year. Though Sunís own technology has not strictly obeyed this law, the industry as a whole has.

Two recent surprises have been

The slowdown in DRAM progress: 16-Mb memory chips are late.
The acceleration of disk progress: Disk and tape capacities (densities) have doubled each year for the last 3 years, which is triple the predicted yearly rate.

These ìnewî trends appear to be sustainable though at different rates: The DRAM industry is moving more slowly than before, but progress in magnetic recording technology is much more rapid. Both of these trends are due to financial rather than technical issues. In particular, 16-Mb DRAMs are being produced, and the 64-Mb and 256-Mb generations are working in the laboratory.

While the slowest of these trends is moving at a 40% deflator per year, some trends are moving at 50% or 60% per year. A 35% deflator means that technology will be 20 times cheaper in 10 years (see Figure A2-1). A 60% deflator means that technology will be 100 times cheaper in 10 years: A $1B system will cost $10M.

Figure A2-1: Year vs. Savings for Various Technology Growth Rates

The implications of this growth encourage EOSDIS to

Design for future equipment.
Buy the equipment just-in-time, rather than pre-purchasing it.

Consider Figure A2-2. It shows the EOSDIS storage demand, as predicted by HAIS, and the price of that storage that year. The upper curve shows cumulative_storage(t), the amount of storage needed for EOSDIS in year t. According to the forecast, storage grows to about 15 PB in 2007. The bottom curve shows the cost of the entire store if it were bought that year disk_price(t) x cumulative_storage(t). The graph assumes 10% of the data is on disk and 90% is on tape. After the year 2001, the price of a entirely new archive declines because technology is improving at 40% (a conservative rate) while the archive is growing at a constant rate. One implication of this analysis is that NASA could keep the entire archive online (rather than nearline) in 2007 by investing in a $200M disk farm. If disk technology were to improve at 60%/year (as it is now), the prices in 2007 will be almost 10 times lower than this prediction.

Figure A2-2: EOSDIS Storage Size and Cost with a 40% per Year Deflator

Along with these price reductions, there has been a commoditization of the computing industry. A comparable restructuring of the long-haul communications industry is in progress. Today, one can buy commodity hardware and software at very low prices. These components benefit from economies of scaleóamortizing the engineering costs over millions of units. Traditional ìmainframeî and ìsupercomputerî hardware and software, though, sell only thousands of units. Consequently, the fixed costs of designing, building, distributing, and supporting such systems is distributed over many fewer units. This makes such software very expensive to buy and use.

Even within the commodity business, there are two distinct price bands: servers and desktops. Servers are typically higher-performance, more carefully constructed, and more expandable. But they also cost more. Servers are typically 2 or 3 times more expensive per unit than clients. We expect this distinction to continue because it reflects true costs rather than pricing anomalies. To gain a sense of the relative costs of commodity versus boutique prices, consider representative prices of various components in Table A2-1.

Table A2-1: Comparison of Three Price Bandsó Boutique vs. Commodity Components

	$/SPECint	$/MB RAM	$/MB disk	$/tape drive	$/DBMS
Mainframe	25,000	1,000	5	30,000	100K
Server	200	100	1	3,000	20K
Desktop	50	30	1	500	200

Traditionally, mainframe peripherals (disks and tapes) were superior to commodity components, but that is no longer true: IBM 3480-3490 tapes are 10-100 times ìsmallerî than DLT tapes and no faster. Commodity SCSI disks provide very competitive performance (10 MB/s) and reliability. Arrays of such disks form the basis of RAID technology. High levels of integration, mass production, and wide use create more reliable and better tested products.

Even today, an architecture that depends on traditional ìmainframeî or ìboutiqueî hardware and software is much more expensive than one based on commodity components. We expect this trend to continue. Consequently, we believe that EOSDIS should be based on arrays of commodity processors, memories, disks, tapes, and communications networks. Hardware and software should comply with de facto or de jure standards and be available from multiple vendors. Examples of such standards today are C, SQL, the X Windows System, DCE, SNMP, Ethernet, and ATM.

The design should be ìopenî enough to allow EOSDIS to select the most cost- effective hardware and software components on a yearly basis. In particular, EOSDIS should avoid ìboutiqueî hardware and software architectures. Rather, wherever possible, it should use commodity software and hardware.

Table A2-2 summarizes our technology forecasts, showing what $1M can buy today (1995), and in 5 and 10 years. For example, today you can buy 100 server nodes (processors and their cabinets), each node having a 100 SPECint processor and costing about $10K. The package is a .1 Tera-op-per-second computing array (.1 Topps) spread among 100 nodes.

Table A2-2: What $1M Can Buy

	Topps @ nodes	RAM	Disk @ drives	Tape robots	LAN	WAN
1995	.1 Top @ 100	10 GB	2 TB @ 200	20 TB @ 100	FDDI	T1
2000	.5 Top @ 100	50 GB	15 TB @ 400	100 TB @ 100	ATM	ATM
2005	3. Top @ 1000	250 GB	100 TB @ 1000	1 PB @ 100	?	?

A.2.2 Hardware

A2.2.1 Processors

Microprocessors have changed the economics of computing completely. The fastest scalar computers are single-chip computers. These computers are also very cheap, starting at about $1,000 per chip when new, but declining to $50 per chip (or less) as the manufacturing process matures. There is every indication that clock rates will continue to increase at 50% per year for the rest of the decade.

Initially, some thought that RISC (Reduced Instruction Set Computers) was the key to the increasing speed of microprocessors. Today, RISC processors have floating-point instructions, square-root instructions, integrated memory management, and PCI I/O interfaces. They are far from the basic RISC features. Additionally, Intelís x86 CISC (Complex Instruction Set Computers) continue to be competitive. Indeed, it appears that the next step is for the RISC and CISC lines to merge as super-pipelined VLIW computers.

This means that faster, inexpensive microprocessors are coming. Modern software (programming languages and operating systems) insulates applications from dependence on hardware instruction sets, so we expect EOSDIS will be able to use the best microprocessors of each generation.

Current servers are in the 100-MHz (Intel) to 300-MHz (DEC Alpha) range. Their corresponding SPECint ratings are in the 75-150 SPECint range (per processor). Perhaps more representative are the TPC-A transaction per second ratings that show the machines to be in the 150-350 TPS-A range. The current trend is a 40% annual decline in cost and a 50% annual increase in clock speed. Today, $1 buys 1 teraop (1 trillion instructions if the machine is depreciated over 3 years.) Table A2-3 indicates current and predicted processor prices. In the year 2000 and beyond, additional speedup will come from multiple processors per chip.

Table A2-3: Cost of Processor Power (Commodity Servers)óDesktops Are 2 Times Cheaper, Mainframes 100 Times More Expensive

Year	SPECint	$/SPECint	Top/$	TPS-A/CPU
1995	100	100	1	300
2000	500	20	5	1,000
2005	2,000	4	20	3,000

Beyond the relentless speedup, the main processor trend is to use multiple processorsóeither as shared-memory multiprocessors or as clusters of processors in a distributed-memory or multi-computer configuration. We return to this development in the Section A2.2.7, System Architecture.

A2.2.2 RAM

Dynamic memory is almost on the historic trend line of a 50% price decline per year. A new generation appears about every 3 years (the actual rate seems to be 3.3 years) with each successive generation being 4 times larger than the previous. Memory prices have not declined much over the last year, holding at about $30/MB for PCs, $100/MB for servers, and $1,000/MB for mainframes. Right now, 4-Mb-chip DRAMs are standard, 16-Mb chips are being sampled, 64-Mb chips are in process, and 256-Mb chips have been demonstrated.

DRAM speeds are increasing slowly, about 10%/year. High-speed static RAM chips are being used to build caches that match the memory speed to fast processors.

The main trends in memory hardware are a movement to very large main memories, 64-bit addressing, and shared memory for SMPs. Table A2-4 shows current and predicted prices and typical memory configurations.

Table A2-4: Size and Cost of RAM

Year	b/chip	$/MB	Desktop memory	Server memory/ CPU
1995	4 Mb	30	16 MB	128 MB
2000	256 Mb	6	128 MB	1 GB
2005	1 Gb	1	1 GB	8 GB

A2.2.3 Magnetic Disk

Dramatic improvements have taken place in disk performance, capacity, and price over the last few years. Disk capacity has doubled every year over the last 3 years. Disk speeds have risen from 3600 to 9200 RPM, decreasing rotational latency and increasing the transfer rate by a factor of 2.5. In addition, small form-factor disks have cut seek latency by a factor of 2. This progress has been driven largely by demands for desktop storage devices and departmental servers. They have created a very competitive and dynamic market for commodity disk drives. Multimedia applications (sound, image, and some video) have expanded storage requirements both in commerce and entertainment.

Technological advances like thin-film heads and magneto-resistive heads will yield several more product generations. Technologists are optimistic that they can deliver 60%/year growth in capacity and speed over the rest of the decade.

Disk form factors are shrinking: 3.25-inch disks are being superseded by 1-inch small fast disks. Large 3.25-inch disks are emerging as giant ìslowî disks (online tapes).

Traditional consumers of disk technology have been concerned with two disk metrics:

$/GB: The cost of 1 GB of disk storage.
Access time: The time to access a random sector.

EOSDIS will be storing and accessing relatively large objects and moving large quantities of data. This, in turn, means that EOSDIS will have many disks and so will use disk arrays to build reliable storage while exploiting parallel disk bandwidth to obtain high transfer rates. So, the EOSDIS project is more concerned about

kox: Number of KB objects a storage device can read or write per second.
mox: Number of MB objects a storage device can read or write per second.
gox: Number of GB objects a storage device can read or write per hour.
scans: Number of times the entire device can be read per day.
dollars-per-reliable GB: Cost of 1 GB of reliable (fault-tolerant) disk storage.

Access time is related to kox, but the other metrics are new. Tiles (hyper-slabs) of satellite data are likely to come in MB units, thus our interest in mox. Landsat images, simulation outputs, and time-series analysis are GB objects, hence our interest in gox. Certain queries may need to scan a substantial fraction of the database to reprocess it or to analyze it, hence our interest in scans. Disks have excellent characteristics on all these metrics (except dollars-per-reliable GB) when compared to optical disk or tape. Table A2-5 shows current and predicted prices and performance metrics for disk technology.

Section A2.2.5, System Architecture, discusses how RAID technology can be used to build disk arrays with high mox, gox, and scans and with low dollars-per-reliable GB.

Table A2-5: Disk Drive Capacity and Price Over Time

Year	Capacity	Cost/ drive ($)	Cost/ GB ($)	Trans-fer rate	Access time	kox	mox	gox	scans
1995	10 GB	3K	300	5 MB/s	15 ms	237K	17K	18	43
2000	50 GB	2K	40	10 MB/s	10 ms	356K	33K	36	17
2005	200 GB	1K	5	20 MB/s	7 ms	510K	49K	54	6

A2.2.4 Optical Disk

Optical data recording emerged a decade ago with great promise as both an interchange medium and an online storage medium. The data distribution application (CD-ROM) has been a great success but, for a variety of reasons, read-write optical disks have lagged magnetic recording in capacity and speed. More troubling still is that progress has been relatively slow, so that now magnetic storage has more capacity and lower cost than online optical storage.

Optical disks read at less than 1 MB/s and write at half that speedó10 times slower than magnetic disks. This means they have poor mox, gox, and scan ratings. Optical disks are also more expensive per byte than magnetic disks. To alleviate this disadvantage, optical disks are organized in a juke-box with a few read-write stations multiplexed across a hundred platters. These optical disk robots have 100 times worse mox, gox, and scans (since platter switches are required and there is so much more data per reader), but at least jukeboxes offer 3 times the cost/GB advantage over magnetic storage.

Unless there is a technological breakthrough, we do not expect optical storage to play a large role in EOSDIS. EOSDIS or third parties may publish topical CD-ROM data products for individuals with low bandwidth Internet connections. A current CD-ROM would take less than a minute to download on a ìslowî ATM link.

A2.2.5 Tape

Tape technology has produced three generations in the last decade. In the early 1980ís, reel-to-reel tapes were replaced by IBMís 3480 tape cartridges. These tapes were more reliable, more automatic, and more capacious than previous technology. Today, however, it is an antique technologyómuch more expensive and 100 times less capacious than current technology. We were dismayed to learn that NASA has mandated the 3480 as its storage standard. IBMís successor technology, the 3490, is 10 times more capacious but still about 100 times more expensive than current technology.

Helical scan (8-mm) and DAT (4-mm) tapes formed a second storage generation. These devices were less reliable than the 3480 and had 10 times less the transfer rate, but they compensated for this with drives and robots that were 10 times less expensive and with cartridges that could store 10 times more data. This combination gave these drives a 100-fold price/performance advantage over 3480 technology.

Starting in 1991, a third generation of tape, now called Digital Linear Tape, came on the scene. It has the reliability and performance of the 3480 family, the cost of the 8-mm tapes and drives, and has 10 times the capacity. A current dlt has a 10-gb capacity (uncompressed) and transfers at 4 mb/s, and the drives cost $3K. A robot stacker managing 10 tapes costs less than $10K. Three-tb storage silos built from this technology are available today for $21/gb. These silos are 10 times cheaper than Storage Technology Corporation silos built from 3490 technology. Put another way, the 10-PB eosdis database would cost $210M in current dlt technology, and $2B in 3490 technology. Dlt in the year 2000 should be 10 times cheaper: $2M per nearline petabyte. Ibm has announced plans to transition from 3490 to dlt in its next tape products.

DLT tape densities are doubling every 2 years, and data rates are doubling every 4 years. These trends are expected to continue for the rest of the decade. It is not encouraging that mox, gox, and scan numbers do not change much for tapes. Rather, they get worse for scans because transfer rates do not keep pace with density. This encourages us to move to a tape architecture that uses many inexpensive tape drives rather than a few expensive ones. In addition, it encourages us to read and write large (100-MB) objects from tape so that the tape is transferring data most of the time rather than picking or seeking.

Table A2-6: DLT Tape Drive Capacity and Price Over Time (Uncompressed)óCompressed Capacity and Data Rate is Typically 2 Times Greater

Year	Capacity	Cost/GB ($)	Transfer rate	Mount and rewind time	kox	mox	gox	scans
1995	10 GB	300	3 MB/s	1 minute	60	60	9	26
2000	100 GB	30	8 MB/s	1 minute	60	60	19	7
2005	1 TB	3	20 MB/s	1 minute	60	60	33	2

We discuss tape robots in Section A2.2.7, System Architecture. Briefly, the choice is between 100 small tape robots or 1 large silo. We believe the former is more economical and scalable than the latter. The tape-robot-farm has better mox, gox, and scans. In particular, it can scan the data 1.7 times per dayóeven in the year 2005ówhile the silo (with only 16 read/write stations) will take 6 times longer to do a scan.

A2.2.6 Networking

FDDI, Fast-Ethernet, and ATM are competing to replace the dominance of Ethernet in the LAN marketplace. Fast-Ethernet and FDDI propose to offer 100 Mb/s data rates, while ATM offers either 150 Mb/s or 800 Mb/s data rates. Implementing any of these technologies will require replacing existing Ethernet wiring. Only ATM and proprietary networks are competing in the WAN arena. The telephone companies are committed to providing ATM service as the standard digital (voice and data) high-speed port into the public network, which may settle the LAN issue in favor of ATM.

ISDN is a relatively slow and expensive technology (100 times slower than Ethernet), which was obsolete when introduced.

In any event, EOSDIS will not have to innovate in the area of high-speed networks. Progress will be driven by the non-government sector (commerce, entertainment, and communications.) The NASA 500 and, indeed, most scientists will have high-speed (more than 100 Mb/s) desktop access to the Internet by 1997.

A2.2.7 System Architecture

A pattern emerges from these technology discussions: Commodity hardware components are fast, reliable, and inexpensive. The challenge is to build large systems from arrays of these components. Computer vendors are all embarked on building such systems.

Nodes are being built from a large-memory array connected to 10-100 processors. These shared-memory multiprocessors are typified by Symmetric Multi-Processors (SMP) from Sequent, Pyramid, Encore, Compaq, Sun Microsystems, DEC, and SGI. There are interesting differences among these machines but, from the viewpoint of EOSDIS, they offer low cost per SPECint or SPECfp. They typically run a UNIX-like operating system and provide a good basis for parallel computing.

There is still considerable debate about whether one can scale shared-memory multiprocessors to 10s of processors. TMI, KSR, and some at SGI claim it scales to 1,000s of processors. Others think the limit is much lower. All agree that to use commodity components and achieve modular growth, fault-containment, and high-availability, one should partition a large-processor array into groups of SMPs connected by a high-speed network. Tandem, Teradata, Digitalís VMS-cluster and Gigaswitch, and IBMís SP2 and Sysplex typify such cluster architectures.

A combination of distributed shared memory and clusters should allow EOSDIS to scale processor arrays to 1,000s of processors.

Processors are getting much faster, but DRAMs are not keeping pace. Consequently, to deliver the power of these processors, DRAMs will exploit multi-level caches. These relatively small (10s of MBs) processor-private caches will mask the latencies of the SMPís multi-GB shared memory. Cache designs will distinguish the fast SMPs from the slow. Again, EOSDIS need not worry about this because computer vendors will compete to build compute and storage servers for EOSDIS and many other high-end applications.

The previous section suggested that by the year 2005, EOSDIS DAACs will have 1,000s of disks storing several PBs of data. Management of these disk farms must be automatic (hence the interest in System Managed Storage). The following benefits accrue by grouping disks into arrays of 10 or 100 to be managed as a RAID-5 set:

Reliability: The array can mask the failure of a sector or drive and allow the device to be repaired online while the array continues to deliver service. The techniques to do this are duplexing or using parity disks (RAID-5), a commodity technology. Compaq is the largest supplier of RAID-5 technology, which is in widespread use in the PC server world.
Improved transfer rate: The kox rating of a disk is dominated by seek times. The mox rating begins to show the influence of transfer rates because transferring 1 MB requires about 5 times the disk latency. gox and scans are completely dominated by transfer times. Disk arrays can transfer data 10 or 100 times faster than single disks. This means the mox and gox rating of the array is 10 or 100 times higher, while the scan time stays constant.
Automatic cooling: By transparently spreading parts of files among disks, arrays spread the load among many drives, in effect automatically load balancing the disks of the array.

Disk arrays are already a commercial off-the-shelf (COTS) technology. The EOSDIS architecture should count on them to provide low-cost, high-performance storage subsystems for its processor arrays.

The implication for tapes should be obvious. Traditionally, tape robots consisted of a few expensive ($30K) tape read/write stations fed by a robot that moved tape cartridges in a silo. A typical configuration is a $1.8M STC silo, storing 6,000 3490 tapes, with 16 stations. It holds about 9 TB at a little more than $200/GB. It takes 3 days to scan the contents of the silo, and the silo delivers only 112 gox.

Exabyte and others pioneered the construction of low-cost ($10K) tape robots that can manage an array of 10-100 cartridges. Today, the DLT equivalent of this approach is an array of 180 DLT tape robots each with a 10-tape stacker holding 200 GB on 10 tapes. This array would cost about the same as the STC silo, store 4 times as much data, provide 2 scans per day (12 times better than the silo), provide 1,800 gox (15 times better than the silo), and would be easier to buy and scale as demand grows.