Computer Technology Forecast for Virtual Observatories

Jim Gray

Computer Technology Forecast for Virtual Observatories

Jim Gray

MSR-TR-2000-102 | September 2000

Download BibTex

The Virtual Observatory integrating all the world’s astronomy data into one world-wide telescope will consume vast quantities of storage, processing, and bandwidth. It will require new kinds of software. This talk discusses the surprise-free technology predictions that indicate that the Virtual Observatory can house several petabytes of data and that it can be quickly accessible to everyone everywhere. It argues that the memory hierarchy will force us to a scale-out approach with many commodity components each managing a part of the VO database or pipeline. The paper has 3 main points: (1) Technology progress in processors, storage, and networks is keeping pace with the growth of astronomy datasets. (2) Access times are not improving so rapidly, hence we must use parallelism to ameliorate this gap. With parallelism, we will be able to scan the archives within an hour. This parallelism should be automatic. I believe database technology can give this automatic parallelism. (3) The Virtual Observatory will be a federation of database systems. As such, the federation needs to define interoperability standards. FITS is a good start, but the community could benefit by looking at progress in the commercial sector in using directories, portals, XML, and SOAP for data interchange.