The Virtual Observatory
by Suzanne Ross
Once upon a time, the sky was an immeasurable frontier, the epitome of how far we
could reach, a sentinel guarding the borders of our grasp. But that's not quite true
anymore, thanks to the Sloan Digital Sky Survey and the international group of
collaborators, including Microsoft researchers Jim Gray and Don Slutz. They're helping
the astronomers dig deep into the mystery of quasars, brown dwarves, black holes, and
galaxies.
The Sloan Digital Sky Survey (SDSS) began twelve years ago, and even in a world
saturated by Moore's Law advances, the project has advanced astronomy by light-years.
The first step for the SDSS was developing a telescope that could digitally map about
half of the Northern sky in five spectral bands from ultraviolet to the near infrared. By the
time the Sloan survey is finished it will have mapped the positions and absolute
brightness of more than 100 million celestial objects and measured the distance to more
than a million galaxies and quasars.
The database will contain about 40 terabytes of raw data—more than enough to hold
the Library of Congress. With this much information, scientists hope to see large-scale
patterns that will answer questions about how the universe evolved.
One of Sloan's goals is to build a 3D map of the sky by measuring the spectra and
redshift of the brightest million galaxies. In addition, the survey is getting spectra for a
hundred thousand faint galaxies that have quasar candidates. Quasars are the most
luminous known objects in our universe; they emit between 10 and 1000 times the
energy of our entire galaxy. The relentless expansion of the universe stretches the light
waves from a receding quasar to longer and longer wavelengths—shifting into the red
end of the electromagnetic spectrum. The higher the redshift, the further away it is from
us. Until the Sloan survey, the highest measured redshift for a quasar was 4.9, which is
between 3000 megaparsecs and 6000 megaparsecs from Earth. The light from a 4.9
redshifted quasar takes more than 10 billion years to reach Earth.
Another mystery the SDSS hopes to uncloak is a force that most people rarely think
about, though we use it every day. But gravity, the weakest of the four fundamental
forces in the universe, is also one of the most curious. It's much more complex then an
apple falling on Newton's head. SDSS hopes to study the nature, amount, and
distribution of dark matter — an unknown and invisible material that some astronomers
think accounts for 90 percent of the gravity in the universe.
Alex Szalay, a theoretical cosmologist at John Hopkins University, is leading the effort to
design the archives for the project. Szalay said that the group made a few major
discoveries during the beta-test of the whole system. "We've already detected up to the
order of 10 million stars and galaxies in the little bit of data that we have taken. And we
looked at the distribution and tried to separate the typical from the rare. We used another
telescope to follow up each of the rare events. We found the 40 or 50 funniest objects in
the sky. And some of those turned out to be the highest, most distant quasars.
"First we found a 5, then a 5.1, then a 5.3 and then a 5.8. These are still now the most
distant objects from the earth that anyone has ever seen. Another new class of objects
that we have found was the so-called brown dwarves. Those are funny objects. They are
between stars and planets. Bigger than planets, but smaller than stars. They are very
funny because what we call stars have nuclear burning in the center and the planets on
the other hand only reflect light. And these are too small to have nuclear burning inside,
but still they are so big that as gravity pulls it together it heats up. It's not quite hot
enough to start the nuclear burning but enough to have a heat of their own, so they
radiate in the infrared. And we detected them through the infrared radiation."
Microsoft researcher Jim Gray got involved in the project when Szalay needed some
help designing the database and analyzing the data. Szalay and Gray met through a
mutual friend who grew up with Szalay and attended college with Gray.
Gray is helping Szalay speed up the system. "Right now we have only a fraction of the
data. When we have it fully populated, using today's technology, the full queries will take
a fairly long time. So, we hope to use clever algorithms and large memory machines--all
sorts of tricks. This is where Jim's expertise comes in. He's helped us already to
reorganize the data and that resulted in much faster queries," Szalay said.
Szalay is also looking for more intelligent software tools to help him find new patterns in
the data through data mining techniques.
Gray's work with TerraServer, a project led by Tom Barclay, taught the group new ways
to process and store large amounts of data. TerraServer was started as a project to
showcase the robust abilities of SQL Server. Barclay gathered satellite pictures from the
United States Geological Survey (USGS) and Spin-2, and put the data online so that
people could access the pictures from their home computers. TerraServer even proved
helpful to a military officer who wanted to make sure his ship wouldn't go aground on a
rock he knew was somewhere in the bay he needed to navigate.
While TerraServer proved to be an excellent resource for the Government, its main job
was to show how terabytes of data could be successfully stored and accessed online.
The difference between TerraServer and the upcoming project, SkyServer, is that
TerraServer doesn't have a strong data mining component. It's primarily images of the
earth from an airplane, satellites, or scanned topographic maps. But there isn't much
data about cities, roads, and demographics.
The astronomy datasets, on the other hand, start with images and spectra but are almost
immediately translated into descriptions about the objects. The survey can identify about
400 parameters about each object, including the shape, whether it's a galaxy or a star,
and its colors.
Extracting the objects from all of this raw data is a challenging task. It took a million
lines of software code developed by the ten astronomy groups that make up the SDSS
consortium. Fermilab in Chicago, Illinois runs the data pipeline. The project
collaborators originally planned to deliver the data with a year's delay. This would give
them the chance to make some of the first discoveries, but also to make sure that the
calibration of the data is accurate.
Jim Gray's group suggested publishing the public data as a Web site similar to the
TerraServer. Szalay and the SDSS project leaders agreed. "Essentially we decided to
take all the engineering data that we used for these early discoveries and package it
very nicely in a version similar to the TerraServer which we call SkyServer. It will
be released between June and July for the public," Szalay said.
Szalay feels that the impact of the SkyServer on education will be substantial. He says,
"People simply love to gaze at the sky. And if we can connect this to the excitement of
scientific discovery I think we can motivate a lot of small kids to have a different attitude
to our science."
Astronomy has a history of a large group of amateur astronomers, some who are
engaged in other sciences, some of them in school, some of them who just like to look at
the sky. SkyServer will be a tremendous help to those who would never be able to
access the power of a telescope like the Sloan Digital telescope.
There is even some discussion that the Sloan Sky Survey might one day set aside a part
of the sky as a project for school kids who will get to see the data first. Szalay said, "Jim
has some very nice ideas about how to tie the SkyServer site to educational content. So
hopefully we will find teachers who will volunteer, who will get excited about writing
curricula and teacher guides around this Web site."
Gray is excited about the SkyServer project. "I've been working with Alex Szalay on
building that server. Microsoft is in fact contributing the hardware and some of the
support dollars for that server and it will be running Microsoft software, so we're also
contributing much of the software." Fermilab will operate the server.
Szalay hopes that the Sloan Survey, combined with other's efforts, will form a Virtual
Observatory and cause an entire paradigm shift in astronomy and other sciences. He
said, "The exciting thing about this will be the question of how you do science when you
collect and have at your fingertips so much data that you simply can't even think of
displaying it on your computer screen."