Data Mining the SDSS SkyServer Database

An earlier paper described the Sloan Digital Sky Survey’s (SDSS) data management needs [Szalay1] by defining twenty database queries and twelve data visualization tasks that a good data management system should support. We built a database and interfaces to support both the query load and also a website for ad-hoc access. This paper reports on the database design, describes the data loading pipeline, and reports on the query implementation and performance. The queries typically translated to a single SQL statement. Most queries run in less than 20 seconds, allowing scientists to interactively explore the database. This paper is an in-depth tour of those queries. Readers should first have studied the companion overview paper “The SDSS SkyServer – Public Access to the Sloan Digital Sky Server Data” [Szalay2].

tr-2002-01.pdf
PDF file
tr-2002-01.doc
Word document

In  Distributed Data and Structures 4: Records of the 4th International Meeting

Publisher  Carleton Scientific

Details

TypeInproceedings
Pages189-210
NumberMSR-TR-2002-01
ISBN1-894145-13-5
InstitutionMicrosoft Research
AddressParis, France
> Publications > Data Mining the SDSS SkyServer Database