Scientific DataSet

Established: March 12, 2010

SDS logoScientific DataSet (SDS) is a managed library for reading, writing and sharing array-oriented scientific data, such as time series, matrices, satellite or medical imagery, and multidimensional numerical grids.

 

It features:

  • Rich metadata to create self-descriptive data packages.
  • Support for several common data formats, such as comma-separated values (CSV), network common data form (NetCDF), and hierarchical data format (HDF5).
  • The ability to scale up from simple text files to multi-terabyte Windows Azure archives.
  • Concurrent access to the data from multiple computing agents in multicore and distributed settings.
  • Consistency checks and transactional updates.

A lighter package that treats NetCDF library as an external dependency is at http://github.com/predictionmachines/sdslite. This package is also available via NuGet.

The library was developed in collaboration between MSR Computational Science lab in Cambridge, UK and the Computer Science department, Moscow State University.

You can read more about the library and how to get started in the Introduction to Scientific DataSet document.

People

Portrait of Vassily Lyutsarev

Vassily Lyutsarev

Principal Research Software Development Engineer