ApproxFS

Established: June 2, 2015

File System for Approximate Storage

Approximate storage allows tradeoffs between data storage precision and other desirable characteristics such as energy savings, higher performance or higher density. It consists of placing different sets of data in memories or storage with different precision guarantees. For example, data may be partitioned into a “precise” portion with the highest possible reliability and fidelity in storage (e.g., 10-16 unrecoverable bit error rate) and an “approximate” portion with lower reliability (e.g., 10-4 unrecoverable bit error rate). Multiple levels of approximation are possible (e.g., 10-4 and 10-7 unrecoverable bit error rates).

File systems have been used to organize storage space and the data stored in it, always assuming consistently precise storage. To organize approximate storage, we propose that a file system be made aware of different levels of approximation (or preciseness) offered by different regions of the storage substrate. As such, this enlightened file system should provide a mechanism via its APIs to allow running programs to specify which pieces of data require which level of fidelity in storage. For example, this API could allow programs to specify the fidelity requirement of data to be written through an extended write call that, in addition to taking the data to be written as a parameter, also takes the requested fidelity. This file system should also organize its data structures (blocks, inodes, etc) so that data blocks are never stored in a substrate that offers lower fidelity than requested (although storing it in higher fidelity storage is acceptable). This file system may also store multiple blocks of data from the same file with similar fidelity requirements contiguously in the appropriate region of storage to improve spatial locality. Finally, the file system should be capable of collecting all different data blocks included in a given file from multiple storage regions with different fidelity and returning it to an application as an integral file (or in some other mutually agreed format).

People

Portrait of Karin Strauss

Karin Strauss

Senior Principal Research Manager, Microsoft Research AI4Science