Conventional depth imaging is based on light, either through cameras, or by using lasers. These systems require hardware operating at high sampling frequencies, precise calibration, and they are power hogs (see new Kinect). We investigate the potential of ultrasound for depth acquisition, with an application to skeletal tracking in mind. We propose simple hardware for depth imaging, based on cheap and low-power off-the-shelf sensors. Even with a small array size and a rudimentary design, we obtain promising results comparable with the state-of-the-art of ultrasound in air. Departing from B-mode imaging, we show how to leverage sound source localization to get usable depth images. We further propose an algorithm for improving the frame rate of the system, and a novel 3D deconvolution algorithm based on convex optimization.