How the visual system integrates the information provided by several depth cues is central for vision research. Here, we present a model for how the human visual system combines disparity and velocity information. The model provides a depth interpretation to a subspace defined by the covariation of the two signals. We show that human performance is consistent with the predictions of the model, and compare them with those of another theoretical approach, the modified weak-fusion model. We discuss the validity of each approach as a model for human perception of 3-D shape from multiple cues to depth.