Models are simplified representations of some real world system, which hopefully generalize to make useful predictions about new observations within that system.
Building a model that is both accurate (makes reasonably good predictions) and generalizable (can make those predictions about never seen before instances) is hard. It is especially hard if the system we are modeling is an aggregate of sub-systems each of which has distinct causality driving the outcome to be predicted. Failing to recognize this heterogeneity is one of the fundamental AI failure types and might be referred to as a “heterogeneous awareness failure”.
The Uber team’s release of the Manifold tool is both a recognition that this type of failure exists and a helpful step towards addressing it.
Taking advantage of visual analytics techniques, Manifold allows ML practitioners to look beyond overall summary metrics to detect which subset of data a model is inaccurately predicting. Manifold also explains the potential cause of poor model performance by surfacing the feature distribution difference between better and worse-performing subsets of data. Moreover, it can display how several candidate models have different prediction accuracies for each subset of data, providing justification for advanced treatments such as model ensembling.