Active Learning via Label-Adapted Diffusion (or how to crack Big data with Small data)
- Kushnir D.
Active-Transductive Learning with Label-Adapted Kernels This paper presents an efficient active-transductive approach for classification. A common approach of active learning algorithms is to focus on querying points near the class boundary in order to refine it. However, for certain data distributions, this approach has been shown to lead to uninformative samples. More recent approaches consider combining data exploration with traditional refinement techniques. These techniques typically require tuning sampling of unexplored regions with refinement of detected class boundaries. They also involve significant computational costs for the exploration of informative query candidates. We present a novel iterative active learning algorithm designed to overcome these shortcomings by using a linear running-time activetransductive learning approach that naturally switches from exploration to refinement. The passive classifier employed in our algorithm builds a random-walk on the data graph based on a modified graph geometry that combines the data distribution with current label hypothesis; while the query component uses the uncertainty of the evolving hypothesis. Our supporting theory draws the link between the spectral properties of our iteration matrix and a solution to the minimal-cut problem for a fused hypothesis-data graph. Experiments demonstrate computational complexity that is orders of magnitude lower than state-of-the-art, and competitive results on benchmark data and real churn prediction data.