CVPR, 2012. (ORAL)
We address the problem of segmenting and recognizing objects in real world images, focusing on challenging articulated categories such as humans and other animals. For this purpose, we propose a novel design for region-based object detectors that integrates efficiently top-down information from scanning-windows part models and global appearance cues. Our detectors produce class-specific scores for bottom-up regions, and then aggregate the votes of multiple overlapping candidates through pixel classification. We evaluate our approach on the PASCAL segmentation challenge, and report competitive performance with respect to current leading techniques. On VOC2010, our method obtains the best results in 6/20 categories and the highest performance on articulated objects.