|
Image-level weakly supervised semantic segmentation is a great challenge to compensate for the missing mask labels. Methods based on image-level labels primarily use class activation map (CAM) to approximate the segmentation mask. In view of the pseudo masks only focus on the class-specific discriminative regions of objects, various methods are explored to expand pseudo masks to cover ground-truths. Contemporary methods tend to use a lower threshold to distinguish objects and backgrounds in order to adjust CAMs for higher object coverage. However, too many backgrounds are misclassified into pseudo masks and excessive noise is trained in downstream tasks. To surmount this crux, we propose our Adaptive and Attention-joint Supervision method (AAJS). AAJS divides classification network into two branches and adds the trend of expansion and convergence to the two branches respectively for their class-specific features, i.e. regions activated by CAM. Each branch is adaptive constrained by another branch based on the confidence of the features. Then, the class-specific features are enriched so as to obtain a more accurate CAM at a lower threshold. Moreover, we propose an adaptive feature dropout method to prevent the classification network from relying too much on discriminative regions. AAJS are based on the experiments evaluated on PASCAL VOC 2012 and matches or exceeds the state-of-the-art performance compare to existing methods. |
|
Keywords:Computer Vision; Weakly Supervised Segmentation; Image-level; Self-supervised; Attention |
|