Clear Search sequence regions


filter terms:
  • causal (13)
  • Sizes of these terms reflect their relevance to your search.

    This article introduces a novel approach, called causal prompting network (CPNet), to enhance the causal intervention in the context of image captioning. By leveraging visual prompt engineering in the feature space, this method aims to achieve superior performance in causal intervention tasks. Since CPNet is highly flexible and adaptable, it can be incorporated into any existing causal intervention-based image captioning framework. Specifically, two types of visual prompts-causal region of interest (RoI) prompt (CRP) and causal matching prompt (CMP)-are employed to refine the feature representations effectively. CRP is utilized on the RoI feature of the object feature to enhance RoI features with deconfounded causal features. Meanwhile, CMP is used to strengthen the contextual representation of confounders linked to image captioning tasks. To evaluate the proposed CPNet's effectiveness, an extensive range of experiments are conducted on the popular microsoft common objects in context dataset (MS-COCO) and Flickr30k datasets, and the results are validated using the Karpathy split. Experimental results demonstrate that the proposed CPNet surpasses the performance of other state-of-the-art (SOTA) image captioning methods.

    Citation

    Youngjoon Yu, Yeonju Kim, Yong Man Ro. Advancing Causal Intervention in Image Captioning With Causal Prompt. IEEE transactions on neural networks and learning systems. 2024 Nov 14;PP


    PMID: 40030378

    View Full Text