top of page
Prompting Techniques: Multimodal CoT & Graph Prompting

Prompting Techniques: Multimodal CoT & Graph Prompting

Multimodal CoT Prompting


Zhang et al. (2023) recently proposed a multimodal chain-of-thought prompting approach. Traditional CoT focuses on the language modality. In contrast, Multimodal CoT incorporates text and vision into a two-stage framework. The first step involves rationale generation based on multimodal information. This is followed by the second phase, answer inference, which leverages the informative generated rationales.


The multimodal CoT model (1B) outperforms GPT-3.5 on the ScienceQA benchmark.

Image Source: Zhang et al. (2023)


Further reading:

Liu et al., 2023 introduces GraphPrompt, a new prompting framework for graphs to improve performance on downstream tasks.

Sail London gives you the know-how to turn prospects into loyal clients.

​

Discover in 20 mins how you can gain more use from instructional insights that last longer, build heightened client familiarity, and minimise your sales cycle.

Thank you for submitting

  • Black LinkedIn Icon
bottom of page