Original Paper: https://arxiv.org/abs/2305.09993

By: Weijia XuAndrzej Banburski-FaheyNebojsa Jojic

Abstract:

We introduce Reprompting, an iterative sampling algorithm that automatically learns the Chain-of-Thought (CoT) recipes for a given task without human intervention. Through Gibbs sampling, Reprompting infers the CoT recipes that work consistently well for a set of training samples by iteratively sampling new recipes using previously sampled recipes as parent prompts to solve other training problems. We conduct extensive experiments on 20 challenging reasoning tasks. Results show that Reprompting outperforms human-written CoT prompts substantially by +9.4 points on average. It also achieves consistently better performance than the state-of-the-art prompt optimization and decoding algorithms.


Summary Notes

image.png

Streamlining AI Reasoning with Automated Reprompting

The landscape of artificial intelligence is continuously advancing, with Large Language Models (LLMs) such as ChatGPT and InstructGPT leading the way.

These models excel in various tasks but often struggle with complex, multi-step reasoning. Traditionally, overcoming this issue required manually crafting Chain-of-Thought (CoT) prompts, a process not scalable.

The innovative technique of "Reprompting" changes this by automatically generating and refining CoT prompts through Gibbs sampling, enhancing LLM performance significantly.

In-Context Learning Explained

At the core of LLM capabilities is in-context learning. This method involves presenting models with example tasks to guide their responses.

CoT prompts are crucial here, as they provide a detailed walkthrough of the reasoning needed to solve complex tasks, boosting LLM effectiveness in multi-step reasoning.