Large language models (LLMs) have shown significant prowess in answering simple questions rapidly. However, when it comes to tackling complex tasks that necessitate reasoning and planning, special prompting techniques are required to augment their abilities. These techniques, often known as “System 2” schemes, push LLMs to generate intermediate steps towards problem-solving. While effective, System 2 techniques tend to slow down LLM applications and increase computational costs.
In cognitive science, System 1 and System 2 describe two distinct modes of thinking. System 1 thinking is fast, intuitive, and automatic, whereas System 2 thinking is slow, deliberate, and analytical. LLMs are frequently likened to System 1 thinking, excelling in generating text quickly but struggling with tasks that demand conscious reasoning and strategic planning.
Recent advances in AI research have demonstrated that LLMs can emulate System 2 thinking by employing prompting techniques that necessitate generating intermediate reasoning steps before delivering a final answer. For instance, “Chain of Thought” instructs LLMs to explicate their reasoning process step by step, resulting in more precise outcomes for logical reasoning tasks. Despite the enhanced accuracy stemming from explicit reasoning, many System 2 prompting methods incur higher inference costs and response latencies, hindering their widespread adoption in production systems.
Inspired by the gradual transition of deliberate tasks to automatic processes in human cognition, Meta AI researchers introduced “System 2 distillation” for LLMs. Distillation, a common machine learning technique, typically involves training a smaller model (the “student”) using a larger model’s (the “teacher”) knowledge. However, System 2 distillation deviates from this norm by distilling a model’s System 2 reasoning capabilities into its System 1 generation without relying on an external teacher model.
System 2 distillation initiates by prompting the LLM to solve a problem using System 2 techniques, followed by verifying responses for correctness through an unsupervised mechanism like self-consistency. Intermediate reasoning steps are then discarded, retaining only the final answers. By fine-tuning the model on the initial question and answer, the model learns to bypass reasoning steps and directly obtain solutions.
Meta AI researchers evaluated System 2 distillation on various reasoning tasks using different prompting techniques, such as Chain-of-Thought, System 2 Attention, Rephrase and Respond, and Branch-Solve-Merge. Results demonstrated that System 2 distillation significantly enhances LLM performance in complex reasoning tasks, often surpassing original System 2 methods’ accuracy. Moreover, distilled models exhibit quicker response times and lower computational requirements by eliminating intermediate reasoning steps.
While System 2 distillation proves beneficial in improving LLM performance, researchers noted limitations in distilling certain reasoning skills, akin to human cognitive constraints. Tasks involving complex mathematical reasoning, for instance, posed challenges for LLM distillation. Continued research is needed to explore System 2 distillation’s efficacy on smaller models and its impact on broader task performance. Additionally, addressing potential contamination in LLM benchmarks is crucial to ensure accurate evaluations.
System 2 distillation offers a promising avenue for enhancing LLM capabilities by incorporating deliberate reasoning into rapid inference mechanisms. By distilling System 2 knowledge effectively, LLMs can tackle complex tasks more efficiently, resembling human cognition’s adaptive nature. As research in this field progresses, the optimization potential of distillation techniques will undoubtedly shape the future of LLM pipelines, enabling them to excel in diverse tasks akin to human problem-solving strategies.
Leave a Reply