![]() These puzzles-known in complexity theory as the class of “NP” decision problems-are easy to check for correctness (no hidden answer key) but often difficult to solve. Our approach uses programming puzzles introduced by Microsoft Research in 2021. How can an AI system generate novel algorithmic programming problems without knowing the solution? Watch video The key challenge and our solution We detail our work in a new paper, “Language Models Can Teach Themselves to Program Better,” which we’re presenting at the 2023 International Conference on Learning Representations (ICLR). Our approach involves having the AI design and solve its own programming challenges, enabling practice on millions of artificial challenges and exploration of problem types not found in public repositories. While humans outperform AI systems at designing such algorithms, we show how to improve AI programming abilities using self-play, a technique that has helped AI systems dominate in games such as chess and Go.ĭesigning fast and accurate algorithms requires high-level abstract reasoning, which remains difficult for AI systems. This process leads to significant improvements as measured on held-out test puzzles, which were also handwritten.Įfficient algorithms are crucial for many purposes, including reducing energy consumption in digital devices. Finally, the LM is improved by further training on these verified correct solutions to synthetic puzzles, and the process repeats. In Step 3, the computer (specifically a Python interpreter) filters the candidate solutions for correctness. Then, the LM attempts to solve each of these puzzles 100 times. First, the LM generates novel puzzles based on a training set of handwritten puzzles. A self-play pipeline for a language model (LM) to improve itself in a fully automatic manner. ![]() This research was accepted by the 2023 International Conference on Learning Representations (ICLR), which is dedicated to the advancement of the branch of artificial intelligence generally referred to as deep learning.
0 Comments
Leave a Reply. |