Videos » 'Show Your Working': ChatGPT Performance Doubled w/ Process Rewards (+Synthetic Data Event Horizon)

'Show Your Working': ChatGPT Performance Doubled w/ Process Rewards (+Synthetic Data Event Horizon)

Posted by admin
I have not only read the Let's Verify step by Step paper released less than 24 hours ago, I have combed the release notes and appendix, read most of the linked papers and done my own tests. It's true, performance is massively boosted, and not just for mathematics but science and other domains too. I'll show you comparisons with GPT 3 and PaLM 2, and demonstrate that new records are coming soon. I will also cover the 'synthetic data event horizon' and what might have gone into GPT 4's training. I'll show you how PRM works vs ORM, and why finetuning is still relevant. Plus I'll cover reaction from Jan Leike, Ilya Sutskever, Sam Altman and more. I will also feature the highly relevant paper 'Language Models Don’t Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting'. I'll also give a glimpse from Rob Miles about just how weirdly GPT 4 might think. Verify Paper: https://cdn.openai.com/improving-mathematical-reasoning-with-process-supervision/Lets_Verify_Step_by_Step.pdf Release Page: https://openai.com/research/improving-mathematical-reasoning-with-process-supervision#samples Altman tweet: https://twitter.com/sama/status/1664018190840614912 Language Models Don’t Always Say What They Think: https://arxiv.org/pdf/2305.04388.pdf Sparks of AGI MATH Comparison: https://arxiv.org/pdf/2303.12712v5.pdf PaLM 2 Comparison: https://ai.google/static/documents/palm2techreport.pdf AP Chemistry Calc: https://www.albert.io/blog/ap-chemistry-score-calculator/ Altman Synthetic Data (min 4): https://www.youtube.com/watch?v=1egAKCKPKCk&t=764s Anthropic View: https://www.anthropic.com/index/core-views-on-ai-safety Jan Leike Tweet: https://twitter.com/janleike/status/1663977494058520576 Rob Miles' Tweet: https://twitter.com/robertskmiles/status/1663534255249453056 https://www.patreon.com/AIExplained
Posted June 1, 2023
click to rate

Embed  |  194 views