• AIPressRoom
  • Posts
  • Paul Christiano – Preventing an AI Takeover

Paul Christiano – Preventing an AI Takeover

Talked with Paul Christiano (world’s leading AI safety researcher) about:

– Does he regret inventing RLHF?– What do we want post-AGI world to look like (do we want to keep gods enslaved forever)?– Why he has relatively modest timelines (40% by 2040, 15% by 2030),– Why he’s leading the push to get to labs develop responsible scaling policies, & what it would take to prevent an AI coup or bioweapon,– His current research into a new proof system, and how this could solve alignment by explaining model’s behavior,– and much more.

Open Philanthropy

Open Philanthropy is currently hiring for twenty-two different roles to reduce catastrophic risks from fast-moving advances in AI and biotechnology, including grantmaking, research, and operations.For more information and to apply, please see this application: https://www.openphilanthropy.org/research/new-roles-on-our-gcr-team/The deadline to apply is November 9th; make sure to check out those roles before they close:

Follow me on Twitter: https://twitter.com/dwarkesh_sp

Timestamps(00:00:00) – What do we want post-AGI world to look like?(00:24:25) – Timelines(00:45:28) – Evolution vs gradient descent(00:54:53) – Misalignment and takeover(01:17:23) – Is alignment dual-use?(01:31:38) – Responsible scaling policies(01:58:25) – Paul’s alignment research(02:35:01) – Will this revolutionize theoretical CS and math?(02:46:11) – How Paul invented RLHF(02:55:10) – Disagreements with Carl Shulman(03:01:53) – Long TSMC but not NVIDIA