AGI Progress & Theoretical Views W40

Welcome to this week's Safe AI Progress Report where Thomas describes the scary developments in AI, a theoretical dispute, and risks from power-seeking AI.
[correction] Eliezer kindly reached out to us to note that this is not the correct criticism from MIRI of CHAI's position. Here it is in his own words:
The basic argument goes that an AI whose utility function has been made dependent on hidden information, even if that information is inside humans, won't defer to humans because of that; it gets all the information that's obtainable and then ignores the humans (and kills them). There's never a point where "let the humans shut me off and build another AI" looks like a better strategy than "get all the info out of the humans and then stop listening".


Welcome to this week's Safe AI Progress Report where Thomas describes the scary developments in AI, a theoretical dispute, and risks from power-seeking AI.

00:00 Intro
00:18 Carmack's AGI
00:36 Video generation
00:57 AlphaTensor
01:26 PyTorch to Linux Foundation
02:00 Risks from power-seeking AI
03:09 MIRI criticizes CHAI's approach
03:55 Smaller news
04:35 Interpretability hackathon

Sources:
Capabilities updates
$20 million to develop AGI: https://www.insiderintelligence.com/c...
John Carmack doesn’t care about safety: https://twitter.com/ID_AA_Carmack/sta...
Meta’s GAN: https://ai.facebook.com/blog/generati...
Meta’s video gan is actually bad: https://phenaki.video/#interactive:~:...
Whisper: https://cdn.openai.com/papers/whisper...
People give the AI access to the internet: https://twitter.com/sergeykarayev/sta...
DeepMind’s AlphaMath matrix multiplication: https://www.deepmind.com/blog/discove...
Quanta article: https://www.quantamagazine.org/mathem...
Alignment researcher chatter: https://www.alignmentforum.org/posts/...
Grace
PyTorch to Linux: https://pytorch.org/blog/PyTorchfound...
Neutrality first (still better than Meta): https://www.linuxfoundation.org/blog/...
Yann LeCunn developing AGI: https://openreview.net/pdf?id=BZ5a1r-...
Paperclip clicker games: https://paperclips.tech/
Review of AI risk outlook: https://www.alignmentforum.org/posts/...
Joe Carlsmith’s report: https://arxiv.org/abs/2206.13353
Video lecture of the report: https://forum.effectivealtruism.org/p...
Endgames: https://www.alignmentforum.org/posts/...
Machine alignment Monday by Scott Alexander: https://astralcodexten.substack.com/p...
Article discussed: https://arbital.com/p/updated_deference/
Tamsin Leake’s outlook on AI safety: https://carado.moe/outlook-ai-risk-mi...
Alex’s loss functions: https://www.alignmentforum.org/posts/...
Physics-based deep learning: https://physicsbaseddeeplearning.org/...
Andrej Karpathy’s first video: https://www.youtube.com/watch?v=VMj-3...
Amazing interpretability tool from Redwood: http://interp-tools.redwoodresearch.org/
Tutorial for use: https://docs.google.com/document/d/1E...
Anthropic’s Svelte work: https://github.com/anthropics/PySvelte
Microscope: https://openai.com/blog/microscope/
Join the hackathon, Esben’s interpretability talk: https://itch.io/jam/interpretability
AGI Progress & Theoretical Views W40
Broadcast by