Extreme AI Risk W05
In this week's newsletter, we explore the topic of modern large models’ alignment and examine criticisms of extreme AI risk arguments.
In this week's newsletter, we explore the topic of modern large models’ alignment and examine criticisms of extreme AI risk arguments.
Join the Alignment Jam hackathon this weekend to get experience in doing ML safety research! https://ais.pub/scale
Opportunities
- Join Ought in a senior role: https://ais.pub/ought
- Be the communications expert of the Fund for Alignment Research: https://ais.pub/farcomm
- Refer a cool friend to the Redwood Research summer internship! https://ais.pub/referrr
- Or apply for it yourself: https://ais.pub/internrr
Sources
- LLMs are implicitly topic models: https://arxiv.org/abs/2301.11916
- Condition predictive models: https://arxiv.org/abs/2302.00805
- Goal alignment: https://arxiv.org/abs/2302.00813
- Alignment with human representations supports robust few-shot learning: https://arxiv.org/abs/2301.11990
- MusicLM from Google very good: https://arxiv.org/abs/2301.11325
- Nuno Sempere’s criticism of high AI risk: https://nunosempere.com/blog/2023/01/23/my-highly-personal-skepticism-braindump-on-existential-risk/
- Neel’s quickstart guide: https://www.alignmentforum.org/posts/jLAvJt8wuSFySN975/mechanistic-interpretability-quickstart-guide
- Memorization → generalization double descent phenomena: https://www.alignmentforum.org/posts/KzwB4ovzrZ8DYWgpw/more-findings-on-memorization-and-double-descent
- Inner alignment in Simulators: https://www.alignmentforum.org/posts/FLMyTjuTiGytE6sP2/inner-misalignment-in-simulator-llms
- Action at a distance in the loss landscape: https://www.jessehoogland.com/note/spooky-action-at-a-distance-in-the-loss-landscape
- Luciano Floridi on ML Street Talk: https://youtu.be/YLNGvvgq3eg
His article about superintelligence: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4222347
