A Self-Replicating GPT-4!

In this week's MLAISU, we're covering the latest technical safety developments with GPT-4, looking at Anthropic's safety strategy, and covering the fascinating Japanes...

Interpretability on Go and Language Models

This week, we take a look at interpretability used on a Go-playing neural network, glitchy tokens and the opinions and actions of top AI labs and entrepreneurs.Opportu...

Bing Wants to Kill Humanity W07

We look at Bing going bananas, see that certification mechanisms can be exploited and that scaling oversight seems like a solvable problem from our latest hackathon re...

Will Microsoft and Google start an AI arms race? W06

We would not be an AI newsletter without covering the past week’s releases from Google and Microsoft but we will use this chance to introduce the concept of AI race dy...

Extreme AI Risk W05

In this week's newsletter, we explore the topic of modern large models’ alignment and examine criticisms of extreme AI risk arguments.

