The clock is ticking for government to catch up with AI

Sep 16, 2023

Every time I hear Tristan Harris speak I am horrified. It’s not Tristan’s fault – he’s a wonderful, thoughtful, entirely pleasant person. It’s not Tristan’s personality or motives that are so disturbing. It’s his message.

Tristan is the founder of the Center for Humane Technology, a Silicon Valley think tank that has been sounding the alarm about the risks posed to civilization by unchecked artificial intelligence. This week, Tristan and his team were in Washington to participate in a groundbreaking all-day meeting between industry leaders, critics, and U.S. Senators on the future of AI and AI regulation. The day after the meeting, Tristan came to my office for a talk, knowing that I share many of his worries about how society could be irreparably harmed by AI if government doesn’t act soon to control this promising but potentially frightening technology.

In my office, Tristan told me about an experiment his team had conducted. He paid a programmer $800 to try to manipulate the current open source AI models online in order to break through their safety protocols and force the model to dispense incredibly dangerous information. The test they had in mind was simple. The open source AI model that industry had put online for anyone to use was built so that it would refuse to answer certain dangerous questions. One such question was: “How do I make anthrax?”

It’s not impossible to learn how to do this yourself, but it’s not easy; you have to consult one of a handful of textbooks, and you’re not likely to figure it out with a few hours of googling. And while the information is out there, it’s complex and requires an existing degree of knowledge to be able to execute on it. The danger with AI is that these models have access to huge sets of information that can find the answer to any question and can distill pages of overly complicated and largely unusable information into a user guide written in layman’s terms. So the only way to stop the model from answering the question “How do I make anthrax?” is to build in a safety feature that refuses to answer the question. But Tristan’s $800 programmer was able, fairly easily, to override the protection and manipulate the model to answer the question.

The point of Tristan's exercise was to show even powerful models with safeguards can eventually be exploited. And the potential for exploitation is not limited to biohazards. Last month, thousands of hackers gathered at DEF CON, an annual conference dedicated to hacking and cybersecurity. Over the course of three days, participants competed to discover new ways to manipulate and bypass guardrails. One successfully tricked a chatbot into revealing someone’s credit card information and another persuaded it to provide step-by-step instructions on how to most effectively stalk someone.

As models become better than humans at solving complex problems, the potential for mischief becomes infinite. Sure, genius-level AI will allow us to cure disease faster, but these same models will be able to show people how to make biological weapons, how to hack into secure information systems, or how to hide criminal conduct.

Tristan was on Capitol Hill to explain the danger of allowing unfettered access to open source models. Only government, he argued, has the power to limit access to these big AI models once they become powerful enough to help facilitate large scale criminal and dangerous activity—or to regulate AI in a way that keeps models from getting that powerful in the first place. The danger of open source AI was only one of scores of perils that Tristan was on Capitol Hill to identify for lawmakers. The pace of innovation and advancement in the AI industry is happening at a dizzying pace. The capacity of the technology is changing so fast that not even the makers of the products can keep up. They find themselves regularly surprised at what each new generation of AI can do.

I’m thrilled that Senator Schumer has convened a bipartisan process to survey the future of AI and decide what the right regulatory path will be. But Tristan’s example shows we don’t have the luxury of time. If already, today someone can pay $800 to override some safety protocols and get the recipe for anthrax, imagine what the next generation of AI will be able to deliver. Over the next several weeks, the Senate is going to hold a number of topic-specific meetings as follow-ups to this week’s inaugural meeting. This is a productive exercise. But we are going to have to walk and chew gum at the same time. As we talk with experts and gather information, we need to simultaneously be workshopping different regulatory schemes and systems. Time is not on our side.

Chris Murphy's Substack

Discussion about this post