Interview Of The Week: Stuart Russell, Director Of The Center For Human-Compatible AI

Stuart Russell is a Professor of Computer Science at the University of California at Berkeley, holder of the Smith-Zadeh Chair in Engineering, and Director of the Center for Human-Compatible AI and the Kavli Center for Ethics, Science, and the Public. He is a recipient of the IJCAI Computers and Thought Award and Research Excellence Award and held the Chaire Blaise Pascal in Paris. In 2021 he received the OBE from Her Majesty Queen Elizabeth and gave the Reith Lectures. He is an Honorary Fellow of Wadham College, Oxford, an Andrew Carnegie Fellow, and a Fellow of the American Association for Artificial Intelligence, the Association for Computing Machinery, and the American Association for the Advancement of Science. His book “Artificial Intelligence: A Modern Approach” (with Peter Norvig) is the standard text in AI, used in 1500 universities in 135 countries. His research covers a wide range of topics in artificial intelligence, with a current emphasis on the long-term future of artificial intelligence and its relation to humanity. He has developed a new global seismic monitoring system for the nuclear-test-ban treaty and is currently working to ban lethal autonomous weapons. He recently spoke to The Innovator about generative AI.

Q: There is a lot of buzz around generative AI and some concern. What’s your take?

SR: Generative AI is not real artificial intelligence, but it is giving us a taste of what it would be like to be in a world where real intelligence is on tap like electricity.

It is quite difficult to predict what the concerns should be. When the Internet started, we didn’t have concerns about security. If we did, we would have built a different architecture. When social media started, we did not anticipate the thousands of ways it could be abused and how it could lead to depression, polarize people, and promote extreme content. So, whatever we say now about generative AI – whether it be text and images and probably video and music – there are sure to be 10X as many things we didn’t think about.

With GPT3 there are already some obvious things such as the turbocharging of deep fakes. For example, there was a deep fake video conversation with someone purporting to be the mayor of Kiev doing a Zoom call with a bunch of European politicians. Most of them were fooled. It is only a matter of time before similar things start happening to your spouse and your kids. With individualized phishing, if a chatbot starts a conversation with you it will be very hard to tell if it is a real person. I am already seeing fake people with complete resumes on LinkedIn. The EU AI Act includes a provision that says people have the right to know whether they are dealing with a machine. I am not sure we need legislation for content, but we at least need standards to ensure that anything generated by generative AI – text, images, videos, etc.- is labeled as such. That is very difficult at the moment as there is no agreed standard.

The premise of the work we are doing in the Center for Human-Compatible AI center is that we need a new model of how we design all AI systems. The more general problem with technology is that we design it to do a given thing, but it is difficult to know how humans will end up using it. The current model essentially relies on humans specifying an objective up front. That model works in the lab, for example with puzzles and games that have an objective built-in, but it doesn’t work in the real world where you can’t completely specify the objectives.

Q: You were quoted as saying that the recent discovery of a weakness in some of the most advanced Go-playing machines -which allowed a human to win – underscores how the world has been far too hasty to ascribe superhuman levels of intelligence to machines. Please elaborate

SR: The specific case of Go is quite revealing. When AlphaGo and it successors beat the best human players in 2016 and 2017, we all thought the programs had won—they were just better, in the same way that chess engines are just better than humans. Subsequently, Go programs became ridiculously superhuman: KataGo, the best program, is rated around 5200, compared to the best human at 3800. In other words, we’re toast.

But some of my students Michael Dennis and Adam Gleave suspected that there were Go positions the programs would fail to understand. First, they designed one by hand, and it worked—the programs all made the wrong move. But the position was very contrived and wouldn’t arise in a real. So we developed a program that probed for weaknesses in KataGo, and it found a general pattern: a way of playing that beat KataGo every time, even though it wasn’t a good way to play Go at all. One of our team, Kellin Pelrine, is a decent amateur Go player (rated about 2300), and he was able to take the same basic idea and use it to beat all the leading Go programs. Basically, they have not learned the most basic concept in Go—what a group of stones is, and whether it is alive or dead. Perhaps they have learned many fragmentary approximations to the concept, covering many typically occurring cases, but we found a very simple case—a sort of circular sandwich—where all the programs (which are built by different teams using different neural network designs and different training data) make the same basic mistake and throw away all their pieces. So we were all fooled. The programs aren’t actually good at Go, in the sense of understanding the game properly. They’re just very good at appearing to be good at Go, based partly on having played billions of games.

I think the same thing is going on with large language models, except worse. Because they produce coherent answers in excellent English, we assume they are intelligent. That is, we assume they know things and answer questions based on knowledge, as we do. But they probably don’t know anything at all. Here’s an example my friend Prasad Tadepalli sent me:

Q: Which is bigger? An elephant or a cat?
A: An elephant is bigger than a cat.

Q: Which is not bigger than the other? An elephant or a cat?
A: Neither an elephant nor a cat is bigger than the other.

I say “probably don’t know anything” because we really have no idea how they work, but they are trained simply to predict the next word given the preceding 3,000 words, based on billions of words of training data. In a sense, they draw on the most similar stretches of text in the training data, make appropriate transformations, and spit out the next word. They are probably very sophisticated parrots. They don’t know there is a world, that language is about that world, that questions have real answers relative to that world, that facts must be consistent with each other, and so on.

Q: Will the technology deceive and confuse most humans into thinking it understands – and is that dangerous?

SR: Absolutely, yes, there’s plenty of evidence of this happening already, such as the Google engineer who believed the LaMDA chatbot was sentient. People with less technical sophistication have no way to inoculate themselves against the illusion. There are obvious dangers, such as people becoming psychologically dependent on an algorithm that has no more feelings for them than a paperclip does, or people believing that the output from chatbots is in some way related to the truth. In many cases, the output is a complete hallucination.

Q: Microsoft is already building generative AI into its business tools and corporates are starting to experiment with generative AI, with an eye to developing new business models, according to a new report from BCG Henderson Institute, Boston Consulting Group’s think tank.The think tank is advising companies to experiment with use cases that go beyond short-term productivity gains. Given the technology’s current limitations how would you advise corporates to use the technology?

SR: If you are a politician and you are writing a speech you might ask ChatGPT to pretend it is a member of the opposition party and write a critique. That is a perfectly legitimate use, and it is low stakes. ChatGPT should never be entrusted with anything that is high stakes. People talk about using it in customer service, but how does the company get it to stick to the truth about your products and policies? It might sell you a product that doesn’t exist or approve an insurance policy for a house on Pluto. A rule of thumb a CEO could use is what part of my business would I entrust to a six-year-old? A 6-year-old old has more common sense and is a lot more grounded than ChatGPT.

Q: The attitude of tech companies seems to be the only way to perfect generative AI is to test it in the wild. Given what you’ve said is it responsible for generative AI technology to be unleashed without any standards or guardrails in place?

SR: That seems to be an incredibly irresponsible attitude, but it arises only because these systems are not designed at all. The machine learning process has produced something we don’t understand and are not sure how to use, just like evolution produced wild dogs and twenty-odd thousand years ago we were faced with the problem of how to fit them into human life and gradually figured it out by trial and error and a lot more selective breeding—even though we didn’t know how dogs actually work because we didn’t design them! A more responsible approach would be to design AI systems so that we do understand them and can ensure that they work as intended.

Q: Assuming we do need guardrails what should they look like? An essay published this week co-authored by NYU Professor Gary Marcus and a member of the Canadian parliament argued that It’s time for government to consider frameworks that allow for AI research under a set of rules that provide ethical standards and safety, while pausing the widespread public dissemination of potentially risky new AI technologies—with severe penalties for misuse – until we can be assured of the safety of new technologies that the world frankly doesn’t yet fully understand. Do you agree?

SR: Basically, yes. I’ve been arguing for several years that we need something like an FDA for algorithms. We’ve already nearly destroyed our democratic societies with the simple learning algorithms that run the recommender systems in social media, because they have learned to amplify disinformation and manipulate people into becoming more extreme versions of themselves. Giving away tools that produce unlimited amounts of totally convincing false information, fake videos, impersonations, etc., is going to make things far worse. How many more disasters do we need before governments take action?

This article is content that would normally only be available to subscribers. Sign up for a four-week free trial to see what you have been missing.

To access more of The Innovator’s Interview Of The Week articles click here.

Interview Of The Week: Stuart Russell, Director Of The Center For Human-Compatible AI

About the author

Jennifer L. Schenker

You may also like

About the author

Jennifer L. Schenker