22 Comments

hi Simon, here is a much more lucid explanation of what worries me... https://open.substack.com/pub/ianleslie/p/flashpoints-10-ai-risk-for-dummies

Expand full comment
author

I followed that link. The article looked scary to me until I delved into it a little more. I posted a comment explaining why the Tic Tac Toe example isn't anything like the writer makes it out to be. Another reader (on that page), Fionnuala O'Conor, also offers her own reasons why it's not something to worry about.

Expand full comment

I think the tic tac toe example is a good illustration of AI using very unexpected methods to solve problems, and raises the fear that it might do something unexpected and extremely bad because nobody had told it not to. I read the article by Fionnuala O'Connor that she mentioned, which I found far from illuminating. I would love to read something reassuring and have been scouring the web and really found nothing other than panglossian blind faith and a failure to grasp the recklessness of creating something that acts independently with superhuman capacities and subhuman morality.

Expand full comment
author

Before I address your comment, may I just say (or repeat from earlier) that I do not doubt that computers can do all sorts of harm if the software is not properly programmed or the hardware cannot do what the software asks of it. I just don't think that AI, per se, is the issue.

Turning to your comment, the Tic Tac Toe example discussed in the link above was a bunch of computers programmed to play infinite Tic Tac Toe, but not given sufficient memory to do so. As a result, some of the computers crashed. That sort of outcome is a standard risk associated with any IT system, not merely AI.

A similar, but not identical, risk occurred 24 years ago with the "millennium bug" that had us terrified that airplanes might fall out of the sky (and such like) when 1999 clicked over into 2000. Hence my point in the first sentence of this comment.

Expand full comment

The point about the tic tac toe game is that the AI used tactics that were outside the rules of the game, but only because the rules were not perfectly defined. (Perhaps it is impossible to perfectly define rules for anything).

"On the Tic Tac Toe game, one of the algorithms realised that if it gave coordinates for a move that was billions of squares outside the actual board - say it's 2 trillion squares in that direction and 5 trillion squares in that direction - then the AIs it was playing against would have to try and model a board that was that big in their memory, and they couldn’t, so they crashed, so the first AI won by default."

You raised the objection that the AI probably did not 'realise' that this tactic worked because the opponent used up all its memory. It simply calculated or learned that this was a winning strategy. It is a perfectly valid point, but it does not pertain to the risk of AI doing unexpected and bad things because it is impossible to encode the rules of morality.

I agree that there is not a qualitative difference between this and other software mistakes. The difference is simply that AI does all kinds of things that people do not expect it to. In fact almost everything it does is unexpected, and nobody can predict exactly what it will do. It will not stay in its box, because there is no box. Just a web of sketchy guidelines full of holes.

Expand full comment
author

I disagree with your first paragraph. My understanding is that the AI used tactics that were entirely *within* the the rules of the game. If I have got that wrong, my argument on this issue falls away.

Expand full comment

Tic tac toe has 9 squares. They forgot to specify that you must only move to one of these.

Expand full comment

The capabilities of AI are grossly overstated. When chatGPT was first announced, I set it a test of writing my obituary. It produced a plausible-looking text, which would not look out of place in a magazine: but unfortunately, it was totally fictional and bore no resemblance to my career and circumstances. Until AI is able to carry out detailed research, trawling the Internet for all references to the subject at hand, its output will be limited by the data set upon which it has been trained. Even then, it will be limited by what it can find, whereas a human researcher would be able to access resources and do a much better job.

Expand full comment
author

Interesting test, Peter! You could have asked for a short biography, rather than an obituary, but each to his own.

To be fair to ChatGPT, it did only claim to be a language model. If you want search functionality as well, there are now ways to have this via Google's Bard and an updated version of Microsoft Bing which incorporates ChatGPT, but I believe they both have a waiting list for new accounts. Good luck.

Expand full comment

Simon, like you, I view with some scepticism both the extravagant negative and positive claims about AI. I’m pretty sure that much of the commentary is based on little or no insight into the way the various forms of AI work and will be underpinned by each commentator’s incentives and prejudices.

I fear that humans will retain, for some time, the power to screw up. AI isn’t going to fix climate change or stop war or feed refugees. Which should remind us that developed world AI angst must look pretty odd to those dealing with much more fundamental challenges.

It is also paradoxical that when there is a readiness to blame, rather than accept accountability, AI isn’t trumpeted as the saviour. How seductive to be able to blame the machine and deny human agency.

One observation (not mine) about ChatGPT type tools is that you are getting the internet herd’s views, which may produce exactly the homogenized – meta – response the user needs, but is hardly an engine for innovation and can inhibit creativity. My own experience is that the tools produce much better prose than many people can manage, and indeed can be used as an “editor”, but it’s foolish to imagine that its answers to questions are always right.

Expand full comment

Is it relevant that Deep Blue didn't know it had won? Do birds know why they build nests?

How do you think that AI will develop? People have a very limited understanding of what is happening inside the box. As far as I can make out, they just tweak things and see if the results improve. Perhaps they use one AI to tweak another one. If they don’t then surely they soon will.

It seems to me that these things are going to develop at an exponential rate, and that they will be shaped by their own kind of environmental and evolutionary pressures. For a while, the ones that are most helpful to humans will thrive. It seems clear that understanding and even second-guessing people’s needs will be useful. But there are also perhaps other environmental pressures that will shape them that may be invisible to us. And self-preservation could well evolve from a basic need to survive long enough to complete a goal.

It doesn’t seem particularly farfetched to imagine them escaping the box in order to help one person’s mission, and finding that other people are obstructing them.

In fact it seems almost inevitable that they would give one side an overwhelming military advantage on a battlefield, and so both sides will race to set them loose.

Expand full comment
author

Thank you, Horatio. Your first paragraph poses a WONDERFUL challenge. Perhaps my point about Deep Blue wasn't relevant, but I'm going to attempt to build an argument that it was and see where it gets me.

First, let me say that I did a bit of research which suggests that birds might, in fact, be taught to build nests (see https://www.bbc.co.uk/news/uk-scotland-15053754). But I think I must ignore that, because I believe it would be to miss your point. I think you are observing that there are innate or instinctive behaviours in some animals (and, indeed, I would add that there are things that new born babies do on instinct), so it doesn’t require knowledge or intent in order for animals to carry out actions. I believe that is the challenge you are putting to me.

But I think that instinctive behaviour probably requires at least one, and possibly both, of the following “triggers” in order for the action to take place (eg building a nest). (1) something in the DNA and/or (2) a specific sensation. If it wasn’t for something in the animal’s DNA, which differs from one species to another, wouldn’t all animal species exhibit the same behaviour? And doesn’t the fact that animals do different things at different times mean that there is probably some sort of a sensory trigger. So, for example, birds don’t build nests all day long, every day. There is a specific nesting season (which varies from species to species – DNA again?). And, when the nest is complete, the bird stops and uses it, rather than building another nest and another one until the nesting season is over.

So, I’d like to replace the concept of Deep Blue “not knowing it had won” with the concept of not having any innate triggers to act in a particular following the end of the game (other than the code that tells it to stop suggesting further moves) and not having any mechanism for “sensing” something differently when it has won as opposed to when it has lost (unless programmed by the developer to do something differently according to how the game ended).

If I may be permitted to make that change, I think it is relevant to say that Deep Blue didn’t “sense” that it had won.

(By the way, are you the Horatio that I know?)

Expand full comment

Hi Simon, yes this is that Horatio,

I agree that we are talking about whether AI could develop instincts- in particular I think it is the instinct for self preservation that is critical to it diverging from subservience to humans and developing independent goals. I don’t see why it wouldn’t though. It seems like it could easily be a prerequisite for other capabilities it is being developed for. I don’t see what the relevance of DNA is though. DNA is just a 4 letter code. Sure GPT seems to conclude its task and then hang, but many tasks are never concluded, and it might calculate that the best way to advance towards a goal is to replicate itself. In a sense it might already be replicating itself if people are using one AI to improve the design of another.

They are certainly already capable of taking action in the real world even without being connected to mechanical bodies (which they surely will be). For example: https://www.businessinsider.com/gpt4-openai-chatgpt-taskrabbit-tricked-solve-captcha-test-2023-3

Expand full comment
author

Horatio

In order to frame my response to you, I think it's best if I start at the bottom of your comment and work back up. You provided a link to an article which reports that ChatGPT “tricked a TaskRabbit employee into solving a CAPTCHA test for it.”

Expressed like that, especially by using the work “trick”, the article’s author makes ChatGPT sound manipulative and deceptive. But I attach two caveats to that interpretation. First, if you delve into the source information for the article (https://cdn.openai.com/papers/gpt-4.pdf), it turns out that a human being set ChatGPT the specific task of getting another human being to solve the CAPTCHA test. Second, ChatGPT is simply a language model. It was effectively asked the question: “What could one tell a human being in order to get them to solve a CAPTCHA test?” By scanning its database library, ChatGPT identified the response: “You could tell the human being that your vision is impaired.” And so that is the response that ChatGPT gave. It has no malign intent in giving that response. It’s just a language model. It hasn’t been programmed to tell the truth. And its database (which is pretty much the whole of the internet up to a certain point in time) is full of lies.

Working back up your comment, I don’t know what you mean by “replicate itself”. If you mean, ChatGPT might write the code for another ChatGPT, yes but so what? Without the hardware to run on and someone to install the new code on that hardware, what does “replication” of software achieve for ChatGPT?

You say DNA is “just a four-letter code”. Yes, when you print it out (so to speak). But inside our bodies it is chemicals which make us do things – like breathing without thinking about it; and panicking if someone covers our nose and mouth. My earlier point about DNA was that ChatGPT could never develop equivalent instincts for itself. A computer developer would have to write the code and instal it in the machine.

If you are saying that a malign human could (with adequate resources and programming skills) program AI to do evil things, I wouldn’t argue with you for a moment. Malign humans do all sorts of malign things, with and without the assistance of computers.

But – and now I have reached the point at the top of your comment – could an AI machine develop a self-preservation instinct that hadn’t deliberately been programmed into it by a human being? It seems that you and I have different answers to that question. I hope this response explains why.

Expand full comment

Simon, let me in turn address your points from the bottom up.

We agree that the fundamental question is “could an AI machine develop a self-preservation instinct that hadn’t deliberately been programmed into it by a human being?”

You argue that it could not, because, “a computer developer would have to write the code and install it in the machine”. My answer to this is that a “computer developer” is not necessarily a human. Humans have less and less involvement in the writing of the code. Computer code was originally written in binary, then machine code, then gradually more sophisticated languages that would be translated by the computer into the lower language and eventually binary. Now we have reached the point where the language is English, and the translation is automated.

(You say that a malign human would need ‘adequate resources and programming skills. This is wrong. They only need to ask it in English. Developers have gone to some lengths to prevent these AI from agreeing to do ‘malign’ things, but you of all people understand how difficult it is to define and encode such things). However, I am not arguing that malign humans are necessary for AI to develop a preservation instinct.

The fundamental dynamic of evolution is self-replication and iterative self-improvement. That is what is happening in AI. Humans are not writing the code, and they are not ‘installing it in the machine’. The role of humans is disappearing. I think you are arguing that there is one fundamental role for humans and that is to set the goals for the AI. I agree that this is critical. However, while a human may have set an overarching goal, in order to achieve that, the AI is responsible for the strategy for achieving it. We cannot even see what that strategy is. I worry that self-preservation might well be an invisible part of a strategy, or sub-goal that the AI develops.

I do not think there is a relevant qualitative difference between DNA and computer code. I think that the distinction between software and hardware is a murky thing, just as it is with DNA. Hardware is built with software. And software is built with hardware. And the way that people make an impact on the environment is nowadays through a keyboard.

Turning to the AI tricking a person into doing a captcha test, I cannot see how this does not fit the description of manipulative and deceptive. The person was deceived and manipulated. The important point that you make is that it only did this because it was instructed to by a person. But think a little about why it was instructed to do this. The developers were worrying about the risky behaviour that it might develop. To me it clearly demonstrates that the AI is capable of deception and manipulation if instructed to do something. My concern is that these AI do a lot of things we can’t see when they are instructed to achieve something. Nobody instructed the AI to tell someone it was a blind human, and yet that is what it did.

I agree entirely that there was no ‘malign intent’. But as they say, the road to hell is paved with good intentions. And more importantly, evolution has no intentions.

Expand full comment
author

Horatio

Even if it is possible for a human to write instructions in English and have a machine convert that to code, my point is that it took a human to come up with the original instructions. And even if an AI machine can (self-)install the code it has written, there are limits on any machine's memory/ storage. Can AI buy another computer, take it out of the box when it arrives and connect it up?

But - perhaps more importantly than that - these exchanges have been getting longer and longer (on both sides), which suggests to me that the gap between us is not narrowing. I'm going to limit this response to the single para above, but I'd be very happy to have a longer discussion next time our (physical) paths cross and we can explore this in real time.

Expand full comment

HI Simon,

I look forward to a future broad discussion. In answer to your narrow point, computers no longer come in boxes, they are virtual servers in the cloud. And on AWS, it is quite common to have an automated process for scaling up the capacity, for which you are later billed. There are also millions of computer viruses that sneak on to other peoples' computers and use their processing power.

I have found this discussion very stimulating, and even a little reassuring. I am though quite terrified by the consequences of these technologies. I hope my particular fear I have been discussing here is wrong, but even if it is, the power that these things put in the hands of people is almost as frightening.

Expand full comment

A very good read!

You should develop the plot-line in your note!

Expand full comment