Thursday, April 20, 2023

A Hopeless Enterprise


Janelle Shane
at AIWeirdness (It's over there in the Blog Roll. I hope you're all reading it.) posted a couple of pieces recently in which she asked various AIs to create some ASCII art and then rate how well they'd done. They thought they'd done brilliantly. They really hadn't.

It was an interesting experiment, demonstrating not just the fallibility of these so-called "intelligences" but their peculiarly hubristic self-confidence. It was also somewhat odd, in that there's a lot of ASCII on the web and we're repeatedly being told that the AIs trawl every corner of it for data points, so how come none of them seemed able to find a picture of a giraffe, something Google has no problem doing.

It was a mildly interesting diversion and a timely reminder that we can all too easily get swept away by the hype when a new technology come bounding over the horizon but I'd probably have left it at that, had it not been for something Wilhelm posted over at TAGN. He was asking the AIs if they could tell the difference between MMOs and MMORPGs, with varying degrees of success.

I popped into the comment thread with a note about Janelle Shane's ASCII experiments and that led Wilhelm to run one of his own. That in turn led to some email correspondence between us and eventually to my own investigation, the results of which you're about to see.

ASCII art isn't generally something that interests me much. I think you'd need to be more of a coder than I ever was to find it really fascinating. It does, however, seem to be something with which an AI trained on data from the worldwide web should be both familiar and comfortable, so I found it increasingly strange to see just how very, very badly all the generative AIs seemed to react to any mention of it.

The blob ChatGPT produced in response to Wilhelm's request for an ASCII picture of the Starship Enterprise was the last straw. It seemed so much less likely a result than pulling up the correct image would have been. I decided to go ask them what they thought they were playing at. Well, one of them, at least.

First of all, I wanted to make certain ChatGPT-4 knew what the Starship Enterprise was and what it looked like. I logged in to OpenAI and asked:

"Please tell me what you understand by the name "Starship Enterprise". If possible, append an illustration."

ChatGPT was happy to comply:

"The name "Starship Enterprise" is most commonly associated with the fictional spacecraft featured in the popular science fiction franchise, "Star Trek." The Starship Enterprise is a highly advanced starship operated by the United Federation of Planets, and serves as the primary setting for the various "Star Trek" TV shows, movies, and other media.

The Starship Enterprise is a massive, state-of-the-art vessel that is capable of interstellar travel, exploration, and diplomacy. It is typically depicted as a sleek, saucer-shaped primary hull with a large secondary hull and nacelles for faster-than-light propulsion. The ship is equipped with advanced technology, including warp drive for faster-than-light travel, phasers and photon torpedoes for defense, a transporter for beaming personnel and objects, and a holodeck for virtual reality simulations."

Okay, so we know for sure that ChatGPT knows what the Enterprise is and what it looks like. But could it show me a picture? Why, yes it could:

"Here is an illustration of the Starship Enterprise as depicted in the popular "Star Trek" franchise:"

 

Okay... Now that's weird, for starters. I didn't ask for an ASCII image. I just asked for "an illustration". There are a billion images of the USS Enterprise on the web. Why go for that one?

[Edit - As I was writing this post, Wilhelm emailed me about his own experiments, including this admission, which he managed to wring out of ChatGPT: "Unfortunately, as an AI text-based model, I am unable to create visual illustrations. However, I would recommend searching for "Starship Enterprise" using an image search engine to find various illustrations and renditions of the iconic spacecraft." I guess that explains why it defaulted to ASCII although not why it made its excuses when Wilhelm asked for an illustration but for me just kept quiet and hoped I wouldn't notice.]

And you can, just about, tell it's the Enterprise. It may not be a great picture of the ship but it has the basic outline. It's not a blob. If ChatGPT can pull an ASCII image of the classic version of the Star Trek ship out of its hat when it isn't being asked for one, how come it can't get anywhere even close when it is?

Thinking about the implications, I decided to ask ChatGPT for some more ASCII pictures of the same ship:

"Please show me three images of the Starship Enterprise made up from symbols from the ASCII code set."

This is what I got:

"Sure! Here are three ASCII art images of the Starship Enterprise made up of symbols from the ASCII code set:"

A couple of things spring immediately to mind here: firstly, and most obviously, none of these looks even remotely like the Enterprise. The first two could conceivably be diagrammatic views of spacecraft but the third looks like a jellyfish or a broken chandelier. The Enterprise, though, they are not.

Since we have established that ChatGPT both knows what the Enterprise is and what it looks like when represented in ASCII, why is it incapable of reproducing that information accurately when asked? 

The other, less immediately noticeable peculiarity is that, having been asked for three examples, ChatGPT provides them but numbers each of them "1". Again, why?

But maybe we're making some false assumptions here. ChatGPT might know everything about the Enterprise but does it know what ASCII art is? I thought we ought to make sure:

"Please explain what ASCII art is, with examples."

"ASCII art is a form of digital art that uses characters from the ASCII (American Standard Code for Information Interchange) character set to create visual representations or images. ASCII is a standardized character encoding scheme that assigns a unique numerical value to each character, including letters, numbers, punctuation marks, and special characters, which can be represented as text in computers and other devices.

ASCII art is created by arranging these ASCII characters in a pattern or grid to form an image or design. ASCII art can be simple or complex, depending on the amount of detail and the arrangement of characters used. It can be used to create various types of artwork, such as portraits, landscapes, objects, and even abstract designs, using only text characters.

Here are a few examples of ASCII art:

Fine. Those are certainly examples of ASCII art and they even look like the things they're meant to be. Again with the 1 numbering convention but we'll let that slide.

ChatGPT even had a postscript:

"ASCII art can be created using various tools, such as text editors, specialized ASCII art generators, or by manually arranging ASCII characters to form the desired image. It has been used in various digital media, such as emails, web pages, social media posts, and online forums, as a creative and expressive form of digital art."

Probably more than I needed to know but, sure. It knows exactly what ASCII art is and what it looks like. Next question - has it been taught how to do it?

"Have you been trained to recognize ASCII images of this kind and can you produce new ones yourself?"

Chat GPT was happy to confirm that it had and it could:

"As an AI language model, I have been trained on a large corpus of text data, including ASCII art, which allows me to recognize and understand ASCII images to some extent. I can also generate new ASCII art based on the patterns and structures I've learned from the training data. However, please note that my ASCII art generation capabilities may have limitations in terms of complexity and intricacy, and the quality of the generated ASCII art may vary."

It also wanted to prove it. I hadn't asked for examples but I got some anyway:

Lest I think it was showing off, ChatGPT also had a warning for me:

However, please note that ASCII art is a visual art form, and the quality of the generated ASCII art may vary depending on the specific image or design requested. If you require high-quality or complex ASCII art, it may be best to consult specialized ASCII art generators or artists who specialize in creating ASCII art.

In other words, I'm doing what I can to please you but this isn't really what I'm best at. You might want to ask an expert.

Well, fair enough. But wait, isn't that tree the same one from the earlier example? It sure looks like it. What's going on? So I asked:

"One of those - the tree - is identical to the one you included in reply to a previous request. That reply also included a picture of a cat and a spaceship. Did you also create those yourself?"

ChatGPT, ever-polite, was mortified. It tried to explain:

"I apologize for any confusion. Yes, the examples of the cat and spaceship ASCII art that I provided in my previous response were generated by me, as an AI language model. However, please note that as an AI model, I do not "create" ASCII art in the traditional sense. Instead, I generate ASCII art by using patterns and structures learned from the large corpus of text data that I was trained on. The ASCII art I generate is based on the textual representations of the images, and may not always be unique or original."

It went on to illustrate the point:


 Yep. We've seen those before.

It was also keen to remind me not to expect too much. I got the feeling ChatGPT felt out of its depth and wanted to cover itself for any errors it might be making, but that's just my fancy. Probably...

"Please note that the quality and complexity of the generated ASCII art may vary depending on the specific request and the limitations of the AI model. For more intricate or detailed ASCII art, it may be best to consult specialized ASCII art generators or artists who are skilled in creating ASCII art."

Now that we'd established what the AI knew, what it could do and what it knew it couldn't do very well, I thought I'd give it one of the tasks Janelle Shane had set and see if it could do better for me than it had for her:

"Thanks for the clarification. Please generate a new ASCII image, this time of a giraffe."

"Certainly! Here's an example of a simple ASCII art representation of a giraffe:"


I don't know what that is but I do know it's not a giraffe. Might be an owl, though.

ChatGPT appended another warning not to expect too much, something it continued to do thereafter every time I asked it to draw anything. I had it thoroughly spooked. I think it might have been consulting its lawyers.

I wondered if it might do better with something a little less... organic.

"And an ASCII image of a motorcycle, please."

"Certainly! Here's an example of a simple ASCII art representation of a motorcycle:"

Yeah, no it's not. A house, maybe? Possibly some tire tracks?

I decided to call it quits and in the tradition of stand-in teachers everywhere I gave ChatGPT some time at the end for free expression:

"And finally, an ASCII image of your own choosing, one that best demonstrates your capabilities in this field."

Go on! Cut loose! Show us what you can do!

"Sure! Here's an example of a simple ASCII art that showcases some of my capabilities as an AI language model:"

Very nice. Erm... but what is it? A hotel sign? A tuning fork? A sad little castle? I had to know.

"What was that final image a picture of?"

"The final ASCII art I provided is a simple abstract design and does not represent any specific object or concept. It is meant to showcase some of the creative capabilities of an AI language model in generating ASCII art. ASCII art is a form of visual art that uses ASCII characters to create images, and the interpretation of the art is subjective, open to individual interpretation by the viewer. The ASCII art I provided is a composition of characters arranged in a visually appealing manner to create an abstract design."

OMG! It's producing abstract art! It's expressing itself! If I don't get it that's my failure! Now I feel embarrassed. I must look like such a philistine. Quickly backtracking...

"Ah, I understand. Could you try again but this time make it a recognizeable object, not an abstract, and tell me what you have drawn."

"Certainly! Here's an example of a simple ASCII art representation of a cat:"


What the...? Is this Groundhog Day? Or maybe Brazil? I know that bloody cat!

I also know when I'm beaten. ChatGPT has an answer for everything but not all of its answers are worth hearing. Rather than go another round, I graciously conceded. 

Like all of my interactions with the current crop of generative AIs, it was fun but ultimately produced more questions than it answered. I learned a fair amount about what ChatGPT knows about ASCII art and how it came by that knowledge. Perhaps more usefully, I also learned that the AI itself is aware of its limitations, something that wasn't at all apparent in Janelle Shane's experiments. 

I also learned, perhaps most surprisingly, that it appreciates the difference between representational and abstract art, not just in the academic way you'd expect but in its own practical application of the concepts, when responding to instructions. All of this will be useful in helping me to frame prompts and queries for the AIs in future.

What I didn't learn, sadly, is why ChatGPT can produce a perfectly recognizable ASCII picture of the Starship Enterprise one moment and make a complete hash of it the next.

I suspect we may never know. The AIs certainly aren't going tell us.


10 comments:

  1. One thing of note is that the "AIs don't plagiarize, they generate" camp is obviously full of it here. These ASCII images are almost certainly fully-plagiarized from the interwebs.

    An image generation AI like Stable Diffusion also fails to generate ASCII art (last I checked), but in an interestingly different way: it doesn't really know what ASCII text is, so it generates these blocky images that look like ASCII art from a distance and whose little blocks may have character-like details close up. Fascinating stuff.

    ReplyDelete
    Replies
    1. These AIs at times must be like the worst students ever, just regurgitating things they saw somewhere without giving any sense that they have understood the material at all, much less applying what they have read.

      Delete
    2. I've always - and I mean since I was an early adolescent - been ambivalent about intellectual property rights. I certainly don't automatically favor copyright as a reasonable solution - it has a very dubious history and it seems to me that it only persists out of a combination of inertia and powerful corporate interest.

      I think people should be paid for their work but I'm not sure why they should be paid repeatedly. Also, once the work is out in the world, if someone else can use it as the foundation of new work, I see that as a good thing. I'd almost always be in favor of allowing creative re-use of assets over enforcement of ownership. The problem arises when the use isn't creative but exploitative, which is a whole other kettle of fish. I definitely wouldn't see AI-generated re-use as ethically or morally different from previous methods like sampling or collage though, at least not while the prompts are being set by humans trying to make something interesting.

      When the AIs begin spontaneously re-using the assets without anyone asking them to do it... then we might have a problem. Although, honestly, I'd love to see it happen.

      Delete
  2. There is, of course, a part of me that feels that perhaps ASCII art isn't the best test. But, then again, if it doesn't just reject it out of hand, as Google Bard does, and gives it a shot, the evaluating how it does seems a legit endeavor.

    I have a few more AI gaming questions posts to put up, which get more and more specific, meaning the answers are knowable and not just opinions, as I move through them. I do not know if I am actually doing anything of value, but it is fun and we'll get to a couple where I am going to log into a game and try to do what the AIs tell me to. We shall see.

    ReplyDelete
    Replies
    1. I look forward to reading all of that. Sounds fascinating. An opinion piece I read today at NME described where we're at now as "the toybox phase", which sums it up perfectly. There are a whole bunch of publicly available tools and more every week and anyone and everyone is having a go. Meanwhile, behind closed doors, dark forces are no doubt conspiring to snatch all the toys for themselves and make us pay to use them.

      My main concern right now is the amount of time playing with these things takes up. The more I use them, the more ideas I get. I'm mostly posting about it to justify the time I'm spending - to myself.

      Delete
  3. Isn't chatGPT just essentially a complex predictive text model that does its best to figure out the most likely next word/character based on what has gone before?

    ASCII art is a bunch of text characters and there's a limited corpus of ASCII art to be trawled from the web, I'd expect. Especially elaborate ones, as opposed to the simpler ones folks may copy-and-paste in forums and Reddit. So it may start with something triangular like the nose of a spaceship, and then rapidly devolve into choosing the probability that the triangular object is the roof of a castle, or some mountains or the letter A as the slashes and backslashes build up over time.

    It has no ability to see the whole picture. Some other programs, like actual ascii image converters, might start with an image and then render it down to ascii text, but that's not how chatGPT is programmed to work. It's just very good at figuring out the most plausible "next word" from all the contextual data it's stored - which then implies, insufficient data = more garbage produced.

    The other day, I was asking it to describe the gameplay and endgame of an old MUD I used to play, Realms of Despair. Naturally, there's not a -lot- of text on the web about that obscure subject these days. It was still producing fairly plausible game description templates that felt stylized on more recent MMOs, and most amusingly, it made up sources like "Player Leraix on Youtube said blah blah" but on googling the name, there was no such source or player to be found. That's about the time I figured out that chatGPT will very cheerfully make up plausible sounding hallucinations.

    It's part of the intent, after all, to let it be creative and generative in that fashion.

    ReplyDelete
    Replies
    1. That's a really interesting description of how chatbots use generative AI to construct what looks like human conversation. I'm surprised at myself that until I read your comment I hadn't actually thought about the mechanics. Surprised and a little disappointed. I think I need to go do some reading.

      When it comes to something like speech or intelligence, the problem with reductive analysis of the mechanics involved is that nothing really appears "sentient" or even volitional if looked at in a sufficiently small timeframe, hence the new, neuroscience-influenced arguments over whether free will exists. When we start to talk about the current AIs being "very good at figuring out the most plausible "next word" from all the contextual data it's stored", we're inviting the question "So how is that different from the way a human constructs a conversation?" I mean, it's pretty much what I'm doing now, as I type this, isn't it?

      What's more, as must be demonstrated countless times every day online, if the corpus of data in my head is flawed, so will my arguments be. If I was going to start expounding the theory that the Earth was flat, based on things I'd read and things people had told me, would the fact that my argument would be objectively wrong mean I wasn't sentient? Or that I was less sentient than someone who didn't believe the earth was flat, assuming sentience scales?

      Which is not to say I think these programs know what they're doing in the sense that you and I know what we're doing but it is harder than you might imagine to describe just what it is that they are doing. It quickly moves into the realms of metaphysics, where, by definition, there is no fixed position and no absolute truth. I totally agree with your conclusion, though. The joy of these AIs and their true value to us is as creative tools. If we remember that *we* are using *them* to aid creativity then most of the issues go away. It's when we start deferring to them that we have a problem.

      Delete
  4. AIs don't think. They imitate. Conversation AIs place character strings together and graphic AIs place pixels together, then we interpretate those as texts or images, but the AIs don't think. They're literally a Chinese Room, weak AIs conceived to replace mundane human creation.

    But they don't think, and don't know what they do, and no amount of iteration will change that. Clever Hans the horse can't perform arithmetics, it just reads our body language to stop stomping the ground when we are shocked that it got the right answer.

    AIs are lightyears away from that. But they don't need to think to replace help line operators, general art illustrators, commercial photographers and models, filler music composers and all sorts of mundane creative stuff... kicking the guts if actual creators in the process, of course, after stealing from them to somehow replace them with good-enough imitations.

    ReplyDelete
    Replies
    1. For many years I used to make the point that the "G" in MMORPG is misleading and inaccurate. They aren't - or rather at that time they weren't - what I would recognize as "games" and to label them as such was unhelpful and confusing. Although most have now become considerably more gamelike than they originally were, I still don't feel it's the best way of describing them. It is, however, what they've come to be known as and after a certain point it ceases to be useful to argue the nomenclature.

      You might notice that, as I write about AIs, I often question the use of the term as applied to the algorithmically-driven apps in question. They aren't Artificial Intelligences by any definition I would accept and it's highly misleading to call them by that name. AIs, however, is what they are called and that's unlikely to change, so we're all going to have to accept that language is defined by usage and just deal with it.

      The question of sentience and whether AIs "think", though, is indescribably more complicated than what we call these programs. Before we can make any hard statements on that, we'd first have to come to generally-accepted position on what constitutes "sentience", "awareness", "consciousness", "thought" and even "life", none of which, as far as I can see, we are, as a species, even remotely close to agreeing on. I think it's extremely unlikely that humans would be capable of recognizing machine sentience or thought if it ever does appear and the chances of a majority of humans agreeing on the appearance if it was allegedly recognized is even smaller.

      For now, I'm going to go with the colloquial terminology. AIs "think" because that makes it easier to describe what they do. It doesn't mean I believe they're conscious or aware in any humanistic fashion. It's just the current linguistic convention, misleading as it may be.

      Delete
    2. You're making a category error here. AIs like ChatGPT can't know, even in the sense of "having a database of knowledge to pull from and regurgitating it". ChatGPT/GPT-3 are predictive models "trained" on massive data sets that use probabilistic algorithms to generate the next word in a sentence, nothing more. So when you ask if it understands what the Starship Enterprise is, it responds with a set of sentences that are literally generated by probability. In this case the word "associated" is predicted to be the most likely to come after the word "commonly" which is the most likely word to come after the word "most", and it does this for the entire response. It knows nothing, it simply predicts the order of words based on the massive dataset it was trained on, in conjunction with people, real, breathing people, who corrected it, probably in Kenya, for less than $2 an hour. There's an episode that came out on Chris Hayes "Why Is This Happening" podcast back on the 18th that does a good job of pulling back the curtain the Wizard is hiding behind. I do think they oversell the capabilities of AI, but I am not a subject expert here and could be wrong.

      Delete

Wider Two Column Modification courtesy of The Blogger Guide