Yesterday, as I was sitting at the PC with a stack of real-life loot beside me, I had the urge to do one of those "What I got for Christmas" posts that, now I come to think of it, no-one really does any more, if indeed they ever did. I guess it feels a bit braggy, although what kind of social leverage anyone could get from seven Buffy the Vampire Slayer novels (used) or a three-inch tall felt fox (original purpose: christmas tree ornament) is hard to imagine.
Then I read this post from AI Weirdness. It inspired me to go play with Stable Diffusion, one of the many AI toys in my playbox I keep forgetting to use. I'm always signing up to these things, bookmarking them and then doing nothing with them for months. Maybe I should make a New Year's Resolution about it. Those always work.
What got me started this time was Janelle Shae's comment that suggestions from Ada, the smallest AI, "tended to be harder to illustrate". I wondered about that so I picked one of Ada's card descriptions, "A frantic octopus watches Santa descend from the sky out of a dark, drapery-clad window" and ran it through Stable Diffusion to see what would happen.
This is what I got.
I think all of those would make excellent Christmas cards. In fact, rather than buying my cards next year, I might just print those four out and glue them over some of the many unused cards I have from previous Christmases. Even with the cost of printer ink it would probably save me some money.
Then I had the bright idea of inputting a list of gifts I'd received and letting SD illustrate them for me. I have a medium-term goal of getting an AI to write posts for me so I need the practice. So, it appears, does Stable Diffusion. The results I got were... shall we say "mixed"?
The first prompt I used was "chilli chocolate, a felt fox and a hat that lights up". This is what I got:
That's a lot of felt foxes and fair spattering of hats, none of which light up. Also there are chillis but no chocolate. I give it five out of ten.
Next, I tried "gloves, cowboy bebop and a whole load of buffy the vampire slayer merch". Not sure why I went all ee cummings about it nor if that makes a difference to the result but I think it came out pretty well:
There are some very recognisable images in there - Buffy and Spike particularly - and I love the way SD has collaged them. It reminds me strongly of the kind of thing I did in art class back in the early 1970s.
It's very curious how the AI went purely for drawings rather than live-action, given both shows exist in both formats. Also that it picked Spike from Cowboy Bebop rather than any of the other main characters. I wonder if that has anything to do with there being a character called Spike in both shows? Gloves are well represented, too, albeit in only one out of the four versions but there's no sign of any "merch". I think that's a solid eight out of ten. Nine, even.
By contrast, my next prompt was the least successful of them all. This is what "Supergirl, Robert B Parker's Spenser and a stripy velour hoodie" got me:
That's awful. It's just pictures of Supergirl. I mean, don't get me wrong, I have nothing against pictures of Supergirl, but where's Spenser? Or the hoodie?
Oh, wait... there is one picture of Supergirl, the one that looks like a mash-up of a photo and a cartoon, where she's wearing a maroon hoodie with the S logo. Is that an actual look she wore in the comics? I think it might be. Also there's one where she doesn't have the S logo at all, which is weird. I don't think I can give this one more than four out of ten.
Following on, we come to "presidents of america, veronica mars and some jelly beans". You'll note the capitals, which reappeared for a moment, have vanished again.
I don't think I can argue with this, although aesthetically I don't find it especially pleasing. I think I can see Kennedy, Bush Sr. and possibly Hoover in there, although I get the feeling the AI has played a game of mix and match with most of the presidents on show. It also seems to have slapped Kristen Bell's head onto a presidential torso, which is kind of horrific.
There's no arguing about the jelly beans. They get a whole image to themselves. I'm a little surprised the AI didn't make something of the Ronald Reagan/Jelly Bean connection but maybe that's why we still need humans - for the trivia. All in all, even though it's ugly as hell, nine out of ten for accuracy.
Two more to go. I saved one of the best and definitely the cutest for last. First up, "melanie martinez, halsey and lou reed CDs". For some reason I capitalized "CDs" but none of the names. It was late. Otherwise, I have no excuses.
I really like this set. I'm not absolutely sold on any of the imagers being Halsey but I'm willing to give the AI the benefit of the doubt because it's worked so hard to include all the elements in the prompt and managed to integrate them so well.I think the first panel is supposed to be a series of CD covers, albeit in dimensions that would better fit a cassette box. I'd be first in line for a copy of "Do Love" by Melanie and Lou. The one with the hand holding up the Melanie/Halsey collab is inspired, too. Another eight out of ten. I could be convinced to go as far as nine.
And finally... "a cute baby fox asleep in a nest". This is a bit of a gimme for Stable Diffusion, to be fair. It's just one image, not a list of unrelated items. Still, the result is so eerily similar to the actual present I received it's uncanny.
Well, the third one is, anyway. I'll give that one nine out of ten and the rest about six or seven.
That's not all the presents I got or even necessarily the best of them (That would be the polyptych diorama Mrs Bhagpuss made for me) but it'll do for now. Tomorrow I'm going to visit my mother. We'll see what she's got me and if it's odd enough (Not an unlikely possibility) maybe we'll find out what Stable Diffusion makes of that.
Nice work!
ReplyDeleteYou've come upon a couple of the weaknesses of Stable Diffusion and similar AIs quite early.
Stable Diffusion struggles hard with figuring out how to deal with multiple subjects in the same prompt: it tends to treat either the prompt as all one thing, or to pick one of the subjects and ignore the rest. If you describe your gifts one at a time you'll get much better results, as you found with your fox nest.
Stable Diffusion also likes rendering pretty young women more than anything else for some reason. (Hint: the images that trained it were harvested from the Internet.) That's likely why once you put Supergirl in there that's what you were going to get; it also explains the paucity of Lou Reed.
Merry Christmas!
I suspect if I broke the lists up and gave them more context SD would do a better job with them. It's interesting that it did a really good job on the presidents/veronica/jelly beans one, particularly in the first frame, which has all three elements perfectly balanced. I'm guessing the main reason it failed so spectacularly on the Supergirl/Spenser/hoodie one is that Spenser has a very low internet presence, even though there was a TV show for a while and "hoodie" is just too generic to have much of a hook.
DeleteI might try to get some better results on other lists by giving clearer instructions.