The Almost Birth of Augmented Creativity

*Image generated by dall-e-3 with the ChatGPT UI. The full prompt is at the bottom of the article.*

A few years ago I started writing a graphic novel. I had a good idea, some interesting characters and software that provides some templates for comic book pages. I did lack one important thing: Even the tiniest amount of skill in the visual arts. There was some clip art and stock backgrounds I could use to suggest what should be in each panel. The human templates provided were terrible. The result was less than satisfactory. FAR less.

I eventually gave up the project because it was taking a tremendous amount of time to produce something unuseable. I had a lot of fun, though, and I’ve been trying ever since to think of a way (that doesn’t involve spending money) to revive the project.

Fast forward to today, when we now have aritifical intelligence that can produce amazing images from a text prompt describing what you want. I use ChatGPT nearly every day, though not usually for images. But, why not? I tried an experiment. I described a scene corresponding to a very early panel in the story. A police detective, a forensic technician and some equipment inside an apartment where a murder had occurred. It took a couple tries but I got something decent.

The AI has a bit of a mind of its own. First it made one that looked almost like a photograph. It was good but looked like it was in a lab, not an apartment. I told it I wanted the same picture (more or less) in a comic book style. The result still looked like it was in some kind of lab but was otherwise great (see below).

*An early effort at getting ChatGPT/dall-e-3 to illustrate my graphic novel.*

Next step, which happened the next day. I uploaded the previous picture and told it I wanted a picture of that same detective, talking to a uniformed cop in the apartment building hall. Aside from the fact that it was patholgically incapable of making the two people actually face each other when they talked, and that it shaved 20 years off the detective’s apparent age, it was good.

I imported that into Powerpoint and added some word balloons and voila! See below.

BUT, I wasn’t done. I wanted to see if I could keep the story going. So a couple days later, I uploaded this image and told it to give me another angle on the same conversation. The first attempt wasn’t bad except it removed the uniformed cop’s moustache! I made it put that back and it gave the detective a hat. I told it to remove the hat and it did, hair and all! After a few more attempts, I decided, “Good enough for now.” See below. They don’t look like exactly the same people. Pretty close but not quite the same. And no matter what I tell it, they still won’t face each other. But it’s a start.

Some discussion

Part of the problem has to do with a setting called “Temperature.” Very basically speaking, the hotter the temperature of an AI is set, the more randomness there is in the output. Setting it very low might get us more consistent results. Don’t know, It’s something to look into.
The problem of people not facing each other is a big one but not horrible. I mean, I can live with it.
BUT the one about people changing ages and facial hair between one picture and the next is crazy. I think the basic problem is that modern image generation AI was developed to do one image at a time. The idea of a related series of them is not the “base use case.” But there is software out there already for making videos with specific avatars. This shouldn’t be that much different. Or maybe it is. I don’t know.
Just a little more improvement in the results and maybe I can actually make that graphic novel some day. I’d like that. It does mean I would have to finish the story, too. I think it needs a serious rewrite before I can do that.
This is what I meant by the title. “Augmented Creativity.” There are bits I can do better than the machine. There are bits that it does better than I do. Working together, we produce something neither of us could do alone. And if it ever sells, I get to keep all the profits. I paid my $20 for access to the AI this month! But like I said in the title, it’s only “almost” there. I’m anxious for more.
But, you might ask, won’t it eventually reach the point where the machine doesn’t even need you for this? It can already write stories. My personal belief is that it’s creativity is severely limited. Really it just kind of fills words into “templates” that it learned from its training data. It isn’t really creating anything new. And I don’t think the current AI technology, known as “generative AI” has the ability to go beyond those current limits, although it might come so close to them to satisfy many people.

Yes, eventually the machines will catch up and be able to do everything. But they haven’t yet and, anyway, it can’t enjoy the creation process for me. That part is all mine.

Here’s the prompt for the image at the top. It’s not the prompt I gave. That was only about 8 words long. This is the prompt the AI developed for itself based on what I gave it. “An imaginative depiction of a robot resembling Leonardo da Vinci, engaged in the act of painting. The robot should have a humanoid form with visible mechanical elements and a Renaissance-era attire, reflecting da Vinci’s period. The setting is an artist’s studio, filled with Renaissance-era art tools, canvases, and a hint of an unfinished painting that mirrors the style of that era. The robot should be holding a paintbrush, with a thoughtful expression as if pondering over the artwork.”

Wizard Signal

Insert clever thought here.

The Almost Birth of Augmented Creativity

Some discussion

Leave a comment Cancel reply

Some discussion

Share this:

Related

Leave a comment Cancel reply