AI Text-to-Image Processors: Threat to Creatives or New Tool in the Toolbox?

0

Couldn’t attend Transform 2022? Discover all the summit sessions now in our on-demand library! Look here.


An image produced from scratch by a video game designer using an AI tool recently won an art contest at the Colorado State Fair, as was widely reported. Some artists are alarmed, but should they be?

For several years, AI has been integrated into the tools used daily by artists, from computational photography in Apple’s iPhone to image enhancement tools from Topaz Labs and Lightricks, and even open source applications. But because an image generated entirely by an AI tool has won a competition, some see it as a tipping point – a sign of a coming AI disaster that will lead to widespread job displacement for those who work in creative fields such as graphic design and illustration, photography, journalism, creative writing, and even software development.

Source: Twitter

The winning image was generated using Midjourney, a cloud-based text-to-image tool developed by a small research lab of the same name that “explores new ways of thinking and expands the imaginative powers of human specie”. Their product is a text-to-image generator, the result of AI neural networks trained on a large number of images. The company didn’t disclose its tech stack, but CEO David Holz said it uses very large AI models with billions of parameters. “They are trained on billions of images.” Although Midjourney only recently came out of stealth mode, hundreds of thousands of people are already using the service.

There is suddenly a proliferation of similar tools, including OpenAI’s DALL-E and Google’s Imagen. According to a Vanity Fair story, Imagen provides “photorealistic images [that] are even more indistinguishable from reality. Stability.ai’s Stable Diffusion is another new text to image converter tool that is open-source and can run locally on a PC with a good graphics card. Stable Diffusion can also be used through art generator services including Artbreeder, Pixelz.ai and Lightricks.

Event

MetaBeat 2022

MetaBeat will bring together thought leaders to advise on how metaverse technology will transform the way all industries communicate and do business on October 4 in San Francisco, California.

register here

To use is to believe

As a passionate amateur photographer who exhibits his work in galleries, I fear that these tools mark the end of photography. I decided to try Midjourney myself to see what it could produce and to better think about the possible ramifications. The following image was generated by trying variations on these text prompts: “An emerald green lake backed by steep Canadian Rockies + A few patches of snow on the mountains + Soft morning light + Mountains with green coniferous forest + Sunrise sun + 4K UHD.”

Canadian Rockies by Gary Grossman via Midjourney

This seems like an amazing result for a novice user. The total time it took from when I first accessed the system to the final image was under 30 minutes. I have to admit I experienced childish wonder watching the image materialize in seconds from the prompts I provided. It brought to mind a 60-year-old quote from science fiction writer and futurist Arthur C. Clarke: “Any sufficiently advanced technology is indistinguishable from magic.” It was like magic.

There are others that use Midjourney that display a lot more sophistication. For example, one user produced an “alien cat” image from over 30 text prompts, including: “cat + alien with rainbow shimmering scales, bright, hyper detailed, micro detail, ultra large angle, octane rendering, realistic…”. seems that more detailed prompts can lead to more sophisticated and higher quality images.

Bella Gritty’s Alien Cat via Midjourney

These text-to-image AI tools are already good enough for business efforts. Creative artist Karen X. Cheng was hired to create an AI-generated cover image for Cosmopolitan. To help generate ideas and the final image, she used DALL-E, or more specifically the latest version, DALL-E 2. Cheng describes the process, including finding the right set of prompts, noting that she generated thousands of images, changing the text prompts hundreds of times for many hours before finding a suitable image.

Source: Twitter

Text-to-image: new tool or threat to a way of life?

In a post on LinkedIn, Cheng commented, “I think the natural reaction is to fear that AI will replace human artists. Admittedly, that thought crossed my mind, especially at first. But the more I use DALL-E, the less I see it as a replacement for humans, and the more I see it as a tool for humans to use – an instrument to play.

I had the same feeling using Midjourney. I posted the image of the Canadian Rockies on Flickr, an image sharing site for artists – primarily photographers and digital artists – and asked for opinions. Specifically, I wanted to know if people consider an AI image generator to be an abomination and a threat or just another tool. A professional replied: “I also played with Midjourney. I am a creative! How can I not play around with it to see what it can do? I am of the opinion that the results are art, even if they are generated by AI. A human imagination creates the prompt and then arranges the results or tries to coax a different result from the system. I think it’s wonderful.

A common refrain in the AI ​​debate is that it will destroy jobs. The answer to this concern is often two-fold: first, many existing jobs will be augmented by AI, so that humans and machines working together will produce better results by extending human creativity, not replacing it; second, that AI will also create new jobs, possibly in areas that did not exist before.

Entrepreneur and influencer Rob Lennon predicted recently that AI text and image generators will open up new career opportunities, specifically citing “rapid engineering.” Prompt crafting is the art of knowing how to write a prompt to get optimal results from an AI. The best prompts are concise while giving the AI ​​context to understand the desired outcome. Already, PromptBase has started marketing this service. Its platform allows fast engineers to “sell text descriptions that reliably output a certain art style or topic on a specific AI platform.”

New York magazine photo editor Megan Paetzhold put DALL-E to the test with assignments she would normally give to artists on her team. In the end, she called it “a toss up” and noted, “DALL-E never gave me a good picture on the first try – there was always a workshop process.” She added, “As I refined my techniques, the process began to be surprisingly collaborative; I was working with DALL-E rather than using this. DALL-E would show me his work and I would adjust my prompt until I was satisfied.

Isn’t there a dark side?

Obviously, these tools can be used to produce high quality content. While many creative jobs may ultimately be at risk, for now text-to-image generators are an example of people and machines working together in a new realm of artistic exploration. Ethically, the key is to disclose that an image or text was created using an AI generator so people know the content was produced by a machine. They may or may not like the result, and in that regard, it’s no different than any other creative endeavor.

This prospect will not satisfy everyone. Many writers, photographers, illustrators and other creatives – while they agree that AI generation tools lack refinement – think it’s only a matter of time before they, the creative professionals , are replaced by machines. Bloomberg Technology Editor Vlad Savov summed up these arguments, calling these tools both stifling and rogue for artists. He may ultimately be right, though, as one respondent to my Flickr query noted, “It’s a different kind of art, which isn’t necessarily bad and potentially allows for incredible creativity.” Another wrote: “I don’t feel threatened by AI. Everything changes.” That done. I guess we just thought there would be more time.

It is possible that these tools are just one more in the artist’s kit. They will be used to produce images and text that will be appreciated and sold. As Jesus Diaz writes in Fast Company: “Once you’ve tried a text-to-image program, the joy of artificial intelligence seems undeniable despite the many dangers that lie ahead.” This does not automatically mean that more traditional creative activities will disappear. Ironically, there might come a time in the not-too-distant future when “man-made” will have a cachet, and work produced without an AI image or text generator might command a premium.

Gary Grossman is Senior Vice President of Technology Practice at Edelman and Global Head of the Edelman AI Center of Excellence.

DataDecisionMakers

Welcome to the VentureBeat community!

DataDecisionMakers is where experts, including data technicians, can share data insights and innovations.

If you want to learn more about cutting-edge insights and up-to-date information, best practices, and the future of data and data technology, join us at DataDecisionMakers.

You might even consider writing your own article!

Learn more about DataDecisionMakers

Share.

About Author

Comments are closed.