Eye on AI - July 15th, 2022
Welcome to Aigora's "Eye on AI" series, where we round up exciting news at the intersection of consumer science and artificial intelligence!
This week’s focus is on text-to-art AI and includes a look at Meta’s new ‘Make-A-Scene’ AI, which adds a new layer of context to more traditional text-to-art tools, and the highly accessible DALL-E Mini.
Meta’s New AI Research Tool May Rival OpenAI’s DALL-E
If you’ve been following the AI art world, you may have noticed the recent explosion of text-to-art AI. First, there was OpenAI’s DALL-E 2, which is generally regarded as the pinnacle of text-to-art AI. Its success led to the creation of the more accessible but less proficient DALL-E mini (no relation to DALL-E 2), Google’s Imagen, and, just last week, Meta’s Make-A-Scene.
However, according to Meta’s press release on their new creation, Make-A-Scene is somewhat different from its more traditional text-to-art AI predecessors. Not only does it give the power of its AI to artists and non-artists alike, but it also produces more predictable outcomes by utilizing human drawings as a second input.
“[Make-A-Scene] demonstrates how people can use both text and simple drawings to convey their visions with greater specificity using a variety of elements,” reads the editorial. “[It] captures the scene layout to enable nuanced sketches as input. It can also generate its own layout with text-only prompts, if that’s what the creator chooses. The model focuses on learning key aspects of the imagery that are more likely to be important to the creator, like objects or animals.”
To put this more simply, users of Make-A-Scene can type in what they hope their desired image will look like, supplement the text with a human-drawn example, then the AI produces numerous images based on the combination of the text and drawing for a more accurate outputs. It’s this combination that Meta believes will make their AI more accessible.
“... people should be able to shape and control the content a system generates,” the editorial continues. “It should be intuitive and easy to use so people can leverage whatever modes of expression work best for them, whether speech, text, gestures, eye movements or even sketches to bring their vision to life.”
How successful Make-A-Scene is at rendering more accurate artwork remains to be seen. It does appear to be able to produce fairly accurate renderings of a user’s desired image, though the solution is not yet available to test. So far as we’ve seen, it can’t yet incorporate things like gestures and eye movements into the AI. However, adding sketches into the mix should provide better context for more accurate image renderings.
How DALL-E Mini Is Education the World in AI Art Deception
Let’s continue with a look at DALL-E mini, the more accessible text-to-art AI tools mentioned above. According to the WIRED article “DALL-E Mini Is the Internet's Favorite AI Meme Machine,” DALL-E mini has become a meme-making machine in the past few months.
“The outwardly simple app, which generates nine images in response to any typed text prompt, was launched nearly a year ago by an independent developer,” writes WIRED contributor Will Knight. “But after some recent improvements and a few viral tweets, its ability to crudely sketch all manner of surreal, hilarious, and even nightmarish visions suddenly became meme magic.”
Unlike DALL-E 2, DALL-E mini, which is a simplified version of its more powerful counterpart, is available for use by everyone, which means it has a high potential for abuse. The warning on the DALL-E Mini web page warns that it may “reinforce or exacerbate societal biases” or “generate images that contain stereotypes against minority groups” points to this idea.
DALL-E 2 comes with filters to block out sensitive content and revokes access to frequent abusers. Not so with DALL-E mini, which has no filters at all. Users can type anything they want to create an image. Thankfully, the images created by DALL-E mini are cruder and more smudged than the highly realistic renderings of DALL-E 2.
“Delangue of Hugging Face says it’s good that the DALL-E Mini’s creations are much cruder than those made with DALL-E 2 because their glitches make clear the imagery is not real and was generated by AI,” continues Knight. “He argues that this has allowed DALL-E Mini to help people learn firsthand about the emerging image-manipulation capabilities of AI, which have mostly been kept locked away from the public.”
Given the abuse we’ve seen across social media platforms, finding the right balance of guardrails and accessibility may prove to be difficult for test-to-art AI. Let’s hope DALL-E mini’s accessibility helps teach users of potential abuse so that if abuse becomes more frequent in its more realistic counterparts people will more easily be able to distinguish AI-generated images from the real ones.
How artificial intelligence is boosting crop yield to feed the world
NFTs become physical experiences as brands offer in-store minting
OpenAI’s new AI learned Minecraft by watching 70,000 hours of YouTube
That's it for now. If you'd like to receive email updates from Aigora, including weekly video recaps of our blog activity, click on the button below to join our email list. Thanks for stopping by!