logo
Mollick Presents The Meaning Of New Image Generation Models

Mollick Presents The Meaning Of New Image Generation Models

Forbes08-04-2025
Paintbrush dynamically illustrates the innovative concept of generative AI art. This mesmerizing ... More image captures the essence of creativity and automation in the realm of digital masterpieces. Witness the fusion of human imagination and artificial intelligence as strokes of the brush evolve into intricate patterns, showcasing the potential of neural networks and creative evolution. This visual journey limitless and where technology transforms the canvas of artistic expression.
What does it mean when AI can build smarter pictures?
We found out a few weeks ago as both Google and OpenAI unveiled new image generation models that are fundamentally different than what has come before.
A number of important voices chimed in on how this is likely to work, but I didn't yet cover this timely piece by Ethan Mollick at One Useful Thing, in which the MIT graduate looks at these new models in a detailed way, and evaluates how they work and what they're likely to mean to human users.
The Promise of Multimodal Image Generation
Essentially, Mollick explains that the traditional image generation systems were a handoff from one model to another.
'Previously, when a Large Language Model AI generated an image, it wasn't really the LLM doing the work,' he writes. 'Instead, the AI would send a text prompt to a separate image generation tool and show you what came back. The AI creates the text prompt, but another, less intelligent system creates the image.'
Diffusion Models Are So 2021
The old models also mostly used diffusion to work.
How does diffusion work?
The traditional models have a single dimension that they use to generate images.
I remember a year ago I was writing an explanation for an audience of diffusion by my colleague Daniela Rus, who presented it at conferences.
It goes something like this – the diffusion model takes an image, introduces noise, and abstracts the image, before denoising it again to form a brand new image that resembles what the computer already knows from looking at images that match the prompt.
Here's the thing – if that's all the model does, you're not going to get an informed picture. You're going to get a new picture that looks like a prior picture, or more accurately, thousands of pictures that the computer saw on the Internet, but you're not going to get a picture with actionable information that's reasoned and considered by the model itself.
Now we have multimodal control, and that's fundamentally different.
No Elephants?
Mollick gives the example of a prompt that asks the model to create an image without elephants in the room, showing why there are no elephants in the room.
Here's the prompt: 'show me a room with no elephants in it, make sure to annotate the image to show me why there are no possible elephants.'
When you hand this to a traditional model, it shows you some elephants, because it doesn't understand the context of the prompt, or what it means. Furthermore, a lot of the text that you'll get is complete nonsense, or even made-up characters. That's because the model didn't know what letters actually looked like – it was getting that from training data, too.
Mollick shows when you hand the same prompt to a multimodal model. It gives you exactly what you want – a room with no elephants, and notes like 'the door is too small' showing why the elephants wouldn't be in there.
Challenges of Prompting Traditional Models
I know personally that this was how the traditional models worked. As soon as you asked them not to put something in, they would put it in, because they didn't understand your request.
Another major difference is that traditional models would change the fundamental image every time you ask for a correction or a tweak.
Suppose you had an image of a person, and you asked for a different hat. You might get an image of an entirely different person.
The multimodal image generation models know how to preserve the result that you wanted, and just change it in one single small way.
Preserving Habitats
Mollick gives another example of how this works: he shows an otter with a particular sort of display in its hands. Then the otter appears in different environments with different styles of background.
This also shows the detailed integration of multi Moto image generators.
A whole pilot deck.
For a used case scenario BB shows how you could take one of these multimodal models and have it designed an entire pitch deck for guacamole or anything else?
All you have to do is say come up with this type of deck and the model will get right to work looking at what else is on the Internet, Synthesizing it and giving you the result.
As Mick mentions this will make all sorts of human work obsolete very quickly.
We will need well considered framework
Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

US agency approves OpenAI, Google, Anthropic for federal AI vendor list
US agency approves OpenAI, Google, Anthropic for federal AI vendor list

Yahoo

time17 minutes ago

  • Yahoo

US agency approves OpenAI, Google, Anthropic for federal AI vendor list

WASHINGTON (Reuters) -The U.S. government's central purchasing arm on Tuesday added OpenAI's ChatGPT, Google's Gemini and Anthropic's Claude to a list of approved artificial intelligence vendors to speed use by government agencies. The move by the General Services Administration, allows the federal government advance adoption of AI tools by making them available for government agencies through a platform with contract terms in place. GSA said approved AI providers "are committed to responsible use and compliance with federal standards." Sign in to access your portfolio

OpenAI says they are no longer optimizing ChatGPT to keep you chatting — here's why
OpenAI says they are no longer optimizing ChatGPT to keep you chatting — here's why

Tom's Guide

time19 minutes ago

  • Tom's Guide

OpenAI says they are no longer optimizing ChatGPT to keep you chatting — here's why

With over 180.5 million monthly active users and nearly 2.5 billion prompts per day, OpenAI recently revealed it is optimizing ChatGPT to help, not hook. In a new blog post titled 'What we're optimizing ChatGPT for,' OpenAI revealed it's moving away from traditional engagement metrics like time spent chatting. Instead, the company says it's now prioritizing user satisfaction, task completion and overall usefulness. This is an unconventional stance, as apps like TikTok, Meta, and similar Silicon Valley companies strive to keep users tied to their screens. 'We're not trying to maximize the time you spend with ChatGPT,' OpenAI wrote. 'We want you to use it when it's helpful, and not use it when it isn't.' While many platforms chase user attention, often with addictive features, OpenAI says it's focused on building a helpful assistant that respects your time. That means ChatGPT won't be trying to keep you talking just for the sake of it. Instead, it's being shaped into a tool that helps you solve problems, learn something new or complete a task, and then get on with your day. This approach mirrors recent updates like Study Mode and ChatGPT Agent, both of which are designed to get things done rather than entertain. Together, they reflect OpenAI's growing focus on goal-oriented AI over engagement-first design. Get instant access to breaking news, the hottest reviews, great deals and helpful tips. Rather than acting like a social app that wants you to linger, OpenAI says ChatGPT is being tuned to behave more like a true assistant that offers answers, structure and support without dragging you into an endless chat spiral. Behind the scenes, OpenAI says it's incorporating feedback from its Superalignment and Preparedness teams, along with trust and safety evaluations, to ensure the assistant is more transparent, less sycophantic and better at knowing when to be concise. OpenAI also acknowledges that people use ChatGPT in different ways; some want speed, others want depth; some prefer playful conversation, others want straight answers. The goal is to improve default settings while still allowing user customization, just not at the cost of clarity or mental load. If you use ChatGPT for studying, writing, planning or productivity, you may soon notice: OpenAI's shift is part of a broader trend toward human-centered AI; tools that support your work and well-being without demanding your attention in return. The company's redefined vision for ChatGPT is simple: help people more, distract them less. In a world full of apps designed to hook you, that's a surprisingly radical move and one that could (hopefully) set a new standard for AI design going forward. Follow Tom's Guide on Google News to get our up-to-date news, how-tos, and reviews in your feeds. Make sure to click the Follow button.

15 TikTok Videos About ‘Clankers', a New Slur for Robots
15 TikTok Videos About ‘Clankers', a New Slur for Robots

Gizmodo

time19 minutes ago

  • Gizmodo

15 TikTok Videos About ‘Clankers', a New Slur for Robots

Terms like 'social media,' 'podcast,' and 'internet' emerged years ago as ways to talk about the latest advancements in the world of technology. And over the past month, we've seen some new terms popping up in the world of tech, from clanker to slopper, even if they seem to be mostly tongue-in-cheek at this point. What's a clanker? It's a derogatory word for a robot, a term coined in 1920 for a Czech play about dangerous mechanical men. And given the fact that humanoid robots are still pretty rare in everyday life, the term clanker has emerged as a way to joke about a future where robots face discrimination in jobs and relationships. That's what the folks of TikTok have been doing with some frequency since the word started to spread widely online in early July. As io9 reported Monday, clanker as a slur actually originates from the Star Wars universe, starting with the 2005 video game Republic Commando and becoming more popular with the Clone Wars animated series in 2008. But the term has taken off recently as a way to joke about our uneasiness with new technology in 2025. Some of the videos currently circulating on social media are just directed at robots that show up in daily life already, like the bots that are sometimes cleaning in supermarkets. But other videos imagine what the future will look like, placing the viewer in an era, maybe 20 or 50 years from now, when robots will presumably be much more common. The jokes often use stereotypes of the 20th century around racial integration, mimicking the bigoted responses white people had reacting to civil rights advancements in the U.S. and repurposing them for this version of a future where humans are uncomfortable with a robotic other. Obviously, we don't know how common humanoid robots will be in five, 10, or 20 years. Elon Musk has promised 'billions' of robots will be sold around the globe within your lifetime. And while Musk is often far too, let's say, optimistic about his tech timelines, it seems perfectly reasonable that we will have more humanoid robots walking around in the near future. For his part, Musk has only been showing off teleoperated robots that are closer to a magic trick than visions of the future. But technological change can be scary. And it's interesting to see how content creators channel those fears by imagining a new future where robots are oppressed—something that's incredibly common in science fiction, even before the word clanker was coined. Whatever you think of the term (and there are some people who are uncomfortable with it as coded racism rather than a comment on racism), it's everywhere on TikTok right now. good for nothing cl***ers #groopski #robophobic #futuretech #humanityfirst ♬ Bell Sound/Temple/Gone/About 10 minutes(846892) – yulu-ism project #fyp ♬ original sound – baggyclothesfromjapan Robophobia running rampant in clanker society #robot #fyp #fypage #robophobia #ai #police #cops #robo ♬ Beethoven's 'Moonlight'(871109) – 平松誠 #fyp #pov #robot #skit ♬ Bell Sound/Temple/Gone/About 10 minutes(846892) – yulu-ism project Sorry I just thought you were one of the good ones 🤖❌ #clanker #ai #fypシ #viral #fyp ♬ Bell Sound/Temple/Gone/About 10 minutes(846892) – yulu-ism project All these new gens bro💔🥀 #clanker#ai#robot#clonewars #starwars#clone#jangofett #starwarsfan #obiwan#anakin ♬ Classic classical gymnopedie solo piano(1034554) – Lyrebirds music How would pissed would you get at a robot ump? #baseball #comedy #clanker #pov ♬ original sound – LucasRoach15 Clanker #clanker #clankermeme #robophobic #robot #fyp ♬ original sound – TrendsOnline – TrendsOnline Asked ChatGPT if it liked this video it said yeah ♬ Rust – Black Label Society #clanker #fyp #fypシ #relatable #funny #trending #blowthisup ♬ original sound – 👩🏼‍🎤 Robophobia running rampant in clanker society #robot #robophobia #fyp #fypage #ai #clanker #robo ♬ Dust Collector – ybg lucas ♬ original sound – Conner Esche These clankers am I right #skit #meme #edit #clanker ♬ original sound – coopermitcchell #pov : You're coming out to your parents in 2050 #ai ♬ original sound – bjcalvillo Clanker #clanker #clankermeme #robophobic #robot #fyp #robots ♬ original sound – TrendsOnline – TrendsOnline Every new era sees an expansion of the tech lexicon. That's just how the march of time works. And it's unclear whether clankers will have any staying power beyond the summer of 2025. Another term, sloppers, has seen a similar rise, a term for people who use generative artificial intelligence for everything. There's just no predicting what new words are going to stick. After all, the internet was almost called the catenet. Language works in really funny ways.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store