Latest news with #Imagen4

Google's Imagen 4 text-to-image model promises 'significantly improved' boring images

Engadget

25-06-2025

Entertainment
Engadget

Google's Imagen 4 text-to-image model promises 'significantly improved' boring images

Google has unveiled its latest text-to-image model Imagen 4 with the usual promise of "significantly improved text rendering" over the previous version, Imagen 3. The company also introduced a new deluxe version called Imagen 4 Ultra designed to follow more precise text prompts if you're willing to pay extra. Both arrive to a paid preview in the Gemini API and for limited free testing in Google AI Studio. Google describes the main Imagen 4 model as "your go-to for most tasks" with a price of $.04 per image. Imagen 4 Ultra, meanwhile, is for "when you need your images to precisely follow instructions" with the promise of "strong" output results compared to other image generators like Dall-E and Midjourney. That model boosts the price by 50 percent to $.06 per image. The company showed off a range of images including a three-panel comic generated by Imagen 4 Ultra showing a small spaceship being attacked by a giant blue... space lizard? with some sound effects like "Crunch!" and inexplicably, "Had!!" The image followed the listed prompt beat for beat and looked okay, not unlike a toon rendering from a 3D app. Another prompt read " front of a vintage travel postcard for Kyoto: iconic pagoda under cherry blossoms, snow-capped mountains in distance, clear blue sky, vibrant colors." Imagen 4 output that to a "T," albeit in a generic style lacking any charm. Another image showed a hiking couple waving from atop a rock and another, a fake "avant garde" fashion shoot. The images were definitely of good quality and followed the text prompts precisely but still looked highly machine generated. Imagen 4 is fine and does seem a mild improvement from before, but I'm not exactly wowed by it — particularly compared to the market leaders, Dall-E 3 and Midjourney 7. Plus, following an initial rush of enthusiasm, the public seems to be getting sick of AI art, with the main use case apparently being spammy ads on social media or at the bottom of articles.

I tested Gemini's latest image generator and here are the results

Android Authority

22-06-2025

Android Authority

I tested Gemini's latest image generator and here are the results

Back in November, I tested the image generation capabilities within Google's Gemini, which was powered by the Imagen 3 model. While I liked it, I ran into its limitations pretty quickly. Google recently rolled out its successor — Imagen 4 — and I've been putting it through its paces over the last couple of weeks. I think the new version is definitely an improvement, as some of the issues I had with Imagen 3 are now thankfully gone. But some frustrations still remain, meaning the new version isn't quite as good as I'd like. How often do you create images with AI? 0 votes It's a daily thing for me. NaN % Maybe once per week. NaN % A few times per month. NaN % Never. NaN % So, what has improved? The quality of the images produced has generally improved, though the improvement isn't massive. Imagen 3 was already generally good at creating images of people, animals, and scenery, but the new version consistently produces sharper, more detailed images. When it comes to generating images of people — which is only possible with Gemini Advanced — I had persistent issues with Imagen 3 where it would create cartoonish-looking photos, even when I wasn't asking for that specific style. Prompting it to change the image to something more realistic was often a losing battle. I haven't experienced any of that with Imagen 4. All the images of people it generates look very professional — perhaps a bit too much, which is something we'll touch on later. One of my biggest frustrations with the older model was the limited control over aspect ratios. I often felt stuck with 1:1 square images, which severely limited their use case. I couldn't use them for online publications, and printing them for a standard photo frame was out of the question. While Imagen 4 still defaults to a 1:1 ratio, I can now simply prompt it to use a different one, like 16:9, 9:16, or 4:3. This is the feature I've been waiting for, as it makes the images created far more versatile and usable. Imagen 4 also works a lot more smoothly. While I haven't found it to be noticeably faster — although a faster model is reportedly in the works — there are far fewer errors. With the previous version, Gemini would sometimes show an error message, saying it couldn't produce an image for an unknown reason. I have received none of those with Imagen 4. It just works. Still looks a bit too retouched While Imagen 4 produces better images, is more reliable, and allows for different aspect ratios, some of the issues I encountered when testing its predecessor are still present. My main problem is that the images often aren't as realistic as I'd like, especially when creating close-ups of people and animals. Images tend to come out quite saturated, and many feature a prominent bokeh effect that professionally blurs the background. They all look like they were taken by a photographer with 15 years of experience instead of by me, just pointing a camera at my cat and pressing the shutter. Sure, they look nice, but a 'casual mode' would be a fantastic addition — something more realistic, where the lighting isn't perfect and the subject isn't posing like a model. I prompted Gemini to make an image more realistic by removing the bokeh effect and generally making it less perfect. The AI did try, but after prompting it three or four times on the same image, it seemed to reach its limit and said it couldn't do any better. Each new image it produced was a bit more casual, but it was still quite polished, clearly hinting that it was AI-generated. You can see that in the images above, going from left to right. The first one includes a strong bokeh effect, and the man has very clear skin, while the other two progress to the man looking older and older, as well as more tired. He even started balding a bit in the last image. It's not what I really meant when prompting Gemini to make the image more realistic, although it does come out more casual. Imagen 4 does a much better job with random images like landscapes and city skylines. These images, taken from afar, don't include as many close-up details, so they look more genuine. Still, it can be a hit or miss. An image of the Sydney Opera House looks great, although the saturation is bumped up quite a bit — the grass is extra green, and the water is a picture-perfect blue. But when I asked for a picture of the Grand Canyon, it came out looking completely artificial and wouldn't fool anyone into thinking it was a real photo. It did perform better after a few retries, though. Editing is better, but not quite there One of my gripes with the previous version was its clumsy editing. When asked to change something minor — like the color of a hat — the AI would do it, but it would also generate a brand new, completely different image. The ideal scenario would be to create an image and then be allowed to edit every detail precisely, such as changing a piece of clothing, adding a specific item, or altering the weather conditions while leaving everything else exactly as is. Imagen 4 is better in this regard, but not by much. When I prompted it to change the color of a jacket to blue, it created a new image. However, by specifically asking it to keep all other details the same, it managed to maintain a lot of the scenery and subject from the original. That's what happened in the examples above. The woman in the third image was the same, and she appeared to be in a similar room, but her pose and the camera angle were different, making it more of a re-shoot than an edit. Here's another example of a cat eating a popsicle. I prompted Gemini to change the color of the popsicle, and it did, and it kept a lot of the details. The cat's the same, and so is most of the background. But the cat's ears are now sticking out, and the hat is a bit different. Still, a good try. Despite its shortcomings, Imagen 4 is a great tool Even with its issues and a long wishlist of missing functionality, Imagen 4 is still among the best AI image generators available. Most of the problems I've mentioned are also present in other AI image-generation software, so it's not as if Gemini is behind the competition. It seems there are significant technical hurdles that need to be overcome before these types of tools can reach the next level of precision and realism. Other limitations are still in place, such as the inability to create images of famous people or generate content that violates Google's safety guidelines. Whether that's a good or a bad thing is a matter of opinion. For users seeking fewer restrictions, there are alternatives like Grok. Have you tried out the latest image generation in Gemini? Let me know your thoughts in the comments.

Adobe Firefly Gets More Third-Party AI Models And Unveils New Mobile App

Forbes

17-06-2025

Business
Forbes

Adobe Firefly Gets More Third-Party AI Models And Unveils New Mobile App

Adobe is rapidly expanding its Partner Model Integration Program with new AI models from Google, ... More OpenAI and Black Forest Labs. Software giant Adobe says it is now adding more image and video models from third parties like Ideogram, Luma AI, Pika and Runway to Adobe Firefly. These new models are part of Adobe's Partner Model Integration Program and join AI models from the likes of Google, OpenAI and Black Forest Labs. All models will be available in the latest version of Adobe's Firefly Boards. Adobe has also announced that Google's latest text-to-image model, Imagen 4 will also join the program and will be available in Firefly's Text-to-Image feature Veo 3 and will be integrated into both the Text-to-Video and Image-to-Video features. These new models will give content creators even more choice and flexibility for experimenting with AI models. All models will all be available from within Adobe's trusted workflow without users having to leave Firefly or switch to other apps. Alexandru Costin is vice president of Generative AI, Adobe Partner Model Integration says: 'We built the Firefly app to be the ultimate one-stop shop for creative experimentation—where you can explore different AI models, aesthetics, and media types all in one place. Every new partner model we add gives creators even more flexibility to experiment, iterate, and push their ideas further.' Uses can now use the Adobe Firefly mobile app and then switch back to the desktop version when they ... More get back to the office. Adobe says that no matter which model a creator opts to use and upload from within its products, the content will never be used to train generative AI models. This has always been true with Adobe's Firefly models but it's also an essential condition of Adobe's third-party partnership agreements. As part of Adobe's commitment to transparency, Content Credentials are automatically appended to any AI-generated content produced within Firefly so the user will always know when something was created with a Firefly or one of its partner models. Thanks to these partnership agreements, Adobe's customers now get access to all the models via a single Adobe sign-in and plan, so there's no need to juggle separate accounts or have separate subscription plans. Today, Adobe is also unveiling innovations to Firefly Boards for creative professionals and their teams to access new ways of developing and collaborating on hundreds of concepts across multiple media formats… and all in one place. Firefly Boards give access to Adobe's commercially safe Firefly models as well as the growing roster of partner models The latest additions will help unlock more creativity and flexibility. These latest innovations in Firefly Boards include Generate Video which adds videos generated by Adobe's commercially safe Firefly Video model and partner models, including Google Veo, Ray2 by Luma AI and Pika's text-to-video generator, directly to a Board. Users can make Iterative Edits to images and then experiment and make further edits to images with conversational text prompts using Black Forest Labs' Flux.1 Kontext and OpenAI's image generation capabilities. Test prompts can easily be turned into video clips using the new Firefly Mobile app. Another new feature is the ability to bring order to ideas with a single click. Boards can neatly organize all the visual elements into a clean and presentation-ready layout. Users can also link Adobe documents so that any updates or changes are made in other Adobe apps automatically, syncing to Boards content. Adobe is expanding its mobile offering with the launch of the Firefly mobile app for iOS and Android. The new apps bring AI-first creativity to creators no matter where they are. For example, users can generate images and video from anywhere using popular generative AI features, including Generative Fill, Generative Expand, Text to Image, Text to Video, Image to Video. Just like the desktop Firefly app, creators will have the flexibility to choose between using Adobe's commercially safe Firefly models or partner models from Google and OpenAI, depending on their creative needs for Text to Image, Text to Video and Image to Video. Anything that's created in the Firefly mobile app will automatically sync with the creator's Creative Cloud account. This means they can start creating on a mobile device and then switch over to desktop when they get back to the office. Alternatively, work that was begun on a desktop can be continued on the move by switching over to the Firefly mobile app.

Google-backed Glance AI to be part of Samsung mobile phones in the US

Business Standard

04-06-2025

Business
Business Standard

Google-backed Glance AI to be part of Samsung mobile phones in the US

Google-backed Glance, a consumer tech company, announced on Wednesday that Glance AI—an AI commerce platform—will be available on Samsung phones in the US. Glance AI, which is powered by Google's Imagen 4 AI generator, brings personalised shopping to the lock screen. Glance AI allows users to instantly visualise themselves in outfits and destinations they would never imagine and purchase their favourites with just a tap. Naveen Tewari, Founder and Chief Executive Officer, Glance, said: 'Glance AI helps consumers discover and visualise what's possible—starting with an outfit that makes them look and feel great—and own it with just a tap on the platform. Samsung's commitment to enable Glance AI across its US devices will enable consumers to enjoy a fully user-opted-in experience where inspirational commerce and content converge.' Glance, a subsidiary of mobile advertising firm InMobi, is a five-year-old company. It has over 300 million global users, with 235 million in India. Glance as a software platform is available on over 450 million smartphones globally. Some of the OEMs that Glance works with include Samsung, MI, Vivo, Oppo, realme, Jio and others. Jason Shim, Senior Director and Head of Samsung Galaxy Store USA, said in a statement: 'Glance AI is a perfect example of the kind of high-quality and unique content we strive to deliver. By using AI to personalise content and shopping directly on the lock screen, it brings a smarter, more dynamic experience that reflects the forward-thinking spirit of the Galaxy Store.' In India, while Samsung phones may not have this feature on their lock screen, users can download the app from the Google Play Store and Apple's App Store. In an earlier announcement, the company had said that users can now explore looks and products tailored to their preferences and complete purchases seamlessly from over 400 global brands. Glance AI is a fully opt-in platform, with privacy and user control built into its core design. Users can explore looks, save or share them, set them as wallpapers, and visualise themselves in unique looks and collections driven by global trends and occasions. In an earlier interaction with Business Standard, Tewari had said that the firm has set a timeframe of 12 months to reach profitability. This is on the back of its foray into the US market and also because of some of its artificial intelligence (AI)-powered offerings. Glance revenue for FY24, according to a report filed by Entrackr, stood at Rs 614 crore, up 89 per cent on a year-on-year basis. The firm's loss was Rs 929 crore in FY24, down from Rs 1,094 crore in FY23.

I Spent $125 to Generate 5 AI Videos a Day With Google's Veo 3. The Sound Sets It Apart

CNET

30-05-2025

Business
CNET

I Spent $125 to Generate 5 AI Videos a Day With Google's Veo 3. The Sound Sets It Apart

I am just a girl who wants to be on a warm beach but most of the time, I'm trapped behind a computer screen. So, like any reporter who tests and reviews AI, I make my days more bearable by using these AI programs to create alternative-timeline versions of myself somewhere where Jimmy Buffett is playing and you can smell the salt. Here's what Google's newest AI video model, Veo 3, came up with. My beach bonfire party dream-turned-prompt is usually my first test as I put a new AI generator through its paces. And I admit, I had pretty low expectations for Veo 3. While I did see some social media posts gawking at Veo 3's capabilities, I've seen enough slop and hallucinations to approach with skepticism. Google's AI creative products, in particular, have always felt like a bit of an afterthought to me, something the company adds on to its extensive Gemini offerings to compete with the other tech heavyweights. But this year at the company's annual I/O developer conference, Google's Imagen 4, Veo 3 and Flow all took center stage. So I dove into Veo 3. Without spoiling anything, I walked away from Veo feeling like this was the next natural step for Google, with one feature in particular giving the company an edge that might make it a more serious contender in the AI creative space. But there are serious limits and annoyances that I hope are addressed soon. Here's how my experience went and what you need to know. Veo 3 availability, pricing and privacy Veo 3 is currently available for Gemini Ultra users in the US and enterprise Vertex users. In other words, you'll need to pay up to play around with the new Veo. Ultra is Gemini's newest, priciest tier at $250 per month. (It's currently half off for $125 per month for three months.) Vertex is Google's AI enterprise platform, and you'll know if you have access to it. If you don't want to pay hundreds of dollars for access to Google's AI video tools -- and I don't blame you -- you can try out Veo 2 with Google AI's Pro plan. I found that the one-month free trial is enough time to figure out if you want to pay the $20 per month fee to continue using it. You can check out my hands-on testing with that model for more info. Google's Gemini privacy policy says the company can collect your info to improve its technologies, which is why it recommends not sharing any confidential information with Gemini. You also agree to Google's prohibited use policy, which outlaws the creation of abusive or illegal content. My wild ride with Veo 3 The most impressive thing about Veo 3 is its new audio generation capabilities. You don't have to tell Gemini in your prompt that you want sound; it will automatically add it. This is a first among competitors like OpenAI's Sora and Adobe's Firefly and it certainly gives Google a huge edge. While the AI audio is a nice perk, it isn't perfect. If you're familiar with the somewhat clunky nature of AI-generated music and dialogue, you'll be able to identify it immediately. But there were times when it flowed more naturally. The clashing metal sounds and grunts in my alien fight scene were timed perfectly to their attacks, something that would've been difficult to add on my own afterward. But the dinosaur-like aliens also literally say "roar" and "hiss" instead of making those noises. My kayaker's paddling very nearly matched up with the water sloshing sound. The nature ambience in that video was particularly lovely and added a layer of depth that's been missing from AI videos. My dream beach bonfire partiers didn't sound like any party I've ever been to, but still, points for being first and relatively unproblematic. Of course, while the audio was nice, it doesn't take away from the weird eccentricities that continue to plague AI generators. I ran into a few hiccups, mostly with people's faces, a notoriously hard thing for AI to mimic. But compared to the glaringly obvious errors I ran into with Veo 2, the new generation does appear to have made real improvements as Google claimed it did. I run into hallucinations a lot when I'm testing AI image and video generators, so the first thing I do is look for whether a service gives me the ability to edit it. Veo 3 doesn't offer any of these, which is a bummer. It's certainly something that's going to make it less useful for professional creators, who are used to more fine-tuning editing tools and need to make precise tweaks for their projects. You can send a follow-up prompt asking for specific changes. For example, I asked Veo to change the angle in the previous video so I could see her face, which the program handled well. With Veo 3, you'll typically have to wait 3 to 5 minutes for a new, edited video to load, though. Veo 3 has the longest generation time of any AI video generator I've tested. But the addition of audio to the videos excuses the longer wait time in my eyes. The worst part of Veo 3 is how quickly I hit my daily generation limit. After only five videos, I was barred for an entire 24-hour period -- something that really annoyed me and made it much harder to assess. Google's VP of Gemini and Google Labs, Josh Woodward, said in a post on X/Twitter that Ultra subscribers like me have the highest number of generations that reset daily, in the regular Gemini app and in Flow. And for me, that limit in Gemini was five videos. Flow's limit is 125, according to Woodward. I reached out to Google to get clarity on what the daily limit is for Ultra users creating through Gemini that Woodward mentions. Here's the response: "Google AI Ultra subscribers get the highest level of access to Veo 3, our state-of-the-art video generation model, which they can use in both the Gemini app and Flow, our new AI filmmaking tool." The limits are another sign that this isn't a tool meant for professional creation and iterative editing. You need to spend time thoughtfully crafting your prompt and if Google flubs a face or glitches, you're likely to run out of credits fast and end up out of luck. Veo 3 is better suited for AI enthusiasts who want to dip their toes in video creation, not creators experimenting with AI. Is Veo 3 worth the cost? After an underwhelming experience with Veo 2, I had reservations about what to expect in the usefulness and accuracy of Veo 3. But the new model was impressive, the audio especially, even though it's still missing some key features. Let me be clear: There is no rational reason to spend hundreds of dollars on a Gemini Ultra plan only to use Veo 3. If you want to dabble for fun, you can do that with Veo 2 for hundreds less per month, and if you're a creative professional, Veo 3 still lacks crucial features like editing. The Ultra plan does offer other features, like YouTube Premium, 30 terabytes of space and access to the newest Gemini models. So if you want any of those things, then, yeah, pay up and go play around with Veo 3. But it's not worth it on its own. Veo 3 isn't the revolutionary upgrade those social media posts might lead you to believe. It is the next generation, better than last month's Veo 2, and it shows real promise in Google's future AI video endeavors. But be prepared to pay up if you want to try it out.