Latest news with #Imagen4

I put 5 of the best AI image generators to the test using NightCafe — this one took the top spot

Tom's Guide

a day ago

Tom's Guide

I put 5 of the best AI image generators to the test using NightCafe — this one took the top spot

Competition in the AI image generator space is intense, with multiple companies like Ideogram, Midjourney and OpenAI hoping to convince you to use their offerings. That is why I'm a fan of NightCafe and have been using it for a few years. It has all the major models in one place, including DALL-E 3, Flux, Google Imagen and Ideogram. I've created a lot of AI images over the years and every model brings something different. For example, Flux is a great general purpose model in different versions. Imagen 4 is incredible for realism and Ideogram does text better than anything but GPT-4o. With NightCafe you can try the same prompt over multiple models, or even create a realistic image of say a train station using Google Imagen, then use that as a starter image for an Ideogram project to overlay a caption or stylized logo. You can also just use the same prompt over multiple models to see which you prefer. NightCafe also offers most of the major video models including Kling, Runway Gen-4, Luma Dream Machine and Wan 2.1. For this test we're focusing on image models. Having all those models to hand is a great way to test each of them to find the one that best matches your personal aesthetic — and they're each more different than you think. As well as the 'headline' models like Flux and Imagen, there are also community models that are fine-tuned versions of Flux and Stable Diffusion. For this I focused on the core models OpenAI GPT1, Recraft v3, Google Imagen 4, Ideogram 3 and Flux Kontext. I've come up with a prompt to try across each model. It requires a degree of photorealism, it presents a complex scene and includes a subtle text requirement. Get instant access to breaking news, the hottest reviews, great deals and helpful tips. Google's Imagen 4 is the model you'll use if you ask the Gemini app to create an image of something for you. It's also the model used in Google Slides when you create images. This was the first image for this test and while it captured the smoke rising it emphasised it a little. It did create a visually compelling scene and followed the requirement for the two people in the scene. It captured the correct vehicle but there's no sign of the text. Black Forest Labs Flux models are among the most versatile and are open source. With the arrival of the Kontext variant, we got image models that also understand natural language better. This means, a bit like OpenAI's native image generation in GPT-4o, it gives much more accurate results, especially when rendering text or complex scenes. Flux Kontext captured the 'Cafe Matin' perfectly, got the woman right and it somehow feels more French than Imagen but I don't think it's as photographically accurate. GPT Image-1, not to be confused with the 2018 original GPT-1 model, is a multimodal model from OpenAI designed for improved render accuracy, it is used by Adobe, Figma, Canva and NightCafe. Like Kontext, it has a better understanding of natural language prompts. One downside to this model is it can't do 9:16 or 16:9 images. Only variants of square. It captured the truck and the name, but I don't think the scene is as good. It also randomly generated a second umbrella and placement of hands feels unreal. Ideogram has been one of my favorite AI image models since it launched. Always able to generate legible text, it is also more flexible in terms of style than the other models. The Ideogram website includes a well designed canvas and built-in upscaler. The result isn't perfect, the barista leans funny but the lighting is more realistic, the scene is also more realistic with the truck on the sidewalk instead of the road. It also feels more modern and the text is both legible and well designed. Recraft is more of a design model, perfect for both rendered text and illustration, but that doesn't mean it can't create a stunning image. When it hit the market it shook things up, beating other models to the top of leaderboards. I wasn't overly impressed with the output. Yes, it's the most visually striking in part thanks to the space given to the scene. But it over emphasises the smoke and where is the barista? Also for a model geared around text — there's no sign writing. While Flux had a number of issues visually, it was the most consistent and it included legible sign writing. If I were using this commercially, as a stock image, I'd go with the Google Imagen 4 image, but from a purely visual perspective — Flux wins. What you also get with Flux Kontext is easy adaptation. You could make a secondary prompt to change the truck color or replace the old lady with a businessman. You can do that in Gemini but not with Imagen. You'd need to use native image generation from Gemini 2+. If you want to make a change to any image using Kontext, even if it wasn't a Kontext image originally, just click on the image in NightCafe and select "Prompt to Edit". Costs about 2.5 credits and is just a simple descriptive text prompt away. I used the most expensive version of each model for this test. The one that takes the most processing time to work on each image. This allowed for the fairest comparison. What surprises me is just how differently each model interprets the same descriptive prompt. But it doesn't surprise me how much better they've all got at following that description. What I love about NightCafe though, is its one stop shop for AI content. It isn't just a place to use all the leading image and video models, it contains a large community with a range of games, activities and groups centered around content creation. Also, you can edit, enhance, fix faces, upscale and expand any image you create within the app.

Google's Imagen 4 text-to-image model promises 'significantly improved' boring images

Engadget

25-06-2025

Entertainment
Engadget

Google's Imagen 4 text-to-image model promises 'significantly improved' boring images

Google has unveiled its latest text-to-image model Imagen 4 with the usual promise of "significantly improved text rendering" over the previous version, Imagen 3. The company also introduced a new deluxe version called Imagen 4 Ultra designed to follow more precise text prompts if you're willing to pay extra. Both arrive to a paid preview in the Gemini API and for limited free testing in Google AI Studio. Google describes the main Imagen 4 model as "your go-to for most tasks" with a price of $.04 per image. Imagen 4 Ultra, meanwhile, is for "when you need your images to precisely follow instructions" with the promise of "strong" output results compared to other image generators like Dall-E and Midjourney. That model boosts the price by 50 percent to $.06 per image. The company showed off a range of images including a three-panel comic generated by Imagen 4 Ultra showing a small spaceship being attacked by a giant blue... space lizard? with some sound effects like "Crunch!" and inexplicably, "Had!!" The image followed the listed prompt beat for beat and looked okay, not unlike a toon rendering from a 3D app. Another prompt read " front of a vintage travel postcard for Kyoto: iconic pagoda under cherry blossoms, snow-capped mountains in distance, clear blue sky, vibrant colors." Imagen 4 output that to a "T," albeit in a generic style lacking any charm. Another image showed a hiking couple waving from atop a rock and another, a fake "avant garde" fashion shoot. The images were definitely of good quality and followed the text prompts precisely but still looked highly machine generated. Imagen 4 is fine and does seem a mild improvement from before, but I'm not exactly wowed by it — particularly compared to the market leaders, Dall-E 3 and Midjourney 7. Plus, following an initial rush of enthusiasm, the public seems to be getting sick of AI art, with the main use case apparently being spammy ads on social media or at the bottom of articles.

I tested Gemini's latest image generator and here are the results

Android Authority

22-06-2025

Android Authority

I tested Gemini's latest image generator and here are the results

Back in November, I tested the image generation capabilities within Google's Gemini, which was powered by the Imagen 3 model. While I liked it, I ran into its limitations pretty quickly. Google recently rolled out its successor — Imagen 4 — and I've been putting it through its paces over the last couple of weeks. I think the new version is definitely an improvement, as some of the issues I had with Imagen 3 are now thankfully gone. But some frustrations still remain, meaning the new version isn't quite as good as I'd like. How often do you create images with AI? 0 votes It's a daily thing for me. NaN % Maybe once per week. NaN % A few times per month. NaN % Never. NaN % So, what has improved? The quality of the images produced has generally improved, though the improvement isn't massive. Imagen 3 was already generally good at creating images of people, animals, and scenery, but the new version consistently produces sharper, more detailed images. When it comes to generating images of people — which is only possible with Gemini Advanced — I had persistent issues with Imagen 3 where it would create cartoonish-looking photos, even when I wasn't asking for that specific style. Prompting it to change the image to something more realistic was often a losing battle. I haven't experienced any of that with Imagen 4. All the images of people it generates look very professional — perhaps a bit too much, which is something we'll touch on later. One of my biggest frustrations with the older model was the limited control over aspect ratios. I often felt stuck with 1:1 square images, which severely limited their use case. I couldn't use them for online publications, and printing them for a standard photo frame was out of the question. While Imagen 4 still defaults to a 1:1 ratio, I can now simply prompt it to use a different one, like 16:9, 9:16, or 4:3. This is the feature I've been waiting for, as it makes the images created far more versatile and usable. Imagen 4 also works a lot more smoothly. While I haven't found it to be noticeably faster — although a faster model is reportedly in the works — there are far fewer errors. With the previous version, Gemini would sometimes show an error message, saying it couldn't produce an image for an unknown reason. I have received none of those with Imagen 4. It just works. Still looks a bit too retouched While Imagen 4 produces better images, is more reliable, and allows for different aspect ratios, some of the issues I encountered when testing its predecessor are still present. My main problem is that the images often aren't as realistic as I'd like, especially when creating close-ups of people and animals. Images tend to come out quite saturated, and many feature a prominent bokeh effect that professionally blurs the background. They all look like they were taken by a photographer with 15 years of experience instead of by me, just pointing a camera at my cat and pressing the shutter. Sure, they look nice, but a 'casual mode' would be a fantastic addition — something more realistic, where the lighting isn't perfect and the subject isn't posing like a model. I prompted Gemini to make an image more realistic by removing the bokeh effect and generally making it less perfect. The AI did try, but after prompting it three or four times on the same image, it seemed to reach its limit and said it couldn't do any better. Each new image it produced was a bit more casual, but it was still quite polished, clearly hinting that it was AI-generated. You can see that in the images above, going from left to right. The first one includes a strong bokeh effect, and the man has very clear skin, while the other two progress to the man looking older and older, as well as more tired. He even started balding a bit in the last image. It's not what I really meant when prompting Gemini to make the image more realistic, although it does come out more casual. Imagen 4 does a much better job with random images like landscapes and city skylines. These images, taken from afar, don't include as many close-up details, so they look more genuine. Still, it can be a hit or miss. An image of the Sydney Opera House looks great, although the saturation is bumped up quite a bit — the grass is extra green, and the water is a picture-perfect blue. But when I asked for a picture of the Grand Canyon, it came out looking completely artificial and wouldn't fool anyone into thinking it was a real photo. It did perform better after a few retries, though. Editing is better, but not quite there One of my gripes with the previous version was its clumsy editing. When asked to change something minor — like the color of a hat — the AI would do it, but it would also generate a brand new, completely different image. The ideal scenario would be to create an image and then be allowed to edit every detail precisely, such as changing a piece of clothing, adding a specific item, or altering the weather conditions while leaving everything else exactly as is. Imagen 4 is better in this regard, but not by much. When I prompted it to change the color of a jacket to blue, it created a new image. However, by specifically asking it to keep all other details the same, it managed to maintain a lot of the scenery and subject from the original. That's what happened in the examples above. The woman in the third image was the same, and she appeared to be in a similar room, but her pose and the camera angle were different, making it more of a re-shoot than an edit. Here's another example of a cat eating a popsicle. I prompted Gemini to change the color of the popsicle, and it did, and it kept a lot of the details. The cat's the same, and so is most of the background. But the cat's ears are now sticking out, and the hat is a bit different. Still, a good try. Despite its shortcomings, Imagen 4 is a great tool Even with its issues and a long wishlist of missing functionality, Imagen 4 is still among the best AI image generators available. Most of the problems I've mentioned are also present in other AI image-generation software, so it's not as if Gemini is behind the competition. It seems there are significant technical hurdles that need to be overcome before these types of tools can reach the next level of precision and realism. Other limitations are still in place, such as the inability to create images of famous people or generate content that violates Google's safety guidelines. Whether that's a good or a bad thing is a matter of opinion. For users seeking fewer restrictions, there are alternatives like Grok. Have you tried out the latest image generation in Gemini? Let me know your thoughts in the comments.

Adobe Firefly Gets More Third-Party AI Models And Unveils New Mobile App

Forbes

17-06-2025

Business
Forbes

Adobe Firefly Gets More Third-Party AI Models And Unveils New Mobile App

Adobe is rapidly expanding its Partner Model Integration Program with new AI models from Google, ... More OpenAI and Black Forest Labs. Software giant Adobe says it is now adding more image and video models from third parties like Ideogram, Luma AI, Pika and Runway to Adobe Firefly. These new models are part of Adobe's Partner Model Integration Program and join AI models from the likes of Google, OpenAI and Black Forest Labs. All models will be available in the latest version of Adobe's Firefly Boards. Adobe has also announced that Google's latest text-to-image model, Imagen 4 will also join the program and will be available in Firefly's Text-to-Image feature Veo 3 and will be integrated into both the Text-to-Video and Image-to-Video features. These new models will give content creators even more choice and flexibility for experimenting with AI models. All models will all be available from within Adobe's trusted workflow without users having to leave Firefly or switch to other apps. Alexandru Costin is vice president of Generative AI, Adobe Partner Model Integration says: 'We built the Firefly app to be the ultimate one-stop shop for creative experimentation—where you can explore different AI models, aesthetics, and media types all in one place. Every new partner model we add gives creators even more flexibility to experiment, iterate, and push their ideas further.' Uses can now use the Adobe Firefly mobile app and then switch back to the desktop version when they ... More get back to the office. Adobe says that no matter which model a creator opts to use and upload from within its products, the content will never be used to train generative AI models. This has always been true with Adobe's Firefly models but it's also an essential condition of Adobe's third-party partnership agreements. As part of Adobe's commitment to transparency, Content Credentials are automatically appended to any AI-generated content produced within Firefly so the user will always know when something was created with a Firefly or one of its partner models. Thanks to these partnership agreements, Adobe's customers now get access to all the models via a single Adobe sign-in and plan, so there's no need to juggle separate accounts or have separate subscription plans. Today, Adobe is also unveiling innovations to Firefly Boards for creative professionals and their teams to access new ways of developing and collaborating on hundreds of concepts across multiple media formats… and all in one place. Firefly Boards give access to Adobe's commercially safe Firefly models as well as the growing roster of partner models The latest additions will help unlock more creativity and flexibility. These latest innovations in Firefly Boards include Generate Video which adds videos generated by Adobe's commercially safe Firefly Video model and partner models, including Google Veo, Ray2 by Luma AI and Pika's text-to-video generator, directly to a Board. Users can make Iterative Edits to images and then experiment and make further edits to images with conversational text prompts using Black Forest Labs' Flux.1 Kontext and OpenAI's image generation capabilities. Test prompts can easily be turned into video clips using the new Firefly Mobile app. Another new feature is the ability to bring order to ideas with a single click. Boards can neatly organize all the visual elements into a clean and presentation-ready layout. Users can also link Adobe documents so that any updates or changes are made in other Adobe apps automatically, syncing to Boards content. Adobe is expanding its mobile offering with the launch of the Firefly mobile app for iOS and Android. The new apps bring AI-first creativity to creators no matter where they are. For example, users can generate images and video from anywhere using popular generative AI features, including Generative Fill, Generative Expand, Text to Image, Text to Video, Image to Video. Just like the desktop Firefly app, creators will have the flexibility to choose between using Adobe's commercially safe Firefly models or partner models from Google and OpenAI, depending on their creative needs for Text to Image, Text to Video and Image to Video. Anything that's created in the Firefly mobile app will automatically sync with the creator's Creative Cloud account. This means they can start creating on a mobile device and then switch over to desktop when they get back to the office. Alternatively, work that was begun on a desktop can be continued on the move by switching over to the Firefly mobile app.

Google-backed Glance AI to be part of Samsung mobile phones in the US

Business Standard

04-06-2025

Business
Business Standard

Google-backed Glance AI to be part of Samsung mobile phones in the US

Google-backed Glance, a consumer tech company, announced on Wednesday that Glance AI—an AI commerce platform—will be available on Samsung phones in the US. Glance AI, which is powered by Google's Imagen 4 AI generator, brings personalised shopping to the lock screen. Glance AI allows users to instantly visualise themselves in outfits and destinations they would never imagine and purchase their favourites with just a tap. Naveen Tewari, Founder and Chief Executive Officer, Glance, said: 'Glance AI helps consumers discover and visualise what's possible—starting with an outfit that makes them look and feel great—and own it with just a tap on the platform. Samsung's commitment to enable Glance AI across its US devices will enable consumers to enjoy a fully user-opted-in experience where inspirational commerce and content converge.' Glance, a subsidiary of mobile advertising firm InMobi, is a five-year-old company. It has over 300 million global users, with 235 million in India. Glance as a software platform is available on over 450 million smartphones globally. Some of the OEMs that Glance works with include Samsung, MI, Vivo, Oppo, realme, Jio and others. Jason Shim, Senior Director and Head of Samsung Galaxy Store USA, said in a statement: 'Glance AI is a perfect example of the kind of high-quality and unique content we strive to deliver. By using AI to personalise content and shopping directly on the lock screen, it brings a smarter, more dynamic experience that reflects the forward-thinking spirit of the Galaxy Store.' In India, while Samsung phones may not have this feature on their lock screen, users can download the app from the Google Play Store and Apple's App Store. In an earlier announcement, the company had said that users can now explore looks and products tailored to their preferences and complete purchases seamlessly from over 400 global brands. Glance AI is a fully opt-in platform, with privacy and user control built into its core design. Users can explore looks, save or share them, set them as wallpapers, and visualise themselves in unique looks and collections driven by global trends and occasions. In an earlier interaction with Business Standard, Tewari had said that the firm has set a timeframe of 12 months to reach profitability. This is on the back of its foray into the US market and also because of some of its artificial intelligence (AI)-powered offerings. Glance revenue for FY24, according to a report filed by Entrackr, stood at Rs 614 crore, up 89 per cent on a year-on-year basis. The firm's loss was Rs 929 crore in FY24, down from Rs 1,094 crore in FY23.