Stable Diffusion 3.5 follows your prompts more closely and generates more diverse people

Stable Diffusion, an open-source alternative to AI image generators like Midjourney and DALL-E, has been updated to version 3.5. The new model tries to right some of the wrongs (which may be an understatement) of the widely panned Stable Diffusion 3 Medium. Stability AI says the 3.5 model adheres to prompts better than other image generators and competes with much larger models in output quality. In addition, it’s tuned for a greater diversity of styles, skin tones and features without needing to be prompted to do so explicitly. The new model comes in three flavors. Stable Diffusion 3.5 Large is the most powerful of the trio, with the highest quality of the bunch, while leading the industry in prompt adherence. Stability AI says the model is suitable for professional uses at 1 MP resolution. Meanwhile, Stable Diffusion 3.5 Large Turbo is a “distilled” version of the larger model, focusing more on efficiency than maximum quality. Stability AI says the Turbo variant still produces “high-quality images with exceptional prompt adherence” in four steps. Finally, Stable Diffusion 3.5 Medium (2.5 billion parameters) is designed to run on consumer hardware, balancing quality with simplicity. With its greater ease of customization, the model can generate images between 0.25 and 2 megapixel resolution. However, unlike the first two models, which are available now, Stable Diffusion 3.5 Medium doesn’t arrive until October 29. The new trio follows the botched Stable Diffusion 3 Medium in June. The company admitted that the release “didn’t fully meet our standards or our communities’ expectations,” as it produced some laughably grotesque body horror in response to prompts that asked for no such thing. Stability AI’s repeated mentions of exceptional prompt adherence in today’s announcement are likely no coincidence. Although Stability AI only briefly mentioned it in its announcement blog post, the 3.5 series has new filters to better reflect human diversity. The company describes the new models’ human outputs as “representative of the world, not just one type of person, with different skin tones and features, without the need for extensive prompting.” Let’s hope it’s sophisticated enough to account for subtleties and historical sensitivities, unlike Google’s debacle from earlier this year. Unprompted to do so, Gemini produced collections of egregiously inaccurate historical “photos,” like ethnically diverse Nazis and US Founding Fathers. The backlash was so intense that Google didn’t reincorporate human generations until six months later.This article originally appeared on Engadget at https://www.engadget.com/ai/stable-diffusion-35-follows-your-prompts-more-closely-and-generates-more-diverse-people-184022965.html?src=rss

Oct 23, 2024 - 00:30
 0
Stable Diffusion 3.5 follows your prompts more closely and generates more diverse people

Stable Diffusion, an open-source alternative to AI image generators like Midjourney and DALL-E, has been updated to version 3.5. The new model tries to right some of the wrongs (which may be an understatement) of the widely panned Stable Diffusion 3 Medium. Stability AI says the 3.5 model adheres to prompts better than other image generators and competes with much larger models in output quality. In addition, it’s tuned for a greater diversity of styles, skin tones and features without needing to be prompted to do so explicitly.

The new model comes in three flavors. Stable Diffusion 3.5 Large is the most powerful of the trio, with the highest quality of the bunch, while leading the industry in prompt adherence. Stability AI says the model is suitable for professional uses at 1 MP resolution.

Meanwhile, Stable Diffusion 3.5 Large Turbo is a “distilled” version of the larger model, focusing more on efficiency than maximum quality. Stability AI says the Turbo variant still produces “high-quality images with exceptional prompt adherence” in four steps.

Finally, Stable Diffusion 3.5 Medium (2.5 billion parameters) is designed to run on consumer hardware, balancing quality with simplicity. With its greater ease of customization, the model can generate images between 0.25 and 2 megapixel resolution. However, unlike the first two models, which are available now, Stable Diffusion 3.5 Medium doesn’t arrive until October 29.

The new trio follows the botched Stable Diffusion 3 Medium in June. The company admitted that the release “didn’t fully meet our standards or our communities’ expectations,” as it produced some laughably grotesque body horror in response to prompts that asked for no such thing. Stability AI’s repeated mentions of exceptional prompt adherence in today’s announcement are likely no coincidence.

Although Stability AI only briefly mentioned it in its announcement blog post, the 3.5 series has new filters to better reflect human diversity. The company describes the new models’ human outputs as “representative of the world, not just one type of person, with different skin tones and features, without the need for extensive prompting.”

Let’s hope it’s sophisticated enough to account for subtleties and historical sensitivities, unlike Google’s debacle from earlier this year. Unprompted to do so, Gemini produced collections of egregiously inaccurate historical “photos,” like ethnically diverse Nazis and US Founding Fathers. The backlash was so intense that Google didn’t reincorporate human generations until six months later.This article originally appeared on Engadget at https://www.engadget.com/ai/stable-diffusion-35-follows-your-prompts-more-closely-and-generates-more-diverse-people-184022965.html?src=rss

What's Your Reaction?

like

dislike

love

funny

angry

sad

wow

Viral News Code whisperer by profession, narrative alchemist by passion. With 6 years of tech expertise under my belt, I bring a unique blend of logic and imagination to ViralNews360. Expect everything from tech explainers that melt your brain (but not your circuits) to heartwarming tales that tug at your heartstrings. Come on in, the virtual coffee's always brewing!