COMPOSING PROMPTS FOR PHOTOREALISTIC IMAGES
This post is not directed towards every AI artist that is exploring this revolutionary new art form. It can’t be all things to all people and I know that I don’t have the expertise, or the knowledge of the fundamentals, to intelligently discuss creating illustrations, watercolors, oil paintings, video game simulations, 3D models and so forth.
In this installment of the series on AI imaging we will be discussing how to create effective prompts for photorealistic images. From a photographer’s point of view. From someone who spent fifty years behind the lenses of cameras of all types. From someone whose entire life revolved around the business of professional photography. From someone who knows and understands the fundamentals of the craft.
What Makes A Great Image?
During my career as a professional photographer I had the privilege of judging the annual print competition for WPPI (Wedding and Portrait Photographers International). This is, arguably, the most prestigious photography competition in the world of wedding and portrait photography. The competitors were the-best-of-the-best from all around the globe.
As judges we were charged with the responsibility of viewing, scoring and offering critique on incredible images submitted by the world’s best image makers. There are numerous parameters by which we would evaluate this photographic art. I’m going to discuss just a few of those in relation to how I use those traditional guidelines when creating effective prompts for photorealistic images.
Impact
When you view an image for the very first time it should “WOW” you. Your first reaction to seeing a potential award-winning image should be one of astonishment. If an image looks the same as hundreds of images you have seen before, then the artist has not delivered on the impact factor.
To achieve this important quality of impact, I start my prompts with a description of the scene, as I picture it in my mind’s eye, including details that I know will give the image that desired impact. The prompt for the live performance image seen above started as follows: “In this photorealistic image, a heavy metal band guitarist is seen playing with intense energy and passion in front of a roaring crowd at an arena concert. The musician is waist up, and the spotlight shines brightly on him, illuminating his face and amplifying his stage presence. The stage is alive with colorful accent stage lights, which add to the electrifying atmosphere of the show.”
Composition
By default most AI image generators are going to place your subject dead center in the frame. Why? Because that’s what most casual photographers do and the AI bots have scraped billions of those images in building their data sets.
To get your AI image generator to adhere to what professional image makers refer to as the rule of thirds you have to give it that guidance in your prompt. The prompt for the bridal image above included the instructions: “…the bride is standing in the right 1/3 of the frame looking to her right. Her eyes are focused on her bridal bouquet of white and pastel pink flowers, which is placed in the left 1/3 of the frame…”
Technical Expertise
If you wish to use AI to create extraordinary special effects photographs you should have a good working knowledge of what goes into creating such images in a traditional photographic environment. I see so many prompts being shared online where the image makers throw a “word salad” of random photographic terms at the AI bot, with no command of the underlying fundamentals of the instructions they’re giving. You need to have a good command of photographic fundamentals if you want to compose effective prompts for photorealistic images.
You can’t just type “Depth Of Field” (or “DOF” as some of the truly lazy prompters do), without telling your bot what depth of focus characteristic you’re looking for. Do you want a shallow depth of field that puts your subject in sharp focus against an out of focus background? Or do you, instead, want a deep depth of field that allows all of the elements of the image to remain in sharp focus? It’s up to you to specify this in your prompt. I’ve said it before: “Computers don’t do what you want them to do. They do what you tell them to do.”
In the explosive image of the motorcycle helmet above the prompt included highly technical instructions such as: “…lens aperture is set at f/11 to insure that the entire helmet, and the pieces breaking away from it, all remain in sharp focus. The studio soft box that illuminates the scene is fitted with an electronic flash that is set for a flash duration of 1/20,000 of a second to insure that all the flying bits and dust particles are frozen in place.”
Lighting
AI image generators have the ability to create incredibly complex lighting scenarios. Once again, it is up to you as the image maker to compose your prompts for photorealistic images accordingly.
The prompt for this cosplay/fashion photograph included very specific instructions on the lighting effect I envisioned: “…her skin is flawless and seems to shimmer in the light of the genie’s lamp that she is holding. In reality she is being illuminated by a large soft box light source just outside the right frame of the image. The area behind the model falls into darkness with just the light of a pair of suspended sconce lamps being recorded.”
Story Telling
They say “a picture is worth a thousand words”. As an image competition judge I always had to consider whether or not the artist effectively told a story with their creation. Every exceptional image tells a story.
The prompt for the image above started as follows: “A breathtaking and emotionally charged full color photograph capturing a heartwarming moment when a rugged cowboy snuggles up to the face of his loyal horse, at the end of a long day in the saddle. This mesmerizing, waist up, photo-realistic image effortlessly conveys the profound bond between humans and animals, transcending the barriers that separate their individual worlds.”
To further emphasize the story that an image is telling I always end my prompts with a statement of what the essence of the image is. In this image the last sentence of the prompt read as follows: “This extraordinary, high-resolution photograph is a testament to the unconditional love that can exist in the natural world and the profound connections that can form between its inhabitants.”
Trolls Need Not Apply
By now you are guessing that my prompts are pretty long and detailed. You are correct. I’m sure that a few keyboard warriors are going to “attack” this post saying that they can achieve similar results with far fewer words. (I find it humorous that they don’t have an extra two minutes to compose, detailed effective prompts for photorealistic images, but they can find an hour to argue about it on social media).
It’s true that you can achieve photorealistic output using far less-detailed prompts than I choose to use. If your entire motive is to surprise yourself; then show off “your” artwork on social media; by all means, write short concise prompts, that have little to no direction or emotion, and see what the AI bot comes up with. It will be pleasing. It will be fun. It will also look just like everything else that other uninspired folks are producing.
If, however, you are commissioned to create a very specific image for a client, you can’t rely on “getting lucky” when you prompt the AI image generator to create exactly what you need. The bot will do what you tell it to do. You have to take the time and trouble to instruct it properly. All that being said I will share with you the complete prompt that was used for the image below.
Putting It All Together
The complete prompt: “In this dramatic, high fashion, photorealistic waist up image, a rugged, handsome, fit male model, with five o’clock shadow, standing on the left side of the frame, wears a formal white tuxedo shirt with no jacket, while looking towards the window that is illuminating the scene; an elegant hotel room with marble walls, ornate windows and a small table out of focus in the background with the remains of his room service breakfast upon it. The photograph is masterfully captured using a Nikon Z7II mirrorless camera, known for its high resolution and accurate color rendition, paired with the versatile Nikkor Z MC 85mm f/1.2 S lens, celebrated for its exceptional sharpness, vibrant colors, natural bokeh and outstanding image quality. The camera settings are carefully chosen with an aperture of f/4 to create a shallow depth of field placing focus on the model while allowing the background elements to fall out of focus. The composition features a warm, amber, golden hour light and white balance is set to 4000K to bathe the scene in a soft, radiant glow. This extraordinary, high-resolution photograph is a testament to masculinity and poise, forever preserved in a stunning wedding day image. –ar 5:4 —v 5.1”
At just over 200 words a lot of folks might call this prompt excessive. I say that all that detail is necessary when composing prompts for photorealistic images. Especially if you want your bot to deliver exactly what you envision. The Midjourney bot that I use for my AI creations will recognize up to 6,000 characters in a prompt. I only used 1,200 of them in this prompt. There’s actually a whole lot more potential to give the bot even more to work with. There’s obviously a reason this much functionality is built into the AI image generators. It’s up to us creatives to push the envelope and see what this technology is really capable of. What it all comes down to is whether you want to create a very specific image or if you just want to create “something”.
What’s Ahead?
I can only answer that question by repeating what I stated in my previous post: “Breakthroughs and advancements are being made every day. Most of the information in this article will be obsolete before the year is out. What I can say with certainty is that the advances that will be made will be mind-boggling. Revolutionary. They will continue to advance and astound as more and more photographers embrace AI imaging.”
I hope that more and more professional image makers embrace and start exploring this emerging technology. Remember: we’re dealing with machine learning technology here. I, for one, would like to see those machines learning from the best, most imaginative, most devoted folks in the craft.
As always we welcome your comments below. Click Here to view an interactive visual index page where you can quickly browse through all of the great features that are published on the Roadcraft USA blog. Be sure to subscribe to Roadcraft USA. We send out monthly email notifications about new features that hit the blog.