Malaga AI Conference: My Key Lessons and Insights

Last Thursday and Friday, I focused on generative artificial intelligence. A recap.

Last Thursday and Friday I had the chance to engage with generative artificial intelligence. Christoph Raethke invited a hand-picked group to the Innovation Campus Malaga. Sun, 25 degrees, and a workshop. The two AI luminaries Peter Kabel (Business & Design) and Boris Eldagsen (Art & Photography) took us through the subject in depth. Of course, the topic has existed for decades, and everyone now knows and uses ChatGPT. How does generative AI actually work? Which data is used, and how? Which tools are suitable for what? How can I influence the results? Several things became clearer to me, and I found it remarkable how details decide quality.

Shit in, Shit out.

Graphic tools such as Dall-E, Midjourney and Stable Diffusion use, in addition to a language model - so the prompts can be understood at all - data from millions of images. Depending on how these were described and tagged, the tools deliver different results for the same topic.

Of course, you can have Lake Biel drawn in the style of Hodler or generated as a photo in the style of Helmut Newton. But you should first check how much data was actually learned about Hodler, and if necessary expand your prompt. At haveibeentrained.com you can check whether your topic/keyword has been trained into the AIs accordingly.

If you ask ChatGPT what defines Hodler's or Newton's style and use those adjectives in the prompt, the results can improve massively. Depending on the prompt, you access different learned data. That is why you should always make your queries in English, to draw on a broad data base and get accurate results. Also expand them as specifically as possible.

Wes Anderson, for example: pastel tones and muted hues, dreamy, nostalgic atmosphere, vintage and retro elements, balanced, symmetrical, etc.

A new profession: the prompt writer?

Of course, you can have a “cool lobby in 3d render style” generated. With such short prompts, you leave most of the work to the AI, and the results may not be so satisfactory. You can and should describe colours, materials, moods, emotions, trends, decades, fashions or technical details such as camera model, perspective, film type, lighting, exposure time, focal length, etc. For example: instead of an image from above, you can extend your prompt with “Drone Footage”. A “professional” prompt can then look like this:

“The parametric hotel lobby is a sleek and modern space with plenty of natural light. The lobby is spacious and open with a variety of seating options. The front desk is a sleek white counter with a parametric design. The walls are a light blue color with parametric patterns. The floor is a light wood color with a parametric design. There are plenty of plants and flowers throughout the space. The overall effect is a calm and relaxing space. occlusion, moody, sunset, concept art, octane rendering, 8k, highly detailed, concept art, highly detailed, beautiful scenery, cinematic, beautiful light, hyperreal, octane render, hdr, long exposure, 8K, realistic, fog, moody, fire and explosions, smoke, 50mm f2.8”

Depending on the tool, the order of the words and various control characters such as (()) or [[]] can affect the result (depending on the tool). At a minimum, you should

Style
Artist
Formats
Boosters
Vibes
Perspective
Technical

define.

The art of prompting: Besides syntax and a good feel for language, one thing is decisive: the prompts of an art historian, a professional photographer or a graphic designer can look very different because of their specialist knowledge. In short, it is an art and takes a lot of experience to issue the right prompts. By using your knowledge, you influence the result strongly and - I think - create something new.

Did you share your prompts and dimensions that you discovered and, for example, summarised in an Excel sheet? Hmm, that will become your secret treasure. Your AI knowledge!

This is also where the work of Boris comes in, who has just won a Sony World Photography Award 2023 for your AI-generated image. And that brings us straight to the next topic: AI art is more than a tool - it is a process.

Processes and toolsets

Dall-E, Midjourney and especially the open-source-based Stable Diffusion are certainly the leading tools. But within the tools there are thousands of parameters, functions, plug-ins and presets that can be used. Until an image is created, you often jump back and forth a lot, depending on the tool.

Johannes Vermeer's girl in the original

Besides art, applications in marketing are of course exciting. Imagine you have a backpack you want to market, but headquarters has only provided three photos with a male model.

As the photos were unfortunately taken horizontally, you extend the image vertically with outpainting so you can use it for social media.
With inpainting, you make certain areas in the photo a little calmer.
You also want the product to be larger and the crop a little wider. Instead of simply zooming the photo in (which would make it blurry), the AI calculates the missing information with upscaling.
In a second step, you replace the backpack with two other colour variants.
Then you replace the man with a female model and change the setting of the shots to iconic local places in Zurich, Basel, Lucerne, Geneva and Bern.

Within a few hours, you have produced 90 different ads without ever having been to different locations, and you have saved an additional shoot with a female model.

August Kamp × DALL·E Outpainting vom Mädchen

August Kamp × DALL·E outpainting of the girl

At the moment, many steps still have to be carried out in partly different tools, but new, specific applications are appearing every day. Control Net will help to control such processes better and influence them in detail. One example is www.headshotpro.com. You upload any employee photos and automatically generate good staff headshots that meet your corporate design standards and make all photos look consistent.

Of course, there are still many limits and shortcomings

But the AI is learning at breathtaking speed all the time, and a tsunami of new tools is added every day. Fun fact: hands seem to be the hardest thing for AI, because they are depicted so differently in images. Midjourney v5.0 seems to be slowly getting a handle on this.

There are also astonishing things in video, but the journey to perfect results is still longer. Of course, time and three dimensions are harder to learn than a flat photo or text. It remains exciting.

As always with new developments: resistance

Of course there are ethical dimensions that need to be discussed, tested and learned, but in principle you should approach the matter proactively. The EU seems, with its new AI regulation, to be missing a historic opportunity and is leaving development to the US and Asia instead.

The UK, by contrast, is investing more than £1bn in this area.

AI is a complex topic and is very much influenced by who uses it and how. So is this really not art? I say: it is! Similar accusations were made earlier too:

“Photography is the sworn enemy of painting; it is the refuge of all failed painters, the untalented and the lazy.” Charles Baudelaire (1821-1867)