Text to image AI – first thoughts

It seems culture, philosophy and technology have been converging around language and semantic structures for a while. First through the ideas of postmodern philosopers, and now through text to image and algorithms like GPT3/4, which have been made widely accessible through platforms like Midjourney and OpenAI’s Chatbot.

This leads to many intriguing questions, such as ‘What does it mean to be seduced by the new?’, ‘How is value determined?’, and ‘How will the role of imagination, skill and curation change?’
Also, ‘Is this thing alive yet?’, which obviously has pretty deep implications for how we think about intelligence and evolution. My first thought about that is around how we tend to anthropomorphise, and sometimes fetishize, or even deify the things we create (even cars have facial features). So perhaps, while we explore the ‘uncanny valley‘ it’s good to remember all those engineers writing code..

buddhist tibetan thanka with thin rainbows and wispy spiral buddhist clouds, lotus flowers, flowing robes, flowing sash, and chinese ink mountains dynamic, superflat, gouache style –v 4

Falling down the rabbit-hole of Midjourney prompting, it becomes immediately apparent that synthesis and novel combinations of pop iconography are in abundance. Ever wondered what masks from science fiction & superhero movies would look like as Chinese porcelain? There is a huge amount of kitsch, cyber/steampunk mashup and every kind of popular Artstation / DeviantArt theme. Let’s call this ‘Contemporary Pop Art’, or maybe ‘Metamodern Kawai’. Already it’s apparent that this kind of imagery is going to become extremely commonplace. It’s already making its’ way into animation and is headed fast in the direction of mainstream screen entertainment. In magazine editorial we already start to see some AI-created illustrations. And without forming too many immediate judgements about value or mediocrity, I must admit I enjoy the freedom and play of being able to lean into my own particular preferences for kitsch (Chris Foss / Boris Valejo / Moebius-style science fiction with spaceships – creating concept art and a potential album cover for a band I play in called Council of Neptune), or rendering the best sportscar design of all time as a dune buggy, just for fun.

I named my instagram account statistical unconscious because one of the first things that occurred to me about platforms such as Midjourney, is that they represent a collective unconscious, an actual manifestation of humanity’s net output on the internet (in its’ current version including 100s of billions of data points), connected through linguistic nodes and taxonomies, reinforced and adapted through interaction to evolve a user-driven aesthetic. So while I don’t yet perceive a distinct intelligence per se, it can seem like a kind of Jungian dreaming, especially with the roll-the-dice aspect that invokes familiar tools like Tarot or I Ching. When the results of your prompt are mysterious, resonant and compelling, hey presto, we feel as though we have contacted some divine oracle. In terms of how directive prompt design can be, here is an interesting article describing the bias and preference/style present in simply asking for a circle. Racial bias is also present as we might expect, not because of the algorithm but the inherent bias of the content it incorporates. While the inner workings of the algorithm are something of a mystery, we can start to discern something about what might be happening under the hood just by using it. Getting specific about results proves to be more of a challenge, which for me is the most interesting area to explore, such as prompting Buddhist iconography. The statistical element becomes apparent when, for example I type “buddhist tibetan thanka with thin rainbows, wispy buddhist clouds, lotus flowers, flowing robes, flowing sash, dynamic, superflat, gouache –v 4” and one of the images looks like a painting by Hilda af Klimt. I assume the algorithm is generating a set of keywords which include, say “spiritual painting”. Something like making a cluster of related content, and then drawing randomly from it. Occasionally it may pull in something from the edge of the cluster. It’s likely that the end to end nature of prompt to results will develop to include prompt-based editing, iterative steps and more specificity. More control means less generic results.

Using one of my own images, a digital collage using vectorized source material from textile designs: https://s.mj.run/1pDsSkdAoto posterized psychedelic nature collage, screenprint, magic –v 4 –no characters, people, faces

It’s worth considering the idea of attribution, since it is possible to literally generate an image in someone else’s style. An earlier reference we have for this is the birth of sampling in the 80s, and how there was debate, reaction and new artistic forms that arose. I’ve seen some discussion about this on the Midjourney Discord, and my assessment of the more intelligent commentary is that most artists would actually benefit from being copied, and most of the output generated is not going to gain much traction anyway. The moat of branding and personal relationships is more likely in most cases to mean that artists gain more notoriety, and their work, especially if handmade, will become even more desirable and collectable. It is also currently possible to opt out of being included in the corpus of imagery used to train the model on Midjourney, if an artist wishes not to be assimilated. There are also some ideas about royalties being discussed, but I believe these come from an attempt to graft a concept from NFTs (the smart contract grants ongoing royalties back to the artist with each sale) onto a fundamentally different technology, and ultimately this would not be even be practically possible to do.

Things can also start to get quite meta when we use OpenAI’s chatbot to write formatted prompts with controlled randomness, and the chatbot is such a game-changing development it would need a separate article. As an example it can write WordPress code, and is context-aware enough to be able to provide corrections or alternatives based on your response. It is an amazing technology that will change things in ways we probably cannot yet imagine.

I think it’s worth celebrating the massive democratisation that a free (at least to get started) art creation tool offers for general imagining. And while there are going to some jobs that disappear or rarify, such as game concept artist or illustrator for editorial, the creation of worlds is likely to be an interesting area of creativity, and one that may lead to new creative talents. Whether that be metaverse(s) (crypto-related or not, but hopefully at least partially open-source and user-generated) or virtual experiences designed by software houses or artists, this current iteration of AI is likely to continue quickly expanding. 2d to 3d, in-painting, out-painting and linking up different specialized algorithms offers both almost limitless possibilities and the challenge of how to do something focused, specific, coherent and useful.

a bold, graphic Japanese screenprint with buddhist clouds, colourful interlocking layered geometric shapes and patterns, dynamic, psychedelic, halftone pattern, Hokusai woodblock print –v 4 –no text, caption, title, frame –ar 2:3


I’m going to make a few predictions about what we might see emerge as themes:

  • The phenomenon of ‘Japlish’ or ‘Chinglish’ is likely to mutate, and text as ambiguous or even meaningless decoration will expand. In particular the English language will become even more malleable.
  • New mythologies will be created around juxtaposition of ancient and modern. New Age people will be really into this, and ‘techno-shamanism’ will expand.
  • Some artists with social and reputational capital will exploit International Art Speak to legitimise and differentiate their AI-generated work, even though the work may be much the same as standard prompt generation. Possibly either hyper-kitsch or ultra-grotesque.
  • At some point in a near dystopian future psychometric research is going to find very powerful ways to interface with our sensory apparatus, and find out exactly which buttons to press. I would be more concerned about Meta’s Metaverse for this.
  • Architecture and Design will experience a decentralising revolution, at least in the conceptual layer, and many ideas will surface that may well end up changing how we make things and what things get made.
  • Collective art making will become more prominent

In terms of my own art making, I intend to explore generating ideas or layers that are repurposed and transformed, perhaps as a scaffold for specific details, patterns, layouts etc, and to see how to bring some of this collective wisdom into the physical space. If we go beyond literally superflat screens or giclée prints into 3d printing, or exploring the meeting of, say, T’ang poetry and molecular chemistry (code is inherently poetic and Zen) that adds dimensionality and depth to work both conceptually and experientially. I can contemplate visual tropes that coalesce as nodes of meaning, culture, geopolitics, and distil an essence from that which feels beautiful, poetic and relevant. Recently I find it fascinating to use my own digital collage images as source material and expand on a specific aesthetic.

Arabic tessellated microprocessor pattern flat 2d –v 4

I look forward to reading some analysis of these technologies from a philosophy of aesthetics and culture/meaning perspective. I’m also expecting to see unnecessary intellectualised complexified explanations of something that most people will just instinctively understand or appreciate. Do we need indecipherable International Art Speak post-structuralist dialectic, disguising the fact that the artist has intrinsically adopted a nihilist position, chosen the mirroring of banality, and is simultaneously pronouncing themselves as the genius-judge of ‘multisectional nodes of refractory semiotics within the third eye of the awakening posthuman anthroposcene cybermind’? Or do memes and humour actually do a much better and more honest job of expressing complexity?

As always, market forces will determine value based on their own algorithms, these new tools will be used in as many ways as there are artists and thinkers, and there will be work that stands out as somehow embodying perfection. I look forward to seeing it..

Follow my Midjourney research at https://www.instagram.com/statisticalunconscious/